6,447 Matching Annotations
  1. Last 7 days
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      These authors have developed a method to induce MI or MII arrest. While this was previously possible in MI, the advantage of the method presented here is it works for MII, and chemically inducible because it is based on a system that is sensitive to the addition of ABA. Depending on when the ABA is added, they achieve a MI or MII delay. The ABA promotes dimerizing fragments of Mps1 and Spc105 that can't bind their chromosomal sites. The evidence that the MI arrest is weaker than the MII arrest is convincing and consistent with published data and indicating the SAC in MI is less robust than MII or mitosis. The authors use this system to find evidence that the weak MI arrest is associated with PP1 binding to Spc105. This is a nice use of the system.

      The remainder of the paper uses the SynSAC system to isolate populations enriched for MI or MII stages and conduct proteomics. This shows a powerful use of the system but more work is needed to validate these results, particularly in normal cells.

      Overall the most significant aspect of this paper is the technical achievement, which is validated by the other experiments. They have developed a system and generated some proteomics data that maybe useful to others when analyzing kinetochore composition at each division. Overall, I have only a few minor suggestions.

      We appreciate the reviewers’ support of our study.

      1) In wild-type - Pds1 levels are high during M1 and A1, but low in MII. Can the authors comment on this? In line 217, what is meant by "slightly attenuated? Can the authors comment on how anaphase occurs in presence of high Pds1? There is even a low but significant level in MII.

      The higher levels of Pds1 in meiosis I compared to meiosis II has been observed previously using immunofluorescence and live imaging1–3. Although the reasons are not completely clear, we speculate that there is insufficient time between the two divisions to re-accumulate Pds1 prior to separase re-activation.

      We agree “slightly attenuated” was confusing and we have re-worded this sentence to read “Addition ABA at the time of prophase release resulted in Pds1securin stabilisation throughout the time course, consistent with delays in both metaphase I and II”.

      We do not believe that either anaphase I or II occur in the presence of high Pds1. Western blotting represents the amount of Pds1 in the population of cells at a given time point. The time between meiosis I and II is very short even when treated with ABA. For example, in Figure 2B, spindle morphology counts show that the anaphase I peak is around 40% at its maxima (105 min) and around 40% of cells are in either metaphase I or metaphase II, and will be Pds1 positive. In contrast, due to the better efficiency of meiosis II, anaphase II hardly occurs at all in these conditions, since anaphase II spindles (and the second nuclear division) are observed at very low frequency (maximum 10%) from 165 minutes onwards. Instead, metaphase II spindles partially or fully breakdown, without undergoing anaphase extension. Taking Pds1 levels from the western blot and the spindle data together leads to the conclusion that at the end of the time-course, these cells are biochemically in metaphase II, but unable to maintain a robust spindle. Spindle collapse is also observed in other situations where meiotic exit fails, and potentially reflects an uncoupling of the cell cycle from the programme governing gamete differentiation3–5. We will explain this point in a revised version while referring to representative images that from evidence for this, as also requested by the reviewer below.

      2) The figures with data characterizing the system are mostly graphs showing time course of MI and MII. There is no cytology, which is a little surprising since the stage is determined by spindle morphology. It would help to see sample sizes (ie. In the Figure legends) and also representative images. It would also be nice to see images comparing the same stage in the SynSAC cells versus normal cells. Are there any differences in the morphology of the spindles or chromosomes when in the SynSAC system?

      This is an excellent suggestion and will also help clarify the point above. We will provide images of cells at the different stages. For each timepoint, 100 cells were scored. We have already included this information in the figure legends

      3) A possible criticism of this system could be that the SAC signal promoting arrest is not coming from the kinetochore. Are there any possible consequences of this? In vertebrate cells, the RZZ complex streams off the kinetochore. Yeast don't have RZZ but this is an example of something that is SAC dependent and happens at the kinetochore. Can the authors discuss possible limitations such as this? Does the inhibition of the APC effect the native kinetochores? This could be good or bad. A bad possibility is that the cell is behaving as if it is in MII, but the kinetochores have made their microtubule attachments and behave as if in anaphase.

      In our view, the fact that SynSAC does not come from kinetochores is a major advantage as this allows the study of the kinetochore in an unperturbed state. It is also important to note that the canonical checkpoint components are all still present in the SynSAC strains, and perturbations in kinetochore-microtubule interactions would be expected to mount a kinetochore-driven checkpoint response as normal. Indeed, it would be interesting in future work to understand how disrupting kinetochore-microtubule attachments alters kinetochore composition (presumably checkpoint proteins will be recruited) and phosphorylation but this is beyond the scope of this work. In terms of the state at which we are arresting cells – this is a true metaphase because cohesion has not been lost but kinetochore-microtubule attachments have been established. This is evident from the enrichment of microtubule regulators but not checkpoint proteins in the kinetochore purifications from metaphase I and II. While this state is expected to occur only transiently in yeast, since the establishment of proper kinetochore-microtubule attachments triggers anaphase onset, the ability to capture this properly bioriented state will be extremely informative for future studies. We appreciate the reviewers’ insight in highlighting these interesting discussion points which we will include in a revised version.

      Reviewer #1 (Significance (Required)):

      These authors have developed a method to induce MI or MII arrest. While this was previously possible in MI, the advantage of the method presented here is it works for MII, and chemically inducible because it is based on a system that is sensitive to the addition of ABA. Depending on when the ABA is added, they achieve a MI or MII delay. The ABA promotes dimerizing fragments of Mps1 and Spc105 that can't bind their chromosomal sites. The evidence that the MI arrest is weaker than the MII arrest is convincing and consistent with published data and indicating the SAC in MI is less robust than MII or mitosis. The authors use this system to find evidence that the weak MI arrest is associated with PP1 binding to Spc105. This is a nice use of the system.

      The remainder of the paper uses the SynSAC system to isolate populations enriched for MI or MII stages and conduct proteomics. This shows a powerful use of the system but more work is needed to validate these results, particularly in normal cells.

      Overall the most significant aspect of this paper is the technical achievement, which is validated by the other experiments. They have developed a system and generated some proteomics data that maybe useful to others when analyzing kinetochore composition at each division.

      We appreciate the reviewer’s enthusiasm for our work.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The manuscript submitted by Koch et al. describes a novel approach to collect budding yeast cells in metaphase I or metaphase II by synthetically activating the spinde checkpoint (SAC). The arrest is transient and reversible. This synchronization strategy will be extremely useful for studying meiosis I and meiosis II, and compare the two divisions. The authors characterized this so-named syncSACapproach and could confirm previous observations that the SAC arrest is less efficient in meiosis I than in meiosis II. They found that downregulation of the SAC response through PP1 phosphatase is stronger in meiosis I than in meiosis II. The authors then went on to purify kinetochore-associated proteins from metaphase I and II extracts for proteome and phosphoproteome analysis. Their data will be of significant interest to the cell cycle community (they compared their datasets also to kinetochores purified from cells arrested in prophase I and -with SynSAC in mitosis).

      I have only a couple of minor comments:

      1) I would add the Suppl Figure 1A to main Figure 1A. What is really exciting here is the arrest in metaphase II, so I don't understand why the authors characterize metaphase I in the main figure, but not metaphase II. But this is only a suggestion.

      This is a good suggestion, we will do this in our full revision.

      2) Line 197, the authors state: ...SyncSACinduced a more pronounced delay in metaphase II than in metaphase I. However, line 229 and 240 the auhtors talk about a "longer delay in metaphase Thank you for pointing this out, this is indeed a typo and we have corrected it.

      3) The authors describe striking differences for both protein abundance and phosphorylation for key kinetochore associated proteins. I found one very interesting protein that seems to be very abundant and phosphorylated in metaphase I but not metaphase II, namely Sgo1. Do the authors think that Sgo1 is not required in metaphase II anymore? (Top hit in suppl Fig 8D).

      This is indeed an interesting observation, which we plan to investigate as part of another study in the future. Indeed, data from mouse indicates that shugoshin-dependent cohesin deprotection is already absent in meiosis II in mouse oocytes6, though whether this is also true in yeast is not known. Furthermore, this does not rule out other functions of Sgo1 in meiosis II (for example promoting biorientation). We will include this point in the discussion.

      Reviewer #2 (Significance (Required)):

      The technique described here will be of great interest to the cell cycle community. Furthermore, the authors provide data sets on purified kinetochores of different meiotic stages and compare them to mitosis. This paper will thus be highly cited, for the technique, and also for the application of the technique.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In their manuscript, Koch et al. describe a novel strategy to synchronize cells of the budding yeast Saccharomyces cerevisiae in metaphase I and metaphase II, thereby facilitating comparative analyses between these meiotic stages. This approach, termed SynSAC, adapts a method previously developed in fission yeast and human cells that enables the ectopic induction of a synthetic spindle assembly checkpoint (SAC) arrest by conditionally forcing the heterodimerization of two SAC components upon addition of the plant hormone abscisic acid (ABA). This is a valuable tool, which has the advantage that induces SAC-dependent inhibition of the anaphase promoting complex without perturbing kinetochores. Furthermore, since the same strategy and yeast strain can be also used to induce a metaphase arrest during mitosis, the methodology developed by Koch et al. enables comparative analyses between mitotic and meiotic cell divisions. To validate their strategy, the authors purified kinetochores from meiotic metaphase I and metaphase II, as well as from mitotic metaphase, and compared their protein composition and phosphorylation profiles. The results are presented clearly and in an organized manner.

      We are grateful to the reviewer for their support.

      Despite the relevance of both the methodology and the comparative analyses, several main issues should be addressed: 1.- In contrast to the strong metaphase arrest induced by ABA addition in mitosis (Supp. Fig. 2), the SynSAC strategy only promotes a delay in metaphase I and metaphase II as cells progress through meiosis. This delay extends the duration of both meiotic stages, but does not markedly increase the percentage of metaphase I or II cells in the population at a given timepoint of the meiotic time course (Fig. 1C). Therefore, although SynSAC broadens the time window for sample collection, it does not substantially improve differential analyses between stages compared with a standard NDT80 prophase block synchronization experiment. Could a higher ABA concentration or repeated hormone addition improve the tightness of the meiotic metaphase arrest?

      For many purposes the enrichment and extended time for sample collection is sufficient, as we demonstrate here. However, as pointed out by the reviewer below, the system can be improved by use of the 4A-RASA mutations to provide a stronger arrest (see our response below). We did not experiment with higher ABA concentrations or repeated addition since the very robust arrest achieved with the 4A-RASA mutant deemed this unnecessary.

      2.- Unlike the standard SynSAC strategy, introducing mutations that prevent PP1 binding to the SynSAC construct considerably extended the duration of the meiotic metaphase arrests. In particular, mutating PP1 binding sites in both the RVxF (RASA) and the SILK (4A) motifs of the Spc105(1-455)-PYL construct caused a strong metaphase I arrest that persisted until the end of the meiotic time course (Fig. 3A). This stronger and more prolonged 4A-RASA SynSAC arrest would directly address the issue raised above. It is unclear why the authors did not emphasize more this improved system. Indeed, the 4A-RASA SynSAC approach could be presented as the optimal strategy to induce a conditional metaphase arrest in budding yeast meiosis, since it not only adapts but also improves the original methods designed for fission yeast and human cells. Along the same lines, it is surprising that the authors did not exploit the stronger arrest achieved with the 4A-RASA mutant to compare kinetochore composition at meiotic metaphase I and II.

      We agree that the 4A-RASA mutant is the best tool to use for the arrest and going forward this will be our approach. We collected the proteomics data and the data on the SynSAC mutant variants concurrently, so we did not know about the improved arrest at the time the proteomics experiment was done. Because very good arrest was already achieved with the unmutated SynSAC construct, we could not justify repeating the proteomics experiment which is a large amount of work using significant resources. However, we will highlight the potential of the 4A-RASA mutant more prominently in our full revision.

      3.- The results shown in Supp. Fig. 4C are intriguing and merit further discussion. Mitotic growth in ABA suggest that the RASA mutation silences the SynSAC effect, yet this was not observed for the 4A or the double 4A-RASA mutants. Notably, in contrast to mitosis, the SynSAC 4A-RASA mutation leads to a more pronounced metaphase I meiotic delay (Fig. 3A). It is also noteworthy that the RVAF mutation partially restores mitotic growth in ABA. This observation supports, as previously demonstrated in human cells, that Aurora B-mediated phosphorylation of S77 within the RVSF motif is important to prevent PP1 binding to Spc105 in budding yeast as well.

      We agree these are intriguing findings that highlight key differences as to the wiring of the spindle checkpoint in meiosis and mitosis and potential for future studies, however, currently we can only speculate as to the underlying cause. The effect of the RASA mutation in mitosis is unexpected and unexplained. However, the fact that the 4A-RASA mutation causes a stronger delay in meiosis I compared to mitosis can be explained by a greater prominence of PP1 phosphatase in meiosis. Indeed, our data (Figure 4A) show that the PP1 phosphatase Glc7 and its regulatory subunit Fin1 are highly enriched on kinetochores at all meiotic stages compared to mitosis.

      We agree that the improved growth of the RVAF mutant is intriguing and points to a role of Aurora B-mediated phosphorylation, though previous work has not supported such a role 7.

      We will include a discussion of these important points in a revised version.

      4.- To demonstrate the applicability of the SynSAC approach, the authors immunoprecipitated the kinetochore protein Dsn1 from cells arrested at different meiotic or mitotic stages, and compared kinetochore composition using data independent acquisition (DIA) mass spectrometry. Quantification and comparative analyses of total and kinetochore protein levels were conducted in parallel for cells expressing either FLAG-tagged or untagged Dsn1 (Supp. Fig. 7A-B). To better detect potential changes, protein abundances were next scaled to Dsn1 levels in each sample (Supp. Fig. 7C-D). However, it is not clear why the authors did not normalize protein abundance in the immunoprecipitations from tagged samples at each stage to the corresponding untagged control, instead of performing a separate analysis. This would be particularly relevant given the high sensitivity of DIA mass spectrometry, which enabled quantification of thousands of proteins. Furthermore, the authors compared protein abundances in tagged-samples from mitotic metaphase and meiotic prophase, metaphase I and metaphase II (Supp. Fig. 7E-F). If protein amounts in each case were not normalized to the untagged controls, as inferred from the text (lines 333 to 338), the observed differences could simply reflect global changes in protein expression at different stages rather than specific differences in protein association to kinetochores.

      While we agree with the reviewer that at first glance, normalising to no tag makes the most sense, in practice there is very low background signal in the no tag sample which means that any random fluctuations have a big impact on the final fold change. This approach therefore introduces artefacts into the data rather than improving normalisation.

      To provide reassurance that our kinetochore immunoprecipitations are specific, and that the background (no tag) signal is indeed very low, we will provide a new supplemental figure showing the volcanos comparing kinetochore purifications at each stage with their corresponding no tag control. These volcano plots show very clearly that the major enriched proteins are kinetochore proteins and associated factors, in all cases.

      It is also important to note that our experiment looks at relative changes of the same protein over time, which we expect to be relatively small in the whole cell lysate. We previously documented proteins that change in abundance in whole cell lysates throughout meiosis8. In this study, we found that relatively few proteins significantly change in abundance, supporting this view.

      Our aim in the current study was to understand how the relative composition of the kinetochore changes and for this, we believe that a direct comparison to Dsn1, a central kinetochore protein which we immunoprecipitated is the most appropriate normalisation.

      5.- Despite the large amount of potentially valuable data generated, the manuscript focuses mainly on results that reinforce previously established observations (e.g., premature SAC silencing in meiosis I by PP1, changes in kinetochore composition, etc.). The discussion would benefit from a deeper analysis of novel findings that underscore the broader significance of this study.

      We strongly agree with this point and we will re-frame the discussion to focus on the novel findings, as also raised by the other reviewers.

      Finally, minor concerns are: 1.- Meiotic progression in SynSAC strains lacking Mad1, Mad2 or Mad3 is severely affected (Fig. 1D and Supp. Fig. 1), making it difficult to assess whether, as the authors state, the metaphase delays depend on the canonical SAC cascade. In addition, as a general note, graphs displaying meiotic time courses could be improved for clarity (e.g., thinner data lines, addition of axis gridlines and external tick marks, etc.).

      We will generate the data to include a checkpoint mutant +/- ABA for direct comparison. We will take steps to improve the clarity of presentation of the meiotic timecourse graphs, though our experience is that uncluttered graphs make it easier to compare trends.

      2.- Spore viability following SynSAC induction in meiosis was used as an indicator that this experimental approach does not disrupt kinetochore function and chromosome segregation. However, this is an indirect measure. Direct monitoring of genome distribution using GFP-tagged chromosomes would have provided more robust evidence. Notably, the SynSAC mad3Δ mutant shows a slight viability defect, which might reflect chromosome segregation defects that are more pronounced in the absence of a functional SAC.

      Spore viability is a much more sensitive way of analysing segregation defects that GFP-labelled chromosomes. This is because GFP labelling allows only a single chromosome to be followed. On the other hand, if any of the 16 chromosomes mis-segregate in a given meiosis this would result in one or more aneuploid spores in the tetrad, which are typically inviable. The fact that spore viability is not significantly different from wild type in this analysis indicates that there are no major chromosome segregation defects in these strains, and we therefore do not plan to do this experiment.

      3.- It is surprising that, although SAC activity is proposed to be weaker in metaphase I, the levels of CPC/SAC proteins seem to be higher at this stage of meiosis than in metaphase II or mitotic metaphase (Fig. 4A-B).

      We agree, this is surprising and we will point this out in the revised discussion. We speculate that the challenge in biorienting homologs which are held together by chiasmata, rather than back-to-back kinetochores results in a greater requirement for error correction in meiosis I. Interestingly, the data with the RASA mutant also point to increased PP1 activity in meiosis I, and we additionally observed increased levels of PP1 (Glc7 and Fin1) on meiotic kinetochores, consistent with the idea that cycles of error correction and silencing are elevated in meiosis I.

      4.- Although a more detailed exploration of kinetochore composition or phosphorylation changes is beyond the scope of the manuscript, some key observations could have been validated experimentally (e.g., enrichment of proteins at kinetochores, phosphorylation events that were identified as specific or enriched at a certain meiotic stage, etc.).

      We agree that this is beyond the scope of the current study but will form the start of future projects from our group, and hopefully others.

      5.- Several typographical errors should be corrected (e.g., "Knetochores" in Fig. 4 legend, "250uM ABA" in Supp. Fig. 1 legend, etc.)

      Thank you for pointing these out, they have been corrected.

      Reviewer #3 (Significance (Required)):

      Koch et al. describe a novel methodology, SynSAC, to synchronize budding yeast cells in metaphase I or metaphase II during meiosis, as well and in mitotic metaphase, thereby enabling differential analyses among these cell division stages. Their approach builds on prior strategies originally developed in fission yeast and human cells models to induce a synthetic spindle assembly checkpoint (SAC) arrest by conditionally forcing the heterodimerization of two SAC proteins upon addition of abscisic acid (ABA). The results from this manuscript are of special relevance for researchers studying meiosis and using Saccharomyces cerevisiae as a model. Moreover, the differential analysis of the composition and phosphorylation of kinetochores from meiotic metaphase I and metaphase II adds interest for the broader meiosis research community. Finally, regarding my expertise, I am a researcher specialized in the regulation of cell division.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In their manuscript, Koch et al. describe a novel strategy to synchronize cells of the budding yeast Saccharomyces cerevisiae in metaphase I and metaphase II, thereby facilitating comparative analyses between these meiotic stages. This approach, termed SynSAC, adapts a method previously developed in fission yeast and human cells that enables the ectopic induction of a synthetic spindle assembly checkpoint (SAC) arrest by conditionally forcing the heterodimerization of two SAC components upon addition of the plant hormone abscisic acid (ABA). This is a valuable tool, which has the advantage that induces SAC-dependent inhibition of the anaphase promoting complex without perturbing kinetochores. Furthermore, since the same strategy and yeast strain can be also used to induce a metaphase arrest during mitosis, the methodology developed by Koch et al. enables comparative analyses between mitotic and meiotic cell divisions. To validate their strategy, the authors purified kinetochores from meiotic metaphase I and metaphase II, as well as from mitotic metaphase, and compared their protein composition and phosphorylation profiles. The results are presented clearly and in an organized manner. Despite the relevance of both the methodology and the comparative analyses, several main issues should be addressed:

      1.- In contrast to the strong metaphase arrest induced by ABA addition in mitosis (Supp. Fig. 2), the SynSAC strategy only promotes a delay in metaphase I and metaphase II as cells progress through meiosis. This delay extends the duration of both meiotic stages, but does not markedly increase the percentage of metaphase I or II cells in the population at a given timepoint of the meiotic time course (Fig. 1C). Therefore, although SynSAC broadens the time window for sample collection, it does not substantially improve differential analyses between stages compared with a standard NDT80 prophase block synchronization experiment. Could a higher ABA concentration or repeated hormone addition improve the tightness of the meiotic metaphase arrest? 2.- Unlike the standard SynSAC strategy, introducing mutations that prevent PP1 binding to the SynSAC construct considerably extended the duration of the meiotic metaphase arrests. In particular, mutating PP1 binding sites in both the RVxF (RASA) and the SILK (4A) motifs of the Spc105(1-455)-PYL construct caused a strong metaphase I arrest that persisted until the end of the meiotic time course (Fig. 3A). This stronger and more prolonged 4A-RASA SynSAC arrest would directly address the issue raised above. It is unclear why the authors did not emphasize more this improved system. Indeed, the 4A-RASA SynSAC approach could be presented as the optimal strategy to induce a conditional metaphase arrest in budding yeast meiosis, since it not only adapts but also improves the original methods designed for fission yeast and human cells. Along the same lines, it is surprising that the authors did not exploit the stronger arrest achieved with the 4A-RASA mutant to compare kinetochore composition at meiotic metaphase I and II. 3.- The results shown in Supp. Fig. 4C are intriguing and merit further discussion. Mitotic growth in ABA suggest that the RASA mutation silences the SynSAC effect, yet this was not observed for the 4A or the double 4A-RASA mutants. Notably, in contrast to mitosis, the SynSAC 4A-RASA mutation leads to a more pronounced metaphase I meiotic delay (Fig. 3A). It is also noteworthy that the RVAF mutation partially restores mitotic growth in ABA. This observation supports, as previously demonstrated in human cells, that Aurora B-mediated phosphorylation of S77 within the RVSF motif is important to prevent PP1 binding to Spc105 in budding yeast as well. 4.- To demonstrate the applicability of the SynSAC approach, the authors immunoprecipitated the kinetochore protein Dsn1 from cells arrested at different meiotic or mitotic stages, and compared kinetochore composition using data independent acquisition (DIA) mass spectrometry. Quantification and comparative analyses of total and kinetochore protein levels were conducted in parallel for cells expressing either FLAG-tagged or untagged Dsn1 (Supp. Fig. 7A-B). To better detect potential changes, protein abundances were next scaled to Dsn1 levels in each sample (Supp. Fig. 7C-D). However, it is not clear why the authors did not normalize protein abundance in the immunoprecipitations from tagged samples at each stage to the corresponding untagged control, instead of performing a separate analysis. This would be particularly relevant given the high sensitivity of DIA mass spectrometry, which enabled quantification of thousands of proteins. Furthermore, the authors compared protein abundances in tagged-samples from mitotic metaphase and meiotic prophase, metaphase I and metaphase II (Supp. Fig. 7E-F). If protein amounts in each case were not normalized to the untagged controls, as inferred from the text (lines 333 to 338), the observed differences could simply reflect global changes in protein expression at different stages rather than specific differences in protein association to kinetochores. 5.- Despite the large amount of potentially valuable data generated, the manuscript focuses mainly on results that reinforce previously established observations (e.g., premature SAC silencing in meiosis I by PP1, changes in kinetochore composition, etc.). The discussion would benefit from a deeper analysis of novel findings that underscore the broader significance of this study.

      Finally, minor concerns are:

      1.- Meiotic progression in SynSAC strains lacking Mad1, Mad2 or Mad3 is severely affected (Fig. 1D and Supp. Fig. 1), making it difficult to assess whether, as the authors state, the metaphase delays depend on the canonical SAC cascade. In addition, as a general note, graphs displaying meiotic time courses could be improved for clarity (e.g., thinner data lines, addition of axis gridlines and external tick marks, etc.). 2.- Spore viability following SynSAC induction in meiosis was used as an indicator that this experimental approach does not disrupt kinetochore function and chromosome segregation. However, this is an indirect measure. Direct monitoring of genome distribution using GFP-tagged chromosomes would have provided more robust evidence. Notably, the SynSAC mad3Δ mutant shows a slight viability defect, which might reflect chromosome segregation defects that are more pronounced in the absence of a functional SAC. 3.- It is surprising that, although SAC activity is proposed to be weaker in metaphase I, the levels of CPC/SAC proteins seem to be higher at this stage of meiosis than in metaphase II or mitotic metaphase (Fig. 4A-B). 4.- Although a more detailed exploration of kinetochore composition or phosphorylation changes is beyond the scope of the manuscript, some key observations could have been validated experimentally (e.g., enrichment of proteins at kinetochores, phosphorylation events that were identified as specific or enriched at a certain meiotic stage, etc.). 5.- Several typographical errors should be corrected (e.g., "Knetochores" in Fig. 4 legend, "250uM ABA" in Supp. Fig. 1 legend, etc.)

      Significance

      Koch et al. describe a novel methodology, SynSAC, to synchronize budding yeast cells in metaphase I or metaphase II during meiosis, as well and in mitotic metaphase, thereby enabling differential analyses among these cell division stages. Their approach builds on prior strategies originally developed in fission yeast and human cells models to induce a synthetic spindle assembly checkpoint (SAC) arrest by conditionally forcing the heterodimerization of two SAC proteins upon addition of abscisic acid (ABA). The results from this manuscript are of special relevance for researchers studying meiosis and using Saccharomyces cerevisiae as a model. Moreover, the differential analysis of the composition and phosphorylation of kinetochores from meiotic metaphase I and metaphase II adds interest for the broader meiosis research community. Finally, regarding my expertise, I am a researcher specialized in the regulation of cell division.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript submitted by Koch et al. describes a novel approach to collect budding yeast cells in metaphase I or metaphase II by synthetically activating the spinde checkpoint (SAC). The arrest is transient and reversible. This synchronization strategy will be extremely useful for studying meiosis I and meiosis II, and compare the two divisions. The authors characterized this so-named syncSACapproach and could confirm previous observations that the SAC arrest is less efficient in meiosis I than in meiosis II. They found that downregulation of the SAC response through PP1 phosphatase is stronger in meiosis I than in meiosis II. The authors then went on to purify kinetochore-associated proteins from metaphase I and II extracts for proteome and phosphoproteome analysis. Their data will be of significant interest to the cell cycle community (they compared their datasets also to kinetochores purified from cells arrested in prophase I and -with SynSAC in mitosis).

      I have only a couple of minor comments:

      1) I would add the Suppl Figure 1A to main Figure 1A. What is really exciting here is the arrest in metaphase II, so I don't understand why the authors characterize metaphase I in the main figure, but not metaphase II. But this is only a suggestion.

      2) Line 197, the authors state: ...SyncSACinduced a more pronounced delay in metaphase II than in metaphase I. However, line 229 and 240 the auhtors talk about a "longer delay in metaphase <i compared to metaphase II"... this seems to be a mix-up.

      3) The authors describe striking differences for both protein abundance and phosphorylation for key kinetochore associated proteins. I found one very interesting protein that seems to be very abundant and phosphorylated in metaphase I but not metaphase II, namely Sgo1. Do the authors think that Sgo1 is not required in metaphase II anymore? (Top hit in suppl Fig 8D).

      Significance

      The technique described here will be of great interest to the cell cycle community. Furthermore, the authors provide data sets on purified kinetochores of different meiotic stages and compare them to mitosis. This paper will thus be highly cited, for the technique, and also for the application of the technique.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      These authors have developed a method to induce MI or MII arrest. While this was previously possible in MI, the advantage of the method presented here is it works for MII, and chemically inducible because it is based on a system that is sensitive to the addition of ABA. Depending on when the ABA is added, they achieve a MI or MII delay. The ABA promotes dimerizing fragments of Mps1 and Spc105 that can't bind their chromosomal sites. The evidence that the MI arrest is weaker than the MII arrest is convincing and consistent with published data and indicating the SAC in MI is less robust than MII or mitosis. The authors use this system to find evidence that the weak MI arrest is associated with PP1 binding to Spc105. This is a nice use of the system.

      The remainder of the paper uses the SynSAC system to isolate populations enriched for MI or MII stages and conduct proteomics. This shows a powerful use of the system but more work is needed to validate these results, particularly in normal cells.

      Overall the most significant aspect of this paper is the technical achievement, which is validated by the other experiments. They have developed a system and generated some proteomics data that maybe useful to others when analyzing kinetochore composition at each division. Overall, I have only a few minor suggestions.

      1) In wild-type - Pds1 levels are high during M1 and A1, but low in MII. Can the authors comment on this? In line 217, what is meant by "slightly attenuated? Can the authors comment on how anaphase occurs in presence of high Pds1? There is even a low but significant level in MII.

      2) The figures with data characterizing the system are mostly graphs showing time course of MI and MII. There is no cytology, which is a little surprising since the stage is determined by spindle morphology. It would help to see sample sizes (ie. In the Figure legends) and also representative images. It would also be nice to see images comparing the same stage in the SynSAC cells versus normal cells. Are there any differences in the morphology of the spindles or chromosomes when in the SynSAC system?

      3) A possible criticism of this system could be that the SAC signal promoting arrest is not coming from the kinetochore. Are there any possible consequences of this? In vertebrate cells, the RZZ complex streams off the kinetochore. Yeast don't have RZZ but this is an example of something that is SAC dependent and happens at the kinetochore. Can the authors discuss possible limitations such as this? Does the inhibition of the APC effect the native kinetochores? This could be good or bad. A bad possibility is that the cell is behaving as if it is in MII, but the kinetochores have made their microtubule attachments and behave as if in anaphase.

      Significance

      These authors have developed a method to induce MI or MII arrest. While this was previously possible in MI, the advantage of the method presented here is it works for MII, and chemically inducible because it is based on a system that is sensitive to the addition of ABA. Depending on when the ABA is added, they achieve a MI or MII delay. The ABA promotes dimerizing fragments of Mps1 and Spc105 that can't bind their chromosomal sites. The evidence that the MI arrest is weaker than the MII arrest is convincing and consistent with published data and indicating the SAC in MI is less robust than MII or mitosis. The authors use this system to find evidence that the weak MI arrest is associated with PP1 binding to Spc105. This is a nice use of the system.

      The remainder of the paper uses the SynSAC system to isolate populations enriched for MI or MII stages and conduct proteomics. This shows a powerful use of the system but more work is needed to validate these results, particularly in normal cells.

      Overall the most significant aspect of this paper is the technical achievement, which is validated by the other experiments. They have developed a system and generated some proteomics data that maybe useful to others when analyzing kinetochore composition at each division.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-03160

      Corresponding author(s) Padinjat, Raghu

      [The “revision plan” should delineate the revisions that authors intend to carry out in response to the points raised by the referees. It also provides the authors with the opportunity to explain their view of the paper and of the referee reports.

      • *

      The document is important for the editors of affiliate journals when they make a first decision on the transferred manuscript. It will also be useful to readers of the reprint and help them to obtain a balanced view of the paper.

      • *

      If you wish to submit a full revision, please use our "Full Revision" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

      1. General Statements [optional]

      We thank all three reviewers for appreciating the novelty of our analysis of CERT function in a physiological context in vivo. While many studies have been published on the biochemistry and function of CERT in cultured cells, there are limited studies, if any, relating the impact of CRT function at the biochemical level to its function on a physiological process, in our case the electrical response to light.

      We also that all reviewers for commenting on the importance of our rescue of dcert mutants with hCERT and the scientific insights raised by this experiment. All reviewers have also noted the importance of strengthening our observation that hCERT, in these cells, is localized at ER-PM MCS rather that the more widely reported localization at the Golgi. We highlight that many excellent studies which have localized CERT at the Golgi are performed in cultured, immortalized, mammalian cells. There are limited studies on the localization of this protein in primary cells, neurons or in polarized cells. With the additional experiments we have proposed in the revision for this aspect of the manuscript, we believe the findings will be of great novelty and widespread interest.

      We believe we can address almost all points raised by reviewers thereby strengthening this exciting manuscript.

      2. Description of the planned revisions

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This manuscript dissects the physiological function of ceramide transfer protein (CERT) by studying the phenotype of CERT null Drosophila.

      dCERT null animals have a reduced electrical response to light in their photoreceptors, reduced baseline PIP2 accumulation in the cells and delayed re-synthesis of PIP2 and its precursor, PI4P after light stimulation. There are also reduced ER:PM contact sites at the rhabdomere and a corresponding reduction in the localization of PI/PA exchange protein, RDGB at this site. Therefore, the animals seem to have an impaired ability for sustaining phototransduction, which is nonetheless milder than that seen after loss of RDGB, for example. In terms of biochemical function, there is no overall change in ceramides, with some minor increases in specific short chain pools. There is however a large decrease in PE-ceramide species, again selective for a few molecular species. Curiously, decreasing ceramides with a mutant in ceramide synthesis is able to partially rescue both the electrical response and RDGB localization in dCERT flies, implying the increased ceramide species contribute to the phenotype. In addition, a mutation in PE-ceramide synthase largely phenocopies the dCERT null, exhiniting both increases ceramides and decreased PE-ceramide.

      In addition, dCERT flies were shown to have reduced localization of some plasma membrane proteins to detergent-resistant membrane fractions, as well as up regulation of the IRE1 and PERK stress-response pathways. Finally, dCERT nulls could be rescued with the human CERT protein, demonstrating conservation of core physiological function between these animals. Surprisingly, CERT is reported to localize to the ER:PM junctions at rhabdomeres, as opposed to the expected ER:Golgi contact sites. Specific areas where the manuscript could be strengthened include:

      Figure 2 studies the phototransduction system. Although clear changes in PI4P and PIP2 are seen, it would be interesting to see if changed PA accumulation occur in the dCERT animals, since RDGB localization is disrupted: this is expected to cause PM PA accumulation along with reduced PIP2 synthesis.

      It is an important question raised by the reviewer to check PA levels. In the present study we have noticed that localization of RDGB at the base of the rhabdomere in dcert1 is reduced but not completely removed. Consequently, one may consider the situation on dcert1 as a partial loss of function of RDGB and consistent with this, the delay in PI4P and PI(4,5)P2 resynthesis is not as severe as in rdgB9 which is a strong hypomorph (PMID: 26203165).

      rdgB9 mutants also show an elevation in PA levels and the reviewer is right that one might expect changes in PA levels too as RDGB is a PI/PA transfer protein. We expect that if measured, there will be a modest elevation in PA levels. However, previous work has shown that elevation of PA levels at the or close to the rhabdomere lead to retinal degeneration Specifically, elevated PA levels by dPLD overexpression disrupts rhabdomere biogenesis and leads to retinal degeneration (PMID: 19349583). Similarly, loss of the lipid transfer protein RDGB leads to photoreceptor degeneration (PMID: 26203165). In this study, we report that retinal degeneration is not a phenotype of dcert1. Thus measurements of PA levels though interesting may not be that informative in the context of the present study. However, if necessary, we can measure PA levels in dcert1.

      Lines 228-230 state: "These findings suggest an important contribution for reduced PE - Cer levels in the eye phenotypes of dcert". Does it not also suggest a contribution of the elevated ceramide species, since these are also observed in the CPES animals?

      We agree with the reviewer that not only reduced PE-Ceramide but also elevated ceramide levels in GMR>CPESi could contribute to the eye phenotype. This statement will be revised to reflect this conclusion.

      Figure 6D is a key finding that human CERT localized to the rhabdomere at ER:PM contact sites, though the reviewer was not convinced by these images. Is the protein truly localized to the contact sites, or simply have a pool of over-expressed protein localized to the surrounding cytoplasm? It also does not rule out localization (and therefore function) at ER:PM contact sites.

      Since hCERT completely rescued eye phenotype of dcert1 the localization we observe for hCERT must be at least partly relevant. We will perform additional IHC experiments to

      • Co-localize hCERT with an ER-PM MCS marker, e.g RDGB in wild type flies
      • Co-localize hCERT with VAP-A that is enriched at the ER-PM MCS. This should help to determine if there are MCS and non-MCS pools of hCERT in these cells. marker, e.g RDGB in wild type flies
      • Test if there is a pool of hCERT, in these cells that also localizes (or not) with the Golgi marker Golgin 84. These will be included in the revision to strengthen this important point.

      Statistics: There are a large number of t-tests employed that do not correct for multiple comparisons, for example in figures 3B, 3D, 3H, 4C, 6C, S2A, S2B, S3B and S3C.

      We will performed multiple comparisons with mentioned data and incorporate in the revised manuscript.

      There are two Western blotting sections in the methods.

      The first Western blotting methods is for general blots in the paper. The second western blotting section is related to the samples from detergent resistant membrane (DRM) fractions. We will clearly explain this information in the methods section of the manuscript.

      Reviewer #1 (Significance (Required)):

      Overall, the manuscript is clearly and succinctly written, with the data well presented and mostly convincing. The paper demonstrates clear phenotypes associated with loss of dCERT function, with surprising consequences for the function of a signaling system localized to ER:PM contact sites. To this reviewer, there seem to be three cogent observations of the paper: (i) loss of dCERT leads to accumulation of ceramides and loss of PE-ceramide, which together drive the phenotype. (ii) this ceramide alteration disrupts ER:PM contact sites and thus impairs phototransduction and (iii) rescue by human CERT and its apparent localization to ER:PM contact sites implies a potential novel site of action. Although surprising and novel, the significance of these observations are a little unclear: there is no obvious mechanism by which the elevated ceramide species and decreased PE-ceramide causes the specific failure in phototrasnduction, and the evidence for a novel site of action of CERT at the ER:PM contact sites is not compelling. Therefore, although an interesting and novel set of observations, the manuscript does not reveal a clear mechanistic basis for CERT physiological function.

      We thank reviewer for appreciating the quality of our manuscript while also highlighting points through which its impact can be enhanced. To our knowledge this is one of the first studies to tackle the challenging problem of a role for CERT in physiological function. We would like to highlight two points raised:

      • We do understand that the localisation of hCERT at ER-PM MCS is unusual compared to the traditional reported localization to ER-Golgi sites. This is important for the overall interpretation of the results in the paper on how dCERT regulates phototransduction. As indicated in response to an earlier comment by the reviewer we will perform additional experiments to strengthen our conclusion of the localization of hCERT.
      • With regard to how loss of dCERT affects phototransduction, we feel to likely mechanisms contribute. If the localization of hCERT to ER-PM MCS is verified through additional experiments (see proposal above) then it is important to note that ER-PM MCS in these cells includes the SMC (smooth endoplasmic reticulum) the major site of lipid synthesis. It is possible that loss of dCERT leads to ceramide accumulation in the smooth ER and disruption of ER-PM contacts. That may explain why reducing the levels of ceramide at this site partially rescues the eye phenotype.

      The multi-protein INAD-TRP-NORPA complex, central to phototransduction have previously been shown to localise to DRMs in photoreceptors. PE-Ceramides are important contributors to the formation of plasma membrane DRMs and we have presented biochemical evidence that the formation of these DRMs are reduced in the dcert1. This may be a mechanism contributing to reduced phototransduction. This latter mechanism has been proposed as a physiological function of DRMs but we think our data may be the first to show it in a physiological model.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary Non-vesicular lipid transfer by lipid transfer proteins regulates organelle lipid compositions and functions. CERT transfers ceramide from the ER to Golgi to produce sphingomyelin, although CERT function in animal development and physiology is less clear. Using dcert1 (a protein-null allele), this paper shows a disruption of the sole Drosophila CERT gene causes reduced ERG amplitude in photoreceptors. While the level and localization of phototransduction machinery appears unaffected, the level of PIP2 and the localization of RDGB are perturbed. Collectively, these observations establish a novel link between CERT and phospholipase signaling in phototransduction. To understand the molecular mechanism further, the authors performed lipid chromatography and mass spec to characterize ceramide species in dcert1. This analysis reveals that whereas the total ceramide remains unaffected, most PE-ceramide species are reduced. The authors use lace mutant (serine palmitoyl transferase) and CPES (ceramide phosphoethanolamine synthase) RNAi to distinguish whether it is the accumulation of ceramide in the ER or the reduction of sphingolipid derivates in the Golgi that is the cause for the reduced ERG amplitude. Mutating one copy of lace reduces ceramide level by 50% and partially rescues the ERG defect, suggesting that the accumulation of ceramide in the ER is a cause. CPES RNAi phenocopies the reduced ERG amplitude, suggesting the production of certain sphingolipid is also relevant.

      Major comments: 1. By showing the reduced PIP2 level, the decreased SMC sites at the base of rhabdomeres, and the diffused RDGB localization in dcert1, the authors favor the model, in which the disruption of ceramide metabolism affects PIP transport. However, it is unclear if the reduced PIP2 level (i.e., reduced PH-PLCd::GFP staining) is specific to the rhabdomeres. It should be possible to compare PH-PLCd::GFP signals in different plasma membranes between wildtype and dcert1. If PH-PLCd::GFP signal is specifically reduced at the rhabdomeres, this conclusion will be greatly strengthened. In addition, the photoreceptor apical plasma membrane includes rhabdomere and stalk membrane. Is the PH-PLCd::GFP signal at the stalk membrane also affected?

      Due to the physical organization of optics in the fly eye, the pseudopupil imaging method used in this study collects the signal for the PIP2 probe (PH-PLCd::GFP) mainly from the apical rhabdomere membrane of photoreceptors in live imaging experimental mode. Therefore, the PIP2 signal from these experiments cannot be used to interpret the level of PIP2 either at the stalk membrane or indeed the basolateral membrane.

      The point raised by the reviewer, i.e whether CERT selectively controls PIP2 levels at the rhabdomere membrane or not, is an interesting one. To do this, we will need to fix fly photoreceptors and determine the PH-PLCd::GFP signal using single slice confocal imaging. When combined with a stalk marker such as CRUMBS, it should be possible to address the question of which are the membrane domains at which dCERT controls PIP2 levels. If the sole mechanism of action of dCERT is via disruption of ER-PM MCS then only the apical rhabdomere membrane PIP2 should be affected leaving the stalk membrane and basolateral membrane unaffected.

      Thank you very much for raising this specific point.

      The analysis of RDGB localization should be done in mosaic dcert1 retinas, which will be more convincing with internal control for each comparison. In addition, the phalloidin staining in Figure 2J shows distinct patterns of adherens junctions, indicating that the wildtype and dcert1 were imaged at different focal planes.

      We understand that having mosaics is an alternative an elegant way to perform a a side by side analysis of control and mutant. However this would require significant investment of time and effort, perhaps beyond the scope of this study. If we were to perform a mosaic analysis, this would compromise our ERG analysis since ERG is an extracellular recording We feel that this is beyond the scope of this study and perhaps may not be necessary as such (see below).

      In the revision we will present equivalent sections of control and dcert1 taken from the nuclear plane of the photoreceptor. This should resolve the reviewer’s concerns.

      The significance of ceramide species levels in dcert1 and GMR>CPESRNAi needs to be explained better. Do certain alterations represent accumulation of ceramides in the ER?

      Species level analysis of changes in ceramides reveal that elevations in dcert1 are seen mainly in the short chain ceramides (14 and 16 carbon chains). These most likely represent the short chain ceramides synthesised in the ER and accumulating due to the block in further metabolism to PE-Cer due to depletion in CPES.

      Species level analysis of changes in ceramides reveal that in dcert1 there is a ceramide transport related defect leading to elevation, primarily, in the short chain ceramides (14 and 16 carbon chains), and this selective supply defect leads to a reduction in PE-Cer levels, with a maximum change in the ratio of short-chain Cer:PE Cer (Figure 3A-D). Though there is no apparent change in the total ceramide level the species specific elevation in the ceramides disturb the fine -balance between the short-chain ceramides and the long and very-long chain ceramides. As the function of long and very-long chain ceramides are implicated in dendrite development and neuronal morphology (doi: 10.1371/journal.pgen.1011880), therefore this alteration in the fine balance between different ceramide species probably impacts the integrity and fluidity of the membrane environment. On the other hand it leads to a possibility of a defined function of the short-chain ceramides in electrical responses to light signalling in the eye, especially with respect to the PE-ceramides that are reduced by around 50%.

      In contrast the GMR>CPESRNAi leads to more of a substrate accumulation showing ceramide increase (14, 16, 18, 20 carbon chains) and decrease in PE-Cer levels (Figure 4D, E). In this case Cer accumulation is due to the block in further metabolism to PE-Cer arising from depletion in CPES.

      We will include this in the discussion of a revised version.

      The suppression by lace is interpreted as evidence that the reduced ERG amplitude in dcert1 is caused by ceramide accumulation in the ER. This interpretation seems preliminary as lace may interact with dcert genetically by other mechanisms.

      The dcert1 mutant exhibits increased levels of short-chain ceramides (Fig 3B), whereas the lace heterozygous mutant (laceK05305/+) displays reduced short-chain ceramide levels (Supp Fig 2B). In the laceK05305/+; dcert1 double mutant, ceramide levels are lower than those observed in the dcert1 mutant alone (Supp Fig 2B), indicating a partial genetic rescue of the elevated ceramide phenotype.

      Furthermore, through multiple independent genetic manipulations that modulate ceramide metabolism (alterations of dcert, cpes and lace), we consistently observe that increased ceramide levels correlate with a reduction in ERG amplitude, suggesting that ceramide accumulation negatively impacts photoreceptor function. Taken together, these observations indicate that the reduction in ceramide levels in the laceK05305/+; dcert1 double mutant likely contributes to the suppression of the ERG defect observed in the dcert1 mutant.

      The authors show that ERG amplitude is reduced in GMR>CPESRNAi. While this phenocopying is consistent with the reduced ERG amplitude in dcert1 being caused by reduced production of PE-ceramide, GMR>CPESRNAi also shows an increase in total ceramide level. Could this support the hypothesis that reduced ERG amplitude is caused by an accumulation of ceramide elsewhere? In addition, is the ERG amplitude reduction in GMR>CPESRNAi sensitive to lace?

      We agree that in addition to reduced PE-Ceramide, the elevated ceramide levels in GMR>CPESi could contribute to the eye phenotype. We will introduce lace heterozygous mutant in the GMR>CPESi background to test the contribution of elevated ceramide levels in the *GMR>CPESi * background and incorporate the data in the revision. Thank you for this suggestion.

      Along the same line, while the total ceramide level is significantly reduced in lace heterozygotes, is the PE-ceramide level also reduced? If yes, wouldn't this be contradictory to PE-ceramide production being important for ERG amplitude?

      Mass spec measurements show that levels of PE-Cer were not reduced in lacek05305/+ compared to wild type. This data will be included in the revised manuscript. However, the ERG amplitude of these flies and also in those with lace depletion using two independent RNAi lines were not reduced.

      What is the explanation and significance for the age-dependent deterioration of ERG amplitude in dcert1? Likewise, the significance of no retinal degeneration is not clearly presented.

      There could be multiple reasons for the age dependent deterioration of the ERG amplitude, in the absence of retinal degeneration. Drosophila phototransduction cascade depends heavily on ATP production. The age dependent reduction in ATP synthesis could lead to deterioration in the ERG amplitude. These may include instability of the DRMs due to reduced PE-Cer, lower ATP levels due to mitochondrial dysfunction, an perhaps others. A previous study has shown that ATP production is highly reduced along with oxidative stress and metabolic dysfunction in dcert1 flies aged to 10 days and beyond (PMID: 17592126). The same study has also found no neuronal degeneration in dcert1 that phenocopies absence of photoreceptor degeneration in the present study. We will attempt a few experiments to rule in or rule out the these and revise the discussion accordingly.

      The rescue of dcert1 phenotype by the expression of human CERT is a nice result. In addition to demonstrating a functional conservation, it allows a determination of CERT protein localization. However, the quality of images in Figure 6D should be improved. The phalloidin staining was rather poor, and the CNX99A in the lower panel was over-exposed, generating bleed-through signals at the rhabdomeres. In addition, the localization of hCERT should be explored further. For instance, does hCERT colocalize with RDGB? Is the hCERT localization altered in lace or GMR>CPESRNAi background?

      As indicated in response to reviewer 1:

      We will perform additional IHC experiments to

      • Co-localize hCERT with an ER-PM MCS marker, e.g RDGB in wild type flies
      • Co-localize hCERT with VAP-A that is enriched at the ER-PM MCS. This should help to determine if there are MCS and non-MCS pools of hCERT in these cells. marker, e.g RDGB in wild type flies
      • Test if there is a pool of hCERT, in these cells that also localizes (or not) with the Golgi marker Golgin 84. These will be included in the revision to strengthen this important point.

      We will also attempt to perform hCERT localization in lace or GMR>CPESRNAi background

      Minor comments: 1. In Line 128, Df(732) should be Df(3L)BSC732.

      Changes will be incorporated in the main manuscript.

      GMR-SMSrRNAi shows an increase in ERG peak amplitude. Is there an explanation for this?

      GMR-SMSrRNAi did show slight increase in ERG peak amplitude but was not statistically significant.

      Reviewer #2 (Significance (Required)):

      Significance As CERT mutations are implicated in human learning disability, a better understanding of CERT function in neuronal cells is certainly of interest. While the link between ceramide transport and phospholipase signaling is novel and interesting, this paper does not clearly explain the mechanism. In addition, as the ERG were measured long after the retinal cells were deficient in CERT or CPES, it is difficult to assess whether the observed phenotype is a primary defect. Furthermore, the quality of some images needs to be improved. Thus, I feel the manuscript in its current form is too preliminary.

      We thank reviewer for highlighting the importance and significance of our work in the light of recent studies of CERT function in ID. As with all genetic studies it is difficult to completely disentangle the role of a gene during development from a role only in the adult. However, we will attempt to perhaps use the GAL80ts system to uncouple these two potential components of CERT function in photoreceptors. The goal will be to determine if CERT has a specific role only in adult photoreceptors or if this is coupled to a developmental role. Since ID is as a neurodevelopmental disorder, a developmental role for CERT would be equally interesting.

      As previously indicated images will be improved bearing in mind the reviewer comments.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: Lipid transfer proteins (LTPs) shuttle lipids between organelle membranes at membrane contact sites (MCSs). While extensive biochemical and cell culture studies have elucidated many aspects of LTP function, their in vivo physiological roles are only beginning to be understood. In this manuscript, the authors investigate the physiological role of the ceramide transfer protein (CERT) in Drosophila adult photoreceptors-a model previously employed by this group to study LTP function at ER-PM contact sites under physiological conditions. Using a combination of genetic, biochemical, and physiological approaches, they analyze a protein-null mutant of dcert. They show that loss of dcert causes a reduction in electrical response to light with progressive decrease in electroretinogram (ERG) amplitude with age but no retinal degeneration. Lipidomic analysis shows that while the total levels of ceramides are not changed in dcert mutants, they do observe significant change in certain species of ceramides and depletion of downstream metabolite phosphoethanolamine ceramide (PE-Cer). Using fluorescent biosensors, the authors demonstrate reduced PIP2 levels at the plasma membrane, unchanged basal PI4P levels and slower resynthesis kinetics of both lipids following depletion. Electron microscopy and immunolabeling further reveal a reduced density of ER-PM MCSs and mislocalization of the MCS-resident lipid transfer protein RDGB. Genetic interaction studies with lace and RNAi-mediated knockdown of CPES support the conclusion that both ER ceramide accumulation and PM PE-Cer depletion contribute to the observed defects in dcert mutants. In addition, detergent-resistant membrane fractionation indicates altered plasma membrane organization in the absence of dcert. The study also reports upregulation of unfolded protein response transcripts, including IRE1 and PERK, suggesting increased ER stress. Finally, expression of human CERT rescues the reduced electrical response, demonstrating functional conservation across species. Overall the manuscript is well written that builds on established work and experiments are technically rigorous. The results are clearly presented and provide valuable insights into the physiological role of CERT.

      Major comments: 1.The reduced ERG amplitude appears to be the central phenotype associated with the loss of dcert, and most of the experiments in this manuscript effectively build a mechanistic framework to explain this observation. However, the experiments addressing detergent-resistant membrane domains (DRMs) and the unfolded protein response (UPR) seem somewhat disconnected from the main focus of the study. The DRM and UPR data feel peripheral and could benefit from few experiments for functional linkage to the ERG defect or should be moved to supplementary.

      We agree with the reviewer that further experiments are needed to link the DRM data to the ERG defects. That would need specific biochemical alteration at the PM to modulate PE-Cer species and their effect on scaffolding proteins required for phototransduction (that is beyond the scope of the present study). We will consider moving these to the supplementary section as suggested by the reviewer.

      2.The changes in ceramide species and reduction in PE-Cer are key findings of the study. These results should be further validated by performing a genetic rescue using the BAC or hCERT fly line to confirm that the lipidomic changes are specifically due to loss of CERT function.

      Thank you for this comment. We will include this in the revised manuscript.

      3.Figure 2B-C and 2E-F: Representative images corresponding to the quantified data should be included to illustrate the changes in PIP2 and PI4P reporters. Given that the fluorescence intensity of the PIP2 reporter at the PM is reduced in the dcert mutant relative to control, the authors should also verify that the reporter is expressed at comparable levels across genotypes.

      • As mentioned by the reviewer we will include representative images alongside our quantified data both of the basal ones and that of the kinetic study.
      • Western blot of reporters (PH-PLCd::GFP and P4M::GFP) across genotypes will be added to the revised manuscript. 4.Figure 2J-K: The partial mislocalization of RDGB represents an important observation that could mechanistically explain the reduced resynthesis of PI4P and PIP2 and consequently, the decreased ERG amplitude in dcert mutants. However, this result requires further validation. First, the authors should confirm whether this mislocalization is specific to RDGB by performing co-staining with another ER-PM MCS marker, such as VAP-A, to assess whether overall MCS organization is disrupted. Second, the quantification of RDGB enrichment at ER-PM MCSs should be refined. From the representative images, RDGB appears redistributed toward the photoreceptor cell body, but the presented quantification does not clearly reflect this shift. The authors should therefore include an analysis comparing RDGB levels in the cell body versus the submicrovillar region across genotypes. This analysis should be repeated for similar experiments across the study. Additionally, the total RDGB protein level should be quantified and reported. Finally, since RDGB mislocalization could directly contribute to the decreased ERG amplitude, it would be valuable to test whether overexpression of RDGB in dcert mutants can rescue the ERG phenotype.

      • In our ultrastructural studies (Fig. 2H, 2I and Sup. Fig. 1A, 1B) we did see reduction in PM-SMC MCS that was corroborated with RDGB staining.

      • Comparative ratio analysis of RDGB localisation at ER-PM MCS vs cell body will be included in the manuscript for all RDGB staining.
      • We have done western analysis for total RDGB protein level in ROR and dcert1. This data will be included in the revised manuscript.
      • This is a very interesting suggestion and we will test if RDGB overexpression can rescue ERG phenotype in dcert1.

      5.Figure 3F and I-J: Inclusion of appropriate WT and laceK05205/+ controls is necessary to allow proper interpretation of the results. These controls would strengthen the conclusions regarding the functional relationship between dcert and lace.

      Changes will be incorporated as per the suggestion.

      6.Figure 5C: The representative images shown here appear to contradict the findings described in Figure 2A. In Figure 5C, Rhodopsin 1 levels seem markedly reduced in the dcert mutants, whereas the text states that Rh1 levels are comparable between control and mutant photoreceptors. The authors should replace or reverify the representative images to ensure that they accurately reflect the conclusions presented in the text.

      We will reverify the representative images and changes will be accordingly incorporated.

      7.Figure 6D: The reported localization of hCERT to ER-PM MCSs is a key and potentially insightful observation, as it suggests the subcellular site of dcert activity in photoreceptors. However, the representative images provided are not sufficiently conclusive to support this claim. The authors should validate hCERT localization by co-staining with established markers like RDGB for ER-PM CNX99A for the ER and a Golgi marker since mammalian CERT is classically localized to ER-Golgi interfaces. Optionally, the authors could also quantify the relative distribution of hCERT among these compartments to provide a clearer assessment of its subcellular localization.

      As indicated in response to reviewer 1:

      We will perform additional IHC experiments to

      • Co-localize hCERT with an ER-PM MCS marker, e.g RDGB in wild type flies
      • Co-localize hCERT with VAP-A that is enriched at the ER-PM MCS. This should help to determine if there are MCS and non-MCS pools of hCERT in these cells. marker, e.g RDGB in wild type flies
      • Test if there is a pool of hCERT, in these cells that also localizes (or not) with the Golgi marker Golgin 84. These will be included in the revision to strengthen this important point.

      Minor comments: 1.In the first paragraph of introduction, authors should consider citing few of the key MCS literature.

      Additional literature will be included as per the suggestion.

      2.Line 132: data not shown is not acceptable. Authors should consider presenting the findings in the supplemental figure.

      Data will be added in supplement as per the suggestion.

      3.The authors should include a comprehensive table or Excel sheet summarizing all statistical analyses. This should include the sample size, type of statistical test used and exact p-values. Providing this information will improve the transparency, reproducibility and overall rigor of the study.

      We will provide all the statistical analyses in mentioned format as per the suggestion.

      4.The materials and methods section can be reorganized to include citation for flystocks which do not have stock number or RRIDs if the stocks were previously described but are not available from public repositories. They should expand on the details of various quantification methods used in the study. Finally including a section of Statistical analyses would further enhance transparency and reproducibility

      • Stock details will be added wherever missing as per the suggestion.
      • Statistical analyses section will be included in the material and methods. **Referee cross-commenting**

      1.I concur with Reviewer 1 regarding the need for more detailed reporting of statistical analyses.

      We will perform multiple comparisons with mentioned data and incorporate in the revised manuscript.

      2.I also agree with Reviewer 3 that the discussion should be expanded to address the age-dependent deterioration of ERG amplitude observed in the dcert mutants. This progressive decline could provide valuable insight into the long-term requirement of CERT function and signaling capacity at the photoreceptor membrane.

      Expanded discussion on the age dependent ERG amplitude decline will be incorporated in the discussion as per the suggestion.

      Reviewer #3 (Significance (Required)):

      This study explores the physiological function of CERT, a LTP localized at MCSs in Drosophila photoreceptors and uncovers a novel role in regulating plasma membrane PE-Cer levels and GPCR-mediated signaling. These findings significantly advances our understanding of how CERT-mediated lipid transport regulates G-protein coupled phospholipase C signaling in vivo. This work also highlights Drosophila photoreceptors as a powerful system to analyze the physiological significance of lipid-dependent signaling processes. This work will be of interest to researchers in neuronal cell biology, membrane dynamics and lipid signaling community. This review is based on my expertise in neuronal cell biology.

      We thank the reviewer for appreciating the significance of our work from a neuroscience perspective.

      • *

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      • *

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      • *

      We can address all reviewer points in the revision. However, we will not be able to perform a mosaic analysis of the impact of dcert1 mutant in the retina. We feel this is beyond the scope of this revision. In our response, we have highlighted how controls included in the revision offset the need for a mosaic analysis at this stage.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Lipid transfer proteins (LTPs) shuttle lipids between organelle membranes at membrane contact sites (MCSs). While extensive biochemical and cell culture studies have elucidated many aspects of LTP function, their in vivo physiological roles are only beginning to be understood. In this manuscript, the authors investigate the physiological role of the ceramide transfer protein (CERT) in Drosophila adult photoreceptors-a model previously employed by this group to study LTP function at ER-PM contact sites under physiological conditions. Using a combination of genetic, biochemical, and physiological approaches, they analyze a protein-null mutant of dcert. They show that loss of dcert causes a reduction in electrical response to light with progressive decrease in electroretinogram (ERG) amplitude with age but no retinal degeneration. Lipidomic analysis shows that while the total levels of ceramides are not changed in dcert mutants, they do observe significant change in certain species of ceramides and depletion of downstream metabolite phosphoethanolamine ceramide (PE-Cer). Using fluorescent biosensors, the authors demonstrate reduced PIP2 levels at the plasma membrane, unchanged basal PI4P levels and slower resynthesis kinetics of both lipids following depletion. Electron microscopy and immunolabeling further reveal a reduced density of ER-PM MCSs and mislocalization of the MCS-resident lipid transfer protein RDGB. Genetic interaction studies with lace and RNAi-mediated knockdown of CPES support the conclusion that both ER ceramide accumulation and PM PE-Cer depletion contribute to the observed defects in dcert mutants. In addition, detergent-resistant membrane fractionation indicates altered plasma membrane organization in the absence of dcert. The study also reports upregulation of unfolded protein response transcripts, including IRE1 and PERK, suggesting increased ER stress. Finally, expression of human CERT rescues the reduced electrical response, demonstrating functional conservation across species.Overall the manuscript is well written that builds on established work and experiments are technically rigorous. The results are clearly presented and provide valuable insights into the physiological role of CERT.

      Major comments:

      1.The reduced ERG amplitude appears to be the central phenotype associated with the loss of dcert, and most of the experiments in this manuscript effectively build a mechanistic framework to explain this observation. However, the experiments addressing detergent-resistant membrane domains (DRMs) and the unfolded protein response (UPR) seem somewhat disconnected from the main focus of the study. The DRM and UPR data feel peripheral and could benefit from few experiments for functional linkage to the ERG defect or should be moved to supplementary. 2.The changes in ceramide species and reduction in PE-Cer are key findings of the study. These results should be further validated by performing a genetic rescue using the BAC or hCERT fly line to confirm that the lipidomic changes are specifically due to loss of CERT function. 3.Figure 2B-C and 2E-F: Representative images corresponding to the quantified data should be included to illustrate the changes in PIP2 and PI4P reporters. Given that the fluorescence intensity of the PIP2 reporter at the PM is reduced in the dcert mutant relative to control, the authors should also verify that the reporter is expressed at comparable levels across genotypes. 4.Figure 2J-K: The partial mislocalization of RDGB represents an important observation that could mechanistically explain the reduced resynthesis of PI4P and PIP2 and consequently, the decreased ERG amplitude in dcert mutants. However, this result requires further validation. First, the authors should confirm whether this mislocalization is specific to RDGB by performing co-staining with another ER-PM MCS marker, such as VAP-A, to assess whether overall MCS organization is disrupted. Second, the quantification of RDGB enrichment at ER-PM MCSs should be refined. From the representative images, RDGB appears redistributed toward the photoreceptor cell body, but the presented quantification does not clearly reflect this shift. The authors should therefore include an analysis comparing RDGB levels in the cell body versus the submicrovillar region across genotypes. This analysis should be repeated for similar experiments across the study. Additionally, the total RDGB protein level should be quantified and reported. Finally, since RDGB mislocalization could directly contribute to the decreased ERG amplitude, it would be valuable to test whether overexpression of RDGB in dcert mutants can rescue the ERG phenotype. 5.Figure 3F and I-J: Inclusion of appropriate WT and laceK05205/+ controls is necessary to allow proper interpretation of the results. These controls would strengthen the conclusions regarding the functional relationship between dcert and lace. 6.Figure 5C: The representative images shown here appear to contradict the findings described in Figure 2A. In Figure 5C, Rhodopsin 1 levels seem markedly reduced in the dcert mutants, whereas the text states that Rh1 levels are comparable between control and mutant photoreceptors. The authors should replace or reverify the representative images to ensure that they accurately reflect the conclusions presented in the text. 7.Figure 6D: The reported localization of hCERT to ER-PM MCSs is a key and potentially insightful observation, as it suggests the subcellular site of dcert activity in photoreceptors. However, the representative images provided are not sufficiently conclusive to support this claim. The authors should validate hCERT localization by co-staining with established markers like RDGB for ER-PM CNX99A for the ER and a Golgi marker since mammalian CERT is classically localized to ER-Golgi interfaces. Optionally, the authors could also quantify the relative distribution of hCERT among these compartments to provide a clearer assessment of its subcellular localization.

      Minor comments:

      1.In the first paragraph of introduction, authors should consider citing few of the key MCS literature. 2.Line 132: data not shown is not acceptable. Authors should consider presenting the findings in the supplemental figure. 3.The authors should include a comprehensive table or Excel sheet summarizing all statistical analyses. This should include the sample size, type of statistical test used and exact p-values. Providing this information will improve the transparency, reproducibility and overall rigor of the study. 4.The materials and methods section can be reorganized to include citation for flystocks which do not have stock number or RRIDs if the stocks were previously described but are not available from public repositories. They should expand on the details of various quantification methods used in the study. Finally including a section of Statistical analyses would further enhance transparency and reproducibility

      Referee cross-commenting

      1.I concur with Reviewer 1 regarding the need for more detailed reporting of statistical analyses. 2.I also agree with Reviewer 3 that the discussion should be expanded to address the age-dependent deterioration of ERG amplitude observed in the dcert mutants. This progressive decline could provide valuable insight into the long-term requirement of CERT function and signaling capacity at the photoreceptor membrane.

      Significance

      This study explores the physiological function of CERT, a LTP localized at MCSs in Drosophila photoreceptors and uncovers a novel role in regulating plasma membrane PE-Cer levels and GPCR-mediated signaling. These findings significantly advances our understanding of how CERT-mediated lipid transport regulates G-protein coupled phospholipase C signaling in vivo. This work also highlights Drosophila photoreceptors as a powerful system to analyze the physiological significance of lipid-dependent signaling processes. This work will be of interest to researchers in neuronal cell biology, membrane dynamics and lipid signaling community. This review is based on my expertise in neuronal cell biology.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      Non-vesicular lipid transfer by lipid transfer proteins regulates organelle lipid compositions and functions. CERT transfers ceramide from the ER to Golgi to produce sphingomyelin, although CERT function in animal development and physiology is less clear. Using dcert1 (a protein-null allele), this paper shows a disruption of the sole Drosophila CERT gene causes reduced ERG amplitude in photoreceptors. While the level and localization of phototransduction machinery appears unaffected, the level of PIP2 and the localization of RDGB are perturbed. Collectively, these observations establish a novel link between CERT and phospholipase signaling in phototransduction. To understand the molecular mechanism further, the authors performed lipid chromatography and mass spec to characterize ceramide species in dcert1. This analysis reveals that whereas the total ceramide remains unaffected, most PE-ceramide species are reduced. The authors use lace mutant (serine palmitoyl transferase) and CPES (ceramide phosphoethanolamine synthase) RNAi to distinguish whether it is the accumulation of ceramide in the ER or the reduction of sphingolipid derivates in the Golgi that is the cause for the reduced ERG amplitude. Mutating one copy of lace reduces ceramide level by 50% and partially rescues the ERG defect, suggesting that the accumulation of ceramide in the ER is a cause. CPES RNAi phenocopies the reduced ERG amplitude, suggesting the production of certain sphingolipid is also relevant.

      Major comments:

      1. By showing the reduced PIP2 level, the decreased SMC sites at the base of rhabdomeres, and the diffused RDGB localization in dcert1, the authors favor the model, in which the disruption of ceramide metabolism affects PIP transport. However, it is unclear if the reduced PIP2 level (i.e., reduced PH-PLC::GFP staining) is specific to the rhabdomeres. It should be possible to compare PH-PLC::GFP signals in different plasma membranes between wildtype and dcert1. If PH-PLC::GFP signal is specifically reduced at the rhabdomeres, this conclusion will be greatly strengthened. In addition, the photoreceptor apical plasma membrane includes rhabdomere and stalk membrane. Is the PH-PLC::GFP signal at the stalk membrane also affected?
      2. The analysis of RDGB localization should be done in mosaic dcert1 retinas, which will be more convincing with internal control for each comparison. In addition, the phalloidin staining in Figure 2J shows distinct patterns of adherens junctions, indicating that the wildtype and dcert1 were imaged at different focal planes.
      3. The significance of ceramide species levels in dcert1 and GMR>CPESRNAi needs to be explained better. Do certain alterations represent accumulation of ceramides in the ER?
      4. The suppression by lace is interpreted as evidence that the reduced ERG amplitude in dcert1 is caused by ceramide accumulation in the ER. This interpretation seems preliminary as lace may interact with dcert genetically by other mechanisms.
      5. The authors show that ERG amplitude is reduced in GMR>CPESRNAi. While this phenocopying is consistent with the reduced ERG amplitude in dcert1 being caused by reduced production of PE-ceramide, GMR>CPESRNAi also shows an increase in total ceramide level. Could this support the hypothesis that reduced ERG amplitude is caused by an accumulation of ceramide elsewhere? In addition, is the ERG amplitude reduction in GMR>CPESRNAi sensitive to lace?
      6. Along the same line, while the total ceramide level is significantly reduced in lace heterozygotes, is the PE-ceramide level also reduced? If yes, wouldn't this be contradictory to PE-ceramide production being important for ERG amplitude?
      7. What is the explanation and significance for the age-dependent deterioration of ERG amplitude in dcert1? Likewise, the significance of no retinal degeneration is not clearly presented.
      8. The rescue of dcert1 phenotype by the expression of human CERT is a nice result. In addition to demonstrating a functional conservation, it allows a determination of CERT protein localization. However, the quality of images in Figure 6D should be improved. The phalloidin staining was rather poor, and the CNX99A in the lower panel was over-exposed, generating bleed-through signals at the rhabdomeres. In addition, the localization of hCERT should be explored further. For instance, does hCERT colocalize with RDGB? Is the hCERT localization altered in lace or GMR>CPESRNAi background?

      Minor comments:

      1. In Line 128, Df(732) should be Df(3L)BSC732.
      2. GMR-SMSrRNAi shows an increase in ERG peak amplitude. Is there an explanation for this?

      Significance

      As CERT mutations are implicated in human learning disability, a better understanding of CERT function in neuronal cells is certainly of interest. While the link between ceramide transport and phospholipase signaling is novel and interesting, this paper does not clearly explain the mechanism. In addition, as the ERG were measured long after the retinal cells were deficient in CERT or CPES, it is difficult to assess whether the observed phenotype is a primary defect. Furthermore, the quality of some images needs to be improved. Thus, I feel the manuscript in its current form is too preliminary.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This manuscript dissects the physiological function of ceramide transfer protein (CERT) by studying the phenotype of CERT null Drosophila.

      dCERT null animals have a reduced electrical response to light in their photoreceptors, reduced baseline PIP2 accumulation in the cells and delayed re-synthesis of PIP2 and its precursor, PI4P after light stimulation. There are also reduced ER:PM contact sites at the rhabdomere and a corresponding reduction in the localization of PI/PA exchange protein, RDGB at this site. Therefore, the animals seem to have an impaired ability for sustaining phototransduction, which is nonetheless milder than that seen after loss of RDGB, for example. In terms of biochemical function, there is no overall change in ceramides, with some minor increases in specific short chain pools. There is however a large decrease in PE-ceramide species, again selective for a few molecular species. Curiously, decreasing ceramides with a mutant in ceramide synthesis is able to partially rescue both the electrical response and RDGB localization in dCERT flies, implying the increased ceramide species contribute to the phenotype. In addition, a mutation in PE-ceramide synthase largely phenocopies the dCERT null, exhiniting both increases ceramides and decreased PE-ceramide.

      In addition, dCERT flies were shown to have reduced localization of some plasma membrane proteins to detergent-resistant membrane fractions, as well as up regulation of the IRE1 and PERK stress-response pathways. Finally, dCERT nulls could be rescued with the human CERT protein, demonstrating conservation of core physiological function between these animals. Surprisingly, CERT is reported to localize to the ER:PM junctions at rhabdomeres, as opposed to the expected ER:Golgi contact sites.

      Specific areas where the manuscript could be strengthened include:

      Figure 2 studies the phototransduction system. Although clear changes in PI4P and PIP2 are seen, it would be interesting to see if changed PA accumulation occur in the dCERT animals, since RDGB localization is disrupted: this is expected to cause PM PA accumulation along with reduced PIP2 synthesis.

      Lines 228-230 state: "These findings suggest an important contribution for reduced PE - Cer levels in the eye phenotypes of dcert". Does it not also suggest a contribution of the elevated ceramide species, since these are also observed in the CPES animals?

      Figure 6D is a key finding that human CERT localized to the rhabdomere at ER:PM contact sites, though the reviewer was not convinced by these images. Is the protein truly localized to the contact sites, or simply have a pool of over-expressed protein localized to the surrounding cytoplasm? It also does not rule out localization (and therefore function) at ER:PM contact sites.

      Statistics: There are a large number of t-tests employed that do not correct for multiple comparisons, for example in figures 3B, 3D, 3H, 4C, 6C, S2A, S2B, S3B and S3C.

      There are two Western blotting sections in the methods.

      Significance

      Overall, the manuscript is clearly and succinctly written, with the data well presented and mostly convincing. The paper demonstrates clear phenotypes associated with loss of dCERT function, with surprising consequences for the function of a signaling system localized to ER:PM contact sites. To this reviewer, there seem to be three cogent observations of the paper: (i) loss of dCERT leads to accumulation of ceramides and loss of PE-ceramide, which together drive the phenotype. (ii) this ceramide alteration disrupts ER:PM contact sites and thus impairs phototransduction and (iii) rescue by human CERT and its apparent localization to ER:PM contact sites implies a potential novel site of action. Although surprising and novel, the significance of these observations are a little unclear: there is no obvious mechanism by which the elevated ceramide species and decreased PE-ceramide causes the specific failure in phototrasnduction, and the evidence for a novel site of action of CERT at the ER:PM contact sites is not compelling. Therefore, although an interesting and novel set of observations, the manuscript does not reveal a clear mechanistic basis for CERT physiological function.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Evidence, reproducibility and clarity

      This paper addresses a very interesting problem of non-centrosomal microtubule organization in developing Drosophila oocytes. Using genetics and imaging experiments, the authors reveal an interplay between the activity of kinesin-1, together with its essential cofactor Ensconsin, and microtubule organization at the cell cortex by the spectraplakin Shot, minus-end binding protein Patronin and Ninein, a protein implicated in microtubule minus end anchoring. The authors demonstrate that the loss of Ensconsin affects the cortical accumulation non-centrosomal microtubule organizing center (ncMTOC) proteins, microtubule length and vesicle motility in the oocyte, and show that this phenotype can be rescued by constitutively active kinesin-1 mutant, but not by Ensconsin mutants deficient in microtubule or kinesin binding. The functional connection between Ensconsin, kinesin-1 and ncMTOCs is further supported by a rescue experiment with Shot overexpression. Genetics and imaging experiments further implicate Ninein in the same pathway. These data are a clear strength of the paper; they represent a very interesting and useful addition to the field.

      The weaknesses of the study are two-fold. First, the paper seems to lack a clear molecular model, uniting the observed phenomenology with the molecular functions of the studied proteins. Most importantly, it is not clear how kinesin-based plus-end directed transport contributes to cortical localization of ncMTOCs and regulation of microtubule length.

      Second, not all conclusions and interpretations in the paper are supported by the presented data.

      We thank the reviewer for recognizing the impact of this work. In response to the insightful suggestions, we performed extensive new experiments that establish a well-supported cellular and molecular model (Figure 7). The discussion has been restructured to directly link each conclusion to its corresponding experimental evidence, significantly strengthening the manuscript.

      Below is a list of specific comments, outlining the concerns, in the order of appearance in the paper/figures.

      Figure 1. The statement: "Ens loading on MTs in NCs and their subsequent transport by Dynein toward ring canals promotes the spatial enrichment of the Khc activator Ens in the oocyte" is not supported by data. The authors do not demonstrate that Ens is actually transported from the nurse cells to the oocyte while being attached to microtubules. They do show that the intensity of Ensconsin correlates with the intensity of microtubules, that the distribution of Ensconsin depends on its affinity to microtubules and that an Ensconsin pool locally photoactivated in a nurse cell can redistribute to the oocyte (and throughout the nurse cell) by what seems to be diffusion. The provided images suggest that Ensconsin passively diffuses into the oocyte and accumulates there because of higher microtubule density, which depends on dynein. To prove that Ensconsin is indeed transported by dynein in the microtubule-bound form, one would need to measure the residence time of Ensconsin on microtubules and demonstrate that it is longer than the time needed to transport microtubules by dynein into the oocyte; ideally, one would like to see movement of individual microtubules labelled with photoconverted Ensconsin from a nurse cell into the oocyte. Since microtubules are not enriched in the oocyte of the dynein mutant, analysis of Ensconsin intensity in this mutant is not informative and does not reveal the mechanism of Ensconsin accumulation.

      As noted by Reviewer 3, the directional movement of microtubules traveling at ~140 nm/s from nurse cells toward the oocyte through Ring Canals was previously reported using a tagged Ens-MT binding domain reporter line by Lu et al. (2022). We have therefore added the citation of this crucial work in the novel version of the manuscript (lane 155-157) and removed the photo-conversion panel.

      Critically, however, our study provides mechanistic insight that was missing from this earlier work: this mechanism is also crucial to enrich MAPs in the oocyte. The fact that Dynein mutants fail to enrich Ensconsin is a crucial piece of evidence: it supports a model of Ensconsin-loaded MT transport (Figure 1D-1F).

      Figure 2. According to the abstract, this figure shows that Ensconsin is "maintained at the oocyte cortex by Ninein". However, the figure doesn't seem to prove it - it shows that oocyte enrichment of Ensonsin is partially dependent on Ninein, but this applies to the whole cell and not just to the cell cortex. Furthermore, it is not clear whether Ninein mutation affects microtubule density, which in turn would affect Ensconsin enrichment, and therefore, it is not clear whether the effect of Ninein loss on Ensconsin distribution is direct or indirect.

      Ninein plays a critical role in Ensconsin enrichment and microtubule organization in the oocyte (new Figure 2, Figure 3, Figure S3). Quantification of total Tubulin signal shows no difference between control and Nin mutant oocytes (new Figure S3 panels A, B). We found decreased Ens enrichment in the oocyte, and Ens localization on MTs and to the cell cortex (Figure 2E, 2F, and Figure S3C and S3D).

      Novel quantitative analyses of microtubule orientation at the anterior cortex, where MTs are normally preferentially oriented toward the posterior pole (Parton et al. 2011), demonstrate that Nin mutants exhibit randomized MT orientation compared to wild-type oocytes (new Figure 3C-3E).These findings establish that Ninein (although not essential) favors Ensconsin localization on MTs, Ens enrichment in the oocyte, ncMTOC cortical localization, and more robust MT orientation toward the posterior cortex. It also suggests that Ens levels in the oocyte acts as a rheostat to control Khc activation.

      The observation that the aggregates formed by overexpressed Ninein accumulate other proteins, including Ensconsin, supports, though does not prove their interactions. Furthermore, there is absolutely no proof that Ninein aggregates are "ncMTOCs". Unless the authors demonstrate that these aggregates nucleate or anchor microtubules (for example, by detailed imaging of microtubules and EB1 comets), the text and labels in the figure would need to be altered.

      We have modified the manuscript, we now refer to an accumulation of these components in large puncta, rather than aggregates, consistent with previous observations (Rosen et al., 2000). We acknowledge in the revised version that these puncta recruit Shot, Patronin and Ens without mentioning direct interaction (lane 218).

      Importantly, we conducted a more detailed characterization of these Ninein/Shot/Patronin/Ens-containing puncta in a novel Figure S4. To rigorously assess their nucleation capacity, we analyzed Eb1-GFP-labeled MT comets, a robust readout of MT nucleation (Parton et al., 2011, Nashchekin et al., 2016). While few Eb1-positive comets occasionally emanate from these structures, confirming their identity as putative ncMTOCs, these puncta function as surprisingly weak nucleation centers (new Figure S4 E, Video S1) and, their presence does not alter overall MT architecture (new Figure S4 F). Moreover, these puncta disappear over time, are barely visible at stage 10B, they do not impair oocyte development or fertility (Figure S4 G and Table 1).

      Minor comment: Note that a "ratio" (Figure 2C) is just a ratio, and should not be expressed in arbitrary units.

      We have amended this point in all the figures.

      Figure 3B: immunoprecipitation results cannot be interpreted because the immunoprecipitated proteins (GFP, Ens-GFP, Shot-YFP) are not shown. It is also not clear that this biochemical experiment is useful. If the authors would like to suggest that Ensconsin directly binds to Patronin, the interaction would need to be properly mapped at the protein domain level.

      This is a good point: the GFP and Ens-GFP immunoprecipitated proteins are now much clearly identified on the blots and in the figure legend (new Figure 4G). Shot-YFP IP, was used as a positive control but is difficult to be detected by Western blot due to its large size (>106 Da) using conventional acrylamide gels (Nashchekin et al., 2016).

      We now explicitly state that immunoprecipitations were performed at 4°C, where microtubules are fully depolymerized, thereby excluding undirect microtubule-mediated interactions. We agree with this reviewer: we cannot formally rule out interactions through bridging by other protein components. This is stated in the revised manuscript (lane 238-239).

      One of the major phenotypes observed by the authors in Ens mutant is the loss of long microtubules. The authors make strong conclusions about the independence of this phenotype from the parameters of microtubule plus-end growth, but in fact, the quality of their data does not allow to make such a conclusion, because they only measured the number of EB1 comets and their growth rate but not the catastrophe, rescue or pausing frequency."Note that kinesin-1 has been implicated in promoting microtubule damage and rescue (doi: 10.1016/j.devcel.2021).In the absence of such measurements, one cannot conclude whether short microtubules arise through defects in the minus-end, plus-end or microtubule shaft regulation pathways.

      We thank the reviewer for raising this important point. Our data demonstrate that microtubule (MT) nucleation and polymerization rates remain unaffected under Khc RNAi and ens mutant conditions, indicating that MT dynamics alterations must arise through alternative mechanisms.

      As the reviewer suggested, recent studies on Kinesin activity and MT network regulation are indeed highly relevant. Two key studies from the Verhey and Aumeier laboratories examined Kinesin-1 gain-of-function conditions and revealed that constitutively active Kinesin-1 induces MT lattice damage (Budaitis et al., 2022). While damaged MTs can undergo self-repair, Aumeier and colleagues demonstrated that GTP-tubulin incorporation generates "rescue shafts" that promote MT rescue events (Andreu-Carbo et al., 2022). Extrapolating from these findings, loss of Kinesin-1 activity could plausibly reduce rescue shaft formation, thereby decreasing MT rescue frequency and stability. Although this hypothesis is challenging to test directly in our system, it provides a mechanistic framework for the observed reduction in MT number and stability.

      Additionally, the reviewer highlighted the role of Khc in transporting the dynactin complex, an anti-catastrophe factor, to MT plus ends (Nieuwburg et al., 2017), which could further contribute to MT stabilization. This crucial reference is now incorporated into the revised Discussion.

      Importantly, our work also demonstrates the contribution of Ens/Khc to ncMTOC targeting to the cell cortex. Our new quantitative analyses of MT organization (new Figure 5 B) reveal a defective anteroposterior orientation of cortical MTs in mutant conditions, pointing to a critical role for cortical ncMTOCs in organizing the MT network.

      Taken together, we propose that the observed MT reduction and disorganization result from multiple interconnected mechanisms: (1) reduced rescue shaft formation affecting MT stability; (2) impaired transport of anti-catastrophe factors to MT plus ends; and (3) loss of cortical ncMTOCs, which are essential for minus-end MT stabilization and network organization. The Discussion has been revised to reflect this integrated model in a dedicated paragraph (“A possible regulation of MT dynamics in the oocyte at both plus end minus MT ends by Ens and Khc” lane 415-432).

      It is important to note in that a spectraplakin, like Shot, can potentially affect different pathways, particularly when overexpressed.

      We agree that Shot harbors multiple functional domains and acts as a key organizer of both actin and microtubule cytoskeletons. Overexpression of such a cytoskeletal cross-linker could indeed perturb both networks, making interpretation of Ens phenotype rescue challenging due to potential indirect effects.

      To address this concern, we selected an appropriate Shot isoform for our rescue experiments that displayed similar localization to “endogenous” Shot-YFP (a genomic construct harboring shot regulatory sequences) and importantly that was not overexpressed.

      Elevated expression of the Shot.L(A) isoform (see Western Blot Figure S8 A), considered as the wild-type form with two CH1 and CH2 actin-binding motifs (Lee and Kolodziej, 2002), showed abnormal localization such as strong binding to the microtubules in nurse cells and oocyte confirming the risk of gain-of-function artifacts and inappropriate conclusions (Figure S8 B, arrows).

      By contrast, our rescue experiments using the Shot.L(C) isoform (that only harbors the CH2 motif) provide strong evidence against such artifacts for three reasons. First, Shot-L(C) is expressed at slightly lower levels than a Shot-YFP genomic construct (not overexpressed), and at much lower levels than Shot-L(A), despite using the same driver (Figure S8 A). Second, Shot-L(C) localization in the oocyte is similar to that of endogenous Shot-YFP, concentrating at the cell cortex (Figure S8 B, compare lower and top panels). Taken together, these controls rather suggest our rescue with the Shot-L(C) is specific.

      Note that this Shot-L(C) isoform is sufficient to complement the absence of the shot gene in other cell contexts (Lee and Kolodziej, 2002).

      Unjustified conclusions should be removed: the authors do not provide sufficient data to conclude that "ens and Khc oocytes MT organizational defects are caused by decreased ncMTOC cortical anchoring", because the actual cortical microtubule anchoring was not measured.

      This is a valid point. We acknowledge that we did not directly measure microtubule anchoring in this study. In response, we have revised the discussion to more accurately reflect our observations. Throughout the manuscript, we now refer to "cortical microtubule organization" rather than "cortical microtubule anchoring," which better aligns with the data presented.

      Minor comment: Microtubule growth velocity must be expressed in units of length per time, to enable evaluating the quality of the data, and not as a normalized value.

      This is now amended in the revised version (modified Figure S7).

      A significant part of the Discussion is dedicated to the potential role of Ensconsin in cortical microtubule anchoring and potential transport of ncMTOCs by kinesin. It is obviously fine that the authors discuss different theories, but it would be very helpful if the authors would first state what has been directly measured and established by their data, and what are the putative, currently speculative explanations of these data.

      We have carefully considered the reviewer's constructive comments and are confident that this revised version fully addresses their concerns.

      First, we have substantially strengthened the connection between the Results and Discussion sections, ensuring that our interpretations are more directly anchored in the experimental data. This restructuring significantly improves the overall clarity and logical flow of the manuscript.

      Second, we have added a new comprehensive figure presenting a molecular-scale model of Kinesin-1 activation upon release of autoinhibition by Ensconsin (new Figure 7D). Critically, this figure also illustrates our proposed positive feedback loop mechanism: Khc-dependent cytoplasmic advection promotes cortical recruitment of additional ncMTOCs, which generates new cortical microtubules and further accelerates cytoplasmic transport (Figure 7 A-C). This self-amplifying cycle provides a mechanistic framework consistent with emerging evidence that cytoplasmic flows are essential for efficient intracellular transport in both insect and mammalian oocytes.

      Minor comment: The writing and particularly the grammar need to be significantly improved throughout, which should be very easy with current language tools. Examples: "ncMTOCs recruitment" should be "ncMTOC recruitment"; "Vesicles speed" should be "Vesicle speed", "Nin oocytes harbored a WT growth,"- unclear what this means, etc. Many paragraphs are very long and difficult to read. Making shorter paragraphs would make the authors' line of thought more accessible to the reader.

      We have amended and shortened the manuscript according to this reviewer feed-back. We have specifically built more focused paragraphs to facilitates the reading.

      Significance

      This paper represents significant advance in understanding non-centrosomal microtubule organization in general and in developing Drosophila oocytes in particular by connecting the microtubule minus-end regulation pathway to the Kinesin-1 and Ensconsin/MAP7-dependent transport. The genetics and imaging data are of good quality, are appropriately presented and quantified. These are clear strengths of the study which will make it interesting to researchers studying the cytoskeleton, microtubule-associated proteins and motors, and fly development.

      The weaknesses of this study are due to the lack of clarity of the overall molecular model, which would limit the impact of the study on the field. Some interpretations are not sufficiently supported by data, but this can be solved by more precise and careful writing, without extensive additional experimentation.

      We thank the reviewer for raising these important concerns regarding clarity and data interpretation. We have thoroughly revised the manuscript to address these issues on multiple fronts. First, we have substantially rewritten key sections to ensure that our conclusions are clearly articulated and directly supported by the data. Second, we have performed several new experiments that now allow us to propose a robust mechanistic model, presented in new figures. These additions significantly strengthen the manuscript and directly address the reviewer's concerns.

      My expertise is cell biology and biochemistry of the microtubule cytoskeleton, including both microtubule-associated proteins and microtubule motors.

      Reviewer #2

      Evidence, reproducibility and clarity

      In this manuscript, Berisha et al. investigate how microtubule (MT) organization is spatially regulated during Drosophila oogenesis. The authors identify a mechanism in which the Kinesin-1 activator Ensconsin/MAP7 is transported by dynein and anchored at the oocyte cortex via Ninein, enabling localized activation of Kinesin-1. Disruption of this pathway impairs ncMTOC recruitment and MT anchoring at the cortex. The authors combine genetic manipulation with high-resolution microscopy and use three key readouts to assess MT organization during mid-to-late oogenesis: cortical MT formation, localization of posterior determinants, and ooplasmic streaming. Notably, Kinesin-1, in concert with its activator Ens/MAP7, contributes to organizing the microtubule network it travels along. Overall, the study presents interesting findings, though we have several concerns we would like the authors to address. Ensconsin enrichment in the oocyte 1. Enrichment in the oocyte • Ensconsin is a MAP that binds MTs. Given that microtubule density in the oocyte significantly exceeds that in the nurse cells, its enrichment may passively reflect this difference. To assess whether the enrichment is specific, could the authors express a non-Drosophila MAP (e.g., mammalian MAP1B) to determine whether it also preferentially localizes to the oocyte?

      To address this point, we performed a new series of experiments analyzing the enrichment of other Drosophila and non-Drosophila MAPs, including Jupiter-GFP, Eb1-GFP, and bovine Tau-GFP, all widely used markers of the microtubule cytoskeleton in flies (see new Figure S2). Our results reveal that Jupiter-GFP, Eb1-GFP, and bovine Tau-GFP all exhibit significantly weaker enrichment in the oocyte compared to Ens-GFP. Khc-GFP also shows lower enrichment. These findings indicate that MAP enrichment in the oocyte is MAP-dependent, rather than solely reflecting microtubule density or organization. Of note, we cannot exclude that microtubule post-translational modifications contribute to differential MAP binding between nurse cells and the oocyte, but this remains a question for future investigation.

      The ability of ens-wt and ens-LowMT to induce tubulin polymerization according to the light scattering data (Fig. S1J) is minimal and does not reflect dramatic differences in localization. The authors should verify that, in all cases, the polymerization product in their in vitro assays is microtubules rather than other light-scattering aggregates. What is the control in these experiments? If it is just purified tubulin, it should not form polymers at physiological concentrations.

      The critical concentration Cr for microtubule self-assembly in classical BRB80 buffer found by us and others is around 20 µM (see Fig. 2c in Weiss et al., 2010). Here, microtubules were assembled at 40 µM tubulin concentration, i.e., largely above the Cr. As stated in the materials and methods section, we systematically induced cooling at 4°C after assembly to assess the presence of aggregates, since those do not fall apart upon cooling. The decrease in optical density upon cooling is a direct control that the initial increase in DO is due to the formation of microtubules. Finally, aggregation and polymerization curves are widely different, the former displaying an exponential shape and the latter a sigmoid assembly phase (see Fig. 3A and 3B in Weiss et al., 2010).

      Photoconversion caveatsMAPs are known to dynamically associate and dissociate from microtubules. Therefore, interpretation of the Ens photoconversion data should be made with caution. The expanding red signal from the nurse cells to the oocyte may reflect a any combination of dynein-mediated MT transport and passive diffusion of unbound Ensconsin. Notably, photoconversion of a soluble protein in the nurse cells would also result in a gradual increase in red signal in the oocyte, independent of active transport. We encourage the authors to more thoroughly discuss these caveats. It may also help to present the green and red channels side by side rather than as merged images, to allow readers to assess signal movement and spatial patterns better.

      This is a valid point that mirrors the comment of Reviewers 1 and 3. The directional movement of microtubules traveling at ~140 nm/s from nurse cells toward the oocyte via the ring canals was previously reported by Lu et al. (2022) with excellent spatial resolution. Notably, this MT transport was measured using a fusion protein containing the Ens MT-binding domain. We now cite this relevant study in our revised manuscript and have removed this redundant panel in Figure 1.

      Reduction of Shot at the anterior cortex• Shot is known to bind strongly to F-actin, and in the Drosophila ovary, its localization typically correlates more closely with F-actin structures than with microtubules, despite being an MT-actin crosslinker. Therefore, the observed reduction of cortical Shot in ens, nin mutants, and Khc-RNAi oocytes is unexpected. It would be important to determine whether cortical F-actin is also disrupted in these conditions, which should be straightforward to assess via phalloidin staining.

      As requested by the reviewer, we performed actin staining experiments, which are now presented in a new Figure S5. These data demonstrate that the cortical actin network remains intact in all mutant backgrounds analyzed, ruling out any indirect effect of actin cytoskeleton disruption on the observed phenotypes.

      MTs are barely visible in Fig. 3A, which is meant to demonstrate Ens-GFP colocalization with tubulin. Higher-quality images are needed.

      The revised version now provides significantly improved images to show the different components examined. Our data show that Ens and Ninein localize at the cell cortex where they co-localize with Shot and Patronin (Figure 2 A-C). In addition, novel images show that Ens extends along microtubules (new Figure 4 A).

      MT gradient in stage 9 oocytesIn ens-/-, nin-/-, and Khc-RNAi oocytes, is there any global defect in the stage 9 microtubule gradient? This information would help clarify the extent to which cortical localization defects reflect broader disruptions in microtubule polarity.

      We now provide quantitative analysis of microtubule (MT) array organization in novel figures (Figure 3D and Figure 5B). Our data reveal that both Khc RNAi and ens mutant oocytes exhibit severe disruption of MT orientation toward the posterior (new Figure 5B). Importantly, this defect is significantly less pronounced in Nin-/- oocytes, which retain residual ncMTOCs at the cortex (new Figure 3D). This differential phenotype supports our model that cortical ncMTOCs are critical for maintaining proper MT orientation toward the posterior side of the oocyte.

      Role of Ninein in cortical anchoringThe requirement for Ninein in cortical anchorage is the least convincing aspect of the manuscript and somewhat disrupts the narrative flow. First, it is unclear whether Ninein exhibits the same oocyte-enriched localization pattern as Ensconsin. Is Ninein detectable in nurse cells? Second, the Ninein antibody signal appears concentrated in a small area of the anterior-lateral oocyte cortex (Fig. 2A), yet Ninein loss leads to reduced Shot signal along a much larger portion of the anterior cortex (Fig. 2F)-a spatial mismatch that weakens the proposed functional relationship. Third, Ninein overexpression results in cortical aggregates that co-localize with Shot, Patronin, and Ensconsin. Are these aggregates functional ncMTOCs? Do microtubules emanate from these foci?

      We now provide a more comprehensive analysis of Ninein localization. Similar to Ensconsin (Ens), endogenous Ninein is enriched in the oocyte during the early stages of oocyte development but is also detected in NCs (see modified Figure 2 A and Lasko et al., 2016). Improved imaging of Ninein further shows that the protein partially co-localizes with Ens, and ncMTOCs at the anterior cortex and with Ens-bound MTs (Figure 2B, 2C).

      Importantly, loss of Ninein (Nin) only partially reduces the enrichment of Ens in the oocyte (Figure 2E). Both Ens and Kinesin heavy chain (Khc) remain partially functional and continue to target non-centrosomal microtubule-organizing centers (ncMTOCs) to the cortex (Figure 3A). In Nin-/- mutants, a subset of long cortical microtubules (MTs) is present, thereby generating cytoplasmic streaming, although less efficiently than under wild-type (WT) conditions (Figure 3F and 3G). As a non-essential gene, we envisage Ninein as a facilitator of MT organization during oocyte development.

      Finally, our new analyses demonstrate that large puncta containing Ninein, Shot, Patronin, and despite their size, appear to be relatively weak nucleation centers (revised Figure S4 E and Video 1). In addition, their presence does not bias overall MT architecture (Figure S4 F) nor impair oocyte development and fertility (Figure S4 G and Table 1).

      Inconsistency of Khc^MutEns rescueThe Khc^MutEns variant partially rescues cortical MT formation and restores a slow but measurable cytoplasmic flow yet it fails to rescue Staufen localization (Fig. 5). This raises questions about the consistency and completeness of the rescue. Could the authors clarify this discrepancy or propose a mechanistic rationale?

      This is a good point. The cytoplasmic flows (the consequence of cargo transport by Khc on MTs) generated by a constitutively active KhcMutEns in an ens mutant condition, are less efficient than those driven by Khc activated by Ens in a control condition (Figure 6C). The rescued flow is probably not efficient enough to completely rescue the Staufen localization at stage 10.

      Additionally, this KhcMutEns variant rescues the viability of embryos from Khc27 mutant germline clones oocytes but not from ens mutants (Table1). One hypothesis is that Ens harbors additional functions beyond Khc activation.

      This incomplete rescue of Ens by an active Khc variant could also be the consequence of the “paradox of co-dependence”: Kinesin-1 also transport the antagonizing motor Dynein that promotes cargo transport in opposite directions (Hancock et al., 2016). The phenotype of a gain of function variant is therefore complex to interpret. Consistent with this, both KhcMutEns-GFP and KhcDhinge2 two active Khc only rescues partially centrosome transport in ens mutant Neural Stem Cells (Figure S10).

      Minor points: 1. The pUbi-attB-Khc-GFP vector was used to generate the Khc^MutEns transgenic line, presumably under control of the ubiquitous ubi promoter. Could the authors specify which attP landing site was used? Additionally, are the transgenic flies viable and fertile, given that Kinesin-1 is hyperactive in this construct?

      All transgenic constructs were integrated at defined genomic landing sites to ensure controlled expression levels. Specifically, both GFP-tagged KhcWT and KhcMutEns were inserted at the VK05 (attP9A) site using PhiC31-mediated integration. Full details of the landing sites are provided in the Materials and Methods section. Both transgenic flies are homozygous lethal and the transgenes are maintained over TM6B balancers.

      On page 11 (Discussion, section titled "A dual Ensconsin oocyte enrichment mechanism achieves spatial relief of Khc inhibition"), the statement "many mutations in Kif5A are causal of human diseases" would benefit from a brief clarification. Since not all readers may be familiar with kinesin gene nomenclature, please indicate that KIF5A is one of the three human homologs of Kinesin heavy chain.

      We clarified this point in the revised version (lane 465-466).

      On page 16 (Materials and Methods, "Immunofluorescence in fly ovaries"), the sentence "Ovaries were mounted on a slide with ProlonGold medium with DAPI (Invitrogen)" should be corrected to "ProLong Gold."

      This is corrected.

      Significance

      This study shows that enrichment of MAP7/ensconsin in the oocyte is the mechanism of kinesin-1 activation there and is important for cytoplasmic streaming and localization non-centrosomal microtubule-organizing centers to the oocyte cortex

      We thank the reviewers for the accurate review of our manuscript and their positive feed-back.

      Reviewer #3

      Evidence, reproducibility and clarity

      The manuscript of Berisha et al., investigates the role of Ensconsin (Ens), Kinesin-1 and Ninein in organisation of microtubules (MT) in Drosophila oocyte. At stage 9 oocytes Kinesin-1 transports oskar mRNA, a posterior determinant, along MT that are organised by ncMTOCs. At stage 10b, Kinesin-1 induces cytoplasmic advection to mix the contents of the oocyte. Ensconsin/Map7 is a MT associated protein (MAP) that uses its MT-binding domain (MBD) and kinesin binding domain (KBD) to recruit Kinesin-1 to the microtubules and to stimulate the motility of MT-bound Kinesin-1. Using various new Ens transgenes, the authors demonstrate the requirement of Ens MBD and Ninein in Ens localisation to the oocyte where Ens activates Kinesin-1 using its KBD. The authors also claim that Ens, Kinesin-1 and Ninein are required for the accumulation of ncMTOCs at the oocyte cortex and argue that the detachment of the ncMTOCs from the cortex accounts for the reduced localisation of oskar mRNA at stage 9 and the lack of cytoplasmic streaming at stage 10b. Although the manuscript contains several interesting observations, the authors' conclusions are not sufficiently supported by their data. The structure function analysis of Ensconsin (Ens) is potentially publishable, but the conclusions on ncMTOC anchoring and cytoplasmic streaming not convincing.

      We are grateful that the regulation of Khc activity by MAP7 was well received by all reviewers. While our study focuses on Drosophila oogenesis, we believe this mechanism may have broader implications for understanding kinesin regulation across biological systems.

      For the novel function of the MAP7/Khc complex in organizing its own microtubule networks through ncMTOC recruitment, we have carefully considered the reviewers' constructive recommendations. We now provide additional experimental evidence supporting a model of flux self-amplification in which ncMTOC recruitment plays a key role. It is well established that cytoplasmic flows are essential for posterior localization of cell fate determinants at stage 10B. Slow flows have also been described at earlier oogenesis stages by the groups of Saxton and St Johnston. Building on these early publications and our new experiments, we propose that these flows are essential to promote a positive feedback loop that reinforces ncMTOC recruitment and MT organization (Figure 7).

      1) The main conclusion of the manuscript is that "MT advection failure in Khc and ens in late oogenesis stems from defective cortical ncMTOCs recruitment". This completely overlooks the abundant evidence that Kinesin-1 directly drives cytoplasmic streaming by transporting vesicles and microtubules along microtubules, which then move the cytoplasm by advection (Palacios et al., 2002; Serbus et al, 2005; Lu et al, 2016). Since Kinesin-1 generates the flows, one cannot conclude that the effect of khc and ens mutants on cortical ncMTOC positioning has any direct effect on these flows, which do not occur in these mutants.

      We regret the lack of clarity of the first version of the manuscript and some missing references. We propose a model in which the Kinesin-1- dependent slow flows (described by Serbus/Saxton and Palacios/StJohnston) play a central role in amplifying ncMTOC anchoring and cortical MT network formation (see model in the new Figure 7).

      2) The authors claim that streaming phenotypes of ens and khs mutants are due to a decrease in microtubule length caused by the defective localisation of ncMTOCs. In addition to the problem raised above, However, I am not convinced that they can make accurate measurements of microtubule length from confocal images like those shown in Figure 4. Firstly, they are measuring the length of bundles of microtubules and cannot resolve individual microtubules. This problem is compounded by the fact that the microtubules do not align into parallel bundles in the mutants. This will make the "microtubules" appear shorter in the mutants. In addition, the alignment of the microtubules in wild-type allows one to choose images in which the microtubule lie in the imaging plane, whereas the more disorganized arrangement of the microtubules in the mutants means that most microtubules will cross the imaging plane, which precludes accurate measurements of their length.

      As mentioned by Reviewer 4, we have been transparent with the methodology, and the limitations that were fully described in the material and methods section.

      Cortical microtubules in oocytes are highly dynamic and move rapidly, making it technically impossible to capture their entire length using standard Z-stack acquisitions. We therefore adopted a compromise approach: measuring microtubules within a single focal plane positioned just below the oocyte cortex. This strategy is consistent with established methods in the field, such as those used by Parton et al. (2011) to track microtubule plus-end directionality. To avoid overinterpretation, we explicitly refer to these measurements as "minimum detectable MT length," acknowledging that microtubules may extend beyond the focal plane, particularly at stage 10, where long, tortuous bundles frequently exit the plane of focus. These methodological considerations and potential biases are clearly described in the Materials and Methods section and the text now mentions the possible disorganization of the MT network in the mutant conditions (lane 272-273).

      In this revised version, we now provide complementary analyses of MT network organization.Beyond length measurements (and the mentioned limitations), we also quantified microtubule network orientation at stage 9, assessing whether cortical microtubules are preferentially oriented toward the posterior axis as observed in controls (revised Figure 3D and Figure 5B). While this analysis is also subject to the same technical limitations, it reveals a clear biological difference: microtubules exhibit posterior-biased orientation in control oocytes similar to a previous study (Parton et al., 2011) but adopt a randomized orientation in Nin-/-, ens, and Khc RNAi-depleted oocytes (revised Figure 3D and Figure 5B).

      Taken together, these complementary approaches, despite their technical constraints, provide convergent evidence for the role of the Khc/Ens complex in organizing cortical microtubule networks during oogenesis.

      3) "To investigate whether the presence of these short microtubules in ens and Khc RNAi oocytes is due to defects in microtubule anchoring or is also associated with a decrease in microtubule polymerization at their plus ends, we quantified the velocity and number of EB1comets, which label growing microtubule plus ends (Figure S3)." I do not understand how the anchoring or not of microtubule minus ends to the cortex determines how far their plus ends grow, and these measurements fall short of showing that plus end growth is unaffected. It has already been shown that the Kinesin-1-dependent transport of Dynactin to growing microtubule plus ends increases the length of microtubules in the oocyte because Dynactin acts as an anti-catastrophe factor at the plus ends. Thus, khc mutants should have shorter microtubules independently of any effects on ncMTOC anchoring. The measurements of EB1 comet speed and frequency in FigS2 will not detect this change and are not relevant for their claims about microtubule length. Furthermore, the authors measured EB1 comets at stage 9 (where they did not observe short MT) rather than at stage 10b. The authors' argument would be better supported if they performed the measurements at stage 10b.

      We thank the reviewer for raising this important point. The short microtubule (MT) length observed at stage 10B could indeed result from limited plus-end growth. Unfortunately, we were unable to test this hypothesis directly: strong endogenous yolk autofluorescence at this stage prevented reliable detection of Eb1-GFP comets, precluding velocity measurements.

      At least during stage 9, our data demonstrate that MT nucleation and polymerization rates are not reduced in both KhcRNAi and ens mutant conditions, indicating that the observed MT alterations must arise through alternative mechanisms.

      In the discussion, we propose the following interconnected explanations, supported by recent literature and the reviewers’ suggestions:

      1- Reduced MT rescue events. Two seminal studies from the Verhey and Aumeier laboratories have shown that constitutively active Kinesin-1 induces MT lattice damage (Budaitis et al., 2022), which can be repaired through GTP-tubulin incorporation into "rescue shafts" that promote MT rescue (Andreu-Carbo et al., 2022). Extrapolating from these findings, loss of Kinesin-1 activity could plausibly reduce rescue shaft formation, thereby decreasing MT stability. While challenging to test directly in our system, this mechanism provides a plausible framework for the observed phenotype.

      2- Impaired transport of stabilizing factors. As that reviewer astutely points out, Khc transports the dynactin complex, an anti-catastrophe factor, to MT plus ends (Nieuwburg et al., 2017). Loss of this transport could further compromise MT plus end stability. We now discuss this important mechanism in the revised manuscript.

      3- Loss of cortical ncMTOCs. Critically, our new quantitative analyses (revised Figure 3 and Figure 5) also reveal defective anteroposterior orientation of cortical MTs in mutant conditions. These experiments suggest that Ens/Khc-mediated localization of ncMTOCs to the cortex is essential for proper MT network organization, and possibly minus-end stabilization as suggested in several studies (Feng et al., 2019, Goodwin and Vale, 2011, Nashchekin et al., 2016).

      Altogether, we now propose an integrated model in which MT reduction and disorganization may result from multiple complementary mechanisms operating downstream of Kinesin-1/Ensconsin loss. While some aspects remain difficult to test directly in our in vivo system, the convergence of our data with recent mechanistic studies provides an interesting conceptual framework. The Discussion has been revised to reflect this comprehensive view in a dedicated paragraph (“A possible regulation of MT dynamics in the oocyte at both plus end minus MT ends by Ens and Khc” lane 415-432).

      4) The Shot overexpression experiments presented in Fig.3 E-F, Fig.4D and TableS1 are very confusing. Originally , the authors used Shot-GFP overexpression at stage 9 to show that there is a decrease of ncMTOCs at the cortex in ens mutants (Fig.3 E-F) and speculated that this caused the defects in MT length and cytoplasmic advection at stage 10B. However the authors later state on page 8 that : "Shot overexpression (Shot OE) was sufficient to rescue the presence of long cortical MTs and ooplasmic advection in most ens oocytes (9/14), resembling the patterns observed in controls (Figures 4B right panel and 4D). Moreover, while ens females were fully sterile, overexpression of Shot was sufficient to restore that loss of fertility (Table S1)". Is this the same UAS Shot-GFP and VP16 Gal4 used in both experiments? If so, this contradictions puts the authors conclusions in question.

      This is an important point that requires clarification regarding our experimental design.

      The Shot-YFP construct is a genomic insertion on chromosome 3. The ens mutation is also located on chromosome 3 and we were unable to recombine this transgene with the ens mutant for live quantification of cortical Shot. To circumvent this technical limitation, we used a UAS-Shot.L(C)-GFP transgenic construct driven by a maternal driver, expressed in both wild-type (control) and ens mutant oocytes. We validated that the expression level and subcellular localization of UAS-Shot.L(C)-GFP were comparable to those of the genomic Shot-YFP (new Figure S8 A and B).

      From these experiments, we drew two key conclusions. First, cortical Shot.L(C)-GFP is less abundant in ens mutant oocytes compared to wild-type (the quantification has been removed from this version). Second, despite this reduced cortical accumulation, Shot.L(C)-GFP expression partially rescues ooplasmic flows and microtubule streaming in stage 10B ens mutant oocytes, and restores fertility to ens mutant females.

      5) The authors based they conclusions about the involvement of Ens, Kinesin-1 and Ninein in ncMTOC anchoring on the decrease in cortical fluorescence intensity of Shot-YFP and Patronin-YFP in the corresponding mutant backgrounds. However, there is a large variation in average Shot-YFP intensity between control oocytes in different experiments. In Fig. 2F-G the average level of Shot-YFP in the control sis 130 AU while in Fig.3 G-H it is only 55 AU. This makes me worry about reliability of such measurements and the conclusions drawn from them.

      To clarify this point, we have harmonized the method used to quantify the Shot-YFP signals in Figure 4E with the methodology used in Figure 3B, based on the original images. The levels are not strictly identical (Control Figure 2 B: 132.7+/-36.2 versus Control Figure 4 E: 164.0+/- 37.7). These differences are usual when experiments are performed at several-month intervals and by different users.

      6) The decrease in the intensity of Shot-YFP and Patronin-YFP cortical fluorescence in ens mutant oocytes could be because of problems with ncMTOC anchoring or with ncMTOCs formation. The authors should find a way to distinguish between these two possibilities. The authors could express Ens-Mut (described in Sung et al 2008), which localises at the oocyte posterior and test whether it recruits Shot/Patronin ncMTOCs to the posterior.

      We tried to obtain the fly stocks described in the 2008 paper by contacting former members of Pernille Rørth's laboratory. Unfortunately, we learned that the lab no longer exists and that all reagents, including the requested stocks, were either discarded or lost over time. To our knowledge, these materials are no longer available from any source. We regret that this limitation prevented us from performing the straightforward experiments suggested by the reviewer using these specific tools.

      7) According to the Materials and Methods, the Shot-GFP used in Fig.3 E-F and Fig.4 was the BDSC line 29042. This is Shot L(C), a full-length version of Shot missing the CH1 actin-binding domain that is crucial for Shot anchoring to the cortex. If the authors indeed used this version of Shot-GFP, the interpretation of the above experiments is very difficult.

      The Shot.L(C) isoform lacks the CH1 domain but retains the CH2 actin-binding motif. Truncated proteins with this domain and fused to GST retains a weak ability to bind actin in vitro. Importantly, the function of this isoform is context-dependent: it cannot rescue shot loss-of-function in neuron morphogenesis but fully restores Shot-dependent tracheal cell remodeling (Lee and Kolodziej, 2002).

      In our experiments, when the Shot.L(C) isoform was expressed under the control of a maternal driver, its localization to the oocyte cortex was comparable to that of the genomic Shot-YFP construct (new Figure S8). This demonstrates unambiguously that the CH1 domain is dispensable for Shot cortical localization in oocytes, and that CH2-mediated actin binding is sufficient for this localization. Of note, a recent study showed that actin network are not equivalent highlighting the need for specific Shot isoforms harboring specialized actin-binding domain (Nashchekin et al., 2024).

      We note that the expression level of Shot.L(C)-GFP in the oocyte appeared slightly lower than that of Shot-YFP (expressed under endogenous Shot regulatory sequences), as assessed by Western blot (Figure S8 A).

      Critically, Shot.L(C)-GFP expression was substantially lower than that of Shot.L(A)-GFP (that harbored both the CH1 and CH2 domain). Shot.L(A)-GFP was overexpressed (Figure 8 A) and ectopically localized on MTs in both nurse cells and the ooplasm (Figure S8 B middle panel and arrow). These observations are in agreement that the Shot.L(C)-GFP rescue experiment was performed at near-physiological expression levels, strengthening the validity of our conclusions.

      8) Page 6 "converted in NCs, in a region adjacent to the ring canals, Dendra-Ens-labeled MTs were found in the oocyte compartment indicating they are able to travel from NC toward the oocyte through ring canals". I have difficulty seeing the translocation of MT through the ring canals. Perhaps it would be more obvious with a movie/picture showing only one channel. Considering that f Dendra-Ens appears in the oocyte much faster than MT transport through ring canals (140nm/s, Lu et al 2022), the authors are most probably observing the translocation of free Ens rather than Ens bound to MT. The authors should also mention that Ens movement from the NC to the oocyte has been shown before with Ens MBD in Lu et al 2022 with better resolution.

      We fully agree on the caveat mentioned by this reviewer: we may observe the translocation of free Dendra-Ensconsin. The experiment, was removed and replaced by referring to the work of the Gelfand lab. The movement of MTs that travel at ~140 nm/s between nurse cells toward the oocyte through the Ring Canals was reported before by Lu et al. (2022) with a very good resolution. Notably, this directional directed movement of MTs was measured using a fusion protein encompassing Ens MT-binding domain. We decided to remove this inclusive experiment and rather refer to this relevant study.

      9) Page 6: The co-localization of Ninein with Ens and Shot at the oocyte cortex (Figure 2A). I have difficulty seeing this co-localisation. Perhaps it would be more obvious in merged images of only two channels and with higher resolution images

      10) "a pool of the Ens-GFP co-localized with Ch-Patronin at cortical ncMTOCs at the anterior cortex (Figure 3A)". I also have difficulty seeing this.

      We have performed new high-resolution acquisitions that provide clearer and more convincing evidence for the localization cortical distribution of these proteins (revised Figure 2A-2C and Figure 4A). These improved images demonstrate that Ens, Ninein, Shot, and Patronin partially colocalize at cortical ncMTOCs, as initially proposed. Importantly, the new data also reveal a spatial distinction: while Ens localizes along microtubules extending from these cortical sites, Ninein appears confined to small cytoplasmic puncta adjacent but also present on cortical microtubules.

      11) "Ninein co-localizes with Ens at the oocyte cortex and partially along cortical microtubules, contributing to the maintenance of high Ens protein levels in the oocyte and its proper cortical targeting". I could not find any data showing the involvement of Ninein in the cortical targeting of Ens.

      We found decreased Ens localization to MTs and to the cell cortex region (new Figure S3 A-B).

      12) "our MT network analyses reveal the presence of numerous short MTs cytoplasmic clustered in an anterior pattern." "This low cortical recruitment of ncMTOCs is consistent with poor MT anchoring and their cytoplasmic accumulation." I could not find any data showing that short cortical MT observed at stage 10b in ens mutant and Khc RNAi were cytoplasmic and poorly anchored.

      The sentence was removed from the revised manuscript.

      13) "The egg chamber consists of interconnected cells where Dynein and Khc activities are spatially separated. Dynein facilitates transport from NCs to the oocyte, while Khc mediates both transport and advection within the oocyte." Dynein is involved in various activities in the oocyte. It anchors the oocyte nucleus and transports bcd and grk mRNA to mention a few.

      The text was amended to reflect Dynein involvement in transport activities in the oocyte, with the appropriate references (lane 105-107).

      14) The cartoons in Fig.2H and 3I exaggerate the effect of Ninein and Ens on cortical ncMTOCs. According to the corresponding graphs, there is a 20 and 50% decrease in each case.

      New cartoons (now revised Figure 3E and 4F), are amended to reflect the ncMTOC values but also MT orientation (Figure 3E).

      Significance

      Given the important concerns raised, the significance of the findings is difficult to assess at this stage.

      We sincerely thank the reviewer for their thorough evaluation of our manuscript. We have carefully addressed their concerns through substantial new experiments and analyses. We hope that the revised manuscript, in its current form, now provides the clarifications and additional evidence requested, and that our responses demonstrate the significance of our findings.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      Summary: This manuscript presents an investigation into the molecular mechanisms governing spatial activation of Kinesin-1 motor protein during Drosophila oogenesis, revealing a regulatory network that controls microtubule organization and cytoplasmic transport. The authors demonstrate that Ensconsin, a MAP7 family protein and Kinesin-1 activator, is spatially enriched in the oocyte through a dual mechanism involving Dynein-mediated transport from nurse cells and cortical maintenance by Ninein. This spatial enrichment of Ens is crucial for locally relieving Kinesin-1 auto-inhibition. The Ens/Khc complex promotes cortical recruitment of non-centrosomal microtubule organizing centers (ncMTOCs), which are essential for anchoring microtubules at the cortex, enabling the formation of long, parallel microtubule streams or "twisters" that drive cytoplasmic advection during late oogenesis. This work establishes a paradigm where motor protein activation is spatially controlled through targeted localization of regulatory cofactors, with the activated motor then participating in building its own transport infrastructure through ncMTOC recruitment and microtubule network organization.

      There's a lot to like about this paper! The data are generally lovely and nicely presented. The authors also use a combination of experimental approaches, combining genetics, live and fixed imaging, and protein biochemistry.

      We thank the reviewer for this enthusiastic and supportive review, which helped us further strengthen the manuscript.

      Concerns: Page 6: "to assay if elevation of Ninein levels was able to mis-regulate Ens localization, we overexpressed a tagged Ninein-RFP protein in the oocyte. At stage 9 the overexpressed Ninein accumulated at the anterior cortex of the oocyte and also generated large cortical aggregates able to recruit high levels of Ens (Figures 2D and 2H)... The examination of Ninein/Ens cortical aggregates obtained after Ninein overexpression showed that these aggregates were also able to recruit high levels of Patronin and Shot (Figures 2E and 2H)." Firstly, I'm not crazy about the use of "overexpressed" here, since there isn't normally any Ninein-RFP in the oocyte. In these experiments it has been therefore expressed, not overexpressed. Secondly, I don't understand what the reader is supposed to make of these data. Expression of a protein carrying a large fluorescent tag leads to large aggregates (they don't look cortical to me) that include multiple proteins - in fact, all the proteins examined. I don't understand this to be evidence of anything in particular, except that Ninein-RFP causes the accumulation of big multi-protein aggregates. While I can understand what the authors were trying to do here, I think that these data are inconclusive and should be de-emphasized.

      We have revised the manuscript by replacing overexpressed with expressed (lanes 211 and 212). In addition, we now provide new localization data in both cortical (new Figure S4 A, top) and medial focal planes (new Figure S4 A, bottom), demonstrating that Ninein puncta (the word used in Rosen et al, 2019), rather than aggregates are located cortically. We also show that live IRP-labelled MTs do not colocalize with Ninein-RFP puncta. In light of the new experiments and the comments from the other reviewers, the corresponding text has been revised and de-emphasized accordingly.

      Page 7: "Co-immunoprecipitations experiments revealed that Patronin was associated with Shot-YFP, as shown previously (Nashchekin et al., 2016), but also with EnsWT-GFP, indicating that Ens, Shot and Patronin are present in the same complex (Figure 3B)." I do not agree that association between Ens-GFP and Patronin indicates that Ens is in the same complex as Shot and Patronin. It is also very possible that there are two (or more) distinct protein complexes. This conclusion could therefore be softened. Instead of "indicating" I suggest "suggesting the possibility."

      We have toned down this conclusion and indicated “suggesting the possibility” (lane 238-239).

      Page 7: "During stage 9, the average subcortical MT length, taken at one focal plane in live oocytes (see methods)..." I appreciate that the authors have been careful to describe how they measured MT length, as this is a major point for interpretation. I think the reader would benefit from an explanation of why they decided to measure in only one focal plane and how that decision could impact the results.

      We appreciate this helpful suggestion. Cortical microtubules are indeed highly dynamic and extend in multiple directions, including along the Z-axis. Moreover, their diameter is extremely small (approximately 25 nm), making it technically challenging to accurately measure their full length with high resolution using our Zeiss Airyscan confocal microscope (over several, microns): the acquisition of Z-stacks is relatively slow and therefore not well suited to capturing the rapid dynamics of these microtubules. Consequently, our length measurements represent a compromise and most likely underestimate the actual lengths of microtubules growing outside the focal plane. We note that other groups have encountered similar technical limitations (Parton et al., 2011).

      Page 7: "... the MTs exhibited an orthogonal orientation relative to the anterior cortex (Figures 4A left panels, 4C and 4E)." This phenotype might not be obvious to readers. Can it be quantified?

      We have now analyzed the orientation of microtubules (MTs) along the dorso-ventral axis. Our analysis shows that ens, Khc RNAi oocytes (new Figure 5B), and, to a lesser extent, Nin mutant oocytes (new Figure 3D), display a more random MT orientation compared to wild-type (WT) oocytes. In WT oocytes, MTs are predominantly oriented toward the posterior pole, consistent with previous findings (Parton et al., 2011).

      Page 8: "Altogether, the analyses of Ens and Khc defective oocytes suggested that MT organization defects during late oogenesis (stage 10B) were caused by an initial failure of ncMTOCs to reach the cell cortex. Therefore, we hypothesized that overexpression of the ncMTOC component Shot could restore certain aspects of microtubule cortical organization in ens-deficient oocytes. Indeed, Shot overexpression (Shot OE) was sufficient to rescue the presence of long cortical MTs and ooplasmic advection in most ens oocytes (9/14)..." The data are clear, but the explanation is not. Can the authors please explain why adding in more of an ncMTOC component (Shot) rescues a defect of ncMTOC cortical localization?

      We propose that cytoplasmic ncMTOCs can bind the cell cortex via the Shot subunit that is so far the only component that harbors actin-binding motifs. Therefore, we propose that elevating cytoplasmic Shot increase the possibility of Shot to encounter the cortex by diffusion when flows are absent. This is now explained lane 282-285.

      I'm grateful to the authors for their inclusion of helpful diagrams, as in Figures 1G and 2H. I think the manuscript might benefit from one more of these at the end, illustrating the ultimate model.

      We have carefully considered and followed the reviewer’s suggestions. In response, we have included a new figure illustrating our proposed model: the recruitment of ncMTOCs to the cell cortex through low Khc-mediated flows at stage 9 enhances cortical microtubule density, which in turn promotes self-amplifying flows (new Figure 7, panels A to C). Note that this Figure also depicts activation of Khc by loss of auto-inhibition (Figure 7, panel D).

      I'm sorry to say that the language could use quite a bit of polishing. There are missing and extraneous commas. There is also regular confusion between the use of plural and singular nouns. Some early instances include:

      1. Page 3: thought instead of "thoughted."
      2. Page 5: "A previous studies have revealed"
      3. Page 5: "A significantly loss"
      4. Page 6: "troughs ring canals" should be "through ring canals"
      5. Page 7: lives stage 9 oocytes
      6. Page 7: As ens and Khc RNAi oocytes exhibits
      7. Page 7: we examined in details
      8. Page 7: This average MT length was similar in Khc RNAi and ens mutant oocyte..

      We apologize for errors. We made the appropriate corrections of the manuscript.

      Reviewer #4 (Significance (Required)):

      This work makes a nice conceptual advance by showing that motor activation controls its own transport infrastructure, a paradigm that could extend to other systems requiring spatially regulated transport.

      We thank the reviewers for their evaluation of the manuscript and helpful comments.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      Summary: This manuscript presents an investigation into the molecular mechanisms governing spatial activation of Kinesin-1 motor protein during Drosophila oogenesis, revealing a regulatory network that controls microtubule organization and cytoplasmic transport. The authors demonstrate that Ensconsin, a MAP7 family protein and Kinesin-1 activator, is spatially enriched in the oocyte through a dual mechanism involving Dynein-mediated transport from nurse cells and cortical maintenance by Ninein. This spatial enrichment of Ens is crucial for locally relieving Kinesin-1 auto-inhibition. The Ens/Khc complex promotes cortical recruitment of non-centrosomal microtubule organizing centers (ncMTOCs), which are essential for anchoring microtubules at the cortex, enabling the formation of long, parallel microtubule streams or "twisters" that drive cytoplasmic advection during late oogenesis. This work establishes a paradigm where motor protein activation is spatially controlled through targeted localization of regulatory cofactors, with the activated motor then participating in building its own transport infrastructure through ncMTOC recruitment and microtubule network organization.

      There's a lot to like about this paper! The data are generally lovely and nicely presented. The authors also use a combination of experimental approaches, combining genetics, live and fixed imaging, and protein biochemistry.

      Concerns:

      Page 6: "to assay if elevation of Ninein levels was able to mis-regulate Ens localization, we overexpressed a tagged Ninein-RFP protein in the oocyte. At stage 9 the overexpressed Ninein accumulated at the anterior cortex of the oocyte and also generated large cortical aggregates able to recruit high levels of Ens (Figures 2D and 2H)... The examination of Ninein/Ens cortical aggregates obtained after Ninein overexpression showed that these aggregates were also able to recruit high levels of Patronin and Shot (Figures 2E and 2H)." Firstly, I'm not crazy about the use of "overexpressed" here, since there isn't normally any Ninein-RFP in the oocyte. In these experiments it has been therefore expressed, not overexpressed. Secondly, I don't understand what the reader is supposed to make of these data. Expression of a protein carrying a large fluorescent tag leads to large aggregates (they don't look cortical to me) that include multiple proteins - in fact, all the proteins examined. I don't understand this to be evidence of anything in particular, except that Ninein-RFP causes the accumulation of big multi-protein aggregates. While I can understand what the authors were trying to do here, I think that these data are inconclusive and should be de-emphasized.

      Page 7: "Co-immunoprecipitations experiments revealed that Patronin was associated with Shot-YFP, as shown previously (Nashchekin et al., 2016), but also with EnsWT-GFP, indicating that Ens, Shot and Patronin are present in the same complex (Figure 3B)." I do not agree that association between Ens-GFP and Patronin indicates that Ens is in the same complex as Shot and Patronin. It is also very possible that there are two (or more) distinct protein complexes. This conclusion could therefore be softened. Instead of "indicating" I suggest "suggesting the possibility."

      Page 7: "During stage 9, the average subcortical MT length, taken at one focal plane in live oocytes (see methods)..." I appreciate that the authors have been careful to describe how they measured MT length, as this is a major point for interpretation. I think the reader would benefit from an explanation of why they decided to measure in only one focal plane and how that decision could impact the results.

      Page 7: "... the MTs exhibited an orthogonal orientation relative to the anterior cortex (Figures 4A left panels, 4C and 4E)." This phenotype might not be obvious to readers. Can it be quantified?

      Page 8: "Altogether, the analyses of Ens and Khc defective oocytes suggested that MT organization defects during late oogenesis (stage 10B) were caused by an initial failure of ncMTOCs to reach the cell cortex. Therefore, we hypothesized that overexpression of the ncMTOC component Shot could restore certain aspects of microtubule cortical organization in ens-deficient oocytes. Indeed, Shot overexpression (Shot OE) was sufficient to rescue the presence of long cortical MTs and ooplasmic advection in most ens oocytes (9/14)..." The data are clear, but the explanation is not. Can the authors please explain why adding in more of an ncMTOC component (Shot) rescues a defect of ncMTOC cortical localization?

      I'm grateful to the authors for their inclusion of helpful diagrams, as in Figures 1G and 2H. I think the manuscript might benefit from one more of these at the end, illustrating the ultimate model.

      I'm sorry to say that the language could use quite a bit of polishing. There are missing and extraneous commas. There is also regular confusion between the use of plural and singular nouns. Some early instances include:

      1. Page 3: thought instead of "thoughted."
      2. Page 5: "A previous studies have revealed"
      3. Page 5: "A significantly loss"
      4. Page 6: "troughs ring canals" should be "through ring canals"
      5. Page 7: lives stage 9 oocytes
      6. Page 7: As ens and Khc RNAi oocytes exhibits
      7. Page 7: we examined in details
      8. Page 7: This average MT length was similar in Khc RNAi and ens mutant oocyte..

      Significance

      This work makes a nice conceptual advance by showing that motor activation controls its own transport infrastructure, a paradigm that could extend to other systems requiring spatially regulated transport.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript of Berisha et al., investigates the role of Esconsin (Ens), Kinesin-1 and Ninein in organisation of microtubules (MT) in Drosophila oocyte. At stage 9 oocytes Kinesin-1 transports oskar mRNA, a posterior determinant, along MT that are organised by ncMTOCs. At stage 10b, Kinesin-1 induces cytoplasmic advection to mix the contents of the oocyte. Ensconsin/Map7 is a MT associated protein (MAP) that uses its MT-binding domain (MBD) and kinesin binding domain (KBD) to recruit Kinesin-1 to the microtubules and to stimulate the motility of MT-bound Kinesin-1. Using various new Ens transgenes, the authors demonstrate the requirement of Ens MBD and Ninein in Ens localisation to the oocyte where Ens activates Kinesin-1 using its KBD. The authors also claim that Ens, Kinesin-1 and Ninein are required for the accumulation of ncMTOCs at the oocyte cortex and argue that the detachment of the ncMTOCs from the cortex accounts for the reduced localisation of oskar mRNA at stage 9 and the lack of cytoplasmic streaming at stage 10b.

      Although the manuscript contains several interesting observations, the authors' conclusions are not sufficiently supported by their data. The structure function analysis of Ensconsin (Ens) is potentially publishable, but the conclusions on ncMTOC anchoring and cytoplasmic streaming not convincing

      1. The main conclusion of the manuscript is that "MT advection failure in Khc and ens in late oogenesis stems from defective cortical ncMTOCs recruitment". This completely overlooks the abundant evidence that Kinesin-1 directly drives cytoplasmic streaming by transporting vesicles and microtubules along microtubules, which then move the cytoplasm by advection (Palacios et al., 2002; Serbus et al, 2005; Lu et al, 2016). Since Kinesin-1 generates the flows, one cannot conclude that the effect of khc and ens mutants on cortical ncMTOC positioning has any direct effect on these flows, which do not occur in these mutants.
      2. The authors claim that streaming phenotypes of ens and khs mutants are due to a decrease in microtubule length caused by the defective localisation of ncMTOCs. In addition to the problem raised above, However, I am not convinced that they can make accurate measurements of microtubule length from confocal images like those shown in Figure 4. Firstly, they are measuring the length of bundles of microtubules and cannot resolve individual microtubules. This problem is compounded by the fact that the microtubules do not align into parallel bundles in the mutants. This will make the "microtubules" appear shorter in the mutants. In addition, the alignment of the microtubules in wild-type allows one to choose images in which the microtubule lie in the imaging plane, whereas the more disorganised arrangement of the microtubules in the mutants means that most microtubules will cross the imaging plane, which precludes accurate measurements of their length.
      3. "To investigate whether the presence of these short microtubules in ens and Khc RNAi oocytes is due to defects in microtubule anchoring or is also associated with a decrease in microtubule polymerization at their plus ends, we quantified the velocity and number of EB1comets, which label growing microtubule plus ends (Figure S3)." I do not understand how the anchoring or not of microtubule minus ends to the cortex determines how far their plus ends grow, and these measurements fall short of showing that plus end growth is unaffected. It has already been shown that the Kinesin-1-dependent transport of Dynactin to growing microtubule plus ends increases the length of microtubules in the oocyte because Dynactin acts as an anti-catastrophe factor at the plus ends. Thus, khc mutants should have shorter microtubules independently of any effects on ncMTOC anchoring. The measurements of EB1 comet speed and frequency in FigS2 will not detect this change and are not relevant for their claims about microtubule length. Furthermore, the authors measured EB1 comets at stage 9 (where they did not observe short MT) rather than at stage 10b. The authors' argument would be better supported if they performed the measurements at stage 10b.
      4. The Shot overexpression experiments presented in Fig.3 E-F, Fig.4D and TableS1 are very confusing. Originally , the authors used Shot-GFP overexpression at stage 9 to show that there is a decrease of ncMTOCs at the cortex in ens mutants (Fig.3 E-F) and speculated that this caused the defects in MT length and cytoplasmic advection at stage 10B. However the authors later state on page 8 that : "Shot overexpression (Shot OE) was sufficient to rescue the presence of long cortical MTs and ooplasmic advection in most ens oocytes (9/14), resembling the patterns observed in controls (Figures 4B right panel and 4D). Moreover, while ens females were fully sterile, overexpression of Shot was sufficient to restore that loss of fertility (Table S1)". Is this the same UAS Shot-GFP and VP16 Gal4 used in both experiments? If so, this contradictions puts the authors conclusions in question.
      5. The authors based they conclusions about the involvement of Ens, Kinesin-1 and Ninein in ncMTOC anchoring on the decrease in cortical fluorescence intensity of Shot-YFP and Patronin-YFP in the corresponding mutant backgrounds. However, there is a large variation in average Shot-YFP intensity between control oocytes in different experiments. In Fig. 2F-G the average level of Shot-YFP in the control sis 130 AU while in Fig.3 G-H it is only 55 AU. This makes me worry about reliability of such measurements and the conclusions drawn from them.
      6. The decrease in the intensity of Shot-YFP and Patronin-YFP cortical fluorescence in ens mutant oocytes could be because of problems with ncMTOC anchoring or with ncMTOCsformation. The authors should find a way to distinguish between these two possibilities. The authors could express Ens-Mut (described in Sung et al 2008), which localises at the oocyte posterior and test whether it recruits Shot/Patronin ncMTOCs to the posterior.
      7. According to the Materials and Methods, the Shot-GFP used in Fig.3 E-F and Fig.4 was the BDSC line 29042. This is Shot L(C), a full-length version of Shot missing the CH1 actin-binding domain that is crucial for Shot anchoring to the cortex. If the authors indeed used this version of Shot-GFP, the interpretation of the above experiments is very difficult.
      8. Page 6 "converted in NCs, in a region adjacent to the ring canals, Dendra-Ens-labeled MTs were found in the oocyte compartment indicating they are able to travel from NC toward the oocyte trough ring canals". I have difficulty seeing the translocation of MT through the ring canals. Perhaps it would be more obvious with a movie/picture showing only one channel. Considering that f Dendra-Ens appears in the oocyte much faster than MT transport through ring canals (140nm/s, Lu et al 2022) , the authors are most probably observing the translocation of free Ens rather than Ens bound to MT. The authors should also mention that Ens movement from the NC to the oocyte has been shown before with Ens MBD in Lu et al 2022 with better resolution.
      9. Page 6: The co-localization of Ninein with Ens and Shot at the oocyte cortex (Figure 2A). I have difficulty seeing this co-localisation. Perhaps it would be more obvious in merged images of only two channels and with higher resolution images
      10. "a pool of the Ens-GFP co-localized with Ch-Patronin at cortical ncMTOCs at the anterior cortex (Figure 3A)". I also have difficulty seeing this.
      11. "Ninein co-localizes with Ens at the oocyte cortex and partially along cortical microtubules, contributing to the maintenance of high Ens protein levels in the oocyte and its proper cortical targeting". I could not find any data showing the involvement of Ninein in the cortical targeting of Ens.
      12. "our MT network analyses reveal the presence of numerous short MTs cytoplasmic clustered in an anterior pattern." "This low cortical recruitment of ncMTOCs is consistent with poor MT anchoring and their cytoplasmic accumulation." I could not find any data showing that short cortical MT observed at stage 10b in ens mutant and Khc RNAi were cytoplasmic and poorly anchored.
      13. "The egg chamber consists of interconnected cells where Dynein and Khc activities are spatially separated. Dynein facilitates transport from NCs to the oocyte, while Khc mediates both transport and advection within the oocyte." Dynein is involved in various activities in the oocyte. It anchors the oocyte nucleus and transports bcd and grk mRNA to mention a few.
      14. The cartoons in Fig.2H and 3I exaggerate the effect of Ninein and Ens on cortical ncMTOCs. According to the corresponding graphs, there is a 20 and 50% decrease in each case.

      Significance

      Given the important concerns raised, the significance of the findings is difficult to assess at this stage.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, Berisha et al. investigate how microtubule (MT) organization is spatially regulated during Drosophila oogenesis. The authors identify a mechanism in which the Kinesin-1 activator Ensconsin/MAP7 is transported by dynein and anchored at the oocyte cortex via Ninein, enabling localized activation of Kinesin-1. Disruption of this pathway impairs ncMTOC recruitment and MT anchoring at the cortex. The authors combine genetic manipulation with high-resolution microscopy and use three key readouts to assess MT organization during mid-to-late oogenesis: cortical MT formation, localization of posterior determinants, and ooplasmic streaming. Notably, Kinesin-1, in concert with its activator Ens/MAP7, contributes to organizing the microtubule network it travels along. Overall, the study presents interesting findings, though we have several concerns we would like the authors to address.

      Ensconsin enrichment in the oocyte

      1. Enrichment in the oocyte
        • Ensconsin is a MAP that binds MTs. Given that microtubule density in the oocyte significantly exceeds that in the nurse cells, its enrichment may passively reflect this difference. To assess whether the enrichment is specific, could the authors express a non-Drosophila MAP (e.g., mammalian MAP1B) to determine whether it also preferentially localizes to the oocyte?
        • The ability of ens-wt and ens-LowMT to induce tubulin polymerization according to the light scattering data (Fig. S1J) is minimal and does not reflect dramatic differences in localization. The authors should verify that, in all cases, the polymerization product in their in vitro assays is microtubules rather than other light-scattering aggregates. What is the control in these experiments? If it is just purified tubulin, it should not form polymers at physiological concentrations.
      2. Photoconversion caveats MAPs are known to dynamically associate and dissociate from microtubules. Therefore, interpretation of the Ens photoconversion data should be made with caution. The expanding red signal from the nurse cells to the oocyte may reflect a any combination of dynein-mediated MT transport and passive diffusion of unbound Ensconsin. Notably, photoconversion of a soluble protein in the nurse cells would also result in a gradual increase in red signal in the oocyte, independent of active transport. We encourage the authors to more thoroughly discuss these caveats. It may also help to present the green and red channels side by side rather than as merged images, to allow readers to assess signal movement and spatial patterns better.
      3. Reduction of Shot at the anterior cortex
        • Shot is known to bind strongly to F-actin, and in the Drosophila ovary, its localization typically correlates more closely with F-actin structures than with microtubules, despite being an MT-actin crosslinker. Therefore, the observed reduction of cortical Shot in ens, nin mutants, and Khc-RNAi oocytes is unexpected. It would be important to determine whether cortical F-actin is also disrupted in these conditions, which should be straightforward to assess via phalloidin staining.
        • MTs are barely visible in Fig. 3A, which is meant to demonstrate Ens-GFP colocalization with tubulin. Higher-quality images are needed.
      4. MT gradient in stage 9 oocytes In ens-/-, nin-/-, and Khc-RNAi oocytes, is there any global defect in the stage 9 microtubule gradient? This information would help clarify the extent to which cortical localization defects reflect broader disruptions in microtubule polarity.
      5. Role of Ninein in cortical anchoring The requirement for Ninein in cortical anchorage is the least convincing aspect of the manuscript and somewhat disrupts the narrative flow. First, it is unclear whether Ninein exhibits the same oocyte-enriched localization pattern as Ensconsin. Is Ninein detectable in nurse cells? Second, the Ninein antibody signal appears concentrated in a small area of the anterior-lateral oocyte cortex (Fig. 2A), yet Ninein loss leads to reduced Shot signal along a much larger portion of the anterior cortex (Fig. 2F)-a spatial mismatch that weakens the proposed functional relationship. Third, Ninein overexpression results in cortical aggregates that co-localize with Shot, Patronin, and Ensconsin. Are these aggregates functional ncMTOCs? Do microtubules emanate from these foci?
      6. Inconsistency of Khc^MutEns rescue The Khc^MutEns variant partially rescues cortical MT formation and restores a slow but measurable cytoplasmic flow yet it fails to rescue Staufen localization (Fig. 5). This raises questions about the consistency and completeness of the rescue. Could the authors clarify this discrepancy or propose a mechanistic rationale?

      Minor points:

      1. The pUbi-attB-Khc-GFP vector was used to generate the Khc^MutEns transgenic line, presumably under control of the ubiquitous ubi promoter. Could the authors specify which attP landing site was used? Additionally, are the transgenic flies viable and fertile, given that Kinesin-1 is hyperactive in this construct?
      2. On page 11 (Discussion, section titled "A dual Ensconsin oocyte enrichment mechanism achieves spatial relief of Khc inhibition"), the statement "many mutations in Kif5A are causal of human diseases" would benefit from a brief clarification. Since not all readers may be familiar with kinesin gene nomenclature, please indicate that KIF5A is one of the three human homologs of Kinesin heavy chain.
      3. On page 16 (Materials and Methods, "Immunofluorescence in fly ovaries"), the sentence "Ovaries were mounted on a slide with ProlonGold medium with DAPI (Invitrogen)" should be corrected to "ProLong Gold."

      Significance

      This study shows that enrichment of MAP7/ensconsin in the oocyte is the mechanism of kinesin-1 activation there and is important for cytoplasmic streaming and localization non-centrosomal microtubule-organizing centers to the oocyte cortex

    5. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This paper addresses a very interesting problem of non-centrosomal microtubule organization in developing Drosophila oocytes. Using genetics and imaging experiments, the authors reveal an interplay between the activity of kinesin-1, together with its essential cofactor Ensconsin, and microtubule organization at the cell cortex by the spectraplakin Shot, minus-end binding protein Patronin and Ninein, a protein implicated in microtubule minus end anchoring. The authors demonstrate that the loss of Ensconsin affects the cortical accumulation non-centrosomal microtubule organizing center (ncMTOC) proteins, microtubule length and vesicle motility in the oocyte, and show that this phenotype can be rescued by constitutively active kinesin-1 mutant, but not by Ensconsin mutants deficient in microtubule or kinesin binding. The functional connection between Ensconsin, kinesin-1 and ncMTOCs is further supported by a rescue experiment with Shot overexpression. Genetics and imaging experiments further implicate Ninein in the same pathway. These data are a clear strength of the paper; they represent a very interesting and useful addition to the field.

      The weaknesses of the study are two-fold. First, the paper seems to lack a clear molecular model, uniting the observed phenomenology with the molecular functions of the studied proteins. Most importantly, it is not clear how kinesin-based plus-end directed transport contributes to cortical localization of ncMTOCs and regulation of microtubule length.

      Second, not all conclusions and interpretations in the paper are supported by the presented data. Below is a list of specific comments, outlining the concerns, in the order of appearance in the paper/figures.

      1. Figure 1. The statement: "Ens loading on MTs in NCs and their subsequent transport by Dynein toward ring canals promotes the spatial enrichment of the Khc activator Ens in the oocyte" is not supported by data. The authors do not demonstrate that Ens is actually transported from the nurse cells to the oocyte while being attached to microtubules. They do show that the intensity of Ensconsin correlates with the intensity of microtubules, that the distribution of Ensconsin depends on its affinity to microtubules and that an Ensconsin pool locally photoactivated in a nurse cell can redistribute to the oocyte (and throughout the nurse cell) by what seems to be diffusion. The provided images suggest that Ensconsin passively diffuses into the oocyte and accumulates there because of higher microtubule density, which depends on dynein. To prove that Ensconsin is indeed transported by dynein in the microtubule-bound form, one would need to measure the residence time of Ensconsin on microtubules and demonstrate that it is longer than the time needed to transport microtubules by dynein into the oocyte; ideally, one would like to see movement of individual microtubules labelled with photoconverted Ensconsin from a nurse cell into the oocyte. Since microtubules are not enriched in the oocyte of the dynein mutant, analysis of Ensconsin intensity in this mutant is not informative and does not reveal the mechanism of Ensconsin accumulation.
      2. Figure 2. According to the abstract, this figure shows that Ensconsin is "maintained at the oocyte cortex by Ninein". However, the figure doesn't seem to prove it - it shows that oocyte enrichment of Ensonsin is partially dependent on Ninein, but this applies to the whole cell and not just to the cell cortex. Furthermore, it is not clear whether Ninein mutation affects microtubule density, which in turn would affect Ensconsin enrichment, and therefore, it is not clear whether the effect of Ninein loss on Ensconsin distribution is direct or indirect. The observation that the aggregates formed by overexpressed Ninein accumulate other proteins, including Ensconsin, supports, though does not prove their interactions. Furthermore, there is absolutely no proof that Ninein aggregates are "ncMTOCs". Unless the authors demonstrate that these aggregates nucleate or anchor microtubules (for example, by detailed imaging of microtubules and EB1 comets), the text and labels in the figure would need to be altered.

      Minor comment: Note that a "ratio" (Figure 2C) is just a ratio, and should not be expressed in arbitrary units. 3. Figure 3B: immunoprecipitation results cannot be interpreted because the immunoprecipitated proteins (GFP, Ens-GFP, Shot-YFP) are not shown. It is also not clear that this biochemical experiment is useful. If the authors would like to suggest that Ensconsin directly binds to Patronin, the interaction would need to be properly mapped at the protein domain level. 4. One of the major phenotypes observed by the authors in Ens mutant is the loss of long microtubules. The authors make strong conclusions about the independence of this phenotype from the parameters of microtubule plus-end growth, but in fact, the quality of their data does not allow to make such a conclusion, because they only measured the number of EB1 comets and their growth rate but not the catastrophe, rescue or pausing frequency. Note that kinesin-1 has been implicated in promoting microtubule damage and rescue (doi: 10.1016/j.devcel.2021). In the absence of such measurements, one cannot conclude whether short microtubules arise through defects in the minus-end, plus-end or microtubule shaft regulation pathways. It is important to note in that a spectraplakin, like Shot, can potentially affect different pathways, particularly when overexpressed. Unjustified conclusions should be removed: the authors do not provide sufficient data to conclude that "ens and Khc oocytes MT organizational defects are caused by decreased ncMTOC cortical anchoring", because the actual cortical microtubule anchoring was not measured.

      Minor comment: Microtubule growth velocity must be expressed in units of length per time, to enable evaluating the quality of the data, and not as a normalized value. 5. A significant part of the Discussion is dedicated to the potential role of Ensconsin in cortical microtubule anchoring and potential transport of ncMTOCs by kinesin. It is obviously fine that the authors discuss different theories, but it would be very helpful if the authors would first state what has been directly measured and established by their data, and what are the putative, currently speculative explanations of these data.

      Minor comment: The writing and particularly the grammar need to be significantly improved throughout, which should be very easy with current language tools. Examples: "ncMTOCs recruitment" should be "ncMTOC recruitment"; "Vesicles speed" should be "Vesicle speed", "Nin oocytes harbored a WT growth,"- unclear what this means, etc. Many paragraphs are very long and difficult to read. Making shorter paragraphs would make the authors' line of thought more accessible to the reader.

      Significance

      This paper represents significant advance in understanding non-centrosomal microtubule organization in general and in developing Drosophila oocytes in particular by connecting the microtubule minus-end regulation pathway to the Kinesin-1 and Ensconsin/MAP7-dependent transport. The genetics and imaging data are of good quality, are appropriately presented and quantified. These are clear strengths of the study which will make it interesting to researchers studying the cytoskeleton, microtubule-associated proteins and motors, and fly development.

      The weaknesses of this study are due to the lack of clarity of the overall molecular model, which would limit the impact of the study on the field. Some interpretations are not sufficiently supported by data, but this can be solved by more precise and careful writing, without extensive additional experimentation.

      My expertise is cell biology and biochemistry of the microtubule cytoskeleton, including both microtubule-associated proteins and microtubule motors.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Revision Plan

      Manuscript number: RC-2025-03208

      Corresponding author(s): Jared Nordman

      [The "revision plan" should delineate the revisions that authors intend to carry out in response to the points raised by the referees. It also provides the authors with the opportunity to explain their view of the paper and of the referee reports.

      • *

      The document is important for the editors of affiliate journals when they make a first decision on the transferred manuscript. It will also be useful to readers of the reprint and help them to obtain a balanced view of the paper.

      • *

      If you wish to submit a full revision, please use our "Full Revision" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

      1. General Statements [optional]

      All three reviewers of our manuscript were very positive about our work. The reviewers noted that our work represents a necessary advance that is timely, addresses important issues in the chromatin field, and will of broad interest to this community. Given the nature of our work and the positive reviews, we feel that this manuscript would best be suited for the Journal of Cell Biology.

      2. Description of the planned revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary:

      The authors investigate the function of the H3 chaperone NASP, which is known to bind directly to H3 and prevent degradation of soluble H3. What is unclear is where NASP functions in the cell (nucleus or cytoplasm), how NASP protects H3 from degradation (direct or indirect), and if NASP affects H3 dynamics (nuclear import or export). They use the powerful model system of Drosophila embryos because the soluble H3 pool is high due to maternal deposition and they make use of photoconvertable Dendra-tagged proteins, since these are maternally deposited and can be used to measure nuclear import/export rates.

      Using these systems and tools, they conclude that NASP affects nuclear import, but only indirectly, because embryos from NASP mutant mothers start out with 50% of the maternally deposited H3. Because of the depleted H3 and reduced import rates, NASP deficient embryos also have reduced nucleoplasmic and chromatin-associated H3. Using a new Dendra-tagged NASP allele, the authors show that NASP and H3 have different nuclear import rates, indicating that NASP is not a chaperone that shuttles H3 into the nucleus. They test H3 levels in embryos that have no nuclei and conclude that NASP functions in the cytoplasm, and through protein aggregation assays they conclude that NASP prevents H3 aggregation.

      Major comments:

      The text was easy to read and logical. The data are well presented, methods are complete, and statistics are robust. The conclusions are largely reasonable. However, I am having trouble connecting the conclusions in text to the data presented in Figure 4.

      First, I'm confused why the conclusion from Figure 4A is that NASP functions in the cytoplasm of the egg. Couldn't NASP be required in the ovary (in, say, nurse cell nuclei) to stimulate H3 expression and deposition into the egg? The results in 4A would look the same if the mothers deposit 50% of the normal H3 into the egg. Why is NASP functioning specifically in the cytoplasm when it is also so clearly imported into the nucleus? Maybe NASP functions wherever it is, and by preventing nuclear import, you force it to function in the cytoplasm. I do not have additional suggestions for experiments, but I think the authors need to be very clear about the different interpretations of these data and to discuss WHY they believe their conclusion is strongest.

      The concern raised by the reviewer regarding NASP function during oogenesis has been addressed in a previous work published from our lab. Unfortunately, we did not do a good job conveying this work in the original version of this manuscript. We demonstrated that total H3 levels are unaffected when comparing WT and NASP mutant stage 14 egg chambers. This means that the amount of H3 deposited into the eggs does not change in the absence of NASP. To address the reviewer's comment, we will change the text to make the link to our previous work clear.

      Second, an alternate conclusion from Figure 4D/E is that mothers are depositing less H3 protein into the egg, but the same total amount is being aggregated. This amount of aggregated protein remains constant in activated eggs, but additional H3 translation leads to more total H3? The authors mention that additional translation can compensate for reduced histone pools (line 416).

      Similar to our response above, the total amount of H3 in wild type and NASP mutant stage 14 egg chambers is the same. Therefore, mothers are depositing equal amounts of H3 into the egg. We will make the necessary changes in the text to make this point clear.

      As the function of NASP in the cytoplasm (when it clearly imports into the nucleus) and role in H3 aggregation are major conclusions of the work, the authors need to present alternative conclusions in the text or complete additional experiments to support the claims. Again, I do not have additional suggestions for experiments, but I think the authors need to be very clear about the different interpretations of these data and to discuss WHY they believe their conclusion is strongest.

      A common issue raised by all three reviewers was to more convincingly demonstrate that assay that we have used to isolate protein aggregates does, in fact, isolate protein aggregates. To verify this, we will be performing the aggregate isolation assay using controls that are known to induce more protein aggregation. We will perform the aggregation assay with egg chambers or extracts that are exposed to heat shock or the aggregation-inducing chemicals Canavanine and Azetidine-2-carboxylic acid. The chemical treatment was a welcome suggestion from reviewer #3. These experiments will significantly strengthen any claims based on the outcome of the aggregation assay.

      We will also make changes to the text and include other interpretations of our work as the reviewer has suggested.

      Data presentation:

      Overall, I suggest moving some of the supplemental figures to the main text, adding representative movie stills to show where the quantitative data originated, and moving the H3.3 data to the supplement. Not because it's not interesting, but because H3.3 and H3.2 are behaving the same.

      Where possible, we will make changes to the figure display to improve the logic and flow of the manuscript

      Fig 1:

      It would strengthen the figure to include representative still images that led to the quantitative data, mostly so readers understand how the data were collected.

      We will add representative stills to Figure 1 to help readers understand how the data is collected. We will also a representative H3-Dendra movie similar to the NASP supplemental movie.

      The inclusion of a "simulated 50% H3" in panel C is confusing. Why?

      We used a 50% reduction in H3 levels because that is reduction in H3 we measure in embryos laid by NASP-mutant mothers in our previous work. A reduction in H3 levels alone would be predicted to change the nuclear import rate of H3. Thus, having a quantitative model of H3 import kinetics was key in our understanding of NASP function in vivo. We will revise the text to make this clear.

      I would also consider normalizing the data between A and B (and C and D) by dividing NASP/WT. This could be included in the supplement (OPTIONAL)

      We can normalize the values and include the data in a supplemental figure.

      Fig S1:

      The data simulation S1G should be moved to the main text, since it is the primary reason the authors reject the hypothesis that NASP influences H3 import rates.

      This is a good point. We will move S1G into the Figure 1.

      Fig 2:

      Once again, I think it would help to include a few representative images of the photoconverted Dendra2 in the main text.

      We will add representative images of the photoconversion in Figure 2.

      I struggled with A/B, I think due to not knowing how the data were normalized. When I realized that the WT and NASP data are not normalized to each other, but that the NASP values are likely starting less than the WT values, it made way more sense. I suggest switching the order of data presentation so that C-F are presented first to establish that there is less chromatin-bound H3 in the first place, and then present A/B to show no change in nuclear export of the H3 that is present, allowing the conclusion of both less soluble AND chromatin-bound H3.

      The order of the presentation of the data was to test if NASP was acting as a nuclear receptor. Since Figure 1 compares the nuclear import, we wanted to address the nuclear export and provide a comprehensive analysis of the role of NASP in H3 nuclear dynamics before advancing on to other consequences of NASP depletion. We can add the graphs with the un-normalized values in the Supplemental Figure to show the actual difference in total intensity values.

      Fig S2:

      If M1-M3 indicate males, why are the ovaries also derived from males? I think this is just confusing labeling.

      We will change the labelling.

      Supplemental Movie S1:

      Beautiful. Would help to add a time stamp (OPTIONAL).

      Thank you! We will add the time stamp to the movie

      Fig 3:

      Panel C is the same as Fig S1A (not Fig 1A, as is said in the legend), though I appreciate the authors pointing it out in the legend. Also see line 276.

      We appreciate the reviewer for pointing this out. We will make the change in the text to correct this.

      Panel D is a little confusing, because presumably the "% decrease in import rate" cannot be positive (Y axis). This could be displayed as a scatter (not bar) as in Panels B/C (right) where the top of the Y axis is set to 0.

      We understand the reviewer's concern that the decrease value cannot be positive. We can adjust the y-axis so that it caps off at 0.

      Fig S3:

      A: What do the different panels represent? I originally thought developmental time, but now I think just different representative images? Are these age-matched from time at egg lay?

      The different panels show representative images. We can clarify that in the figure legend.

      C: What does "embryos" mean? Same question for Fig 4A.

      In this figure, embryos mean the exact number of embryos used to form the lysate for the western blot. We will clarify this in the figure legend.

      Fig 4:

      A: What does "embryos" mean? Number of embryos? Age in hours?

      In this figure, embryos mean the exact number of embryos used to form the lysate for the western blot. We will clarify this in the figure legend.

      C: Not sure the workflow figure panel is necessary, as I can't tell what each step does. This is better explained in methods. However I appreciated the short explanation in the text (lines 314-5).

      The workflow panel helps to identify the samples labelled as input and aggregate for the western blot analysis. Since our input in the western blots does not refer to the total protein lysate, we feel it is helpful to point out exactly what stage at the protocol we are utilizing the sample for our analysis.

      Minor comments:

      The authors should describe the nature of the NASP alleles in the main text and present evidence of robust NASP depletion, potentially both in ovaries and in embryos. The antibody works well for westerns (Fig S2B). This is sort of demonstrated later in Figure 4A, but only in NAAP x twine activated eggs.

      We appreciate the reviewer's comments about the NASP mutant allele. In our previous publication, we characterized the NASP mutant fly line and its effect on both stage 14 egg chambers and the embryos. We will emphasize the reference to our previous work in the text.

      Lines 163, 251, 339: minor typos

      Line 184: It would help to clarify- I'm assuming cytoplasmic concentration (or overall) rather than nuclear concentration. If nuclear, I'd expect the opposite relationship. This occurs again when discussing NASP (line 267). I suspect it's also not absolute concentration, but relative concentration difference between cytoplasm and nucleus. It would help clarify if the authors were more precise.

      We appreciate the reviewer's point and will add the clarification in the text.

      Line 189: Given that the "established integrative model" helps to reject the hypothesis that NASP is involved in H3 import, I think it's important to describe the model a little more, even though it's previously published.

      We will add few sentences giving a brief description of the model to the text.

      Line 203: "The measured rate of H3.2 export from the nucleus is negligible" clarify this is in WT situations and not a conclusion from this study.

      We will add the clarification of this statement in the text.

      Line 211: How can the authors be so sure that the decrease in WT is due to "the loss of non-chromatin bound nucleoplasmic H3.2-Dendra2?"

      From the live imaging experiments, the H3.2-Dendra2 intensity in the nucleus reduces dramatically upon nuclear envelope breakdown with the only H3.2-Dendra2 intensity remaining being the chromatin bound H3.2. Excess H3.2 is imported into the nucleus and not all of it is incorporated into the chromatin. This is a unique feature of the embryo system that has been observed previously. We mention that the intensity reduction is due to the loss of non-chromatin bound nucleoplasmic H3.2.

      Line 217: In the conclusion, the authors indicate that NASP indirectly affects soluble supply of H3 in the nucleoplasm. I do believe they've shown that the import rate effect is indirect, but I don't know why they conclude that the effect of NASP on the soluble nucleoplasmic H3 supply is indirect. Similarly, the conclusion is indirect on line 239. Yet, the authors have not shown it's not direct, just assumed since NASP results in 50% decrease to deposited maternal histones.

      We appreciate the feedback on the conclusions of Figure 2 from the reviewer. Our conclusions are primarily based on the effect of H3 levels in the absence of NASP in the early embryos. To establish direct causal effects, it would be important to recover the phenotypes by complementation experiments and providing molecular interactions to cause the effects. In this study we have not established those specific details to make conclusions of direct effects. We will change the text to make this more clear.

      Line 292: What is the nature of the NASP "mutant?" Is it a null? Similarly, what kind of "mutant" is the twine allele? Line 295.

      We will include descriptions of the NASP and twine mutants in the text.

      Line 316: Why did the authors use stage 14 egg chambers here when they previously used embryos? This becomes more clear later shortly, when the authors examine activated eggs, but it's confusing in text.

      The reason to use stage 14 egg chambers was to establish NASP function during oogenesis. We will modify the text to emphasize the reason behind using stage 14 egg chambers.

      Lines 343-348: It's unclear if the authors are drawing extended conclusions here or if they are drawing from prior literature (if so, citations would be required). For example, why during oogenesis/embryogenesis are aggregation and degradation developmentally separated?

      This conclusion is based primarily based on the findings from this study (Figure 4) and out previous published work. We will modify the text for more clarity.

      Lines 386-7: I do not understand why the authors conclude that H3 aggregation and degradation are "developmentally uncoupled" and why, in the absence of NASP, "H3 aggregation precedes degradation."

      This is based data in Figure 4 combined with our previous working showing that the total level of H3 in not changed in NASP-mutant stage 14 egg chambers. Aggregates seem to be more persistent in the stage 14 egg chambers (oogenesis) and they get cleared out upon egg activation (entry into embryogenesis). This provides evidence for aggregation occurring prior to degradation and these two events occurring in different developmental stages. We will change the text to make this more clear.

      Line 395: Why suddenly propose that NASP also functions in the nucleus to prevent aggregation, when earlier the authors suggest it functions only in the cytoplasm?

      We will make the necessary edits to ensure that the results don't suggest a role of NASP exclusive to the cytoplasm. Our findings highlight a cytoplasmic function of NASP, however, we do not want to rule out that this same function couldn't occur in the nucleus.

      Lines 409-413: The authors claim that histone deficiency likely does not cause the embryonic arrest seen in embryos from NASP mutant mothers. This is because H3 is reduced by 50% yet some embryos arrest long before they've depleted this supply. However, the authors also showed that H3 import rates are affected in these embryos due to lower H3 concentration. Since the early embryo cycles are so rapid, reduced H3 import rates could lead to early arrest, even though available H3 remains in the cytoplasm.

      We thank the reviewer for their suggestion. This conclusion is based on the findings from the previous study from our lab which showed that the majority of the embryos laid by NASP mutant females get arrested in the very early nuclear cycles (Reviewer #1 (Significance (Required)):

      The significance of the work is conceptual, as NASP is known to function in H3 availability but the precise mechanism is elusive. This work represents a necessary advance, especially to show that NASP does not affect H3 import rates, nor does it chaperone H3 into the nucleus. However, the authors acknowledge that many questions remain. Foremost, why is NASP imported into the nucleus and what is its role there?

      I believe this work will be of interest to those who focus on early animal development, but NASP may also represent a tool, as the authors conclude in their discussion, to reduce histone levels during development and examine nucleosome positioning. This may be of interest to those who work on chromatin accessibility and zygotic genome activation.

      I am a genetics expert who works in Drosophila embryogenesis. I do not have the expertise to evaluate the aggregate methods presented in Figure 4.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      This manuscript focuses on the role of the histone chaperone NASP in Drosophila. NASP is a chaperone specific to histone H3 that is conserved in mammals. Many aspects of the molecular mechanisms by which NASP selectively binds histone H3 have been revealed through biochemical studies. However, key aspects of NASP's in vivo roles remain unclear, including where in the cell NASP functions, and how it prevents H3 degradation. Through live imaging in the early Drosophila embryo, which possesses large amounts of soluble H3 protein, Das et al determine that NASP does not control nuclear import or export of H3.2 or H3.3. Instead, they find through differential centrifugation analysis that NASP functions in the cytoplasm to prevent H3 aggregation and hence its subsequent degradation.

      Major Comments:

      The protein aggregation assays raise several questions. From a technical standpoint, it would be helpful to have a positive control to demonstrate that the assay is effective at detecting protein aggregates. Ie. a genotype that exhibits increased protein aggregation; this could be for a protein besides H3. A common issue raised by all three reviewers was to more convincingly demonstrate that assay that we have used to isolate protein aggregates does, in fact, isolate protein aggregates. To verify this, we will be performing the aggregate isolation assay using controls that are known to induce more protein aggregation. We will perform the aggregation assay with egg chambers or extracts that are exposed to heat shock or the aggregation-inducing chemicals Canavanine and Azetidine-2-carboxylic acid. The chemical treatment was a welcome suggestion from reviewer #3. These experiments will significantly strengthen any claims based on the outcome of the aggregation assay.

      If NASP is not required to prevent H3 degradation in egg chambers, then why are H3 levels much lower in NASP input lanes relative to wild-type egg chambers in Fig 4D? We appreciate the reviewer's inputs regarding the reduced H3 levels in the NASP mutant egg chambers. We observe this reduction in H3 levels in the input because of the altered solubility of H3 which leads to the loss of H3 protein at different steps of the aggregate isolation assay. We will add a supplement figure showing H3 levels at different steps of the aggregate isolation assay. We do want to stress, however, that the total levels of H3 in stage 14 egg chambers does not change between WT and the NASP mutant.

      A corollary to this is that the increased fraction of H3 in aggregates in NASP mutants seems to be entirely due to the reduction in total H3 levels rather than an increase in aggregated H3. If NASP's role is to prevent aggregation in the cytoplasm, and degradation has not yet begun in egg chambers, then why are aggregated H3 levels not increased in NASP mutants relative to wild-type egg chambers? If the same number of egg chambers were used, shouldn't the total amount of histone be the same in the absence of degradation?

      In previously published work, we demonstrated that total H3 levels are unaffected when comparing WT and NASPmutant stage 14 egg chambers. This means that the amount of H3 deposited into the eggs does not change in the absence of NASP. To address the reviewer's comment, we will change the text to make the link to our previous work clear. As stated above, we will add a supplement figure showing H3 levels at different steps of the aggregate isolation assay.

      The live imaging studies are well designed, executed, and quantified. They use an established genotype (H3.2-Dendra2) in wild-type and NASP maternal mutants to demonstrate that NASP is not directly involved in nuclear import of H3.2. Decreased import is likely due to reduced H3.2 levels in NASP mutants rather than reduced import rates per se. The same methodology was used to determine that loss of NASP did not affect H3.2 nuclear export. These findings eliminate H3.2 nuclear import/export regulation as possible roles for NASP, which had been previously proposed.

      Thank you.

      Live imaging also conclusively demonstrates that the levels of H3.2 in the nucleoplasm and in mitotic chromatin are significantly lower in NASP mutants than wild-type nuclei. Despite these lower histone levels, the nuclear cycle duration is only modestly lengthened. The live imagining of NASP-Dendra2 nuclear import conclusively demonstrate that NASP and H3.2 are unlikely to be imported into the nucleus as one complex.

      Thank you.

      Minor Comments:

      Additional details on how the NASP-Dendra2 CRISPR allele was generated should be provided. In addition, additional details on how it was determined that this allele is functional should be provided (e.g. quantitative assays for fertility/embryo viability of NASP-Dendra2 females) We will make these additions to the text.

      If statistical tests are used to determine significance, the type of test used should be reported in the figure legends throughout.

      We will make the addition of the statistical tests to the figure legends.

      The western blot shown in Figure 4A looks more like a 4-fold reduction in H3 levels in NASP mutants relative to wild-type embryos, rather than the quantified 2-fold reduction. Perhaps a more representative blot can be shown.

      We have additional blots in the supplemental figure S3C. The quantification was performed after normalization to the total protein levels and we can highlight that in the figure legend.

      Reviewer #2 (Significance (Required)):

      As a fly chromatin biologist with colleagues that utilize mammalian experimental systems, I feel this manuscript will be of broad interest to the chromatin research community. Packaging of the genome into chromatin affects nearly every DNA-templated process, making the mechanisms by which histone proteins are expressed, chaperoned, and deposited into chromatin of high importance to the field. The study has multiple strengths, including high-quality quantitative imaging, use of a terrific experimental system (storage and deposition of soluble histones in early fly embryos). The study also answers outstanding questions in the field, specifically that NASP does not control nuclear import/export of histone H3. Instead, the authors propose that NASP functions to prevent protein aggregation. If this could be conclusively demonstrated, it would be valuable to the field. However, the protein aggregation studies need improvement. Technical demonstration that their differential centrifugation assay accurately detects aggregated proteins is needed. Further, NASP mutants do not exhibit increased H3 protein aggregation in the data presented. Instead, the increased fraction of aggregated H3 in NASP mutants seems to be due to a reduction in the overall levels of H3 protein, which is contrary to the model presented in this paper.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This manuscript by Das et al. entitled "NASP functions in the cytoplasm to prevent histone H3 aggregation during early embryogenesis", explores the role of the histone chaperone NASP in regulating histone H3 dynamics during early Drosophila embryogenesis. Using primarily live imaging approaches, the authors found that NASP is not directly involved in the import or export of H3. Moreover, the authors claimed that NASP prevents H3 aggregation rather than protects against degradation.

      Major Comments:

      Figure 1A-B: The plotted data appear to have substantial dispersion. Could the authors include individual data points or provide representative images to help the reader assess variability?

      We chose to show unnormalized data in Figure 1 so readers could better compare the actual import values of H3 in the presence and absence of NASP. We felt it was a better representation of the true biological difference although raw data is more dispersive. We did also include normalized data in the supplement. Regardless, we will add representative stills to Figure 1 and include a H3-Dendra2 movie in the supplement to show the representative data.

      Given that the authors conclude that the reduced nuclear import is due to lowered H3 levels in NASP-deficient embryos, would overexpression of H3 rescue this phenotype? This would directly test whether H3 levels, rather than import machinery per se, drive the effect.

      We thank the reviewer for their valuable suggestion. We and others have tried to overexpress histones in the Drosophila early embryo without success. There must be an undefined feedback mechanism preventing histone overexpression in the germline. In fact, a recent paper has been deposited on bioRxiv (https://doi.org/10.1101/2024.12.23.630206) that suggest H4 protein could provide a feedback mechanism to prevent histone overexpression. While we would love to do this experiment, it is not technically feasible at this time.

      Figure 2A-B: The authors present the Relative Intensity of H3-Dendra2, but this metric obscures absolute differences between Control and NASP knockout embryos. Please include Total Intensity plots to show the actual reduction in H3 levels.

      We will add the total H3-Dendra2 intensity plots to the supplemental figure for the export curves.

      Additionally, Western blot analysis of nucleoplasmic H3 from wild-type vs. NASP-deficient embryos would provide essential biochemical confirmation of H3 level reductions.

      We will measure nuclear H3 levels by western from 0-2 hr embryos laid by WT and NASP mutant flies.

      Figure 4: To support the conclusion that NASP prevents H3 aggregation, I recommend performing aggregation assays by adding compounds that induce unfolding (amino acid analogues that induce unfolding, like canavanine or Azetidine-2-carboxylic acid) or using aggregation-prone H3 mutants.

      This is a very helpful suggestion! It is difficult to get chemicals into Drosophila eggs, but we will treat extracts directly with these chemicals. Additionally, we will use heat shocked eggs and extracts as an additional control.

      Inclusion of CMA and proteasome inhibition experiments could also clarify whether degradation pathways are secondarily involved or compensatory in the absence of NASP.

      The degradation pathway for H3 in the absence of NASP is unknown and a major focus of our future work is to define this pathway. Drosophila does not have a CMA pathway and therefore, we don't know how H3 aggregates are being sensed.

      Minor Comments:

      (1) The Introduction would benefit from mentioning the two NASP isoforms that exist in mammals (sNASP and tNASP), as this evolutionary context may inform interpretation of the Drosophila results.

      We will make the edits in the text to include that Drosophila NASP is the sole homolog of sNASP and that tNASP ortholog is not found in Drosophila.

      (2) Could the authors comment on the status of histone H4 in their experimental system? Given the observed cytoplasmic pool of H3, is it likely to exist as a monomer? If this H3 pool is monomeric, does that suggest an early failure in H3-H4 dimerization, and could this contribute to its aggregation propensity?

      In our previous work we noted that NASP binds more preferentially to H3 and the levels of H3 we much more reduced upon NASP depletion than H4. We pointed out in this publication that our data was consistent with H3 stores being monomeric in the Drosophila embryo. We don't' have a H4-Dendra2 line to test. In the future, however, this is something we are very keen to look at.

      Reviewer #3 (Significance (Required)):

      This work addresses a timely and important question in the field of chromatin biology and developmental epigenetics. The focus on histone homeostasis during embryogenesis and the cytoplasmic role of NASP adds a novel perspective. The live imaging experiments are a clear strength, providing valuable spatiotemporal insights. However, I believe that the manuscript would benefit significantly from additional biochemical validation to support and clarify some of the mechanistic claims.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      • *

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript by Das et al. entitled "NASP functions in the cytoplasm to prevent histone H3 aggregation during early embryogenesis", explores the role of the histone chaperone NASP in regulating histone H3 dynamics during early Drosophila embryogenesis. Using primarily live imaging approaches, the authors found that NASP is not directly involved in the import or export of H3. Moreover, the authors claimed that NASP prevents H3 aggregation rather than protects against degradation.

      Major Comments:

      Figure 1A-B: The plotted data appear to have substantial dispersion. Could the authors include individual data points or provide representative images to help the reader assess variability? Given that the authors conclude that the reduced nuclear import is due to lowered H3 levels in NASP-deficient embryos, would overexpression of H3 rescue this phenotype? This would directly test whether H3 levels, rather than import machinery per se, drive the effect.

      Figure 2A-B: The authors present the Relative Intensity of H3-Dendra2, but this metric obscures absolute differences between Control and NASP knockout embryos. Please include Total Intensity plots to show the actual reduction in H3 levels. Additionally, Western blot analysis of nucleoplasmic H3 from wild-type vs. NASP-deficient embryos would provide essential biochemical confirmation of H3 level reductions.

      Figure 4: To support the conclusion that NASP prevents H3 aggregation, I recommend performing aggregation assays by adding compounds that induce unfolding (amino acid analogues that induce unfolding, like canavanine or Azetidine-2-carboxylic acid) or using aggregation-prone H3 mutants. Inclusion of CMA and proteasome inhibition experiments could also clarify whether degradation pathways are secondarily involved or compensatory in the absence of NASP.

      Minor Comments:

      (1) The Introduction would benefit from mentioning the two NASP isoforms that exist in mammals (sNASP and tNASP), as this evolutionary context may inform interpretation of the Drosophila results.

      (2) Could the authors comment on the status of histone H4 in their experimental system? Given the observed cytoplasmic pool of H3, is it likely to exist as a monomer? If this H3 pool is monomeric, does that suggest an early failure in H3-H4 dimerization, and could this contribute to its aggregation propensity?

      Significance

      This work addresses a timely and important question in the field of chromatin biology and developmental epigenetics. The focus on histone homeostasis during embryogenesis and the cytoplasmic role of NASP adds a novel perspective. The live imaging experiments are a clear strength, providing valuable spatiotemporal insights. However, I believe that the manuscript would benefit significantly from additional biochemical validation to support and clarify some of the mechanistic claims.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      This manuscript focuses on the role of the histone chaperone NASP in Drosophila. NASP is a chaperone specific to histone H3 that is conserved in mammals. Many aspects of the molecular mechanisms by which NASP selectively binds histone H3 have been revealed through biochemical studies. However, key aspects of NASP's in vivo roles remain unclear, including where in the cell NASP functions, and how it prevents H3 degradation. Through live imaging in the early Drosophila embryo, which possesses large amounts of soluble H3 protein, Das et al determine that NASP does not control nuclear import or export of H3.2 or H3.3. Instead, they find through differential centrifugation analysis that NASP functions in the cytoplasm to prevent H3 aggregation and hence its subsequent degradation.

      Major Comments:

      1. The protein aggregation assays raise several questions.

      a. From a technical standpoint, it would be helpful to have a positive control to demonstrate that the assay is effective at detecting protein aggregates. Ie. a genotype that exhibits increased protein aggregation; this could be for a protein besides H3.

      b. If NASP is not required to prevent H3 degradation in egg chambers, then why are H3 levels much lower in NASP input lanes relative to wild-type egg chambers in Fig 4D?

      c. A corollary to this is that the increased fraction of H3 in aggregates in NASP mutants seems to be entirely due to the reduction in total H3 levels rather than an increase in aggregated H3. If NASP's role is to prevent aggregation in the cytoplasm, and degradation has not yet begun in egg chambers, then why are aggregated H3 levels not increased in NASP mutants relative to wild-type egg chambers? If the same number of egg chambers were used, shouldn't the total amount of histone be the same in the absence of degradation? 2. The live imaging studies are well designed, executed, and quantified. They use an established genotype (H3.2-Dendra2) in wild-type and NASP maternal mutants to demonstrate that NASP is not directly involved in nuclear import of H3.2. Decreased import is likely due to reduced H3.2 levels in NASP mutants rather than reduced import rates per se. The same methodology was used to determine that loss of NASP did not affect H3.2 nuclear export. These findings eliminate H3.2 nuclear import/export regulation as possible roles for NASP, which had been previously proposed. 3. Live imaging also conclusively demonstrates that the levels of H3.2 in the nucleoplasm and in mitotic chromatin are significantly lower in NASP mutants than wild-type nuclei. Despite these lower histone levels, the nuclear cycle duration is only modestly lengthened. 4. The live imagining of NASP-Dendra2 nuclear import conclusively demonstrate that NASP and H3.2 are unlikely to be imported into the nucleus as one complex.

      Minor Comments:

      1. Additional details on how the NASP-Dendra2 CRISPR allele was generated should be provided. In addition, additional details on how it was determined that this allele is functional should be provided (e.g. quantitative assays for fertility/embryo viability of NASP-Dendra2 females)
      2. If statistical tests are used to determine significance, the type of test used should be reported in the figure legends throughout.
      3. The western blot shown in Figure 4A looks more like a 4-fold reduction in H3 levels in NASP mutants relative to wild-type embryos, rather than the quantified 2-fold reduction. Perhaps a more representative blot can be shown.

      Significance

      As a fly chromatin biologist with colleagues that utilize mammalian experimental systems, I feel this manuscript will be of broad interest to the chromatin research community. Packaging of the genome into chromatin affects nearly every DNA-templated process, making the mechanisms by which histone proteins are expressed, chaperoned, and deposited into chromatin of high importance to the field. The study has multiple strengths, including high-quality quantitative imaging, use of a terrific experimental system (storage and deposition of soluble histones in early fly embryos). The study also answers outstanding questions in the field, specifically that NASP does not control nuclear import/export of histone H3. Instead, the authors propose that NASP functions to prevent protein aggregation. If this could be conclusively demonstrated, it would be valuable to the field. However, the protein aggregation studies need improvement. Technical demonstration that their differential centrifugation assay accurately detects aggregated proteins is needed. Further, NASP mutants do not exhibit increased H3 protein aggregation in the data presented. Instead, the increased fraction of aggregated H3 in NASP mutants seems to be due to a reduction in the overall levels of H3 protein, which is contrary to the model presented in this paper.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The authors investigate the function of the H3 chaperone NASP, which is known to bind directly to H3 and prevent degradation of soluble H3. What is unclear is where NASP functions in the cell (nucleus or cytoplasm), how NASP protects H3 from degradation (direct or indirect), and if NASP affects H3 dynamics (nuclear import or export). They use the powerful model system of Drosophila embryos because the soluble H3 pool is high due to maternal deposition and they make use of photoconvertable Dendra-tagged proteins, since these are maternally deposited and can be used to measure nuclear import/export rates.

      Using these systems and tools, they conclude that NASP affects nuclear import, but only indirectly, because embryos from NASP mutant mothers start out with 50% of the maternally deposited H3. Because of the depleted H3 and reduced import rates, NASP deficient embryos also have reduced nucleoplasmic and chromatin-associated H3. Using a new Dendra-tagged NASP allele, the authors show that NASP and H3 have different nuclear import rates, indicating that NASP is not a chaperone that shuttles H3 into the nucleus. They test H3 levels in embryos that have no nuclei and conclude that NASP functions in the cytoplasm, and through protein aggregation assays they conclude that NASP prevents H3 aggregation.

      Major comments:

      The text was easy to read and logical. The data are well presented, methods are complete, and statistics are robust. The conclusions are largely reasonable. However, I am having trouble connecting the conclusions in text to the data presented in Figure 4.

      First, I'm confused why the conclusion from Figure 4A is that NASP functions in the cytoplasm of the egg. Couldn't NASP be required in the ovary (in, say, nurse cell nuclei) to stimulate H3 expression and deposition into the egg? The results in 4A would look the same if the mothers deposit 50% of the normal H3 into the egg. Why is NASP functioning specifically in the cytoplasm when it is also so clearly imported into the nucleus? Maybe NASP functions wherever it is, and by preventing nuclear import, you force it to function in the cytoplasm. I do not have additional suggestions for experiments, but I think the authors need to be very clear about the different interpretations of these data and to discuss WHY they believe their conclusion is strongest.

      Second, an alternate conclusion from Figure 4D/E is that mothers are depositing less H3 protein into the egg, but the same total amount is being aggregated. This amount of aggregated protein remains constant in activated eggs, but additional H3 translation leads to more total H3? The authors mention that additional translation can compensate for reduced histone pools (line 416).

      As the function of NASP in the cytoplasm (when it clearly imports into the nucleus) and role in H3 aggregation are major conclusions of the work, the authors need to present alternative conclusions in the text or complete additional experiments to support the claims. Again, I do not have additional suggestions for experiments, but I think the authors need to be very clear about the different interpretations of these data and to discuss WHY they believe their conclusion is strongest.

      Data presentation:

      Overall, I suggest moving some of the supplemental figures to the main text, adding representative movie stills to show where the quantitative data originated, and moving the H3.3 data to the supplement. Not because it's not interesting, but because H3.3 and H3.2 are behaving the same.

      Fig 1:

      It would strengthen the figure to include representative still images that led to the quantitative data, mostly so readers understand how the data were collected. The inclusion of a "simulated 50% H3" in panel C is confusing. Why? I would also consider normalizing the data between A and B (and C and D) by dividing NASP/WT. This could be included in the supplement (OPTIONAL)

      Fig S1:

      The data simulation S1G should be moved to the main text, since it is the primary reason the authors reject the hypothesis that NASP influences H3 import rates.

      Fig 2:

      Once again, I think it would help to include a few representative images of the photoconverted Dendra2 in the main text. I struggled with A/B, I think due to not knowing how the data were normalized. When I realized that the WT and NASP data are not normalized to each other, but that the NASP values are likely starting less than the WT values, it made way more sense. I suggest switching the order of data presentation so that C-F are presented first to establish that there is less chromatin-bound H3 in the first place, and then present A/B to show no change in nuclear export of the H3 that is present, allowing the conclusion of both less soluble AND chromatin-bound H3.

      Fig S2:

      If M1-M3 indicate males, why are the ovaries also derived from males? I think this is just confusing labeling. Supplemental Movie S1: Beautiful. Would help to add a time stamp (OPTIONAL).

      Fig 3:

      Panel C is the same as Fig S1A (not Fig 1A, as is said in the legend), though I appreciate the authors pointing it out in the legend. Also see line 276. Panel D is a little confusing, because presumably the "% decrease in import rate" cannot be positive (Y axis). This could be displayed as a scatter (not bar) as in Panels B/C (right) where the top of the Y axis is set to 0.

      Fig S3:

      A: What do the different panels represent? I originally thought developmental time, but now I think just different representative images? Are these age-matched from time at egg lay? C: What does "embryos" mean? Same question for Fig 4A. Fig 4: A: What does "embryos" mean? Number of embryos? Age in hours? C: Not sure the workflow figure panel is necessary, as I can't tell what each step does. This is better explained in methods. However I appreciated the short explanation in the text (lines 314-5).

      Minor comments:

      The authors should describe the nature of the NASP alleles in the main text and present evidence of robust NASP depletion, potentially both in ovaries and in embryos. The antibody works well for westerns (Fig S2B). This is sort of demonstrated later in Figure 4A, but only in NAAP x twine activated eggs.

      Lines 163, 251, 339: minor typos Line 184: It would help to clarify- I'm assuming cytoplasmic concentration (or overall) rather than nuclear concentration. If nuclear, I'd expect the opposite relationship. This occurs again when discussing NASP (line 267). I suspect it's also not absolute concentration, but relative concentration difference between cytoplasm and nucleus. It would help clarify if the authors were more precise. Line 189: Given that the "established integrative model" helps to reject the hypothesis that NASP is involved in H3 import, I think it's important to describe the model a little more, even though it's previously published. Line 203: "The measured rate of H3.2 export from the nucleus is negligible" clarify this is in WT situations and not a conclusion from this study. Line 201: How can the authors be so sure that the decrease in WT is due to "the loss of non-chromatin bound nucleoplasmid H3.2-Dendra2?" Line 217: In the conclusion, the authors indicate that NASP indirectly affects soluble supply of H3 in the nucleoplasm. I do believe they've shown that the import rate effect is indirect, but I don't know why they conclude that the effect of NASP on the soluble nucleoplasmic H3 supply is indirect. Similarly, the conclusion is indirect on line 239. Yet, the authors have not shown it's not direct, just assumed since NASP results in 50% decrease to deposited maternal histones. Line 292: What is the nature of the NASP "mutant?" Is it a null? Similarly, what kind of "mutant" is the twine allele? Line 295. Line 316: Why did the authors use stage 14 egg chambers here when they previously used embryos? This becomes more clear later shortly, when the authors examine activated eggs, but it's confusing in text. Lines 343-348: It's unclear if the authors are drawing extended conclusions here or if they are drawing from prior literature (if so, citations would be required). For example, why during oogenesis/embryogenesis are aggregation and degradation developmentally separated? Lines 386-7: I do not understand why the authors conclude that H3 aggregation and degradation are "developmentally uncoupled" and why, in the absence of NASP, "H3 aggregation precedes degradation." Line 395: Why suddenly propose that NASP also functions in the nucleus to prevent aggregation, when earlier the authors suggest it functions only in the cytoplasm? Lines 409-413: The authors claim that histone deficiency likely does not cause the embryonic arrest seen in embryos from NASP mutant mothers. This is because H3 is reduced by 50% yet some embryos arrest long before they've depleted this supply. However, the authors also showed that H3 import rates are affected in these embryos due to lower H3 concentration. Since the early embryo cycles are so rapid, reduced H3 import rates could lead to early arrest, even though available H3 remains in the cytoplasm.

      Significance

      The significance of the work is conceptual, as NASP is known to function in H3 availability but the precise mechanism is elusive. This work represents a necessary advance, especially to show that NASP does not affect H3 import rates, nor does it chaperone H3 into the nucleus. However, the authors acknowledge that many questions remain. Foremost, why is NASP imported into the nucleus and what is its role there?

      I believe this work will be of interest to those who focus on early animal development, but NASP may also represent a tool, as the authors conclude in their discussion, to reduce histone levels during development and examine nucleosome positioning. This may be of interest to those who work on chromatin accessibility and zygotic genome activation.

      I am a genetics expert who works in Drosophila embryogenesis. I do not have the expertise to evaluate the aggregate methods presented in Figure 4.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Review report for 'Sterols regulate ciliary membrane dynamics and hedgehog signaling in health and disease', Lamazière et al.

      Reviewer #1

      In this manuscript, Lamazière et al. address an important understudied aspect of primary cilium biology, namely the sterol composition in the ciliary membrane. It is known that sterols especially play an important role in signal transduction between PTCH1 and SMO, two upstream components of the Hedgehog pathway, at the primary cilium. Moreover, several syndromes linked to cholesterol biosynthesis defects present clinical phenotypes indicative of altered Hh signal transduction. To understand the link between ciliary membrane sterol composition and Hh signal transduction in health and disease, the authors developed a method to isolate primary cilia from MDCK cells and coupled this to quantitative metabolomics. The results were validated using biophysical methods and cellular Hh signaling assays. While this is an interesting study, it is not clear from the presented data how general the findings are: can cilia be isolated from different mammalian cell types using this protocol? Is the sterol composition of MDCK cells expected to the be the same in fibroblasts or other cell types? Without this information, it is difficult to judge whether the conclusions reached in fibroblasts are indeed directly related to the sterol composition detected in MDCK cells. Below is a detailed breakdown of suggested textual changes and experimental validations to strengthen the conclusions of the manuscript.

      We would like to thank the reviewer for their helpful comments

      Major comments:

      • It appears that the comparison has been made between ciliary membranes and the rest of the cell's membranes, which includes many other membranes besides the plasma membrane. This significantly weakens the conclusions on the sterol content specific to the cilium, as it may in fact be highly similar to the rest of the plasma membrane. It is for example known that lathosterol is biosynthesized in the ER, and therefore the non-presence in the cilium may reflect a high abundance in the ER but not necessarily in the plasma membrane.

      The reviewer is correct that we compared the sterol composition of the primary ciliary membrane to the average of the remaining cellular membranes. We agree that this broader reference fraction contains multiple intracellular membranes, including ER- and Golgi-derived compartments, and therefore does not isolate the plasma membrane specifically. We would like to emphasize that our study did not aim to compare the cilium directly to the plasma membrane, nor did we claim that the comparison was in any way related to the plasma membrane. It is also worth noting that previous studies in other ciliated organisms have reported a higher cholesterol content in cilia compared to the plasma membrane, suggesting that the two membranes may not be compositionally identical despite their continuity. However, we concur that determining the sterol composition of the MDCK plasma membrane would provide valuable context and enable a comparison with the membrane continuous with the ciliary membrane. Hence, we are willing to try isolating plasma membrane in the same cellular contexts.

      • While the protocol to isolate primary cilium from MDCK cells is a valuable addition to the methods available, it would be good to at least include a discussion on its general applicability. Have the authors tried to use this protocol on fibroblasts for example?

      Thank you for the reviewer's positive comment on the value of the ciliary isolation protocol. Indeed, we have attempted to apply the same approach to other ciliated cell types, namely IMCD3 and MEF cells. In the case of IMCD3 cells, we were able to isolate primary cilia using the same general strategy; however, we are still refining the preparation, as the overall yield is lower than in MDCK cells and the amount of material obtained is currently insufficient for comprehensive biochemical analyses. With MEF (fibroblast) cells, the procedure proved even more challenging, as the yield of isolated cilia was extremely low. This difficulty is likely due to the shorter length of fibroblast cilia and to their positioning beneath the cell body, which probably makes them more resistant to detachment. Overall, these observations suggest that while the protocol can be adapted to other cell types, its efficiency depends on cellular architecture. We have added a discussion of these aspects in the revised manuscript to clarify the method's current scope and limitations (lines 492-502).

      • Some of the conclusions in the introduction (lines 75-80) seem to be incorrectly phrased based on the data: in basal conditions, ciliary membranes are already enriched in cholesterol and desmosterol, and the treatment lowers this in all membranes.

      We agree, this was modified in the revised manuscript (lines 75-80).

      • There seems to be little effect of simvastatin on overall cholesterol levels. Can the authors comment on this result? How would the membrane fluidity be altered when mimicking simvastatin-induced composition? Since the effect on Hh signaling appears to be the biggest (Figure 5B) under simvastatin treatment, it would be interesting to compare this against that found for AY9944 treatment. Also, the authors conclude that the effects of simvastatin treatment on ciliary membrane sterol composition are the mildest, however, one could argue that they are the strongest as there is a complete lack of desmosterol.

      We thank the reviewer for these insightful comments. Regarding the modest overall effect of simvastatin on cholesterol levels, we would like to note that MDCK cells are an immortalized epithelial cell line with high metabolic plasticity. Such cancer-like cell types are known to exhibit enhanced de novo lipogenesis, particularly under culture conditions with ample glucose availability. This compensatory lipid biosynthesis can partially counterbalance pharmacological inhibition of the cholesterol biosynthetic pathway. Because simvastatin acts upstream in the pathway (at HMG-CoA reductase), its inhibition primarily reduces early intermediates rather than fully depleting end-product cholesterol, explaining the relatively mild changes observed in total cholesterol content.

      Concerning desmosterol, we agree with the reviewer that its complete loss under simvastatin treatment is a striking finding that deserves further discussion. Interestingly, our data show that simvastatin treatment produces the strongest inhibition of pathway activation (as measured by SMO activation), but the weakest effect on signal transduction downstream of constitutively active SMOM2. This dichotomy suggests that the absence of desmosterol may preferentially affect the activation step of Hedgehog signaling at the ciliary membrane, without equally impacting downstream propagation. We have expanded the Result section to highlight this potential role of desmosterol in the activation phase of Hedgehog signaling and to contrast it with the effects observed under AY9944 treatment (lines 463-469).

      It is not clear to me why the authors have chosen to use SAG to activate the Hh pathway, as this is a downstream mode of activation and bypasses PTCH1 (and therefore a potentially sterol-mediated interaction between the two proteins). It would be very informative to compare the effect of sterol modulation on the ability of ShhN vs SAG to activate the pathway.

      Our study aims to demonstrate that the sterol composition of the ciliary membrane plays an essential role in the proper functioning of the Hedgehog (Hh) signaling pathway, comparable in importance to that of oxysterols and free cholesterol. Because ShhN itself is covalently modified by cholesterol, and Smoothened (SMO) can be directly activated by both oxysterols and cholesterol, we reasoned that using a non-native SMO agonist such as SAG would allow us to specifically assess defects arising from alterations in membrane-bound sterols. In this way, pathway activation by SAG provides a more direct readout of the functional contribution of ciliary membrane sterols to SMO activity, independent of potential confounding effects related to ShhN processing, secretion, or PTCH1-mediated regulation.

      • The conclusions about the effect of tamoxifen on SMO trafficking in MEFs should be validated in human patient cells before being able to conclude that there is a potential off-target effect (line 438). Also, if that is the case, the experiment of tamoxifen treatment of EBP KO cells should give an additional effect on SMO trafficking. Also, could the CDPX2 phenotypes in patients be the result of different cell types being affected than the fibroblast used in this study?

      We agree that carrying the proposed experiment would be a good way to assess a potential off-target effect. However, such validation is beyond the scope of the present study, as this comment on off-target effect was aimed primarily to propose a mechanistic hypothesis to explain the differences observed in Hedgehog pathway activation between patient-derived fibroblasts and tamoxifen-treated MEFs. We leaned towards this hypothesis because drug treatments are known for their overall variable specificity, but we agree other hypotheses are possible, and among them the difference in cell type, as both are fibroblasts but from different origin. We rephrased this passage in the revised manuscript (lines 447-448 ).

      Regarding the reviewer's third point, we fully agree that the CDPX2 phenotype in patients is unlikely to arise solely from fibroblast dysfunction. Nevertheless, fibroblasts are the only patient-derived cells currently available to us, and they provide a useful model for assessing ciliary signaling. It is reasonable to expect that similar defects could occur in other, more physiologically relevant cell types.

      • For the experiments with the SMO-M2 mutant, it would be useful to show the extent of pathway activation by the mutant compared to SAG or ShhN treatment of non-transfected cells. Moreover, it will be necessary to exclude any direct effects of the compound treatment on the ability of this mutant to traffic to the primary cilium, which can easily be done using fluorescence microscopy as the mutant is tagged with mCherry.

      The SmoM2 mutant is indeed a well-characterized constitutively active form of Smoothened that has been extensively studied by us and others. It is well established that this mutant correctly localizes to the primary cilium and robustly activates the Hedgehog pathway in MEFs (see Eguether et al., Dev. Cell, 2014 or Eguether et al, mol.biol.cell, 2018). In our study, we have already included supporting evidence for pathway activation in Supplementary Figure S1b, showing Gli1 expression levels in untreated MEFs transfected with SmoM2, which illustrates the extent of its activation compared to ligand-induced conditions.

      In line with the reviewer's recommendation, we will additionally include microscopy data showing SmoM2 localization in MEFs treated with the different sterol modulators. These data should confirm that the observed effects are not due to altered ciliary trafficking of the mutant protein but instead reflect changes in downstream signaling or membrane composition.

      Minor comments:

      Line 74: 'in patients', should be rephrased to 'patient-derived cells'

      This was modified in the revised manuscript

      Figure 2A: What do the '+/-' indicate? They seem to be erroneously placed.

      We apologize for the oversight, the figures initially submitted with the manuscript inadvertently included some earlier versions, which explains several of the discrepancies noted by the reviewers. This issue has been corrected in the revised submission, and all figures have now been updated to reflect the finalized data.

      Figure 2B: no label present for which bar represents cilia/other membranes

      We apologize for the oversight, the figures initially submitted with the manuscript inadvertently included some earlier versions, which explains several of the discrepancies noted by the reviewers. This issue has been corrected in the revised submission, and all figures have now been updated to reflect the finalized data.

      Figure 2C: this representation is slightly deceptive, since the difference between cells and cilia for lanosterol is not significantly different as shown in figure 2A.

      This representation has been removed in the revised figures.

      Figure 3A: it would be useful to also show where 8-DHC is in the biosynthetic pathway.

      This has been modified in the revised figures.

      Line 373: the title should be rephrased as it infers that DHCR7 was blocked in model membranes, which is not the case.

      This has been modified in the revised manuscript.

      Lines 377-384: this paragraph seems to be a mix of methods and some explanation, but should be rephrased for clarity.

      We believe the technical information within this paragraph are useful for the understanding of the reader. We would rather leave as is unless recommended by other reviewers or editorial staff.

      Line 403: 'which could explain the resulting defects in Hedgehog signaling': how and what defects? At this point in the study no defects in Hh signaling have been shown.

      This has been modified in the revised manuscript.

      Figure 4D: 'd' is missing

      We apologize for the oversight, the figures initially submitted with the manuscript inadvertently included some earlier versions, which explains several of the discrepancies noted by the reviewers. This issue has been corrected in the revised submission, and all figures have now been updated to reflect the finalized data.

      Line 408: SAG treatment resulted in slightly shorter cilia: this is not the case for just SAG treated cilia, but only for the combination of SAG + AY9944. However, in that condition there appears to be a subpopulation of very short cilia, are those real?

      This is correct, this is not the case for untreated cilia, but the short population is real, not only in AY9944 but also in Tamoxifen and Simvastatin. Again, the relevance and significance of minor cilia length change is unclear and we are not trying to draw any other conclusion from this than saying that the ciliary compartment is modified.

      Figure 5b: it would be good to add that all conditions contained SAG.

      This has been modified in the revised figures.

      Figure 5D: Since it is shown in Fig 5C that there are no positive cilia -SAG, there is no point to have empty graphs in Fig 5D on the left side, nor can any statistics be done. Similarly for 5K.

      We think this is still worth having in the figure. As the reviewer noted in one of his next comment, there are cases where Smoothened or Patched can be abnormally distributed (see also Eguether et al, mol biol cell, 2018). This shows that we checked all conditions for presence or absence of Smo and that there is no signal to be found. We would rather leave it as is unless asked otherwise by editorial staff.

      Figure 5E: it is not clearly indicated what is visualized in the inserts, sometimes it's a box, sometimes a line and they seem randomly integrated into the images.

      We apologize for the oversight - the figures initially submitted with the manuscript inadvertently included some earlier versions, which explains several of the discrepancies noted by the reviewers. This issue has been corrected in the revised submission, and all figures have now been updated to reflect the finalized data.

      Figure 5H: is this the intensity in just SMO positive cilia? If yes, this should be indicated, and the line at '0' for WT-SAG should be removed. I am also surprised there is then ns found for WT vs SLO, since in WT there are no positive cilia, but in SLO there are a few, so it appears to be more of a black-white situation. Perhaps it would be useful to split the data from different experiments to see if it consistently the case that there is a low percentage of SMO positive cilia in SLO cells.

      Yes, as in the rest of figure 5, the fluorescence intensity of Smo is only taken into account in SMO positive cells. This is now indicated in figure legend (lines 890, 898, 903 ). As for Smo positive, this is a good suggestion. We checked and for cilia in non-activated SLO patients, there are 8 positive cilia over a total of 240 counted cilia, mainly from one of the experiments. We could remove the data or leave as is given that the result is not significant.

      Fig S1: panels are inverted compared to mentioning in the text.

      We apologize for the oversight, the figures initially submitted with the manuscript inadvertently included some earlier versions, which explains several of the discrepancies noted by the reviewers. This issue has been corrected in the revised submission, and all figures have now been updated to reflect the finalized data.

      Methods-pharmacological treatments: there appear to be large differences in concentrations chosen to treat MDCK versus MEF cells - can the authors comment on these choices and show that the enzymes are indeed inhibited at the indicated concentrations?

      We thank the reviewer for this important comment. The concentrations of the pharmacological treatments were optimized separately for MDCK and MEF cells based on cell-type-specific tolerance. For each compound, we used the highest concentration that produced no detectable cytotoxicity or morphological changes. These conditions ensured that the treatments were effective (as seen by changes in sterol composition in MDCK cilia and Hh pathway phenotypes in treated MEFs) and compatible with cell viability and ciliation. Although we did not directly assay enzymatic inhibition in each case, the selected concentrations are consistent with those previously reported to inhibit the targeted enzymes in similar cellular contexts.

      Compound

      Typical Concentration Range in Mammalian Cell Culture

      Typical Exposure Duration

      Example Cell Types

      Representative Peer-Reviewed References

      AY9944 (DHCR7 inhibitor)

      1-10 µM widely used; 1 µM for minimal on-target effects; 2.5-10 µM for robust sterol shifts

      24-72 h; some sterol studies up to several days

      HEK293, fibroblasts, neuronal cells, macrophages

      Kim et al., J Biol Chem, 2001 - used 1 µM in dose-response experiments.; Haas et al., Hum Mol Genet, 2007 - 1 µM in cell-based assays.; Recent macrophage sterol study - 2.5-10 µM to induce 7-DHC accumulation.

      Simvastatin (HMG-CoA reductase inhibitor)

      0.1-10 µM common; 1-10 µM most widely used for robust pathway inhibition

      24-72 h

      Diverse mammalian lines, including liver, fibroblasts, epithelial cells

      Bytautaite et al., Cells (2020) - discusses common in-vitro ranges (1-10 µM).; Mullen et al., 2011 - used 10 µM simvastatin, noting it is a standard in-vitro concentration.

      Tamoxifen (modulator of sterol metabolism)

      1-20 µM; 1-5 µM for mild/longer treatments; 10-20 µM in cancer/cilia signaling studies

      24-72 h (longer treatments often at 1-5 µM)

      MDCK, MEFs, MCF-7, diverse epithelial lines

      Schlottmann et al., Cells (2022) - used 5-25 µM in sterol-related cell studies.; MCF-7 literature - 0.1-1 µM for estrogenic signaling, higher (5-10 µM) for metabolic/sterol pathway effects.; Additional cancer cell work indicating similar ranges.

      This information has been clarified in the revised Methods section (lines 222-224).

      (optional): it would be interesting to include a gamma-tubulin staining on the cilium prep to see if there is indeed a presence of the basal body as suggested by the proteomics data.

      Thank you, we will try this.

      There are many spelling mistakes and inconsistencies throughout the manuscript and its figures (mix of French and English for example) so careful proofreading would be warranted. Moreover, there are many mentionings of 'Hedgehog defects' or 'Hedgehog-linked', where in fact it is a defect in or link to the Hedgehog pathway, not the protein itself. This should be corrected.

      We thank the reviewer for noting these issues. We apologize for the inconsistencies observed in the initial submission, as mentioned previously, some of the figures inadvertently included earlier versions, which may have contributed to the errors identified. All figures have now been carefully revised and updated in the resubmitted manuscript.

      Regarding the text, we are surprised to hear about the spelling inconsistencies, as the manuscript was professionally proofread prior to submission (documentation can be provided upon request). Nevertheless, we have conducted an additional round of thorough proofreading to ensure consistency throughout the text and figures.

      Finally, we have corrected all instances of "Hedgehog defects" or "Hedgehog-linked" to the more accurate phrasing "Hedgehog pathway defect" or "Hedgehog pathway-linked," as suggested by the reviewer throughout the manuscript.

      Reviewer #1 (Significance (Required)):

      The study of ciliary membrane composition is highly relevant to understand signal transduction in health and disease. As such, the topic of this manuscript is significant and timely. However, as indicated above, there are limitations to this study, most notably the comparison of ciliary membrane versus all cellular membranes (rather than the plasma membrane), which weakens the conclusions that can be drawn. Moreover, cell-type dependency should be more thoroughly addressed. There certainly is a methodological advance in the form of cilia isolation from MDCK cells, however, it is unclear how broadly applicable this is to other mammalian cell types.

      We would like to thank the reviewer for their helpful comments and we appreciate the reviewer's recognition of the relevance and timeliness of studying ciliary membrane composition in the context of signaling regulation. We fully acknowledge that our comparison was made between the primary ciliary membrane and the total cellular membrane fraction, which encompasses multiple intracellular membranes. Our intent, however, was to obtain a global overview of how the ciliary membrane differs from the average membrane environment within the cell, thereby highlighting features that are unique to the cilium as a signaling organelle. This approach provides valuable baseline information that complements, rather than replaces, future targeted comparisons with the plasma membrane. As mentioned in this reply, we aim at carrying out these experiments before publication. Regarding cell-type dependency, we concur that ciliary lipid composition may vary between cell types, reflecting differences in their functional specialization. Our method was intentionally established in MDCK cells, which are epithelial and highly ciliated, to ensure sufficient yield and reproducibility. We have initiated trials with other mammalian cell types, including IMCD3 and MEF cells, and while yields remain limited, preliminary results indicate that the approach is adaptable with further optimization. Thus, our current work establishes a robust and reproducible proof of concept in a mammalian model, providing the first detailed sterol fingerprint of a mammalian primary cilium.

      We believe this constitutes a significant methodological and conceptual advance, as it opens the way for systematic exploration of ciliary lipid composition across diverse mammalian systems and pathological contexts.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Overview Accumulating evidence suggests that sterols play critical roles in signal transduction within the primary cilium, perhaps most notably in the Hedgehog cascade. However, the precise sterol composition of the primary cilium, and how it may change under distinct biological conditions, remains unknown, in part because of the lack of reproducible, widely accepted procedures to purify primary cilia from mammalian cultured cells. In the present study, the authors have designed a method to isolate the cilium from the MDCK cells efficiently and then utilized this procedure in conjunction with mass spectrometry to systematically analyze the sterol composition of the ciliary membrane, which they then compare to the sterol composition of the cell body. By analyzing this sterol profiling. the authors claim that the cilium has a distinct sterol composition from the cell body, including higher levels of cholesterol and desmosterol but lower levels of 8-DHC and & Lathosterol. This manuscript further demonstrates that alteration of sterol composition within cilia modulates Hedgehog signaling. These results strengthen the link between dysregulated Hedgehog signaling and defects in cholesterol biosynthesis pathways, as observed in SLOS and CDPX2.

      While the ability to isolate primary cilia from cultured MDCK cells represents an important technical achievement, the central claim of the manuscript - that cilia have a different sterol composition from the cell body - is not adequately supported by the data, and more rigorous comparisons between the ciliary membrane and key organellar membranes (such as plasma membrane) are required to make this claim. Moreover, although the authors have repeatedly mention that the ciliary sterol composition is "tightly regulated" there is no evidence provided to support such claim. At best, the data suggest that the cilium and cell body may differ in sterol composition (though even that remains uncertain), but no underlying regulatory mechanisms are demonstrated. In addition, much of the 2nd half of the paper represents a rehash of experiments with sterol biosynthesis inhibitors that have already been published in the literature, making the conceptual advance modest at best. Lastly, the link between CDPX2 and defective Hedgehog signaling is tenuous.

      We would like to thank the reviewer for their helpful comments

      Major comments

      Figure 1. C) Although the isolation of cilium from the MDCK cells using dibucaine treatment seems to be very efficient, the quality control of their fractionation procedure to monitor the isolation is limited to a single western blot of the purified cilia vs. cell body samples, with no representative data shown from the sucrose gradient fractionation steps. Given that prior studies (including those from the Marshall lab cited in this manuscript) found that 1) sucrose gradient fractionation was essential to obtain relatively pure ciliary fractions, and 2) the ciliary fractions appear to spread over many sucrose concentrations in those prior studies , the authors should have included the comparison of the fractionation profile from the sucrose gradient while isolating the primary cilium. This additional information would have further clarified and supported the efficiency of their proposed method.

      We thank the reviewer for their insightful comments regarding the quality control of our ciliary fractionation. We would like to clarify several important methodological aspects that distinguish our approach from those used in the studies cited (including those from the Marshall lab). In the cited work, the authors used a continuous sucrose gradient ranging from 30 % to 45 %, which allowed visualization of the distribution of ciliary proteins across the gradient. In contrast, we employed a discontinuous sucrose gradient (25 % / 50 %) optimized for higher recovery and reproducibility in our hands. In our preparation, the primary cilia consistently localize at the interface between the 25 % and 50 % layers. We systematically collect five 1 mL fractions from this interface and use fractions 1-3 for downstream analyses, as fractions 4-5 are typically already depleted of ciliary material. This targeted collection ensures good enrichment and low contamination, while avoiding unnecessary dilution of the limited ciliary sample. We also note that the prior studies the reviewer refers to were optimized for proteomic analyses, and therefore used actin as a marker of contamination from the cell body. In our case, the downstream application is lipidomic profiling, for which such protein-based contamination markers are not directly informative, since no reliable lipid marker exists to differentiate between organelle membranes. For this reason, we limited the protein-level validation to a semi-quantitative assessment of ciliary enrichment using ARL13B Western blotting, which robustly reports the presence and enrichment of ciliary membranes. Finally, to complement this targeted validation, we performed proteomic analysis followed by Gene Ontology (GO) Enrichment Analysis using the PANTHER database. This analysis evaluates the overrepresentation of proteins associated with ciliary structures and functions relative to the background frequency in the Canis lupus familiaris proteome. The resulting enrichment profile confirms that the isolated material is highly enriched in ciliary components and somewhat depleted of non-ciliary contaminants, thereby serving as an unbiased and global assessment of sample specificity and purity. We believe that, together, these methodological choices provide a rigorous and quantitative validation of our fractionation efficiency and support the robustness of the cilia isolation protocol used in this study.

      1. D) The authors presented proteomic data for the peptides analyzed from the isolated cilia in the form of GO term analysis; however, they did not provide examples of different proteins enriched within their fractionation procedure, aside from Arl13b shown in the blot. Including a summary table with representative proteins identified in the isolated ciliary fraction, along with the relative abundance or percentage distribution of these proteins, would make the data more informative.

      We thank the reviewer for this valuable suggestion. As mentioned in the manuscript, our proteomic dataset includes numerous hallmark components of the cilium, such as 18 IFT proteins, 4 BBS proteins, and several Hedgehog pathway components (including SuFu and Arl13b), as well as axonemal (Tubulin, Kinesin, Dynein) and centrosomal proteins (Centrin, CEPs, γ-Tubulin, and associated factors). This composition demonstrates that the isolated fraction is highly enriched in bona fide ciliary components while retaining a small proportion of basal body proteins, which is expected given their physical continuity. Importantly, our dataset shows a 70% overlap with the ciliary proteome published by Ishikawa et al. and a 41% overlap with the CysCilia consortium's list of potential ciliary proteins, which supports both the specificity and reliability of our isolation procedure. Regarding the suggestion to present relative protein abundances, we would like to clarify that defining "relative to what" is challenging in this context. The stoichiometry of ciliary proteins is largely unknown, and relative abundance normalized to total protein content can be misleading, as ciliary structural and signaling components differ greatly in copy number and membrane association. For this reason, we chose to highlight in the text proteins such as BBS and IFTs, which are known to be of low abundance within the cilium; their detection supports the depth and specificity of our proteomic coverage. In addition, we performed an unbiased Gene Ontology (GO) Enrichment Analysis using the PANTHER database, which provides a systematic and quantitative overview of the biological processes and cellular components overrepresented in our dataset relative to the canine proteome. This analysis with regard to purity wa already discussed in the submitted manuscript discussion. To further address the reviewer's comment, we will include as a supplemental table in the revised manuscript, a summary table listing representative ciliary proteins identified in our fraction, including those overlapping with the CysCilia (Gold ans potential lists), CiliaCarta and Ishikawa/Marshall proteomes. This addition should make the dataset more transparent and informative while preserving scientific rigor.

      Figure 2.

      The authors represented the comparison of sterol content within the cilia versus whole cell (as cell membranes). Since different organelles have a very diverse degree of cholesterol contents within them, for instance plasma membrane itself is around 50 mol% cholesterol levels while organelles like ER have barely any cholesterol. Thus, comparing these two samples and claiming a 2.5-fold increase in cholesterol levels is misleading. A more appropriate comparison would be between isolated primary cilia and isolated plasma membranes (procedures to isolate plasma membranes have been described previously, e.g., Naito et al., eLife 2019; Das et al, PNAS 2013. The absence of such controls makes it difficult to fully validate the reported magnitude of sterols enrichment in cilia relative to the cell surface.

      As already discussed above for reviewer 1, we would like to emphasize that our study did not aim to compare the cilium directly to the plasma membrane, nor did we claim that the comparison was in any way related to the plasma membrane. Our intent, was to obtain a global overview of how the ciliary membrane differs from the average membrane environment within the cell, thereby highlighting features that are unique to the cilium as a signaling organelle. This approach provides valuable baseline information that complements, rather than replaces, future targeted comparisons with the plasma membrane. However, we concur that determining the sterol composition of the MDCK plasma membrane would provide valuable context and enable a comparison with the membrane continuous with the ciliary membrane. Hence, we are willing to try isolating plasma membrane in the same cellular contexts, and we thank the reviewer for the proposed literature.

      Also, because dibucaine was used here to isolate MDCK cilia, a control experiment to exclude possible effects of the dibucaine treatment on sterol biosynthesis would be helpful.

      Thank you for this comment, we will verify this point by quantifying by GC-MS the sterol content of whole MDCK cells with and without 15 minutes-dibucaine treatments.

      Figure 3.

      Tamoxifen is a potent drug for nuclear hormone receptor activity and thus can independently influence various cellular processes. As several experiments in the later sections of the manuscript rely on tamoxifen treatment of cells, it is important that the authors include appropriate controls for tamoxifen treatment, to confirm that the observed effects do not stem from effects on nuclear hormone receptor activity. This would ensure that the observed effects can be confidently attributed to the experimental manipulation rather than to the intrinsic effects of tamoxifen.

      The reviewer is right, tamoxifen, like many drugs, has pleiotropic effects in different cell processes. Aware of this possible issue, we turned to a genetic model creating a CRISPR-CAS9 mediated knock down of EBP, the enzyme targeted by tamoxifen. We showed in figure 5 that the results between tamoxifen treated cells and CRIPSR EBP cells were in accordance with one another, showing that, for hedgehog signaling, the effect of tamoxifen recapitulates the effect of the enzyme KO.

      Figure4. The authors present the results of spectroscopy studies to analyze generalized polarization (GP) of liposomes in vitro , but only processed data are shown, and the raw spectra are not provided. The authors need to present representative spectra to enable the readers to interact the raw data from the experiments.

      This has been added to new supplemental figure 1 and corresponding figure legend (lines 898-904)

      Figure5. B) The experiment shown Gli1 mRNA levels following treatment with inhibitors of cholesterol biosynthesis, but similar findings have already been reported previously (e.g., Cooper et al, Nature Genetics 2003; Blassberg et al, Hum Mol Genet 2016), and the present results do not provide a significant conceptual advance over those earlier studies.

      We thank the reviewer for this comment and for highlighting the importance of earlier studies on Hedgehog (Hh) signaling and cholesterol metabolism. While we fully agree that confirming and extending established findings has intrinsic scientific value, we respectfully disagree with the assertion that our work does not provide conceptual novelty.

      The seminal work by Cooper et al. (Nature Genetics, 2003) indeed laid the foundation for linking sterol metabolism to Hedgehog signaling, and we cite it as such. However, that study was conducted in chick embryos, a model that is relatively distant from mammalian systems and human pathophysiology. Moreover, their approach relied heavily on cyclodextrin-mediated cholesterol depletion, which is non-specific and extracts multiple sterols from membranes (discussed in this article lines 512-516). In contrast, our study employs pharmacological inhibitors targeting specific enzymes in the sterol biosynthetic pathway, thereby allowing us to modulate distinct steps and intermediates in a controlled and mechanistically informative manner. We also extend these analyses to patient-derived fibroblasts and CRISPR-engineered cells, providing direct human and genetic validation of the observed effects. Importantly, we complement these cellular studies with biochemical characterization of isolated ciliary membranes from MDCK cells, enabling a direct assessment of how specific sterol alterations affect ciliary composition and Hh pathway function - an angle not addressed in prior work.

      Regarding Blassberg et al. (Hum. Mol. Genet., 2016), we agree that part of our findings recapitulates their observations on SMO-related signaling defects, which we view as an important confirmation of reproducibility. However, their study primarily sought to distinguish whether Hh pathway impairment in SLOS results from 7-DHC accumulation or cholesterol depletion, concluding that cholesterol deficiency was the main cause. Our results expand on this by demonstrating that perturbations extend beyond these two sterols, and that additional intermediates in the biosynthetic pathway also impact ciliary membrane composition and signaling competence. Furthermore, our experiments using the constitutively active SmoM2 mutant show that Hh signaling defects are not restricted to SMO activation per se, revealing a broader disruption of the signaling machinery within the cilium.

      Finally, neither of the above studies examined CDPX2 patient-derived cells or the consequences of EBP enzyme deficiency on Hh signaling. Our finding that this pathway is altered in this genetic context represents, to our knowledge, a novel link between CDPX2 and Hedgehog pathway dysfunction.

      Taken together, our work builds upon and extends previous findings by integrating cell-type-specific, biochemical, and patient-based analyses to provide a more comprehensive and mechanistically detailed view of how sterol composition of the ciliary membrane regulates Hedgehog signaling.

      In addition, the authors analyze the effect of these inhibitors on SAG stimulation, but the experiment lacks the control for Gli mRNA levels in the absence of SAG treatment. Without this control, it is impossible to know where the baseline in the experiment is and how large the effects in question really are.

      Below, we provide the data expressed using the ΔΔCt method (NT + SAG normalized to NT - SAG), which more clearly illustrates the magnitude of the effect in question. As similar qPCR-based Hedgehog pathway activation assays in MEFs have been published previously (see Eguether et al., Dev. Cell 2014; Eguether et al., Mol. Biol. Cell 2018), our goal here was not to re-establish the assay itself but to highlight the comparative effects across experimental conditions. In addition, one of the datasets was obtained using a new batch of SAG, which exhibited stronger pathway activation across all conditions (visible as higher overall expression levels). To ensure valid statistical comparisons across experiments and to focus on relative rather than absolute activation, we therefore chose to present the data as fold change values, which provides a more robust and statistically consistent measure for cross-condition analysis.

      J-K) The data represented in these panels for SAG treatment as fraction of Smo and its fluorescence intensity for the same sample appears to be inconsistent between the two graphs. Under SAG treatment for EBP mutants shows higher Smo fluorescence intensity while Smo positive cilia seems to be less than the wild type control cells. If the number of Smo+ cilia (quantified by eye) differs between conditions, shouldn't the quantification of Smo intensity within cilia show a similar difference?

      We thank the reviewer for this careful observation. The apparent discrepancy arises because the two panels quantify different parameters. In panel (j), we counted the percentage of cilia positive for SMO (i.e., cilia in which SMO was detected above background). In contrast, panel (k) reports the fluorescence intensity of SMO, but this measurement was performed only within the SMO-positive cilia identified in panel (j). This distinction has now been explicitly clarified in the figure legend, as also suggested by Reviewer 1.

      Taken together, these two analyses indicate that although fewer cilia display detectable SMO accumulation in the EBP mutant cells, the amount of SMO present within those cilia that do recruit it is comparable to wild-type levels (as reflected by the non-significant difference in fluorescence intensity). This interpretation helps explain the partial functional preservation of Hedgehog signaling in this condition and contrasts with cases such as AY9944 treatment, where both the number of SMO-positive cilia and the SMO intensity are reduced.

      1. I) The rationale for using SmoM2 in the analysis of cholesterol metabolism-related diseases such as SLOS and CDPX2 is unclear. The SmoM2 variant is primarily associated with cancer rather than cholesterol biosynthesis defects and its relevance either of these disorders is not immediately apparent.

      We thank the reviewer for this pertinent observation. We fully agree that SmoM2 was originally identified as an oncogenic mutation and is not directly associated with cholesterol biosynthesis disorders. However, our rationale for using this mutant was mechanistic rather than pathological. SmoM2 is a constitutively active form of SMO that triggers pathway activation independently of upstream components such as PTCH1 or ligand-mediated regulation.

      By using SmoM2, we aimed to determine whether the signaling defects observed under conditions that alter sterol metabolism (e.g., treatment with AY9944 or tamoxifen) occur upstream or downstream of SMO activation. The results demonstrate that, even when SMO is constitutively active, the Hedgehog pathway remains impaired under AY9944 treatment-and to a lesser extent with tamoxifen-indicating that these sterol perturbations disrupt the pathway beyond the level of SMO activation itself. In contrast, cells treated with simvastatin maintain normal pathway responsiveness, reinforcing the specificity of this effect.

      This experiment is therefore central to our study, as it reveals that sterol imbalance can hinder Hedgehog signaling even in the presence of an active SMO, providing new insight into how membrane composition influences downstream signaling competence.

      Minor corrections

      1. Line 385 seems to be a bit confusing which mentions cilia were treated with AY9944 - do the authors mean that cells were been treated with the drugs before isolation of cilia, or were the purified cilia actually treated with the drugs?

      Thank you, this has been modified in the revised manuscript

      The authors should add proper label in Figure 2 panel b for the bars representing the cilia and cell membranes.

      We apologize for the oversight, the figures initially submitted with the manuscript inadvertently included some earlier versions, which explains several of the discrepancies noted by the reviewers. This issue has been corrected in the revised submission, and all figures have now been updated to reflect the finalized data.

      Panels in Figure S1 should be re-arranged according to the figure legend and figure reference in line 450.

      We apologize for the oversight, the figures initially submitted with the manuscript inadvertently included some earlier versions, which explains several of the discrepancies noted by the reviewers. This issue has been corrected in the revised submission, and all figures have now been updated to reflect the finalized data.

      Legend for the Figure S1b should be corrected as data sets in graph represents 7 points while technical replicates in legend shows 6 experimental values.

      Thank you, this has been modified in the revised manuscript

      The labels for drug in Figure 3 and 5 should be corrected from tamoxifene to tamoxifen and simvastatine to simvastatin.

      We apologize for the oversight, the figures initially submitted with the manuscript inadvertently included some earlier versions, which explains several of the discrepancies noted by the reviewers. This issue has been corrected in the revised submission, and all figures have now been updated to reflect the finalized data.

      Reviewer #2 (Significance (Required)):

      In the present study, the authors have designed a method to isolate the cilium from the MDCK cells efficiently and then utilized this procedure in conjunction with mass spectrometry to systematically analyze the sterol composition of the ciliary membrane, which they then compare to the sterol composition of the cell body. By analyzing this sterol profiling. the authors claim that the cilium has a distinct sterol composition from the cell body, including higher levels of cholesterol and desmosterol but lower levels of 8-DHC and & Lathosterol. This manuscript further demonstrates that alteration of sterol composition within cilia modulates Hedgehog signaling. These results strengthen the link between dysregulated Hedgehog signaling and defects in cholesterol biosynthesis pathways, as observed in SLOS and CDPX2.

      While the ability to isolate primary cilia from cultured MDCK cells represents an important technical achievement, the central claim of the manuscript - that cilia have a different sterol composition from the cell body - is not adequately supported by the data, and more rigorous comparisons between the ciliary membrane and key organellar membranes (such as plasma membrane) are required to make this claim. Moreover, although the authors have repeatedly mention that the ciliary sterol composition is "tightly regulated" there is no evidence provided to support such claim. At best, the data suggest that the cilium and cell body may differ in sterol composition (though even that remains uncertain), but no underlying regulatory mechanisms are demonstrated. In addition, much of the 2nd half of the paper represents a rehash of experiments with sterol biosynthesis inhibitors that have already been published in the literature, making the conceptual advance modest at best. Lastly, the link between CDPX2 and defective Hedgehog signaling is tenuous.

      We thank the reviewer for this detailed summary and for acknowledging the technical advance represented by our method for isolating primary cilia from MDCK cells. However, we respectfully disagree with several aspects of the reviewer's assessment of our work.

      As we elaborated in our responses to earlier comments, particularly regarding Figure 5, we disagree with the characterization of part of our study as a "rehash", a somewhat derogatory word, of previously published experiments. Our approach differs from earlier studies by relying on specific pharmacological modulation of defined enzymes in the sterol biosynthesis pathway, rather than using non-specific agents such as cyclodextrins, and by linking these manipulations to direct biochemical measurements of ciliary sterol composition. This strategy allows, for the first time, a targeted and physiologically relevant examination of how specific sterol perturbations affect Hedgehog signaling.

      Regarding our statement that ciliary sterol composition is "tightly regulated," we acknowledge that we have not yet explored the underlying molecular mechanisms of this regulation. Nevertheless, the experimental evidence supporting this statement lies in the variation of ciliary sterol composition across multiple treatments that strongly perturb cellular sterols. Despite broad cellular changes, the ciliary sterol profile remains very resilient for some parameters, an observation that, in our view, strongly supports the idea of a selective or regulated process maintaining ciliary sterol identity. This conclusion does not depend on comparison with other membrane compartments.

      We also respectfully disagree that the observed differences between cilia and the cell body (which doesn't equal to plasma membrane) are "uncertain." The consistent enrichment in cholesterol and desmosterol, combined with the relative depletion in 8-DHC and lathosterol, were detected across independent replicates using robust lipidomic profiling and are statistically supported. These findings are, to our knowledge, the first quantitative demonstration of a sterol fingerprint specific to a mammalian cilium.

      Finally, while we agree that the mechanistic link between CDPX2 and defective Hedgehog signaling warrants further exploration, the data we present, combining pharmacological inhibition (tamoxifen), CRISPR-mediated EBP knockout, and SMOM2 activation assays, all consistently indicate a functional impairment of the Hedgehog pathway under EBP deficiency. This is further reinforced by clinical reports describing Hedgehog-related phenotypes in CDPX2 patients. We therefore believe that our work provides a solid experimental and conceptual basis for connecting EBP dysfunction to Hedgehog signaling defects.

      In summary, our study introduces a validated and reproducible method for mammalian cilia isolation, provides the first detailed sterol composition profile of primary cilia, and establishes a functional link between ciliary sterol imbalance and Hedgehog pathway modulation. We believe these findings represent a meaningful conceptual advance and a valuable resource for the field

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Lamaziere et al. describe an improved protocol for isolating primary cilia from MDCK cells for downstream lipidomics analysis. Using this protocol, they characterize sterol profile of MDCK cilia membrane under standard growth conditions and following pharmacological perturbations that are meant to mimic SLOS and CDPX2 disorders in humans. The authors then assess the impact of the same pharmacological manipulations on Shh pathway activity and validate their findings from these experiments using orthogonal genetic approaches. Major and minor concerns that require attention prior to publication are outlined below.

      We would like to thank the reviewer for their comments

      Major 1.Since the extent of contamination of the cilia preps with non-cilia membranes is unclear, and variability between replicates is not reported, it makes interpretation of changes in cilia membrane sterol composition in response to pharmacological manipulations somewhat difficult to interpret. Discussing reproducibility of cilia sterol composition between replicates (and including corresponding data) could alleviate these concerns to some extent.

      We thank the reviewer for this comment. We would like to clarify that variability between replicates is indeed reported throughout the manuscript. In Figures 2 and 3, all data are presented as mean {plus minus} SEM, as indicated in the figure legends. Specifically, the data in Figure 2 are derived from six independent experiments, reflecting the central dataset used for comparative analyses, while the data in Figure 3 are based on three independent experiments.

      We also note that the overall variability between replicates is low, further supporting the reproducibility of our ciliary sterol composition measurements. This consistency across independent biological replicates provides confidence that the differences observed between cilia and the cell body are robust and not due to stochastic contamination or technical variation.

      2.An abundant non-ciliary membrane protein (rather than GAPDH) may be a more appropriate loading control in Fig. 1C.

      This is a valuable comment and we will find a non-ciliary membrane protein to complement this experiment.

      3.Fig. 2b - which bar corresponds to cells and which one to cilia? What do numbers inside bars represent? Please label accordingly.

      We apologize for the oversight, the figures initially submitted with the manuscript inadvertently included some earlier versions, which explains several of the discrepancies noted by the reviewers. This issue has been corrected in the revised submission, and all figures have now been updated to reflect the finalized data.

      4.Fig. 3b-d, right panels - please define what numbers inside bars represent

      Thank you, this was done in the revised manuscript. The numbers are reports of absolute quantification.

      5.The font in Figs 2, 3, and 4 is very small and difficult to read. Please make the font and/or panels bigger to improve readability.

      We did our best to enlarge font despite space limitations, but we are willing to work with editorial staff to improve readability as suggested.

      6.It would help to have a diagram of the key steps in the cholesterol synthesis pathway for reference early in the paper rather than in figure 3.

      We thank the reviewer for his comment, but we don't understand why this would be helpful as we only use sterol modulators involving the pathway's enzyme in fig3. We are open to discussion with editorial staff about moving it up to fig2. If they feel this is needed

      7.The authors need to discuss why/how global inhibition of enzymes (e.g. via AY9944 treatment) in a cell could cause reduction in cholesterol levels only in the cilium and not in other cell membranes (see also point 1). Yet, tamoxifen treatment lowers cholesterol across the board.

      We thank the reviewer for these insightful comments. Regarding the modest overall effect of simvastatin on cholesterol levels, we would like to note that MDCK cells are an immortalized epithelial cell line with high metabolic plasticity. Such cancer-like cell types are known to exhibit enhanced de novo lipogenesis, particularly under culture conditions with ample glucose availability. This compensatory lipid biosynthesis can partially counterbalance pharmacological inhibition of the cholesterol biosynthetic pathway. Because simvastatin acts upstream in the pathway (at HMG-CoA reductase), its inhibition primarily reduces early intermediates rather than fully depleting end-product cholesterol, explaining the relatively mild changes observed in total cholesterol content. . This has been added in a new paragraph in the revised manuscript (lines 371-378).

      8.Fig. 5c, g, and j - statistical analyses are missing and need to be added in support of conclusions drawn in the text of the manuscript.

      Thank you, this has been done in the revised manuscript

      9.The decrease in the fraction of Smo+ cilia observed in EBP KO cells is mild (panel j, no statistics), and there is possibly a clone-specific effect here as well (statistical analysis is needed to determine if EBP139 is indeed different from WT and whether EBP139 and 141 are different from each other). Similarly, Smo fluorescence intensity after SAG treatment (panel k) is the same in WT and EBP KO cells, while there is a marked difference in intraciliary Smo intensity after tamoxifen treatment. The author's conclusion "...we were able to show that results with human cells aligned with our tamoxifen experiments" (line 436) should be modified to more accurately reflect the presented data. Ditto conclusions on lines 440-442, 530-531. In fact, it is the lack of Hh phenotypes in CDPX2 patients that is consistent with the EBP KO data presented in the paper.

      We thank the reviewer for this detailed comment. We have now performed the requested statistical analyses and incorporated them into the revised manuscript.

      The new analyses confirm that both EBP139 and EBP141 CRISPR KO clones show a statistically significant reduction in the fraction of Smo⁺ cilia compared to WT cells. They also reveal that the two clones differ significantly from each other, consistent with the expected clonal variability inherent to independently derived CRISPR lines.

      Despite this variability, several lines of evidence support our conclusion that the EBP KO phenotypes align with the effects observed after tamoxifen treatment:

      1- Directionally consistent reduction in Smo⁺ cilia:

      Although the magnitude of the decrease differs between clones, both clones display a significant reduction compared to WT, paralleling the reduction observed in tamoxifen-treated cells. This directional consistency is the key point for comparing pharmacological and genetic perturbations.

      2-Converging evidence from SmoM2 experiments:

      Tamoxifen treatment also reduces pathway output in the context of SmoM2 overexpression. This supports the interpretation that both EBP inhibition (tamoxifen) and EBP loss (CRISPR KO) impair Hedgehog signaling at the level of ciliary function, albeit more mildly than AY9944/SLOS-like perturbations.

      3-Interpretation of Smo intensity (panel k):

      As clarified in the revised text, the fluorescence intensities in panel K correspond only to cilia that are Smo-positive. The absence of a difference in intensity therefore does not contradict the observed reduction in the number of Smo⁺ cilia. Rather, it explains why the phenotype is milder than that observed for SLOS/AY9944: when Smo is able to enter the cilium, its enrichment level is comparable to WT.

      4- Clinical relevance for CDPX2:

      While Hedgehog-related phenotypes in CDPX2 patients may be milder or under-reported, several documented features, such as polydactyly (10% of cases), as well as syndactyly and clubfoot, are classically associated with ciliary/Hedgehog signaling defects. This clinical pattern is consistent with the milder yet detectable defects we observe in EBP KO cells.

      Minor •Line 310: 'intraflagellar' rather than 'intraciliary' transport particle B is a more conventional term

      We agree that intraflagellar is more conventional than intraciliary, but in this case, this is how the GO term is labeled in the database. In our opinion, it should stay as is.

      • Fig. 2c - typos in the color key, is grey meant to be "cells" and blue "cilia"? Individual panels are not referenced in the text

      This panel has been removed thanks to comment from reviewer 1 and 3 finding it misleading.

      • Lines 357-358: "Notably, AY9944 treatment led to a greater reduction in cholesterol content as well as a greater increase in 7-DHC and 8-DHC in cilia than in the other cell membranes" - the authors need to support this statement with appropriate statistical analysis

      We respectfully believe there may be a misunderstanding in the reviewer's concern. In all cases, our comparisons are made between treated vs. untreated conditions within each compartment (cell bulk vs. ciliary membrane), and the statistical significance of these differences is already reported as determined by a Mann-Whitney test. In every case, the changes observed are greater in cilia than in the cell body. The statement in the manuscript simply summarizes this quantitative observation. However, if the reviewer feels that an additional statistical test directly comparing the magnitude of the two compartment-specific changes would strengthen the claim, we are willing to include this analysis. Alternatively, if preferred, we can remove the sentence entirely, as the comparison is already clearly visible in Figure 3b.

      • Line 473 - unclear what is meant by "olfactory cilia are mainly sensory and not primary". Primary cilia are sensory.

      We agree, primary cilia are sensory, but still different from cilia belonging to sensory epithelia like retina photoreceptors or olfactory cilia. Nevertheless, this statement was modified in revised manuscript

      • Line 551: 'data not shown'. Please include the data that you would like to discuss or remove discussion of these data from the manuscript.

      The data is not shown because there is nothing to show, as we discussed in that sentence, use of cholesterol probe resulted in the disappearance of primary cilia altogether. We are willing to work with editorial staff to find a better way of expressing this idea.

      Reviewer #3 (Significance (Required)):

      Overall, the manuscript expands our knowledge of cilia membrane composition and reports an interesting link between SLOS and Shh signaling defects, which could at least in part explain SLOS patients' symptoms. The findings reported in the manuscript could be of interest to a broad audience of cell biologists and geneticists.

      We would like to thank the reviewer for his recognition of the importance of this work

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Lamaziere et al. describe an improved protocol for isolating primary cilia from MDCK cells for downstream lipidomics analysis. Using this protocol, they characterize sterol profile of MDCK cilia membrane under standard growth conditions and following pharmacological perturbations that are meant to mimic SLOS and CDPX2 disorders in humans. The authors then assess the impact of the same pharmacological manipulations on Shh pathway activity and validate their findings from these experiments using orthogonal genetic approaches. Major and minor concerns that require attention prior to publication are outlined below.

      Major

      1. Since the extent of contamination of the cilia preps with non-cilia membranes is unclear, and variability between replicates is not reported, it makes interpretation of changes in cilia membrane sterol composition in response to pharmacological manipulations somewhat difficult to interpret. Discussing reproducibility of cilia sterol composition between replicates (and including corresponding data) could alleviate these concerns to some extent.
      2. An abundant non-ciliary membrane protein (rather than GAPDH) may be a more appropriate loading control in Fig. 1C.
      3. Fig. 2b - which bar corresponds to cells and which one to cilia? What do numbers inside bars represent? Please label accordingly.
      4. Fig. 3b-d, right panels - please define what numbers inside bars represent
      5. The font in Figs 2, 3, and 4 is very small and difficult to read. Please make the font and/or panels bigger to improve readability.
      6. It would help to have a diagram of the key steps in the cholesterol synthesis pathway for reference early in the paper rather than in figure 3.
      7. The authors need to discuss why/how global inhibition of enzymes (e.g. via AY9944 treatment) in a cell could cause reduction in cholesterol levels only in the cilium and not in other cell membranes (see also point 1). Yet, tamoxifen treatment lowers cholesterol across the board.
      8. Fig. 5c, g, and j - statistical analyses are missing and need to be added in support of conclusions drawn in the text of the manuscript.
      9. The decrease in the fraction of Smo+ cilia observed in EBP KO cells is mild (panel j, no statistics), and there is possibly a clone-specific effect here as well (statistical analysis is needed to determine if EBP139 is indeed different from WT and whether EBP139 and 141 are different from each other). Similarly, Smo fluorescence intensity after SAG treatment (panel k) is the same in WT and EBP KO cells, while there is a marked difference in intraciliary Smo intensity after tamoxifen treatment. The author's conclusion "...we were able to show that results with human cells aligned with our tamoxifen experiments" (line 436) should be modified to more accurately reflect the presented data. Ditto conclusions on lines 440-442, 530-531. In fact, it is the lack of Hh phenotypes in CDPX2 patients that is consistent with the EBP KO data presented in the paper.

      Minor

      • Line 310: 'intraflagellar' rather than 'intraciliary' transport particle B is a more conventional term
      • Fig. 2c - typos in the color key, is grey meant to be "cells" and blue "cilia"? Individual panels are not referenced in the text
      • Lines 357-358: "Notably, AY9944 treatment led to a greater reduction in cholesterol content as well as a greater increase in 7-DHC and 8-DHC in cilia than in the other cell membranes" - the authors need to support this statement with appropriate statistical analysis
      • Line 473 - unclear what is meant by "olfactory cilia are mainly sensory and not primary". Primary cilia are sensory.
      • Line 551: 'data not shown'. Please include the data that you would like to discuss or remove discussion of these data from the manuscript.

      Significance

      Overall, the manuscript expands our knowledge of cilia membrane composition and reports an interesting link between SLOS and Shh signaling defects, which could at least in part explain SLOS patients' symptoms. The findings reported in the manuscript could be of interest to a broad audience of cell biologists and geneticists.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Overview

      Accumulating evidence suggests that sterols play critical roles in signal transduction within the primary cilium, perhaps most notably in the Hedgehog cascade. However, the precise sterol composition of the primary cilium, and how it may change under distinct biological conditions, remains unknown, in part because of the lack of reproducible, widely accepted procedures to purify primary cilia from mammalian cultured cells. In the present study, the authors have designed a method to isolate the cilium from the MDCK cells efficiently and then utilized this procedure in conjunction with mass spectrometry to systematically analyze the sterol composition of the ciliary membrane, which they then compare to the sterol composition of the cell body. By analyzing this sterol profiling. the authors claim that the cilium has a distinct sterol composition from the cell body, including higher levels of cholesterol and desmosterol but lower levels of 8-DHC and & Lathosterol. This manuscript further demonstrates that alteration of sterol composition within cilia modulates Hedgehog signaling. These results strengthen the link between dysregulated Hedgehog signaling and defects in cholesterol biosynthesis pathways, as observed in SLOS and CDPX2.

      While the ability to isolate primary cilia from cultured MDCK cells represents an important technical achievement, the central claim of the manuscript - that cilia have a different sterol composition from the cell body - is not adequately supported by the data, and more rigorous comparisons between the ciliary membrane and key organellar membranes (such as plasma membrane) are required to make this claim. Moreover, although the authors have repeatedly mention that the ciliary sterol composition is "tightly regulated" there is no evidence provided to support such claim. At best, the data suggest that the cilium and cell body may differ in sterol composition (though even that remains uncertain), but no underlying regulatory mechanisms are demonstrated. In addition, much of the 2nd half of the paper represents a rehash of experiments with sterol biosynthesis inhibitors that have already been published in the literature, making the conceptual advance modest at best. Lastly, the link between CDPX2 and defective Hedgehog signaling is tenuous.

      Major comments

      Figure 1.

      C) Although the isolation of cilium from the MDCK cells using dibucaine treatment seems to be very efficient, the quality control of their fractionation procedure to monitor the isolation is limited to a single western blot of the purified cilia vs. cell body samples, with no representative data shown from the sucrose gradient fractionation steps. Given that prior studies (including those from the Marshall lab cited in this manuscript) found that 1) sucrose gradient fractionation was essential to obtain relatively pure ciliary fractions, and 2) the ciliary fractions appear to spread over many sucrose concentrations in those prior studies , the authors should have included the comparison of the fractionation profile from the sucrose gradient while isolating the primary cilium. This additional information would have further clarified and supported the efficiency of their proposed method. D) The authors presented proteomic data for the peptides analyzed from the isolated cilia in the form of GO term analysis; however, they did not provide examples of different proteins enriched within their fractionation procedure, aside from Arl13b shown in the blot. Including a summary table with representative proteins identified in the isolated ciliary fraction, along with the relative abundance or percentage distribution of these proteins, would make the data more informative.

      Figure 2.

      The authors represented the comparison of sterol content within the cilia versus whole cell (as cell membranes). Since different organelles have a very diverse degree of cholesterol contents within them, for instance plasma membrane itself is around 50 mol% cholesterol levels while organelles like ER have barely any cholesterol. Thus, comparing these two samples and claiming a 2.5-fold increase in cholesterol levels is misleading. A more appropriate comparison would be between isolated primary cilia and isolated plasma membranes (procedures to isolate plasma membranes have been described previously, e.g., Naito et al., eLife 2019; Das et al, PNAS 2013. The absence of such controls makes it difficult to fully validate the reported magnitude of sterols enrichment in cilia relative to the cell surface. Also, because dibucaine was used here to isolate MDCK cilia, a control experiment to exclude possible effects of the dibucaine treatment on sterol biosynthesis would be helpful.

      Figure 3.

      Tamoxifen is a potent drug for nuclear hormone receptor activity and thus can independently influence various cellular processes. As several experiments in the later sections of the manuscript rely on tamoxifen treatment of cells, it is important that the authors include appropriate controls for tamoxifen treatment, to confirm that the observed effects do not stem from effects on nuclear hormone receptor activity. This would ensure that the observed effects can be confidently attributed to the experimental manipulation rather than to the intrinsic effects of tamoxifen.

      Figure4.

      The authors present the results of spectroscopy studies to analyze generalized polarization (GP) of liposomes in vitro , but only processed data are shown, and the raw spectra are not provided. The authors need to present representative spectra to enable the readers to interact the raw data from the experiments.

      Figure5.

      B) The experiment shown Gli1 mRNA levels following treatment with inhibitors of cholesterol biosynthesis, but similar findings have already been reported previously (e.g., Cooper et al, Nature Genetics 2003; Blassberg et al, Hum Mol Genet 2016), and the present results do not provide a significant conceptual advance over those earlier studies. In addition, the authors analyze the effect of these inhibitors on SAG stimulation, but the experiment lacks the control for Gli mRNA levels in the absence of SAG treatment. Without this control, it is impossible to know where the baseline in the experiment is and how large the effects in question really are. J-K) The data represented in these panels for SAG treatment as fraction of Smo and its fluorescence intensity for the same sample appears to be inconsistent between the two graphs. Under SAG treatment for EBP mutants shows higher Smo fluorescence intensity while Smo positive cilia seems to be less than the wild type control cells. If the number of Smo+ cilia (quantified by eye) differs between conditions, shouldn't the quantification of Smo intensity within cilia show a similar difference? I) The rationale for using SmoM2 in the analysis of cholesterol metabolism-related diseases such as SLOS and CDPX2 is unclear. The SmoM2 variant is primarily associated with cancer rather than cholesterol biosynthesis defects and its relevance either of these disorders is not immediately apparent.

      Minor corrections

      1. Line 385 seems to be a bit confusing which mentions cilia were treated with AY9944 - do the authors mean that cells were been treated with the drugs before isolation of cilia, or were the purified cilia actually treated with the drugs?
      2. The authors should add proper label in Figure 2 panel b for the bars representing the cilia and cell membranes.
      3. Panels in Figure S1 should be re-arranged according to the figure legend and figure reference in line 450.
      4. Legend for the Figure S1b should be corrected as data sets in graph represents 7 points while technical replicates in legend shows 6 experimental values.
      5. The labels for drug in Figure 3 and 5 should be corrected from tamoxifene to tamoxifen and simvastatine to simvastatin.

      Significance

      In the present study, the authors have designed a method to isolate the cilium from the MDCK cells efficiently and then utilized this procedure in conjunction with mass spectrometry to systematically analyze the sterol composition of the ciliary membrane, which they then compare to the sterol composition of the cell body. By analyzing this sterol profiling. the authors claim that the cilium has a distinct sterol composition from the cell body, including higher levels of cholesterol and desmosterol but lower levels of 8-DHC and & Lathosterol. This manuscript further demonstrates that alteration of sterol composition within cilia modulates Hedgehog signaling. These results strengthen the link between dysregulated Hedgehog signaling and defects in cholesterol biosynthesis pathways, as observed in SLOS and CDPX2.

      While the ability to isolate primary cilia from cultured MDCK cells represents an important technical achievement, the central claim of the manuscript - that cilia have a different sterol composition from the cell body - is not adequately supported by the data, and more rigorous comparisons between the ciliary membrane and key organellar membranes (such as plasma membrane) are required to make this claim. Moreover, although the authors have repeatedly mention that the ciliary sterol composition is "tightly regulated" there is no evidence provided to support such claim. At best, the data suggest that the cilium and cell body may differ in sterol composition (though even that remains uncertain), but no underlying regulatory mechanisms are demonstrated. In addition, much of the 2nd half of the paper represents a rehash of experiments with sterol biosynthesis inhibitors that have already been published in the literature, making the conceptual advance modest at best. Lastly, the link between CDPX2 and defective Hedgehog signaling is tenuous.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Review report for 'Sterols regulate ciliary membrane dynamics and hedgehog signaling in health and disease', Lamazière et al.

      In this manuscript, Lamazière et al. address an important understudied aspect of primary cilium biology, namely the sterol composition in the ciliary membrane. It is known that sterols especially play an important role in signal transduction between PTCH1 and SMO, two upstream components of the Hedgehog pathway, at the primary cilium. Moreover, several syndromes linked to cholesterol biosynthesis defects present clinical phenotypes indicative of altered Hh signal transduction. To understand the link between ciliary membrane sterol composition and Hh signal transduction in health and disease, the authors developed a method to isolate primary cilia from MDCK cells and coupled this to quantitative metabolomics. The results were validated using biophysical methods and cellular Hh signaling assays. While this is an interesting study, it is not clear from the presented data how general the findings are: can cilia be isolated from different mammalian cell types using this protocol? Is the sterol composition of MDCK cells expected to the be the same in fibroblasts or other cell types? Without this information, it is difficult to judge whether the conclusions reached in fibroblasts are indeed directly related to the sterol composition detected in MDCK cells. Below is a detailed breakdown of suggested textual changes and experimental validations to strengthen the conclusions of the manuscript.

      Major comments:

      • It appears that the comparison has been made between ciliary membranes and the rest of the cell's membranes, which includes many other membranes besides the plasma membrane. This significantly weakens the conclusions on the sterol content specific to the cilium, as it may in fact be highly similar to the rest of the plasma membrane. It is for example known that lathosterol is biosynthesized in the ER, and therefore the non-presence in the cilium may reflect a high abundance in the ER but not necessarily in the plasma membrane.
      • While the protocol to isolate primary cilium from MDCK cells is a valuable addition to the methods available, it would be good to at least include a discussion on its general applicability. Have the authors tried to use this protocol on fibroblasts for example?
      • Some of the conclusions in the introduction (lines 75-80) seem to be incorrectly phrased based on the data: in basal conditions, ciliary membranes are already enriched in cholesterol and desmosterol, and the treatment lowers this in all membranes.
      • There seems to be little effect of simvastatin on overall cholesterol levels. Can the authors comment on this result? How would the membrane fluidity be altered when mimicking simvastatin-induced composition? Since the effect on Hh signaling appears to be the biggest (Figure 5B) under simvastatin treatment, it would be interesting to compare this against that found for AY9944 treatment. Also, the authors conclude that the effects of simvastatin treatment on ciliary membrane sterol composition are the mildest, however, one could argue that they are the strongest as there is a complete lack of desmosterol.
      • It is not clear to me why the authors have chosen to use SAG to activate the Hh pathway, as this is a downstream mode of activation and bypasses PTCH1 (and therefore a potentially sterol-mediated interaction between the two proteins). It would be very informative to compare the effect of sterol modulation on the ability of ShhN vs SAG to activate the pathway.
      • The conclusions about the effect of tamoxifen on SMO trafficking in MEFs should be validated in human patient cells before being able to conclude that there is a potential off-target effect (line 438). Also, if that is the case, the experiment of tamoxifen treatment of EBP KO cells should give an additional effect on SMO trafficking. Also, could the CDPX2 phenotypes in patients be the result of different cell types being affected than the fibroblast used in this study?
      • For the experiments with the SMO-M2 mutant, it would be useful to show the extent of pathway activation by the mutant compared to SAG or ShhN treatment of non-transfected cells. Moreover, it will be necessary to exclude any direct effects of the compound treatment on the ability of this mutant to traffic to the primary cilium, which can easily be done using fluorescence microscopy as the mutant is tagged with mCherry.

      Minor comments:

      Line 74: 'in patients', should be rephrased to 'patient-derived cells'

      Figure 2A: What do the '+/-' indicate? They seem to be erroneously placed.

      Figure 2B: no label present for which bar represents cilia/other membranes

      Figure 2C: this representation is slightly deceptive, since the difference between cells and cilia for lanosterol is not significantly different as shown in figure 2A.

      Figure 3A: it would be useful to also show where 8-DHC is in the biosynthetic pathway.

      Line 373: the title should be rephrased as it infers that DHCR7 was blocked in model membranes, which is not the case.

      Lines 377-384: this paragraph seems to be a mix of methods and some explanation, but should be rephrased for clarity.

      Line 403: 'which could explain the resulting defects in Hedgehog signaling': how and what defects? At this point in the study no defects in Hh signaling have been shown.

      Figure 4D: 'd' is missing

      Line 408: SAG treatment resulted in slightly shorter cilia: this is not the case for just SAG treated cilia, but only for the combination of SAG + AY9944. However, in that condition there appears to be a subpopulation of very short cilia, are those real?

      Figure 5b: it would be good to add that all conditions contained SAG.

      Figure 5D: Since it is shown in Fig 5C that there are no positive cilia -SAG, there is no point to have empty graphs in Fig 5D on the left side, nor can any statistics be done. Similarly for 5K.

      Figure 5E: it is not clearly indicated what is visualized in the inserts, sometimes it's a box, sometimes a line and they seem randomly integrated into the images.

      Figure 5H: is this the intensity in just SMO positive cilia? If yes, this should be indicated, and the line at '0' for WT-SAG should be removed. I am also surprised there is then ns found for WT vs SLO, since in WT there are no positive cilia, but in SLO there are a few, so it appears to be more of a black-white situation. Perhaps it would be useful to split the data from different experiments to see if it consistently the case that there is a low percentage of SMO positive cilia in SLO cells. Fig S1: panels are inverted compared to mentioning in the text.

      Methods-pharmacological treatments: there appear to be large differences in concentrations chosen to treat MDCK versus MEF cells - can the authors comment on these choices and show that the enzymes are indeed inhibited at the indicated concentrations?

      (optional): it would be interesting to include a gamma-tubulin staining on the cilium prep to see if there is indeed a presence of the basal body as suggested by the proteomics data.

      There are many spelling mistakes and inconsistencies throughout the manuscript and its figures (mix of French and English for example) so careful proofreading would be warranted. Moreover, there are many mentionings of 'Hedgehog defects' or 'Hedgehog-linked', where in fact it is a defect in or link to the Hedgehog pathway, not the protein itself. This should be corrected.

      Significance

      The study of ciliary membrane composition is highly relevant to understand signal transduction in health and disease. As such, the topic of this manuscript is significant and timely. However, as indicated above, there are limitations to this study, most notably the comparison of ciliary membrane versus all cellular membranes (rather than the plasma membrane), which weakens the conclusions that can be drawn. Moreover, cell-type dependency should be more thoroughly addressed. There certainly is a methodological advance in the form of cilia isolation from MDCK cells, however, it is unclear how broadly applicable this is to other mammalian cell types.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript investigates the role of DOT1L and its H3K79 methyltransferase activity in dendritic cell (DC) differentiation. The authors employ a combination of in vitro FLT3L/SCF bone marrow culture systems, in vivo inducible knockout models, and genome-wide H3K79me2 ChIP-seq and RNA-seq analyses to demonstrate that DOT1L influences the balance between pDC and cDC2 differentiation, while leaving cDC1 development largely unaffected. The study further identifies transcriptional and epigenetic programs associated with these changes, linking DOT1L deficiency to altered antigen presentation pathways and loss of pDC-associated transcription factors. The paper provides valuable insights into DC biology. However, some of the key conclusions rely heavily on in vitro systems and short-term tamoxifen deletion models, which limit the interpretation of the in vivo data. Strengthening or clearly defining these limitations would substantially improve the paper's impact and clarity.

      Major Comments

      1. To strengthen the paper, the authors could follow one of two alternative strategies:

      (1) Validate their in vitro observations through in vivo experiments, or

      (2) Focus on deepening and refining their in vitro findings, moving the limited in vivo data to the supplementary material and explicitly acknowledging the limitations of the tamoxifen-inducible system.

      Strategy 1 - Strengthen in vivo validation

      -   The experiments presented in Figures 3 and 5 could be repeated in a competitive bone marrow chimera setting (e.g. CD45.1/CD45.2 irradiated hosts reconstituted with a 1:1 mix of WT CD45.1⁺ and Dot1l-KO CD45.2⁺ cells).
      -   This design would allow dissection of direct (cell-intrinsic) versus indirect effects of DOT1L deficiency and could mitigate confounding effects of incomplete or asynchronous deletion.
      -   After reconstitution, mice could be maintained on tamoxifen-supplemented chow for a longer period to ensure efficient recombination and adequate time for observing phenotypic consequences.
      -   Flow cytometric analysis of spleen and bone marrow should use more refined panels to explore DC precursor and subset deficiencies. Suggested reference panels: Rodrigues et al., Immunity 2024; Minutti et al., Nat. Immunol. 2024; Zhu et al., Nat. Immunol. 2015.
      

      Strategy 2 - Refine in vitro system and reposition in vivo data - The authors could replicate their differentiation assays under conditions that emulate the chimera approach by co-culturing WT (CD45.1⁺) and Dot1l-KO (CD45.2⁺) bone marrow cells. - This would reveal potential competition or cross-talk between WT and mutant cells and provide clearer mechanistic insight into cell-intrinsic versus extrinsic effects. - The authors should examine how tamoxifen itself affects differentiation and measure the kinetics of deletion and H3K79me loss to better contextualize the dynamic response. - It would also be valuable to assess which cDC2 subtypes (A vs. B) are preferentially affected by Dot1l deficiency, again using more sophisticated flow cytometry panels (see references above). If this in vitro-focused strategy is adopted, the in vivo data could be moved to the supplementary material, with explicit acknowledgment that the inducible deletion model and the gradual nature of H3K79me dilution limit the interpretation of the in vivo findings. 2. In Figures 2 and 3, the efficiency of H3K79me2 depletion following Dot1l excision should be assessed directly. Although DOT1L is the sole H3K79 methyltransferase, the dilution kinetics of H3K79me2 can vary depending on the proliferation rate. Quantifying the H3K79me2 signal in bone marrow-derived cell culture samples would clarify whether the deletion window allowed complete loss of the methylation mark. 3. Several observations are not discussed in sufficient depth: - The finding that Dot1l deletion increases antigen-presentation signatures might reflect stress or activation rather than lineage fate change. - The authors could also acknowledge that DOT1L's effect might be indirect, acting through cytokine feedback loops or altered progenitor proliferation, especially given the co-expression of Kit, Flt3, and Irf8 in early DC progenitors. - Moreover, because H3K79 methylation is primarily associated with transcriptional elongation rather than initiation, the observed transcriptional changes could result from broader alterations in chromatin accessibility or polymerase processivity, rather than direct promoter regulation. Discussing this mechanistic aspect would help clarify whether DOT1L's role in DC differentiation reflects a direct control of lineage-defining gene expression or a secondary consequence of disrupted transcriptional elongation dynamics.

      Minor Comments

      1. Terminology: The manuscript repeatedly refers to "mature" DCs-please clarify whether this means activated or fully differentiated cells.
      2. Ontogeny statements: <br /> The assertion that DCs of lymphoid origin are well established should be softened; the lymphoid contribution to some DC lineages remains under discussion.
      3. Transitional DCs (tDCs): <br /> The equivalence between tDCs and pre-cDC2As remains controversial. This should be acknowledged.
      4. Cytokine supplementation: <br /> The inclusion of SCF in the FLT3L-based differentiation assays should be justified, it is not a standard procedure.
      5. Macrophage contamination: <br /> The presence of C1qa, C1qb, and C1qc transcripts in some datasets suggests possible macrophage contamination. Please discuss how this was controlled for or how it might affect interpretation.

      Significance

      This study provides important insights into the epigenetic regulation of DC differentiation by DOT1L. The conclusions would be more compelling if supported by in vivo validation or, alternatively, if the limitations of the current in vivo data were transparently acknowledged and the focus shifted toward mechanistic in vitro depth.

      With these revisions, the manuscript would represent a valuable contribution to understanding how chromatin modification integrates with transcriptional control in shaping dendritic cell fate.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Bouma et al. present a comprehensive analysis of DOT1L-mediated histone H3K79 methylation across canonical DC subsets. By mapping the methylation landscape, the authors demonstrate that DOT1L regulates both shared and subset-specific gene programs. They show that in vitro or in vivo deletion of Dot1l, followed by in vitro differentiation, results in reduced myeloid progenitors and pDCs alongside an increase in cDC2s, while cDC1 numbers remain largely unaffected. Functionally, Dot1l-deficient DCs fail to produce IFNα upon stimulation. Transcriptomic profiling reveals enrichment of antigen presentation pathways in Dot1l-KO subsets, with upregulated MHC class II surface expression in pDCs. Mechanistically, pharmacological inhibition of DOT1L links these effects to its methyltransferase activity. Collectively, the data suggest that DOT1L differentially regulates canonical DC subset development and represses antigen presentation pathways.

      The manuscript is well-written and technically sound. However, several conclusions would benefit from deeper discussion or additional experimental validation.

      Major Comments

      1. Interpretation of DC balance changes and cell-cycle effects

      The authors propose that DOT1L loss skews DC differentiation toward a pDC-like phenotype. However, DOT1L deletion or inhibition, and the consequent global loss of H3K79 methylation, is well known to downregulate key cell-cycle genes (e.g., Cyclin D1, Cyclin E, CDK4/6, MCM family) while upregulating cell-cycle inhibitors (e.g., Cdkn1a and b). These transcriptional changes are associated with slower proliferation, G1 arrest or delayed S-phase entry, and reduced DNA replication fork progression. Importantly, blocking DNA synthesis (e.g., with aphidicolin or mitomycin C) during early culture inhibits DC emergence, underscoring that proliferation is essential for differentiation. The authors should discuss how their findings align with this established literature. Could the observed DC subset shifts result from impaired cell-cycle progression rather than lineage-specific transcriptional reprogramming? A more detailed consideration of this point is needed. 2. Discrepancy between in vitro and in vivo pDC phenotypes

      The in vitro data show a marked reduction in pDCs, yet in vivo pDC numbers appear unchanged. Although the discussion briefly mentions proliferation differences, this discrepancy deserves a clearer explanation or experimental follow-up.

      Minor Comments

      • Clarify statistical methods, specify biological replicate numbers, and indicate whether corrections for multiple comparisons were applied to transcriptomic analyses.
      • The introduction is somewhat lengthy and repetitive; condensing it would improve focus.
      • In the discussion sometimes it is not clear the distinction between findings and speculation.
      • Ensure consistent gene name formatting throughout (e.g., Dot1l, Dot1L).

      Significance

      The current manuscript fills a gap in knowledge, and this is its major strength. Other strengths are clarity and technical appropriateness.

      The major weakness is that the work is mainly descriptive. Mechanistic insights into DOT1L-dependent transcriptional regulation are still weak. The proposed mechanism -that DOT1L maintains pDC identity through H3K79 methylation at key transcription factors (Tcf4, SpiB, Irf8)- is intriguing but currently lacks functional evidence. The authors should consider validating this model experimentally, by modulating the expression of these genes without affecting DOT1L activity. Also the model suggesting that DOT1L indirectly represses antigen presentation via the Fbxo11-Ciita pathway is interesting but remains speculative. Additional mechanistic data would help support this claim.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this study, Bouma et al. investigate the epigenetic mechanisms involved in dendritic cell (DC) development, focusing on the role of the lysine methyltransferase DOT1L, which mediates histone H3 lysine 79 (H3K79) methylation. The authors first show that Dot1l is expressed across most DC subsets and their progenitors. Consistently, DOT1L activity was detected in these subsets, as ChIP-seq analysis revealed an enrichment of H3K79 methylation marks around the transcription start sites of numerous genes that regulate DC fate. These marks were associated with active transcription, as confirmed by RNA sequencing. To assess the functional role of Dot1l in DC development, the authors used Rosa26Cre-ERT2 × Dot1l^flox/flox mice. Bone marrow (BM) cells from these mice were treated in vitro with tamoxifen and cultured with FLT3L and SCF to induce DC differentiation. Dot1l deletion impaired the development of plasmacytoid DCs (pDCs) and enhanced the generation of conventional DC2 (cDC2), while leaving cDC1 development unaffected. Similarly, in vivo tamoxifen treatment of Rosa26Cre-ERT2 × Dot1l^flox/flox mice for three days led to a comparable impairment of DC development upon in vitro culture of BM cells. Beyond mature DCs, Dot1l deletion also disrupted the ability of BM cells to generate common myeloid progenitors (CMPs), monocyte-dendritic cell progenitors (MDPs), and common DC progenitors (CDPs). These effects were attributed to the methyltransferase activity of DOT1L, as pharmacological inhibition of DOT1L produced similar outcomes. Interestingly, while in vivo tamoxifen treatment altered the frequencies of progenitor populations (MDP, CDP, CMP) in the BM, it did not significantly change the frequency of pDCs in the BM or spleen. Moreover, an increase in the cDC2 population was observed only in the BM, with no effect detected in the spleen. With these findings the authors claim that epigenetic regulation of gene expression by DOT1L is important for proper dendritic cell development.

      Major comments.

      While this study demonstrates that DOT1L regulates DC development in vitro, its inducible deletion in vivo using tamoxifen does not appear to significantly affect the overall distribution or function of DCs. Therefore, further investigation is needed to clarify the role of DOT1L in regulating DC fate under physiological conditions. The authors analyzed DC populations at only two time points (3 and 12 days) following tamoxifen-induced Dot1l deletion. As noted in the discussion, these time points are relatively early considering the lifespan of DCs, which often extends beyond this period. It would thus be important to assess the effects of Dot1l deletion over a longer duration (e.g., at least one month) to fully evaluate its impact on DC development. In addition to the BM, an extensive analysis of DCs population should be carried in the spleen as well as lymph nodes. Given the broad activity of the Rosa26-Cre system, prolonged deletion may affect overall mouse health and/or the function of other cell types that contribute to DC development; therefore, using a DC-specific Cre driver (e.g., CD11c-Cre) would provide a more targeted approach. Alternatively, competitive BM chimera experiments could be performed by reconstituting irradiated control mice with a 1:1 mixture of BM cells from Rosa26Cre-ERT2 × Dot1l^flox/flox and Rosa26Cre-ERT2 × Dot1l^wt/flox mice, both pre-treated with tamoxifen in vitro. Such experiments would offer more definitive evidence for the role of DOT1L in DC development in vivo. Aside from this point, the data and methods are clearly presented, and the figures are largely self-explanatory. All experiments were adequately replicated three times. Statistical analyses were primarily performed using t-tests, and ANOVA with multiple comparisons when appropriate. Since these are parametric tests that assume a normal distribution, it would be important to confirm whether the analyzed samples meet this assumption. If not, non-parametric tests should be used instead.

      Minor comments.

      It would be informative to show how specific Dot1l expression is in DCs and their progenitors compared with other immune lineages (e.g., lymphocytes) and their precursors. The data suggest that DOT1L regulates H3K79 methylation of both shared and subset-specific genes among DC populations. The authors could elaborate on how this regulation achieves cell-type specificity-perhaps through differential Dot1l expression levels across DC subsets.

      Interestingly, Dot1l deletion both in vitro and in vivo markedly reduces the frequency of common DC progenitors (CDPs), which give rise to cDC1 and cDC2. The authors should discuss how such a substantial loss of progenitors does not proportionally affect downstream cDC populations. Although in vivo tamoxifen-induced deletion of Dot1l in Rosa26Cre-ERT2 × Dot1l^flox/flox mice does not significantly alter the overall distribution of DC subsets (pDCs and cDCs), it appears to modify their phenotype. It would therefore be valuable to examine how Dot1l loss impacts the functional properties of individual DC subsets. While pDC responsiveness to CpG stimulation seems preserved in the absence of Dot1l, assessing how cDCs respond to TLR3 and TLR4 stimulation and their capacity to activate T cells would provide important additional insights.

      Significance

      General assessment: Bouma et al. present compelling evidence that DOT1L is an important regulator of DC differentiation in vitro from bone marrow-derived cells. They further demonstrate that DOT1L regulates DC development through its lysine methyltransferase activity, mediating histone H3K79 methylation. While these in vitro findings are robust and well supported, the physiological relevance of DOT1L function in vivo remains less clearly established. Additional experiments would help to strengthen the conclusions regarding its role under physiological conditions.

      Advance: While numerous transcription factors have been described as key regulators of DC subset development and fate, the role of epigenetic regulation in this process remains relatively understudied and poorly understood. This study addresses this important gap in the literature and provides novel insights into the role of H3K79 methylation mediated by DOT1L in controlling DC development.

      Audience: This paper will be of interest for a specialized audience in the field of the regulation of dendritic cell ontogeny. This work could influence additional research to investigate the epigenitc regulation of DCs development.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank all the reviewers for their valuable comments and criticisms. We have thoroughly revised the manuscript and the resource to address all the points raised by the reviewers. Below, we provide a point-by-point response for the sake of clarity.

      Reviewer #1

      __Evidence, reproducibility and clarity __

      Summary: This manuscript, "MAVISp: A Modular Structure-Based Framework for Protein Variant Effects," presents a significant new resource for the scientific community, particularly in the interpretation and characterization of genomic variants. The authors have developed a comprehensive and modular computational framework that integrates various structural and biophysical analyses, alongside existing pathogenicity predictors, to provide crucial mechanistic insights into how variants affect protein structure and function. Importantly, MAVISp is open-source and designed to be extensible, facilitating reuse and adaptation by the broader community.

      Major comments: - While the manuscript is formally well-structured (with clear Introduction, Results, Conclusions, and Methods sections), I found it challenging to follow in some parts. In particular, the Introduction is relatively short and lacks a deeper discussion of the state-of-the-art in protein variant effect prediction. Several methods are cited but not sufficiently described, as if prior knowledge were assumed. OPTIONAL: Extend the Introduction to better contextualize existing approaches (e.g., AlphaMissense, EVE, ESM-based predictors) and clarify what MAVISp adds compared to each.

      We have expanded the introduction on the state-of-the-art of protein variant effects predictors, explaining how MAVISp departs from them.

      - The workflow is summarized in Figure 1(b), which is visually informative. However, the narrative description of the pipeline is somewhat fragmented. It would be helpful to describe in more detail the available modules in MAVISp, and which of them are used in the examples provided. Since different use cases highlight different aspects of the pipeline, it would be useful to emphasize what is done step-by-step in each.

      We have added a concise, narrative description of the data flow for MAVISp, as well as improved the description of modules in the main text. We will integrate the results section with a more comprehensive description of the available modules, and then clarify in the case studies which modules were applied to achieve specific results.

      OPTIONAL: Consider adding a table or a supplementary figure mapping each use case to the corresponding pipeline steps and modules used.

      We have added a supplementary table (Table S2) to guide the reader on the modules and workflows applied for each case study

      We also added Table S1 to map the toolkit used by MAVISp to collect the data that are imported and aggregated in the webserver for further guidance.

      - The text contains numerous acronyms, some of which are not defined upon first use or are only mentioned in passing. This affects readability. OPTIONAL: Define acronyms upon first appearance, and consider moving less critical technical details (e.g., database names or data formats) to the Methods or Supplementary Information. This would greatly enhance readability.

      We revised the usage of acronyms following the reviewer’s directions of defying them at first appearance.

      • The code and trained models are publicly available, which is excellent. The modular design and use of widely adopted frameworks (PyTorch and PyTorch Geometric) are also strong points. However, the Methods section could benefit from additional detail regarding feature extraction and preprocessing steps, especially the structural features derived from AlphaFold2 models. OPTIONAL: Include a schematic or a table summarizing all feature types, their dimensionality, and how they are computed.

      We thank the reviewer for noticing and praising the availability of the tools of MAVISp. Our MAVISp framework utilizes methods and scores that incorporate machine learning features (such as EVE or RaSP), but does not employ machine learning itself. Specifically, we do not use PyTorch and do not utilize features in a machine learning sense. We do extract some information from the AlphaFold2 models that we use (such as the pLDDT score and their secondary structure content, as calculated by DSSP), and those are available in the MAVISp aggregated csv files for each protein entry and detailed in the Documentation section of the MAVISp website.

      • The section on transcription factors is relatively underdeveloped compared to other use cases and lacks sufficient depth or demonstration of its practical utility. OPTIONAL: Consider either expanding this section with additional validation or removing/postponing it to a future manuscript, as it currently seems preliminary.

      We have removed this section and included a mention in the conclusions as part of the future directions.

      Minor comments: - Most relevant recent works are cited, including EVE, ESM-1v, and AlphaFold-based predictors. However, recent methods like AlphaMissense (Cheng et al., 2023) could be discussed more thoroughly in the comparison.

      We have revised the introduction to accommodate the proper space for this comparison.

      • Figures are generally clear, though some (e.g., performance barplots) are quite dense. Consider enlarging font sizes and annotating key results directly on the plots.

      We have revised Figure 2 and presented only one case study to simplify its readability. We have also changed Figure 3, whereas retained the other previous figures since they seemed less problematic.

      • Minor typographic errors are present. A careful proofreading is highly recommended. Below are some of the issues I identified: Page 3, line 46: "MAVISp perform" -> "MAVISp performs" Page 3, line 56: "automatically as embedded" -> "automatically embedded" Page 3, line 57: "along with to enhance" -> unclear; please revise Page 4, line 96: "web app interfaces with the database and present" -> "presents" Page 6, line 210: "to investigate wheatear" -> "whether" Page 6, lines 215-216: "We have in queue for processing with MAVISp proteins from datasets relevant to the benchmark of the PTM module." -> unclear sentence; please clarify Page 15, line 446: "Both the approaches" -> "Both approaches" Page 20, line 704: "advantage of multi-core system" -> "multi-core systems"

      We have done a proofreading of the entire article, including the points above

      Significance

      General assessment: the strongest aspects of the study are the modularity, open-source implementation, and the integration of structural information through graph neural networks. MAVISp appears to be one of the few publicly available frameworks that can easily incorporate AlphaFold2-based features in a flexible way, lowering the barrier for developing custom predictors. Its reproducibility and transparency make it a valuable resource. However, while the technical foundation is solid and the effort substantial, the scientific narrative and presentation could be significantly improved. The manuscript is dense and hard to follow in places, with a heavy use of acronyms and insufficient explanation of key design choices. Improving the descriptive clarity, especially in the early sections, would greatly enhance the impact of this work.

      Advance

      to the best of my knowledge, this is one of the first modular platforms for protein variant effect prediction that integrates structural data from AlphaFold2 with bioinformatic annotations and even clinical data in an extensible fashion. While similar efforts exist (e.g., ESMfold, AlphaMissense), MAVISp distinguishes itself through openness and design for reusability. The novelty is primarily technical and practical rather than conceptual.

      Audience

      this study will be of strong interest to researchers in computational biology, structural bioinformatics, and genomics, particularly those developing variant effect predictors or analyzing the impact of mutations in clinical or functional genomics contexts. The audience is primarily specialized, but the open-source nature of the tool may diffuse its use among more applied or translational users, including those working in precision medicine or protein engineering.

      Reviewer expertise: my expertise is in computational structural biology, molecular modeling, and (rather weak) machine learning applications in bioinformatics. I am familiar with graph-based representations of proteins, AlphaFold2, and variant effects based on Molecular Dynamics simulations. I do not have any direct expertise in clinical variant annotation pipelines.

      Reviewer #2

      __Evidence, reproducibility and clarity __

      Summary: The authors present a pipeline and platform, MAVISp, for aggregating, displaying and analysis of variant effects with a focus on reclassification of variants of uncertain clinical significance and uncovering the molecular mechanisms underlying the mutations.

      Major comments: - On testing the platform, I was unable to look-up a specific variant in ADCK1 (rs200211943, R115Q). I found that despite stating that the mapped refseq ID was NP_001136017 in the HGVSp column, it was actually mapped to the canonical UniProt sequence (Q86TW2-1). NP_001136017 actually maps to Q86TW2-3, which is missing residues 74-148 compared to the -1 isoform. The Uniprot canonical sequence has no exact RefSeq mapping, so the HGVSp column is incorrect in this instance. This mapping issue may also affect other proteins and result in incorrect HGVSp identifiers for variants.

      We would like to thank the reviewer for pointing out these inconsistencies. We have revised all the entries and corrected them. If needed, the history of the cases that have been corrected can be found in the closed issues of the GitHub repository that we use for communication between biocurators and data managers (https://github.com/ELELAB/mavisp_data_collection). We have also revised the protocol we follow in this regard and the MAVISp toolkit to include better support for isoform matching in our pipelines for future entries, as well as for the revision/monitoring of existing ones, as detailed in the Method Section. In particular, we introduced a tool, uniprot2refseq, which aids the biocurator in identifying the correct match in terms of sequence length and sequence identity between RefSeq and UniProt. More details are included in the Method Section of the paper. The two relevant scripts for this step are available at: https://github.com/ELELAB/mavisp_accessory_tools/

      - The paper lacks a section on how to properly interpret the results of the MAVISp platform (the case-studies are helpful, but don't lay down any global rules for interpreting the results). For example: How should a variant with conflicts between the variant impact predictors be interpreted? Are specific indicators considered more 'reliable' than others?

      We have added a section in Results to clarify how to interpret results from MAVISp in the most common use cases.

      • In the Methods section, GEMME is stated as being rank-normalised with 0.5 as a threshold for damaging variants. On checking the data downloaded from the site, GEMME was not rank-normalised but rather min-max normalised. Furthermore, Supplementary text S4 conflicts with the methods section over how GEMME scores are classified, S4 states that a raw-value threshold of -3 is used.

      We thank the reviewer for spotting this inconsistency. This part in the main text was left over from a previous and preliminary version of the pre-print, we have revised the main text. Supplementary Text S4 includes the correct reference for the value in light of the benchmarking therewithin.

      • Note. This is a major comment as one of the claims is that the associated web-tool is user-friendly. While functional, the web app is very awkward to use for analysis on any more than a few variants at once. The fixed window size of the protein table necessitates excessive scrolling to reach your protein-of-interest. This will also get worse as more proteins are added. Suggestion: add a search/filter bar. The same applies to the dataset window.

      We have changed the structure of the webserver in such a way that now the whole website opens as its own separate window, instead of being confined within the size permitted by the website at DTU. This solves the fixed window size issue. Hopefully, this will improve the user experience.

      We have refactored the web app by adding filtering functionality, both for the main protein table (that can now be filtered by UniProt AC, gene name or RefSeq ID) and the mutations table. Doing this required a general overhaul of the table infrastructure (we changed the underlying engine that renders the tables).

      • You are unable to copy anything out of the tables.
      • Hyperlinks in the tables only seem to work if you open them in a new tab or window.

      The table overhauls fixed both of these issues

      • All entries in the reference column point to the MAVISp preprint even when data from other sources is displayed (e.g. MAVE studies).

      We clarified the meaning of the reference column in the Documentation on the MAVISp website, as we realized it had confused the reviewer. The reference column is meant to cite the papers where the computationally-generated MAVISp data are used, not external sources. Since we also have the experimental data module in the most recent release, we have also refactored the MAVISp website by adding a “Datasets and metadata” page, which details metadata for key modules. These include references to data from external sources that we include in MAVISp on a case-by-case basis (for example the results of a MAVE experiment). Additionally, we have verified that the papers using MAVISp data are updated in https://elelab.gitbook.io/mavisp/overview/publications-that-used-mavisp-data and in the csv file of the interested proteins.

      Here below the current references that have been included in terms of publications using MAVISp data:

      SMPD1

      ASM variants in the spotlight: A structure-based atlas for unraveling pathogenic mechanisms in lysosomal acid sphingomyelinase

      Biochim Biophys Acta Mol Basis Dis

      38782304

      https://doi.org/10.1016/j.bbadis.2024.167260

      TRAP1

      Point mutations of the mitochondrial chaperone TRAP1 affect its functions and pro-neoplastic activity

      Cell Death & Disease

      40074754

      https://doi.org/10.1038/s41419-025-07467-6

      BRCA2

      Saturation genome editing-based clinical classification of BRCA2 variants

      Nature

      39779848

      0.1038/s41586-024-08349-1

      TP53, GRIN2A, CBFB, CALR, EGFR

      TRAP1 S-nitrosylation as a model of population-shift mechanism to study the effects of nitric oxide on redox-sensitive oncoproteins

      Cell Death & Disease

      37085483

      10.1038/s41419-023-05780-6

      KIF5A, CFAP410, PILRA, CYP2R1

      Computational analysis of five neurodegenerative diseases reveals shared and specific genetic loci

      Computational and Structural Biotechnology Journal

      38022694

      https://doi.org/10.1016/j.csbj.2023.10.031

      KRAS

      Combining evolution and protein language models for an interpretable cancer driver mutation prediction with D2Deep

      Brief Bioinform

      39708841

      https://doi.org/10.1093/bib/bbae664

      OPTN

      Decoding phospho-regulation and flanking regions in autophagy-associated short linear motifs

      Communications Biology

      40835742

      10.1038/s42003-025-08399-9

      DLG4,GRB2,SMPD1

      Deciphering long-range effects of mutations: an integrated approach using elastic network models and protein structure networks

      JMB

      40738203

      doi: 10.1016/j.jmb.2025.169359

      Entering multiple mutants in the "mutations to be displayed" window is time-consuming for more than a handful of mutants. Suggestion: Add a box where multiple mutants can be pasted in at once from an external document.

      During the table overhaul, we have revised the user interface to add a text box that allows free copy-pasting of mutation lists. While we understand having a single input box would have been ideal, the former selection interface (which is also still available) doesn’t allow copy-paste. This is a known limitation in Streamlit.

      Minor comments

      • Grammar. I appreciate that this manuscript may have been compiled by a non-native English speaker, but I would be remiss not to point out that there are numerous grammar errors throughout, usually sentence order issues or non-pluralisation. The meaning of the authors is mostly clear, but I recommend very thoroughly proof-reading the final version.

      We have done proofreading on the final version of the manuscript

      • There are numerous proteins that I know have high-quality MAVE datasets that are absent in the database e.g. BRCA1, HRAS and PPARG.

      Yes, we are aware of this. It is far from trivial to properly import the datasets from multiplex assays. They often need to be treated on a case-by-case basis. We are in the process of carefully compiling locally all the MAVE data before releasing it within the public version of the database, so this is why they are missing. We are giving priorities to the ones that can be correlated with our predictions on changes in structural stability and then we will also cover the rest of the datasets handling them in batches. Having said this, we have checked the dataset for BRCA1, HRAS, and PPARG. We have imported the ones for PPARG and BRCA1 from ProtGym, referring to the studies published in 10.1038/ng.3700 and 10.1038/s41586-018-0461-z, respectively. Whereas for HRAS, checking in details both the available data and literature, while we did identify a suitable dataset (10.7554/eLife.27810), we struggled to understand what a sensible cut-off for discriminating between pathogenic and non-pathogenic variants would be, and so ended up not including it in the MAVISp dataset for now. We will contact the authors to clarify which thresholds to apply before importing the data.

      • Checking one of the existing MAVE datasets (KRAS), I found that the variants were annotated as damaging, neutral or given a positive score (these appear to stand-in for gain-of-function variants). For better correspondence with the other columns, those with positive scores could be labelled as 'ambiguous' or 'uncertain'.

      In the KRAS case study presented in MAVISP, we utilized the protein abundance dataset reported in (http://dx.doi.org/10.1038/s41586-023-06954-0) and made available in the ProteinGym repository (specifically referenced at https://github.com/OATML-Markslab/ProteinGym/blob/main/reference_files/DMS_substitutions.csv#L153). We adopted the precalculated thresholds as provided by the ProteinGym authors. In this regard, we are not really sure the reviewer is referring to this dataset or another one on KRAS.

      • Numerous thresholds are defined for stabilizing / destabilizing / neutral variants in both the STABILITY and the LOCAL_INTERACTION modules. How were these thresholds determined? I note that (PMC9795540) uses a ΔΔG threshold of 1/-1 for defining stabilizing and destabilizing variants, which is relatively standard (though they also say that 2-3 would likely be better for pinpointing pathogenic variants).

      We improved the description of our classification strategies for both modules in the Documentation page of our website. Also, we explained more clearly the possible sources of ‘uncertain’ annotations for the two modules in both the web app (Documentation page) and main text. Briefly, in the STABILITY module, we consider FoldX and either Rosetta or RaSP to achieve a final classification. We first classify one and the other independently, according to the following strategy:

      If DDG ≥ 3, the mutation is Destabilizing If DDG ≤ −3, the mutation is Stabilizing If −2 We then compare the classifications obtained by the two methods: if they agree, then that is the final classification, if they disagree, then the final classification is Uncertain. The thresholds were selected based on a previous study, in which variants with changes in stability below 3 kcal/mol were not featuring a markedly different abundance at cellular level [10.1371/journal.pgen.1006739, 10.7554/eLife.49138]

      Regarding the LOCAL_INTERACTION module, it works similarly as for the Stability module, in that Rosetta and FoldX are considered independently, and an implicit classification is performed for each, according to the rules (values in kcal/mol)

      If DDG > 1, the mutation is Destabilizing. If DDG Each mutation is therefore classified for both methods. If the methods agree (i.e., if they classify the mutation in the same way), their consensus is the final classification for the mutation; if they do not agree, the final classification will be Uncertain.

      If a mutation does not have an associated free energy value, the relative solvent accessible area is used to classify it: if SAS > 20%, the mutation is classified as Uncertain, otherwise it is not classified.

      Thresholds here were selected according to best practices followed by the tool authors and more in general in the literature, as the reviewer also noticed.

      • "Overall, with the examples in this section, we illustrate different applications of the MAVISp results, spanning from benchmarking purposes, using the experimental data to link predicted functional effects with structural mechanisms or using experimental data to validate the predictions from the MAVISp modules."

      The last of these points is not an application of MAVISp, but rather a way in which external data can help validate MAVISp results. Furthermore, none of the examples given demonstrate an application in benchmarking (what is being benchmarked?).

      We have revised the statements to avoid this confusion in the reader.

      • Transcription factors section. This section describes an intended future expansion to MAVISp, not a current feature, and presents no results. As such, it should be moved to the conclusions/future directions section.

      We have removed this section and included a mention in the conclusions as part of the future directions.

      • Figures. The dot-plots generated by the web app, and in Figures 4, 5 and 6 have 2 legends. After looking at a few, it is clear that the lower legend refers to the colour of the variant on the X-axis - most likely referencing the ClinVar effect category. This is not, however, made clear either on the figures or in the app.

      The reviewer’s interpretation on the second legend is correct - it does refer to the ClinVar classification. Nonetheless, we understand the positioning of the legend makes understanding what the legend refers to not obvious. We also revised the captions of the figures in the main text. On the web app, we have changed the location of the figure legend for the ClinVar effect category and added a label to make it clear what the classification refers to.

      • "We identified ten variants reported in ClinVar as VUS (E102K, H86D, T29I, V91I, P2R, L44P, L44F, D56G, R11L, and E25Q, Fig.5a)" E25Q is benign in ClinVar and has had that status since first submitted.

      We have corrected this in the text and the statements related to it.

      Significance

      Platforms that aggregate predictors of variant effect are not a new concept, for example dbNSFP is a database of SNV predictions from variant effect predictors and conservation predictors over the whole human proteome. Predictors such as CADD and PolyPhen-2 will often provide a summary of other predictions (their features) when using their platforms. MAVISp's unique angle on the problem is in the inclusion of diverse predictors from each of its different moules, giving a much wider perspective on variants and potentially allowing the user to identify the mechanistic cause of pathogenicity. The visualisation aspect of the web app is also a useful addition, although the user interface is somewhat awkward. Potentially the most valuable aspect of this study is the associated gitbook resource containing reports from biocurators for proteins that link relevant literature and analyse ClinVar variants. Unfortunately, these are only currently available for a small minority of the total proteins in the database with such reports. For improvement, I think that the paper should focus more on the precise utility of the web app / gitbook reports and how to interpret the results rather than going into detail about the underlying pipeline.

      We appreciate the interest in the gitbook resource that we also see as very valuable and one of the strengths of our work. We have now implemented a new strategy based on a Python script introduced in the mavisp toolkit to generate a template Markdown file of the report that can be further customized and imported into GitBook directly (​​https://github.com/ELELAB/mavisp_accessory_tools/). This should allow us to streamline the production of more reports. We are currently assigning proteins in batches for reporting to biocurator through the mavisp_data_collection GitHub to expand their coverage. Also, we revised the text and added a section on the interpretation of results from MAVISp. with a focus on the utility of the web-app and reports.

      In terms of audience, the fast look-up and visualisation aspects of the web-platform are likely to be of interest to clinicians in the interpretation of variants of unknown clinical significance. The ability to download the fully processed dataset on a per-protein database would be of more interest to researchers focusing on specific proteins or those taking a broader view over multiple proteins (although a facility to download the whole database would be more useful for this final group).

      While our website only displays the dataset per protein, the whole dataset, including all the MAVISp entries, is available at our OSF repository (https://osf.io/ufpzm/), which is cited in the paper and linked on the MAVISp website. We have further modified the MAVISp database to add a link to the repository in the modes page, so that it is more visible.

      My expertise. - I am a protein bioinformatician with a background in variant effect prediction and large-scale data analysis.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Evidence, reproducibility and clarity:

      Summary:

      The authors present MAVISp, a tool for viewing protein variants heavily based on protein structure information. The authors have done a very impressive amount of curation on various protein targets, and should be commended for their efforts. The tool includes a diverse array of experimental, clinical, and computational data sources that provides value to potential users interested in a given target.

      Major comments:

      Unfortunately I was not able to get the website to work correctly. When selecting a protein target in simple mode, I was greeted with a completely blank page in the app window. In ensemble mode, there was no transition away from the list of targets at all. I'm using Firefox 140.0.2 (64-bit) on Ubuntu 22.04. I would like to explore the data myself and provide feedback on the user experience and utility.

      We have tried reproducing the issue mentioned by the reviewer, using the exact same Ubuntu and Firefox versions, but unfortunately failed to produce it. The website worked fine for us under such an environment. The issue experienced by the reviewer may have been due to either a temporary issue with the web server or a problem with the specific browser environment they were working in, which we are unable to reproduce. It would be useful to know the date that this happened to verify if it was a downtime on the DTU IT services side that made the webserver inaccessible.

      I have some serious concerns about the sustainability of the project and think that additional clarifications in the text could help. Currently is there a way to easily update a dataset to add, remove, or update a component (for example, if a new predictor is published, an error is found in a predictor dataset, or a predictor is updated)? If it requires a new round of manual curation for each protein to do this, I am worried that this will not scale and will leave the project with many out of date entries. The diversity of software tools (e.g., three different pipeline frameworks) also seems quite challenging to maintain.

      We appreciate the reviewer’s concerns about long-term sustainability. It is a fair point that we consider within our steering group, who oversee and plans the activities and meet monthly. Adding entries to MAVISp is moving more and more towards automation as we grow. We aim to minimize the manual work where applicable. Still, an expert-based intervention is really needed in some of the steps, and we do not want to renounce it. We intend to keep working on MAVISp to make the process of adding and updating entries as automated as possible, and to streamline the process when manual intervention is necessary. From the point of view of the biocurators, they have three core workflows to use for the default modules, which also automatically cover the source of annotations. We are currently working to streamline the procedures behind LOCAL_INTERACTION, which is the most challenging one. On the data manager and maintainers' side, we have workflows and protocols that help us in terms of automation, quality control, etc, and we keep working to improve them. Among these, we have workflows to use for the old entries updates. As an example, the update of erroneously attributed RefSeq data (pointed out by reviewer 2) took us only one week overall (from assigning revisions and importing to the database) because we have a reduced version of Snakemake for automation that can act on only the affected modules. Also, another point is that we have streamlined the generation of the templates for the gitbook reports (see also answer to reviewer 2).

      The update of old entries is planned and made regularly. We also deposit the old datasets on OSF for transparency, in case someone needs to navigate and explore the changes. We have activities planned between May and August every year to update the old entries in relation to changes of protocols in the modules, updates in the core databases that we interact with (COSMIC, Clinvar etc). In case of major changes, the activities for updates continue in the Fall. Other revisions can happen outside these time windows if an entry is needed or a specific research project and needs updates too.

      Furthermore, the community of people contributing to MAVISp as biocurators or developers is growing and we have scientists contributing from other groups in relation to their research interest. We envision that for this resource to scale up, our team cannot be the only one producing data and depositing it to the database. To facilitate this we launched a pilot for a training event online (see Event page on the website) and we will repeat it once per year. We also organize regular meetings with all the active curators and developers to plan the activities in a sustainable manner and address the challenges we encounter.

      As stated in the manuscript, currently with the team of people involved, automatization and resources that we have gathered around this initiative we can provide updates to the public database every third month and we have been regularly satisfied with them. Additionally, we are capable of processing from 20 to 40 proteins every month depending also on the needs of revision or expansion of analyses on existing proteins. We also depend on these data for our own research projects and we are fully committed to it.

      Additionally, we are planning future activities in these directions to improve scale up and sustainability:

      • Streamlining manual steps so that they are as convenient as fast as possible for our curators, e.g. by providing custom pages on the MAVISp website
      • Streamline and automatize the generation of useful output, for instance the reports, by using a combination of simple automation and large language models
      • Implement ways to share our software and scripts with third parties, for instance by providing ready made (or close to) containers or virtual machines
      • For a future version 2 if the database grows in a direction that is not compatible with Streamlit, the web data science framework we are currently using, we will rewrite the website using a framework that would allow better flexibility and performance, for instance using Django and a proper database backend. On the same theme, according to the GitHub repository, the program relies on Python 3.9, which reaches end of life in October 2025. It has been tested against Ubuntu 18.04, which left standard support in May 2023. The authors should update the software to more modern versions of Python to promote the long-term health and maintainability of the project.

      We thank the reviewer for this comment - we are aware of the upcoming EOL of Python 3.9. We tested MAVISp, both software package and web server, using Python 3.10 (which is the minimum supported version going forward) and Python 3.13 (which is the latest stable release at the time of writing) and updated the instructions in the README file on the MAVISp GitHub repository accordingly.

      We plan on keeping track of Python and library versions during our testing and updating them when necessary. In the future, we also plan to deploy Continuous Integration with automated testing for our repository, making this process easier and more standardized.

      I appreciate that the authors have made their code and data available. These artifacts should also be versioned and archived in a service like Zenodo, so that researchers who rely on or want to refer to specific versions can do so in their own future publications.

      Since 2024, we have been reporting all previous versions of the dataset on OSF, the repository linked to the MAVISp website, at https://osf.io/ufpzm/files/osfstorage (folder: previous_releases). We prefer to keep everything under OSF, as we also use it to deposit, for example, the MD trajectory data.

      Additionally, in this GitHub page that we use as a space to interact between biocurators, developers, and data managers within the MAVISp community, we also report all the changes in the NEWS space: https://github.com/ELELAB/mavisp_data_collection

      Finally, the individual tools are all available in our GitHub repository, where version control is in place (see Table S1, where we now mapped all the resources used in the framework)

      In the introduction of the paper, the authors conflate the clinical challenges of variant classification with evidence generation and it's quite muddled together. They should strongly consider splitting the first paragraph into two paragraphs - one about challenges in variant classification/clinical genetics/precision oncology and another about variant effect prediction and experimental methods. The authors should also note that they are many predictors other than AlphaMissense, and may want to cite the ClinGen recommendations (PMID: 36413997) in the intro instead.

      We revised the introduction in light of these suggestions. We have split the paragraph as recommended and added a longer second paragraph about VEPs and using structural data in the context of VEPs. We have also added the citation that the reviewer kindly recommended.

      Also in the introduction on lines 21-22 the authors assert that "a mechanistic understanding of variant effects is essential knowledge" for a variety of clinical outcomes. While this is nice, it is clearly not the case as we can classify variants according to the ACMG/AMP guidelines without any notion of specific mechanism (for example, by combining population frequency data, in silico predictor data, and functional assay data). The authors should revise the statement so that it's clear that mechanistic understanding is a worthy aspiration rather than a prerequisite.

      We revised the statement in light of this comment from the reviewer

      In the structural analysis section (page 5, lines 154-155 and elsewhere), the authors define cutoffs with convenient round numbers. Is there a citation for these values or were these arbitrarily chosen by the authors? I would have liked to see some justification that these assignments are reasonable. Also there seems to be an error in the text where values between -2 and -3 kcal/mol are not assigned to a bin (I assume they should also be uncertain). There are other similar seemingly-arbitrary cutoffs later in the section that should also be explained.

      We have revised the text making the two intervals explicit, for better clarity.

      On page 9, lines 294-298 the authors talk about using the PTEN data from ProteinGym, rather than the actual cutoffs from the paper. They get to the latter later on, but I'm not sure why this isn't first? The ProteinGym cutoffs are somewhat arbitrarily based on the median rather than expert evaluation of the dataset, and I'm not sure why it's even worth mentioning them when proper classifications are available. Regarding PTEN, it would be quite interesting to see a comparison of the VAMP-seq PTEN data and the Mighell phosphatase assay, which is cited on page 9 line 288 but is not actually a VAMP-seq dataset. I think this section could be interesting but it requires some additional attention.

      We have included the data from Mighell’s phosphatase assay as provided by MAVEdb in the MAVISp database, within the experimental_data module for PTEN, and we have revised the case study, including them and explaining better the decision of supporting both the ProteinGym and MAVEdb classification in MAVISp (when available). See revised Figure3, Table 1 and corresponding text.

      The authors mention "pathogenicity predictors" and otherwise use pathogenicity incorrectly throughout the manuscript. Pathogenicity is a classification for a variant after it has been curated according to a framework like the ACMG/AMP guidelines (Richards 2015 and amendments). A single tool cannot predict or assign pathogenicity - the AlphaMissense paper was wrong to use this nomenclature and these authors should not compound this mistake. These predictors should be referred to as "variant effect predictors" or similar, and they are able to produce evidence towards pathogenicity or benignity but not make pathogenicity calls themselves. For example, in Figure 4e, the terms "pathogenic" and "benign" should only be used here if these are the classifications the authors have derived from ClinVar or a similar source of clinically classified variants.

      The reviewer is correct, we have revised the terminology we used in the manuscript and refers to VEPs (Variant Effect Predictors)

      Minor comments:

      The target selection table on the website needs some kind of text filtering option. It's very tedious to have to find a protein by scrolling through the table rather than typing in the symbol. This will only get worse as more datasets are added.

      We have revised the website, adding a filtering option. In detail, we have refactored the web app by adding filtering functionality, both for the main protein table (that can now be filtered by UniProt AC, gene name, or RefSeq ID) and the mutations table. Doing this required a general overhaul of the table infrastructure (we changed the underlying engine that renders the tables).

      The data sources listed on the data usage section of the website are not concordant with what is in the paper. For example, MaveDB is not listed.

      We have revised and updated the data sources on the website, adding a metadata section with relevant information, including MaveDB references where applicable.

      Figure 2 is somewhat confusing, as it partially interleaves results from two different proteins. This would be nicer as two separate figures, one on each protein, or just of a single protein.

      As suggested by the reviewer, we have now revised the figure and corresponding legends and text, focusing only on one of the two proteins.

      Figure 3 panel b is distractingly large and I wonder if the authors could do a little bit more with this visualization.

      We have revised Figure 3 to solve these issues and integrating new data from the comparison with the phosphatase assay

      Capitalization is inconsistent throughout the manuscript. For example, page 9 line 288 refers to VampSEQ instead of VAMP-seq (although this is correct elsewhere). MaveDB is referred to as MAVEdb or MAVEDB in various places. AlphaMissense is referred to as Alphamissense in the Figure 5 legend. The authors should make a careful pass through the manuscript to address this kind of issues.

      We have carefully proofread the paper for these inconsistencies

      MaveDB has a more recent paper (PMID: 39838450) that should be cited instead of/in addition to Esposito et al.

      We have added the reference that the reviewer recommended

      On page 11, lines 338-339 the authors mention some interesting proteins including BLC2, which has base editor data available (PMID: 35288574). Are there plans to incorporate this type of functional assay data into MAVISp?

      The assay mentioned in the paper refers to an experimental setup designed to investigate mutations that may confer resistance to the drug venetoclax. We started the first steps to implement a MAVISp module aimed at evaluating the impact of mutations on drug binding using alchemical free energy perturbations (ensemble mode) but we are far from having it complete. We expect to import these data when the module will be finalized since they can be used to benchmark it and BCL2 is one of the proteins that we are using to develop and test the new module.

      Reviewer #3 (Significance (Required)):

      Significance:

      General assessment:

      This is a nice resource and the authors have clearly put a lot of effort in. They should be celebrated for their achievments in curating the diverse datasets, and the GitBooks are a nice approach. However, I wasn't able to get the website to work and I have raised several issues with the paper itself that I think should be addressed.

      Advance:

      New ways to explore and integrate complex data like protein structures and variant effects are always interesting and welcome. I appreciate the effort towards manual curation of datasets. This work is very similar in theme to existing tools like Genomics 2 Proteins portal (PMID: 38260256) and ProtVar (PMID: 38769064). Unfortunately as I wasn't able to use the site I can't comment further on MAVISp's position in the landscape.

      We have expanded the conclusions section to add a comparison and cite previously published work, and linked to a review we published last year that frames MAVISp in the context of computational frameworks for the prediction of variant effects. In brief, the Genomics 2 Proteins portal (G2P) includes data from several sources, including some overlapping with MAVISp such as Phosphosite or MAVEdb, as well as features calculated on the protein structure. ProtVar also aggregates mutations from different sources and includes both variant effect predictors and predictions of changes in stability upon mutation, as well as predictions of complex structures. These approaches are only partially overlapping with MAVISp. G2P is primarily focused on structural and other annotations of the effect of a mutation; it doesn’t include features about changes of stability, binding, or long-range effects, and doesn’t attempt to classify the impact of a mutation according to its measurements. It also doesn’t include information on protein dynamics. Similarly, ProtVar does include information on binding free energies, long effects, or dynamical information.

      Audience:

      MAVISp could appeal to a diverse group of researchers who are interested in the biology or biochemistry of proteins that are included, or are interested in protein variants in general either from a computational/machine learning perspective or from a genetics/genomics perspective.

      My expertise:

      I am an expert in high-throughput functional genomics experiments and am an experienced computational biologist with software engineering experience.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The authors present MAVISp, a tool for viewing protein variants heavily based on protein structure information. The authors have done a very impressive amount of curation on various protein targets, and should be commended for their efforts. The tool includes a diverse array of experimental, clinical, and computational data sources that provides value to potential users interested in a given target.

      Major comments:

      Unfortunately I was not able to get the website to work properly. When selecting a protein target in simple mode, I was greeted with a completely blank page in the app window, and in ensemble mode, there was no transition away from the list of targets at all. I'm using Firefox 140.0.2 (64-bit) on Ubuntu 22.04. I would have liked to be able to explore the data myself and provide feedback on the user experience and utility.

      I have some serious concerns about the sustainability of the project and think that additional clarifications in the text could help. Currently is there a way to easily update a dataset to add, remove, or update a component (for example, if a new predictor is published, an error is found in a predictor dataset, or a predictor is updated)? If it requires a new round of manual curation for each protein to do this, I am worried that this will not scale and will leave the project with many out of date entries. The diversity of software tools (e.g., three different pipeline frameworks) also seems quite challenging to maintain.

      On the same theme, according to the GitHub repository, the program relies on Python 3.9, which reaches end of life in October 2025. It has been tested against Ubuntu 18.04, which left standard support in May 2023. The authors should update the software to more modern versions of Python to promote the long-term health and maintainability of the project.

      I appreciate that the authors have made their code and data available. These artifacts should also be versioned and archived in a service like Zenodo, so that researchers who rely on or want to refer to specific versions can do so in their own future publications.

      In the introduction of the paper, the authors conflate the clinical challenges of variant classification with evidence generation and it's quite muddled together. The y should strongly consider splitting the first paragraph into two paragraphs - one about challenges in variant classification/clinical genetics/precision oncology and another about variant effect prediction and experimental methods. The authors should also note that they are many predictors other than AlphaMissense, and may want to cite the ClinGen recommendations (PMID: 36413997) in the intro instead.

      Also in the introduction on lines 21-22 the authors assert that "a mechanistic understanding of variant effects is essential knowledge" for a variety of clinical outcomes. While this is nice, it is clearly not the case as we are able to classify variants according to the ACMG/AMP guidelines without any notion of specific mechanism (for example, by combining population frequency data, in silico predictor data, and functional assay data). The authors should revise the statement so that it's clear that mechanistic understanding is a worthy aspiration rather than a prerequisite.

      In the structural analysis section (page 5, lines 154-155 and elsewhere), the authors define cutoffs with convenient round numbers. Is there a citation for these values or were these arbitrarily chosen by the authors? I would have liked to see some justification that these assignments are reasonable. Also there seems to be an error in the text where values between -2 and -3 kcal/mol are not assigned to a bin (I assume they should also be uncertain). There are other similar seemingly-arbitrary cutoffs later in the section that should also be explained.

      On page 9, lines 294-298 the authors talk about using the PTEN data from ProteinGym, rather than the actual cutoffs from the paper. They get to the latter later on, but I'm not sure why this isn't first? The ProteinGym cutoffs are somewhat arbitrarily based on the median rather than expert evaluation of the dataset and I'm not sure why it's even worth mentioning them when proper classifications are available. Regarding PTEN, it would be quite interesting to see a comparison of the VAMP-seq PTEN data and the Mighell phosphatase assay, which is cited on page 9 line 288 but is not actually a VAMP-seq dataset. I think this section could be interesting but it requires some additional attention.

      The authors mention "pathogenicity predictors" and otherwise use pathogenicity incorrectly throughout the manuscript. Pathogenicity is a classification for a variant after it has been curated according to a framework like the ACMG/AMP guidelines (Richards 2015 and amendments). A single tool cannot predict or assign pathogenicity - the AlphaMissense paper was wrong to use this nomenclature and these authors should not compound this mistake. These predictors should be referred to as "variant effect predictors" or similar, and they are able to produce evidence towards pathogenicity or benignity but not make pathogenicity calls themselves. For example, in Figure 4e, the terms "pathogenic" and "benign" should only be used here if these are the classifications the authors have derived from ClinVar or a similar source of clinically classified variants.

      Minor comments:

      The target selection table on the website needs some kind of text filtering option. It's very tedious to have to find a protein by scrolling through the table rather than typing in the symbol. This will only get worse as more datasets are added.

      The data sources listed on the data usage section of the website are not concordant with what is in the paper. For example, MaveDB is not listed.

      I found Figure 2 to be a bit confusing in that it partially interleaves results from two different proteins. I think this would be nicer as two separate figures, one on each protein, or just of a single protein.

      Figure 3 panel b is distractingly large and I wonder if the authors could do a little bit more with this visualization.

      Capitalization is inconsistent throughout the manuscript. For example, page 9 line 288 refers to VampSEQ instead of VAMP-seq (although this is correct elsewhere). MaveDB is referred to as MAVEdb or MAVEDB in various places. AlphaMissense is referred to as Alphamissense in the Figure 5 legend. The authors should make a careful pass through the manuscript to address this kind of issues.

      MaveDB has a more recent paper (PMID: 39838450) that should be cited instead of/in addition to Esposito et al.

      On page 11, lines 338-339 the authors mention some interesting proteins including BLC2, which has base editor data available (PMID: 35288574). Are there plans to incorporate this type of functional assay data into MAVISp?

      Significance

      General assessment:

      This is a nice resource and the authors have clearly put a lot of effort in. They should be celebrated for their achievments in curating the diverse datasets, and the GitBooks are a nice approach. However, I wasn't able to get the website to work and I have raised several issues with the paper itself that I think should be addressed.

      Advance:

      New ways to explore and integrate complex data like protein structures and variant effects are always interesting and welcome. I appreciate the effort towards manual curation of datasets. This work is very similar in theme to existing tools like Genomics 2 Proteins portal (PMID: 38260256) and ProtVar (PMID: 38769064). Unfortunately as I wasn't able to use the site I can't comment further on MAVISp's position in the landscape.

      Audience:

      MAVISp could appeal to a diverse group of researchers who are interested in the biology or biochemistry of proteins that are included, or are interested in protein variants in general either from a computational/machine learning perspective or from a genetics/genomics perspective.

      My expertise:

      I am an expert in high-throughput functional genomics experiments and am an experienced computational biologist with software engineering experience.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The authors present a pipeline and platform, MAVISp, for aggregating, displaying and analysis of variant effects with a focus on reclassification of variants of uncertain clinical significance and uncovering the molecular mechanisms underlying the mutations.

      Major comments:

      • On testing the platform, I was unable to look-up a specific variant in ADCK1 (rs200211943, R115Q). I found that despite stating that the mapped refseq ID was NP_001136017 in the HGVSp column, it was actually mapped to the canonical UniProt sequence (Q86TW2-1). NP_001136017 actually maps to Q86TW2-3, which is missing residues 74-148 compared to the -1 isoform. The Uniprot canonical sequence has no exact RefSeq mapping, so the HGVSp column is incorrect in this instance. This mapping issue may also affect other proteins and result in incorrect HGVSp identifiers for variants.
      • The paper lacks a section on how to properly interpret the results of the MAVISp platform (the case-studies are useful, but don't lay down any global rules for interpreting the results). For example: How should a variant with conflicts between the variant impact predictors be interpreted? Are certain indicators considered more 'reliable' than others?
      • In the Methods section, GEMME is stated as being rank-normalised with 0.5 as a threshold for damaging variants. On checking the data downloaded from the site, GEMME was not rank-normalised but rather min-max normalised. Furthermore, Supplementary text S4 conflicts with the methods section over how GEMME scores are classified, S4 states that a raw-value threshold of -3 is used.
      • Note. This is a major comment as one of the claims is that the associated web-tool is user-friendly. While functional, the web app is very awkward to use for analysis on any more than a few variants at once.
        • The fixed window size of the protein table necessitates excessive scrolling to reach your protein-of-interest. This will also get worse as more proteins are added. Suggestion: add a search/filter bar.
        • The same applies to the dataset window.
        • You are unable to copy anything out of the tables.
        • Hyperlinks in the tables only seem to work if you open them in a new tab or window.
        • All entries in the reference column point to the MAVISp preprint even when data from other sources is displayed (e.g. MAVE studies).
        • Entering multiple mutants in the "mutations to be displayed" window is time-consuming for more than a handful of mutants. Suggestion: Add a box where multiple mutants can be pasted in at once from an external document.

      Minor comments

      • Grammar. I appreciate that this manuscript may have been compiled by a non-native English speaker, but I would be remiss not to point out that there are numerous grammar errors throughout, usually sentence order issues or non-pluralisation. The meaning of the authors is mostly clear, but I recommend very thoroughly proof-reading the final version.
      • There are numerous proteins that I know have high-quality MAVE datasets that are absent in the database e.g. BRCA1, HRAS and PPARG.
      • Checking one of the existing MAVE datasets (KRAS), I found that the variants were annotated as damaging, neutral or given a positive score (these appear to stand-in for gain-of-function variants). For better correspondence with the other columns, those with positive scores could be labelled as 'ambiguous' or 'uncertain'.
      • Numerous thresholds are defined for stabilizing / destabilizing / neutral variants in both the STABILITY and the LOCAL_INTERACTION modules. How were these thresholds determined? I note that (PMC9795540) uses a ΔΔG threshold of 1/-1 for defining stabilizing and destabilizing variants, which is relatively standard (though they also say that 2-3 would likely be better for pinpointing pathogenic variants).
      • "Overall, with the examples in this section, we illustrate different applications of the MAVISp results, spanning from benchmarking purposes, using the experimental data to link predicted functional effects with structural mechanisms or using experimental data to validate the predictions from the MAVISp modules."

      The last of these points is not an application of MAVISp, but rather a way in which external data can help validate MAVISp results. Furthermore, none of the examples given demonstrate an application in benchmarking (what is being benchmarked?). - Transcription factors section. This section describes an intended future expansion to MAVISp, not a current feature, and presents no results. As such, it should probably be moved to the conclusions/future directions section. - Figures. The dot-plots generated by the web app, and in Figures 4, 5 and 6 have 2 legends. After looking at a few, it is clear that the lower legend refers to the colour of the variant on the X-axis - most likely referencing the ClinVar effect category. This is not, however, made clear either on the figures or in the app. - "We identified ten variants reported in ClinVar as VUS (E102K, H86D, T29I, V91I, P2R, L44P, L44F, D56G, R11L, and E25Q, Fig.5a)"

      E25Q is benign in ClinVar and has had that status since first submitted.

      Significance

      Platforms that aggregate predictors of variant effect are not a new concept, for example dbNSFP is a database of SNV predictions from variant effect predictors and conservation predictors over the whole human proteome. Predictors such as CADD and PolyPhen-2 will often provide a summary of other predictions (their features) when using their platforms. MAVISp's unique angle on the problem is in the inclusion of diverse predictors from each of its different moules, giving a much wider perspective on variants and potentially allowing the user to identify the mechanistic cause of pathogenicity. The visualisation aspect of the web app is also a useful addition, although the user interface is somewhat awkward. Potentially the most valuable aspect of this study is the associated gitbook resource containing reports from biocurators for proteins that link relevant literature and analyse ClinVar variants. Unfortunately, these are only currently available for a small minority of the total proteins in the database with such reports.

      For improvement, I think that the paper should focus more on the precise utility of the web app / gitbook reports and how to interpret the results rather than going into detail about the underlying pipeline.

      In terms of audience, the fast look-up and visualisation aspects of the web-platform are likely to be of interest to clinicians in the interpretation of variants of unknown clinical significance. The ability to download the fully processed dataset on a per-protein database would be of more interest to researchers focusing on specific proteins or those taking a broader view over multiple proteins (although a facility to download the whole database would be more useful for this final group).

      My expertise.

      • I am a protein bioinformatician with a background in variant effect prediction and large-scale data analysis.
    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: This manuscript, "MAVISp: A Modular Structure-Based Framework for Protein Variant Effects," presents a significant new resource for the scientific community, particularly in the interpretation and characterization of genomic variants. The authors have developed a comprehensive and modular computational framework that integrates various structural and biophysical analyses, alongside existing pathogenicity predictors, to provide crucial mechanistic insights into how variants affect protein structure and function. Importantly, MAVISp is open-source and designed to be extensible, facilitating reuse and adaptation by the broader community.

      Major comments:

      • While the manuscript is formally well-structured (with clear Introduction, Results, Conclusions, and Methods sections), I found it challenging to follow in some parts. In particular, the Introduction is relatively short and lacks a deeper discussion of the state-of-the-art in protein variant effect prediction. Several methods are cited but not sufficiently described, as if prior knowledge were assumed. OPTIONAL: Extend the Introduction to better contextualize existing approaches (e.g., AlphaMissense, EVE, ESM-based predictors) and clarify what MAVISp adds compared to each.
      • The workflow is summarized in Figure 1(b), which is visually informative. However, the narrative description of the pipeline is somewhat fragmented. It would be helpful to describe in more detail the available modules in MAVISp, and which of them are used in the examples provided. Since different use cases highlight different aspects of the pipeline, it would be useful to emphasize what is done step-by-step in each. OPTIONAL: Consider adding a table or a supplementary figure mapping each use case to the corresponding pipeline steps and modules used.
      • The text contains numerous acronyms, some of which are not defined upon first use or are only mentioned in passing. This affects readability. OPTIONAL: Define acronyms upon first appearance, and consider moving less critical technical details (e.g., database names or data formats) to the Methods or Supplementary Information. This would greatly enhance readability.
      • The code and trained models are publicly available, which is excellent. The modular design and use of widely adopted frameworks (PyTorch and PyTorch Geometric) are also strong points. However, the Methods section could benefit from additional detail regarding feature extraction and preprocessing steps, especially the structural features derived from AlphaFold2 models. OPTIONAL: Include a schematic or a table summarizing all feature types, their dimensionality, and how they are computed.
      • The section on transcription factors is relatively underdeveloped compared to other use cases and lacks sufficient depth or demonstration of its practical utility. OPTIONAL: Consider either expanding this section with additional validation or removing/postponing it to a future manuscript, as it currently seems preliminary.

      Minor comments:

      • Most relevant recent works are cited, including EVE, ESM-1v, and AlphaFold-based predictors. However, recent methods like AlphaMissense (Cheng et al., 2023) could be discussed more thoroughly in the comparison.
      • Figures are generally clear, though some (e.g., performance barplots) are quite dense. Consider enlarging font sizes and annotating key results directly on the plots.
      • Minor typographic errors are present. A careful proofreading is highly recommended. Below are some of the issues I identified:

      Page 3, line 46: "MAVISp perform" -> "MAVISp performs"

      Page 3, line 56: "automatically as embedded" -> "automatically embedded"

      Page 3, line 57: "along with to enhance" -> unclear; please revise

      Page 4, line 96: "web app interfaces with the database and present" -> "presents"

      Page 6, line 210: "to investigate wheatear" -> "whether"

      Page 6, lines 215-216: "We have in queue for processing with MAVISp proteins from datasets relevant to the benchmark of the PTM module." -> unclear sentence; please clarify

      Page 15, line 446: "Both the approaches" -> "Both approaches"

      Page 20, line 704: "advantage of multi-core system" -> "multi-core systems"

      Significance

      General assessment: the strongest aspects of the study are the modularity, open-source implementation, and the integration of structural information through graph neural networks. MAVISp appears to be one of the few publicly available frameworks that can easily incorporate AlphaFold2-based features in a flexible way, lowering the barrier for developing custom predictors. Its reproducibility and transparency make it a valuable resource. However, while the technical foundation is solid and the effort substantial, the scientific narrative and presentation could be significantly improved. The manuscript is dense and hard to follow in places, with a heavy use of acronyms and insufficient explanation of key design choices. Improving the descriptive clarity, especially in the early sections, would greatly enhance the impact of this work.

      Advance: to the best of my knowledge, this is one of the first modular platforms for protein variant effect prediction that integrates structural data from AlphaFold2 with bioinformatic annotations and even clinical data in an extensible fashion. While similar efforts exist (e.g., ESMfold, AlphaMissense), MAVISp distinguishes itself through openness and design for reusability. The novelty is primarily technical and practical rather than conceptual.

      Audience: this study will be of strong interest to researchers in computational biology, structural bioinformatics, and genomics, particularly those developing variant effect predictors or analyzing the impact of mutations in clinical or functional genomics contexts. The audience is primarily specialized, but the open-source nature of the tool may diffuse its use among more applied or translational users, including those working in precision medicine or protein engineering.

      Reviewer expertise: my expertise is in computational structural biology, molecular modeling, and (rather weak) machine learning applications in bioinformatics. I am familiar with graph-based representations of proteins, AlphaFold2, and variant effects based on Molecular Dynamics simulations. I do not have any direct expertise in clinical variant annotation pipelines.

  2. Nov 2025
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Reply to the Reviewers

      I thank the Referees for their...

      Referee #1

      1. The authors should provide more information when...

      Responses + The typical domed appearance of a hydrocephalus-harboring skull is apparent as early as P4, as shown in a new side-by-side comparison of pups at that age (Fig. 1A). + Though this is not stated in the MS 2. Figure 6: Why has only...

      Response: We expanded the comparison

      Minor comments:

      1. The text contains several...

      Response: We added...

      Referee #2

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Meroni and colleagues present evidence that CIP2A is required to recruit the SMX complex to sites of replication stress in mitotic cells. Whilst the data generated when using U2OS cells seems to support a role for CIP2A in recruiting the SMX complex to sites of replication stress to facilitate MiDAS, as the authors point out, this pathway is not conserved in DLD1 cells. Although the authors suggest that this discrepancy in the data may relate to the fact that U2OS cells are ALT positive and the DLD1 cells are not, there is no experimental evidence to support this hypothesis. It would have been nice if the authors had backed up this hypothesis with data relating to how CIP2A regulates the SMX-MiDAS pathway in other ALT positive and negative cell lines. Taken together, after reading this manuscript, I am left wondering whether CIP2A is really important for SMX-dependent MiDAS or whether it is phenomenon that is found in some commonly used cancer cell lines and not others. Whilst it is important to publish conflicting results as they can explain why some research labs can reproduce published data and others can't, I think this manuscript would benefit from assessment of the role of CIP2A in mediating the recruitment of the SMX complex to carry out MiDAS in a variety of additional cancer cell lines and also non-cancer cell lines, such as RPE1-hTERT cells to obtain some sort of consensus about the importance of CIP2A in dealing with mitotic replication stress.

      Comments:

      1. Fig.2A-E: Can the authors comment on the difference in number of APH-induced FANCD2, SLX4, Mus81 and XPF foci in mitotic U2OS cells? Given that SLX4 should be recruiting both XPF and Mus81, there is a disparity between the numbers of mitotic foci given that there are approximately 30 FANCD3 foci per mitotic cell following APH treatment. Additionally, why do the XPF foci not increase after APH exposure?
      2. Fig.2G: I would say that the 'full rescue' of Mus81 foci in the CIP2A KO cells complemented with WT CIP2A is not hugely convincing since there is only a difference of 1-2 foci between the WT and CIP2A KO cells treated with APH.
      3. Fig.3A: I am not really sure how biologically meaningful a difference of 0.03-0.04 EdU foci per chromosome is when comparing BRCA2 KO DLD1 cells treated with control siRNA versus CIP2A siRNA. Would it not have been better to treat the BRCA2 KO DLD1 cells with APH?
      4. Fig.3H-I: Given that the reduction in MiDAS in the CIP2A KO cl.7 cell line is likely a clonal artifact not related to the loss of CIP2A, it is unclear how to interpret the data about the EdU foci pattern on chromosomes presented in Fig.3H-I and its relevance to CIP2A. Therefore, I am not sure this data really adds anything to the manuscript.
      5. Fig.4H: The difference in Mus81 foci per mitotic cell with/without the expression of B6L is only one focus per mitotic cell. Based on this, it is difficult to make any real conclusions about whether the TOPBP1-CIP2A interaction is really required for the recruitment of Mus81 to sites of mitotic replication stress.

      Significance

      As mentioned above, it is clear that the role of CIP2A in regulating the mitotic replication stress response by promoting recruitment of the SMX complex to sites of mitotic replication stress to promote MiDAS is complicated and may be specific to some cancer cell lines and not others. Whilst it is not clear what the underlying reason for this is, this manuscript would definitely benefit from additional analysis of this pathway in other cancer and non-cancer cell lines to obtain a consensus about the role of CIP2A.

      This manuscript would appeal to fundamental research scientists interested in understanding the mechanisms underlying DNA damage repair, the replication stress response and mitotic regulation.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In the manuscript entitled "CIP2A Mediates the Recruitment of the SLX4-MUS81-XPF Tri-Nuclease Complex in Mitosis and Protects Against Replication Stress" by Meroni et al the authors have characterized localization of the CIP2A-TopBP1 complex as well as some aspects of its function in U2OS and DLD1 cell lines exposed to different types of stress. They find that replication stress due to BRCA2 KO, APH or ATRi results in increased focus formation of the CIP2A-TopBP1 complex in mitotic cells. Moreover, the authors find significant decrease in EdU incorporartion in mitotic cells when disrupting CIP2A in (i) U2OS exposed to ATRi or Aph; (ii) in DLD1 BRCA2 KO; (iii) in one clone of DLD1 with Cip2A KO, and a non significant decrease the other DLD1 with Cip2A KO that they tested. Thus, under most of the tested conditions CIP2A is facilitating MiDAS. However, the authors find that expression of a previously characterised fragment of TopBP1 called B6L, which disrupts CIP2A-TopBP1 interaction, does not inhibit MiDAS in DLD1 cells.

      Major comments:

      It is convincing but not surprising that CIP2A-TopBP1 form more foci in mitotic cells after replication stress. The authors statement in the abstract: "We demonstrate that in the absence of CIP2A, cells fail to recruit the SLX4-MUS81-XPF (SMX) tri-nuclease complex to sites of under-replicated DNA in mitosis, resulting in a high incidence of lagging chromosomes during anaphase and subsequent micronuclei formation" is not supported by experiments. The authors indeed show that absence of CIP2A leads to lagging chromosomes during anaphase and subsequent micronuclei formation (which has previously been shown) but they have not shown that it is the failure to recruit the SMX complex that results in the phenotypes they mention. The authors should rephrase or remove this claim.

      There is a discrepancy between the B6L-mediated disruption of TopBP1-CIP2A interaction having no effect on MiDAS in DLD1 cells (fig. 4F) whereas knockout of CIP2A in DLD1 cells seem to have an effect (fig 3E). The most obvious explanation for this observation is that the B6L peptide does not fully abolish TopBP1-CIP2A interaction and can still allow for some SLX4-MUS81 recruitment that is not visible as foci but still sufficient to induce MiDAS. To understand whether MiDAS in DLD1 expressing B6L is dependent on the fraction of TopBP1 that can still form foci (according to Fig 4D) the authors must co-stain for TopBP1 together with EdU detection to address whether they observe any colocalization of TopBP1 with MiDAS.

      Many of the experiments are only performed with two independent replicates. The authors must perform 3 independent replicates. Also, it is not clear how many cells were analysed for each replicate. This should be clearly stated and the mean of each replicate should always be shown. Statistical analyses should be carried out using the means of the replicates. The authors must provide data showing the efficiency of CIP2A knockdown and CIP2A expression in the complementation assay (Fig. 2G)

      Minor comments:

      The authors should change "U-2 OS" in the figures to "U2OS" for consistency.

      In figure 4D - is the increase with APH and S1 significant compared to S1 alone?

      Figure 3 B and C. It is worrying that there is a huge difference in the EdU foci/mitotic cell in untreated condition from panel B to pabel C.

      Fig 3F - is the increase in EdU incorporation after complementation significant?

      For figure 3I representative images should be added

      Significance

      The data presented in the manuscript is of high quality but unfortunately does not present a big advance compared to current knowledge. Nevertheless, it is useful to have side-by-side comparison of different cell lines and conditions and IF localization studies. Given the therapeutic interest in the CIP2A-TopBP1 pathway it is important to get all the details right and researches with interest in DNA repair during mitosis will have interest in this work.

      Moreover, in this manuscript the authors demonstrate that the impact of CIP2A disruption on MiDAS is variable across different cell lines-and even between individual clones. The concept of MiDAS is still clouded by considerable ambiguity, possibly due to earlier studies overstating the consequences of knockdown or knockout. It is therefore great that this manuscript presents clear, unbiased observations, highlighting both inter-cell line differences and the partial nature of the effects. This kind of nuanced reporting is valuable for the field.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary and Significance

      This is a timely and exciting study that provides us with some new molecular insights into mitotic DNA repair. It builds on previous studies that identified the CIP2A-TOPBP1 complex as a molecular tether that connects broken DNA ends that get transmitted from interphase into mitosis (PMID: 30898438, 35842428, 35842428). The results are also largely complementary with those of Martin et al. (BioRxiv preprint at https://doi.org/10.1101/2024.11.12.621593) and de Haan et al. (BioRxiv preprint at https://www.biorxiv.org/content/10.1101/2025.04.03.647079v1).

      The authors report three main findings, as summarized below.

      1) The CIP2A oncoprotein is involved in the cellular response to replication stress in mitosis.

      2) CIP2A is required for the recruitment of SLX4, MUS81, and XPF into foci during mitosis. SLX4 is a well-established protein scaffold for multiple DNA repair factors, including three structure-selective endonucleases called SLX1, MUS81-EME1, and XPF-ERCC1 that together, form the SMX tri-nuclease that removes DNA repair intermediates and chromosome entanglements during mitosis. In some cell lines, the SMX complex is required for mitotic DNA synthesis at sites of under-replicated DNA, thus ensuring complete DNA replication prior to cell division.

      3) The role(s) of CIP2A in MiDAS are cell line-dependent/context-dependent.

      In general, this is a solid body of microscopy-based work that includes appropriate cell models and experimental controls. The manuscript is well-written, and the data is presented coherently. The main findings will have important implications for researchers interested in mitotic DNA damage, genome stability, and cancer biology. After addressing the points below, I believe this manuscript will be suitable for publication.

      Major comments

      1) Figure 1C: The CIP2A-TOPBP1 PLA experiments are lacking critical controls, namely cells lacking or depleted of CIP2A and TOPBP1. These controls are necessary to provide confidence for the results presented in Figure 1C. If these controls are too expensive or time-demanding for the manuscript, then I recommend removing the PLA data from Figure 1C.

      2) In Figure 2, the authors conclude that the loss of SLX4, XPF, and MUS81 foci in CIP2A depleted cells is synonymous with the loss of recruitment to DNA lesions. However, I can think of many other reasons that could explain the loss of foci. For example, do the authors know that the proteins are expressed to similar levels in cells with and without CIP2A (this should be tested by a simple western blot). Along the same vein, a biochemical fractionation and western blot of the soluble vs chromatin-bound fraction would complement and substantiate their microscopy-based assays in Figure 2. If the fractionation is not possible, then the text should be adjusted accordingly.

      3) The experimental set-up in Figure 2 probes whether CIP2A mediates the recruitment of SMX subunits - SLX4, XPF, MUS81 - but not the SMX complex per se, which would require the study of SLX4 point mutants that selectively ablate the interactions with XPF or MUS81 (but not CIP2A). As such, I suggest that they rephrase their wording appropriately.

      4) Western blots must be provided to substantiate the experiments performed with siRNA (Figure 1G-J, Figure 2A-E and 2H, Figure 3A-D, Figure 5B-D). Similarly, the authors should provide western blots to confirm the BRCA2 and CIP2A statuses in their KO cell lines, as well as the complementation cell lines. In the absence of this information, it is difficult for someone to make an independent and meaningful interpretation of their data.

      5) Most of the data presented in this manuscript is derived from n = 2 biological replicates. All of the experiments reported in the study should be repeated for n = 3 biological replicates.

      6) Since the authors report the median of their data, they should also report the interquartile range or confidence interval to display the uncertainty.

      Minor comments

      1) The references can be improved by acknowledging some of the foundational papers on SLX4 and the SMX tri-nuclease.

      1.a) Page 3: Neither Minocherhomji et al. 2015 nor Pedersen et al. 2015 were the first to describe SLX4 as a scaffold for structure-selective endonucleases. The founding papers were published in 2009 (Svendsen et al. 2009, Munoz et al. 2009, Fekairi et al. 2009, Andersen et al. 2009) with important mechanistic studies on nuclease activation reported in 2013 (Wyatt et al. 2013, Castor et al. 2013) and 2017 (Wyatt et al. 2017).

      1.b) Page 6: The authors should cite Wyatt et al. 2013, alongside Castor et al. 2013 and Garner et al. 2013 since these 3 articles were published at similar times. They may also want to acknowledge previous work from the Hickson and Rosselli labs showing that XPF-ERCC1 and MUS81-EME1 are recruited to fragile sites in mitosis.

      2) To improve broad readability, the authors should remove the following abbreviations: Aph and WT.

      3) In several figures, the authors show that a given treatment causes a very small change in the number of foci observed per mitotic cell. Although the values may be statistically different, it is important that they discuss the biological significance of these small effects - for example, I am not convinced that a difference of 2-3 foci per cell is sufficient to induce a robust cellular response.

      4) The methods could be expanded to ensure reproducibility, particularly with respect to the drug treatments (e.g., timing, washes, etc.).

      Significance

      This is a timely and exciting study that provides us with some new molecular insights into mitotic DNA repair. It builds on previous studies that identified the CIP2A-TOPBP1 complex as a molecular tether that connects broken DNA ends that get transmitted from interphase into mitosis (PMID: 30898438, 35842428, 35842428). The results are also largely complementary with those of Martin et al. (BioRxiv preprint at https://doi.org/10.1101/2024.11.12.621593) and de Haan et al. (BioRxiv preprint at https://www.biorxiv.org/content/10.1101/2025.04.03.647079v1).

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their detailed comments, which have already helped us improve our manuscript. The responses below detail changes we have already made as part of the Review Commons revision plan, and further changes we expect to make in a longer revision period.


      __Reviewer #1 __

      Major points __ It is mentioned throughout the manuscript that 3 plates were evaluated per line. I believe these are independently differentiated plates. This detail is critical concerning rigor and reproducibility. This should be clearly stated in the Methods section and in the first description of the experimental system in the Results section for Figure 1.__

      These experimental details have now been clarified. Unless otherwise stated, all findings were confirmed in three independently differentiated plates from the same line or at least one differentiation from each of three lines.

      For the patient-specific lines - how many lines were derived per patient?

      This has now been clarified in the methods. Microfluidic reprogramming of a small number of amniocytes produces one line per patient representing a pool of clones. Subcloning from individual cells would not be possible within the timeframe of a pregnancy.

      Methods: For patient-specific iPSC lines, one independent iPSC line was obtained per patient following microfluidic mmRNA reprogramming.

      Was the Vangl2 variant introduced by prime editing? Base editing? The details of the methods are sparse.

      We have now expanded these details:

      Methods: VANGL2 knock-in lines were generated using CRSIPR-Cas9 homology directed repair editing by Synthego (SO-9291367-1). The guide sequence was AUGAGCGAAGGGUGCGCAAG and the donor sequence was CAATGAGTACTACTATGAGGAGGCTGAGCATGAGCGA AGGGTGTGCAAGAGGAGGGCCAGGTGGGTCCCTGGGGGAGAAGAGGAGAG. Sequence modification was confirmed by Sanger sequencing before delivery of the modified clones, and Sanger sequencing was repeated after expansion of the lines (Supplementary Figure 5) as well as SNP arrays (Illumina iScan, not shown) confirming genomic stability.

      Some additional suggestions for improvement. __ The abstract could be more clearly written to effectively convey the study's importance. Here are some suggestions.__

      Line 26: Insert "apicobasal" before "elongation" - the way it is written, I initially interpreted it as anterior-posterior elongation.

      Line 29: Please specify that the lines refer to 3 different established parent iPSC lines with distinct origins and established using different reprogramming methods, plus 2 control patient-derived lines. - The reproducibility of the cell behaviors is impressive, but this is not captured in the abstract.

      Line 32: add that this mutation was introduced by CRISPR-Cas9 base/prime editing.

      The last sentence of the abstract states that the study only links apical constriction to human NTDs, but also reveals that neural differentiation and apical-basal elongation were found. __ The introduction could also use some editing. __ Line 71: insert "that pulls actin filaments together" after "power strokes" __ Line 73: "apically localized," do you mean "mediolaterally" or "radially"? __ Line 75: Can you specify that PCP components promote "mediolaterally orientated" apical constriction __ Lines 127: Specify that NE functions include apical basal elongation and neurodifferentiation are disrupted in patient-derived models__

      These text changes have all been made.

      Reviewer #2:____ __ __Major comments: __ 1. Figure 1. The authors use F-actin to segment cell areas. Perhaps this could be done more accurately with ZO-1, as F-actin cables can cross the surface of a single cell. In any case, the authors need to show a measure of segmentation precision: segmented image vs. raw image plus a nuclear marker (DAPI, H2B-GFP), so we can check that the number of segmented cells matches the number of nuclei.__

      We used ZO-1 to quantify apical areas of the VANGL2-konckin lines in Figure 3. Segmentation of neuroepithelial apical areas based on F-actin staining is commonplace in the field (e.g. Fig 9 of Bogart & Brooks 2025 as a recent example), and is generally robust because the cell junctions are much brighter than any apical fibres not associated with the apical cortex. However, we accept that at earlier stages of differentiation there may be more apical fibres when cells are cuboidal. We have therefore repeated our analysis of apical area using ZO-1 staining as suggested, shown in the new Supplementary Figure 1, analysing a more temporally-detailed time course in one iPSC line. This new analysis confirms our finding of lack of apical area change between days 2-4 of differentiation, then progressive reduction of apical area between days 4-8, further validating our system. Including nuclear images is not helpful because of the high nuclear index of pseudostratified epithelia (e.g. see Supplementary Figure 7) which means that nuclei overlap along the apicobasal axis. Individual nuclei cannot be related to their apical surface in projected images.

      __2.Lines 156-166. The authors claim that changes in gene expression precede morphological changes. I am not convinced this is supported by their data. Fig. 1g (epithelial thickness) and Fig. 1k (PAX6 expression) seem to have similar dynamics. The authors can perform a cross-correlation between the two plots to see which Δt gives maximum correlation. If Δt __We are happy to do this analysis fully in revision. __Our initial analysis performing cross-correlation between apical area and CDH2 protein in one line shows the highest cross-correlation at Δt = -1, suggesting neuroepithelial CDH2 increases before apical area decreases. In contrast, the same analysis comparing apical area versus PAX6 shows Δt = 0, suggesting concurrence. This analysis will be expanded to include the other markers we quantified and the manuscript text amended accordingly. We are keen to undertake additional experiments to test whether these cells swap their key cadherins - CDH1 and CDH2 - before they begin to undergo morphological changes (see the response to Reviewer 3's minor comment 1 immediately below).

      3. Figure 2d. The laser ablation experiment in the presence of ROCK inhibitor is clear, as I can easily see the cell outlines before and after the experiment. In the absence of ROCK inhibitor, the cell edges are blurry, and I am not convinced the outline that the authors drew is really the cell boundary. Perhaps the authors can try to ablate a larger cell patch so that the change in area is more defined.

      The outlines on these images are not intended to show cell boundaries, but rather link landmarks visible at both timepoints to calculate cluster (not cell) change in area. This is as previously shown in Galea et al Nat Commun 2021 and Butler et al J Cell Sci 2019. We have now amended the visualisation of retraction in Figure 2 to make representation of differences between conditions more intuitive.

      4. Figure 2d. Do the cells become thicker after recoil?

      This is unlikely because the ablated surface remains in the focal plane. Unfortunately, we are unable to image perpendicularly to the direction of ablation to test whether their apical surface moves in Z even by a very small amount. This has now been clarified in the results:

      Results: The ablated surface remained within the focal plane after ablation, indicating minimal movement along the apical-basal axis.

      5. Figure 3. The authors mention their previous study in which they show that Vangl2 is not cell-autonomously required for neural closure. It will be interesting to study whether this also the case in the present human model by using mosaic cultures.

      We agree with the reviewer that this is one of the exciting potential future applications of our model, which will first require us to generate stable fluorescently-tagged lines (to identify those cells which lack VANGL2). We will also need to extensively analyze controls to validate that mixing fluo-tagged and untagged lines does not alter the homogeneity of differentiation, or apical constriction, independently of VANGL2 deletion. As such, the reviewer is suggesting an altogether new project which carries considerable risk and will require us to secure dedicated funding to undertake.

      6. Lines 403-415. The authors report poor neural induction and neuronal differentiation in GOSB2. As far as I understand, this phenotype does not represent the in vivo situation. Thus, it is not clear to what extent the in vitro 2D model describes the human patient.

      The GOSB2 iPSC line we describe does represent the in vivo situation in Med24 knockout mouse embryos, but is clearly less severe because we are still able to detect MED24 protein expressed in this line. We do not have detailed clinical data of the patient from which this line was obtained to determine whether their neurological development is normal. However, it is well established that some individuals who have spina bifida also have abnormalities in supratentorial brain development. It is therefore likely that abnormalities in neuron differentiation/maturation are concomitant with spina bifida. Our findings in the GOSB2 line complement earlier studies which also identified deficiencies in the ability of patient-derived lines to form neurons, but were unable to functionally assess neuroepithelial cell behaviours we studied. This has now been clarified in the discussion:

      Discussion: *Neuroepithelial cells of the GOSB2 line described here, which has partial loss of MED24, similarly produces a thinner neuroepithelium with larger apical areas. Although apical areas were not analysed in mouse models of Med24 deletion, these embryos also have shorter and non-pseudostratified neuroepithelium. *

      Our GOSB2 line - which retains readily detectable MED24 protein - is clearly less severe than the mouse global knockout, and the clinical features of the patient from which this line was derived are milder than the phenotype of Med24 knockout embryos68. Mouse embryos lacking one of Med24's interaction partners in the mediator complex, Med1, also have thinner neuroepithelium and diminished neuronal differentiation but successfully close their neural tube85.

      7.The experimental feat to derive cell lines from amniotic fluid and to perform experiments before birth is, in my view, heroic. However, I do not feel I learned much from the in vitro assays. There are many genetic changes that may cause the in vivo phenotype in the patient. The authors focus on MED24, but there is not enough convincing evidence that this is the key gene. I would like to suggest overexpression of MED24 as a rescue experiment, but I am not sure this is a single-gene phenotype. In addition, the fact that one patient line does not differentiate properly leads me to think that the patient lines do not strengthen the manuscript, and that perhaps additional clean mutations might contribute more.

      We thank the reviewer for their praise of our personalised medicine approach and fully agree that neural tube defects are rarely monogenic. The patient lines we studied were not intended to provide mechanistic insight, but rather to demonstrate the future applicability of our approach to patient care. Our vision is that every patient referred for fetal surgery of spina bifida will have amniocytes (collected as part of routine cystocentesis required before surgery) reprogrammed and differentiated into neuroepithelial cells, then neural progenitors, to help stratify their post-natal care. One could also picture these cells becoming an autologous source for future cell-based therapies if they pass our reproducible analysis pipeline as functional quality control. This has now been clarified in the discussion:

      Discussion____: The multi-genic nature of neural tube defect susceptibility, compounded by uncontrolled environmental risk factors (including maternal age and parity102), mean that patient-derived iPSC models are unlikely to provide mechanistic insight. They do provide personalised disease models which we anticipate will enable functional validation of genetic diagnoses for patients and their parents' recurrence risk in future pregnancies, and may eventually stratify patients' postnatal care. We also envision this model will enable quality control of patient-derived cells intended for future autologous cell replacement therapies, as is being developed in post-natal spinal cord injury103.

      Minor comments: __ 1.Figure 1c. Text is cropped at the edge of the image.__

      This image has been corrected.

      Reviewer #2 (Significance (Required)): __ ...In addition, the model was unsuccessful in one of the two patient-derived lines, which limits generalizability and weakens claims of patient-specific predictive value.__

      We disagree with the reviewer that "the model was unsuccessful in one of the two patient-derived lines". The GOSB1 line demonstrated deficiency of neuron differentiation independently of neuroepithelial biomechanical function, whereas the GOSB2 line showed earlier failure of neuroepithelial function. We also do not, at this stage, make patient-specific predictive claims: this will require longer-term matching of cell model findings with patient phenotypes over the next 5-10 years.

      Reviewer #3: Major comments __ 1) One of my few concerns with this work is that the relative constriction of the apical surface with respect to the basal surface is not directly quantified for any of the experiments. This worry is slightly compounded by the 3D reconstructions Figure 1h, and the observation that overall cell volume is reduced and cell height increased simultaneously to area loss. Additionally, the net impact of apical constriction in tissues in vivo is to create local or global curvature change, but all the images in the paper suggest that the differentiated neural tissues are an uncurved monolayer even missing local buckles. I understand that these cells are grown on flat adherent surfaces limiting global curvature change, but is there evidence of localized buckling in the monolayer? While I believe-along with the authors-that their phenotypes are likely failures in apical constriction, I think they should work to strengthen this conclusion. I think the easiest way (and hopefully using data they already have) would be to directly compare apical area to basal area on a cell wise basis for some number of cells. Given the heterogeneity of cells, perhaps 30-50 cells per condition/line/mutant would be good? I am open to other approaches; this just seems like it may not require additional experiments.__

      As the reviewer observes, our cultures cannot bend because they are adhered on a rigid surface. The apical and basal lengths of the cultures will therefore necessarily be roughly equal in length. Some inwards bending of the epithelium is expected at the edges of the dish, but these cannot be imaged. The live imaging we show in Figure 2 illustrates that, just as happens in vivo, apical constriction is asynchronous. This means not all cells will have 'bottle' shapes in the same culture. We now illustrate the evolution of these shapes in more detail in Supplementary Figure 1 (shown in point 2.1 above).

      Additionally, the reviewer's comment motivated us to investigate local buckles in the apical surface of our cultures when their apical surfaces are dilated by ROCK inhibition. We hypothesised that the very straight apical surface in normal cultures is achieved by a balance of apical cell size and tension with pressure differences at the cell-liquid interface. Consistent with our expectation, the apical surface of ROCK-inhibited cultures becomes wrinkled (new Supplementary Figure 3). The VANGL2-KI lines do not develop this tortuous apical surface (as shown in Figure 3), which is to be expected given their modification is present throughout differentiation unlike the acute dilation caused by ROCK inhibition.

      This new data complements our visualisation of apical constriction in live imaging, apical accumulation of phospho-myosin, and quantification of ROCK-dependent apical tension as independent lines of evidence that our cultures undergo apical constriction.

      2) Another slight experimental concern I have regards the difference in laser ablation experiments detailed in Figure 3h-i from those of Figure 2d-e. It seems like WT recoil values in 3h-I are more variable and of a lower average than the earlier experiments and given that it appears significance is reached mainly by impact of the lower values, can the authors explain if this variability is expected to be due to heterogeneity in the tissue, i.e. some areas have higher local tension? If so, would that correspond with more local apical constriction?

      There is no significant difference in recoil between the control lines in Figures 2 and 3, albeit the data in Figure 3 is more variable (necessitating more replicates: none were excluded). We also showed laser ablation recoil data in Supplementary Figure 10, in which we did identify a graphing error (now corrected, also no significant difference in recoil from the other control groups).

      Minor comments __ 1) There seems to be a critical window at day 5 of the differentiation protocol, both in terms of cell morphology and the marker panel presented in Figure 1i. Do the authors have any data spanning the hours from day 5 to 6? If not, I don't think they need to generate any, but do I think this is a very interesting window worthy of further discussion for a couple of reasons. First, several studies of mouse neural tube closure have shown that various aspects of cell remodeling are temporally separable. For example, between Grego-Bessa et al 2016 and Brooks et al 2020 we can infer that apicobasal elongation rapidly increases starting at E8.5, whereas apical surface area reduction and constriction are apparent somewhat earlier at E8.0. I think it would be interesting to see if this separability is conserved in humans. Second, is there a sense of how the temporal correlation between the pluripotent and early neural fate marker data presented here corroborate or contradict the emerging set of temporally resolved RNA seq data sets of mouse development at equivalent early neural stages?__

      Cell shape analysis between days 5 and 6 has now been added (see the response to point 2.1 below). As the reviewer predicted, this is a transition point when apical area begins to decrease and apicobasal elongation begins to increase.

      We also thank the reviewer for this prompt to more closely compare our data to the previous mouse publications, which we have added to the discussion. The Grego-Bessa 2016 paper appears to show an increase in thickness between E7.75 and E8.5, but these are not statistically compared. Previous studies showed rapid apicobasal elongation during the period of neural fold elevation, when neuroepithelial cells apically constrict. This has now been added to the discussion:

      Discussion In mice, neuroepithelial apicobasal thickness is spatially-patterned, with shorter cells at the midline under the influence of SHH signalling14,77,78. Apicobasal thickness of the cranial neural folds increases from ~25 µm at E7.75 to ~50 µm at E8.579: closely paralleling the elongation between days 2 and 8 of differentiation in our protocol. The rate of thickening is non-uniform, with the greatest increase occurring during elevation of the neural folds80, paralleled in our model by the rapid increase in thickness between days 4-6 as apical areas decrease. Elevation requires neuroepithelial apical constriction and these cells' apical area also decreases between E7.75 and E8.5 in mice79, but we and others have recently shown that this reduction is both region and sex-specific14,81. Specifically, apical constriction occurs in the lateral (future dorsal) neuroepithelium: this corresponds with the identity of the cells generated by the dual SMAD inhibition model we use56. More recently, Brooks et al82 showed that the rapid reduction in apical area from E8-E8.5 is associated with cadherin switching from CDH1 (E-cadherin) to CDH2 (N-cadherin). This is also directly paralleled in our human system, which shows low-level co-expression of CDH1 and CDH2 at day 4 of differentiation, immediately before apical area shrinks and apicobasal thickness increases.

      Prompted by the in vivo data in Brooks et al (2025)82, we are keen to further explore the timing of CDH1/CDH2 switching versus apical constriction with new experimental data in revisions.

      2) Can the authors elaborate a bit more on what is known regarding apicobasal thickening and pseudo-stratification and how their work fits into the current understanding in the discussion? This is a very interesting and less well studied mechanism critical to closure, which their model is well suited to directly address. I am thinking mainly of the Grego-Bessa at al., 2016 work on PTEN, though interestingly the work of Ohmura et al., 2012 on the NUAK kinases also shows reduced tissue thickening (and apical constriction) and I am sure I have missed others. Given that the authors identify MED24 as a likely candidate for the lack of apicobasal thickening in one of their patient derived lines, is there any evidence that it interacts with any of the known players?

      We have now added further discussion on the mechanisms by which the neuroepithelium undergoes apicobasal elongation. Nuclear compaction is likely to be necessary to allow pseudostratification and apicobasal elongation. The reviewer's comment has led us to realise that diminished chromatin compaction is a potential outcome of MED24 down-regulation in our GOSB2 patient-derived line. Figure 4D suggests the nuclei of our MED24 deficient patient-derived line are less compacted than control equivalents and we propose to quantify nuclear volume in more detail to explore this possibility.

      Additionally, we have already expanded our discussion as suggested by the reviewer:

      Discussion: *Mechanistic separability of apical constriction and apicobasal elongation is consistent with biomechanical modelling of Xenopus neural tube closure showing that both are independently required for tissue bending61. Nonetheless, neuroepithelial apical constriction and apicobasal elongation are co-regulated in mouse models: for example, deletion of Nuak1/283, Cfl184, and Pten79 all produce shorter neuroepithelium with larger apical areas. Neuroepithelial cells of the GOSB2 line described here, which has partial loss of MED24, similarly produces a thinner neuroepithelium with larger apical areas. Although apical areas were not analysed in mouse models of Med24 deletion, these embryos also have shorter and non-pseudostratified neuroepithelium. *

      Our GOSB2 line - which retains readily detectable MED24 protein - is clearly less severe than the mouse global knockout, and the clinical features of the patient from which this line was derived are milder than the phenotype of Med24 knockout embryos68. Mouse embryos lacking one of Med24's interaction partners in the mediator complex, Med1, also have thinner neuroepithelium and diminished neuronal differentiation but successfully close their neural tube85. As general regulators of polymerase activity, MED proteins have the potential to alter the timing or level of expression of many other genes, including those already known to influence pseudostratification or apicobasal elongation. MED depletion also causes redistribution of cohesion complexes86 which may impact chromatin compaction, reducing nuclear volume during differentiation.

      3) Is there any indication that Vangl2 is weakly or locally planar polarized in this system? Figure 2F seems to suggest not, but Supplementary Figure 5 does show at least more supracellular cable like structures that may have some polarity. I ask because polarization seems to be one of the properties that differs along the anteroposterior axis of the neural plate, and I wonder if this offers some insight into the position along the axis that this system most closely models?

      VANGL2 does not appear to be planar polarised in this system. This is similar to the mouse spinal neuroepithelium, in which apical VANGL2 is homogenous but F-actin is planar polarised (Galea et al Disease Models and Mechanisms 2018). We do observe local supracellular cable-like enrichments of F-actin in the apical surface of iPSC-derived neuroepithelial cells. _We propose to compare the length of F-actin cables and coherency of their orientation at the start and end of neuroepithelial differentiation, and in wild-type versus VANGL2-mutant epithelia._

      4) I think some of the commentary on the strengths and limitations of the model found in the Results section should be collated and moved to the discussion in a single paragraph. For example ' This could also briefly touch on/compare to some of the other models utilizing hiPSCs (These are mentioned briefly in the intro, but this comparison could be elaborated on a bit after seeing all the great data in this work).

      These changes have now been made:

      __Discussion: __Some of these limitations, potentially including inclusion of environmental risk factors, can be addressed by using alternative iPSC-derived models93,94. For example, if patients have suspected causative mutations in genes specific to the surface (non-neural) ectoderm, such as GRHL2/3, 3D models described by Karzbrun et al49 or Huang et al95 may be informative. Characterisation of surface ectoderm behaviours in those models is currently lacking. These models are particularly useful for high-throughput screens of induced mutations95, but their reproducibility between cell lines, necessary to compare patient samples to non-congenic controls, remains to be validated. Spinal cell identities can be generated in human spinal cord organoids, although these have highly variable morphologies96,97. As such, each iPSC model presents limitations and opportunities, to which this study contributes a reductionist and highly reproducible system in which to quantitatively compare multiple neuroepithelial functions.

      5) While the authors are generally good about labeling figures by the day post smad inhibition, in some figures it is not clear either from the images or the legend text. I believe this includes supplemental figures 2,5,6,8, and 10 (apologies if I simply missed it in one or more of them)

      These have now been added.

      6) The legend for Figure 2 refers to a panel that is not present and the remaining panel descriptions are off by a letter. I'm guessing this is a versioning error as the text itself seems largely correct, but it may be good to check for any other similar errors that snuck in

      This has now been corrected.

      7) The cell outlines in Figure 3d are a bit hard to see both in print and on the screen, perhaps increase the displayed intensity?

      This has now been corrected.

      8) The authors show a fascinating piece of data in Supplementary Figure 1, demonstrating that nuclear volume is halved by day 8. Do they have any indication if the DNA content remains constant (e.g., integrated DAPI density)? I suppose it must, and this is a minor point in the grand scheme, but this represents a significant nuclear remodeling and may impact the overall DNA accessibility.

      We agree with the reviewer that the reduction in nuclear volume is important data both because it informs understanding of the reduction in total cell volume, and because it suggests active chromatin compaction during differentiation. Unfortunately, the thicker epithelium and superimposition of nuclei in the differentiated condition means the laser light path is substantially different, making direct comparisons of intensity uninterpretable. Additionally, the apical-most nuclei will mostly be in G2/M phase due to interkinetic nuclear migration. As such, the comparison of DAPI integrated density between epithelial morphologies would not be informative.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript by Ampartzidis et al., significantly extends the human induced pluripotent stem cell system originally characterized by the same group as a tool for examining cellular remodeling during differentiation stages consistent with those of human neural tube closure (Ampartzidis et al., 2023). Given that there are no direct ways to analyze cellular activity in human neural tube closure in vivo, this model represents an important platform for investigating neural tube defects which are a common and deleterious human developmental disease. Here, the authors carefully test whether this system is robust and reproducible when using hiPSC cells from different donors and pluripotency induction methods and find that despite all these variables the cellular remodeling programs that occur during early neural differentiation are statistically equivalent, suggesting that this system is a useful experimental substrate. Additionally, the carefully selected donor populations suggest these aspects of human neural tube closure are likely to be robust to sexual dimorphism and to reasonable levels of human genetic background variation, though more fully testing that proposition would require significant effort and be beyond the scope of the current work. Subsequent to this careful characterization, the authors next tested whether this system could be used to derive specific insights into cell remodeling during early neural differentiation. First, they used a reverse genetics approach to knock in a human point mutation in the critical regulator of planar cell polarity and apical constriction, Vangl2. Despite being identified in a patient, this R353C variant has not been directly functionally tested in a human system. The authors find that this variant, despite showing normal expression and phospho-regulation, leads to defects consistent with a failure in apical constriction, a key cell behavior required to drive curvature change during cranial closure. Finally, the authors test the utility of their hiPSC platform to understand human patient-specific defects by differentiating cells derived from two clinical spina bifida patients. The authors identify that one of these patients is likely to have a significant defect in fully establishing early proneural identity as well as defects in apicobasal thickening. While early remodeling occurs normally in the other patient, the authors observe significant defects in later neuronal induction and maturation. In addition, using whole exome sequencing the authors identify candidate variant loci that could underly these defects.

      Major comments

      1) One of my few concerns with this work is that the relative constriction of the apical surface with respect to the basal surface is not directly quantified for any of the experiments. This worry is slightly compounded by the 3D reconstructions Figure 1h, and the observation that overall cell volume is reduced and cell height increased simultaneously to area loss. Additionally, the net impact of apical constriction in tissues in vivo is to create local or global curvature change, but all the images in the paper suggest that the differentiated neural tissues are an uncurved monolayer even missing local buckles. I understand that these cells are grown on flat adherent surfaces limiting global curvature change, but is there evidence of localized buckling in the monolayer? While I believe-along with the authors-that their phenotypes are likely failures in apical constriction, I think they should work to strengthen this conclusion. I think the easiest way (and hopefully using data they already have) would be to directly compare apical area to basal area on a cell wise basis for some number of cells. Given the heterogeneity of cells, perhaps 30-50 cells per condition/line/mutant would be good? I am open to other approaches; this just seems like it may not require additional experiments.

      2) Another slight experimental concern I have regards the difference in laser ablation experiments detailed in Figure 3h-i from those of Figure 2d-e. It seems like WT recoil values in 3h-I are more variable and of a lower average than the earlier experiments and given that it appears significance is reached mainly by impact of the lower values, can the authors explain if this variability is expected to be due to heterogeneity in the tissue, i.e. some areas have higher local tension? If so, would that correspond with more local apical constriction?

      Minor comments

      1) There seems to be a critical window at day 5 of the differentiation protocol, both in terms of cell morphology and the marker panel presented in Figure 1i. Do the authors have any data spanning the hours from day 5 to 6? If not, I don't think they need to generate any, but do I think this is a very interesting window worthy of further discussion for a couple of reasons. First, several studies of mouse neural tube closure have shown that various aspects of cell remodeling are temporally separable. For example, between Grego-Bessa et al 2016 and Brooks et al 2020 we can infer that apicobasal elongation rapidly increases starting at E8.5, whereas apical surface area reduction and constriction are apparent somewhat earlier at E8.0. I think it would be interesting to see if this separability is conserved in humans. Second, is there a sense of how the temporal correlation between the pluripotent and early neural fate marker data presented here corroborate or contradict the emerging set of temporally resolved RNA seq data sets of mouse development at equivalent early neural stages?

      2) Can the authors elaborate a bit more on what is known regarding apicobasal thickening and pseudo-stratification and how their work fits into the current understanding in the discussion? This is a very interesting and less well studied mechanism critical to closure, which their model is well suited to directly address. I am thinking mainly of the Grego-Bessa at al., 2016 work on PTEN, though interestingly the work of Ohmura et al., 2012 on the NUAK kinases also shows reduced tissue thickening (and apical constriction) and I am sure I have missed others. Given that the authors identify MED24 as a likely candidate for the lack of apicobasal thickening in one of their patient derived lines, is there any evidence that it interacts with any of the known players?

      3) Is there any indication that Vangl2 is weakly or locally planar polarized in this system? Figure 2F seems to suggest not, but Supplementary Figure 5 does show at least more supracellular cable like structures that may have some polarity. I ask because polarization seems to be one of the properties that differs along the anteroposterior axis of the neural plate, and I wonder if this offers some insight into the position along the axis that this system most closely models?

      4) I think some of the commentary on the strengths and limitations of the model found in the Results section should be collated and moved to the discussion in a single paragraph. For example ' This could also briefly touch on/compare to some of the other models utilizing hiPSCs (These are mentioned briefly in the intro, but this comparison could be elaborated on a bit after seeing all the great data in this work).

      5) While the authors are generally good about labeling figures by the day post smad inhibition, in some figures it is not clear either from the images or the legend text. I believe this includes supplemental figures 2,5,6,8, and 10 (apologies if I simply missed it in one or more of them)

      6) The legend for Figure 2 refers to a panel that is not present and the remaining panel descriptions are off by a letter. I'm guessing this is a versioning error as the text itself seems largely correct, but it may be good to check for any other similar errors that snuck in

      7) The cell outlines in Figure 3d are a bit hard to see both in print and on the screen, perhaps increase the displayed intensity?

      8) The authors show a fascinating piece of data in Supplementary Figure 1, demonstrating that nuclear volume is halved by day 8. Do they have any indication if the DNA content remains constant (e.g., integrated DAPI density)? I suppose it must, and this is a minor point in the grand scheme, but this represents a significant nuclear remodeling and may impact the overall DNA accessibility.

      Significance

      Overall, I am enthusiastic about this work and believe it represents a significant step forward in the effort to establish precision medicine approaches for diagnoses of the patient-specific causative cellular defects underlying human neural tube closure defects. This work systematizes an important and novel tool to examine the cellular basis of neural tube defects. While other hiPSC models of neural tube closure capture some tissue level dynamics, which this model does not, they require complex microfluidic approaches and have limited accessibility to direct imaging of cell remodeling. Comparatively, the relative simplicity of the reported model and the work demonstrating its tractability as a patient-specific and reverse genetic platform make it unique and attractive. This work will be of interest to a broad cross section of basic scientists interested in the cellular basis of tissue remodeling and/or the early events of nervous system development as well as clinical scientists interested in modeling the consequences of patient specific human genetic deficits identified in neural tube defect pregnancies.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The authors' work focuses on studying cell morphological changes during differentiation of hPSCs into neural progenitors in a 2D monolayer setting. The authors use genetic mutations in VANGL2 and patient-derived iPSCs to show that (1) human phenotypes can be captured in the 2D differentiation assay, and (2) VANGL2 in humans is required for neural contraction, which is consistent with previous studies in animal models. The results are solid and convincing, the data are quantitative, and the manuscript is well written. The 2D model they present successfully addresses the questions posed in the manuscript. However, the broad impact of the model may be limited, as it does not contain NNE cells and does not exhibit tissue folding or tube closure, as seen in neural tube formation. Patient-derived lines are derived from amniotic fluid cells, and the experiments are performed before birth, which I find to be a remarkable achievement, showing the future of precision medicine.

      Major comments:

      1.Figure 1. The authors use F-actin to segment cell areas. Perhaps this could be done more accurately with ZO-1, as F-actin cables can cross the surface of a single cell. In any case, the authors need to show a measure of segmentation precision: segmented image vs. raw image plus a nuclear marker (DAPI, H2B-GFP), so we can check that the number of segmented cells matches the number of nuclei. 2.Lines 156-166. The authors claim that changes in gene expression precede morphological changes. I am not convinced this is supported by their data. Fig. 1g (epithelial thickness) and Fig. 1k (PAX6 expression) seem to have similar dynamics. The authors can perform a cross-correlation between the two plots to see which Δt gives maximum correlation. If Δt < 0, then it would suggest that gene expression precedes morphology, as they claim. Fig. 1j shows that NANOG drops before the morphological changes, but loss of NANOG is not specific to neural differentiation and therefore should not be related to the observed morphological changes. 3.Figure 2d. The laser ablation experiment in the presence of ROCK inhibitor is clear, as I can easily see the cell outlines before and after the experiment. In the absence of ROCK inhibitor, the cell edges are blurry, and I am not convinced the outline that the authors drew is really the cell boundary. Perhaps the authors can try to ablate a larger cell patch so that the change in area is more defined. 4.Figure 2d. Do the cells become thicker after recoil? 5.Figure 3. The authors mention their previous study in which they show that Vangl2 is not cell-autonomously required for neural closure. It will be interesting to study whether this also the case in the present human model by using mosaic cultures. 6.Lines 403-415. The authors report poor neural induction and neuronal differentiation in GOSB2. As far as I understand, this phenotype does not represent the in vivo situation. Thus, it is not clear to what extent the in vitro 2D model describes the human patient. 7.The experimental feat to derive cell lines from amniotic fluid and to perform experiments before birth is, in my view, heroic. However, I do not feel I learned much from the in vitro assays. There are many genetic changes that may cause the in vivo phenotype in the patient. The authors focus on MED24, but there is not enough convincing evidence that this is the key gene. I would like to suggest overexpression of MED24 as a rescue experiment, but I am not sure this is a single-gene phenotype. In addition, the fact that one patient line does not differentiate properly leads me to think that the patient lines do not strengthen the manuscript, and that perhaps additional clean mutations might contribute more.

      Minor comments:

      1.Figure 1c. Text is cropped at the edge of the image.

      Significance

      This study establishes a quantitative, reproducible 2D human iPSC-to-neural-progenitor platform for analyzing cell-shape dynamics during differentiation. Using VANGL2 mutations and patient-derived iPSCs, the work shows that (1) human phenotypes can be captured in a 2D differentiation assay and (2) VANGL2 is required for neural contraction (apical constriction), consistent with animal studies. The results are solid, the data are quantitative, and the manuscript is well written. Although the planar system lacks non-neural ectoderm and does not exhibit tissue folding or tube closure, it provides a tractable baseline for mechanistic dissection and genotype-phenotype mapping. The derivation of patient lines from amniotic fluid and execution of experiments before birth is a remarkable demonstration that points toward precision-medicine applications, while motivating rescue strategies and additional clean genetic models. However, overall I did not learn anything substantively new from this manuscript; the conclusions largely corroborate prior observations rather than extend them. In addition, the model was unsuccessful in one of the two patient-derived lines, which limits generalizability and weakens claims of patient-specific predictive value.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Ampartzidis et al. report the establishment of an iPSC-derived neuroepithelial model to examine how mutations from spina bifida patients disrupt fundamental cellular properties that underlie neural tube closure. The authors utilize an adherent neural induction protocol that relies on dual SMAD inhibition to differentiate three previously established iPSC lines with different origins and reprogramming methods. The analysis is comprehensive and outstanding, demonstrating reproducible differentiation, apical-basal elongation, and apical constriction over an 8-day period among the 3 lines. In inhibitor studies, it is shown that apical constriction is dependent on ROCK and generates tension, which can be measured using an annular laser ablation assay. Since this pathway is dependent on PCP signaling, which is also implicated in neural tube defects, the authors investigated whether VANGL2 is required by generating 2 lines with a pathogenic patient-derived sequence variant. Both lines showed reduced apical constriction and reduced tension in the laser ablation assays. The authors then established lines obtained from amniocentesis, including 2 control and 2 spina bifida patient-derived lines. These remarkably exhibited different defects. One line showed defects in apical-basal elongation, while the other showed defects in neural differentiation. Both lines were sequenced to identify candidate variants in genes implicated in NTDs. While no smoking gun was found in the line that disrupts neural differentiation (as is often the case with NTDs), compound heterozygous MED24 variants were found in the patient whose cells were defective in apical-basal elongation. Since MED24 has been linked to this phenotype, this finding is especially significant.

      Some details are missing regarding the method to evaluate the rigor and reproducibility of the study.

      Major points

      It is mentioned throughout the manuscript that 3 plates were evaluated per line. I believe these are independently differentiated plates. This detail is critical concerning rigor and reproducibility. This should be clearly stated in the Methods section and in the first description of the experimental system in the Results section for Figure 1. For the patient-specific lines - how many lines were derived per patient? Was the Vangl2 variant introduced by prime editing? Base editing? The details of the methods are sparse.<br /> Some additional suggestions for improvement.<br /> The abstract could be more clearly written to effectively convey the study's importance. Here are some suggestions Line 26: Insert "apicobasal" before "elongation" - the way it is written, I initially interpreted it as anterior-posterior elongation. Line 29: Please specify that the lines refer to 3 different established parent iPSC lines with distinct origins and established using different reprogramming methods, plus 2 control patient-derived lines. - The reproducibility of the cell behaviors is impressive, but this is not captured in the abstract. Line 32: add that this mutation was introduced by CRISPR-Cas9 base/prime editing The last sentence of the abstract states that the study only links apical constriction to human NTDs, but also reveals that neural differentiation and apical-basal elongation were found. The introduction could also use some editing. Line 71: insert "that pulls actin filaments together" after "power strokes" Line 73: "apically localized," do you mean "mediolaterally" or "radially"? Line 75: Can you specify that PCP components promote "mediolaterally orientated" apical constriction Lines 127: Specify that NE functions include apical basal elongation and neurodifferentiation are disrupted in patient-derived models

      Significance

      This paper is significant not only for verifying the cell behaviors necessary for neural tube closure in a human iPSC model, but also for establishing a robust assay for the functional testing of NTD-associated sequence variants. This will not only demonstrate that sequence variants result in loss of function but also determine which cellular behaviors are disrupted.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-03220

      Corresponding author(s): Ryusuke Niwa, Yuko Shimada-Niwa, and Wei Sun

      Dear Editors,

      We are pleased to submit our revised manuscript of RC-2025-03220R. The reviewers’ comments from Review Commons are presented in italic.

      For submission of our current revised manuscript, we provide two Word files, which are the “clean” and “Track-and-Change” files. Page and line numbers described below correspond to those of the “clean” file. The “Track-and-Change” file might be helpful for Reviewers to find what we have changed for the current revision.

      We hope that the revised version is now suitable for the next stage of evaluation.

      Sincerely,

      Ryusuke Niwa, Yuko Shimada-Niwa, and Wei Sun

      1. General Statements [optional]

      We sincerely thank the reviewers for their thoughtful feedback on our initial submission. Experiments that we will conduct and the revisions on the manuscript that have already been incorporated are detailed below in the point-by-point response. For this revised submission, two versions of the manuscript are provided: a clean copy and a tracked-changes file. Page and line numbers mentioned below refer to the clean version, while the tracked-changes file is intended to help reviewers easily identify the revisions made.

      In preparing the revision plan, we have included additional data, some of which were generated in collaboration with new contributors. Accordingly, we would like to propose adding Yuichi Shichino and Shintaro Iwasaki as co-authors to acknowledge their contributions.

      2. Description of the planned revisions__ __

      __

      - Also, the authors show that two different RNAi lines for NudC give the same defects - it would be good to know if the RNAi lines target the same or different sequences in the NudC transcripts. Alternatively, it would be equally good to show that trans-allelic combinations of NudC mutants have the same defects in the prothoracic glands and the salivary glands as the RNAi. Instead, they examine only overall body size, developmental delays and lethality in the trans-hetero allelic NudC mutants.

      Author response:

      In response to the second part of the criticism, we will further validate the observed phenotypes by examining tissue and nuclear size, chromosomal structure, and the levels of Fibrillarin and RpS6 proteins in the prothoracic glands and salivary glands of NudC mutants.

      __

      - It would be quite helpful to characterize the "5 blob" and "shortened polytene chromosome arm" defects shown in Figure 2 and Figure 6. Are these partially polytenized chromosomes or are large sections of the chromosomes missing or just underreplicated? What do the chromosomes look like if you lyse the nuclei, spread the chromosomes and stain with DAPI or Hoechst - this is a pretty standard practice and would reveal much more about the structure of the polytene chromosomes.

      Author response:

      To address these structural concerns more clearly, we plan to apply established protocols to obtain higher-resolution images and gather more detailed information on chromosome morphology.

      __ - Discussion, line 468. I don't think the authors have provided evidence of DNA damage. With the experiments they have shown, the chromosomes look abnormal - not clear what is abnormal.

      Author response:

      To further confirm DNA damage in NudC knockdown salivary gland cells, we plan to perform a TUNEL assay, which detects DNA fragmentation associated with damage.

      We would like to note that, in the current manuscript, we have shown that depletion of NudC, eIF5, RpLP0-like, or Nopp140 increased γH2Av levels, suggesting activation of the DNA damage response (Figures 6B and 6C).

      __

      *The authors claim that NudC has a dual role as a cell cycle/cytoskeleton regulator and as a ribosome biogenesis factor. However, because NudC knockdown reduces nuclear size and ploidy (Figures 1F and 2H-2I), the authors cannot exclude that decreased rDNA dosage and nucleolar volume contribute to reduced rRNA signals and that the effects seen are due to a NudC involvement in endoreplication, the rRNA reduction being a consequence of lower polyploidy. Different allelic combinations of NudC induce larval growth defects (Figure S5), consistent with a NudC role in endoreplication. To circumvent this, the authors could genetically modulate endocycle progression (e.g., E2F or Fzr overexpression) in the NudC RNAi background to test whether inducing endoreplication rescues rRNA production and nucleolar volume. This would establish causality between the endocycle state and rRNA output and clarify whether NudC's primary role is in RiBi or endocycle control. *

      Author response: In response to Reviewer #2’s suggestion, we plan to genetically modify the progression of the endocycle by inducing continuous expression of Cyclin E (CycE), E2F1, and Fzr in NudC RNAi salivary glands to test whether promoting endoreplication can restore rRNA production and nucleolar volume.

      In fact, we have attempted to rescue the developmental arrest in animals with NudC-deficient prothoracic glands (PGs) by inducing continuous expression of CycE. Two constructs, UAS-CycE-1 (BDSC#30725) and UAS-CycE-2 (BDSC#30924), were used. UAS-CycE-1 has previously been shown to rescue developmental arrest in PG-specific TOR loss-of-function animals (Ohhara, Kobayashi, and Yamanaka. PLoS Genetics 13 (1): e1006583, 2017). We introduced each construct into NudC knockdown PGs. However, continuous expression of CycE did not restore development (Figure A as shown below), suggesting that NudC functions in the polyploid cells extend beyond endocycle regulation. We do not currently plan to include the PG data shown in Figure A in the revised manuscript. We will evaluate whether it would be meaningful to present PG data alongside salivary gland results once we have obtained and analyzed data from the salivary gland rescue experiment.

      __Figure A. _Survival and developmental progression following continuous expression of CycE._ __Control (phtm>dicer2, +), NudC knockdown (phtm>dicer2, NudC RNAi), and NudC RNAi + CycE (phtm>dicer2, NudC RNAi, CycE) flies were analyzed at 10 days after hatching (10 dAH). Dead indicates dead larvae; L3 denotes third-instar larvae. Sample sizes (number of flies) are shown below each bar.

      __

      *The conclusion that NudC maintains rRNA levels is derived from salivary gland RNAi phenotypes with strong reductions in ITS1/ITS2 and 18S/28S signals (Figure 4B-4K) and reduced 28S by Northern (Figure 4L), plus corroboration in fat body cells (Figure S7). The authors verified knockdown using two independent RNAi lines for growth phenotypes and NudC::GFP reduction (Figure S2) and generated a UAS-FLAG::NudC transgene (Key Resources), but rRNA measurements were reported for only one RNAi line without rescue. Rescue of the rRNA phenotype by transgenic NudC re-expression, or replication of the rRNA decrease with a second, non-overlapping RNAi, would directly attribute the effect to NudC. In the absence of these standard validation controls, an off-target explanation remains plausible. *

      Author response:

      We plan to analyze rRNA FISH signals in salivary glands and fat bodies using a second, non-overlapping RNAi strain to confirm the reproducibility of the observed effects.

      __ - The authors report in Fig. 2 elevated γH2Av in SG cells upon NudC knockdown and interpret this as evidence of chromosome destabilization. They also state that apoptosis is not observed in Fig S10. However, the increase in γH2Av could reflect transient or early apoptotic events or other stress responses triggered by NudC depletion, rather than direct defects in endoreplication or genome stability. I suggest that the authors clarify this important point, for example, by co-expressing apoptotic inhibitors such as P35, or by using the TUNEL assay, which is more sensitive than anti-Caspase3 or Dcp1 antibodies.

      Author response:

      We plan to perform a TUNEL assay on salivary gland cells to evaluate apoptosis associated with NudC depletion.

      __ - Activation of the JNK pathway is often accompanied by apoptosis. It would strengthen the conclusions if the authors included a positive control to confirm that apoptosis is not induced under these experimental conditions, ensuring that the observed effects are specific to autophagy and not confounded by cell death.

      Author response:

      We will analyze pJNK and autophagy levels in animals expressing a constitutively-active form of hemipterous (hep) (hep[CA] ) under the control of fkh-GAL4 driver as a positive control. hep encodes the Drosophila JNK kinase, and it is well established that forced expression of hep[CA] induces JNK phosphorylation and activation.

      __ - In Figure S1, reduction of NudC in the fat body appears to induce a starvation-like phenotype, suggesting a potential impairment of metabolic or nutrient-sensing pathways. It would be important to determine whether modulation of nutrient-responsive signaling could rescue this phenotype. Specifically, have the authors examined whether activation of the TOR or PI3K pathways mitigates the effects of NudC knockdown? Assessing pathway activity (e.g., via phospho-S6K or phospho-Akt levels) or performing genetic rescue experiments with pathway activators could clarify whether the observed phenotypes are mediated through disrupted nutrient signaling rather than a secondary effect of general cellular stress. Such analyses could also provide a mechanistic explanation for the increased autophagy observed in these cells.

      Author response:

      1. We will analyze phospho-S6K levels in salivary glands and fat bodies by immunostaining.
      2. To activate the TOR pathway in NudC RNAi fat bodies, we will overexpress Rheb, an established upstream activator of the TOR pathway in Drosophila, which has been shown to robustly increase TOR signaling and S6K phosphorylation.

        __ - The current images of autophagic vesicles in the SG in Fig. 8B are not clearly visible and quantified. Considering the large size of these polyploid cells, higher-resolution images or alternative imaging approaches should be presented to better visualize and quantify autophagy. This would make the conclusions regarding enhanced autophagy more convincing. In addition, this data could be further strengthened by expanding the analysis of autophagy to other cell types. For example, examining autophagy in fat body cells, where autophagy plays a primary physiological role associated with rRNA accumulation (Fig. S7), rather than a reduction like in SG (Fig. 4), could provide a useful comparison for the function of NudC between polyploid cells.

      Author response:

      In response to the second part of the reviewer’s comment, we will conduct additional experiments using anti-Atg8a immunostaining and/or LysoTracker staining to analyze autophagy in NudC RNAi fat bodies and prothoracic glands. These experiments will help further characterize the cellular responses associated with NudC depletion.

      3. Description of the revisions that have already been incorporated in the transferred manuscript


      __

      -The title is a bit problematic since they haven't shown that NudC doesn't also affect normal mitotic cells - they only look at polyploid cells, but that doesn't mean normal mitotic cells are not also affected.

      Author response:

      In response to the suggestion from Reviewer #1, we have revised the title from “NudC moonlights in ribosome biogenesis and homeostasis in Drosophila melanogaster polyploid cells” to “NudC moonlights in ribosome biogenesis and homeostasis in polyploid cells of Drosophila melanogaster” to place greater emphasis on “polyploid cells.”

      Regarding mitotic cells, we have added new data in the revised manuscript (Figure S7; lines 249–256 and 417–418) demonstrating that NudC regulates apoptosis and stress responses in mitotic imaginal wing disc cells. However, as the main focus of our study remains polyploid cells, we have chosen to retain the emphasis in the title.

      __

      - Also, the authors show that two different RNAi lines for NudC give the same defects - it would be good to know if the RNAi lines target the same or different sequences in the NudC transcripts. Alternatively, it would be equally good to show that trans-allelic combinations of NudC mutants have the same defects in the prothoracic glands and the salivary glands as the RNAi. Instead, they examine only overall body size, developmental delays and lethality in the trans-hetero allelic NudC mutants.

      Author response:

      In response to the first half of criticism, the two RNAi lines used for NudC target distinct sequences. We have added the corresponding RNAi target sites to Figure S4A for clarity.

      __

      - Results: Lines 261 - 266. Seeing electron dense structures in TEMs and seeing increased Me31B staining by confocal imaging in the cytoplasm is insufficient evidence that the electron dense structures are P-bodies. They could be the P-bodies but they could also be aggregated ribosomes; there is insufficient evidence to "confirm" that they are P-bodies - maybe just say "suggests".

      Author response:

      In response to Reviewer #1’s suggestion, we have revised lines 261–262 to avoid using the word "confirm." The new sentence reads: “Immunostaining with the P-body marker Me31B reveals numerous cytoplasmic P-bodies in NudC-deficient SG cells,” which appears in lines 293–295.

      __

      - Abstract, lines 28 - 31. I think this gene has been identified before. The authors probably want to say they have discovered a role for this gene in RiBi.

      Author response:

      We have followed Reviewer #1’s suggestion and revised the sentence in lines 35–37 to: “In this study, we discovered a role for the gene NudC (nuclear distribution C, dynein complex regulator) in RiBi within polyploid cells of Drosophila melanogaster larvae.”

      __

      - Introduction, line 66. The protein is imported into the nucleus, where it localizes to the nucleolus - technically the protein is not imported into the nucleolus.

      Author response:

      To correct the misrepresentation in line 66, we have revised the sentence to: “RP mRNAs are synthesized by RNA polymerase II, and exported to the cytoplasm for translation. Then, RPs are imported into the nucleus, where they localize to the nucleolus.” in lines 70–73.

      __ - Introduction, line 70. To be comprehensive in the description of ribosome biogenesis, the authors may want to mention that the 40S and 60S subunits are then exported from the nucleus and form the 80S subunit in the cytoplasm during translation.

      Author response:

      To improve the representation, we have revised the sentences in lines 73 – 78 as follows: “Within the nucleolus, rRNAs and RPs assemble into pre-40S and pre-60S subunits. immature versions of the small (40S) and large (60S) subunits, respectively, that undergo maturation with numerous ribosome biogenesis factors (RBFs) (Greber, 2016). The 40S and 60S subunits are then transported separately to the cytoplasm, where they combine to form functional 80S ribosomes, capable of sustaining protein synthesis (Pelletier et al., 2018).”

      __ - Introduction, line 98. May want to cite paper showing that Minute mutations turn out to be mutations in individual ribosomal protein genes.

      Author response:

      As Reviewer #1 suggested, we have cited two, Marygold et al. (2007) entitled “The ribosomal protein genes and Minute loci of Drosophila melanogaster” and Recasens-Alvarez et al. (2021) entitled “Ribosomopathy-associated mutations cause proteotoxic stress that is alleviated by TOR inhibition” along with He et al. (2015). The inappropriate citation to Brehme (1939) has been removed.

      __ - Results, lines 292. Since they didn't knock down NudC in the fat body cells in this experiment, this comment seems irrelevant.

      Author response:

      We would like to clarify that the phenotype observed with fkh-GAL4-driven NudC RNAi was specific to salivary glands, and no obvious phenotypes were detected in the surrounding fat body cells, which do not express fkh-GAL4. In this context, the adjacent fat body cells serve as an internal control.

      In the revised manuscript, the sentence has been rewritten as: “In contrast, the fat body cells surrounding NudC-deficient SGs did not show this reduction (Figure S9),” in lines 323–324.

      __ - Figure 6A. Hoechst is misspelled.

      __

      - Fig. 2 I - Hoeschest should be Hoescht.

      Author response:

      We have fixed the error.

      __ *- Given that prothoracic gland (PG) size influences ecdysone production, the finding that NudC knockdown alters PG cell size, morphology, and cytoskeletal organization raises the possibility that ecdysone synthesis or signaling may also be affected. This, in turn, could explain the delayed maturation phenotype observed in Figure 1. I recommend testing whether ectopic activation of ecdysone signaling, for instance through 20-hydroxyecdysone (20E) supplementation, can rescue the defects in PG size and developmental timing. Such an experiment would strengthen the link between NudC function, PG morphology, and ecdysone-dependent developmental progression. *

      Author response:

      We have conducted experiments showing that developmental defects in NudC RNAi animals can be partially rescued by administering 20E. Approximately 32% of NudC RNAi larvae fed with 20E completed pupariation. These new data have been added to Figure S1B and are described in the main text (lines 165-168).

      Regarding PG size, our experiments show that PG growth remains inhibited following 20E administration (Figure B as shown below). This observation indicates that treatment with exogenous 20E does not restore PG growth in NudC RNAi animals, suggesting that other factors may be required for normal PG development beyond ecdysone supplementation.

      Because this analysis is not the main focus of our manuscript, we currently plan not to include these data in the revised manuscript.

      Figure B. Prothoracic gland (PG) size ____after 20E administration.

      To assess whether 20E supplementation could restore PG size, control (phtm>dicer2, +) and NudC RNAi (phtm>dicer2, NudC RNAi) larvae were transferred at 60 hours after hatching (hAH) to standard medium containing 20E dissolved in 100% ethanol. Control groups were transferred to medium containing the same volume of 100% ethanol at the same time point. PG size was quantified at the wandering stage. Sample sizes (number of glands) are shown below each bar. Bars represent mean ± SD. **p * *

      __ - Additionally, qRT-PCR can be performed to assess the expression levels of ecdysone precursors or target genes in whole larvae, serving as a readout of ecdysone activity, including dilp8, which is usually upregulated when ecdysone levels are reduced.

      Author response: To investigate ecdysone biosynthesis, Halloween genes including nvd, spok, sro, phm, dib, and sad were measured by conducting qRT-PCR. In NudC RNAi animals, nvd, sro and phm were suppressed at late L3 stage, indicating that NudC in the PG is required for ecdysone biosynthesis. The new data are described in Figure S1A and in the main text (lines 159-164) in the revised manuscript.

      __ - The current images of autophagic vesicles in the SG in Fig. 8B are not clearly visible and quantified. Considering the large size of these polyploid cells, higher-resolution images or alternative imaging approaches should be presented to better visualize and quantify autophagy. This would make the conclusions regarding enhanced autophagy more convincing.

      Author response:

      Regarding the image quality issue, we have provided improved images of anti-Atg8a immunostaining in the salivary gland mosaic clones (Figure 8B) and included additional data from SG-specific knockdown cells (Supplemental Figures S13A-S13F) to provided quantitative results.

      __ - Furthermore, including experiments in other cell types, such as imaginal disc cells, where apoptosis is more readily induced, would help determine whether the effects of NudC knockdown are specific to polyploid cells or are more broadly applicable.

      Author response: We found that apoptosis was observed in NudC RNAi wing discs. In the revised manuscript, we have included this data in Figure S7 and referenced it in the main text (lines 249–256).

      4. Description of analyses that authors prefer not to carry out

      __ - Results, lines 285 to 298. In situs with multiple probes that detect all parts of both the pre-rRNA and processed rRNA indicate that all are down in the SG in NudC knockdowns, but that the 18S and 28S rRNAs are down the internal transcribed spacers go up - can the authors explain or hypothesize how this could happen?

      Author response:

      As Reviewer #1 indicated, we indeed observed that internal transcribed spacer (ITS) levels decrease in NudC knockdown salivary glands, but increase in knockdown fat bodies. Our hypothesis is that, as noted in the Discussion (lines 529–534), ribosome abundance is typically linked to protein synthesis. Salivary gland cells, which are highly active in protein production, may be particularly sensitive to disruptions in ribosome biogenesis. Therefore, NudC may maintain appropriate levels of rRNA with its impact varying according to the specific regulatory mechanisms of each cell type. We do not have a further explanation for this phenomenon, and therefore we have retained the original sentences without adding new ones.

      __ - The data presented in Fig 4 show that NudC knockdown reduces pre-rRNA (ITS1/ITS2) and mature 18S/28S rRNAs in a tissue-specific manner. However, it remains unclear whether these reductions have functional consequences for ribosome assembly and translation. I recommend that the authors perform polysome profiling or an equivalent assay to assess the impact of NudC loss on actively translating ribosomes. This approach would provide a quantitative readout of translation efficiency and clarify whether the observed rRNA defects lead to impaired protein synthesis. Additionally, polysome profiling could help explain the tissue-specific differences observed between salivary glands and fat body cells.

      Author response:

      We performed ribosome fractionation using wild-type salivary glands and repeated the experiment three times with 56–62 gland pairs per sample. As shown in Figure C, the polyribosome peaks (grey lines) are not prominent, indicating that a much larger number of glands would be required for robust polysome profiling. Given that NudC RNAi salivary glands are significantly smaller than wild-type glands, collecting enough tissue for equivalent profiling would be technically difficult. Therefore, we concluded that obtaining sufficient RNAi samples for polysome profiling is extremely challenging, and these data have not been included in the revised manuscript.

      On the other hand, we would like to emphasize that we observed a significant reduction in O-propargyl puromycin (OPP) labeling in NudC-deficient salivary gland cells (Figure 3B), which provides strong evidence for reduced translational activity.

      __Figure C. Ribosomal fraction profiles of wild-type salivary glands. __Salivary glands from the late L3 larvae were dissected for analysis. Polyribosome peaks are indicated in grey. The number of salivary gland pairs used for each sample is shown above each bar.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      NudC (Nuclear Distribution Protein C) is a conserved, dynein-associated protein that plays a critical role in nuclear positioning and neuronal development. It functions as a co-chaperone, stabilizing components of the dynein-motor complex, thereby facilitating proper microtubule-dependent nuclear migration and intracellular transport. In developing neurons, NudC is essential for correct dendritic morphogenesis, ensuring nuclei and dendritic processes attain their proper spatial organization. Loss or knockdown of nudC leads to defects in nuclear localization, aberrant dendritic architecture, and mitotic stress, which can predispose cells to apoptosis. Highlighting NudC as a pivotal regulator of intracellular dynamics, cytoskeletal organization, In this paper, the authors propose a role for the gene in regulating ribosomal biogenesis. However, the interpretation of these results remains somewhat unclear, as the observed effects on ribosome biogenesis could potentially result from nonspecific cellular stress or toxicity caused by gene knockdown in polyploid cells. At this stage, the link between NudC and the regulation of ribosomal biogenesis is not fully convincing. Additional experiments could help clarify whether this relationship is direct or secondary to other cellular effects. I suggest conducting additional experiments to strengthen this hypothesis; for example, by examining whether knocking down NudC would give similar effects as observed for other genes that regulate RiBi in other organs and tissues where ribosomal biogenesis and stress responses have been well-characterized, such as the imaginal discs. Comparing the results across these different tissues would help clarify whether the effects of gene knockdown are specific to polyploid cells or represent a more general cellular response.

      Suggested experiments to sustain the paper:

      1. Given that prothoracic gland (PG) size influences ecdysone production, the finding that NudC knockdown alters PG cell size, morphology, and cytoskeletal organization raises the possibility that ecdysone synthesis or signaling may also be affected. This, in turn, could explain the delayed maturation phenotype observed in Figure 1. I recommend testing whether ectopic activation of ecdysone signaling, for instance through 20-hydroxyecdysone (20E) supplementation, can rescue the defects in PG size and developmental timing. Such an experiment would strengthen the link between NudC function, PG morphology, and ecdysone-dependent developmental progression.
      2. Additionally, qRT-PCR can be performed to assess the expression levels of ecdysone precursors or target genes in whole larvae, serving as a readout of ecdysone activity, including dilp8, which is usually upregulated when ecdysone levels are reduced.
      3. The authors report in Fig. 2 elevated γH2Av in SG cells upon NudC knockdown and interpret this as evidence of chromosome destabilization. They also state that apoptosis is not observed in Fig S10. However, the increase in γH2Av could reflect transient or early apoptotic events or other stress responses triggered by NudC depletion, rather than direct defects in endoreplication or genome stability. I suggest that the authors clarify this important point, for example, by co-expressing apoptotic inhibitors such as P35, or by using the TUNEL assay, which is more sensitive than anti-Caspase3 or Dcp1 antibodies.
      4. The data presented in Fig 4 show that NudC knockdown reduces pre-rRNA (ITS1/ITS2) and mature 18S/28S rRNAs in a tissue-specific manner. However, it remains unclear whether these reductions have functional consequences for ribosome assembly and translation. I recommend that the authors perform polysome profiling or an equivalent assay to assess the impact of NudC loss on actively translating ribosomes. This approach would provide a quantitative readout of translation efficiency and clarify whether the observed rRNA defects lead to impaired protein synthesis. Additionally, polysome profiling could help explain the tissue-specific differences observed between salivary glands and fat body cells.
      5. Activation of the JNK pathway is often accompanied by apoptosis. It would strengthen the conclusions if the authors included a positive control to confirm that apoptosis is not induced under these experimental conditions, ensuring that the observed effects are specific to autophagy and not confounded by cell death.
      6. In Figure S1, reduction of NudC in the fat body appears to induce a starvation-like phenotype, suggesting a potential impairment of metabolic or nutrient-sensing pathways. It would be important to determine whether modulation of nutrient-responsive signaling could rescue this phenotype. Specifically, have the authors examined whether activation of the TOR or PI3K pathways mitigates the effects of NudC knockdown? Assessing pathway activity (e.g., via phospho-S6K or phospho-Akt levels) or performing genetic rescue experiments with pathway activators could clarify whether the observed phenotypes are mediated through disrupted nutrient signaling rather than a secondary effect of general cellular stress. Such analyses could also provide a mechanistic explanation for the increased autophagy observed in these cells.
      7. The current images of autophagic vesicles in the SG in Fig. 8B are not clearly visible and quantified. Considering the large size of these polyploid cells, higher-resolution images or alternative imaging approaches should be presented to better visualize and quantify autophagy. This would make the conclusions regarding enhanced autophagy more convincing. In addition, this data could be further strengthened by expanding the analysis of autophagy to other cell types. For example, examining autophagy in fat body cells, where autophagy plays a primary physiological role associated with rRNA accumulation (Fig. S7), rather than a reduction like in SG (Fig. 4), could provide a useful comparison for the function of NudC between polyploid cells.
      8. Furthermore, including experiments in other cell types, such as imaginal disc cells, where apoptosis is more readily induced, would help determine whether the effects of NudC knockdown are specific to polyploid cells or are more broadly applicable.

      Significance

      NudC is a conserved dynein-associated protein essential for nuclear positioning, dendritic morphogenesis, and intracellular transport. This study suggests a novel role for NudC in regulating ribosome biogenesis, potentially linking cytoskeletal organization with protein synthesis and cellular homeostasis. Validating this connection across different tissues could reveal whether NudC serves as a general coordinator of intracellular architecture and translational capacity, providing new insights into how cells integrate structural and biosynthetic functions.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      In this manuscript, Duoduo Shi and colleagues, propose that NudC, previously known for its role in dynein regulation, has a second role as a critical regulator of ribosome biogenesis (RiBi) in Drosophila melanogaster polyploid cells, where its depletion reduces rRNA levels and ribosome abundance, triggering a compensatory homeostatic response that upregulates ribosomal proteins and biogenesis factors, similar to the response observed upon depletion of established ribosome biogenesis factors.

      Strengths

      The authors propose a novel role for NudC as a regulator of ribosome biogenesis (RiBi) which is dynein-independent and they provide a detailed homeostatic response to RiBi stress.

      Weaknesses

      NudC downregulation may be affecting the endocycle and an endoreplication defect may drive rRNA reduction.

      Major comments

      The authors claim that NudC has a dual role as a cell cycle/cytoskeleton regulator and as a ribosome biogenesis factor. However, because NudC knockdown reduces nuclear size and ploidy (Figures 1F and 2H-2I), the authors cannot exclude that decreased rDNA dosage and nucleolar volume contribute to reduced rRNA signals and that the effects seen are due to a NudC involvement in endoreplication, the rRNA reduction being a consequence of lower polyploidy. Different allelic combinations of NudC induce larval growth defects (Figure S5), consistent with a NudC role in endoreplication. To circumvent this, the authors could genetically modulate endocycle progression (e.g., E2F or Fzr overexpression) in the NudC RNAi background to test whether inducing endoreplication rescues rRNA production and nucleolar volume. This would establish causality between the endocycle state and rRNA output and clarify whether NudC's primary role is in RiBi or endocycle control.

      The conclusion that NudC maintains rRNA levels is derived from salivary gland RNAi phenotypes with strong reductions in ITS1/ITS2 and 18S/28S signals (Figure 4B-4K) and reduced 28S by Northern (Figure 4L), plus corroboration in fat body cells (Figure S7). The authors verified knockdown using two independent RNAi lines for growth phenotypes and NudC::GFP reduction (Figure S2) and generated a UAS-FLAG::NudC transgene (Key Resources), but rRNA measurements were reported for only one RNAi line without rescue. Rescue of the rRNA phenotype by transgenic NudC re-expression, or replication of the rRNA decrease with a second, non-overlapping RNAi, would directly attribute the effect to NudC. In the absence of these standard validation controls, an off-target explanation remains plausible.

      Minor comments

      Fig. 2 I - Hoeschest should be Hoescht

      Significance

      The findings shown in this manuscript introduce a new player in endoreplication/ribosome biogenesis, a protein previously know as a dynein regulator. The strengths of the work lie on its novelty and thorough analysis of the cellular phenotypes induced by NudC depletion. However, its weaknesses are related to some claims not completely backed by the data, with some uncertainties related with a possible function of NudC in endoreplication.

      This basic research work will be of interest to a broad cell and developmental biology community as they provide a novel cellular function of a known protein. It is of specific interest to the specialized field of polyploidy and ribosome biogenesis.

      Field of expertise:

      Drosophila, morphogenesis, tubulogenesis, cytoskeleton, DNA damage and repair.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: This manuscript describes evidence for a role for the Nuclear distribution C dynein complex regulator (NudC) in ribosome biogenesis (RiBi) independent of its role in microtubule-associated dynein function.

      Evidence: NudC was picked up in a screen for genes affecting ecdysteroid biosynthesis, a process that occurs in the prothoracic gland (PG; an endocrine organ). In the absence of ecdysone, larvae fail to pupate. Consistent with this finding, the authors find that prothoracic RNAi knockdown of NudC results in a failure in pupation and a decrease in total PG size. They also show defects in polytene chromosome architecture and a mild decrease in overall DNA content. They then turn to the salivary gland (SG) to further characterize the phenotypes associated with NudC knockdown. First, they show that an endogenously tagged version of NudC is abundant in the cytosol and has very weak nuclear staining in the region of the nucleolus (marked by the very low levels of DAPI staining). Knockdown of NudC using RNAi results in reduced NudC-GFP staining, a reduction in SG size, and a reduction in nuclear size. They also find that the SG polytene chromosomes are abnormal and that the production of a SG glue protein as measured by Sgs3-GFP levels and electron dense secretory granules is significantly reduced with NudC knockdown. Interestingly, they also observe the presence of abundant virus-like particles in the nucleus (these structures are thought to originate from retrotransposons and are an indicator of stress). Consistent with increased cellular stress, the authors show activation of JNK signalling. Ultrastructural analysis reveals an abnormally organized ER with an apparent loss of ER-associated ribosomes. They do see other electron dense structures in the cytosol, which they provide evidence (see below) of being P-bodies (structures associated with mRNA). They show that, consistent with a decrease in ribosomes, protein translation is reduced. This is supported by FISH experiments where they show significant decreases in ribosomal RNA (rRNA) transcript levels and decreased translation. Seeing the significant decreases in rRNA levels prompted them to look at overall changes in gene expression, where they discovered that both ribosomal protein gene expression as well as expression of other genes involved in ribosome biogenesis (RiBi) are upregulated with knockdown of NudC. They confirm the changes in mRNA for two genes by showing that levels of the corresponding proteins are also upregulated based on immunostaining of SG cells in which NudC is knocked down. Linking NudC function to a response to defects in RiBi, they shown that SG knockdown of several ribosomal biogenesis factors (RBFs) have similar chromosome structural defects and result in an increase in expression of ribosomal protein genes and of NudC itself. Finally, they show that knock down of genes encoding proteins linked to NudC function in microtubule dynamics do not have any of the same phenotypes as knockdown of NudC and RBFs. Altogether, their data support a moonlighting function for NudC in ribosome biogenesis. Moreover, defects in RiBi wherein ribosomal RNAs are decreased seem to result in compensatory changes where both RBFs and ribosomal protein genes are upregulated.

      Major issues:

      The title is a bit problematic since they haven't shown that NudC doesn't also affect normal mitotic cells - they only look at polyploid cells, but that doesn't mean normal mitotic cells are not also affected.

      Also, the authors show that two different RNAi lines for NudC give the same defects - it would be good to know if the RNAi lines target the same or different sequences in the NudC transcripts. Alternatively, it would be equally good to show that trans-allelic combinations of NudC mutants have the same defects in the prothoracic glands and the salivary glands as the RNAi. Instead, they examine only overall body size, developmental delays and lethality in the trans-hetero allelic NudC mutants.

      Results: Lines 261 - 266. Seeing electron dense structures in TEMs and seeing increased Me31B staining by confocal imaging in the cytoplasm is insufficient evidence that the electron dense structures are P-bodies. They could be the P-bodies but they could also be aggregated ribosomes; there is insufficient evidence to "confirm" that they are P-bodies - maybe just say "suggests".

      It would be quite helpful to characterize the "5 blob" and "shortened polytene chromosome arm" defects shown in Figure 2 and Figure 6. Are these partially polytenized chromosomes or are large sections of the chromosomes missing or just underreplicated? What do the chromosomes look like if you lyse the nuclei, spread the chromosomes and stain with DAPI or Hoechst - this is a pretty standard practice and would reveal much more about the structure of the polytene chromosomes.

      Minor points:

      Abstract, lines 28 - 31. I think this gene has been identified before. The authors probably want to say they have discovered a role for this gene in RiBi.

      Introduction, line 66. The protein is imported into the nucleus, where it localizes to the nucleolus - technically the protein is not imported into the nucleolus.

      Introduction, line 70. To be comprehensive in the description of ribosome biogenesis, the authors may want to mention that the 40S and 60S subunits are then exported from the nucleus and form the 80S subunit in the cytoplasm during translation.

      Introduction, line 98. May want to cite paper showing that Minute mutations turn out to be mutations in individual ribosomal protein genes.

      Results, lines 285 to 298. In situs with multiple probes that detect all parts of both the pre-rRNA and processed rRNA indicate that all are down in the SG in NudC knockdowns, but that the 18S and 28S rRNAs are down the internal transcribed spacers go up - can the authors explain or hypothesize how this could happen?

      Results, lines 292. Since they didn't knock down NudC in the fat body cells in this experiment, this comment seems irrelevant.

      Discussion, line 468. I don't think the authors have provided evidence of DNA damage. With the experiments they have shown, the chromosomes look abnormal - not clear what is abnormal.

      Figure 6A. Hoechst is misspelled.

      Referee cross-commenting

      I think the other reviewers have valid criticisms. I think among the most critical issues to sort out is (1) what is wrong with the chromosomes, (2) are diploid tissues also affected, (3) are the RIBI phenotypes a primary or secondary consequence of nudC loss. I'm not sure how easy it is to do ribosomal profiling on tissues dissected from larvae as the third reviewer is suggesting.

      Significance

      It is a novel discovery that a protein regulating microtubule dynamics is moonlighting, presumably in the nucleolus, to regulate rRNA synthesis or stabilization. A little information regarding mechanism of action would make this a much more exciting paper - how does it do it? Right now, it is unclear whether rRNA synthesis or maintenance is being regulated and there are no hypotheses regarding how this protein localizes to nucleoli and exactly what it is doing there. Is it regulating all RNA Pol I-dependent transcription? Is it involved in processing or stabilizing rRNAs? The description of the chromosomal defects also fall short of satisfying. As is, this paper probably of most interest to those who study ribosome biogenesis - an important topic, but without more mechanistic insight, not so interesting to a more general audience.

      My expertise

      I am an experienced Drosophila biologist who is familiar with the system and who fully understands all of the experiments presented in this manuscript and the relevance of the findings.

    5. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: This manuscript describes evidence for a role for the Nuclear distribution C dynein complex regulator (NudC) in ribosome biogenesis (RiBi) independent of its role in microtubule-associated dynein function.

      Evidence: NudC was picked up in a screen for genes affecting ecdysteroid biosynthesis, a process that occurs in the prothoracic gland (PG; an endocrine organ). In the absence of ecdysone, larvae fail to pupate. Consistent with this finding, the authors find that prothoracic RNAi knockdown of NudC results in a failure in pupation and a decrease in total PG size. They also show defects in polytene chromosome architecture and a mild decrease in overall DNA content. They then turn to the salivary gland (SG) to further characterize the phenotypes associated with NudC knockdown. First, they show that an endogenously tagged version of NudC is abundant in the cytosol and has very weak nuclear staining in the region of the nucleolus (marked by the very low levels of DAPI staining). Knockdown of NudC using RNAi results in reduced NudC-GFP staining, a reduction in SG size, and a reduction in nuclear size. They also find that the SG polytene chromosomes are abnormal and that the production of a SG glue protein as measured by Sgs3-GFP levels and electron dense secretory granules is significantly reduced with NudC knockdown. Interestingly, they also observe the presence of abundant virus-like particles in the nucleus (these structures are thought to originate from retrotransposons and are an indicator of stress). Consistent with increased cellular stress, the authors show activation of JNK signalling. Ultrastructural analysis reveals an abnormally organized ER with an apparent loss of ER-associated ribosomes. They do see other electron dense structures in the cytosol, which they provide evidence (see below) of being P-bodies (structures associated with mRNA). They show that, consistent with a decrease in ribosomes, protein translation is reduced. This is supported by FISH experiments where they show significant decreases in ribosomal RNA (rRNA) transcript levels and decreased translation. Seeing the significant decreases in rRNA levels prompted them to look at overall changes in gene expression, where they discovered that both ribosomal protein gene expression as well as expression of other genes involved in ribosome biogenesis (RiBi) are upregulated with knockdown of NudC. They confirm the changes in mRNA for two genes by showing that levels of the corresponding proteins are also upregulated based on immunostaining of SG cells in which NudC is knocked down. Linking NudC function to a response to defects in RiBi, they shown that SG knockdown of several ribosomal biogenesis factors (RBFs) have similar chromosome structural defects and result in an increase in expression of ribosomal protein genes and of NudC itself. Finally, they show that knock down of genes encoding proteins linked to NudC function in microtubule dynamics do not have any of the same phenotypes as knockdown of NudC and RBFs. Altogether, their data support a moonlighting function for NudC in ribosome biogenesis. Moreover, defects in RiBi wherein ribosomal RNAs are decreased seem to result in compensatory changes where both RBFs and ribosomal protein genes are upregulated.

      Major issues:

      The title is a bit problematic since they haven't shown that NudC doesn't also affect normal mitotic cells - they only look at polyploid cells, but that doesn't mean normal mitotic cells are not also affected.

      Also, the authors show that two different RNAi lines for NudC give the same defects - it would be good to know if the RNAi lines target the same or different sequences in the NudC transcripts. Alternatively, it would be equally good to show that trans-allelic combinations of NudC mutants have the same defects in the prothoracic glands and the salivary glands as the RNAi. Instead, they examine only overall body size, developmental delays and lethality in the trans-hetero allelic NudC mutants.

      Results: Lines 261 - 266. Seeing electron dense structures in TEMs and seeing increased Me31B staining by confocal imaging in the cytoplasm is insufficient evidence that the electron dense structures are P-bodies. They could be the P-bodies but they could also be aggregated ribosomes; there is insufficient evidence to "confirm" that they are P-bodies - maybe just say "suggests".

      It would be quite helpful to characterize the "5 blob" and "shortened polytene chromosome arm" defects shown in Figure 2 and Figure 6. Are these partially polytenized chromosomes or are large sections of the chromosomes missing or just underreplicated? What do the chromosomes look like if you lyse the nuclei, spread the chromosomes and stain with DAPI or Hoechst - this is a pretty standard practice and would reveal much more about the structure of the polytene chromosomes.

      Minor points:

      Abstract, lines 28 - 31. I think this gene has been identified before. The authors probably want to say they have discovered a role for this gene in RiBi.

      Introduction, line 66. The protein is imported into the nucleus, where it localizes to the nucleolus - technically the protein is not imported into the nucleolus.

      Introduction, line 70. To be comprehensive in the description of ribosome biogenesis, the authors may want to mention that the 40S and 60S subunits are then exported from the nucleus and form the 80S subunit in the cytoplasm during translation.

      Introduction, line 98. May want to cite paper showing that Minute mutations turn out to be mutations in individual ribosomal protein genes.

      Results, lines 285 to 298. In situs with multiple probes that detect all parts of both the pre-rRNA and processed rRNA indicate that all are down in the SG in NudC knockdowns, but that the 18S and 28S rRNAs are down the internal transcribed spacers go up - can the authors explain or hypothesize how this could happen?

      Results, lines 292. Since they didn't knock down NudC in the fat body cells in this experiment, this comment seems irrelevant.

      Discussion, line 468. I don't think the authors have provided evidence of DNA damage. With the experiments they have shown, the chromosomes look abnormal - not clear what is abnormal.

      Figure 6A. Hoechst is misspelled.

      Referee cross-commenting

      I think the other reviewers have valid criticisms. I think among the most critical issues to sort out is (1) what is wrong with the chromosomes, (2) are diploid tissues also affected, (3) are the RIBI phenotypes a primary or secondary consequence of nudC loss. I'm not sure how easy it is to do ribosomal profiling on tissues dissected from larvae as the third reviewer is suggesting.

      Significance

      It is a novel discovery that a protein regulating microtubule dynamics is moonlighting, presumably in the nucleolus, to regulate rRNA synthesis or stabilization. A little information regarding mechanism of action would make this a much more exciting paper - how does it do it? Right now, it is unclear whether rRNA synthesis or maintenance is being regulated and there are no hypotheses regarding how this protein localizes to nucleoli and exactly what it is doing there. Is it regulating all RNA Pol I-dependent transcription? Is it involved in processing or stabilizing rRNAs? The description of the chromosomal defects also fall short of satisfying. As is, this paper probably of most interest to those who study ribosome biogenesis - an important topic, but without more mechanistic insight, not so interesting to a more general audience.

      My expertise

      I am an experienced Drosophila biologist who is familiar with the system and who fully understands all of the experiments presented in this manuscript and the relevance of the findings.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      *We thank the reviewers for their valuable comments. A common suggestion by all reviewers was that the manuscript would benefit from restructuring. Following their recommendation we have restructured this manuscript to improve its readability. *

      2. Point-by-point description of the revisions

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __ The paper from Louka et al. studies the function of Cep104 during the development of Xenopus embryos. They perform overexpression and knock down experiments and address the consequences on neural tube closure, on ciliogenesis, and MT stability and on apical intercalation. There is a lot of data presented on a wide range of topics. While the data on MTs tracks reasonably well with other reports on Cep104, there are some concerns regarding the quality of some of the data and the interpretations based on the experimental results.

      Specific Points: It is difficult to assess the effect on apical constriction with the data provided. Please show zoomed in higher mag images. Also this should be coupled with a quantification of cell number and proliferation rates, as it is possible that Cep104 mildly affects proliferation / cell division which could affect cell size. Overall this experiment is not really addressing apical constriction since there is no before and after data. Lots of things could affect apical surface area, most notably proliferation rates which one might predict would be affected by subtle changes to MT dynamics.

      __Response: __Following the reviewer's recommendation we now show zoomed in higher magnification images to more clearly demonstrate the larger cell surface area in the morpholino injected neural plate compared to the control non-injected side in the same embryo. We agree with the reviewer that defects in cell proliferation could affect the cell size. If the effect of Cep104 on the cell surface area is caused by defects in cell proliferation, then we would expect this phenotype to persist in other tissues such as the ectoderm. However, we show that this phenotype is specific to the neural plate. On the other hand, if the cell surface area defect is caused by defects in apical constriction, we would expect this phenotype to be stage specific. Following the reviewer's recommendation, we compared the surface area of neuroectoderm cells before and after extensive apical constriction takes. The new data is shown in Figure S2. Our results show no difference in the surface area of neuroectoderm cells in control tracer injected and morpholino injected neuroepithelial cells at stage 13, before extensive apical constriction whereas significant differences are observed in stage 15 embryos during which cells undergo apical constriction. This data strengthens our conclusion that downregulation of Cep104 affects apical constriction.

      "This defect was rescued with expression of exogenous human CEP104-GFP mRNA (300pg mRNA) (Figure 1D-E)." This was partially rescued as the control and the rescue are significantly different.

      __Response: __We thank the reviewer for this important clarification. We edited the text to more clearly reflect our data.

      I am unclear what is being depicted in Figure 1F and G. What is the intense red staining? Is that the blastopore? Which would imply that the stage of analysis is quite different between C and F which is concerning. The same stages should be used.

      __Response: __This is an image of the anterior most region of a stage 15 embryo. Occasionally some embryos do display intense phalloidin staining at the neural plate. We replaced the image with a more clear one and moved this data to Figure S2C.

      S1A has a boxed region as if there was going to be a zoomed in image, but there is not. It would be nice to see it zoomed in. While the localization is indeed at the base and tips of cilia the base looks too dispersed and big to be the basal body?

      __Response: __Following the reviewer's recommendation we now show a zoomed in image of a primary cilium. The boxed area in figure S2A shows the cilium that was used to generate the fluorescence intensity profile plot shown in S2B. The Cep104 signal at the basal body is much stronger compared to the ciliary tip signal. Exposure that allows simultaneous detection of both the base and the tip signal results in overexposure of the signal at the base. This is consistent with observations in primary cilia in cell culture (please refer to Figure 4 in Frikstad et al. 2019 and Figure 3 in Yamazoe et al 2020).

      In other systems the depletion of Cep104 decreases primary cilia length. While the authors claim that neural tube cilia are normal there is no quantification to support that and the provided image is hard to assess.

      __Response: __Following the reviewer's recommendation we now show quantifications of the length of floor plate cilia (Figure S3C). Floor plate cilia are longer than the cilia found elsewhere in the neural tube. This inherent variability in the length of cilia will likely prevent the detection of small changes in the cilium length elicited by downregulation of Cep104. Therefore, we chose to examine the length of floor plate cilia only, in control and morpholino injected cells. Our results show that downregulation of Cep104 leads to the formation of shorter floor plate cilia which is in agreement with published data in other systems.

      While the authors claim broad expression in humans and MO effects in cells without cilia, there is little data supporting the expression of Cep104 in the Xenopus cells being assayed (e.g. goblet cells).

      __Response: __We agree with the reviewer that there is little evidence supporting the expression of Cep104 in Xenopus goblet cells. Cep104 is a very low abundance protein and thus very difficult to detect it at endogenous levels For example, Ryniawec et al. (2023) raised an antibody against Drosophila Cep104 that failed to detect the native (endogenous) protein via western blot or immunofluorescence, but successfully recognized the overexpressed (transgenic) Cep104. A proteomic study by Peshkin et al. 2019 showed that Cep104 levels remain relatively constant throughout Xenopus development suggesting that this protein is expressed ubiquitously. This data is shown in Figure 4 where we plot the relative expression levels of Cep104 along with two motile cilia specific genes: hydin and RSPH9.

      The data in Figure 2 regarding the explants is difficult to understand and I think missing some key data. The text refers to the level of Gli increasing in the BF injected explants compared to uninjected explants, but the presentation of that is odd as the levels are normalized against uninjected rather than directly compared. And there are no stats for this key experiment. However, I think a bigger concern is the lack of information regarding the presence of cilia. While elongation and Sox2 expression are important they don't address if this tissue is similar to the neural tube in terms of cilia which is key to the interpretations.

      __Response: __Following the reviewer's recommendation we changed the presentation of this data. GLI1 levels are now normalized to XBF2 injected explants. The results are the same, Gli1 levels are 25% lower in morphant XBF2 explants (ttest pWe understand the reviewer's concern regarding the presence of cilia in the explants. To our knowledge there are currently no reports on the presence of cilia in the neural ectoderm in Xenopus. We have made several attempts to determine if cilia are present in this tissue during neurulation. However, we have not been able to detect cilia based on immunofluorescence staining for acetylated tubulin and Arl13b in the neural ectoderm. We conclude from this experiment that downregulation of Cep104 negatively affects hedgehog signaling and it remains to be addressed whether this is due to defects in primary cilia.

      The localization of Cep104 GFP in the epidermis and the neuroepithelium does not look similar as stated. Ones does not really see the punctate pattern in the neuroectoderm.

      Response: We thank the reviewer for pointing this out. To more clearly present this data we now show a plot of the fluorescence profile of Cep104-GFP along cell-cell junctions to demonstrate the punctate localization in the neuroepithelium.

      The experiments linking Cep104 to the tips of paused MTs is not particularly convincing. The depolymerization of MTs with nocodazole, will decrease all MTs as well as MT trafficking which could affect Cep104. Comparing this experiment with taxol treatment to stabilize MTs (and decrease dynamics) would be more convincing. Plus the image provided does not support the claim that the leftover EMTB is marked with Cep104.

      __Response: __Following the reviewer's recommendation we have examined the effect of taxol on the density of Cep104 apical puncta. We injected embryos with CEP104-GFP and EMTB-scarlet and exposed them to 20 μm taxol and imaged them live at stage 38. Embryos non treated with taxol served as the control. As shown in Figure S4 treatment with taxol led to an increase in the density of Cep104 puncta. This further supports our conclusion that Cep104 localizes to the ends of stable or paused microtubules. We also revised Figure 5 to more clearly show that Cep104 remains associated with the ends of nocodazole resistant EMTB labeled microtubules.

      The data in Figure 6 is very difficult to interpret / believe. The quantified effects on MTs are pretty subtle (which is fine...that is why you quantify), but the massive experimental variability questions the meaningfulness of those quantifications. In Fig 6B There are cells with lots of MTs right next to cells with no MTs and both have similar expression levels of Cep104. The staining just doesn't look consistent enough to accurately quantify. Also the effect of Nocodozole on MT stability is quite rapid, on the order of seconds to minutes, it is unclear what ON treatment with nocodazole would even be measuring since in that time there would be lots of secondary effects.

      __Response: __We thank the reviewer for this comment. Some cells in the epidermis lack apical microtubules as the reviewer correctly points out. Cells without strong apical microtubule staining are seen in both control and morpholino injected cells. Here we quantified the number of control and morphant cells per embryo that lack apical microtubules (DMSO treated embryos). Our results show that similar numbers of control and morphant cells per embryo appear to lack apical microtubules. We think that the heterogeneity in tubulin signal is not an artifact of immunofluorescence staining since these cells are adjacent to cells with clear tubulin staining. Although the source of this variability is still unknown, the fact that an equal number of control and morphant cells show this phenotype suggests that this is unlikely to be linked to the injections or drug treatment. Those cells were excluded from the quantifications shown in Figures 6C and 6D It is possible that these cells are preparing to enter mitosis.

      We think that the reviewer refers to the acute effects of nocodazole seen in cell cultures. However, in Xenopus tadpoles we didn't observe any effect on microtubules after short nocodazole treatment at low temperatures.

      The authors propose that overexpressing Cep104 would lead to stabilized MTs which is a reasonable hypothesis, however, they test this in multiciliated cells that already have a ton of acetylated MTs. If their hypothesis is correct it should lead to an increase in acetylated tubulin in non multiciliated cells which don't have much to begin with. This would be a marked improvement as the side projection quantification seems a little suspect as the analysis requires a precises ROI that eliminates the strong cilia acetylation staining. While I believe that could be done, the image provided looks as if it might cut off some of the apical surface which highlights the challenge.

      __Response: __Following the reviewer's recommendation, we examined the effect of Cep104 overexpression in non-MCCs on Xenopus epidermis. We show in Figure 7 that overexpression of Cep104 leads to a significant increase in the levels of acetylated tubulin in the cytoplasm of non-MCCs. We also show that overexpression of GFP alone did not have an effect on microtubule acetylation (Figure S5A). We moved the data on the cytoplasmic levels of acetylated microtubules in MCCs to figure S5B. We would like to clarify that the ROI to mark the cell body of MCCs was drawn right below the apical phalloidin signal to ensure that no signal derived from motile cilia will be included in the quantifications. A more detailed explanation of the quantification methods is included in this revised manuscript.

      Minor: Overall the color choice of images does not conform to the color blind favorable options that are becoming standard in the field. Also to the extent possible the colors should be consistent (e.g. Fig 4 A Cep104-GFP is green but in B it is red).

      __Response: __We thank the reviewer for this comment. We have changed the color choices in the figures to conform to the color blind.

      The recent Xenopus Cep104 paper was referenced with two references, and the wording of those two sentences was redundant.

      __Response: __We thank the reviewer for this comment. We edited the text accordingly.


      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __ This study by Louka et al., investigates the function of Cep104, a protein associated with Joubert syndrome, in Xenopus. Several aspects are studied at different scales. Loss of function of this protein suggests a role in neural tube closure, apical constriction, and HH signaling. Moving on in the study, the authors investigate the localization of Cep104 in the primary cilia of the neural tube before focusing on its localization in multiciliated cells. They then look at the consequences of loss of function on motile cilia and conclude that it plays a role in the length of the distal segment. They then show an association of Cep104 with cytoplasmic microtubules in non-multiciliated cells of the Xenopus epidermis. They then analyze the function of Cep104 on these microtubules and show that loss of Cep104 function increases the speed of EB1 comets. They then looked at the impact of loss of function on microtubule stability and finally the impact of gain of function. Finally, they returned to the multiciliated cells and described an intercalation defect that correlated with decreases in acetylated tubulin. I think that certain controls are missing and that the choice of illustrations should be reconsidered (better quality, appropriate zoom). In terms of form, the text is not easy to read and the manuscript would benefit from reformatting to highlight the logical links between the different experiences and avoid a catalog-like effect. I would advise the authors to revise their introduction to make it less disjointed and guide readers toward the questions addressed by the manuscript.

      Response: We thank the reviewer for the constructive criticism. We have revised the introduction to make it easier to read.

      Below are specific comments and remarks: Figure 1: Why the conclusion is a "delay" in neural tube closure? At what stage is this analyzed? Is there a recovery of NT closure at later stage? A: I would suggest to provide control picture of non-injected and tracer only injected embryos. B: Statistics are missing on the graph D: mention what was injected instead of "+ rescue". Close up picture would allow a better appreciation of the differences in surface area.

      Response: We thank the reviewer for this comment. The image shown in Figure 1A is from late neurula embryos, stage 18. We conclude that it is a delay in neural tube closure because the neural tube does close and the embryos develop to tailbud stages. To demonstrate the delay in neural tube closure we now include a time lapse sequence of a neurula stage embryo injected with the morpholino unilaterally which shows that the morpholino injected side moves towards the midline slower compared to the control uninjected side (movie 1). We also included a representative image of the dorsal side of a tailbud embryo injected unilaterally with the CEP104 morpholino to show that the neural tube has closed and the embryos develop to tailbud stages (figure S1D).

      Following the reviewer's recommendation, we also show images of embryos injected unilaterally with the tracer alone (Figure S2), we included the statistical analysis for graph 1D, revised image 1D to show that the embryo is injected with the morpholino and CEP104-GFP and provide close ups to allow for better appreciation of the differences in surface area.

      Figure S1: To illustrate the claim that cilia are not affected, it would be good to show injection of tracer alone and compare to tracer + morpholino. Also, to provide a measure of the cilia size.

      __Response: __Following the reviewer's recommendation we quantified the length of floor plate cilia in the neural tube of control and morpholino injected embryos. As explained in our response to a comment by reviewer 1, the floor plate cilia are longer than the cilia found elsewhere in the neural tube. This inherent variability in the length of cilia will likely prevent the detection of small changes in the cilium length elicited by downregulation of Cep104. Therefore, we chose to examine the length of floor plate cilia only in control and morpholino injected cells. Our results show that downregulation of Cep104 leads to the formation of shorter floor plate cilia which is in agreement with published data in other systems (Figure S3C).

      Figure 2: Please provide pictures to illustrate graph D.


      __Response: __The graph in Figure 2D shows RT-qPCR results for CEP104 in BF2 and BF2 and morpholino injected explants as compared to non-injected explants. We do not have a working antibody that would allow us to show the downregulation at the protein level.

      Figure 5: "Interestingly, most of the nocodazole-resistant stable microtubules were positive for Cep104 (Figure 5C, arrows). " The variation in density of Cep104-GFP signal is not visible on the pictures provided in C. I would suggest to show higher magnifications. Also, in the DMSO treated picture the Cep104GFP signal looks really different when compared to Cep104-GFP signal shown in B. Arrows should be reported on all channels. However, it not clear what we should see with this arrows. 5C: it seems that in nocodazole treated condition the Cep104-GFP is at the cilia base in MCCs which is different from the DMSO control condition. The basal body signal was not seen in the figure 3A which analyze the localization of Cep104-GFP in MCCs. Why not comment on this? Is it a phenotype on MCCs ?

      Response: __Following the reviewer's recommendation, we now show higher magnifications of the images shown in Figure 5C. We removed the arrows as most reviewers found them confusing. To demonstrate the presence of Cep104 at the ends of nocodazole resistant EMTB labeled microtubules we show zoomed images and a representative fluorescence intensity profile plot. __Figure 5B shows an image of a non-MCC whereas Figure 5C shows a larger area on the tadpole epidermis which includes both MCCs and non-MCCs. We thank the reviewer for pointing out that the localization of Cep104 in 5C looks different from 3A. We do not think this is a phenotype on MCCs. In Figure 3A we imaged only the tips of cilia which is why it looks different from 5C in which we imaged the apical surface of the cells as well. We disagree with the reviewer regarding the comment '5C: it seems that in nocodazole treated condition the Cep104-GFP is at the cilia base in MCCs which is different from the DMSO control condition'. The basal body localization of Cep104 is shown in the DMSO image as well. We hope that it will be clear in this revised figure.

      Figure 6: Intriguingly, morphant non-MCCs have significantly more mean β-tubulin signal compared to control non-MCCs in embryos treated with DMSO (Figure 6C). impossible to appreciate on the figures. Please specify on the figure what is considered as a morphant non-MCC versus a control non-MCC. The membrane-cherry positive cells (supposedly morphant? it has to be clarified show very heterogenous tubulin expression) If the point here is to show that microtubules are more sensitive to nocodazole in morphant cells as compared to control. I would suggest to show all conditions on a same graph. At least annotate more the graph for a self-explanatory figure (DMSO , Nocodazole).

      __Response: __We agree with the reviewer that it impossible to appreciate the difference in β-tubulin signal between control and morphant non-MCCs. Based on the quantifications of mean β-tubulin fluorescence intensity there is 5% difference in the fluorescence intensity between the two groups. Statistical analysis using t-test shows that although very small, this difference is statistically significant which is why we mention it in the manuscript. We have removed this statement and data from the revised manuscript because this is a very subtle phenotype, and it is beyond the scope of this experiment.

      Following the reviewer's recommendation, we clarify that mem-cherry positive cells contain the morpholino and mem-cherry negative cells are the control cells. We marked with a white asterisk the morphant non-MCCs. To address the heterogenous tubulin levels we provide quantifications which show that a similar number of control and morphant cells appear to lack microtubules. We think that the heterogeneity in tubulin signal is not an artifact of immunofluorescence staining since these cells are adjacent to cells with clear tubulin staining. Although the source of this variability is still unknown, the fact that an equal number of control and morphant cells show this phenotype suggests that this is unlikely to be linked to the injections or drug treatment. Those cells were excluded from the quantifications shown in figure 6. It is possible that these cells are preparing to enter mitosis. The reviewer is correct; the point of this experiment is to examine the effect of Cep104 downregulation on the sensitivity of microtubules to nocodazole. To more clearly present the results of this experiment we normalize the β-tubulin fluorescence Intensity in morphant cells to the one in control cells in the same embryo and we compare the normalized intensity in DMSO and nocodazole treated embryos.

      Figure 7: Statistics are missing on Graph B

      __ ____Response: __Following the reviewer's recommendation, we added the statistics on the graph.

      Comment on the text: "Cep104 signal shows the characteristic two dot pattern in motile cilia (Figure 3A) that was also observed in a recent study using Xenopus Cep10465 and in the cilia of Tetrahymena50. This is in agreement with a recent study showing the characteristic two dot pattern for Xenopus Cep104 as well66 " ref 65 and 66b are the same (Hong et al., preprint)

      __ ____Response: __We thank the reviewer for pointing this out. We edited the text to avoid repetition and corrected the references.

      "This data suggests that downregulation of CEP104 affects the stability of cytoplasmic microtubules." I would suggest a more precise conclusion by stating how is it affected? More stable? Less stable? Important for the follow-up demonstration.

      __ _Response: _We edited the text according to the reviewer's recommendation to precisely conclude that downregulation of Cep104 makes cytoplasmic microtubules less stable. __

      Movies: Please annotate properly movie 2 and 3 so the reader can know what he/she is looking.


      __Response: __Following the reviewer's comment, we revised the movie annotations to help the reader know what they are looking.


      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __ The manuscript entitled "Ciliary and non-ciliary functions of Cep104 in Xenopus" by Louka et al. investigate roles for the centriole and cilia tip protein Cep104 in Xenopus embryos. The authors show that depletion of Cep104 prevents neural tube closure due to inefficient apical constriction of neural cells and defective hedgehog signaling. Cep104 depletion also resulted in structural and functional ciliary defects in multi-ciliated cells. Surprisingly, the authors discover a role for Cep104 in stabilizing cytoplasmic microtubules in non-ciliated and multi-ciliated cells. Reduced microtubule stability in Cep104-depleted cells correlated with reduced apical intercalation of multi-ciliated cells in the epidermis.

      Overall, I find this manuscript difficult to understand because the experiments lack description of the findings within a normal developmental context and the findings are not developed into a cohesive narrative. I do find the study to be potentially impactful as the authors characterize Cep104 in a novel system (previous peer-reviewed studies have investigated Cep104 in human cell lines, Drosophila, zebrafish, Tetrahymena, and Chlamydomonas) with disease-relevant biology (neural development); however, mechanistic links are not properly explored. Over the course of their investigation, the authors made the novel finding that Cep104 controls the dynamics of cytoplasmic microtubules. However, this is not directly tested and potential pleiotropic effects of the developmental defects caused by Cep104 depletion confound the results.

      Response: We thank the reviewer for their comments. We tried to address this by restructuring the manuscript to describe the results in more detail within a normal developmental context.

      Major Critiques: The developmental context of experiments is not made clear. The authors use different tissues at varying developmental stages to perform experiments. However, these findings are not explored in depth and, therefore, the manuscript does not advance our understanding of Cep104's role in any of the processes explored.

      __ ____Response:__ We thank the reviewer for their comment. We took advantage of different tissues during Xenopus development to understand the cellular and molecular function of this protein in vivo. In this manuscript we show that Cep104 is involved in neural tube closure likely through its effect on apical constriction. Our data show that Cep104 is important for the stability of cytoplasmic microtubules and this is further demonstrated through its role in apical intercalation of multiciliated cells, a process known to depend on stable microtubules. Although our data do not advance our understanding on developmental processes such as apical constriction and MCC apical intercalation, they do improve our understanding of how Cep104 impacts cytoplasmic microtubules which has not been addressed in vivo yet.

      While the potential role of Cep104 in cytoplasmic microtubule regulation is intriguing, the experiments in the manuscript do not directly test this function. Because Cep104 depletion appears to have a profound developmental effect, it is difficult to interpret changes to EB1 velocity as directly attributed to Cep104 function. Additionally, the only evidence for Cep104 localization occurs in cells overexpressing human Cep104. The authors must directly visualize endogenous Cep104 to conclude microtubule or membrane localization, which they can also use to demonstrate Cep104 depletion in the morpholino experiments. Additionally, the assertion that Cep104 is binding plus-ends of cytoplasmic microtubules is not experimentally supported.

      __ ____Response: __Unfortunately, we cannot directly visualize endogenous Cep104 because there is no commercially available antibody that works in Xenopus. Cep104 is a very low abundance protein, and this is highlighted in the study by John M.Ryniawec et al. 2023, where they generated an antibody against the drosophila Cep104 which detected the GFP-tagged DmCep104 but failed to detect the endogenous protein. Given that the ciliary and basal body signal of Cep104 represents the cumulative signal from nine microtubules, one can appreciate the difficulty of observing the Cep104 signal in individual microtubules. None of the commercially available Cep104 antibodies that we have tested worked against the Xenopus protein in immunofluorescence or western blot experiments. We agree with the reviewer that we do not experimentally test the binding of Cep104 to the microtubule plus-end. This has been demonstrated by others. In Jiang et al. 2012 it was showed that GFP-Cep104 co-immunoprecipitates with GST-EB1 but not with GST-EB1 that lacks the tail which contains the SxIP binging motif. In Yamazoe et al. 2020 study it was shown that exogenous Cep104 co-immunoprecipitates with exogenous EB1 and Cep104 with mutated SxIP motif (SKNN) fails to co-immunoprecipitate with EB1. This shows that Cep104 interacts with EB1 through its SxIP motif. In addition, overexpression of Cep104 recruits Cep97 to microtubule tips suggesting that it acts as a +TIP protein. A recent study by Saunders et al. 2025 showed that in in vitro microtubule reconstitution assays, Cep104 could not autonomously bind the microtubule plus-end at low concentrations but in the presence of EB3 it could bind the microtubule plus-end and block microtubule polymerization at the same low concentration. This shows that Cep104 interacts with EB3, localizes to the microtubule plus-end and affects its dynamics in vitro. We added this information in the manuscript to more clearly show that the interaction of Cep104 and EB proteins is well documented. We anticipate that this interaction will hold true in all cell types where the two proteins are co-expressed.

      Additional Critiques: Figure S1. I only see the emergence of a shorter product after Cep104 depletion. Should PCR using Exon5-7 still work in successful knockdown? If not, then it is unclear what was quantified to determine Cep104 depletion as morpholino bands appear no different than control.

      __ ____Response: __We thank the reviewer for this comment. PCR using exon5-7 will not work when splice blocking by the morpholino takes place. This is a knockdown approach and the efficiency of the morpholino is about 90%. Upon completion of the RT-qPCR cycle the samples were analyzed by gel electrophoresis to demonstrate that 1) alternative splicing took place (see two products with exon 3-7 primers) and 2) the presence of a single product for all primer sets used.

      Figure 1A. Is this an example of an open or closed NTC? Show data used to determine the statement "no difference during convergent extension".

      __ ____Response: __This is an example of an embryo that was unilatterally injected with the morpholino. The left side is the control non-injected side and the right side is the morpholino injected. We added this information on the figure to make it more self-explanatory. In Figure 2 the elongation of the BF2 injected explants is due to convergent extension. The statement "no difference during convergent extension" was removed from the revised manuscript.

      Figure S2C. What does "Does not effect formation of cilia" mean? Does Cep104 depletion does not effect number, length, etc? Show quantitation used to determine this?

      __ ____Response:__ Following the reviewer's recommendation, we quantified the length of floor plate cilia in control and morpholino injected embryos. As mentioned in our response to reviewer 1 and 2, floor plate cilia are longer than the cilia found elsewhere in the neural tube. This inherent variability in the length of cilia will likely prevent the detection of small changes in the cilium length elicited by downregulation of Cep104. Therefore, we chose to examine the length of floor plate cilia only, in control and morpholino injected cells. Our results show that downregulation of Cep104 leads to the formation of shorter floor plate cilia which is in agreement with published data in other systems.

      Figure 5B. Along with strong Cep104 localization to membranes, there also appears to be strong EMTB localization. Is this also present in beta-tubulin immunostaining? Are these localizing to a cortical population of microtubules or to the membrane?

      __ ____Response: __We thank the reviewer for their comment. The Cep104 puncta at the cell periphery, are reduced/lost upon nocodazole treatment thus we conclude that Cep104 localizes to microtubules and not the cell membrane (Figure 5C, zoomed images). Of course, we cannot exclude the possibility that microtubules are required to target CEP104 to the plasma membrane. We edited the text to clearly state this conclusion.

      Figure 6C and 6D. These two panels have the same labels. The authors should denote that 6D is in nocodazole-treated explants.

      __ ____Response:__ We thank the reviewer for this comment. We edited this figure to more clearly present the results of this experiment: We normalized the β -tubulin levels in morphant cells to that of control cells in the same embryo (mosaic morphant embryos were used in this experiment). The graph shows the mean normalized β -tubulin levels per embryo treated with DMSO or nocodazole.

      Figure 7. What are Cep104 levels at stage 18-19?

      __ ____Response: __Following the reviewer's comment we now show the Cep104 protein expression levels during Xenopus development as reported on Xenbase (Figure 4). Cep104 is expressed at low levels from gastrulation to tailbud stages (Figure 4D).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript entitled "Ciliary and non-ciliary functions of Cep104 in Xenopus" by Louka et al. investigate roles for the centriole and cilia tip protein Cep104 in Xenopus embryos. The authors show that depletion of Cep104 prevents neural tube closure due to inefficient apical constriction of neural cells and defective hedgehog signaling. Cep104 depletion also resulted in structural and functional ciliary defects in multi-ciliated cells. Surprisingly, the authors discover a role for Cep104 in stabilizing cytoplasmic microtubules in non-ciliated and multi-ciliated cells. Reduced microtubule stability in Cep104-depleted cells correlated with reduced apical intercalation of multi-ciliated cells in the epidermis.

      Overall, I find this manuscript difficult to understand because the experiments lack description of the findings within a normal developmental context and the findings are not developed into a cohesive narrative. I do find the study to be potentially impactful as the authors characterize Cep104 in a novel system (previous peer-reviewed studies have investigated Cep104 in human cell lines, Drosophila, zebrafish, Tetrahymena, and Chlamydomonas) with disease-relevant biology (neural development); however, mechanistic links are not properly explored. Over the course of their investigation, the authors made the novel finding that Cep104 controls the dynamics of cytoplasmic microtubules. However, this is not directly tested and potential pleiotropic effects of the developmental defects caused by Cep104 depletion confound the results.

      Major Critiques:

      The developmental context of experiments is not made clear. The authors use different tissues at varying developmental stages to perform experiments. However, these findings are not explored in depth and, therefore, the manuscript does not advance our understanding of Cep104's role in any of the processes explored.

      While the potential role of Cep104 in cytoplasmic microtubule regulation is intriguing, the experiments in the manuscript do not directly test this function. Because Cep104 depletion appears to have a profound developmental effect, it is difficult to interpret changes to EB1 velocity as directly attributed to Cep104 function. Additionally, the only evidence for Cep104 localization occurs in cells overexpressing human Cep104. The authors must directly visualize endogenous Cep104 to conclude microtubule or membrane localization, which they can also use to demonstrate Cep104 depletion in the morpholino experiments. Additionally, the assertion that Cep104 is binding plus-ends of cytoplasmic microtubules is not experimentally supported.

      Additional Critiques:

      Figure S1. I only see the emergence of a shorter product after Cep104 depletion. Should PCR using Exon5-7 still work in successful knockdown? If not, then it is unclear what was quantified to determine Cep104 depletion as morpholino bands appear no different than control.

      Figure 1A. Is this an example of an open or closed NTC? Show data used to determine the statement "no difference during convergent extension".

      Figure S2C. What does "Does not effect formation of cilia" mean? Does Cep104 depletion does not effect number, length, etc? Show quantitation used to determine this?

      Figure 5B. Along with strong Cep104 localization to membranes, there also appears to be strong EMTB localization. Is this also present in beta-tubulin immunostaining? Are these localizing to a cortical population of microtubules or to the membrane?

      Figure 6C and 6D. These two panels have the same labels. The authors should denote that 6D is in nocodazole-treated explants.

      Figure 7. What are Cep104 levels at stage 18-19?

      Significance

      Overall, I find this manuscript difficult to understand because the experiments lack description of the findings within a normal developmental context and the findings are not developed into a cohesive narrative. I do find the study to be potentially impactful as the authors characterize Cep104 in a novel system (previous peer-reviewed studies have investigated Cep104 in human cell lines, Drosophila, zebrafish, Tetrahymena, and Chlamydomonas) with disease-relevant biology (neural development); however, mechanistic links are not properly explored. Over the course of their investigation, the authors made the novel finding that Cep104 controls the dynamics of cytoplasmic microtubules. However, this is not directly tested and potential pleiotropic effects of the developmental defects caused by Cep104 depletion confound the results.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This study by Louka et al., investigates the function of Cep104, a protein associated with Joubert syndrome, in Xenopus. Several aspects are studied at different scales. Loss of function of this protein suggests a role in neural tube closure, apical constriction, and HH signaling. Moving on in the study, the authors investigate the localization of Cep104 in the primary cilia of the neural tube before focusing on its localization in multiciliated cells. They then look at the consequences of loss of function on motile cilia and conclude that it plays a role in the length of the distal segment. They then show an association of Cep104 with cytoplasmic microtubules in non-multiciliated cells of the Xenopus epidermis. They then analyze the function of Cep104 on these microtubules and show that loss of Cep104 function increases the speed of EB1 comets. They then looked at the impact of loss of function on microtubule stability and finally the impact of gain of function. Finally, they returned to the multiciliated cells and described an intercalation defect that correlated with decreases in acetylated tubulin. I think that certain controls are missing and that the choice of illustrations should be reconsidered (better quality, appropriate zoom). In terms of form, the text is not easy to read and the manuscript would benefit from reformatting to highlight the logical links between the different experiences and avoid a catalog-like effect. I would advise the authors to revise their introduction to make it less disjointed and guide readers toward the questions addressed by the manuscript.

      Below are specific comments and remarks:

      Figure 1:

      Why the conclusion is a "delay" in neural tube closure? At what stage is this analyzed? Is there a recovery of NT closure at later stage? A: I would suggest to provide control picture of non-injected and tracer only injected embryos. B: Statistics are missing on the graph D: mention what was injected instead of "+ rescue". Close up picture would allow a better appreciation of the differences in surface area.

      Figure S1:

      To illustrate the claim that cilia are not affected, it would be good to show injection of tracer alone and compare to tracer + morpholino. Also, to provide a measure of the cilia size.

      Figure 2:

      Please provide pictures to illustrate graph D.

      Figure 5:

      "Interestingly, most of the nocodazole-resistant stable microtubules were positive for Cep104 (Figure 5C, arrows). " - The variation in density of Cep104-GFP signal is not visible on the pictures provided in C. I would suggest to show higher magnifications. Also, in the DMSO treated picture the Cep104GFP signal looks really different when compared to Cep104-GFP signal shown in B. Arrows should be reported on all channels. However, it not clear what we should see with this arrows. 5C: it seems that in nocodazole treated condition the Cep104-GFP is at the cilia base in MCCs which is different from the DMSO control condition. The basal body signal was not seen in the figure 3A which analyze the localization of Cep104-GFP in MCCs. Why not comment on this? Is it a phenotype on MCCs ? Figure 6: Intriguingly, morphant non-MCCs have significantly more mean β-tubulin signal compared to control non-MCCs in embryos treated with DMSO (Figure 6C). - impossible to appreciate on the figures. Please specify on the figure what is considered as a morphant non-MCC versus a control non-MCC. The membrane-cherry positive cells (supposedly morphant? it has to be clarified show very heterogenous tubulin expression)

      If the point here is to show that microtubules are more sensitive to nocodazole in morphant cells as compared to control. I would suggest to show all conditions on a same graph. At least annotate more the grap for a self-explanatory figure (DMSO , Nocodazole). Figure 7: Statistics are missing on Graph B Comment on the text: "Cep104 signal shows the characteristic two dot pattern in motile cilia (Figure 3A) that was also observed in a recent study using Xenopus Cep10465 and in the cilia of Tetrahymena50. This is in agreement with a recent study showing the characteristic two dot pattern for Xenopus Cep104 as well66 " - ref 65 and 66b are the same (Hong et al., preprint)

      "This data suggests that downregulation of CEP104 affects the stability of cytoplasmic microtubules." - I would suggest a more precise conclusion by stating how is it affected? More stable? Less stable? Important for the follow-up demonstration.

      Movies:

      Please annotate properly movie 2 and 3 so the reader can know what he/she is looking.

      Referees cross-commenting

      Similar feeling that reviews are consistent

      Significance

      This study investigates the role of the proprotein Cep104 in Xenopus. Cep104 is a protein associated with Joubert syndrome, whose role in primary cilia has been extensively documented. While its localization at the tip of motile cilia has also been reported, this study provides functional evidence for the role of Cep104 in motile cilia. In addition, the study looks at the role of Cep104 on non-cilial microtubules, which is the original aspect of the paper and may ultimately lead to a better understanding of Joubert syndrome. However, I believe that the evidence provided (controls, illustrations) needs to be improved. This paper will be of interest to a specialized audience with an interest in proteins associated with cilia and microtubules.

      I am a cell biologist specialized in the study of multiciliated cells using advanced imaging methods and Xenopus and mice as models. I believe my expertise was a perfect match for this manuscript.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The paper from Louka et al. studies the function of Cep104 during the development of Xenopus embryos. They perform overexpression and knock down experiments and address the consequences on neural tube closure, on ciliogenesis, and MT stability and on apical intercalation. There is a lot of data presented on a wide range of topics. While the data on MTs tracks reasonably well with other reports on Cep104, there are some concerns regarding the quality of some of the data and the interpretations based on the experimental results.

      Specific Points:

      It is difficult to assess the effect on apical constriction with the data provided. Please show zoomed in higher mag images. Also this should be coupled with a quantification of cell number and proliferation rates, as it is possible that Cep104 mildly affects proliferation / cell division which could affect cell size. Overall this experiment is not really addressing apical constriction since there is no before and after data. Lots of things could affect apical surface area, most notably proliferation rates which one might predict would be affected by subtle changes to MT dynamics.

      "This defect was rescued with expression of exogenous human CEP104-GFP mRNA (300pg mRNA) (Figure 1D-E)." This was partially rescued as the control and the rescue are significantly different.

      I am unclear what is being depicted in Figure 1F and G. What is the intense red staining? Is that the blastopore? Which would imply that the stage of analysis is quite different between C and F which is concerning. The same stages should be used.

      S1A has a boxed region as if there was going to be a zoomed in image, but there is not. It would be nice to see it zoomed in. While the localization is indeed at the base and tips of cilia the base looks too dispersed and big to be the basal body?

      In other systems the depletion of Cep104 decreases primary cilia length. While the authors claim that neural tube cilia are normal there is no quantification to support that and the provided image is hard to assess.

      While the authors claim broad expression in humans and MO effects in cells without cilia, there is little data supporting the expression of Cep104 in the Xenopus cells being assayed (e.g. goblet cells).

      The data in Figure 2 regarding the explants is difficult to understand and I think missing some key data. The text refers to the level of Gli increasing in the BF injected explants compared to uninjected explants, but the presentation of that is odd as the levels are normalized against uninjected rather than directly compared. And there are no stats for this key experiment. However, I think a bigger concern is the lack of information regarding the presence of cilia. While elongation and Sox2 expression are important they don't address if this tissue is similar to the neural tube in terms of cilia which is key to the interpretations.

      The localization of Cep104 GFP in the epidermis and the neuroepithelium does not look similar as stated. Ones does not really see the punctate pattern in the neuroectoderm.

      The experiments linking Cep104 to the tips of paused MTs is not particularly convincing. The depolymerization of MTs with nocodazole, will decrease all MTs as well as MT trafficking which could affect Cep104. Comparing this experiment with taxol treatment to stabilize MTs (and decrease dynamics) would be more convincing. Plus the image provided does not support the claim that the leftover EMTB is marked with Cep104.

      The data in Figure 6 is very difficult to interpret / believe. The quantified effects on MTs are pretty subtle (which is fine...that is why you quantify), but the massive experimental variability questions the meaningfulness of those quantifications. In Fig 6B There are cells with lots of MTs right next to cells with no MTs and both have similar expression levels of Cep104. The staining just doesn't look consistent enough to accurately quantify. Also the effect of Nocodozole on MT stability is quite rapid, on the order of seconds to minutes, it is unclear what ON treatment with nocodazole would even be measuring since in that time there would be lots of secondary effects.

      The authors propose that overexpressing Cep104 would lead to stabilized MTs which is a reasonable hypothesis, however, they test this in multiciliated cells that already have a ton of acetylated MTs. If their hypothesis is correct it should lead to an increase in acetylated tubulin in non multiciliated cells which don't have much to begin with. This would be a marked improvement as the side projection quantification seems a little suspect as the analysis requires a precises ROI that eliminates the strong cilia acetylation staining. While I believe that could be done, the image provided looks as if it might cut off some of the apical surface which highlights the challenge.

      Minor:

      Overall the color choice of images does not conform to the color blind favorable options that are becoming standard in the field. Also to the extent possible the colors should be consistent (e.g. Fig 4 A Cep104-GFP is green but in B it is red).

      The recent Xenopus Cep104 paper was referenced with two references, and the wording of those two sentences was redundant.

      Referees cross-commenting

      I feel that all three reviews are pretty consistent and I do not have any issues with the other reviews.

      Significance

      Strengths. Cep104 appears to be a hot topic right now as there are several papers in bioRXiv. I suspect that this led to a bit of a rushed submission. The other papers focus mostly on understanding the mechanisms of the ciliary roles of Cep104 which is well established. In other systems the broad phenotypes associated with Cep104 depletion are assumed to be through loss of cilia mediated HH signaling. This paper proposes a number of non ciliary roles for Cep104 which given its broad distribution could be relevant. If true these findings would add considerably to the field. Given that MTs do lots of things other than make cilia it would not be too surprising for Cep104 to have MT specific phenotypes as proposed here.

      Weaknesses. The quality of much of the data makes it difficult to assess the claims of broad importance. Key experiments critical to the interpretation of the data are lacking.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Authors’ reply (____Ono et al)

      Review Commons Refereed Preprint #RC-2025-03137

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Ono et al addressed how condensin II and cohesin work to define chromosome territories (CT) in human cells. They used FISH to assess the status of CT. They found that condensin II depletion leads to lengthwise elongation of G1 chromosomes, while double depletion of condensin II and cohesin leads to CT overlap and morphological defects. Although the requirement of condensin II in shortening G1 chromosomes was already shown by Hoencamp et al 2021, the cooperation between condensin II and cohesin in CT regulation is a new finding. They also demonstrated that cohesin and condensin II are involved in G2 chromosome regulation on a smaller and larger scale, respectively. Though such roles in cohesin might be predictable from its roles in organizing TADs, it is a new finding that the two work on a different scale on G2 chromosomes. Overall, this is technically solid work, which reports new findings about how condensin II and cohesin cooperate in organizing G1 and G2 chromosomes.

      We greatly appreciate the reviewer’s supportive comments. The reviewer has accurately recognized our new findings concerning the collaborative roles of condensin II and cohesin in establishing and maintaining interphase chromosome territories.

      Major point:

      They propose a functional 'handover' from condensin II to cohesin, for the organization of CTs at the M-to-G1 transition. However, the 'handover', i.e. difference in timing of executing their functions, was not experimentally substantiated. Ideally, they can deplete condensin II and cohesin at different times to prove the 'handover'. However, this would require the use of two different degron tags and go beyond the revision of this manuscript. At least, based on the literature, the authors should discuss why they think condensin II and cohesin should work at different timings in the CT organization.

      We take this comment seriously, especially because Reviewer #2 also expressed the same concern. 

      First of all, we must admit that the basic information underlying the “handover” idea was insufficiently explained in the original manuscript. Let us make it clear below:

      • Condensin II bound to chromosomes and is enriched along their axes from anaphase through telophase (Ono et al., 2004; Hirota et al., 2004; Walther et al., 2018).
      • In early G1, condensin II is diffusely distributed within the nucleus and does not bind tightly to chromatin, as shown by detergent extraction experiments (Ono et al., 2013).
      • Cohesin starts binding to chromatin when the cell nucleus reassembles (i.e., during the cytokinesis stage shown in Fig. 1B), apparently replacing condensins I and II (Brunner et al., 2025).
      • Condensin II progressively rebinds to chromatin from S through G2 phase (Ono et al., 2013). The cell cycle-dependent changes in chromosome-bound condensin II and cohesin summarized above are illustrated in Fig. 1A. We now realize that Fig. 1B in the original manuscript was inconsistent with Fig. 1A, creating unnecessary confusion, and we sincerely apologize for this. The fluorescence images shown in the original Fig. 1B were captured without detergent extraction prior to fixation, giving the misleading impression that condensin II remained bound to chromatin from cytokinesis through early G1. This was not our intention. To clarify this, we have repeated the experiment in the presence of detergent extraction and replaced the original Fig. 1B with a revised panel. Figs. 1A and 1B are now more consistent with each other. Accordingly, we have modified the correspsonding sentences as follows:

      Although condensin II remains nuclear throughout interphase, its chromatin binding is weak in G1 and becomes robust from S phase through G2 (Ono et al., 2013). Cohesin, in contrast, replaces condensin II in early G1 (Fig. 1 B)(Abramo et al., 2019; Brunner et al., 2025), and establishes topologically associating domains (TADs) in the G1 nucleus (Schwarzer et al., 2017; Wutz et al., 2017)*. *

      While there is a loose consensus in the field that condensin II is replaced by cohesin during the M-to-G1 transition, it remains controversial whether there is a short window during which neither condensin II nor cohesin binds to chromatin (Abramo et al., 2019), or whether there is a stage in which the two SMC protein complexes “co-occupy” chromatin (Brunner et al., 2025). Our images shown in the revised Fig. 1B cannot clearly distinguish between these two possibilities.

      From a functional point of view, the results of our depletion experiments are more readily explained by the latter possibility. If this is the case, the “interplay” or “cooperation” rather than the “handover” may be a more appropriate term to describe the functional collaboration between condensin II and cohesin during the M-to-G1 transition. For this reason, we have avoided the use of the word “handover” in the revised manuscript. It should be emphasized, however, that given their distinct chromosome-binding kinetics, the cooperation of the two SMC complexes during the M-to-G1 transition is qualitatively different from that observed in G2. Therefore, the central conclusion of the present study remains unchanged.

      For example, a sentence in Abstract has been changed as follows:

      a functional interplay between condensin II and cohesin during the mitosis-to-G1 transition is critical for establishing chromosome territories (CTs) in the newly assembling nucleus.

      While the reviewer suggested one experiment, it is clearly beyond the scope of the current study. It should also be noted that even if such a cell line were available, the proposed application of sequential depletion to cells progressing from mitosis to G1 phase would be technically challenging and unlikely to produce results that could be interpreted with confidence.

      Other points:

      Figure 2E: It seems that the chromosome length without IAA is shorter in Rad21-aid cells than H2-aid cells or H2-aid Rad21-aid cells. How can this be interpreted? This comment is well taken. A related comment was made by Reviewer #3 (Major comment #2). Given the substantial genetic manipulations applied to establish multiple cell lines used in the present study, it is, strictly speaking, not straightforward to compare the -IAA controls between different cell lines. Such variations are most prominently observed in Fig. 2E, although they can also be observed to lesser extent in other experiments (e.g., Fig. 3E). This issue is inherently associated with all studies using genetically manipulated cell lines and therefore cannot be completely avoided. For this reason, we focus on the differences between -IAA and +IAA within each cell line, rather than comparing the -IAA conditions across different cell lines. In this sense, a sentence in the original manuscript (lines 178-180) was misleading. In the revised manuscript, we have modified the corresponding and subsequent sentence as follows:

      Although cohesin depletion had a marginal effect on the distance between the two site-specific probes (Fig.2, C and E), double depletion did not result in a significant change (Fig.2, D and E), consistent with the partial restoration of centromere dispersion (Fig. 1G).

      • *

      In addition, we have added a section entitled “Limitations of the study” at the end of the Discussion to address technical issues that are inevitably associated with the current approach.

      Figure 3: Regarding the CT morphology, could they explain further the difference between 'elongated' and 'cloud-like (expanded)'? Is it possible to quantify the frequency of these morphologies? In the original manuscript, we provided data that quantitatively distinguished between the “elongated” and “cloud-like” phenotypes. Specifically, Fig. 2E shows that the distance between two specific loci (Cen 12 and 12q15) is increased in the elongated phenotype but not in the cloud-like phenotype. In addition, the cloud-like morphology was clearly deviated from circularity, as indicated by the circularity index (Fig. 3F). However, because circularity can also decrease in rod-shaped chromosomes, these datasets alone may not be sufficiently convincing, as the reviewer pointed out. We have now included an additional parameter, the aspect ratio, defined as the ratio of an object’s major axis to its minor axis (new Fig. 3F). While this intuitive parameter was altered upon condensin II depletion and double depletion, again, we acknowledge that it is not sufficient to convincingly distinguish between the elongated and cloud-like phenotypes proposed in the original manuscript. For these reasons, in the revised manuscript, we have toned down our statements regarding the differences in CT morphology between the two conditions. Nonetheless, together with the data from Figs. 1 and 2, it is that the Rabl configuration observed upon condensin II depletion is further exacerbated in the absence of cohesin. Accordingly, we have modified the main text and the cartoon (Fig 3H) to more accurately depict the observations summarized above.

      Figure 5: How did they assign C, P and D3 for two chromosomes? The assignment seems obvious in some cases, but not in other cases (e.g. in the image of H2-AID#2 +IAA, two D3s can be connected to two Ps in the other way). They may have avoided line crossing between two C-P-D3 assignments, but can this be justified when the CT might be disorganized e.g. by condensin II depletion? This comment is well taken. As the reviewer suspected, we avoided line crossing between two sets of assignments. Whenever there was ambiguity, such images were excluded from the analysis. Because most chromosome territories derived from two homologous chromosomes are well separated even under the depleted conditions as shown in Fig. 6C, we did not encounter major difficulties in making assignments based on the criteria described above. We therefore remain confident that our conclusion is valid.

      That said, we acknowledge that our assignments of the FISH images may not be entirely objective. We have added this point to the “Limitations of the study” section at the end of the Discussion.

      Figure 6F: The mean is not indicated on the right-hand side graph, in contrast to other similar graphs. Is this an error? We apologize for having caused this confusion. First, we would like to clarify that the right panel of Fig. 6F should be interpreted together with the left panel, unlike the seemingly similar plots shown in Figs. 6G and 6H. In the left panel of Fig. 6F, the percentages of CTs that contact the nucleolus are shown in grey, whereas those that do not are shown in white. All CTs classified in the “non-contact” population (white) have a value of zero in the right panel, represented by the bars at 0 (i.e., each bar corresponds to a collection of dots having a zero value). In contrast, each CT in the “contact” population (grey) has a unique contact ratio value in the right panel. Because the right panel consists of two distinct groups, we reasoned that placing mean or median bars would not be appropriate. This was why no mean or median bars were shown in in the tight panel (The same is true for Fig. S5 A and B).

      That said, for the reviewer’s reference, we have placed median bars in the right panel (see below). In the six cases of H2#2 (-/+IAA), Rad21#2 (-/+IAA), Double#2 (-IAA), and Double#3 (-IAA), the median bars are located at zero (note that in these cases the mean bars [black] completely overlap with the “bars” derived from the data points [blue and magenta]). In the two cases of Double#2 (+IAA) and Double#3 (+IAA), they are placed at values of ~0.15. Statistically significant differences between -IAA and +IAA are observed only in Double#2 and Double#3, as indicated by the P-value shown on the top of the panel. Thus, we are confident in our conclusion that CTs undergo severe deformation in the absence of both condensin II and cohesin.

      Figure S1A: The two FACS profiles for Double-AID #3 Release-2 may be mixed up between -IAA and +IAA. The review is right. This inadvertent error has been corrected.

      The method section explains that 'circularity' shows 'how closely the shape of an object approximates a perfect circle (with a value of 1 indicating a perfect circle), calculated from the segmented regions'. It would be helpful to provide further methodological details about it. We have added further explanations regarding the circularity in Materials and Methods together with a citation (two added sentences are underlined below):

      To analyze the morphology of nuclei, CTs, and nucleoli, we measured “circularity,” a morphological index that quantifies how closely the shape of an object approximates a perfect circle (value =1). Circularity was defined as 4π x Area/Perimeter2, where both the area and perimeter of each segmented object were obtained using ImageJ. This index ranges from 0 to 1, with values closer to 1 representing more circular objects and lower values correspond to elongated or irregular shapes (Chen et al, 2017).

      Chen, B., Y. Wang, S. Berretta and O. Ghita. 2017. Poly Aryl Ether Ketones (PAEKs) and carbon-reinforced PAEK powders for laser sintering. J Mater Sci 52:6004-6019.

      Reviewer #1 (Significance (Required)):

      Ono et al addressed how condensin II and cohesin work to define chromosome territories (CT) in human cells. They used FISH to assess the status of CT. They found that condensin II depletion leads to lengthwise elongation of G1 chromosomes, while double depletion of condensin II and cohesin leads to CT overlap and morphological defects. Although the requirement of condensin II in shortening G1 chromosomes was already shown by Hoencamp et al 2021, the cooperation between condensin II and cohesin in CT regulation is a new finding. They also demonstrated that cohesin and condensin II are involved in G2 chromosome regulation on a smaller and larger scale, respectively. Though such roles in cohesin might be predictable from its roles in organizing TADs, it is a new finding that the two work on a different scale on G2 chromosomes. Overall, this is technically solid work, which reports new findings about how condensin II and cohesin cooperate in organizing G1 and G2 chromosomes.

      See our reply above.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Ono et al use a variety of imaging and genetic (AID) depletion approaches to examine the roles of condensin II and cohesin in the reformation of interphase genome architecture in human HCT16 cells. Consistent with previous literature, they find that condensin II is required for CENP-A dispersion in late mitosis/early G1. Using in situ FISH at the centromere/q arm of chromosome 12 they then establish that condensin II removal causes lengthwise elongation of chromosomes that, interestingly, can be suppressed by cohesin removal. To better understand changes in whole-chromosome morphology, they then use whole chromosome painting to examine chromosomes 18 and 19. In the absence of condensin II, cells effectively fail to reorganise their chromosomes from rod-like structures into spherical chromosome territories (which may explain why CENP-A dispersion is suppressed). Cohesin is not required for spherical CT formation, suggesting condensin II is the major initial driver of interphase genome structure. Double depletion results in complete disorganisation of chromatin, leading the authors to conclude that a typical cell cycle requires orderly 'handover' from the mitotic to interphase genome organising machinery. The authors then move on to G2 phase, where they use a variety of different FISH probes to assess alterations in chromosome structure at different scales. They thereby establish that perturbation of cohesin or condensin II influences local and longer range chromosome structure, respectively. The effects of condensin II depletion become apparent at a genomic distance of 20 Mb, but are negligible either below or above. The authors repeat the G1 depletion experiment in G2 and now find that condensin II and cohesin are individually dispensable for CT organisation, but that dual depletion causes CT collapse. This rather implies that there is cooperation rather than handover per se. Overall this study is a broadly informative multiscale investigation of the roles of SMC complexes in organising the genome of postmitotic cells, and solidifies a potential relationship between condensin II and cohesin in coordinating interphase genome structure. The deeper investigation of the roles of condensin II in establishing chromosome territories and intermediate range chromosome structure in particular is a valuable and important contribution, especially given our incomplete understanding of what functions this complex performs during interphase.

      We sincerely appreciate the reviewer’s supportive comments. The reviewer has correctly acknowledged both the current gaps in our understanding of the role of condensin II in interphase chromosome organization and our new findings on the collaborative roles of condensin II and cohesin in establishing and maintaining interphase chromosome territories.

      Major comments:

      In general the claims and conclusions of the manuscript are well supported by multiscale FISH labelling. An important absent control is western blotting to confirm protein depletion levels. Currently only fluorescence is used as a readout for the efficiency of the AID depletion, and we know from prior literature that even small residual quantities of SMC complexes are quite effective in organising chromatin. I would consider a western blot a fairly straightforward and important technical control.

      Let me explain why we used immunofluorescence measurements to evaluate the efficiency of depletion. In our current protocol for synchronizing at the M-to-G1 transition, ~60% of control and H2-depleted cells, and ~30% of Rad21-depleted and co-depleted cells, are successfully synchronized in G1 phase. The apparently lower synchronization efficiency in the latter two groups is attributable to the well-documented mitotic delay caused by cohesin depletion. From these synchronized populations, early G1 cells were selected based on their characteristic morphologies (see the legend of Fig. 1C). In this way, we analyzed an early G1 cell population that had completed mitosis without chromosome segregation defects. We acknowledge that this represents a technically challenging aspect of M-to-G1 synchronization in HCT116 cells, whose synchronization efficiency is limited compared with that of HeLa cells. Nevertheless, this approach constitutes the most practical strategy currently available. Hence, immunofluorescence provides the only feasible means to evaluate depletion efficiency under these conditions.

      Although immunoblotting can, in principle, be applied to G2-arrested cell populations, we do not believe that information obtained from such experiments would affect the main conclusions of the current study. Please note that we carefully designed and performed all experiments with appropriate controls: H2 depletion, RAD21 depletion, and double depletion, with outcomes confirmed using independent cell lines (Double-AID#2 and Double-AID#3) whenever deemed necessary.

      We fully acknowledge the technical limitations associated with the AID-mediated depletion techniques, which are now described in the section entitled “Limitations of the study” at the end of the Discussion. Nevertheless, we emphasize that these limitations do not compromise the validity of our findings.

      I find the point on handover as a mechanism for maintaining CT architecture somewhat ambiguous, because the authors find that the dependence simply switches from condensin II to both condensin II and cohesin, between G1 and G2. To me this implies augmented cooperation rather than handover. I have two further suggestions, both of which I would strongly recommend but would consider desirable but 'optional' according to review commons guidelines.

      First of all, we would like to clarify a possible misunderstanding regarding the phrase “handover as a mechanism for maintaining CT architecture somewhat ambiguous”. In the original manuscript, we proposed handover as a mechanism for establishing G1 chromosome territories, not for maintaining CTs.

      That said, we take this comment very seriously, especially because Reviewer #1 also expressed the same concern. Please see our reply to Reviewer #1 (Major point).

      In brief, we agree with the reviewer that the word “handover” may not be appropriate to describe the functional relationship between condensin II and cohesin during the M-to-G1 transition. In the revised manuscript, we have avoided the use of the word “handover”, replacing it with “interplay”. It should be emphasized, however, that given their distinct chromosome-binding kinetics, the cooperation of the two SMC complexes during the M-to-G1 transition is qualitatively different from that observed in G2. Therefore, the central conclusion of the present study remains unchanged.

      For example, a sentence in Abstract has been changed as follows:

      a functional interplay between condensin II and cohesin during the mitosis-to-G1 transition is critical for establishing chromosome territories (CTs) in the newly assembling nucleus.

      Firstly, the depletions are performed at different stages of the cell cycle but have different outcomes. The authors suggest this is because handover is already complete, but an alternative possibility is that the phenotype is masked by other changes in chromosome structure (e.g. duplication/catenation). I would be very curious to see, for example, how the outcome of this experiment would change if the authors were to repeat the depletions in the presence of a topoisomerase II inhibitor.

      The reviewer’s suggestion here is somewhat vague, and it is unclear to us what rationale underlies the proposed experiment or what meaningful outcomes could be anticipated. Does the reviewer suggest that we perform topo II inhibitor experiments both during the M-to-G1 transition and in G2 phase, and then compare the outcomes between the two conditions?

      For the M-to-G1 transition, Hildebrand et at (2024) have already reported such experiments. They used a topo II inhibitor to provided evidence that mitotic chromatids are self-entangled and that the removal of these mitotic entanglements is required to establish a normal interphase nucleus. Our own preliminary experiments (not presented in the current manuscript) showed that ICRF treatment of cells undergoing the M-to-G1 transition did not affect post-mitotic centromere dispersion. The same treatment also had little effect on the suppression of centromere dispersion observed in condensin II-depleted cells.

      Under G2-arrested condition, because chromosome territories are largely individualized, we would expect topo II inhibition to affect only the extent of sister catenation, which is not the focus of our current study. We anticipate that inhibiting topo II in G2 would have only a marginal, if any, effect on the maintenance of chromosome territories detectable by our current FISH approaches.

      In any case, we consider the suggested experiment to be beyond the scope of the present manuscript, which focuses on the collaborative roles of condensin II and cohesin as revealed by multi-scale FISH analyses.

      Secondly, if the author's claim of handover is correct then one (not exclusive) possibility is that there is a relationship between condensin II and cohesin loading onto chromatin. There does seem to be a modest co-dependence (e.g. fig S4 and S7), could the authors comment on this?

      First of all, we wish to point out the reviewer’s confusion between the G2 experiments and the M-to-G1 experiments. Figs. S4 and S7 concern experiments using G2-arrested cells, not M-to-G1 cells in which a possible handover mechanism is discussed. Based on Fig. 1, in which the extent of depletion in M-to-G1 cells was tested, no evidence of “co-dependence” between H2 depletion and RAD21 depletion was observed.

      That said, as the reviewer correctly points out, we acknowledge the presence of marginal yet statistically significant reductions in the RAD21 signal upon H2 depletion (and vice versa) in G2-arrested cells (Figs. S4 and S7).

      Another control experiment here would be to treat fully WT cells with IAA and test whether non-AID labelled H2 or RAD21 dip in intensity. If they do not, then perhaps there's a causal relationship between condensin II and cohesin levels?

      According to the reviewer’s suggestion, we tested whether IAA treatment causes an unintentional decreases in the H2 or RAD21 signals in G2-arrested cells, and found that it is not the case (see the attached figure below).

      Thus, these data indicate that there is a modest functional interdependence between condensin II and cohesin in G2-arrested cells. For instance, condensin II depletion may modestly destabilize chromatin-bound cohesin (and vice versa). However, we note that these effects are minor and do not affect the overall conclusions of the study. In the revised manuscript, we have described these potentially interesting observations briefly as a note in the corresponding figure legends (Fig. S4).

      I recognise this is something considered in Brunner et al 2025 (JCB), but in their case they depleted SMC4 (so all condensins are lost or at least dismantled). Might bear further investigation.

      Methods:

      Data and methods are described in reasonable detail, and a decent number of replicates/statistical analyses have been. Documentation of the cell lines used could be improved. The actual cell line is not mentioned once in the manuscript. Although it is referenced, I'd recommend including the identity of the cell line (HCT116) in the main text when the cells are introduced and also in the relevant supplementary tables. Will make it easier for readers to contextualise the findings.

      We apologize for the omission of important information regarding the parental cell line used in the current study. The information has been added to Materials and Methods as well as the resource table.

      Minor comments:

      Overall the manuscript is well-written and well presented. In the introduction it is suggested that no experiment has established a causal relationship between human condensin II and chromosome territories, but this is not correct, Hoencamp et al 2021 (cell) observed loss of CTs after condensin II depletion. Although that manuscript did not investigate it in as much detail as the present study, the fundamental relationship was previously established, so I would encourage the authors to revise this statement.

      We are somewhat puzzled by this comment. In the original manuscript, we explicitly cited Hoencamp et al (2021) in support of the following sentences:

      • *

      (Lines 78-83 in the original manuscript)

      *Moreover, high-throughput chromosome conformation capture (Hi-C) analysis revealed that, under such conditions, chromosomes retain a parallel arrangement of their arms, reminiscent of the so-called Rabl configuration (Hoencamp et al., 2021). These findings indicate that the loss or impairment of condensin II during mitosis results in defects in post-mitotic chromosome organization. *

      • *

      That said, to make the sentences even more precise, we have made the following revision in the manuscript.

      • *

      (Lines 78- 82 in the revised manuscript)

      *Moreover, high-throughput chromosome conformation capture (Hi-C) analysis revealed that, under such conditions, chromosomes retain a parallel arrangement of their arms, reminiscent of the so-called Rabl configuration (Hoencamp et al., 2021). These findings,together with cytological analyses of centromere distributions, indicate that the loss or impairment of condensin II during mitosis results in defects in post-mitotic chromosome organization. *

      • *

      The following statement was intended to explain our current understanding of the maintenance of chromosome territories. Because Hoencamp et al (2021) did not address the maintenance of CTs, we have kept this sentence unchanged.

      • *

      (Lines 100-102 in the original manuscript)

      Despite these findings, there is currently no evidence that either condensin II, cohesin, or their combined action contributes to the maintenance of CT morphology in mammalian interphase cells (Cremer et al., 2020).

      • *

      • *

      Reviewer #2 (Significance (Required)):

      General assessment:

      Strengths: the multiscale investigation of genome architecture at different stages of interphase allow the authors to present convincing and well-analysed data that provide meaningful insight into local and global chromosome organisation across different scales.

      Limitations:

      As suggested in major comments.

      Advance:

      Although the role of condensin II in generating chromosome territories, and the roles of cohesin in interphase genome architecture are established, the interplay of the complexes and the stage specific roles of condensin II have not been investigated in human cells to the level presented here. This study provides meaningful new insight in particular into the role of condensin II in global genome organisation during interphase, which is much less well understood compared to its participation in mitosis.

      Audience:

      Will contribute meaningfully and be of interest to the general community of researchers investigating genome organisation and function at all stages of the cell cycle. Primary audience will be cell biologists, geneticists and structural biochemists. Importance of genome organisation in cell/organismal biology is such that within this grouping it will probably be of general interest.

      My expertise is in genome organization by SMCs and chromosome segregation.

      We appreciate the reviewer’s supportive comments. As the reviewer fully acknowledges, this study is the first systematic survey of the collaborative role of condensin II and cohesin in establishing and maintaining interphase chromosome territories. In particular, multi-scale FISH analyses have enabled us to clarify how the two SMC protein complexes contribute to the maintenance of G2 chromosome territories through their actions at different genomic scales. As the reviewer notes, we believe that the current study will appeal to a broad readership in cell and chromosome biology. The limitations of the current study mentioned by the reviewer are addressed in our reply above.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      The manuscript “Condensin II collaborates with cohesin to establish and maintain interphase chromosome territories" investigates how condensin II and cohesin contribute to chromosome organization during the M-to-G1 transition and in G2 phase using published auxin-inducible degron (AID) cell lines which render the respective protein complexes nonfunctional after auxin addition. In this study, a novel degron cell line was established that enables the simultaneous depletion of both protein complexes, thereby facilitating the investigation of synergistic effects between the two SMC proteins. The chromosome architecture is studied using fluorescence in situ hybridization (FISH) and light microscopy. The authors reproduce a number of already published data and also show that double depletion causes during the M-to-G1 transition defects on chromosome territories, producing expanded, irregular shapes that obscure condensin II-specific phenotypes. Findings in G2 cells point to a new role of condensin II for chromosome conformation at a scale of ~20Mb. Although individual depletion has minimal effects on large-scale CT morphology in G2, combined loss of both complexes produces marked structural abnormalities, including irregular crescent-shaped CTs displaced toward the nucleolus and increased nucleolus-CT contact. The authors propose that condensin II and cohesin act sequentially and complementarily to ensure proper post-mitotic CT formation and maintain chromosome architecture across genomic scales.

      We greatly appreciate the reviewer’s supportive comments. The reviewer has accurately recognized our new findings concerning the collaborative roles of condensin II and cohesin in the establishment and maintenance of interphase chromosome territories.

      Concenrs about statistics:

      • The authors provide the information on how many cells are analyzed but not the number of independent experiments. My concern is that there might variations in synchronization of the cell population and in the subsequent preparation (FISH) affecting the final result. We appreciate the reviewer’s important comment regarding the biological reproducibility of our experiments. As the reviewer correctly points out, variations in cell-cycle synchronization and FISH sample preparation can occur across experiments. To address this concern, we repeated the key experiments supporting our main conclusions (Figs. 3 and 6) two additional times, resulting in three independent biological replicas in total. All replicate experiments reproduced the major observations from the original analyses. These results further substantiated our original conclusion, despite the inevitable variability arising from cell synchronization or sample preparation in this type of experiments. In the revised manuscript, we have now explicitly indicated the number of biological replicates in the corresponding figures.

      The analyses of chromosome-arm conformation shown in Fig. 5 were already performed in three independent rounds of experiments, as noted in the original submission. In addition, similar results were already obtained in other analyses reported in the manuscript. For example, centromere dispersion was quantified using an alternative centromere detection method (related to Fig. 1), and distances between specific chromosomal sites were measured using different locus-specific probes (related to Figs. 2 and 4). In both cases, the results were consistent with those presented in the manuscript.

      • Statistically the authors analyze the effect of cells with induced degron vs. vehicle control (non-induced). However, the biologically relevant question is whether the data differ between cell lines when the degron system is induced. This is not tested here (cf. major concern 2 and 3). See our reply to major concerns 2 and 3.

      • Some Journal ask for blinded analysis of the data which might make sense here as manual steps are involved in the data analysis (e.g. line 626 / 627the convex hull of the signals was manually delineated, line 635 / 636 Chromosome segmentation in FISH images was performed using individual thresholding). However personally I have no doubts on the correctness of the work. We thank the reviewer for pointing out that some steps in our data analysis were performed manually, such as delineating the convex hull of signals and segmenting chromosomes in FISH and IF images using individual thresholds. These manual steps were necessary because signal intensities vary among cells and chromosomes, making fully automated segmentation unreliable. To ensure objectivity, we confirmed that the results were consistent across two independently established double-depletion cell lines, which produced essentially identical findings. In addition, we repeated the key experiments underpinning our main conclusions (Figs. 3 and 6) two additional times, and the results were fully consistent with the original analyses. Therefore, we are confident that our current data analysis approach does not compromise the validity of our conclusions. Finally, we appreciate the reviewer’s kind remark that there is no doubt regarding the correctness of our work.

      Major concerns:

      • Degron induction appears to delay in Rad21-AID#1 and Double-AID#1 cells the transition from M to G1, as shown in Fig. S1. After auxin treatment, more cells exhibit a G2 phenotype than in an untreated population. What are the implications of this for the interpretation of the experiments? In our protocol shown in Fig. 1C, cells were released into mitosis after G2 arrest, and IAA was added 30 min after release. It is well established that cohesin depletion causes a prometaphase delay due to spindle checkpoint activation (e.g., Vass et al, 2003, Curr Biol; Toyoda and Yanagida, 2006, MBoC; Peters et al, 2008, Genes Dev), which explains why cells with 4C DNA content accumulated, as judged by FACS (Fig. S1). The same was true for doubly depleted cells. However, a fraction of cells that escaped this delay progressed through mitosis and enter the G1 phase of the next cell cycle. We selected these early G1 cells and used them for down-stream analyses. This experimental procedure was explicitly described in the legends of Fig. 1C and Fig. S1A as follows:

      (Lines 934-937; Legend of Fig. 1C)

      From the synchronized populations, early G1cells were selected based on their characteristic morphologies (i.e., pairs of small post-mitotic cells) and subjected to downstream analyses. Based on the measured nuclear sizes (Fig. S2 G), we confirmed that early G1 cells were appropriately selected.

      (Lines 1114-1119; Legend of Fig. S1A)

      In this protocol, ~60% of control and H2-depleted cells, and ~30% of Rad21-depleted and co-depleted cells, were successfully synchronized in G1 phase. The apparently lower synchronization efficiency in the latter two groups is attributable to the well documented mitotic delay caused by cohesin depletion (Hauf et al., 2005; Haarhuis et al., 2013; Perea-Resa et al., 2020). From these synchronized populations, early G1 cells were selected based on their characteristic morphologies (see the legend of Fig. 1 C).

      • *

      Thus, using this protocol, we analyzed an early G1 cell population that had completed mitosis without chromosome segregation defects. We acknowledge that this represents a technically challenging aspect of synchronizing cell-cycle progression from M to G1 in HCT116 cells, whose synchronization efficiency is limited compared with that of HeLa cells. Nevertheless, this approach constitutes the most practical strategy currently available.

      • Line 178 "In contrast, cohesin depletion had a smaller effect on the distance between the two site-specific probes compared to condensin II depletion (Fig. 2, C and E)." The data in Fig. 2 E show both a significant effect of H2 and a significant effect of RAD21 depletion. Whether the absolute difference in effect size between the two conditions is truly relevant is difficult to determine, as the distribution of the respective control groups also appears to be different. This comment is well taken. Reviewer #1 has made a comment on the same issue. See our reply to Reviewer #1 (Other points, Figure 2E).

      In brief, in the current study, we should focus on the differences between -IAA and +IAA within each cell line, rather than comparing the -IAA conditions across different cell lines. In this sense, a sentence in the original manuscript (lines 178-180) was misleading. In the revised manuscript, we have modified the corresponding and subsequent sentence as follows:

      Although cohesin depletion had a marginal effect on the distance between the two site-specific probes (Fig.2, C and E), double depletion did not result in a significant change (Fig.2, D and E), consistent with the partial restoration of centromere dispersion (Fig. 1G).

      • In Figures 3, S3 and related text in the manuscript I cannot follow the authors' argumentation, as H2 depletion alone leads to a significant increase in the CT area (Chr. 18, Chr. 19, Chr. 15). Similar to Fig. 2, the authors argue about the different magnitude of the effect (H2 depletion vs double depletion). Here, too, appropriate statistical tests or more suitable parameters describing the effect should be used. I also cannot fully follow the argumentation regarding chromosome elongation, as double depletion in Chr. 18 and Chr. 19 also leads to a significantly reduced circularity. Therefore, the schematic drawing Fig. 3 H (double depletion) seems very suggestive to me. This comment is related to the comment above (Major comment #2). See our reply to Reviewer #1 (Other points, Figure 2E).

      It should be noted that, in Figure 3 (unlike in Figure 2), we did not compare the different magnitudes of the effect observed between H2 depletion and double depletion. Thus, the reviewer’s comment that “Similar to Fig. 2, the authors argue about the different magnitude of the effect (H2 depletion vs double depletion) ” does not accurately reflected our description.

      Moreover, while the distance between two specific loci (Fig. 2E) and CT circularity (Fig. 3G) are intuitively related, they represent distinct parameters. Thus, it is not unexpected that double depletion resulted in apparently different outcomes for the two measurements. Thus, the reviewer’s counter-argument is not strictly applicable here.

      That said, we agree with the reviewer that our descriptions here need to be clarified.

      The differences between H2 depletion and double depletion are two-fold: (1) centromere dispersion is suppressed upon H2 depletion, but not upon double depletion (Fig 1G); (2) the distance between Cen 12 and 12q15 increased upon H2 depletion, but not upon double depletion (Fig 2E).

      We have decided to remove the “homologous pair overlap” panel (formerly Fig. 3E) from the revised manuscript. Accordingly, the corresponding sentence has been deleted from the main text. Instead, we have added a new panel of “aspect ratio”, defined as the ratio of the major to the minor axis (new Fig. 3F). While this intuitive parameter was altered upon condensin II depletion and double depletion, again, we acknowledge that it is not sufficient to convincingly distinguish between the elongated and cloud-like phenotypes proposed in the original manuscript. For these reasons, in the revised manuscript, we have toned down our statements regarding the differences in CT morphology between the two conditions. Nonetheless, together with the data from Figs. 1 and 2, it is clear that the Rabl configuration observed upon condensin II depletion is further exacerbated in the absence of cohesin. Accordingly, we have modified the main text and the cartoon (Fig 3H) to more accurately depict the observations summarized above.

      • 5 and accompanying text. I agree with the authors that this is a significant and very interesting effect. However, I believe the sharp bends is in most cases an artifact caused by the maximum intensity projection. I tried to illustrate this effect in two photographs: Reviewer Fig. 1, side view, and Reviewer Fig. 2, same situation top view (https://cloud.bio.lmu.de/index.php/s/77npeEK84towzJZ). As I said, in my opinion, there is a significant and important effect; the authors should simply adjust the description. This comment is well taken. We appreciate the reviewer’s effort to help clarify our original observations. We have therefore added a new section entitled “Limitations of the study” to explicitly describe the constrains of our current approach. That said, as the reviewer also acknowledges, our observations remain valid because all experiments were performed with appropriate controls.

      Minor concerns:

      • I would like to suggest proactively discussing possible artifacts that may arise from the harsh conditions during FISH sample preparation. We fully agree with the reviewer’s concerns. For FISH sample preparation, we used relatively harsh conditions, including (1) fixation under a hypotonic condition (0.3x PBS), (2) HCl treatment, and (3) a denaturation step. We recognize that these procedures inevitably affect the preservation of the original structure; however, they are unavoidable in the standard FISH protocol. We also acknowledge that our analyses were limited to 2D structures based on projected images, rather than full 3D reconstructions. These technical limitations are now explicitly described in a new section entitled “Limitations of the study”, and the technical details are provided in Materials and Methods.

      • It would be helpful if the authors could provide the original data (microscopic image stacks) for download. We thank the reviewer for this suggestion and understand that providing the original image stacks could be of interest to readers. We agree that if the nuclei were perfectly spherical, as is the case for example in lymphocytes, 3D image stacks would contain much more information than 2D projections. However, as is typical for adherent cultured cells, including the HCT116-derived cells used in this study, the nuclei are flattened due to cell adhesion to the culture dish, with a thickness of only about one-tenth of the nuclear diameter (10–20 μm). Considering also the inevitable loss of structural preservation during FISH sample preparation, we were concerned that presenting 3D images might confuse rather than clarify. We therefore believe that representing the data as 2D projections, while explicitly acknowledging the technical limitations, provides the clearest and most interpretable presentation of our results. These limitations are now described in a new section of the manuscript.

      • The authors use a blind deconvolution algorithm to improve image quality. It might be helpful to test other methods for this purpose (optional). We thank the reviewer for this valuable suggestion and fully agree that it is a valid point. We recognize that alternative image enhancement methods can offer advantages, particularly for smaller structures or when multiple probes are analyzed simultaneously. In our study, however, the focus was on detecting whole chromosome territories (CTs) and specific chromosomal loci, which can be visualized clearly with our current FISH protocol combined with blind deconvolution. We therefore believe that the image quality we obtained is sufficient to support the conclusions of this manuscript.

      Reviewer #3 (Significance (Required)):

      Advance:

      Ono et al. addresses the important question on how the complex pattern of chromatin is reestablished after mitosis and maintained during interphase. In addition to affinity interactions (1,2), it is known that cohesin plays an important role in the formation and maintenance of chromosome organization interphase (3). However, current knowledge does not explain all known phenomena. Even with complete loss of cohesin, TAD-like structures can be recognized at the single-cell level (4), and higher structures such as chromosome territories are also retained (5). The function of condensin II during mitosis is another important factor that affects chromosome architecture in the following G1 phase (6). Although condensin II is present in the cell nucleus throughout interphase, very little is known about the role of this protein in this phase of the cell cycle. This is where the present publication comes in, with a new double degron cell line in which essential subunits of cohesin AND condensin can be degraded in a targeted manner. I find the data from the experiments in the G2 phase most interesting, as they suggest a previously unknown involvement of condensin II in the maintenance of larger chromatin structures such as chromosome territories.

      The experiments regarding the M-G1 transition are less interesting to me, as it is known that condensin II deficiency in mitosis leads to elongated chromosomes (Rabl configuration)(6), and therefore the double degradation of condensin II and cohesin describes the effects of cohesin on an artificially disturbed chromosome structure.

      For further clarification, we provide below a table summarizing previous studies relevant to the present work. We wish to emphasize three novel aspects of the present study. First, newly established cell lines designed for double depletion enabled us to address questions that had remained inaccessible in earlier studies. Second, to our knowledge, no study has previously reported condensin II depletion, cohesin depletion and double depletion in G2-arrested cells. Third, the present study represents the first systematic comparison of two different stages of the cell cycle using multiscale FISH under distinct depletion conditions. Although the M-to-G1 part of the present study partially overlaps with previous work, it serves as an important prelude to the subsequent investigations. We are confident that the reviewer will also acknowledge this point.

      cell cycle

      cond II depletion

      cohesin depletion

      double depletion

      M-to-G1

      Hoencamp et al (2021); Abramo et al (2019); Brunner et al (2025);

      this study

      Schwarzer et al (2017);

      Wutz et al (2017);

      this study

      this study

      G2

      this study

      this study

      this study

      Hoencamp et al (2021): Hi-C and imaging (CENP-A distribution)

      Abramo et al (2019): Hi-C and imaging

      Brunner et al (2025): mostly imaging (chromatin tracing)

      Schwarzer et al (2017); Wutz et al (2017): Hi-C

      this study: imaging (multi-scale FISH)

      General limitations:

      (1) Single cell imaging of chromatin structure typically shows only minor effects which are often obscured by the high (biological) variability. This holds also true for the current manuscript (cf. major concern 2 and 3).

      See our reply above.

      (2) A common concern are artefacts introduced by the harsh conditions of conventional FISH protocols (7). The authors use a method in which the cells are completely dehydrated, which probably leads to shrinking artifacts. However, differences between samples stained using the same FISH protocol are most likely due to experimental variation and not an artefact (cf. minor concern 1).

      See our reply above.

      • The anisotropic optical resolution (x-, y- vs. z-) of widefield microscopy (and most other light microscopic techniques) might lead to misinterpretation of the imaged 3D structures. This seems to be the cases in the current study (cf. major concern 4). See our reply above.

      • In the present study, the cell cycle was synchronized. This requires the use of inhibitors such as the CDK1 inhibitor RO-3306. However, CDK1 has many very different functions (8), so unexpected effects on the experiments cannot be ruled out. The current approaches involving FISH inevitably require cell cycle synchronization. We believe that the use of the CDK1 inhibitor RO-3306 to arrest the cell cycle at G2 is a reasonable choice, although we cannot rule out unexpected effects arising from the use of the drug. This issue has now been addressed in the new section entitled “Limitations of the study”.

      Audience:

      The spatial arrangement of genomic elements in the nucleus and their (temporal) dynamics are of high general relevance, as they are important for answering fundamental questions, for example, in epigenetics or tumor biology (9,10). The manuscript from Ono et al. addresses specific questions, so its intended readership is more likely to be specialists in the field.

      We are confident that, given the increasing interest in the 3D genome and its role in regulating diverse biological functions, the current manuscript will attract the broad readership of leading journals in cell biology.

      About the reviewer:

      By training I'm a biologist with strong background in fluorescence microscopy and fluorescence in situ hybridization. In recent years, I have been involved in research on the 3D organization of the cell nucleus, chromatin organization, and promoter-enhancer interactions.

      We greatly appreciate the reviewer’s constructive comments on both the technical strengths and limitations of our fluorescence imaging approaches, which have been very helpful in revising the manuscript. As mentioned above, we have decided to add a special paragraph entitled “Limitations of the study” at the end of the Discussion section to discuss these issues.

      All questions regarding the statistics of angularly distributed data are beyond my expertise. The authors do not correct their statistical analyses for "multiple testing". Whether this is necessary, I cannot judge.

      We thank the reviewer for raising this important point. In our study, the primary comparisons were made between -IAA and +IAA conditions within the same cell line. Accordingly, the figures report P-values for these pairwise comparisons.

      For the distance measurements, statistical evaluations were performed in PRISM using ANOVA (Kruskal–Wallis test), and the P-values shown in the figures are based on these analyses (Fig. 1, G and H; Fig. 2 E; Fig. 3 F and G; Fig. 4 F; Fig. 6 F [right]–H; Fig. S2 B and G; Fig. S3 D and H; Fig. S5 A [right] and B [right]; Fig. S8 B). While the manuscript focuses on pairwise comparisons between -IAA and +IAA conditions within the same cell line, we also considered potential differences across cell lines as part of the same ANOVA framework, thereby ensuring that multiple testing was properly addressed. Because cell line differences are not the focus of the present study, the corresponding results are not shown.

      For the angular distribution analyses, we compared -IAA and +IAA conditions within the same cell line using the Mardia–Watson–Wheeler test; these analyses do not involve multiple testing (circular scatter plots; Fig. 5 C–E and Fig. S6 B, C, and E–H). In addition, to determine whether angular distributions exhibited directional bias under each condition, we applied the Rayleigh test to each dataset individually (Fig. 5 F and Fig. S6 I). As these tests were performed on a single condition, they are also not subject to the problem of multiple testing. Collectively, we consider that the statistical analyses presented in our manuscript appropriately account for potential multiple testing issues, and we remain confident in the robustness of the results.

      Literature

      Falk, M., Feodorova, Y., Naumova, N., Imakaev, M., Lajoie, B.R., Leonhardt, H., Joffe, B., Dekker, J., Fudenberg, G., Solovei, I. et al. (2019) Heterochromatin drives compartmentalization of inverted and conventional nuclei. Nature, 570, 395-399. Mirny, L.A., Imakaev, M. and Abdennur, N. (2019) Two major mechanisms of chromosome organization. Curr Opin Cell Biol, 58, 142-152. Rao, S.S.P., Huang, S.C., Glenn St Hilaire, B., Engreitz, J.M., Perez, E.M., Kieffer-Kwon, K.R., Sanborn, A.L., Johnstone, S.E., Bascom, G.D., Bochkov, I.D. et al. (2017) Cohesin Loss Eliminates All Loop Domains. Cell, 171, 305-320 e324. Bintu, B., Mateo, L.J., Su, J.H., Sinnott-Armstrong, N.A., Parker, M., Kinrot, S., Yamaya, K., Boettiger, A.N. and Zhuang, X. (2018) Super-resolution chromatin tracing reveals domains and cooperative interactions in single cells. Science, 362. Cremer, M., Brandstetter, K., Maiser, A., Rao, S.S.P., Schmid, V.J., Guirao-Ortiz, M., Mitra, N., Mamberti, S., Klein, K.N., Gilbert, D.M. et al. (2020) Cohesin depleted cells rebuild functional nuclear compartments after endomitosis. Nat Commun, 11, 6146. Hoencamp, C., Dudchenko, O., Elbatsh, A.M.O., Brahmachari, S., Raaijmakers, J.A., van Schaik, T., Sedeno Cacciatore, A., Contessoto, V.G., van Heesbeen, R., van den Broek, B. et al. (2021) 3D genomics across the tree of life reveals condensin II as a determinant of architecture type. Science, 372, 984-989. Beckwith, K.S., Ødegård-Fougner, Ø., Morero, N.R., Barton, C., Schueder, F., Tang, W., Alexander, S., Peters, J.-M., Jungmann, R., Birney, E. et al. (2023) Nanoscale 3D DNA tracing in single human cells visualizes loop extrusion directly in situ. BioRxiv 8 of 9https://doi.org/10.1101/2021.04.12.439407. Massacci, G., Perfetto, L. and Sacco, F. (2023) The Cyclin-dependent kinase 1: more than a cell cycle regulator. Br J Cancer, 129, 1707-1716. Bonev, B. and Cavalli, G. (2016) Organization and function of the 3D genome. Nat Rev Genet, 17, 661-678. Dekker, J., Belmont, A.S., Guttman, M., Leshyk, V.O., Lis, J.T., Lomvardas, S., Mirny, L.A., O'Shea, C.C., Park, P.J., Ren, B. et al. (2017) The 4D nucleome project. Nature, 549, 219-226.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The manuscript „Condensin II collaborates with cohesin to establish and maintain interphase chromosome territories" investigates how condensin II and cohesin contribute to chromosome organization during the M-to-G1 transition and in G2 phase using published auxin-inducible degron (AID) cell lines which render the respective protein complexes nonfunctional after auxin addition. In this study, a novel degron cell line was established that enables the simultaneous depletion of both protein complexes, thereby facilitating the investigation of synergistic effects between the two SMC proteins. The chromosome architecture is studied using fluorescence in situ hybridization (FISH) and light microscopy. The authors reproduce a number of already published data and also show that double depletion causes during the M-to-G1 transition defects on chromosome territories, producing expanded, irregular shapes that obscure condensin II-specific phenotypes. Findings in G2 cells point to a new role of condensin II for chromosome conformation at a scale of ~20Mb. Although individual depletion has minimal effects on large-scale CT morphology in G2, combined loss of both complexes produces marked structural abnormalities, including irregular crescent-shaped CTs displaced toward the nucleolus and increased nucleolus-CT contact. The authors propose that condensin II and cohesin act sequentially and complementarily to ensure proper post-mitotic CT formation and maintain chromosome architecture across genomic scales.

      Concerns about statistics:

      (1) The authors provide the information on how many cells are analyzed but not the number of independent experiments. My concern is that there might variations in synchronization of the cell population and in the subsequent preparation (FISH) affecting the final result.

      (2) Statistically the authors analyze the effect of cells with induced degron vs. vehicle control (non-induced). However, the biologically relevant question is whether the data differ between cell lines when the degron system is induced. This is not tested here (cf. major concern 2 and 3).

      (3) Some Journal ask for blinded analysis of the data which might make sense here as manual steps are involved in the data analysis (e.g. line 626 / 627the convex hull of the signals was manually delineated, line 635 / 636 Chromosome segmentation in FISH images was performed using individual thresholding). However personally I have no doubts on the correctness of the work.

      Major concerns:

      (1) Degron induction appears to delay in Rad21-AID#1 an Double-AID#1 cells the transition from M to G1, as shown in Fig. S1. After auxin treatment, more cells exhibit a G2 phenotype than in an untreated population. What are the implications of this for the interpretation of the experiments?

      (2) Line 178 "In contrast, cohesin depletion had a smaller effect on the distance between the two site-specific probes compared to condensin II depletion (Fig. 2, C and E)." The data in Fig. 2 E show both a significant effect of H2 and a significant effect of RAD21 depletion. Whether the absolute difference in effect size between the two conditions is truly relevant is difficult to determine, as the distribution of the respective control groups also appears to be different.

      (3) In Figures 3, S3 and related text in the manuscript I cannot follow the authors' argumentation, as H2 depletion alone leads to a significant increase in the CT area (Chr. 18, Chr. 19, Chr. 15). Similar to Fig. 2, the authors argue about the different magnitude of the effect (H2 depletion vs double depletion). Here, too, appropriate statistical tests or more suitable parameters describing the effect should be used. I also cannot fully follow the argumentation regarding chromosome elongation, as double depletion in Chr. 18 and Chr. 19 also leads to a significantly reduced circularity. Therefore, the schematic drawing Fig. 3 H (double depletion) seems very suggestive to me.

      (4) Fig. 5 and accompanying text. I agree with the authors that this is a significant and very interesting effect. However, I believe the sharp bends is in most cases an artifact caused by the maximum intensity projection. I tried to illustrate this effect in two photographs: Reviewer Fig. 1, side view, and Reviewer Fig. 2, same situation top view (https://cloud.bio.lmu.de/index.php/s/77npeEK84towzJZ). As I said, in my opinion, there is a significant and important effect; the authors should simply adjust the description.

      Minor concerns:

      (1) I would like to suggest proactively discussing possible artifacts that may arise from the harsh conditions during FISH sample preparation..

      (2) It would be helpful if the authors could provide the original data (microscopic image stacks) for download

      (3) The authors use a blind deconvolution algorithm to improve image quality. It might be helpful to test other methods for this purpose (optional).

      Significance

      Advance:

      Ono et al. addresses the important question on how the complex pattern of chromatin is reestablished after mitosis and maintained during interphase. In addition to affinity interactions (1,2), it is known that cohesin plays an important role in the formation and maintenance of chromosome organization interphase (3). However, current knowledge does not explain all known phenomena. Even with complete loss of cohesin, TAD-like structures can be recognized at the single-cell level (4), and higher structures such as chromosome territories are also retained (5). The function of condensin II during mitosis is another important factor that affects chromosome architecture in the following G1 phase (6). Although condensin II is present in the cell nucleus throughout interphase, very little is known about the role of this protein in this phase of the cell cycle. This is where the present publication comes in, with a new double degron cell line in which essential subunits of cohesin AND condensin can be degraded in a targeted manner. I find the data from the experiments in the G2 phase most interesting, as they suggest a previously unknown involvement of condensin II in the maintenance of larger chromatin structures such as chromosome territories. The experiments regarding the M-G1 transition are less interesting to me, as it is known that condensin II deficiency in mitosis leads to elongated chromosomes (Rabl configuration)(6), and therefore the double degradation of condensin II and cohesin describes the effects of cohesin on an artificially disturbed chromosome structure.

      General limitations:

      (1) Single cell imaging of chromatin structure typically shows only minor effects which are often obscured by the high (biological) variability. This holds also true for the current manuscript (cf. major concern 2 and 3).

      (2) A common concern are artefacts introduced by the harsh conditions of conventional FISH protocols (7). The authors use a method in which the cells are completely dehydrated, which probably leads to shrinking artifacts. However, differences between samples stained using the same FISH protocol are most likely due to experimental variation and not an artefact (cf. minor concern 1).

      (3) The anisotropic optical resolution (x-, y- vs. z-) of widefield microscopy (and most other light microscopic techniques) might lead to misinterpretation of the imaged 3D structures. This seems to be the cases in the current study (cf. major concern 4).

      (4) In the present study, the cell cycle was synchronized. This requires the use of inhibitors such as the CDK1 inhibitor RO-3306. However, CDK1 has many very different functions (8), so unexpected effects on the experiments cannot be ruled out.

      Audience:

      The spatial arrangement of genomic elements in the nucleus and their (temporal) dynamics are of high general relevance, as they are important for answering fundamental questions, for example, in epigenetics or tumor biology (9,10). The manuscript from Ono et al. addresses specific questions, so its intended readership is more likely to be specialists in the field.

      About the reviewer: By training I'm a biologist with strong background in fluorescence microscopy and fluorescence in situ hybridization. In recent years, I have been involved in research on the 3D organization of the cell nucleus, chromatin organization, and promoter-enhancer interactions.

      All questions regarding the statistics of angularly distributed data are beyond my expertise. The authors do not correct their statistical analyses for "multiple testing". Whether this is necessary, I cannot judge.

      Literature

      1. Falk, M., Feodorova, Y., Naumova, N., Imakaev, M., Lajoie, B.R., Leonhardt, H., Joffe, B., Dekker, J., Fudenberg, G., Solovei, I. et al. (2019) Heterochromatin drives compartmentalization of inverted and conventional nuclei. Nature, 570, 395-399.
      2. Mirny, L.A., Imakaev, M. and Abdennur, N. (2019) Two major mechanisms of chromosome organization. Curr Opin Cell Biol, 58, 142-152.
      3. Rao, S.S.P., Huang, S.C., Glenn St Hilaire, B., Engreitz, J.M., Perez, E.M., Kieffer-Kwon, K.R., Sanborn, A.L., Johnstone, S.E., Bascom, G.D., Bochkov, I.D. et al. (2017) Cohesin Loss Eliminates All Loop Domains. Cell, 171, 305-320 e324.
      4. Bintu, B., Mateo, L.J., Su, J.H., Sinnott-Armstrong, N.A., Parker, M., Kinrot, S., Yamaya, K., Boettiger, A.N. and Zhuang, X. (2018) Super-resolution chromatin tracing reveals domains and cooperative interactions in single cells. Science, 362.
      5. Cremer, M., Brandstetter, K., Maiser, A., Rao, S.S.P., Schmid, V.J., Guirao-Ortiz, M., Mitra, N., Mamberti, S., Klein, K.N., Gilbert, D.M. et al. (2020) Cohesin depleted cells rebuild functional nuclear compartments after endomitosis. Nat Commun, 11, 6146.
      6. Hoencamp, C., Dudchenko, O., Elbatsh, A.M.O., Brahmachari, S., Raaijmakers, J.A., van Schaik, T., Sedeno Cacciatore, A., Contessoto, V.G., van Heesbeen, R., van den Broek, B. et al. (2021) 3D genomics across the tree of life reveals condensin II as a determinant of architecture type. Science, 372, 984-989.
      7. Beckwith, K.S., Ødegård-Fougner, Ø., Morero, N.R., Barton, C., Schueder, F., Tang, W., Alexander, S., Peters, J.-M., Jungmann, R., Birney, E. et al. (2023) Nanoscale 3D DNA tracing in single human cells visualizes loop extrusion directly in situ. BioRxiv https://doi.org/10.1101/2021.04.12.439407.
      8. Massacci, G., Perfetto, L. and Sacco, F. (2023) The Cyclin-dependent kinase 1: more than a cell cycle regulator. Br J Cancer, 129, 1707-1716.
      9. Bonev, B. and Cavalli, G. (2016) Organization and function of the 3D genome. Nat Rev Genet, 17, 661-678.
      10. Dekker, J., Belmont, A.S., Guttman, M., Leshyk, V.O., Lis, J.T., Lomvardas, S., Mirny, L.A., O'Shea, C.C., Park, P.J., Ren, B. et al. (2017) The 4D nucleome project. Nature, 549, 219-226.
    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      • Ono et al use a variety of imaging and genetic (AID) depletion approaches to examine the roles of condensin II and cohesin in the reformation of interphase genome architecture in human HCT16 cells. Consistent with previous literature, they find that condensin II is required for CENP-A dispersion in late mitosis/early G1. Using in situ FISH at the centromere/q arm of chromosome 12 they then establish that condensin II removal causes lengthwise elongation of chromosomes that, interestingly, can be suppressed by cohesin removal. To better understand changes in whole-chromosome morphology, they then use whole chromosome painting to examine chromosomes 18 and 19. In the absence of condensin II, cells effectively fail to reorganise their chromosomes from rod-like structures into spherical chromosome territories (which may explain why CENP-A dispersion is suppressed). Cohesin is not required for spherical CT formation, suggesting condensin II is the major initial driver of interphase genome structure. Double depletion results in complete disorganisation of chromatin, leading the authors to conclude that a typical cell cycle requires orderly 'handover' from the mitotic to interphase genome organising machinery.

      • The authors then move on to G2 phase, where they use a variety of different FISH probes to assess alterations in chromosome structure at different scales. They thereby establish that perturbation of cohesin or condensin II influences local and longer range chromosome structure, respectively. The effects of condensin II depletion become apparent at a genomic distance of 20 Mb, but are negligible either below or above. The authors repeat the G1 depletion experiment in G2 and now find that condensin II and cohesin are individually dispensable for CT organisation, but that dual depletion causes CT collapse. This rather implies that there is cooperation rather than handover per se.

      • Overall this study is a broadly informative multiscale investigation of the roles of SMC complexes in organising the genome of postmitotic cells, and solidifies a potential relationship between condensin II and cohesin in coordinating interphase genome structure. The deeper investigation of the roles of condensin II in establishing chromosome territories and intermediate range chromosome structure in particular is a valuable and important contribution, especially given our incomplete understanding of what functions this complex performs during interphase.

      Major comments:

      • In general the claims and conclusions of the manuscript are well supported by multiscale FISH labelling. An important absent control is western blotting to confirm protein depletion levels. Currently only fluorescence is used as a readout for the efficiency of the AID depletion, and we know from prior literature that even small residual quantities of SMC complexes are quite effective in organising chromatin. I would consider a western blot a fairly straightforward and important technical control.

      • I find the point on handover as a mechanism for maintaining CT architecture somewhat ambiguous, because the authors find that the dependence simply switches from condensin II to both condensin II and cohesin, between G1 and G2. To me this implies augmented cooperation rather than handover.

      • I have two further suggestions, both of which I would strongly recommend but would consider desirable but 'optional' according to review commons guidelines.

      Firstly, the depletions are performed at different stages of the cell cycle but have different outcomes. The authors suggest this is because handover is already complete, but an alternative possibility is that the phenotype is masked by other changes in chromosome structure (e.g. duplication/catenation). I would be very curious to see, for example, how the outcome of this experiment would change if the authors were to repeat the depletions in the presence of a topoisomerase II inhibitor.

      Secondly, if the author's claim of handover is correct then one (not exclusive) possibility is that there is a relationship between condensin II and cohesin loading onto chromatin. There does seem to be a modest co-dependence (e.g. fig S4 and S7), could the authors comment on this? Another control experiment here would be to treat fully WT cells with IAA and test whether non-AID labelled H2 or RAD21 dip in intensity. If they do not, then perhaps there's a causal relationship between condensin II and cohesin levels?

      • I recognise this is something considered in Brunner et al 2025 (JCB), but in their case they depleted SMC4 (so all condensins are lost or at least dismantled). Might bear further investigation.

      Methods:

      Data and methods are described in reasonable detail, and a decent number of replicates/statistical analyses have been. Documentation of the cell lines used could be improved. The actual cell line is not mentioned once in the manuscript. Although it is referenced, I'd recommend including the identity of the cell line (HCT116) in the main text when the cells are introduced and also in the relevant supplementary tables. Will make it easier for readers to contextualise the findings.

      Minor comments:

      Overall the manuscript is well-written and well presented. In the introduction it is suggested that no experiment has established a causal relationship between human condensin II and chromosome territories, but this is not correct, Hoencamp et al 2021 (cell) observed loss of CTs after condensin II depletion. Although that manuscript did not investigate it in as much detail as the present study, the fundamental relationship was previously established, so I would encourage the authors to revise this statement.

      Significance

      General assessment: Strengths: the multiscale investigation of genome architecture at different stages of interphase allow the authors to present convincing and well-analysed data that provide meaningful insight into local and global chromosome organisation across different scales. Limitations: As suggested in major comments.

      Advance: Although the role of condensin II in generating chromosome territories, and the roles of cohesin in interphase genome architecture are established, the interplay of the complexes and the stage specific roles of condensin II have not been investigated in human cells to the level presented here. This study provides meaningful new insight in particular into the role of condensin II in global genome organisation during interphase, which is much less well understood compared to its participation in mitosis.

      Audience: Will contribute meaningfully and be of interest to the general community of researchers investigating genome organisation and function at all stages of the cell cycle. Primary audience will be cell biologists, geneticists and structural biochemists. Importance of genome organisation in cell/organismal biology is such that within this grouping it will probably be of general interest.

      My expertise is in genome organization by SMCs and chromosome segregation.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Ono et al addressed how condensin II and cohesin work to define chromosome territories (CT) in human cells. They used FISH to assess the status of CT. They found that condensin II depletion leads to lengthwise elongation of G1 chromosomes, while double depletion of condensin II and cohesin leads to CT overlap and morphological defects. Although the requirement of condensin II in shortening G1 chromosomes was already shown by Hoencamp et al 2021, the cooperation between condensin II and cohesin in CT regulation is a new finding. They also demonstrated that cohesin and condensin II are involved in G2 chromosome regulation on a smaller and larger scale, respectively. Though such roles in cohesin might be predictable from its roles in organizing TADs, it is a new finding that the two work on a different scale on G2 chromosomes. Overall, this is technically solid work, which reports new findings about how condensin II and cohesin cooperate in organizing G1 and G2 chromosomes.

      Major point:

      They propose a functional 'handover' from condensin II to cohesin, for the organization of CTs at the M-to-G1 transition. However, the 'handover', i.e. difference in timing of executing their functions, was not experimentally substantiated. Ideally, they can deplete condensin II and cohesin at different times to prove the 'handover'. However, this would require the use of two different degron tags and go beyond the revision of this manuscript. At least, based on the literature, the authors should discuss why they think condensin II and cohesin should work at different timings in the CT organization.

      Other points:

      • Figure 2E: It seems that the chromosome length without IAA is shorter in Rad21-aid cells than H2-aid cells or H2-aid Rad21-aid cells. How can this be interpreted?

      • Figure 3: Regarding the CT morphology, could they explain further the difference between 'elongated' and 'cloud-like (expanded)'? Is it possible to quantify the frequency of these morphologies?

      • Figure 5: How did they assign C, P and D3 for two chromosomes? The assignment seems obvious in some cases, but not in other cases (e.g. in the image of H2-AID#2 +IAA, two D3s can be connected to two Ps in the other way). They may have avoided line crossing between two C-P-D3 assignments, but can this be justified when the CT might be disorganized e.g. by condensin II depletion?

      • Figure 6F: The mean is not indicated on the right-hand side graph, in contrast to other similar graphs. Is this an error?

      • Figure S1A: The two FACS profiles for Double-AID #3 Release-2 may be mixed up between -IAA and +IAA.

      • The method section explains that 'circularity' shows 'how closely the shape of an object approximates a perfect circle (with a value of 1 indicating a perfect circle), calculated from the segmented regions'. It would be helpful to provide further methodological details about it.

      Significance

      Ono et al addressed how condensin II and cohesin work to define chromosome territories (CT) in human cells. They used FISH to assess the status of CT. They found that condensin II depletion leads to lengthwise elongation of G1 chromosomes, while double depletion of condensin II and cohesin leads to CT overlap and morphological defects. Although the requirement of condensin II in shortening G1 chromosomes was already shown by Hoencamp et al 2021, the cooperation between condensin II and cohesin in CT regulation is a new finding. They also demonstrated that cohesin and condensin II are involved in G2 chromosome regulation on a smaller and larger scale, respectively. Though such roles in cohesin might be predictable from its roles in organizing TADs, it is a new finding that the two work on a different scale on G2 chromosomes. Overall, this is technically solid work, which reports new findings about how condensin II and cohesin cooperate in organizing G1 and G2 chromosomes.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): * Summary: * In this manuscript, Turner AH. et al. demonstrated the viral replication in cells depleting Rab11B small GTPase, which is a paralogue of Rab11A. It has been reported that Rab11A is responsible for the intracellular transport of viral RNP via recycling endosomes. The authors showed that Rab11B knockdown reduced the viral protein expression and viral titer. This may be caused by reduced attachment of viral particles on Rab11B knockdown cells.

      • Major comments:*
      • Comment 1 Fig 2-4: The authors should provide Western blot results with equal amount of loading control (GAPDH). The bands shown in these figures lack quantifiability and are not reliable as data.*

      We have rerun these western blots with more equal loading, and included a second loading control (beta-actin) in addition to the GAPDH. These blots can be seen in new Figures 2 and 3, and the quantification against both GAPDH (Figure 2/3) as well as actin (Fig S2) is now included. We have also included additional biological replicates for Fig 2 B-D. These additional experiments have strengthened our conclusion that Rab11B is required for efficient protein production in cells infected with recent H3N2, but not H1N1, isolates.

      Comment 2 Fig 2-4: Why are the results different between Rab11B knockdown alone and Rab11A/B double knockdown? If the authors claims are correct, the results of Rab11B knockdown should be reproducible in Rab11A/B double knockdown cells.

      Prior literature indicates that the Rab11A and Rab11B isoforms can play opposing roles in the trafficking of some cargos (ie, with one isoform transporting a molecule to the cell surface, while the other isoform takes it off again). In this scenario, it is possible that removing both 'halves' of the trafficking loop can ablate a phenotype. However, since our double knockdown used half the amount of siRNA for each isoform (for the same total amount), it is also possible this observation is simply the result of less efficient knockdown. In order to distinguish between these possibilities we depleted Rab11A or Rab11B individually, with this same 'half dose' of siRNA (see new Figure S3). We observed that Rab11B was still robustly required for H3N2 viral protein production. These results suggest that Rab11A and Rab11B could be playing mutually opposing roles in this case, which is consistent with prior Rab11 literature.

      Comment 3 Fig 6: For better understanding, please provide a schematic illustration of experimental setting.

      We have added a new graphical overview to this figure (see new Figure 6A).

      Comment 4: It is necessary to test other siRNA sequences or perform a rescue experiment by expressing an siRNA-resistant clone in the knockdown cells. There seems to be an activation of host defense system, such as IFN pathways.

      In order to rule out the possibility of off-target effects we created a novel cell line that inducibly expresses a Rab11B shRNA sequence (see new Fig 4). This knockdown strategy used a completely different method (shRNA delivered by lentiviral vector vs transient transfection of siRNA), in a different cellular background (H441 "club like" cells vs A549 lung adenocarcinoma). This new depletion strategy showed that the Rab11B dependent H3N2 protein production phenotype is seen across multiple knockdown strategies and cellular backgrounds.

      **Referees cross-commenting**

      I agree with other reviewers' comments in part.

      Reviewer #1 (Significance (Required)):

      The authors propose a novel role for Rab11B in modulating attachment pathway of H3N2 influenza A virus by unknown mechanism. Although previous studies focus on the function of Rab11A on endocytic transport, the function and specificity of Rab11B has remained less clear. The findings may be of interest to a broad audience, including researchers in cell biology, immunology, and host-pathogen interactions. However, the study remains at a superficial level of analysis and does not lead to a deeper understanding of the underlying mechanisms.

      We agree with the reviewer that a strength of this manuscript is its multi-disciplinary nature, particularly with regard to advances in our understanding of Rab11B function. We have added a significant number of experiments and new figures to bolster the rigor and reproducibility of our findings. We have also added a new figure (Fig 7) that uses reverse genetics to map the Rab11B phenotype to the HA gene of the H3N2 isolate under study. By creating '7+1' reassortant viruses with the H3 HA or the N2 NA on a PR8 (H1N1) background (see Fig 7E-H) we were able to demonstrate that Rab11B is acting specifically on one of the HA-mediated entry steps. This provides additional mechanistic insight, by mapping the Rab11B-phenotype to a step at or prior to fusion. Fundamentally, we believe the novelty and rigor of our observation that recent H3N2 viruses enter through a different route than H1N1 isolates is worthy of observation in this updated form, so that the field can begin follow up studies.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): Summary: The authors compare the effect of RAB11A and RAB11B knockdown on replication of contemporary H1N1 and H3N2 influenza A virus strains in A549 cells (human lung epithelials cells). They find a reduction in viral protein expression for tested H3N2 but not for H1N1 isolates. Mechanistically they suggest that RAB11A affects virion attachment to the cell surface.

      Major comments: The provided data do not conclusively support the suggested mechanism of action and essential controls are missing to substantiate the authors claims: • Knockdown efficacy has to be confirmed on protein level, showing reduced levels of RAB11A and B by Western blot. This is a standard in the field. Off target effects cannot be avoided by RNAi approaches and are usually ruled out by using multiple siRNAs or by complementing the targeted protein in trans.

      We have verified knockdown efficacy at the protein level in new Fig 1A/B. However, due to the high degree of protein level conservation between Rab11A and Rab11B it is very difficult to develop isoform specific antibodies, and we were unable to obtain a Rab11B-specific antibody that can detect endogenous protein (despite testing 6 commercially available antibodies for specificity). Using an antibody that detects both 11A and 11B (Fig1A) we were able to observe very slight changes in the molecular weight of the Rab11 band(s) detected upon knockdown of 11A vs 11B (suggestive of the two isoforms running as a dimer, with Rab11A the lower band and Rab11B the upper band). Cells depleted of both isoforms simultaneously showed a near complete loss of signal. Using a Rab11A antibody (that we confirmed as specific) we were able to observe loss of the Rab11A signal in both the 11A and 11A+B knockdowns (Fig 1B).

      • Viral titers should be presented as absolute titers not as % (here the labelling is actually misleading in all graphs indicating pfu/ml)

      This data is now shown in new Figure S1, where it is clear that the trends remain consistent across biological replicates. The axis labels of Fig 1D/E and Fig 3A have been corrected as requested to make clear we are normalizing to account for experiment-to-experiment variation in peak titer.

      • Reduction of viral protein expression goes hand in hand with a reduction in GAPDH. While this is accounted for in the quantification a general block of protein expression cannot be ruled out since the stability of house keeper proteins and viral proteins might be different. Testing multiple house keeping proteins could overcome this issue.

      We have included a second loading control (beta-actin) in addition to the GAPDH for new Figure 2 and 3. The quantification of viral protein production compared to beta actin is now included in new Fig S2. We have also included additional biological replicates for Fig 2 B-D. These additional experiments have strengthened our conclusion that Rab11B is required for efficient protein production in cells infected with recent H3N2, but not H1N1, isolates.

      • The FACS data in Fig 5 are not convincing. The previous figures showed modest reduction in viral protein expression and the fluorescence is indicated here on a logarithmic scale. Quantification and indication of mean fluorescence intensity from the same data would be a better readout to convincingly show that less cells are infected.

      We have reanalyzed the existing data to quantify the geometric mean of viral protein expression in the infected cell populations (new Figure 5D, E). This analysis shows no significant difference in geometric mean of HA (Fig 5D) or M2 (Fig 5E) expression between cells treated with NT, 11A or 11B siRNA. This additional analysis strengthens our original conclusion that when Rab11B is knocked down, fewer cells get infected, but those that do produce the same level of viral proteins.

      • During the time of addition experiment in Fig 6, the authors are testing for HA/M2 positive cells after 16h of infection. This is a multicycle scnario so in a second round they would measure the effect of knockdown in absence of amonium chloride. Shorter infections up to 8h with higher MOI would overcome this problem.

      By maintaining cells in ammonium chloride throughout the infection we are preventing endosomal acidification at any point in the infection period, so this experiment should be measuring solely the effect of one round of infection. The 16 hr timepoint was chosen to allow for optimized staining and analysis of samples by flow cytometry, within the available hours of the flow cytometry facility.

      • Standard error of mean is not an appropriate way of representing experimental error for the provided results and should be replaced by SD. Correct labeling of axis with units is required.

      We have updated the axes throughout the manuscript as requested. We have obtained additional statistical expertise (reflected in the updated author list) regarding the issue of SD vs SEM. Standard deviation (SD) would show a measure of the spread of the data, however the full distribution can be clearly seen as we plotted every individual data point. Standard error of the mean (SEM) is a measure of confidence for the mean of the population which takes into account SD and also sample size. SEM is not obvious to estimate by eye in the same way as SD, and we feel is more helpful to the reader to understand how likely the two population means differ from each other on a given graph.

      Minor comments: • The authors show a rescue of viral replication upon double knockdown of RAB11A and B. Maybe this is just a consequence of inefficient knockdown since only half of the siRNAs were used?

      In order to determine if this was the case we depleted Rab11A or Rab11B individually, with this same 'half dose' of siRNA (see new Figure S3). We observed that Rab11B was still robustly required for H3N2 viral protein production. These results suggest that Rab11A and Rab11B could be playing mutually opposing roles in this case (ie, Rab11B transporting a molecule to the surface, while Rab11A recycles it off), which is consistent with prior Rab11 literature.

      • Specific experimental issues that are easily addressable. • Are prior studies referenced appropriately? • Are the text and figures clear and accurate? • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      • Reviewer #2 (Significance (Required)): Significance The authors claim an H3N2 specific dependency on RAB11B for early steps of infection. While this is per se interesting the provided data do not fully support the claims and lack a mechanistic explanation. What is the difference between H1 and H3 strains (virion shape, HA load per virion, attachment force of H1 vs H3). The readouts used are not close enough to the events with regards to timing and could be supported by established entry assays in the field.

      We have provided additional discussion of the differences between H1s and H3s, including sialic acid binding preferences and changes in the HA-sialic acid avidity (lines 76-84). Notably, we have included a new assay (new Fig 7) that provides additional mechanistic insight into the observation that recent H3N2 but not H1N1 isolates depend on Rab11B early in infection. Using reverse genetics we were able to map the Rab11B phenotype to the HA gene of the H3N2 isolate under study. By creating '7+1' reassortant viruses with either the H3 HA or the N2 NA on a PR8 (H1N1) background (see Fig 7E) we are able to demonstrate that Rab11B is acting specifically at one of the HA-mediated entry steps. This excludes several non-HA dependent steps early in the life cycle (uncoating, RNP transport to the nucleus, nuclear import), thus providing additional confirmation that Rab11B acts at one of the earliest steps in the viral life cycle (and by definition, at or prior to fusion). Fundamentally, we believe the novelty and rigor of our observation that recent H3N2 viruses enter through a different route than H1N1 isolates is worthy of observation in this updated form, so that the field can begin follow up studies.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Manuscript Reference: RC-2025-03007 TITLE: Rab11B is required for binding and entry of recent H3N2, but not H1N1, influenza A isolates Allyson Turner, Sara Jaffrani, Hannah Kubinski, Deborah Ajayi, Matthew Owens, Madeline McTigue, Conor Fanuele, Cailey Appenzeller, Hannah Despres, Madaline Schmidt, Jessica Crothers, and Emily Bruce

      Summary Here, Turner et al. build upon existing knowledge of Influenza A virus (IAV) dependence on the Rab11 family of proteins and provide insights into the specific role of Rab11B isoform in H3N2 virus binding and entry. The introduction is clearly written and provides sufficient background on prior research involving Rab11. It effectively identifies the current gap in knowledge and justifies the investigation of more clinically relevant, circulating strains of IAV. The methods section provides sufficient detail to ensure reproducibility. Similarly, the discussion is well structured, aligns with the introduction, and thoughtfully outlines relevant follow-up experiments. The authors present data from a series of experiments which suggest that the reduced H3N2 infection and viral protein production in Rab11B-depleted cells is due to impaired virus binding. While the evidence supports a Rab11B-specific phenotype in the context of H3N2 infection, we recommend additional experiments (outlined below), to further validate and strengthen these findings. These would help solidify the mechanistic link between Rab11B depletion and the observed phenotype for H3N2 strains of IAV.

      Major comments Figure 1. (B) & (C) The authors normalise viral titers to the non-targeting control (NTC) siRNA set at 100. While this approach allows for relative comparisons, we recommend including the corresponding raw PFU/ml values, at least in the supplementary materials. This will better illustrate the biological significance of gene depletion and variability of the results.

      We have included the raw PFU/mL values in new Figure S1, while peak viral production varied by biological replicate (pasted below, with each biological replicate having a differently shaped data point). While the depletion-induced trends are clearly visible across biological replicates, normalization to average titer in the NT condition for each replicate allows for cleaner visualization.

      In addition, the current protocol uses a high MOI (1), and a relatively short infection period (16 hours) to capture single-cycle replication. However, to better assess the impact of gene knockdown on virus production and spread, we suggest performing a multicycle replication assay using a lower MOI (e.g, 0.01-0.001) over an extended time period, such as 48 hours before titration, provided that cell viability under these conditions is acceptable.

      We appreciate this suggestion and repeatedly attempted to carry out a multicycle growth curve to obtain this data. Unfortunately, out of four independent biological replicates we attempted, we were only able to maintain cell viability and adherence in one biological replicate (shown below). We have not included this data in the revised manuscript due to the limited replicates we were able to obtain, though we can add it in a further revision if the reviewer feels it is warranted.

      Figure 7. (B) & (C) The authors present interesting data showing that siRNA-mediated depletion of Rab11B reduces virion binding of a recently circulating strain of H3N2, but not H1N1, suggesting a subtype-specific role. However, we strongly recommend complementing this assay with a single-cell resolution approach such as immunofluorescence detection of surface-bound viruses through HA staining and image quantification. This would allow the authors to directly assess virion binding per cell and visualise the phenotype, strengthening the mechanistic insight on H3N2 binding in Rab11B-depleted cells. Furthermore, the data, particularly for H1N1 (Figure 7.C), shows substantial variance, which suggests a suboptimal assay sensitivity and limits the strength of the conclusion that the knockdown does not affect H1N1 binding, this limitation may be overcome by implementing the above experimental suggestion.

      We have made substantial efforts to include this data, but were ultimately unable to include this assay due to technical difficulties in implementation (NA stripping caused cells to lift off coverslips, difficulties in antibody sensitivity and specificity, among other issues). We also piloted single cell-based flow cytometry assays to attempt to measure signal from bound virions, but were unable to achieve sufficient differentiation between mock and bound samples with the antibodies we could obtain. However, we have included a new experimental approach that is able to genetically map the 11B-dependent phenotype to the HA gene, thus providing additional mechanistic insight and confirming that Rab11B acts on one of the earliest steps in the viral life cycle (prior to or at fusion).

      Minor comments General The authors should state which statistical test was used for each dataset in the respective figure legends.

      This information is now included in each figure legend.

      Figure 1. Suggest changing Y axis title to PFU/ml [relative to NTC]

      We have changed the axis titles of normalized data to "PFU as % of NT" throughout.

      The co-depletion of Rab11A and Rab11B appears to be less efficient than individual knockdowns, based on RT- qPCR data (Figure 1.A). It is possible that the partial 'rescue' phenotype observed in Figures 2-4 is due to incomplete knockdown, rather than a true biological interaction. This possibility should be acknowledged.

      In order to distinguish between a partial 'rescue' and inefficient knockdown, we depleted Rab11A or Rab11B individually, with the same 'half dose' of siRNA used in the double knockdown (see new Figure S3). We observed that Rab11B was still robustly required for H3N2 viral protein production. These results suggest that Rab11A and Rab11B could be playing mutually opposing roles in this case, which is consistent with prior Rab11 literature, rather than simply inefficient knockdown.

      Furthermore, knockdown efficiency is assessed only at the mRNA level. To strengthen the conclusions, the authors are encouraged to provide western blot data confirming protein-level depletion of Rab11A and Rab11B, particularly in the double knockdown condition. This would help clarify whether co-transfection of siRNAs affect the efficiency of each individual knockdown at the protein level.

      We have verified knockdown efficacy at the protein level in new Fig 1A/B. However, due to the high degree of protein level conservation between Rab11A and Rab11B it is very difficult to develop isoform specific antibodies, and we were unable to obtain a Rab11B-specific antibody that can detect endogenous protein (despite testing 6 commercially available antibodies for specificity). Using an antibody that detects both 11A and 11B (Fig1A) we were able to observe very slight changes in the molecular weight of the Rab11 band(s) detected upon knockdown of 11A vs 11B (suggestive of the two isoforms running as a dimer, with Rab11A the lower band and Rab11B the upper band). Cells depleted of both isoforms simultaneously showed a near complete loss of signal. Using a Rab11A antibody (that we confirmed as specific) we were able to observe loss of the Rab11A signal in both the 11A and 11A+B knockdowns (Fig 1B).

      Figure 6. (A) & (B) are missing error bars, particularly the Rab11B knockdown data points.

      Error bars are plotted in each graph, but due to very limited experimental variation these error bars are too small to appear on the graph (11B points in Fig 6B, D).

      Figure 7. If including any repeats in the binding assay, authors are encouraged to use appropriate controls in each experiment such as exogenous neuraminidase treatment or sialidase treatment.

      When attempting to establish a microscopy based binding assay we included exogenous neuraminidase in each experiment. Unfortunately, the combination of glass coverslips and treatment with exogenous neuraminidase at incubation times sufficient to strip virus also removed cells from the coverslips.

      Reviewer #3 (Significance (Required)):

      General assessment: Provides a conceptual advancement of subtype specific receptor preferences.

      Advance: The study raises interesting observations regarding influenza virus subtype differences in cell surface receptor binding, in a Rab11B-dependent manner.

      Audience: Influenza virologists, respiratory virologists

      Expertise: Virus entry, Virus cell biology

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Title: Rab11B is required for binding and entry of recent H3N2, but not H1N1, influenza A isolates

      Allyson Turner, Sara Jaffrani, Hannah Kubinski, Deborah Ajayi, Matthew Owens, Madeline McTigue, Conor Fanuele, Cailey Appenzeller, Hannah Despres, Madaline Schmidt, Jessica Crothers, and Emily Bruce

      Summary

      Here, Turner et al. build upon existing knowledge of Influenza A virus (IAV) dependence on the Rab11 family of proteins and provide insights into the specific role of Rab11B isoform in H3N2 virus binding and entry. The introduction is clearly written and provides sufficient background on prior research involving Rab11. It effectively identifies the current gap in knowledge and justifies the investigation of more clinically relevant, circulating strains of IAV. The methods section provides sufficient detail to ensure reproducibility. Similarly, the discussion is well structured, aligns with the introduction, and thoughtfully outlines relevant follow-up experiments. The authors present data from a series of experiments which suggest that the reduced H3N2 infection and viral protein production in Rab11B-depleted cells is due to impaired virus binding. While the evidence supports a Rab11B-specific phenotype in the context of H3N2 infection, we recommend additional experiments (outlined below), to further validate and strengthen these findings. These would help solidify the mechanistic link between Rab11B depletion and the observed phenotype for H3N2 strains of IAV.

      Major comments

      Figure 1. (B) & (C)

      The authors normalise viral titers to the non-targeting control (NTC) siRNA set at 100. While this approach allows for relative comparisons, we recommend including the corresponding raw PFU/ml values, at least in the supplementary materials. This will better illustrate the biological significance of gene depletion and variability of the results. In addition, the current protocol uses a high MOI (1), and a relatively short infection period (16 hours) to capture single-cycle replication. However, to better assess the impact of gene knockdown on virus production and spread, we suggest performing a multicycle replication assay using a lower MOI (e.g, 0.01-0.001) over an extended time period, such as 48 hours before titration, provided that cell viability under these conditions is acceptable.

      Figure 7. (B) & (C)

      The authors present interesting data showing that siRNA-mediated depletion of Rab11B reduces virion binding of a recently circulating strain of H3N2, but not H1N1, suggesting a subtype-specific role. However, we strongly recommend complementing this assay with a single-cell resolution approach such as immunofluorescence detection of surface-bound viruses through HA staining and image quantification. This would allow the authors to directly assess virion binding per cell and visualise the phenotype, strengthening the mechanistic insight on H3N2 binding in Rab11B-depleted cells. Furthermore, the data, particularly for H1N1 (Figure 7.C), shows substantial variance, which suggests a suboptimal assay sensitivity and limits the strength of the conclusion that the knockdown does not affect H1N1 binding, this limitation may be overcome by implementing the above experimental suggestion.

      Minor comments

      General

      The authors should state which statistical test was used for each dataset in the respective figure legends.

      Figure 1.

      Suggest changing Y axis title to PFU/ml [relative to NTC] The co-depletion of Rab11A and Rab11B appears to be less efficient than individual knockdowns, based on RT- qPCR data (Figure 1.A). It is possible that the partial 'rescue' phenotype observed in Figures 2-4 is due to incomplete knockdown, rather than a true biological interaction. This possibility should be acknowledged. Furthermore, knockdown efficiency is assessed only at the mRNA level. To strengthen the conclusions, the authors are encouraged to provide western blot data confirming protein-level depletion of Rab11A and Rab11B, particularly in the double knockdown condition. This would help clarify whether co-transfection of siRNAs affect the efficiency of each individual knockdown at the protein level.

      Figure 6.

      (A) & (B) are missing error bars, particularly the Rab11B knockdown data points.

      Figure 7.

      If including any repeats in the binding assay, authors are encouraged to use appropriate controls in each experiment such as exogenous neuraminidase treatment or sialidase treatment.

      Significance

      General assessment: Provides a conceptual advancement of subtype specific receptor preferences.

      Advance: The study raises interesting observations regarding influenza virus subtype differences in cell surface receptor binding, in a Rab11B-dependent manner.

      Audience: Influenza virologists, respiratory virologists

      Expertise: Virus entry, Virus cell biology

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The authors compare the effect of RAB11A and RAB11B knockdown on replication of contemporary H1N1 and H3N2 influenza A virus strains in A549 cells (human lung epithelials cells). They find a reduction in viral protein expression for tested H3N2 but not for H1N1 isolates. Mechanistically they suggest that RAB11A affects virion attachment to the cell surface.

      Major comments:

      The provided data do not conclusively support the suggested mechanism of action and essential controls are missing to substantiate the authors claims:

      • Knockdown efficacy has to be confirmed on protein level, showing reduced levels of RAB11A and B by Western blot. This is a standard in the field. Off target effects cannot be avoided by RNAi approaches and are usually ruled out by using multiple siRNAs or by complementing the targeted protein in trans.
      • Viral titers should be presented as absolute titers not as % (here the labelling is actually misleading in all graphs indicating pfu/ml)
      • Reduction of viral protein expression goes hand in hand with a reduction in GAPDH. While this is accounted for in the quantification a general block of protein expression cannot be ruled out since the stability of house keeper proteins and viral proteins might be different. Testing multiple house keeping proteins could overcome this issue.
      • The FACS data in Fig 5 are not convincing. The previous figures showed modest reduction in viral protein expression and the fluorescence is indicated here on a logarithmic scale. Quantification and indication of mean fluorescence intensity from the same data would be a better readout to convincingly show that less cells are infected.
      • During the time of addition experiment in Fig 6, the authors are testing for HA/M2 positive cells after 16h of infection. This is a multicycle scnario so in a second round they would measure the effect of knockdown in absence of amonium chloride. Shorter infections up to 8h with higher MOI would overcome this problem.
      • Standard error of mean is not an appropriate way of representing experimental error for the provided results and should be replaced by SD. Correct labeling of axis with units is required.

      Minor comments:

      • The authors show a rescue of viral replication upon double knockdown of RAB11A and B. Maybe this is just a consequence of inefficient knockdown since only half of the siRNAs were used?
      • Specific experimental issues that are easily addressable.
      • Are prior studies referenced appropriately?
      • Are the text and figures clear and accurate?
      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Significance

      The authors claim an H3N2 specific dependency on RAB11B for early steps of infection. While this is per se interesting the provided data do not fully support the claims and lack a mechanistic explanation. What is the difference between H1 and H3 strains (virion shape, HA load per virion, attachment force of H1 vs H3). The readouts used are not close enough to the events with regards to timing and could be supported by established entry assays in the field.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript, Turner AH. et al. demonstrated the viral replication in cells depleting Rab11B small GTPase, which is a paralogue of Rab11A. It has been reported that Rab11A is responsible for the intracellular transport of viral RNP via recycling endosomes. The authors showed that Rab11B knockdown reduced the viral protein expression and viral titer. This may be caused by reduced attachment of viral particles on Rab11B knockdown cells.

      Major comments:

      Comment 1 Fig 2-4: The authors should provide Western blot results with equal amount of loading control (GAPDH). The bands shown in these figures lack quantifiability and are not reliable as data.

      Comment 2 Fig 2-4: Why are the results different between Rab11B knockdown alone and Rab11A/B double knockdown? If the authors claims are correct, the results of Rab11B knockdown should be reproducible in Rab11A/B double knockdown cells.

      Comment 3 Fig 6: For better understanding, please provide a schematic illustration of experimental setting.

      Comment 4: It is necessary to test other siRNA sequences or perform a rescue experiment by expressing an siRNA-resistant clone in the knockdown cells. There seems to be an activation of host defense system, such as IFN pathways.

      Referees cross-commenting

      I agree with other reviewers' comments in part.

      Significance

      The authors propose a novel role for Rab11B in modulating attachment pathway of H3N2 influenza A virus by unknown mechanism. Although previous studies focus on the function of Rab11A on endocytic transport, the function and specificity of Rab11B has remained less clear. The findings may be of interest to a broad audience, including researchers in cell biology, immunology, and host-pathogen interactions. However, the study remains at a superficial level of analysis and does not lead to a deeper understanding of the underlying mechanisms.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We will provide the revised manuscript as a PDF with highlighted changes, the Word file with tracked changes linked to reviewer comments, and all updated figures.

      To address the reviewers' suggestions, we have conducted additional experiments that are now incorporated into new figures, or we have added new images to several existing figures where appropriate.

      Please note that all figures have been renumbered to improve clarity and facilitate cross-referencing throughout the text. As recommended by Referee #3, all figure legends have been thoroughly revised to reflect these updates and are now labeled following the standard A-Z panel format, enhancing readability and ensuring easier identification. In addition, all figure legends now include the sample size for each statistical analysis.

      For clarity and ease of reference, we provide below a comprehensive list of all figures included in the revised version. Figures that have undergone modifications are underlined.

      Figure 1____. The first spermatogenesis wave in prepuberal mice.

      This figure now includes amplified images of representative spermatocytes and a summary schematic illustrating the timeline of spermatogenesis. In addition, it now presents the statistical analysis of spermatocyte quantification to support the visual data.

      __Figure 2.____ Cilia emerge across all stages of prophase I in spermatocytes during the first spermatogenesis wave. __

      The images of this figure remain unchanged from the original submission, but all the graphs present now the statistical analysis of spermatocyte quantification.

      Figure 3. Ultrastructure and markers of prepuberal meiotic cilia.

      This figure remains unchanged from the original submission; however, we have replaced the ARL3-labelled spermatocyte image (A) with one displaying a clearer and more representative signal.

      __Figure 4. Testicular tissue presents spermatocyte cysts in prepuberal mice and adult humans. __

      This figure remains unchanged from the original submission.

      __Figure 5. Cilia and flagella dynamics are correlated during prepuberal meiosis. __

      This figure remains unchanged from the original submission.

      __Figure 6. Comparative proteomics identifies potential regulators of ciliogenesis and flagellogenesis. __

      This figure remains unchanged from the original submission.

      Figure 7.____ Deciliation induces persistence of DNA damage in meiosis.

      This figure has been substantially revised and now includes additional experiments analyzing chloral hydrate treatment, aimed at more accurately assessing DNA damage under both control and treated conditions. Images F-I and graph J are new.

      Figure 8____. Aurora kinase A is a regulator of cilia disassembly in meiosis.

      This figure is remodelled as the original version contained a mistake in previous panel II, for this, graph in new Fig.8 I has been corrected. In addition, it now contains additional data of αTubulin staining in arrested ciliated metaphases I after AURKA inhibition (new panel L1´).

      __Figure 9. Schematic representation of the prepuberal versus adult seminiferous epithelium. __

      This figure remains unchanged from the original submission.

      __Supplementary Figure 1. Meiotic stages during the first meiotic wave. __

      This figure remains unchanged from the original submission.

      __Supplementary Figure 2 (new)____. __

      This is a new figure that includes additional data requested by the reviewers. It includes additional markers of cilia in spermatocytes (glutamylated Tubulin/GT335), and the control data of cilia markers in non-ciliated spermatocytes. It also includes now the separated quantification of ciliated spermatocytes for each stage, as requested by reviewers, complementing graphs included in Figure 2.

      Please note that with the inclusion of this new Supplementary Figure 2, the numbering of subsequent supplementary figures has been updated accordingly.

      Supplementary Figure 3 (previously Suppl. Fig. 2)__. Ultrastructure of prophase I spermatocytes. __

      This figure is equal in content to the original submission, but some annotations have been included.

      Supplementary Figure 4 (previously Suppl. Fig. 3).__ Meiotic centrosome under the electron microscope. __

      This figure remains unchanged from the original submission, but additional annotations have been included.

      Supplementary Figure 5 (previously Suppl. Fig. 4)__. Human testis contains ciliated spermatocytes. __

      This figure has been revised and now includes additional H2AX staining to better determine the stage of ciliated spermatocytes and improve their identification.

      Supplementary Figure 6 (previously Suppl. Fig. 5). GLI1 and GLI3 readouts of Hedgehog signalling are not visibly affected in prepuberal mouse testes.

      This figure has been remodeled and now includes the quantification of GLI1 and GLI3 and its corresponding statistical analysis. It also includes the control data for Tubulin, instead of GADPH.

      Supplementary Figure 7 (previously Suppl. Fig. 6)__. CH and MLN8237 optimization protocol. __

      This figure has been remodeled to incorporate control experiments using 1-hour organotypic culture treatment.

      Supplementary Figure 8 (previously Suppl. Fig. 7)__. Tracking first meiosis wave with EdU pulse injection during prepubertal meiosis. __This figure remains unchanged from the original submission.

      Supplementary Figure 9 (previously Suppl. Fig. 8)__. PLK1 and AURKA inhibition in cultured spermatocytes. __

      This figure has been remodeled and now includes additional data on spindle detection in control and AURKA-inhibited spermatocytes (both ciliated and non ciliated).


      __Response to the reviewers __

      We will submit both the PDF version of the revised manuscript and the Word file with tracked changes relative to the original submission. Each modification made in response to reviewers' suggestions is annotated in the Word document within the corresponding section of the text.

      A detailed, point-by-point response to each reviewer's comments is provided in the following section.

      Response to the Referee #1


      In this manuscript by Perez-Moreno et al., titled "The dynamics of ciliogenesis in prepubertal mouse meiosis reveal new clues about testicular maturation during puberty", the authors characterize the development of primary cilia during meiosis in juvenile male mice. The authors catalog a variety of testicular changes that occur as juvenile mice age, such as changes in testis weight and germ cell-type composition. They next show that meiotic prophase cells initially lack cilia, and ciliated meiotic prophase cells are detected after 20 days postpartum, coinciding with the time when post-meiotic spermatids within the developing testes acquire flagella. They describe that germ cells in juvenile mice harbor cilia at all substages of meiotic prophase, in contrast to adults where only zygotene stage meiotic cells harbor cilia. The authors also document that cilia in juvenile mice are longer than those in adults. They characterize cilia composition and structure by immunofluorescence and EM, highlighting that cilia polymerization may initially begin inside the cell, followed by extension beyond the cell membrane. Additionally, they demonstrate ciliated cells can be detected in adult human testes. The authors next perform proteomic analyses of whole testes from juvenile mice at multiple ages, which may not provide direct information about the extremely small numbers of ciliated meiotic cells in the testis, and is lacking follow up experiments, but does serve as a valuable resource for the community. Finally, the authors use a seminiferous tubule culturing system to show that chemical inhibition of Aurora kinase A likely inhibits cilia depolymerization upon meiotic prophase I exit and leads to an accumulation of metaphase-like cells harboring cilia. They also assess meiotic recombination progression using their culturing system, but this is less convincing.

      Author response: We sincerely thank Ref #1 for the thorough and thoughtful evaluation of our manuscript. We are particularly grateful for the reviewer's careful reading and constructive feedback, which have helped us refine several sections of the text and strengthen our discussion. All comments and suggestions have been carefully considered and addressed, as detailed below.


      __Major comments: __

      1. There are a few issues with the experimental set up for assessing the effects of cilia depolymerization on DNA repair (Figure 7-II). First, how were mid pachytene cells identified and differentiated from early pachytene cells (which would have higher levels of gH2AX) in this experiment? I suggest either using H1t staining (to differentiate early/mid vs late pachytene) or the extent of sex chromosome synapsis. This would ensure that the authors are comparing similarly staged cells in control and treated samples. Second, what were the gH2AX levels at the starting point of this experiment? A more convincing set up would be if the authors measure gH2AX immediately after culturing in early and late cells (early would have higher gH2AX, late would have lower gH2AX), and then again after 24hrs in late cells (upon repair disruption the sampled late cells would have high gH2AX). This would allow them to compare the decline in gH2AX (i.e., repair progression) in control vs treated samples. Also, it would be informative to know the starting gH2AX levels in ciliated vs non-ciliated cells as they may vary.

      Response:

      We thank Ref #1 for this valuable comment, which significantly contributed to improving both the design and interpretation of the cilia depolymerization assay.

      Following this suggestion, we repeated the experiment including 1-hour (immediately after culturing), and 24-hour cultures for both control and chloral hydrate (CH)-treated samples (n = 3 biological replicates). To ensure accurate staging, we now employ triple immunolabelling for γH2AX, SYCP3, and H1T, allowing clear distinction of zygotene (H1T−), early pachytene (H1T−), and late pachytene (H1T+) cells. The revised data (Figure 7) now provide a more complete and statistically robust analysis of DNA damage dynamics. These results confirm that CH-induced deciliation leads to persistence of the γH2AX signal at 24 hours, indicating impaired DNA repair progression in pachytene spermatocytes. The new images and graphs are included in the revised Figure 7.

      Regarding the reviewer's final point about the comparison of γH2AX levels between ciliated and non-ciliated cells, we regret that direct comparison of γH2AX levels between ciliated and non-ciliated cells is not technically feasible. To preserve cilia integrity, all cilia-related imaging is performed using the squash technique, which maintains the three-dimensional structure of the cilia but does not allow reliable quantification of DNA damage markers due to nuclear distortion. Conversely, the nuclear spreading technique, used for DNA damage assessment, provides optimal visualization of repair foci but results in the loss of cilia due to cytoplasmic disruption during the hypotonic step. Given that spermatocytes in juvenile testes form developmentally synchronized cytoplasmic cysts, we consider that analyzing a statistically representative number of spermatocytes offers a valid and biologically meaningful measure of tissue-level effects.

      In conclusion, we believe that the additional experiments and clarifications included in revised Figure 7 strengthen our conclusion that cilia depolymerization compromises DNA repair during meiosis. Further functional confirmation will be pursued in future works, since we are currently generating a conditional genetic model for a ciliopathy in our laboratory.

      The authors analyze meiotic progression in cells cultured with/without AURKA inhibition in Figure 8-III and conclude that the distribution of prophase I cells does not change upon treatment. Is Figure 8-III A and B the same data? The legend text is incorrect, so it's hard to follow. Figure 8-III A shows a depletion of EdU-labelled pachytene cells upon treatment. Moreover, the conclusion that a higher proportion of ciliated zygotene cells upon treatment (Figure 8-II C) suggests that AURKA inhibition delays cilia depolymerization (page 13 line 444) does not make sense to me.

      Response:

      We thank Ref#1 for identifying this issue and for the careful examination of Figure 8. We discovered that the submitted version of Figure 8 contained a mismatch between the figure legend and the figure panels. The legend text was correct; however, the figure inadvertently included a non-corresponding graph (previously panel II-A), which actually belonged to Supplementary Figure 7 in the original submission. We apologize for this mistake.

      This error has been corrected in the revised version. The updated Figure 8 now accurately presents the distribution of EdU-labelled spermatocytes across prophase I substages in control and AURKA-inhibited cultures (previously Figure 8-II B, now Figure 8-A). The corrected data show no significant differences in the proportions of EdU-labelled spermatocytes among prophase I substages after 24 hours of AURKA inhibition, confirming that meiotic progression is not delayed and that no accumulation of zygotene cells occurs under this treatment. Therefore, the observed increase in ciliated zygotene spermatocytes upon AURKA inhibition (new Figure 8 H-I) is best explained by a delay in cilia disassembly, rather than by an arrest or slowdown in meiotic progression. The figure legend and main text have been revised accordingly.

      How do the authors know that there is a monopolar spindle in Figure 8-IV treated samples? Perhaps the authors can use a different Tubulin antibody (that does not detect only acetylated Tubulin) to show that there is a monopolar spindle.

      Response:

      We appreciate Ref#1 for this excellent suggestion. In the original submission (lines 446-447), we described that ciliated metaphase I spermatocytes in AURKA-inhibited samples exhibited monopolar spindle phenotypes. This description was based on previous reports showing that AURKA or PLK1 inhibition produces metaphases with monopolar spindles characterized by aberrant yet characteristic SYCP3 patterns, abnormal chromatin compaction, and circular bivalent alignment around non-migrated centrosomes (1). In our study, we observed SYCP3 staining consistent with these characteristic features of monopolar metaphases I.

      However, we agree with Ref #1 that this could be better sustained with data. Following the reviewer's suggestion, we performed additional immunostaining using α-Tubulin, which labels total microtubules rather than only the acetylated fraction. For clarity purposes, the revised Figure 8 now includes α-Tubulin staining in the same ciliated metaphase I cells shown in the original submission, confirming the presence of defective microtubule polymerization and defective spindle organization. For clarity, we now refer to these ciliated metaphases I as "arrested MI". This new data further support our conclusion that AURKA inhibition disrupts spindle bipolarization and prevents cilia depolymerization, indicating that cilia maintenance and bipolar spindle organization are mechanistically incompatible events during male meiosis. The abstract, results, and discussion section has been expanded accordingly, emphasizing that the persistence of cilia may interfere with microtubule polymerization and centrosome separation under AURKA inhibition. The Discussion has been expanded to emphasize that persistence of cilia may interfere with centrosome separation and microtubule polymerization, contrasting with invertebrate systems -e.g. Drosophila (2) and P. brassicae (3)- in which meiotic cilia persist through metaphase I without impairing bipolar spindle assembly.

      1. Alfaro, et al. EMBO Rep 22, (2021). DOI: 15252/embr.202051030 (PMID: 33615693)
      2. Riparbelli et al . Dev Cell (2012) DOI: 1016/j.devcel.2012.05.024 (PMID: 22898783)
      3. Gottardo et al, Cytoskeleton (Hoboken) (2023) DOI: 1002/cm.21755 (PMID: 37036073)

      The authors state in the abstract that they provide evidence suggesting that centrosome migration and cilia depolymerization are mutually exclusive events during meiosis. This is not convincing with the data present in the current manuscript. I suggest amending this statement in the abstract.

      Response:

      We thank Ref#1 for this valuable observation, with which we fully agree. To avoid overstatement, the original statement has been removed from the Abstract, Results, and Discussion, and replaced with a more accurate formulation indicating that cilia maintenance and bipolar spindle formation are mutually exclusive events during mouse meiosis.

      This revised statement is now directly supported by the new data presented in Figure 8, which demonstrate that AURKA inhibition prevents both spindle bipolarization and cilia depolymerization. We are grateful to the reviewer for highlighting this important clarification.


      Minor comments:

      The presence of cilia in all stages of meiotic prophase I in juvenile mice is intriguing. Why is the cellular distribution and length of cilia different in prepubertal mice compared to adults (where shorter cilia are present only in zygotene cells)? What is the relevance of these developmental differences? Do cilia serve prophase I functions in juvenile mice (in leptotene, pachytene etc.) that are perhaps absent in adults?

      Related to the above point, what is the relevance of the absence of cilia during the first meiotic wave? If cilia serve a critical function during prophase I (for instance, facilitating DSB repair), does the lack of cilia during the first wave imply differing cilia (and repair) requirements during the first vs latter spermatogenesis waves?

      In my opinion, these would be interesting points to discuss in the discussion section.

      Response:

      We thank the reviewer for these thoughtful observations, which we agree are indeed intriguing.

      We believe that our findings likely reflect a developmental role for primary cilia during testicular maturation. We hypothesize that primary cilia at this stage might act as signaling organelles, receiving cues from Sertoli cells or neighboring spermatocytes and transmitting them through the cytoplasmic cysts shared by spermatocytes. Such intercellular communication could be essential for coordinating tissue maturation and meiotic entry during puberty. Although speculative, this hypothesis aligns with the established role of primary cilia as sensory and signaling hubs for GPCR and RTK pathways regulating cell differentiation and developmental patterning in multiple tissues (e.g., 1, 2). The Discussion section has been expanded to include these considerations.

      1. Goetz et al, Nat Rev Genet (2010)- DOI: 1038/nrg2774 (PMID: 20395968)
      2. Naturky et al , Cell (2019) DOI: 1038/s41580-019-0116-4 (PMID: 30948801) Our study focuses on the first spermatogenic wave, which represents the transition from the juvenile to the reproductive phase. It is therefore plausible that the transient presence of longer cilia during this period reflects a developmental requirement for external signaling that becomes dispensable in the mature testis. Given that this is only the second study to date examining mammalian meiotic cilia, there remains a vast area of research to explore. We plan to address potential signaling cascades involved in these processes in future studies.

      On the other hand, while we cannot confirm that the cilia observed in zygotene spermatocytes persist until pachytene within the same cell, it is reasonable to speculate that they do, serving as longer-lasting signaling structures that facilitate testicular development during the critical pubertal window. In addition, the observation of ciliated spermatocytes at all prophase I substages at 20 dpp, together with our proteomic data, supports the idea that the emergence of meiotic cilia exerts a significant developmental impact on testicular maturation.

      In summary, although we cannot yet define specific prophase I functions for meiotic cilia in juvenile spermatocytes, our data demonstrate that the first meiotic wave differs from later waves in cilia dynamics, suggesting distinct regulatory requirements between puberty and adulthood. These findings underscore the importance of considering developmental context when using the first meiotic wave as a model for studying spermatogenesis.

      The authors state on page 9 lines 286-288 that the presence of cytoplasmic continuity via intercellular bridges (between developmentally synchronous spermatocytes) hints towards a mechanism that links cilia and flagella formation. Please clarify this statement. While the correlation between the timing of appearance of cilia and flagella in cells that are located within the same segment of the seminiferous tubule may be hinting towards some shared regulation, how would cytoplasmic continuity participate in this regulation? Especially since the cytoplasmic continuity is not between the developmentally distinct cells acquiring the cilia and flagella?

      Response:

      We thank Ref#1 for this excellent question and for the opportunity to clarify our statement.

      The presence of intercellular bridges between spermatocytes is well known and has long been proposed to support germ cell communication and synchronization (1,2) as well as sharing mRNA (3) and organelles (4). A classic example is the Akap gene, located on the X chromosome and essential for the formation of the sperm fibrous sheath; cytoplasmic continuity through intercellular bridges allows Akap-derived products to be shared between X- and Y-bearing spermatids, thereby maintaining phenotypic balance despite transcriptional asymmetry (5). In addition, more recent work has further demonstrated that these bridges are critical for synchronizing meiotic progression and for processes such as synapsis, double-strand break repair, and transposon repression (6).

      In this context, and considering our proteomic data (Figure 6), our statement did not intend to imply direct cytoplasmic exchange between ciliated and flagellated cells. Although our current methods do not allow comprehensive tracing of cytoplasmic continuity from the basal to the luminal compartment of the seminiferous epithelium, we plan to address this limitation using high-resolution 3D and ultrastructural imaging approaches in future studies.

      Based on our current data, we propose that cytoplasmic continuity within developmentally synchronized spermatocyte cysts could facilitate the coordinated regulation of ciliogenesis, and similarly enable the sharing of regulatory factors controlling flagellogenesis within spermatid cysts. This coordination may occur through the diffusion of centrosomal or ciliary proteins, mRNAs, or signaling intermediates involved in the regulation of microtubule dynamics. However, we cannot exclude the possibility that such cytoplasmic continuity extends across all spermatocytes derived from the same spermatogonial clone, potentially providing a larger regulatory network.]] This mechanism could help explain the temporal correlation we observe between the appearance of meiotic cilia and the onset of flagella formation in adjacent spermatids within the same seminiferous segment.

      We have revised the Discussion to explicitly clarify this interpretation and to note that, although hypothetical, it is consistent with established literature on cytoplasmic continuity and germ cell coordination.

      1. Dym, et al. * Reprod.*(1971) DOI: 10.1093/biolreprod/4.2.195 (PMID: 4107186)
      2. Braun et al. Nature. (1989) DOI: 1038/337373a0 (PMID: 2911388)
      3. Greenbaum et al. * Natl. Acad. Sci. USA*(2006). DOI: 10.1073/pnas.0505123103 (PMID: 16549803)
      4. Ventelä et al. Mol Biol Cell. (2003) DOI: 1091/mbc.e02-10-0647 (PMID: 12857863)
      5. Turner et al. Journal of Biological Chemistry (1998). DOI: 1074/jbc.273.48.32135 (PMID: 9822690)
      6. Sorkin, et al. Nat Commun (2025). DOI: 1038/s41467-025-56742-9 (PMID: 39929837)
      7. *note: due to manuscript-length limitations, not all cited references can be included in the text; they are listed here to substantiate our response.*

      Individual germ cells in H&E-stained testis sections in Figure 1-II are difficult to see. I suggest adding zoomed-in images where spermatocytes/round spermatids/elongated spermatids are clearly distinguishable.

      Response:

      Ref#1 is very right in this suggestion. We have revised Figure 1 to improve the quality of the H&E-stained testis sections and have added zoomed-in panels where spermatocytes, round spermatids, and elongated spermatids are clearly distinguishable. These additions significantly enhance the clarity and interpretability of the figure.

      In Figure 2-II B, the authors document that most ciliated spermatocytes in juvenile mice are pachytene. Is this because most meiotic cells are pachytene? Please clarify. If the data are available (perhaps could be adapted from Figure 1-III), it would be informative to see a graph representing what proportions of each meiotic prophase substages have cilia.

      Response:

      We thank the reviewer for this valuable observation. Indeed, the predominance of ciliated pachytene spermatocytes reflects the fact that most meiotic cells in juvenile testes are at the pachytene stage (Figure 1). We have clarified this point in the text and have added a new supplementary figure (Supplementary Figure 2, new figure) presenting a graph showing the proportion of spermatocytes at each prophase I substage that possess primary cilia. This visualization provides a clearer quantitative overview of ciliation dynamics across meiotic substages.

      I suggest annotating the EM images in Sup Figure 2 and 3 to make it easier to interpret.

      Response:

      We thank the reviewer for this helpful suggestion. We have now added annotations to the EM images in Supplementary Figures 3 and 4 to facilitate their interpretation. These visual guides help readers more easily identify the relevant ultrastructural features described in the text.

      The authors claim that the ratio between GLI3-FL and GLI3-R is stable across their analyzed developmental window in whole testis immunoblots shown in Sup Figure 5. Quantifying the bands and normalizing to the loading control would help strengthen this claim as it hard to interpret the immunoblot in its current form.

      Response:

      We thank the reviewer for this valuable suggestion. Following this recommendation, Supplementary Figure 5 has been revised to include quantification of GLI1 and GLI3 protein levels, normalized to the loading control.

      After quantification, we observed statistically significant differences across developmental stages. Specifically, GLI1 expression is slightly higher at 21 dpp compared to 8 dpp. For GLI3, we performed two complementary analyses:

      • Total GLI3 protein (sum of full-length and repressor forms normalized to loading control) shows a progressive decrease during development, with the lowest levels at 60 dpp (Supplementary Figure 5D).
      • GLI3 activation status, assessed as the GLI3-FL/GLI3-R ratio, is highest during the 19-21 dpp window, compared to 8 dpp and 60 dpp. Although these results suggest a possible transient activation of GLI3 during testicular maturation, we caution that this cannot automatically be attributed to increased Hedgehog signaling, as GLI3 processing can also be affected by other processes, such as changes in ciliogenesis. Furthermore, because the analysis was performed on whole-testis protein extracts, these changes cannot be specifically assigned to ciliated spermatocytes.

      We have expanded the Discussion to address these findings and to highlight the potential involvement of the Desert Hedgehog (DHH) pathway, which plays key roles in testicular development, Sertoli-germ cell communication, and spermatogenesis (1, 2, 3). We plan to investigate these pathways further in future studies.

      1. Bitgood et al. Curr Biol. (1996). DOI: 1016/s0960-9822(02)00480-3 (PMID: 8805249)
      2. Clark et al. Biol Reprod. (2000) DOI: 1095/biolreprod63.6.1825 (PMID: 11090455)
      3. O'Hara et al. BMC Dev Biol. (2011) DOI: 1186/1471-213X-11-72 (PMID: 22132805) *note: due to manuscript-length limitations, not all cited references can be included in the text; they are listed here to substantiate our response.

      There are a few typos throughout the manuscript. Some examples: page 5 line 172, Figure 3-I legend text, Sup Figure 5-II callouts, Figure 8-III legend, page 15 line 508, page 17 line 580, page 18 line 611.

      Response:

      We thank the reviewer for detecting this. All typographical errors have been corrected, and figure callouts have been reviewed for consistency.

      __ ____Response to the Referee #2__

      __ __This study focuses on the dynamic changes of ciliogenesis during meiosis in prepubertal mice. It was found that primary cilia are not an intrinsic feature of the first wave of meiosis (initiating at 8 dpp); instead, they begin to polymerize at 20 dpp (after the completion of the first wave of meiosis) and are present in all stages of prophase I. Moreover, prepubertal cilia (with an average length of 21.96 μm) are significantly longer than adult cilia (10 μm). The emergence of cilia coincides temporally with flagellogenesis, suggesting a regulatory association in the formation of axonemes between the two. Functional experiments showed that disruption of cilia by chloral hydrate (CH) delays DNA repair, while the AURKA inhibitor (MLN8237) delays cilia disassembly, and centrosome migration and cilia depolymerization are mutually exclusive events. These findings represent the first detailed description of the spatiotemporal regulation and potential roles of cilia during early testicular maturation in mice. The discovery of this phenomenon is interesting; however, there are certain limitations in functional research.

      We thank Ref#2 for taking the time to evaluate our manuscript and for summarizing its main findings. We regret that the reviewer did not find the study sufficiently compelling, but we respectfully clarify that the strength of our work lies precisely in addressing a largely unexplored aspect of mammalian meiosis for which virtually no prior data exist. Given the extremely limited number of studies addressing cilia in mammalian meiosis (only five to date, including our own previous publication on adult mouse spermatogenesis) (1-5), we consider that the present work provides the first robust and integrative evidence on the emergence, morphology, and potential roles of primary cilia during prepubertal testicular development. The study combines histology, high-resolution microscopy, proteomics, and pharmacological perturbations, supported by quantitative analyses, thereby establishing a solid and much-needed reference framework for future functional studies.

      We emphasize that this manuscript constitutes the first comprehensive characterization of ciliogenesis during prepubertal mouse meiosis, complemented by functional in vitro assays that begin to address potential roles of these cilia. For this reason, we want to underscore the importance of this study in providing a solid framework that will support and guide future research

      Major points:

      1. The prepubertal cilia in spermatocytes discovered by the authors lack specific genetic ablation to block their formation, making it impossible to evaluate whether such cilia truly have functions. Because neither in the first wave of spermatogenesis nor in adult spermatogenesis does this type of cilium seem to be essential. In addition, the authors also imply that the formation of such cilia appears to be synchronized with the formation of sperm flagella. This suggests that the production of such cilia may merely be transient protein expression noise rather than a functionally meaningful cellular structure.

      Response:

      We agree that a genetic ablation model would represent the ideal approach to directly test cilia function in spermatogenesis. However, given the complete absence of prior data describing the dynamics of ciliogenesis during testis development, our priority in this study was to establish a rigorous structural and temporal characterization of this process in the main mammalian model organism, the mouse. This systematic and rigorous phenotypic characterization is a necessary first step before any functional genetics could be meaningfully interpreted.

      To our knowledge, this study represents the first comprehensive analysis of ciliogenesis during prepubertal mouse meiosis, extending our previous work on adult spermatogenesis (1). Beyond these two contributions, only four additional studies have addressed meiotic cilia-two in zebrafish (2, 3), with Mytlys et al. also providing preliminary observations relevant to prepubertal male meiosis that we discuss in the present work, one in Drosophila (4) and a recent one in butterfly (5). No additional information exists for mammalian gametogenesis to date.

      1. López-Jiménez et al. Cells (2022) DOI: 10.3390/cells12010142 (PMID: 36611937)
      2. Mytlis et al. Science (2022) DOI: 10.1126/science.abh3104 (PMID: 35549308)
      3. Xie et al. J Mol Cell Biol (2022) DOI: 10.1093/jmcb/mjac049 (PMID: 35981808)
      4. Riparbelli et al . Dev Cell (2012) DOI: 10.1016/j.devcel.2012.05.024 (PMID: 22898783)
      5. Gottardo et al, Cytoskeleton (Hoboken) (2023) DOI: 10.1002/cm.21755 (PMID: 37036073) We therefore consider this descriptive and analytical foundation to be essential before the development of functional genetic models. Indeed, we are currently generating a conditional genetic model for a ciliopathy in our laboratory. These studies are ongoing and will directly address the type of mechanistic questions raised here, but they extend well beyond the scope and feasible timeframe of the present manuscript.

      We thus maintain that the present work constitutes a necessary and timely contribution, providing a robust reference dataset that will facilitate and guide future functional studies in the field of cilia and meiosis.

      Taking this into account, we would be very pleased to address any additional, concrete suggestions from Ref#2 that could further strengthen the current version of the manuscript

      The high expression of axoneme assembly regulators such as TRiC complex and IFT proteins identified by proteomic analysis is not particularly significant. This time point is precisely the critical period for spermatids to assemble flagella, and TRiC, as a newly discovered component of flagellar axonemes, is reasonably highly expressed at this time. No intrinsic connection with the argument of this paper is observed. In fact, this testicular proteomics has little significance.

      Response:

      We appreciate this comment but respectfully disagree with the reviewer's interpretation of our proteomic data. To our knowledge, this is the first proteomic study explicitly focused on identifying ciliary regulators during testicular development at the precise window (19-21 dpp) when both meiotic cilia and spermatid flagella first emerge.

      While Piprek et al (1) analyzed the expression of primary cilia in developing gonads, proteomic data specifically covering the developmental transition at 19-21 dpp were not previously available. Furthermore, a recent cell-sorting study (2), detected expression of cilia proteins in pachytene spermatocytes compared to round spermatids, but did not explore their functional relevance or integrate these data with developmental timing or histological context.

      In contrast, our dataset integrates histological staging, high-resolution microscopy, and quantitative proteomics, revealing a set of candidate regulators (including DCAF7, DYRK1A, TUBB3, TUBB4B, and TRiC) potentially involved in cilia-flagella coordination. We view this as a hypothesis-generating resource that outlines specific proteins and pathways for future mechanistic studies on both ciliogenesis and flagellogenesis in the testis.

      Although we fully agree that proteomics alone cannot establish causal function, we believe that dismissing these data as having little significance overlooks their value as the first molecular map of the testis at the developmental window when axonemal structures arise. Our dataset provides, for the first time, an integrated view of proteins associated with ciliary and flagellar structures at the developmental stage when both axonemal organelles first appear. We thus believe that our proteomic dataset represents an important and novel contribution to the understanding of testicular development and ciliary biology.

      Considering this, we would again welcome any specific suggestions from Ref#2 on additional analyses or clarifications that could make the relevance of this dataset even clearer to readers.

      1. Piprek et al. Int J Dev Biol. (2019) doi: 10.1387/ijdb.190049rp (PMID: 32149371).
      2. Fang et al. Chromosoma. (1981) doi: 10.1007/BF00285768 (PMID: 7227045).

      Response to the Referee #3

      In "The dynamics of ciliogenesis in prepubertal mouse meiosis reveals new clues about testicular development" Pérez-Moreno, et al. explore primary cilia in prepubertal mouse spermatocytes. Using a combination of microscopy, proteomics, and pharmacological perturbations, the authors carefully characterize prepubertal spermatocyte cilia, providing foundational work regarding meiotic cilia in the developing mammalian testis.

      Response: We sincerely thank Ref#3 for their positive assessment of our work and for the thoughtful suggestions that have helped us strengthen the manuscript. We are pleased that the reviewer recognizes both the novelty and the relevance of our study in providing foundational insights into meiotic ciliogenesis during prepubertal testicular development. All specific comments have been carefully considered and addressed as detailed below.


      Major concerns:

      1. The authors provide evidence consistent with cilia not being present in a larger percentage of spermatocytes or in other cells in the testis. The combination of electron microscopy and acetylated tubulin antibody staining establishes the presence of cilia; however, proving a negative is challenging. While acetylated tubulin is certainly a common marker of cilia, it is not in some cilia such as those in neurons. The authors should use at least one additional cilia marker to better support their claim of cilia being absent.

      Response:

      We thank the reviewer for this helpful suggestion. In the revised version, we have strengthened the evidence for cilia identification by including an additional ciliary marker, glutamylated tubulin (GT335), in combination with acetylated tubulin and ARL13B (which were included in the original submission). These data are now presented in the new Supplementary Figure 2, which also includes an example of a non-ciliated spermatocyte showing absence of both ARL13B and AcTub signals.

      Taken together, these markers provide a more comprehensive validation of cilia detection and confirm the absence of ciliary labelling in non-ciliated spermatocytes.

      The conclusion that IFT88 localizes to centrosomes is premature as key controls for the IFT88 antibody staining are lacking. Centrosomes are notoriously "sticky", often sowing non-specific antibody staining. The authors must include controls to demonstrate the specificity of the staining they observe such as staining in a genetic mutant or an antigen competition assay.

      Response:

      We appreciate the reviewer's concern and fully agree that antibody specificity is critical when interpreting centrosomal localization. The IFT88 antibody used in our study is commercially available and has been extensively validated in the literature as both a cilia marker (1, 2), and a centrosome marker in somatic cells (3). Labelling of IFT88 in centrosomes has also been previously described using other antibodies (4, 5). In our material, the IFT88 signal consistently appears at one of the duplicated centrosomes and at both spindle poles-patterns identical to those reported in somatic cells. We therefore consider the reported meiotic IFT88 staining as specific and biologically reliable.

      That said, we agree that genetic validation would provide the most definitive confirmation. We would like to inform that we are currently since we are currently generating a conditional genetic model for a ciliopathy in our laboratory that will directly assess both antibody specificity and functional consequences of cilia loss during meiosis. These experiments are in progress and will be reported in a follow-up study.

      1. Wong et al. Science (2015). DOI: 1126/science.aaa5111 (PMID: 25931445)
      2. Ocbina et al. Nat Genet (2011). DOI: 1038/ng.832 (PMID: 21552265)
      3. Vitre et al. EMBO Rep (2020). DOI: 15252/embr.201949234 (PMID: 32270908)
      4. Robert A. et al. J Cell Sci (2007). DOI: 1242/jcs.03366 (PMID: 17264151)
      5. Singla et al, Developmental Cell (2010). DOI: 10.1016/j.devcel.2009.12.022 (PMID: 20230748) *note: due to manuscript-length limitations, not all cited references can be included in the text; they are listed here to substantiate our response.

      There are many inconsistent statements throughout the paper regarding the timing of the first wave of spermatogenesis. For example, the authors state that round spermatids can be detected at 21dpp on line 161, but on line 180, say round spermatids can be detected a 19dpp. Not only does this lead to confusion, but such discrepancies undermine the validity of the rest of the paper. A summary graphic displaying key events and their timing in the first wave of spermatogenesis would be instrumental for reader comprehension and could be used by the authors to ensure consistent claims throughout the paper.

      Response:

      We thank the reviewer for identifying this inconsistency and apologize for the confusion. We confirm that early round spermatids first appear at 19 dpp, as shown in the quantitative data (Figure 1J). This can be detected in squashed spermatocyte preparations, where individual spermatocytes and spermatids can be accurately quantified. The original text contained an imprecise reference to the histological image of 21 dpp (previous line 161), since certain H&E sections did not clearly show all cell types simultaneously. However, we have now revised Figure 1, improving the image quality and adding a zoomed-in panel highlighting early round spermatids. Image for 19 dpp mice in Fig 1D shows early, yet still aflagellated spermatids. The first ciliated spermatocytes and the earliest flagellated spermatids are observed at 20 dpp. This has been clarified in the text.

      In addition, we also thank the reviewer for the suggestion of adding a summary graphic, which we agree greatly facilitates reader comprehension. We have added a new schematic summary (Figure 1K) illustrating the key stages and timing of the first spermatogenic wave.

      In the proteomics experiments, it is unclear why the authors assume that changes in protein expression are predominantly due to changes within the germ cells in the developing testis. The analysis is on whole testes including both the somatic and germ cells, which makes it possible that protein expression changes in somatic cells drive the results. The authors need to justify why and how the conclusions drawn from this analysis warrant such an assumption.

      Response:

      We agree with the reviewer that our proteomic analysis was performed on whole testis samples, which contain both germ and somatic cells. Although isolation of pure spermatocyte populations by FACS would provide higher resolution, obtaining sufficient prepubertal material for such analysis would require an extremely large number of animals. To remain compliant with the 3Rs principle for animal experimentation, we therefore used whole-testis samples from three biological replicates per age.

      We acknowledge that our assumption-that the main differences arise from germ cells-is a simplification. However, germ cells constitute the vast majority of testicular cells during this developmental window and are the population undergoing major compositional changes between 15 dpp and adulthood. It is therefore reasonable to expect that a substantial fraction of the observed proteomic changes reflects alterations in germ cells. We have clarified this point in the revised text and have added a statement noting that changes in somatic cells could also contribute to the proteomic profiles.

      The authors should provide details on how proteins were categorized as being involved in ciliogenesis or flagellogenesis, specifically in the distinction criteria. It is not clear how the categorizations were determined or whether they are valid. Thus, no one can repeat this analysis or perform this analysis on other datasets they might want to compare.

      Response:

      We thank the reviewer for this opportunity to clarify our approach. The categorization of protein as being involved in ciliogenesis or flagellogenesis was based on their Gene Ontology (GO) cellular component annotations obtained from the PANTHER database (Version 19.0), using the gene IDs of the Differentially Expressed Proteins (DEPs). Specifically, we used the GO terms cilium (GO:0005929) and motile cilium (GO:0031514). Since motile cilium is a subcategory of cilium, proteins annotated only with the general cilium term, but not included under motile cilium, were considered to be associated with primary cilia or with shared structural components common to different types of cilia. These GO terms are represented in the bottom panel of the Figure 6.

      This information has been added to the Methods section and referenced in the Results for transparency and reproducibility.

      In the pharmacological studies, the authors conclude that the phenotypes they observe (DNA damage and reduced pachytene spermatocytes) are due to loss of or persistence of cilia. This overinterprets the experiment. Chloral hydrate and MLN8237 certainly impact ciliation as claimed, but have additional cellular effects. Thus, it is possible that the observed phenotypes were not a direct result of cilia manipulation. Either additional controls must address this or the conclusions need to be more specific and toned down.

      Response:

      We thank the reviewer for this fair observation and have taken steps to strengthen and refine our interpretation. In the revised version, we now include data from 1-hour and 24-hour cultures for both control and chloral hydrate (CH)-treated samples (n = 3 biological replicates). The triple immunolabelling with γH2AX, SYCP3, and H1T allows accurate staging of zygotene (H1T⁻), early pachytene (H1T⁻), and late pachytene (H1T⁺) spermatocytes.

      The revised Figure 7 now provides a more complete and statistically supported analysis of DNA damage dynamics, confirming that CH-induced deciliation leads to persistent γH2AX signal at 24 hours, indicative of delayed or defective DNA repair progression. We have also toned down our interpretation in the Discussion, acknowledging that CH could affect other cellular pathways.

      As mentioned before, the conditional genetic model that we are currently generating will allow us to evaluate the role of cilia in meiotic DNA repair in a more direct and specific way.

      Assuming the conclusions of the pharmacological studies hold true with the proper controls, the authors still conflate their findings with meiotic defects. Meiosis is not directly assayed, which makes this conclusion an overstatement of the data. The conclusions need to be rephrased to accurately reflect the data.

      Response:

      We agree that this aspect required clarification. As noted above, we have refined both the Results and Discussion sections to make clear that our assays specifically targeted meiotic spermatocytes.

      We now present data for meiotic stages at zygotene, early pachytene and late pachytene. This is demonstrated with the labelling for SYCP3 and H1T, both specific marker for meiosis that are not detectable in non meiotic cells. We believe that this is indeed a way to assay the meiotic cells, however, we have specified now in the text that we are analysing potential defects in meiosis progression. We are sorry if this was not properly explained in the original manuscript: it is now rephrased in the new version both in the results and discussion section.

      It is not clear why the authors chose not to use widely accepted assays of Hedgehog signaling. Traditionally, pathway activation is measured by transcriptional output, not GLI protein expression because transcription factor expression does not necessarily reflect transcription levels of target genes.

      Response:

      We agree with the reviewer that measuring mRNA levels of Hedgehog pathway target genes, typically GLI1 and PTCH1, is the most common method for measuring pathway activation, and is widely accepted by researchers in the field. However, the methods we use in this manuscript (GLI1 and GLI3 immunoblots) are also quite common and widely accepted:

      Regarding GLI1 immunoblot, many articles have used this method to monitor Hedgehog signaling, since GLI1 protein levels have repeatedly been shown to also go up upon pathway activation, and down upon pathway inhibition, mirroring the behavior of GLI1 mRNA. Here are a few publications that exemplify this point:

      • Banday et al. 2025 Nat Commun. DOI: 10.1038/s41467-025-56632-0 (PMID: 39894896)
      • Shi et al 2022 JCI Insight DOI: 10.1172/jci.insight.149626 (PMID: 35041619)
      • Deng et al. 2019 eLife, DOI: 10.7554/eLife.50208 (PMID: 31482846)
      • Zhu et al. 2019 Nat Commun, DOI: 10.1038/s41467-019-10739-3 (PMID: 31253779)
      • Caparros-Martin et al 2013 Hum Mol Genet, DOI: 10.1093/hmg/dds409 (PMID: 23026747) *note: due to manuscript-length limitations, not all cited references can be included in the text; they are listed here to substantiate our response.

      As for GLI3 immunoblot, Hedgehog pathway activation is well known to inhibit GLI3 proteolytic processing from its full length form (GLI3-FL) to its transcriptional repressor (GLI3-R), and such processing is also commonly used to monitor Hedgehog signal transduction, of which the following are but a few examples:

      • Pedraza et al 2025 eLife, DOI: 10.7554/eLife.100328 (PMID: 40956303)
      • Somatilaka et al 2020 Dev Cell, DOI: 10.1016/j.devcel.2020.06.034 (PMID: 32702291)
      • Infante et al 2018, Nat Commun, DOI: 10.1038/s41467-018-03339-0 (PMID: 29515120)
      • Wang et al 2017 Dev Biol DOI: 10.1016/j.ydbio.2017.08.003 (PMID: 28800946)
      • Singh et al 2015 J Biol Chem DOI: 10.1074/jbc.M115.665810 (PMID: 26451044)
      • *note: due to manuscript-length limitations, not all cited references can be included in the text; they are listed here to substantiate our response.*

      In summary, we think that we have used two well established markers to look at Hedgehog signaling (three, if we include the immunofluorescence analysis of SMO, which we could not detect in meiotic cilia).

      These Hh pathway analyses did not provide any convincing evidence that the prepubertal cilia we describe here are actively involved in this pathway, even though Hh signaling is cilia-dependent and is known to be active in the male germline (Sahin et al 2014 Andrology PMID: 24574096; Mäkelä et al 2011 Reproduction PMID: 21893610; Bitgood et al 1996 Curr Biol. PMID: 8805249).

      That said, we fully agree that our current analyses do not allow us to draw definitive conclusions regarding Hedgehog pathway activity in meiotic cilia, and we now state this explicitly in the revised Discussion.

      Also in the Hedgehog pathway experiment, it is confusing that the authors report no detection of SMO yet detect little to no expression of GLIR in their western blot. Undetectable SMO indicates Hedgehog signaling is inactive, which results in high levels of GLIR. The impact of this is that it is not clear what is going on with Hh signaling in this system.

      Response:

      It is true that, when Hh signaling is inactive (and hence SMO not ciliary), the GLI3FL/GLI3R ratio tends to be low.

      Although our data in prepuberal mouse testes show a strong reduction in total GLI3 protein levels (GLI3FL+GLI3R) as these mice grow older, this downregulation of total GLI3 occurs without any major changes in the GLI3FL/GLI3R ratio, which is only modestly affected (suppl. Figure 6).

      Hence, since it is the ratio that correlates with Hh signaling rather than total levels, we do not think that the GLI3R reduction we see is incompatible with our non-detection of SMO in cilia: it seems more likely that overall GLI3 expression is being downregulated in developing testes via a Hh-independent mechanism.

      Also potentially relevant here is the fact that some cell types depend more on GLI2 than on GLI3 for Hh signaling. For instance, in mouse embryos, Hh-mediated neural tube patterning relies more heavily on GLI2 processing into a transcriptional activator than on the inhibition of GLI3 processing into a repressor. In contrast, the opposite is true during Hh-mediated limb bud patterning (Nieuwenhuis and Hui 2005 Clin Genet. PMID: 15691355). We have not looked at GLI2, but it is conceivable that it could play a bigger role than GLI3 in our model.

      Moreover, several forms of GLI-independent non-canonical Hh signaling have been described, and they could potentially play a role in our model, too (Robbins et al 2012 Sci Signal. PMID: 23074268).

      We have revised the discussion to clarify some of these points.

      All in all, we agree that our findings regarding Hh signaling are not conclusive, but we still think they add important pieces to the puzzle that will help guide future studies.

      There are multiple instances where it is not clear whether the authors performed statistical analysis on their data, specifically when comparing the percent composition of a population. The authors need to include appropriate statistical tests to make claims regarding this data. While the authors state some impressive sample sizes, once evaluated in individual categories (eg specific cell type and age) the sample sizes of evaluated cilia are as low as 15, which is likely underpowered. The authors need to state the n for each analysis in the figures or legends.

      We thank the reviewer for highlighting this important issue. We have now included the sample size (n) for every analysis directly in the figure legends. Although this adds length, it improves transparency and reproducibility.

      Regarding the doubts of Ref#3 about the different sample sizes, the number of spermatocytes quantified in each stage is in agreement with their distribution in meiosis (example, pachytene lasts for 10 days this stage is widely represented in the preparations, while its is much difficult to quantify metaphases I that are less present because the stage itself lasts for less than 24hours). Taking this into account, we ensured that all analyses remain statistically valid and representative, applying the appropriate statistical tests for each dataset. These details are now clearly indicated in the revised figures and legends.

      Minor concerns:

      1. The phrase "lactating male" is used throughout the paper and is not correct. We assume this term to mean male pups that have yet to be weaned from their lactating mother, but "lactating male" suggests a rare disorder requiring medical intervention. Perhaps "pre-weaning males" is what the authors meant.

      Response:

      We thank the reviewer for noticing this terminology error. The expression has been corrected to "pre-weaning males" throughout the manuscript.

      The convention used to label the figures in this paper is confusing and difficult to read as there are multiple panels with the same letter in the same figure (albeit distinct sections). Labeling panels in the standard A-Z format is preferred. "Panel Z" is easier to identify than "panel III-E".

      Response:

      We thank the reviewer for this suggestion. All figures have been relabelled using the standard A-Z panel format, ensuring consistency and easier readability across the manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In "The dynamics of ciliogenesis in prepubertal mouse meiosis reveals new clues about testicular development" Pérez-Moreno, et al. explore primary cilia in prepubertal mouse spermatocytes. Using a combination of microscopy, proteomics, and pharmacological perturbations, the authors carefully characterize prepubertal spermatocyte cilia, providing foundational work regarding meiotic cilia in the developing mammalian testis.

      Major concerns:

      1. The authors provide evidence consistent with cilia not being present in a larger percentage of spermatocytes or in other cells in the testis. The combination of electron microscopy and acetylated tubulin antibody staining establishes the presence of cilia; however, proving a negative is challenging. While acetylated tubulin is certainly a common marker of cilia, it is not in some cilia such as those in neurons. The authors should use at least one additional cilia marker to better support their claim of cilia being absent.

      2. The conclusion that IFT88 localizes to centrosomes is premature as key controls for the IFT88 antibody staining are lacking. Centrosomes are notoriously "sticky", often sowing non-specific antibody staining. The authors must include controls to demonstrate the specificity of the staining they observe such as staining in a genetic mutant or an antigen competition assay.

      3. There are many inconsistent statements throughout the paper regarding the timing of the first wave of spermatogenesis. For example, the authors state that round spermatids can be detected at 21dpp on line 161, but on line 180, say round spermatids can be detected a 19dpp. Not only does this lead to confusion, but such discrepancies undermine the validity of the rest of the paper. A summary graphic displaying key events and their timing in the first wave of spermatogenesis would be instrumental for reader comprehension and could be used by the authors to ensure consistent claims throughout the paper.

      4. In the proteomics experiments, it is unclear why the authors assume that changes in protein expression are predominantly due to changes within the germ cells in the developing testis. The analysis is on whole testes including both the somatic and germ cells, which makes it possible that protein expression changes in somatic cells drive the results. The authors need to justify why and how the conclusions drawn from this analysis warrant such an assumption.

      5. The authors should provide details on how proteins were categorized as being involved in ciliogenesis or flagellogenesis, specifically in the distinction criteria. It is not clear how the categorizations were determined or whether they are valid. Thus, no one can repeat this analysis or perform this analysis on other datasets they might want to compare.

      6. In the pharmacological studies, the authors conclude that the phenotypes they observe (DNA damage and reduced pachytene spermatocytes) are due to loss of or persistence of cilia. This overinterprets the experiment. Chloral hydrate and MLN8237 certainly impact ciliation as claimed, but have additional cellular effects. Thus, it is possible that the observed phenotypes were not a direct result of cilia manipulation. Either additional controls must address this or the conclusions need to be more specific and toned down.

      7. Assuming the conclusions of the pharmacological studies hold true with the proper controls, the authors still conflate their findings with meiotic defects. Meiosis is not directly assayed, which makes this conclusion an overstatement of the data. The conclusions need to be rephrased to accurately reflect the data.

      8. It is not clear why the authors chose not to use widely accepted assays of Hedgehog signaling. Traditionally, pathway activation is measured by transcriptional output, not GLI protein expression because transcription factor expression does not necessarily reflect transcription levels of target genes.

      9. Also in the Hedgehog pathway experiment, it is confusing that the authors report no detection of SMO yet detect little to no expression of GLIR in their western blot. Undetectable SMO indicates Hedgehog signaling is inactive, which results in high levels of GLIR. The impact of this is that it is not clear what is going on with Hh signaling in this system.

      10. There are multiple instances where it is not clear whether the authors performed statistical analysis on their data, specifically when comparing the percent composition of a population. The authors need to include appropriate statistical tests to make claims regarding this data. While the authors state some impressive sample sizes, once evaluated in individual categories (eg specific cell type and age) the sample sizes of evaluated cilia are as low as 15, which is likely underpowered. The authors need to state the n for each analysis in the figures or legends.

      Minor concerns:

      1. The phrase "lactating male" is used throughout the paper and is not correct. We assume this term to mean male pups that have yet to be weaned from their lactating mother, but "lactating male" suggests a rare disorder requiring medical intervention. Perhaps "pre-weaning males" is what the authors meant.

      2. The convention used to label the figures in this paper is confusing and difficult to read as there are multiple panels with the same letter in the same figure (albeit distinct sections). Labeling panels in the standard A-Z format is preferred. "Panel Z" is easier to identify than "panel III-E".

      Significance

      Overall, this is a well-done body of work that deserves recognition for the novel and implicative discoveries it presents. Assuming the conclusions hold true following appropriate statistical analysis and rephrasing, this paper would report the first documented evidence of meiotic cilia in the developing mammalian testis with sufficient rigor to become the foundational work on this topic.

      This paper will be of interest to communities focused on germ cell development, cilia, and Hedgehog signaling. It may prompt a new perspective on Desert Hedgehog signaling as it pertains to spermatogenesis. Further, this work will be of interest to those studying male fertility, as it highlights the potential role of cilia in spermatogenesis.

      Further, the proteomic analysis presented has the potential to invoke hypotheses and experimentation investigating the role of several proteins with previously uncharacterized roles in ciliogenesis, flagellogenesis, and/or spermatogenesis. The finding that the onset of ciliogenesis and flagellogenesis appear to be temporally linked has the potential to prompt research regarding shared molecular mechanisms dictating axonemal formation. We believe this paper has the potential to have an impact in its respective field, underscored by the exquisite microscopy and detailed characterization of meiotic cilia.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This study focuses on the dynamic changes of ciliogenesis during meiosis in prepubertal mice. It was found that primary cilia are not an intrinsic feature of the first wave of meiosis (initiating at 8 dpp); instead, they begin to polymerize at 20 dpp (after the completion of the first wave of meiosis) and are present in all stages of prophase I. Moreover, prepubertal cilia (with an average length of 21.96 μm) are significantly longer than adult cilia (10 μm). The emergence of cilia coincides temporally with flagellogenesis, suggesting a regulatory association in the formation of axonemes between the two. Functional experiments showed that disruption of cilia by chloral hydrate (CH) delays DNA repair, while the AURKA inhibitor (MLN8237) delays cilia disassembly, and centrosome migration and cilia depolymerization are mutually exclusive events. These findings represent the first detailed description of the spatiotemporal regulation and potential roles of cilia during early testicular maturation in mice. The discovery of this phenomenon is interesting; however, there are certain limitations in functional research.

      Major points:

      1. The prepubertal cilia in spermatocytes discovered by the authors lack specific genetic ablation to block their formation, making it impossible to evaluate whether such cilia truly have functions. Because neither in the first wave of spermatogenesis nor in adult spermatogenesis does this type of cilium seem to be essential. In addition, the authors also imply that the formation of such cilia appears to be synchronized with the formation of sperm flagella. This suggests that the production of such cilia may merely be transient protein expression noise rather than a functionally meaningful cellular structure.

      2. The high expression of axoneme assembly regulators such as TRiC complex and IFT proteins identified by proteomic analysis is not particularly significant. This time point is precisely the critical period for spermatids to assemble flagella, and TRiC, as a newly discovered component of flagellar axonemes, is reasonably highly expressed at this time. No intrinsic connection with the argument of this paper is observed. In fact, this testicular proteomics has little significance.

      Significance

      Strengths: The discovery of a very interesting time window for ciliary growth in spermatocytes.

      Weaknesses: Insufficient analysis of the function of such cilia.

      Readers: Developmental biologists, reproductive biologists

      My expertise: Spermatogenesis, genetics

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript by Perez-Moreno et al., titled "The dynamics of ciliogenesis in prepubertal mouse meiosis reveal new clues about testicular maturation during puberty", the authors characterize the development of primary cilia during meiosis in juvenile male mice. The authors catalog a variety of testicular changes that occur as juvenile mice age, such as changes in testis weight and germ cell-type composition. They next show that meiotic prophase cells initially lack cilia, and ciliated meiotic prophase cells are detected after 20 days postpartum, coinciding with the time when post-meiotic spermatids within the developing testes acquire flagella. They describe that germ cells in juvenile mice harbor cilia at all substages of meiotic prophase, in contrast to adults where only zygotene stage meiotic cells harbor cilia. The authors also document that cilia in juvenile mice are longer than those in adults. They characterize cilia composition and structure by immunofluorescence and EM, highlighting that cilia polymerization may initially begin inside the cell, followed by extension beyond the cell membrane. Additionally, they demonstrate ciliated cells can be detected in adult human testes. The authors next perform proteomic analyses of whole testes from juvenile mice at multiple ages, which may not provide direct information about the extremely small numbers of ciliated meiotic cells in the testis, and is lacking follow up experiments, but does serve as a valuable resource for the community. Finally, the authors use a seminiferous tubule culturing system to show that chemical inhibition of Aurora kinase A likely inhibits cilia depolymerization upon meiotic prophase I exit and leads to an accumulation of metaphase-like cells harboring cilia. They also assess meiotic recombination progression using their culturing system, but this is less convincing.

      Few suggestions/comments are listed below:

      Major comments

      1. There are a few issues with the experimental set up for assessing the effects of cilia depolymerization on DNA repair (Figure 7-II). First, how were mid pachytene cells identified and differentiated from early pachytene cells (which would have higher levels of gH2AX) in this experiment? I suggest either using H1t staining (to differentiate early/mid vs late pachytene) or the extent of sex chromosome synapsis. This would ensure that the authors are comparing similarly staged cells in control and treated samples. Second, what were the gH2AX levels at the starting point of this experiment? A more convincing set up would be if the authors measure gH2AX immediately after culturing in early and late cells (early would have higher gH2AX, late would have lower gH2AX), and then again after 24hrs in late cells (upon repair disruption the sampled late cells would have high gH2AX). This would allow them to compare the decline in gH2AX (i.e., repair progression) in control vs treated samples. Also, it would be informative to know the starting gH2AX levels in ciliated vs non-ciliated cells as they may vary.

      2. The authors analyze meiotic progression in cells cultured with/without AURKA inhibition in Figure 8-III and conclude that the distribution of prophase I cells does not change upon treatment. Is Figure 8-III A and B the same data? The legend text is incorrect, so it's hard to follow. Figure 8-III A shows a depletion of EdU-labelled pachytene cells upon treatment. Moreover, the conclusion that a higher proportion of ciliated zygotene cells upon treatment (Figure 8-II C) suggests that AURKA inhibition delays cilia depolymerization (page 13 line 444) does not make sense to me.

      3. How do the authors know that there is a monopolar spindle in Figure 8-IV treated samples? Perhaps the authors can use a different Tubulin antibody (that does not detect only acetylated Tubulin) to show that there is a monopolar spindle.

      4. The authors state in the abstract that they provide evidence suggesting that centrosome migration and cilia depolymerization are mutually exclusive events during meiosis. This is not convincing with the data present in the current manuscript. I suggest amending this statement in the abstract.

      Minor comments

      1. The presence of cilia in all stages of meiotic prophase I in juvenile mice is intriguing. Why is the cellular distribution and length of cilia different in prepubertal mice compared to adults (where shorter cilia are present only in zygotene cells)? What is the relevance of these developmental differences? Do cilia serve prophase I functions in juvenile mice (in leptotene, pachytene etc.) that are perhaps absent in adults?

      Related to the above point, what is the relevance of the absence of cilia during the first meiotic wave? If cilia serve a critical function during prophase I (for instance, facilitating DSB repair), does the lack of cilia during the first wave imply differing cilia (and repair) requirements during the first vs latter spermatogenesis waves?

      In my opinion, these would be interesting points to discuss in the discussion section.

      1. The authors state on page 9 lines 286-288 that the presence of cytoplasmic continuity via intercellular bridges (between developmentally synchronous spermatocytes) hints towards a mechanism that links cilia and flagella formation. Please clarify this statement. While the correlation between the timing of appearance of cilia and flagella in cells that are located within the same segment of the seminiferous tubule may be hinting towards some shared regulation, how would cytoplasmic continuity participate in this regulation? Especially since the cytoplasmic continuity is not between the developmentally distinct cells acquiring the cilia and flagella?

      2. Individual germ cells in H&E-stained testis sections in Figure 1-II are difficult to see. I suggest adding zoomed-in images where spermatocytes/round spermatids/elongated spermatids are clearly distinguishable.

      3. In Figure 2-II B, the authors document that most ciliated spermatocytes in juvenile mice are pachytene. Is this because most meiotic cells are pachytene? Please clarify. If the data are available (perhaps could be adapted from Figure 1-III), it would be informative to see a graph representing what proportions of each meiotic prophase substages have cilia.

      4. I suggest annotating the EM images in Sup Figure 2 and 3 to make it easier to interpret.

      5. The authors claim that the ratio between GLI3-FL and GLI3-R is stable across their analyzed developmental window in whole testis immunoblots shown in Sup Figure 5. Quantifying the bands and normalizing to the loading control would help strengthen this claim as it hard to interpret the immunoblot in its current form.

      6. There are a few typos throughout the manuscript. Some examples: page 5 line 172, Figure 3-I legend text, Sup Figure 5-II callouts, Figure 8-III legend, page 15 line 508, page 17 line 580, page 18 line 611.

      Significance

      This work provides new information about an important but poorly understood cellular structure present in meiotic cells, the primary cilium. More generally, this work expands on our understanding of testis development in juvenile mice. The microscopy images presented here are beautiful. The work is mostly descriptive but lays the groundwork for future investigations. I believe that this study would of interest to the germ cell, meiosis, and spermatogenesis communities, and with a few modifications, is suitable for publication.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank the three reviewers for their careful reading of our manuscript and suggested modifications. We have incorporated their suggestions as described below; these changes have significantly improved the structure and focus of the manuscript.


      Reviewer #1 (Evidence, reproducibility and clarity (Required)): Summary

      The possibility of observing 3D cellular organisation in tissues at nanometre resolution is a hope for many cell biologists. Here, the authors have combined two volume electron microscopy approaches with scanning electron microscopy: Focused Ion Beam (FIB-SEM) and Array Tomography (AT-SEM) to study the evolution of the shape and organisation of cytoplasmic bridges, the 'ring canals' (RCs) in the Drosophila ovarian follicle that connect nurse cells and oocyte. This type of cytoplasmic link, found in insects and humans, is essential for oocyte development.

      RCs have mainly been studied using light microscopy with various markers that constitute them, but this approach does not fully capture an overall view of their organization. Due to their three-dimensional arrangement within the ovarian follicle, characterizing their organization using transmission electron microscopy (TEM) has been very limited until now. This v-EM study allows the authors to document the evolution of RC size and thickness during the development of germline cysts, from the germarium to stage 4, and potentially beyond. This study confirmed previous findings, namely that RC size correlates with lineage: the largest RC is formed after the first division, while the smallest is formed during the last division.

      Furthermore, this work allowed a better characterisation of the membrane interdigitation surrounding the RCs. In addition, the authors highlight the important potential of v-EM for further structural analysis of the fusome, migrating border cells and the stem cell niche.

      Majors comment

      The output of this work can be divided into two parts. First, this work presents a technical challenge, involving image acquisition by volume electron microscopy and manual 3D reconstruction of the contours of the membranes, nuclei, RCs, and fusome in different cysts at different stages.

      Secondly, this work is based on a structural study of the RCs and their associated membranes. This work is descriptive but important, although the results largely confirm previous findings, both for the structure of the RCs and their relationship to the division sequence of the cyst cells, and for the organisation of the membranes around the RCs.

      Very interestingly, the authors report the spatial characterisation of membrane structures associated with and close to CRs that have already been identified (Loyer et al.). However, their characterisation is somewhat incomplete, as it lacks quantified data - how many CRs were analysed? and, above all, the characteristics of these membranes, their length and orientation according to their position and their connection in the lineage - these data could be obtained from the VEM data already collected and would be an important addition to the RC structural analysis in this work.

      *Following the suggestions of this reviewer, we have reduced the emphasis on the technical approach to better highlight the ring canal data. We have summarized the ring canal measurements in graphs presented in Fig. 4B, C and included the sample sizes for these measurements in the figure legend. *

      • To gain further insight into the membrane interdigitations, we have developed a detailed model of the oocyte and four ring canals that connect to the posterior nurse cells of the stage 4 egg chamber (Fig. 5). From this model, we see that the interdigitations are longer and more abundant that in the germarium (Fig. S5), but not as extensive as in the stage 8 egg chamber (Fig. 6). The interdigitations were not all oriented in the same direction, and we did not observe an obvious correlation between interdigitation number, orientation, and lineage. We plan to continue to explore these structures in future studies. *

      In line with this, the authors importantly report the presence of an ER-like membrane structure lining the RCs. First, it would be nice to have statistics to support the observation of how many RCs..? Secondly, does this ER membrane structure vary according to the position of the RC in the cyst, are they related to the RC lineage?

      *We appreciate the reviewer's interest in this novel ER-like structure lining the ring canals. We have generated a detailed model of these structures within the stage 4 egg chamber (Fig. 5D,E). However, because we do not have data from a large number of egg chambers, we believe that performing statistics would not be appropriate. *

      The addition of graphs showing the quantitative data with statistics in the figures would improve understanding of the results. This is particularly the case for the characterisation of RCs according to the stage of cyst development, as shown in Figure 3. This also applies to the characterisation of RCs within a cyst and the relationship between RC size and lineage, as shown in Figure 4, and to the characterisation (thickness) of the inner part of the RC.

      *We have included graphs of ring canal diameter based on stage (Fig. 4B) or lineage (Fig. 4C); however, because we only have data from a few germline cysts, we have not performed any statistical analysis. *

      The part on the structural analysis of the fusome is interesting but still secondary to the characterisation of the RCs. This part should be moved to the results and figures after the various parts concerning the RCs.

              *We have deemphasized the fusome structural analysis in the results section; however, we chose to leave these images in the figures, since there could be a connection between the novel ER-like structures and the fusome.  *
      

      Minor comments The distribution of the fusome in Figure 2 is difficult to see with Hts labelling and does not really correspond to the schematic, especially in regions 2a and 2B.

      *We have modified the images and the schematic. *

      In panel C of Figure 2, it is a little disturbing that the legend is directly on the image of RC. It hides some information about the images and could be placed at the bottom of the panel. This also the case for the panel G.

      We understand the possible confusion and have changed the layout in the figure.

      With figure 3B, it would be good to highlight the position of cyst.

      We have pseudocolored the portion that corresponds to the relevant cyst in the same color used for the reconstruction (which is now Fig. 3A).

      Reviewer #1 (Significance (Required)): As mentioned above, this work can be divided into two parts. The part corresponding to the acquisition of images by volume electron microscopy and manual 3D reconstruction is new and a great source of valuable information. The part related to the spatial characterisation of the RC is important, but corresponds more to an extension and reinforcement of previously available information than to the contribution of significant new insights. I think it will be of great interest to an audience interested in Drosophila oogenesis.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This study presents a high-resolution volumetric analysis of germline ring canals (RCs) during Drosophila oogenesis. By combining two complementary electron microscopy techniques-Focused Ion Beam Scanning Electron Microscopy (FIB-SEM) and Array Tomography Scanning Electron Microscopy (AT-SEM)-the authors compare RC structural features at different developmental stages, ranging from the relatively small germarium to the significantly larger, later-stage egg chambers.

      At early stages of oogenesis, FIB-SEM analysis confirms that the average RC size increases progressively with cyst development, in agreement with previous studies. The authors further show that lineage reliably predicts RC size (an observation previously reported, but here identified at an earlier stage in region 2a) and, importantly, that the thickness of the actin rim can also be predicted by lineage (reported here for the first time, at stage 1). FIB-SEM analysis also enables a clear delineation of the fusome, allowing for detailed characterization of its assembly and disassembly. Notably, the authors report, for the first time, structural evidence of ER-like membranes capping the inner rim of actin RCs.

      At later developmental stages, AT-SEM analysis reveals that the microvilli observed by FIB-SEM evolve into extensive interdigitations extending beyond the outer rim in mid-stage egg chambers, a structural feature detected earlier than previously reported. Moreover, by analyzing a sample in which tissue organization was disrupted during preparation, the authors demonstrate that these interdigitations preferentially occur in proximity to the RC. In addition to RC analysis at later stages, the authors use AT-SEM to readily identify small cell populations, such as the germline stem cell niche and border cells, and provide high-resolution volumetric EM data for these structures.

      MAJOR COMMENT My main comment is that we don't learn much new about the biology of these ring canals. The results primarily confirm findings from previous studies using conventional electron microscopy.

      Although TEM data has been used to perform foundational studies in the field, there are limitations to this approach. Due to the size of the ring canals, it is challenging to locate them within the large volume of the egg chamber (especially at later stages). Even if ring canals can be located, they are typically not oriented the same way, so a single section is not sufficient. *Although some of the results shown by our complementary vEM approaches do confirm results that have been previously reported by TEM or fluorescence microscopy, our approach provides important additional insight into structures that have been studied for many decades that would not be possible using other approaches. Further, this approach has identified a novel membrane structure lining the ring canals, and it has provided structural details of the membrane interdigitations that would not be possible with conventional electron microscopy. Further, this complementary set of vEM approaches would be applicable to the study of many other structures within other tissue types. *

      • *

      One particularly interesting biological question, which is briefly mentioned in the text, is whether the oocyte is the cell that inherits the majority of the fusome. Since the authors are able to reconstruct the fusome using their data, they could measure the fusome volume in each cell (especially in the two pro-oocytes) and investigate whether the cell with the larger fusome ultimately becomes the oocyte. This question has been discussed for some time, and recent studies have proposed opposing models based on fusome volume to explain how the oocyte is selected among the 16 sister cells (Nashchekin et al., Science, 2021; Barr et al., Genetics, 2024).

              *We appreciate the reviewer's interest in the fusome, and we agree that our approach has provided significant insight into its three dimensional structure. The rendering of the fusome was performed using a large number of small isosurface volumes, and it is therefore difficult to accurately determine the fusome volume, since additional (non-fusome) material could be included in the model. Further, the fusomes that were rendered were within the germline clusters from region 2b, where the fusome has already started to break down, so these would not provide an accurate quantification of the full fusome volume. Because the focus of the manuscript is on the germline ring canals and associated structures such as the interdigitations (which we have tried to further streamline in this revised version), we believe that additional analysis of the fusome is outside of the scope of this work. *
      

      MINOR COMMENT • The fluorescent markers used in the fly stocks are neither described in the Materials and Methods section nor depicted in the figures.

      *We apologize if this was not clear in the original manuscript. Based on the comment from Reviewer #3 (see below), we have repeated the Hts staining using flies that do not have CheerioYFP in the background. We have also clarified the materials and methods section to indicate the panels that correspond with each strain used. *

      • The authors should quote (Nashchekin et al., Science, 2021) when mentioning unequal partionning of the fusome (p4) and oocyte determination (p12). *We have added the reference to these parts of the manuscript. *

      • P11-12, when mentioning electron dense regions reflecting strong cell-cell adhesion, the authors could refer to (Fichelson et al. Development, 2010), where AJ have been described around ring canals. *We have added the reference to this part of the manuscript. *

      • Figure 2A: The schematic diagram (4th line) is not explained in the figure legend. *We have updated the figure legend to describe this schematic. *

      • Figure 2D: Please clarify whether the RC stage shown corresponds to stage 1 or stage 10, as indicated in panel 2E. Alternatively, are these examples representing the minimum and maximum RC sizes observed across the entire dataset?. *These were not meant to be examples of the minimum and maximum ring canal sizes observed across the dataset. Instead, they were used to demonstrate the significant expansion that occurs during oogenesis. In the updated version of this figure, this panel has been removed. *

      • Figure 5D: Please specify which panel in 5B this corresponds to. • Figure 5E: Please specify which panels in 5B this corresponds to. The two green boxes are not defined. Why is there a grey background under the ovariole assembly? • Figures 5G, 5H: Does panel 5G correspond to the left green box in 5E, and 5H to the right green box in 5E? Please clarify. *We have modified Figure 5 and merged it with the figure 6. In this updated format, panels 5B and 5E have been removed. *

      • Figure 6: The figure title is not on the same page as the figure itself.

      • We have made this change. *

      • Figure 6A: The black box marking the germarium is not defined. *In this revised version, we have modified Fig. 6, and this panel has been removed. *

      • Figure 6B-E: The arrows point to long interdigitations. However, arrowheads (which are not mentioned in the legend) appear to indicate the RC outer rim. Please specify this clearly in the figure legend. In the updated version of Fig. 6, these arrowheads have been removed.

      Reviewer #2 (Significance (Required)):

      I am not an expert in electron microscopy, so I cannot comment in detail on these techniques, but they appear to bridge the gap between conventional EM and optical microscopy in terms of resolution, user-friendliness, and other aspects. This is technically interesting, although these EM approaches have been previously described and applied. The images and movies are beautiful and clearly presented. My main comment is that we don't learn much new about the biology of these ring canals. The results primarily confirm findings from previous studies using conventional electron microscopy.

      One particularly interesting biological question, which is briefly mentioned in the text, is whether the oocyte is the cell that inherits the majority of the fusome. Since the authors are able to reconstruct the fusome using their data, they could measure the fusome volume in each cell (especially in the two pro-oocytes) and investigate whether the cell with the larger fusome ultimately becomes the oocyte. This question has been discussed for some time, and recent studies have proposed opposing models based on fusome volume to explain how the oocyte is selected among the 16 sister cells (Nashchekin et al., Science, 2021; Barr et al., Genetics, 2024).


      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Kolotuev et al. used two volume-based electron microscopy based approaches to identify, segment, and document the changes in intercellular bridges, or ring canals, in early egg chambers of the fruit fly, Drosophila melanogaster. Using array tomography and focused ion beam scanning electron microscopy, Kolotuev et al., provide a high resolution and content rich lineage analysis of ring canal size, shape and orientation among early and late egg chambers. Their analysis included parameters such as the presence and shape of the fusome, the recruitment of actin to the inner ring, and development of membrane fingers that presumably spatially stabilize such structures. Last, Kolotuev and co-authors highlight additional aspects of their dataset including a reconstruction of the border cell cluster in stage 9 egg chambers. The data presented are a treasure trove of the ultrastructural features of the developing dipteran germline and subsequent ovarian follicle development. The data presented represent the highest resolution 3D dataset available and thus are a valuable worthwhile contribution to the field. My overall impression is that this paper sits intellectually between a valuable method and a loose experimental manuscript. This critique is not requesting additional experimental evidence because the data are unique and are the foundation for a new experimental paradigm. But there is not sufficient detail presented to be a full method, nor any hypothesis testing to be considered experimental. I suggest the authors consider amplifying their methods in detail and then note that using these methods provide a foundation for additional future investigations (as mentioned in the discussion). Problems with data interpretation and presentation should be addressed before publication. Below are the major and minor concerns that I believe need to be considered.

      Major comments: In general images in figures are thought provoking, however changes to figure layout and design should be considered to better highlight the results. For instance, I don't know how to follow figure 1a. The arrow leads from a whole ovary to an ovulated egg with an ovariole strand connecting the two. What is the purpose of the arrow? Is it to represent time? And why is the mature egg in the figure when no data regarding this stage is presented. The authors should consider removing the mature egg and helping the reader understand that the ovariole is a subset of the whole ovary. They might do this by putting a box around a single ovarile in the whole ovary to indicate their ovariole illustration. Several other figures have similar problems. Throughout the authors used black and white arrows on black and white EM data and these arrows were lost. Color should be considered to effectively point out what they want the reader to see.

      We have modified the layout of Fig. 1 and added additional explanation to the introduction and figure legend to guide readers through the introduction to the system. We have also added color to some of the arrows throughout the manuscript.

      Can the authors provide additional information for the genotypes used? For instance the Cherrio-YFP (which might affect actin). When what this used and can the authors provide information on how this affected the data between when it was used and when it was not used. Additionally, why was analysis done in transgenic flies over fully wild-type?

      *We have repeated the Hts staining in Fig. 2A in flies that do not express Cheerio-YFP and have made the appropriate changes to the methods section. For the AT-SEM experiment, we chose to use this genetic background since it would align with that of the negative controls that we often use in RNAi or over-expression experiments. FIB-SEM datasets were collected while imaging other tissues of the fly, so the choice of that genotype was not intentional. However, these datasets provided us with the opportunity to do this proof-of-concept work without such a large financial investment in the acquisition of new image stacks. In the future, we hope to expand this work to generate additional datasets from flies of different genotypes. *

      Figure 1 seeks to lay out the ovary system and narrow the reader into the stages that will be analyzed in subsequent figures. Figure 1B is meant to show the types and kinds of electron microscopy, however lacks a full detailed description and legend for each of the colored arrows. And to that fact, so does figure S1. The authors need to provide additional information so the reader can glean what the authors point they are trying to convey. In addition, the authors might add pros and cons to each. I know this was attempted in S1, but did not fully come across.

      We appreciate this feedback, and we have modified the layout of Figure 1 and updated Figures S1 to better highlight the technical challenge of EM in general and benefits of vEM in particular.

      Figure 1 and 2 seek to set up both the biological and technical system to be understood. The authors might consider combining the two figures and eliminate elements that don't represent a result of any kind (Figure 1B, 2B, 3D and 3F). Or more fully explain the result and point they are trying to make with these illustrations. I fully understand and appreciate what they are trying to get across, but it does not come across clearly. For example, I don't know how figure 2B effectively gets across the point that rotation of the image has an effect on how it is sliced and segmented in EM data. Not sure it is necessary. Furthermore, what is the bottom panel with a green ring canal supposed to allow us to interpret or conclude? The same for 3D and F. The result in 3E is far more interesting and should be two panels that emphasize the growth characteristics between young and old rings or those of M1 and M4.

              *We greatly appreciate these suggestions, and we have modified and reorganized several figures to make the flow of scientific ideas easier to follow.* *We have moved panel 1B to the supplementary figure and gave additional indications in the text as to the differences between the EM methods. We have moved panel 2B to the supplementary material. We have moved Fig. 3D to Fig. S5A,B. Fig. 5 now provides more extensive rendering of membrane interdigitations from the stage 4 egg chamber. We have chosen to leave Fig. 3F to allow readers to compare the novel ER-like structures within the ring canals to the fusome that is present within younger germline clusters. *
      

      The HTS and actin stain in figure 2A overlap significantly and obscure the fusome staining. Can the authors confirm that there is no bleed through in their staining and imaging procedure?

      *We have repeated this staining and can confirm that there was no bleed through between the two channels. *

      The data in Figure 2C are critical to showing the z-resolution enhancement of sectioned EM. However, the use of green psuedocolor only in one panel is confusing. Can the authors duplicate the whole panel and provide one without and one with psuedocolor? This would be ideal for fully orienting the reader to the sectioning and setting them up to understand the rest of the figures.

      *In the revised version of Figure 2, we have split the sections into two rows of panels; we have added the pseudocolor to every other section (in the bottom row of panels). *

      • *

      The results section for figure 2 does outline the results presented. For example, the germarium contains syncytia of differing stages and ring canals with intervening fusomes... It does more to talk about the pros and cons of different technical aspects and their difficulty This should be saved for the rationale or the discussion. Rather the section should outline the results presented.

      *We have modified the layout of figure 2 in order to describe the system in a more straightforward manner with a smoother transition from Figure 1 while further explaining technical points. *

      I appreciate the color coding of the differentially segment cysts in Figure 3. The color coding helped orient me to which cysts were being evaluated. However I found the lack of detail bothersome. For instance, which ring canals are in the two panels of D? Are they M1 or M4?

      *With the additional analysis of the interdigitations in the stage 4 cluster, we have moved panel D to Fig. S5. We did not have enough coverage of the region 2a cluster (red) to determine lineage, but we have added a statement to the legend to indicate that the ring canal shown in Fig. S5B is an M1 ring canal. *

      Also, the presentation of ring canal size and distribution should be presented in a graph. Statistics are not necessary, but a dot-plot would go a long way to presenting the result. Two plots can add value, one in which the ring canals for each phase is shown, and the other is the distribution of sizes for each cyst.

      *We have added these graphs in Fig. 4B, C. *

      Lastly, the results section for figure 3 interprets the membrane bound vesicles in the ring canal as "ER-like". This should be removed since they neither look ER-like to me, nor have been shown to be ER in the data.

      *We appreciate this suggestion, and although we cannot be absolutely certain of the identity of these structures without further study, with our additional analysis of the stage 4 egg chamber, we are further convinced of the similar appearance of these novel structures and the ER in other regions of the nurse cell (Fig. 5). We have clarified this point in the text. *

      Figure 4A is not called out specifically in the results and thus should be interpreted or removed from the figure.

      In this revised version, we have removed panel 4A.

      Figure 5 was confusing. I understand the authors wanted to show the wafer and the ribbons, however, this is not a result and does not offer any interpretation of a result and is thus confusing on why it is in the figure. If this were a method paper, I would understand its presence.

      *We have removed this panel from the figure. *

      Can the authors comment on the shape of the nuclei in older egg chambers? They are not round at all. I am interested in whether this is a fixation artifact or the real ultrastructure of the nuclei. Of the border cell nuclei for instance. If it is an artifact, this should be added to the discussion.

      *Some of the nuclei appear to have a peculiar shape in the cross-section. We cannot entirely exclude the role of the fixation in the shape irregularities. However, since not all the nuclei are subject to this phenomenon, we are inclined to attribute it to the intrinsic qualities of the late-stage nuclei. In numerous cases, different tissue and cell stages determine the shape of the nucleus, which frequently deviates from a spherical shape. *

      Although data from "imperfect" samples is interesting, consider relegating Figure 6 to the supplement section, as it takes away from the pre-existing narrative flow established in the paper.

      • In this draft, we have combined parts of figures 5 and 6, and much of the data from the imperfect sample has been removed. *

      Interpretation of the data throughout the results should be left to the discussion section. For instance, interpretation of Figure 4 results on page 14 beginning with "these data demonstrate the importance...". The importance is not related to the result, but rather discussion of past and future studies.

      We have removed this sentence from the results.

      In another example, Figure 5I is introduced and discussed in the results section on page 15, second whole paragraph with an overall introduction/discussion on junctions, which convolutes the actual result. Discussion of future studies or how structures like the novel membrane fingers should be viewed in a larger biological context, should not be in the results.

      We have made this change.

      Minor comments: Remove words such as "pseudo-timelapse", they invoke precision on a point that is imprecise.

      *This has been removed. *

      Re-consider the acronyms for ring canal and egg chamber.

      *We have removed these acronyms. *

      Consider finding another way to call out each supplemental movie other than with another acronym.

      *We have added small icons to indicate that a supplemental movie is associated with a given figure or panel. *

      Reviewer #3 (Significance (Required)): The present manuscript is a technical advance in the field. The use of serial EM imaging with two separate modalities, on what is considered to be a challenging problem in the field, represents a useful technical advance. Light microscopy has thus far limited the resolution to which we can understand the spatial organization and the cellular features there in that regulate germline development. This manuscript brings to bear two serial EM methods to begin approaching this problem. The audience for this work are those working at the forefront of understanding germline architecture and development. I make these statements as an expert in live and super resolution of fruit fly egg chamber development, in addition to having performed 3D SEM in past works.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Kolotuev et al. used two volume-based electron microscopy based approaches to identify, segment, and document the changes in intercellular bridges, or ring canals, in early egg chambers of the fruit fly, Drosophila melanogaster. Using array tomography and focused ion beam scanning electron microscopy, Kolotuev et al., provide a high resolution and content rich lineage analysis of ring canal size, shape and orientation among early and late egg chambers. Their analysis included parameters such as the presence and shape of the fusome, the recruitment of actin to the inner ring, and development of membrane fingers that presumably spatially stabilize such structures. Last, Kolotuev and co-authors highlight additional aspects of their dataset including a reconstruction of the border cell cluster in stage 9 egg chambers. The data presented are a treasure trove of the ultrastructural features of the developing dipteran germline and subsequent ovarian follicle development. The data presented represent the highest resolution 3D dataset available and thus are a valuable worthwhile contribution to the field. My overall impression is that this paper sits intellectually between a valuable method and a loose experimental manuscript. This critique is not requesting additional experimental evidence because the data are unique and are the foundation for a new experimental paradigm. But there is not sufficient detail presented to be a full method, nor any hypothesis testing to be considered experimental. I suggest the authors consider amplifying their methods in detail and then note that using these methods provide a foundation for additional future investigations (as mentioned in the discussion). Problems with data interpretation and presentation should be addressed before publication. Below are the major and minor concerns that I believe need to be considered.

      Major comments:

      • In general images in figures are thought provoking, however changes to figure layout and design should be considered to better highlight the results. For instance, I don't know how to follow figure 1a. The arrow leads from a whole ovary to an ovulated egg with an ovariole strand connecting the two. What is the purpose of the arrow? Is it to represent time? And why is the mature egg in the figure when no data regarding this stage is presented. The authors should consider removing the mature egg and helping the reader understand that the ovariole is a subset of the whole ovary. They might do this by putting a box around a single ovarile in the whole ovary to indicate their ovariole illustration. Several other figures have similar problems. Throughout the authors used black and white arrows on black and white EM data and these arrows were lost. Color should be considered to effectively point out what they want the reader to see.

      • Can the authors provide additional information for the genotypes used? For instance the Cherrio-YFP (which might affect actin). When what this used and can the authors provide information on how this affected the data between when it was used and when it was not used. Additionally, why was analysis done in transgenic flies over fully wild-type? Figure 1 seeks to lay out the ovary system and narrow the reader into the stages that will be analyzed in subsequent figures. Figure 1B is meant to show the types and kinds of electron microscopy, however lacks a full detailed description and legend for each of the colored arrows. And to that fact, so does figure S1. The authors need to provide additional information so the reader can glean what the authors point they are trying to convey. In addition, the authors might add pros and cons to each. I know this was attempted in S1, but did not fully come across. Figure 1 and 2 seek to set up both the biological and technical system to be understood. The authors might consider combining the two figures and eliminate elements that don't represent a result of any kind (Figure 1B, 2B, 3D and 3F). Or more fully explain the result and point they are trying to make with these illustrations. I fully understand and appreciate what they are trying to get across, but it does not come across clearly. For example, I don't know how figure 2B effectively gets across the point that rotation of the image has an effect on how it is sliced and segmented in EM data. Not sure it is necessary. Furthermore, what is the bottom panel with a green ring canal supposed to allow us to interpret or conclude? The same for 3D and F. The result in 3E is far more interesting and should be two panels that emphasize the growth characteristics between young and old rings or those of M1 and M4.

      • The HTS and actin stain in figure 2A overlap significantly and obscure the fusome staining. Can the authors confirm that there is no bleed through in their staining and imaging procedure?

      • The data in Figure 2C are critical to showing the z-resolution enhancement of sectioned EM. However, the use of green psuedocolor only in one panel is confusing. Can the authors duplicate the whole panel and provide one without and one with psuedocolor? This would be ideal for fully orienting the reader to the sectioning and setting them up to understand the rest of the figures.

      • The results section for figure 2 does outline the results presented. For example, the germarium contains syncytia of differing stages and ring canals with intervening fusomes... It does more to talk about the pros and cons of different technical aspects and their difficulty This should be saved for the rationale or the discussion. Rather the section should outline the results presented.

      • I appreciate the color coding of the differentially segment cysts in Figure 3. The color coding helped orient me to which cysts were being evaluated. However I found the lack of detail bothersome. For instance, which ring canals are in the two panels of D? Are they M1 or M4? Also, the presentation of ring canal size and distribution should be presented in a graph. Statistics are not necessary, but a dot-plot would go a long way to presenting the result. Two plots can add value, one in which the ring canals for each phase is shown, and the other is the distribution of sizes for each cyst. Lastly, the results section for figure 3 interprets the membrane bound vesicles in the ring canal as "ER-like". This should be removed since they neither look ER-like to me, nor have been shown to be ER in the data.

      • Figure 4A is not called out specifically in the results and thus should be interpreted or removed from the figure.

      • Figure 5 was confusing. I understand the authors wanted to show the wafer and the ribbons, however, this is not a result and does not offer any interpretation of a result and is thus confusing on why it is in the figure. If this were a method paper, I would understand its presence.

      • Can the authors comment on the shape of the nuclei in older egg chambers? They are not round at all. I am interested in whether this is a fixation artifact or the real ultrastructure of the nuclei. Of the border cell nuclei for instance. If it is an artifact, this should be added to the discussion.

      • Although data from "imperfect" samples is interesting, consider relegating Figure 6 to the supplement section, as it takes away from the pre-existing narrative flow established in the paper. Interpretation of the data throughout the results should be left to the discussion section. For instance, interpretation of Figure 4 results on page 14 beginning with "these data demonstrate the importance...". The importance is not related to the result, but rather discussion of past and future studies. In another example, Figure 5I is introduced and discussed in the results section on page 15, second whole paragraph with an overall introduction/discussion on junctions, which convolutes the actual result. Discussion of future studies or how structures like the novel membrane fingers should be viewed in a larger biological context, should not be in the results.

      Minor comments:

      • Remove words such as "pseudo-timelapse", they invoke precision on a point that is imprecise.

      • Re-consider the acronyms for ring canal and egg chamber.

      • Consider finding another way to call out each supplemental movie other than with another acronym.

      Significance

      The present manuscript is a technical advance in the field. The use of serial EM imaging with two separate modalities, on what is considered to be a challenging problem in the field, represents a useful technical advance. Light microscopy has thus far limited the resolution to which we can understand the spatial organization and the cellular features there in that regulate germline development. This manuscript brings to bear two serial EM methods to begin approaching this problem. The audience for this work are those working at the forefront of understanding germline architecture and development. I make these statements as an expert in live and super resolution of fruit fly egg chamber development, in addition to having performed 3D SEM in past works.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This study presents a high-resolution volumetric analysis of germline ring canals (RCs) during Drosophila oogenesis. By combining two complementary electron microscopy techniques-Focused Ion Beam Scanning Electron Microscopy (FIB-SEM) and Array Tomography Scanning Electron Microscopy (AT-SEM)-the authors compare RC structural features at different developmental stages, ranging from the relatively small germarium to the significantly larger, later-stage egg chambers. At early stages of oogenesis, FIB-SEM analysis confirms that the average RC size increases progressively with cyst development, in agreement with previous studies. The authors further show that lineage reliably predicts RC size (an observation previously reported, but here identified at an earlier stage in region 2a) and, importantly, that the thickness of the actin rim can also be predicted by lineage (reported here for the first time, at stage 1). FIB-SEM analysis also enables a clear delineation of the fusome, allowing for detailed characterization of its assembly and disassembly. Notably, the authors report, for the first time, structural evidence of ER-like membranes capping the inner rim of actin RCs. At later developmental stages, AT-SEM analysis reveals that the microvilli observed by FIB-SEM evolve into extensive interdigitations extending beyond the outer rim in mid-stage egg chambers, a structural feature detected earlier than previously reported. Moreover, by analyzing a sample in which tissue organization was disrupted during preparation, the authors demonstrate that these interdigitations preferentially occur in proximity to the RC. In addition to RC analysis at later stages, the authors use AT-SEM to readily identify small cell populations, such as the germline stem cell niche and border cells, and provide high-resolution volumetric EM data for these structures.

      MAJOR COMMENT

      My main comment is that we don't learn much new about the biology of these ring canals. The results primarily confirm findings from previous studies using conventional electron microscopy. One particularly interesting biological question, which is briefly mentioned in the text, is whether the oocyte is the cell that inherits the majority of the fusome. Since the authors are able to reconstruct the fusome using their data, they could measure the fusome volume in each cell (especially in the two pro-oocytes) and investigate whether the cell with the larger fusome ultimately becomes the oocyte. This question has been discussed for some time, and recent studies have proposed opposing models based on fusome volume to explain how the oocyte is selected among the 16 sister cells (Nashchekin et al., Science, 2021; Barr et al., Genetics, 2024).

      MINOR COMMENTS

      • The fluorescent markers used in the fly stocks are neither described in the Materials and Methods section nor depicted in the figures.

      • The authors should quote (Nashchekin et al., Science, 2021) when mentioning unequal partionning of the fusome (p4) and oocyte determination (p12).

      • P11-12, when mentioning electron dense regions reflecting strong cell-cell adhesion, the authors could refer to (Fichelson et al. Development, 2010), where AJ have been described around ring canals.

      • Figure 2A: The schematic diagram (4th line) is not explained in the figure legend.

      • Figure 2D: Please clarify whether the RC stage shown corresponds to stage 1 or stage 10, as indicated in panel 2E. Alternatively, are these examples representing the minimum and maximum RC sizes observed across the entire dataset?.

      • Figure 5D: Please specify which panel in 5B this corresponds to.

      • Figure 5E: Please specify which panels in 5B this corresponds to. The two green boxes are not defined. Why is there a grey background under the ovariole assembly?

      • Figures 5G, 5H: Does panel 5G correspond to the left green box in 5E, and 5H to the right green box in 5E? Please clarify.

      • Figure 6: The figure title is not on the same page as the figure itself.

      • Figure 6A: The black box marking the germarium is not defined.

      • Figure 6B-E: The arrows point to long interdigitations. However, arrowheads (which are not mentioned in the legend) appear to indicate the RC outer rim. Please specify this clearly in the figure legend.

      Significance

      I am not an expert in electron microscopy, so I cannot comment in detail on these techniques, but they appear to bridge the gap between conventional EM and optical microscopy in terms of resolution, user-friendliness, and other aspects. This is technically interesting, although these EM approaches have been previously described and applied. The images and movies are beautiful and clearly presented.

      My main comment is that we don't learn much new about the biology of these ring canals. The results primarily confirm findings from previous studies using conventional electron microscopy. One particularly interesting biological question, which is briefly mentioned in the text, is whether the oocyte is the cell that inherits the majority of the fusome. Since the authors are able to reconstruct the fusome using their data, they could measure the fusome volume in each cell (especially in the two pro-oocytes) and investigate whether the cell with the larger fusome ultimately becomes the oocyte. This question has been discussed for some time, and recent studies have proposed opposing models based on fusome volume to explain how the oocyte is selected among the 16 sister cells (Nashchekin et al., Science, 2021; Barr et al., Genetics, 2024).

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      The possibility of observing 3D cellular organisation in tissues at nanometre resolution is a hope for many cell biologists. Here, the authors have combined two volume electron microscopy approaches with scanning electron microscopy: Focused Ion Beam (FIB-SEM) and Array Tomography (AT-SEM) to study the evolution of the shape and organisation of cytoplasmic bridges, the 'ring canals' (RCs) in the Drosophila ovarian follicle that connect nurse cells and oocyte. This type of cytoplasmic link, found in insects and humans, is essential for oocyte development. RCs have mainly been studied using light microscopy with various markers that constitute them, but this approach does not fully capture an overall view of their organization. Due to their three-dimensional arrangement within the ovarian follicle, characterizing their organization using transmission electron microscopy (TEM) has been very limited until now. This v-EM study allows the authors to document the evolution of RC size and thickness during the development of germline cysts, from the germarium to stage 4, and potentially beyond. This study confirmed previous findings, namely that RC size correlates with lineage: the largest RC is formed after the first division, while the smallest is formed during the last division. Furthermore, this work allowed a better characterisation of the membrane interdigitation surrounding the RCs. In addition, the authors highlight the important potential of v-EM for further structural analysis of the fusome, migrating border cells and the stem cell niche.

      Major comments

      • The output of this work can be divided into two parts. First, this work presents a technical challenge, involving image acquisition by volume electron microscopy and manual 3D reconstruction of the contours of the membranes, nuclei, RCs, and fusome in different cysts at different stages. Secondly, this work is based on a structural study of the RCs and their associated membranes. This work is descriptive but important, although the results largely confirm previous findings, both for the structure of the RCs and their relationship to the division sequence of the cyst cells, and for the organisation of the membranes around the RCs.

      • Very interestingly, the authors report the spatial characterisation of membrane structures associated with and close to CRs that have already been identified (Loyer et al.). However, their characterisation is somewhat incomplete, as it lacks quantified data - how many CRs were analysed? and, above all, the characteristics of these membranes, their length and orientation according to their position and their connection in the lineage - these data could be obtained from the VEM data already collected and would be an important addition to the RC structural analysis in this work. In line with this, the authors importantly report the presence of an ER-like membrane structure lining the RCs. First, it would be nice to have statistics to support the observation of how many RCs..? Secondly, does this ER membrane structure vary according to the position of the RC in the cyst, are they related to the RC lineage? The addition of graphs showing the quantitative data with statistics in the figures would improve understanding of the results. This is particularly the case for the characterisation of RCs according to the stage of cyst development, as shown in Figure 3. This also applies to the characterisation of RCs within a cyst and the relationship between RC size and lineage, as shown in Figure 4, and to the characterisation (thickness) of the inner part of the RC.

      • The part on the structural analysis of the fusome is interesting but still secondary to the characterisation of the RCs. This part should be moved to the results and figures after the various parts concerning the RCs.

      Minor comments

      • The distribution of the fusome in Figure 2 is difficult to see with Hts labelling and does not really correspond to the schematic, especially in regions 2a and 2B.

      • In panel C of Figure 2, it is a little disturbing that the legend is directly on the image of RC. It hides some information about the images and could be placed at the bottom of the panel. This also the case for the panel G.

      • With figure 3B, it would be good to highlight the position of cyst.

      Significance

      As mentioned above, this work can be divided into two parts.

      The part corresponding to the acquisition of images by volume electron microscopy and manual 3D reconstruction is new and a great source of valuable information. The part related to the spatial characterisation of the RC is important, but corresponds more to an extension and reinforcement of previously available information than to the contribution of significant new insights.

      I think it will be of great interest to an audience interested in Drosophila oogenesis.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The manuscript by Dufour et al. is a follow-up on the groups' previous publication that introduced the photo-inducible Cre recombinase, LiCre. In the present work, the authors further characterize the properties and kinetics of their optogenetic switch. Initially, the authors show that light affects only LiCre-mediated recombination itself and not DNA binding. Following these observations, they measure and mathematically model LiCre kinetics demonstrating high efficiency in vivo and a surprising temperature sensitivity. Finally, Dufour et al. evaluate several mutations that affect the LOV photo-cycle and provide recommendation for LiCre applications. The study thoroughly investigates various aspects of the function of LiCre, confirming some previously known characteristics (i.e. temperature-dependence of Cre activity and functionality of LOV-based optogenetic tools in yeast without co-factor supplementation), while providing new LiCre-specific insights (kinetics, light-independent DNA binding). Please note that the reviewer is no expert in mathematical modeling and cannot fully judge the methodological details of the models. While I have some concerns as listed below, I believe study should be well-suited for publication after a revision.

      Major comments:

      1. After completing the initial experiment, the authors discovered that their plasmids carry different numbers of V5 epitopes. I am wondering whether this was due to a recombination event happening during the experiment or whether the constructs were not sequence verified prior to use? In any case, an additional ChIP experiment using Cre and LiCre constructs with the identical number of tag-repeats will be necessary. The result, i.e. the strong reduction of DNA-binding of LiCre (which is close to the negative control), is quite remarkable given that LiCre is still considerably active and high DNA affinities were observed in SPR experiments. In light of these counterindications, identical experiment conditions for test and reference group become even more important.
      2. The conclusion that DNA-binding of LiCre is completely light-independent is not entirely convincing to me. The differences between the light and dark conditions in Fig. 2d are indeed small, but the values for LiCre are almost on par with the vector control and therefore hard to interpret. Based on this experiment alone, one could even be inclined to argue that LiCre does not bind DNA at all (which is of course falsified by the later experiments), showing that the resolution of the corresponding dataset is too low to draw final conclusions. Light-independent DNA binding should either be confirmed by a more sensitive method or the conclusion statements on this matter should be revised accordingly.
      3. If I understand the explanations correctly, replicates and plotted data points refer to multiple samples (different colonies), that were handled in a single experiment, i.e. by one researcher at the same time/same day. As already mentioned by the authors in the main text, this workflow explains the considerable differences between some of the results in the present manuscript and an identical experiment in a previous publication by the same authors. Providing truly independent experiments (performed on different days) that are therefore independent towards variables such as the fluctuation in incubation temperature (which was the issue in the described experiments) will be crucial, at least for the key datasets.

      Minor comments:

      1. At the end of the Introduction, the authors mention that the interaction of the Cre heptamers was weakened via point mutations in LiCre. A short sentence about the engineering rationale behind this weakened interaction would help readers, who are not familiar with the author's prior work.
      2. Fig. 2a-b depicts images relating to the purification procedure. These could be moved to the supplements as they don't provide any insight apart from the fact that the proteins were successfully purified.
      3. The kinetic characterization was only performed for LiCre. Especially for scientists, who have worked with wildtype Cre before, a side-by-side comparison with wt Cre would be valuable to judge the loss in reaction speed that has to be expected when switching from Cre to LiCre.
      4. The difference between the ChIP results and the SPR results is striking but not mentioned in the discussion section. Also, the statement: "Finally, our results have practical implications on experimental protocols employing LiCre. First, given its high affinity for loxP (Fig. 5b), over-expressing LiCre at high levels will probably not increase its efficiency." (line 502) refers only to the affinity but seems to ignore the low DNA-occupancy of LiCre observed in Fig. 2d. Adapting the discussion section accordingly would improve the manuscript.

      Significance

      General assessment and advance:

      The present study provides a large set of experiments and analyses characterizing the optogenetic LiCre recombinase. In general, the study is well conceived and executed. Although some of my concerns listed above affect key aspects of the study, they should be straightforward to address. The manuscript is a follow-up study providing a more detailed characterization of an optogenetic tool previously developed by the same authors. Its novelty is therefore somewhat limited. While the study provides a rich body of additional data, many of the findings merely confirmed aspects that were to be expected based on the two proteins LiCre is built of (temperature-dependent activity of Cre, optogenetics in yeast w/o the need of co-factor supplementation, weaker DNA-affinity of the Cre fusion protein as compared to wildtype Cre). New insights are provided by the facts that (i) light only controls recombination but not DNA binding and (ii) light activation of only some protomers within the LiCre heptamer is likely to be sufficient to activate recombination. The former aspect is, however, not entirely evident from the results as described above.

      Audience:

      The study will be of interest for researchers focusing on inducible DNA recombination and especially relevant to those who plan to work with LiCre and can now rely on a more detailed and extended characterization compared to the original LiCre publication.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary: This manuscript presents a detailed kinetic and mechanistic characterization of the optogenetic recombinase LiCre, which enables site-specific DNA recombination upon blue-light stimulation. The authors combine in vitro surface plasmon resonance assays, yeast-based recombination assays, and mathematical modeling to dissect the DNA-binding properties, activation dynamics, and recombination efficiency of LiCre. They demonstrate that LiCre binds DNA even in the absence of light, albeit with reduced cooperativity compared to Cre recombinase. Through kinetic modeling, they propose that activation of only two LiCre units may suffice for recombination. The study also evaluates the impact of point mutations in the LOV domain on LiCre's photocycle. The experimental methods are described in detail. Statistical analyses are appropriate and clearly reported.

      Major Comments:

      1. In Figure 1, control experiments with no loxP sequences (i.e. original strain) should be performed to demonstrate specific binding of Cre/LiCre to loxP sequence.
      2. In Figure 2, the SPR experiments are robust and informative. However, the lack of measurement of DNA binding of light-activated LiCre is a notable gap, which will help understand whether the cooperativity of LiCre can be modulated by light. If it is difficult due to experimental conditions, there is lit-mimetic mutant of LOV2 (https://www.nature.com/articles/nmeth.3926).

      Significance

      General Assessment: This is a rigorous study that combines experimental and computational approaches to advance our understanding of LiCre-based optogenetic genome engineering. The strongest aspects are the integration of SPR data with kinetic modeling and the practical insights into LiCre's performance under various conditions. However, the other limitation is the lack of direct validation of some model predictions.

      Advance: To the best of my knowledge, this is the first study to quantitatively model the activation dynamics of LiCre. The work extends previous findings on LiCre and provides new mechanistic and practical insights.

      Audience: This study will be of interest to specialized audiences, particularly those developing or applying the LiCre system.

      Reviewers' Field of Expertise: Protein engineering, Genome editing, Optogenetics, Cell Biology.

      Limitations of Expertise: I do not have deep expertise in mathematical modeling.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Dufour et al describe characterization of the light-activated recombinase LiCre. This work combines the yeast reporter assay, surface plasmon resonance (SPR) and kinetic modeling to provide a comprehensive study of how LiCre functions both in vivo in yeast and in vitro. The authors show that LiCre binds to loxP sites in the dark with high affinity, but reduced cooperativity compared to wild-type Cre, and that recombination efficiency is affected by temperature and illumination regime. Importantly, the authors establish a kinetic model that not only explains these observations but also predicts the altered behavior of a mutant (T418S), which was experimentally validated. It would be valuable to highlight what other predictions the model could make, even if for future work. Overall, this work combines quantitative experiments and modeling to provide new insights into the biochemical and kinetic properties of LiCre.

      Specific comments:

      Line 110-115: Although described in the Methods section, a brief statement of dark and light treatment conditions would help readers better follow the experiments. Likewise, listing the three unrelated positions would improve the clarity.

      Line 185: Is there a typo?

      Line 216: Have the authors considered performing surface plasmon resonance (SPR) to confirm the binding affinity of LiCre-V5 DNA?

      Line 233-234: To determine whether the observed difference in recombination efficiency is due to the genomic context of the reporter loci or due to the measurement accuracy of GFP and RFP signals, have the authors considered swapping the positions of GFP and RFP?

      Line 236: The sentence "Importantly, we never observed recombination in the entire cell population" is ambiguous. I believe it means recombination was never observed in 100% of the cells. Please rephrase it.

      Line 245-249: The hypothesis of plasmid loss based on plating samples on selective and non-selective media without illumination assumes that loss of growth on selective media is only due to plasmid loss, without considering other factors like burden or toxicity. Moreover, the broad range of 10-30% makes it difficult to justify that the ~15% recombination-negative fraction falls within expected variation. The conclusion that LiCre-mediated recombination efficiency is close to 100% after prolonged photoactivation (Line 249, 301-303) is not fully convincing unless more evidence is provided.

      Line 275-276: The authors suspect that the decrease in recombination efficiency at very high light intensity is possibly attributed to phototoxicity. Could photobleaching also contribute to this effect? A viability assay would help to validate the phototoxicity explanation.

      Line 345-346: While the model with x=2 provides a slightly better fit comparing to the others, the possibility of x=4 cannot be excluded. The inference that "photo-activation of at least two LiCre protomers enables recombination" is not sufficiently proven.

      Figure 1e: Please clarify whether the Western blots shown represent biological replicates.

      Figure 4: Please include the error bars. Panel a - The authors integrated GFP and mCherry reporters at two different loci to avoid positional bias. Why then is only mCherry used as the ON readout in most experiments, rather than analyzing both reporters in parallel? Please clarify. For panel 4h and line 272, the statement that maximal activation was reached at 12 mW/cm² should be rephrased more cautiously, as no intermediate intensities between 12 and 35.6 mW/cm² were tested.

      Significance

      This study provides a quantitative experimental and predictive analysis of the light-activated recombinase LiCre, offering new insights into its binding, activation and recombination properties. The predictive validation of the mutant is a strength of this work. While the modeling part is an innovative aspect, more clarification is needed, especially regarding the conclusion that photo-activation of at least two LiCre protomers enables recombination. More mechanistic investigations are needed to support the conclusions. The work will be of interest to researchers in optogenetics, genome engineering, and DNA-protein interactions. My expertise is in yeast genome engineering and applications of Cre-mediated recombination system. Modeling is outside my primary area of expertise.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      Thank you for providing an assessment of our manuscript. We suggest here a revision plan to address the points raised by the reviewers regarding code documentation, benchmarking, and biological applications.

      As part of the revisions implemented we have:

      Clarified the management of dependencies of our package Fixed the data download run times of test data Clarified the parameters of the normalization and optimization functions We plan to:

      Extend our manuscript to include a section on cross-condition analysis that builds on our tutorials, where we will illustrate how ParTIpy can quantify shifts in the distribution of fibroblasts across the functional space defined by archetypal analysis between healthy and failing hearts. Extend our benchmarks of scalability of coresets, by reporting wall-clock time and peak memory usage across distinct data sizes. Extend our benchmarks of stability of coresets, by reporting the similarity of the estimated archetypes based on the original versus the sampled data. Include the original enrichment analysis of ParTI to provide users with distinct options to work with the archetypes, and provide a larger discussion on the distinct strategies. We believe these revisions will strengthen our__ software manuscript__ and will help us to provide a robust and practical tool to analyze functional trade-offs from biological data.

      2. Description of the planned revisions

      Reviewer #1

      Summary

      The paper "ParTIpy: A Scalable Framework for Archetypal Analysis and Pareto Task Inference" presents ParTIpy, an open-source Python package that modernizes and scales the Pareto Task Inference (ParTI) framework for analyzing biological trade-offs and functional specialization. Unlike the earlier MATLAB implementation, which required a commercial license and was limited in scalability, ParTIpy leverages Python's open ecosystem and integration with tools such as scverse to make archetypal analysis more accessible, flexible, and compatible with modern biological data workflows. Through advanced optimization and coreset algorithms, it efficiently handles large scale single cell and spatial transcriptomics datasets. ParTIpy identifies "archetypes", or optimal phenotypic extremes, to reveal how cells balance competing functional programs. The paper demonstrates its application in modeling hepatocyte specialization across the liver lobule, highlighting spatial patterns of metabolic division of labor.

      Overall, ParTIpy represents a modern, accessible, and scalable Python-based solution for exploring biological trade-offs and resource allocation in high-dimensional data. The paper is clearly written and addresses an important methodological gap. However, the enrichment analysis differs from the original ParTI framework and should be discussed more explicitly, and the documentation and tutorials, while helpful, could be refined to improve usability and reproducibility.

      Major Comments

      1. The archetype enrichment analysis used in this paper differs from the original enrichment analysis implemented in ParTI. This is acceptable, but: a) The authors should explicitly state and discuss the differences between the two approaches. b) The enrichment analysis should be made more systematic. For each tested feature (e.g. gene or pathway), the analysis should report a p-value for the hypothesis that the feature is enriched near an archetype - that is, its expression (or value) is high close to the archetype and decreases with distance. Appropriate multiple-hypothesis correction should also be applied.

      We thank the reviewer for this valuable comment and agree that the differences between our enrichment analysis and the original ParTI implementation should be stated more explicitly. We will incorporate the original enrichment algorithm into ParTIpy, enabling users to select their preferred method. In the revised manuscript, we will note that two enrichment algorithms are available and describe both in greater detail in the supplementary methods section. We also note that the current enrichment analysis already reports p-values adjusted for multiple hypothesis testing.

      Reviewer #2

      Summary

      This paper introduces the software ParTIpy, a scalable Python implementation of Pareto Task Inference (ParTI), designed to infer functional trade-offs in biological systems through archetypal analysis. The framework modernizes the previous toolbox with efficient optimization, memory-saving coreset construction, and integration with the scverse ecosystem for single-cell transcriptomic data.

      Using hepatocytes scRNA-seq data as a test case, the authors identify archetypes corresponding to distinct gene expression patterns. These archetypes align with known liver domains in spatial transcriptomics data, validating both the method's interpretability and its biological relevance.

      Major comments

      (1) Conclusions

      The core computational and biological claims are well supported. ParTIpy clearly scales better than earlier implementations and reproduces known biological structure. However, claims about "scalability to large datasets" should be further qualified (see below).

      We will implement further performance benchmarks as discussed below.

      (2) Claims

      Archetypal analysis based on current matrix computation formulation is non-parametric, and new data require recomputation of archetypes. Therefore, the method cannot generalize to unseen data in the way deep learning approaches, which could be further acknowledged and clarified.

      We thank the reviewer for this insightful comment. We agree that deep learning frameworks are typically amortized, allowing them to generalize to unseen data without retraining, and we will clarify this distinction in the discussion of the revised manuscript. However, we note that mapping new cells into an existing archetypal space is computationally inexpensive, as it only requires solving a single convex optimization problem.

      (3) Additional suggested analyses or experiments

      1) Absolute performance benchmarks : it's suggested to report wall-clock time and memory for a few dataset sizes (10k, 100k, 1M cells).

      We thank the reviewer for this helpful suggestion. We will extend the coreset benchmark to quantify how coreset size affects both archetype positions and biological interpretation. Specifically, we will match archetypes across coreset sizes by solving the linear sum assignment problem, as we currently do when comparing bootstrap samples. We will then compare the distances between archetypes inferred from the full dataset and those obtained from different coreset sizes. In addition to measuring displacement, we will assess biological stability by comparing the gene expression vectors of corresponding archetypes as well as their enriched pathways (using metrics such as cosine similarity and Jaccard index).

      **Referee cross-commenting**

      I agree with the other reviewer's suggestion to check consistency and reproducibility with previous implementation, and enhance the tutorial of the software for users from a biological background. Combined with my comments to further improve the biological application showcase, the revised manuscript could be an impactful contribution to the field, if these comments could be properly addressed.

      (1) Advance

      This paper is primarily a technical contribution. It modernizes the Pareto Task Inference framework into a scalable and user-friendly Python implementation, which is valuable. However, to further improve its significance especially for the broader biological audience, more detailed analysis could be performed (see below)

      (2) Biological scope and applications [optional]

      The current biological validation in hepatocyte is technically fine but limited in breadth and impact. It demonstrates that ParTIpy works but falls in short of showing what new insights it can reveal. Several promising applications could be further explored:

      1) Cross-condition comparisons: could ParTIpy quantify how the Pareto front shifts between conditions (e.g., normal vs. tumor, treated vs. control)?

      We thank the reviewer for this valuable suggestion. We have shown ParTIpy's applicability to cross-condition settings in our online tutorials (https://partipy.readthedocs.io/en/latest/notebooks/cross_condition_lupus.html). However, we agree that a more explicit mention in the manuscript is needed. Thus, we will include a cross-condition analysis as a second application in the revised manuscript, focusing on fibroblasts from heart failure patients from Amrute, et. al. (2023) 1. This will illustrate how ParTIpy can quantify shifts in the distribution of cells across the functional space defined by archetypal analysis.

      Because the manuscript does not explore these scenarios, the biological impact remains narrow, and the framework's broader interpretive power is somehow underrepresented.

      We hope that the additional application included in the revised manuscript helps better illustrate the framework's strength. We would also like to note that the online tutorials provide a comprehensive overview of ParTIpy's functionality, as we expect these will serve as a primary entry point for many researchers interested in archetypal analysis and Pareto Task Inference.

      (3) Audience and impact

      The paper will interest computational biologists, systems biologists, and bioinformaticians focused on single-cell analysis, and its impact will grow substantially if the authors demonstrate more biological applications.

      (4) Reviewer expertise

      Computational biology, single-cell transcriptomics, machine learning, computational math

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Reviewer #1

      2. The package documentation on GitHub and ReadTheDocs is a major strength, but the tutorials can be improved for clarity and accessibility:

      We thank the reviewer for this positive feedback. Indeed, providing comprehensive documentation to facilitate ease of adoption was a major motivation behind this project. In response to the reviewer's suggestions, we have revised the tutorials to further improve their clarity, structure, and accessibility, as detailed below.

      a) The documentation should list external dependencies that need to be installed seperately, e.g. pybiomart.

      We thank the reviewer for pointing this out. We had added all dependencies under the optional-dependencies.extra header, which allows users to run pip install partipy[extra] to be able to run all tutorial notebooks. However, we forgot to explain that in the tutorial or Readme page, which we corrected now. The Readme now reads:

      Install the latest stable full release from PyPI with the extra dependencies (e.g., pybiomart, squidpy, liana) that are required to run every tutorial:

      ``` pip install partipy[extra]

      ```

      Additionally we include clarifications in every tutorial notebook that uses additional dependencies: "To run this notebook, install ParTIpy with the tutorial extras: pip install partipy[extra]".

      b) The dataset used in the Quickstart demo appears to be inaccessible or extremely slow to download (the function load_hepatocyte_data_2() did not complete even after 30 minutes, at least in my experience). The authors should verify data availability on Zenodo and consider providing a smaller or cached version to make the demo more reliable and reproducible.

      We thank the reviewer for this helpful comment. We agree that the previous implementation of load_hepatocyte_data_2() was not reliable due to slow download speeds from Zenodo. To address this, we now host the required AnnData object on figshare (https://figshare.com/articles/dataset/scRNA-seq_hepatocyte_data_from_Ben-Moshe_et_al_2022_/30588713?file=59459459), ensuring faster and more stable access for the Quickstart tutorial via scanpy.read:

      ```

      adata = sc.read("data/hepatocyte_processed.h5ad", backup_url="https://figshare.com/ndownloader/files/59459459")

      adata

      ```

      c) The tutorial order could be more intuitive - for instance, "archetype crosstalk network" appears before "archetypal analysis". Consider starting with the simulated dataset and presenting the full pipeline before moving to more complex real-world examples.

      We thank the reviewer for this helpful suggestion and agree that the previous ordering was not intuitive. We have reordered the tutorials such that the notebook introducing archetypal analysis now appears first, followed by the Quickstart tutorial and the subsequent applied examples.

      Minor comments

      1. In the Python function, the parameter "optim" could use more descriptive option names - for example, renaming "projected_gradients" to "PCHA" would make it clearer and more consistent with terminology used in the paper.

      We thank the reviewer for this helpful suggestion. We agree that the previous naming could be misleading. While PCHA does not precisely describe the underlying algorithm, it is the term most users are familiar with from the literature. We have therefore updated the function to accept both "PCHA" and "projected_gradients", which now map to the same underlying optimization routine.

      In the Quickstart preprocessing, the authors use the following code:

      sc.pp.normalize_total(adata)

      sc.pp.log1p(adata)

      However, they do not specify the target sum in the normalize_total function. The authors should ensure that the data values before the logarithmic transformation span several orders of magnitude (e.g., 0-10,000); if normalization is performed to a sum of 1, the log transformation becomes ineffective.

      We thank the reviewer for this helpful comment. By default, sc.pp.normalize_total scales the counts in each cell to the median total counts across all cells, which preserves the typical range of expression values prior to logarithmic transformation. We therefore consider this default behavior appropriate for the Quickstart example. Nonetheless, we will clarify this explicitly in the tutorial to avoid confusion.

      **Referee cross-commenting**

      I agree with Reviewer #2 observation that the paper's contribution is primarily technical; however, I consider this technical advance to be an important and timely one that will enable many biologists to apply archetypal analysis more effectively in their own work.

      We thank the reviewer for this positive and encouraging assessment.

      Reviewer #1 (Significance (Required)):

      This study presents ParTIpy, a Python-based implementation of Pareto Task Inference (ParTI) that makes archetypal analysis more accessible, scalable, and compatible with modern single-cell and spatial transcriptomics workflows. Its main strength lies in translating a conceptually powerful but technically limited MATLAB framework into an open-source, efficient Python package, enabling wider use in computational biology. The package is well-documented, which further enhances its accessibility and adoption potential, though documentation could be improved to enhance reproducibility and ease of use. It will be of interest to computational systems biologists, particularly those working with omics data, and those interested in studying functional trade-offs and resource allocation.

      We appreciate the reviewer's positive evaluation and are encouraged by their recognition of ParTIpy's relevance and potential impact in computational biology.

      4. Description of analyses that authors prefer not to carry out

      Reviewer #2

      The current biological validation in hepatocyte is technically fine but limited in breadth and impact. It demonstrates that ParTIpy works but falls in short of showing what new insights it can reveal. Several promising applications could be further explored:

      2) Transient or plastic states: Cells with mixed archetype weights or high mixture entropy can be interpreted as transient, functionally flexible states. ParTIpy can quantify such transience geometrically, even in static data, which providing a competitive counterpart to models like CellRank or CellSimplex (https://doi.org/10.1093/bioinformatics/btaf119).

      We thank the reviewer for this interesting suggestion. While we agree that quantifying transient or plastic states based on archetype mixtures is an intriguing idea, validating whether cells with mixed archetype weights ("generalists") truly represent transient states would require additional data modalities such as temporal or lineage-tracing measurements. Although we find this direction highly interesting, given that the manuscript is intended as a software paper, we prefer to focus on more directly supported applications of cross-condition data, where labeled data is available.

      However, we will expand our discussion to relate ParTIpy with CellSimplex since we believe this is an interesting angle that future users could explore.

      5. References

      1. Amrute, J. M. et al. Defining cardiac functional recovery in end-stage heart failure at single-cell resolution. Nat. Cardiovasc. Res. 2, 399-416 (2023).
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      This paper introduces the software ParTIpy, a scalable Python implementation of Pareto Task Inference (ParTI), designed to infer functional trade-offs in biological systems through archetypal analysis. The framework modernizes the previous toolbox with efficient optimization, memory-saving coreset construction, and integration with the scverse ecosystem for single-cell transcriptomic data.

      Using hepatocytes scRNA-seq data as a test case, the authors identify archetypes corresponding to distinct gene expression patterns. These archetypes align with known liver domains in spatial transcriptomics data, validating both the method's interpretability and its biological relevance.

      Major comments

      (1) Conclusions

      The core computational and biological claims are well supported. ParTIpy clearly scales better than earlier implementations and reproduces known biological structure. However, claims about "scalability to large datasets" should be further qualified (see below).

      (2) Claims

      Archetypal analysis based on current matrix computation formulation is non-parametric, and new data require recomputation of archetypes. Therefore, the method cannot generalize to unseen data in the way deep learning approaches, which could be further acknowledged and clarified.

      (3) Additional suggested analyses or experiments

      1. Absolute performance benchmarks : it's suggested to report wall-clock time and memory for a few dataset sizes (10k, 100k, 1M cells).
      2. Coreset sensitivity analysis: Could authors show how coreset size affects archetype positions and biological interpretation?

      Referee cross-commenting

      I agree with the other reviewer's suggestion to check consistency and reproducibility with previous implementation, and enhance the tutorial of the software for users from a biological background. Combined with my comments to further improve the biological application showcase, the revised manuscript could be an impactful contribution to the field, if these comments could be properly addressed.

      Significance

      (1) Advance

      This paper is primarily a technical contribution. It modernizes the Pareto Task Inference framework into a scalable and user-friendly Python implementation, which is valuable. However, to further improve its significance especially for the broader biological audience, more detailed analysis could be performed (see below)

      (2) Biological scope and applications [optional]

      The current biological validation in hepatocyte is technically fine but limited in breadth and impact. It demonstrates that ParTIpy works but falls in short of showing what new insights it can reveal. Several promising applications could be further explored:

      1) Cross-condition comparisons: could ParTIpy quantify how the Pareto front shifts between conditions (e.g., normal vs. tumor, treated vs. control)?

      2) Transient or plastic states: Cells with mixed archetype weights or high mixture entropy can be interpreted as transient, functionally flexible states. ParTIpy can quantify such transience geometrically, even in static data, which providing a competitive counterpart to models like CellRank or CellSimplex (https://doi.org/10.1093/bioinformatics/btaf119).

      Because the manuscript does not explore these scenarios, the biological impact remains narrow, and the framework's broader interpretive power is somehow underrepresented.

      (3) Audience and impact

      The paper will interest computational biologists, systems biologists, and bioinformaticians focused on single-cell analysis, and its impact will grow substantially if the authors demonstrate more biological applications.

      (4) Reviewer expertise Computational biology, single-cell transcriptomics, machine learning, computational math

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      The paper "ParTIpy: A Scalable Framework for Archetypal Analysis and Pareto Task Inference" presents ParTIpy, an open-source Python package that modernizes and scales the Pareto Task Inference (ParTI) framework for analyzing biological trade-offs and functional specialization. Unlike the earlier MATLAB implementation, which required a commercial license and was limited in scalability, ParTIpy leverages Python's open ecosystem and integration with tools such as scverse to make archetypal analysis more accessible, flexible, and compatible with modern biological data workflows. Through advanced optimization and coreset algorithms, it efficiently handles large scale single cell and spatial transcriptomics datasets. ParTIpy identifies "archetypes", or optimal phenotypic extremes, to reveal how cells balance competing functional programs. The paper demonstrates its application in modeling hepatocyte specialization across the liver lobule, highlighting spatial patterns of metabolic division of labor. Overall, ParTIpy represents a modern, accessible, and scalable Python-based solution for exploring biological trade-offs and resource allocation in high-dimensional data. The paper is clearly written and addresses an important methodological gap. However, the enrichment analysis differs from the original ParTI framework and should be discussed more explicitly, and the documentation and tutorials, while helpful, could be refined to improve usability and reproducibility.

      Major Comments

      1. The archetype enrichment analysis used in this paper differs from the original enrichment analysis implemented in ParTI. This is acceptable, but:

      a. The authors should explicitly state and discuss the differences between the two approaches.

      b. The enrichment analysis should be made more systematic. For each tested feature (e.g. gene or pathway), the analysis should report a p-value for the hypothesis that the feature is enriched near an archetype - that is, its expression (or value) is high close to the archetype and decreases with distance. Appropriate multiple-hypothesis correction should also be applied. 2. The package documentation on GitHub and ReadTheDocs is a major strength, but the tutorials can be improved for clarity and accessibility:

      a. The documentation should list external dependencies that need to be installed seperately, e.g. pybiomart.

      b. The dataset used in the Quickstart demo appears to be inaccessible or extremely slow to download (the function load_hepatocyte_data_2() did not complete even after 30 minutes, at least in my experience). The authors should verify data availability on Zenodo and consider providing a smaller or cached version to make the demo more reliable and reproducible.

      c. The tutorial order could be more intuitive - for instance, "archetype crosstalk network" appears before "archetypal analysis". Consider starting with the simulated dataset and presenting the full pipeline before moving to more complex real-world examples.

      Minor comments

      1. In the Python function, the parameter "optim" could use more descriptive option names - for example, renaming "projected_gradients" to "PCHA" would make it clearer and more consistent with terminology used in the paper.
      2. In the Quickstart preprocessing, the authors use the following code: sc.pp.normalize_total(adata) sc.pp.log1p(adata) However, they do not specify the target sum in the normalize_total function. The authors should ensure that the data values before the logarithmic transformation span several orders of magnitude (e.g., 0-10,000); if normalization is performed to a sum of 1, the log transformation becomes ineffective.

      Referee cross-commenting

      I agree with Reviewer #2 observation that the paper's contribution is primarily technical; however, I consider this technical advance to be an important and timely one that will enable many biologists to apply archetypal analysis more effectively in their own work.

      Significance

      This study presents ParTIpy, a Python-based implementation of Pareto Task Inference (ParTI) that makes archetypal analysis more accessible, scalable, and compatible with modern single-cell and spatial transcriptomics workflows. Its main strength lies in translating a conceptually powerful but technically limited MATLAB framework into an open-source, efficient Python package, enabling wider use in computational biology. The package is well-documented, which further enhances its accessibility and adoption potential, though documentation could be improved to enhance reproducibility and ease of use. It will be of interest to computational systems biologists, particularly those working with omics data, and those interested in studying functional trade-offs and resource allocation.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to referee comments: ____RC-2025-03008


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary In this article, the authors used the synthetic TALE DNA binding proteins, tagged with YFP, which were designed to target five specific repeat elements in Trypanosoma brucei genome, including centromere and telomeres-associated repeats and those of a transposon element. This is in order to detect and identified, using YFP-pulldown, specific proteins that bind to these repetitive sequences in T. brucei chromatin. Validation of the approach was done using a TALE protein designed to target the telomere repeat (TelR-TALE) that detected many of the proteins that were previously implicated with telomeric functions. A TALE protein designed to target the 70 bp repeats that reside adjacent to the VSG genes (70R-TALE) detected proteins that function in DNA repair and the protein designed to target the 177 bp repeat arrays (177R-TALE) identified kinetochore proteins associated T. brucei mega base chromosomes, as well as in intermediate and mini-chromosomes, which imply that kinetochore assembly and segregation mechanisms are similar in all T. brucei chromosome.

      Major comments: Are the key conclusions convincing? The authors reported that they have successfully used TALE-based affinity selection of protein-associated with repetitive sequences in the T. brucei genome. They claimed that this study has provided new information regarding the relevance of the repetitive region in the genome to chromosome integrity, telomere biology, chromosomal segregation and immune evasion strategies. These conclusions are based on high-quality research, and it is, basically, merits publication, provided that some major concerns, raised below, will be addressed before acceptance for publication. 1. The authors used TALE-YFP approach to examine the proteome associated with five different repetitive regions of the T. brucei genome and confirmed the binding of TALE-YFP with Chip-seq analyses. Ultimately, they got the list of proteins that bound to synthetic proteins, by affinity purification and LS-MS analysis and concluded that these proteins bind to different repetitive regions of the genome. There are two control proteins, one is TRF-YFP and the other KKT2-YFP, used to confirm the interactions. However, there are no experiment that confirms that the analysis gives some insight into the role of any putative or new protein in telomere biology, VSG gene regulation or chromosomal segregation. The proteins, which have already been reported by other studies, are mentioned. Although the author discovered many proteins in these repetitive regions, their role is yet unknown. It is recommended to take one or more of the new putative proteins from the repetitive elements and show whether or not they (1) bind directly to the specific repetitive sequence (e.g., by EMSA); (2) it is recommended that the authors will knockdown of one or a small sample of the new discovered proteins, which may shed light on their function at the repetitive region, as a proof of concept.

      Response

      The main request from Referee 1 is for individual evaluation of protein-DNA interaction for a few candidates identified in our TALE-YFP affinity purifications, particularly using EMSA to identify binding to the DNA repeats used for the TALE selection. In our opinion, such an approach would not actually provide the validation anticipated by the reviewer. The power of TALE-YFP affinity selection is that it enriches for protein complexes that associate with the chromatin that coats the target DNA repetitive elements rather than only identifying individual proteins or components of a complex that directly bind to DNA assembled in chromatin.

      The referee suggests we express recombinant proteins and perform EMSA for selected candidates, but many of the identified proteins are unlikely to directly bind to DNA - they are more likely to associate with a combination of features present in DNA and/or chromatin (e.g. specific histone variants or histone post-translational modifications). Of course, a positive result would provide some validation but only IF the tested protein can bind DNA in isolation - thus, a negative result would be uninformative.

      In fact, our finding that KKT proteins are enriched using the 177R-TALE (minichromosome repeat sequence) identifies components of the trypanosome kinetochore known (KKT2) or predicted (KKT3) to directly bind DNA (Marciano et al., 2021; PMID: 34081090), and likewise the TelR-TALE identifies the TRF component that is known to directly associate with telomeric (TTAGGG)n repeats (Reis et al 2018; PMID: 29385523). This provides reassurance on the specificity of the selection, as does the lack of cross selectivity between different TALEs used (see later point 3 below). The enrichment of the respective DNA repeats quantitated in Figure 2B (originally Figure S1) also provides strong evidence for TALE selectivity.

      It is very likely that most of the components enriched on the repetitive elements targeted by our TALE-YFP proteins do not bind repetitive DNA directly. The TRF telomere binding protein is an exception - but it is the only obvious DNA binding protein amongst the many proteins identified as being enriched in our TelR-TALE-YFP and TRF-YFP affinity selections.

      The referee also suggests that follow up experiments using knockdown of the identified proteins found to be enriched on repetitive DNA elements would be informative. In our opinion, this manuscript presents the development of a new methodology previously not applied to trypanosomes, and referee 2 highlights the value of this methodological development which will be relevant for a large community of kinetoplastid researchers. In-depth follow-up analyses would be beyond the scope of this current study but of course will be pursued in future. To be meaningful such knockdown analyses would need to be comprehensive in terms of their phenotypic characterisation (e.g. quantitative effects on chromosome biology and cell cycle progression, rates and mechanism of recombination underlying antigenic variation, etc) - simple RNAi knockdowns would provide information on fitness but little more. This information is already publicly available from genome-wide RNAi screens (www.tritrypDB.org), with further information on protein location available from the genome-wide protein localisation resource (Tryptag.org). Hence basic information is available on all targets selected by the TALEs after RNAi knock down but in-depth follow-up functional analysis of several proteins would require specific targeted assays beyond the scope of this study.

      NonR-TALE-YFP does not have a binding site in the genome, but YFP protein should still be expressed by T. brucei clones with NLS. The authors have to explain why there is no signal detected in the nucleus, while a prominent signal was detected near kDNA (see Fig.2). Why is the expression of YFP in NonR-TALE almost not shown compared to other TALE clones?

      Response

      The NonR-TALE-YFP immunolocalisation signal indeed is apparently located close to the kDNA and away from the nucleus. We are not sure why this is so, but the construct is sequence validated and correct. However, we note that artefactual localisation of proteins fused to a globular eGFP tag, compared to a short linear epitope V5 tag, near to the kinetoplast has been previously reported (Pyrih et al, 2023; PMID: 37669165),

      The expression of NonR-TALE-YFP is shown in Supplementary Fig. S2 in comparison to other TALE proteins. Although it is evident that NonR-TALE-YFP is expressed at lower levels than other TALEs (the different TALEs have different expression levels), it is likely that in each case the TALE proteins would be in relative excess.

      It is possible that the absence of a target sequence for the NonR-TALE-YFP in the nucleus affects its stability and cellular location. Understanding these differences is tangential to the aim of this study.

      However, importantly, NonR-TALE-YFP is not the only control for used for specificity in our affinity purifications. Instead, the lack of cross-selection of the same proteins by different TALEs (e.g. TelR-TALE-YFP, 177R-TALE-YFP) and the lack of enrichment of any proteins of interest by the well expressed ingiR-TALE-YFP or 147R-TALE-YFP proteins each provide strong evidence for the specificity of the selection using TALEs, as does the enrichment of similar protein sets following affinity purification of the TelR-TALE-YFP and TRF-YFP proteins which both bind telomeric (TTAGGG)n repeats. Moreover, control affinity purifications to assess background were performed using cells that completely lack an expressed YFP protein which further support specificity (Figure 6).

      We have added text to highlight these important points in the revised manuscript:

      Page 8:

      "However, the expression level of NonR-TALE-YFP was lower than other TALE-YFP proteins; this may relate to the lack of DNA binding sites for NonR-TALE-YFP in the nucleus."

      Page 8:

      "NonR-TALE-YFP displayed a diffuse nuclear and cytoplasmic signal; unexpectedly the cytoplasmic signal appeared to be in the vicinity the kDNA of the kinetoplast (mitochrondria). We note that artefactual localisation of some proteins fused to an eGFP tag has previously been observed in T. brucei (Pyrih et al, 2023)."

      Page 10:

      Moreover, a similar set of enriched proteins was identified in TelR-TALE-YFP affinity purifications whether compared with cells expressing no YFP fusion protein (No-YFP), the NonR-TALE-YFP or the ingiR-TALE-YFP as controls (Fig. S7B, S8A; Tables S3, S4). Thus, the most enriched proteins are specific to TelR-TALE-YFP-associated chromatin rather than to the TALE-YFP synthetic protein module or other chromatin.

      As a proof of concept, the author showed that the TALE method determined the same interacting partners enrichment in TelR-TALE as compared to TRF-YFP. And they show the same interacting partners for other TALE proteins, whether compared with WT cells or with the NonR-TALE parasites. It may be because NonR-TALE parasites have almost no (or very little) YFP expression (see Fig. S3) as compared to other TALE clones and the TRF-YFP clone. To address this concern, there should be a control included, with proper YFP expression.

      Response

      See response to point 2, but we reiterate that the ingi-TALE -YFP and 147R-TALE-YFP proteins are well expressed (western original Fig. S3 now Fig. S2) but few proteins are detected as being enriched or correspond to those enriched in TelR-TALE-YFP or TRF-YFP affinity purifications (see Fig. S9). Therefore, the ingi-TALE -YFP and 147R-TALE-YFP proteins provide good additional negative controls for specificity as requested. To further reassure the referee we have also included additional volcano plots which compare TelR-TALE-YFP, 70R-TALE-YFP or 177R-TALE-YFP to the ingiR-TALE-YFP affinity selection (new Figure S8). As with No-YFP or NonR-TALE-YFP controls, the use of ingiR-TALE-YFP as a negative control demonstrates that known telomere associated proteins are enriched in TelR-TALE-YFP affinity purification, RPA subunits enriched with 70R-TALE-YFP and Kinetochore KKT poroteins enriched with 177R-TALE-YFP. These analyses demonstrate specificity in the proteins enriched following affinity purification of our different TALE-YFPs and provide support to strengthen our original findings.

      We now refer to use of No-YFP, NonR-TALE-YFP, and ingiR-TALE -YFP as controls for comparison to TelR-TALE-YFP, 70R-TALE-YFP or 177R-TALE-YFP in several places:

      Page10:

      "Moreover, a similar set of enriched proteins was identified in TelR-TALE-YFP affinity purifications whether compared with cells expressing no YFP fusion protein (No-YFP), the NonR-TALE-YFP or the ingiR-TALE-YFP as controls (Fig. S7B, S8A; Tables S3, S4)."

      Page 11:

      "Thus, the nuclear ingiR-TALE-YFP provides an additional chromatin-associated negative control for affinity purifications with the TelR-TALE-YFP, 70R-TALE-YFP and 177R-TALE-YFP proteins (Fig. S8)."

      "Proteins identified as being enriched with 70R-TALE-YFP (Figure 6D) were similar in comparisons with either the No-YFP, NonR-TALE-YFP or ingiR-TALE-YFP as negative controls."

      Top Page 12:

      "The same kinetochore proteins were enriched regardless of whether the 177R-TALE proteomics data was compared with No-YFP, NonR-TALE or ingiR-TALE-YFP controls."

      Discussion Page 13:

      "Regardless, the 147R-TALE and ingiR-TALE proteins were well expressed in T. brucei cells, but their affinity selection did not significantly enrich for any relevant proteins. Thus, 147R-TALE and ingiR-TALE provide reassurance for the overall specificity for proteins enriched TelR-TALE, 70R-TALE and 177R-TALE affinity purifications."

      After the artificial expression of repetitive sequence binding five-TALE proteins, the question is if there is any competition for the TALE proteins with the corresponding endogenous proteins? Is there any effect on parasite survival or health, compared to the control after the expression of these five TALEs YFP protein? It is recommended to add parasite growth curves, for all the TALE-proteins expressing cultures.

      Response

      Growth curves for cells expressing TelR-TALE-YFP, 177R-TALE-YFP and ingiR-TALE-YFP are now included (New Fig S3A). No deficit in growth was evident while passaging 70R-TALE-YFP, 147R-TALE-YFP, NonR-TALE-YFP cell lines (indeed they grew slightly better than controls).

      The following text has been added page 8:

      "Cell lines expressing representative TALE-YFP proteins displayed no fitness deficit (Fig. S3A)."

      Since the experiments were performed using whole-cell extracts without prior nuclear fractionation, the authors should consider the possibility that some identified proteins may have originated from compartments other than the nucleus. Specifically, the detection of certain binding proteins might reflect sequence homology (or partial homology) between mitochondrial DNA (maxicircles and minicircles) and repetitive regions in the nuclear genome. Additionally, the lack of subcellular separation raises the concern that cytoplasmic proteins could have been co-purified due to whole cell lysis, making it challenging to discern whether the observed proteome truly represents the nuclear interactome.

      Response

      In our experimental design, we confirmed bioinformatically that the repeat sequences targeted were not represented elsewhere in the nuclear or mitochondrial genome (kDNA). The absence of subcellular fractionation could result in some cytoplasmic protein selection, but this is unlikely since each TALE targets a specific DNA sequence but is otherwise identical such that cross-selection of the same contaminating protein set would be anticipated if there was significant non-specific binding. We have previously successfully affinity selected 15 chromatin modifiers and identified associated proteins without major issues concerning cytoplasmic protein contamination (Staneva et al 2021 and 2022; PMID: 34407985 and 36169304). Of course, the possibility that some proteins are contaminants will need to be borne in mind in any future follow-up analysis of proteins of interest that we identified as being enriched on specific types of repetitive element in T. brucei. Proteins that are also detected in negative control, or negative affinity selections such as No-YFP, NoR-YFP, IngiR-TALE or 147R-TALE must be disregarded.

      '6'. Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? As mentioned earlier, the author claimed that this study has provided new information concerning telomere biology, chromosomal segregation mechanisms, and immune evasion strategies. But there are no experiments that provides a role for any unknown or known protein in these processes. Thus, it is suggested to select one or two proteins of choice from the list and validate their direct binding to repetitive region(s), and their role in that region of interaction.

      Response

      As highlighted in response to point 1 the suggested validation and follow up experiments may well not be informative and are beyond the scope of the methodological development presented in this manuscript. Referee 2 describes the study in its current form as "a significant conceptual and technical advancement" and "This approach enhances our understanding of chromatin organization in these regions and provides a foundation for investigating the functional roles of associated proteins in parasite biology."

      The Referee's phrase 'validate their direct binding to repetitive region(s)' here may also mean to test if any of the additional proteins that we identified as being enriched with a specific TALE protein actually display enrichment over the repeat regions when examined by an orthogonal method. A key unexpected finding was that kinetochore proteins including KKT2 are enriched in our affinity purifications of the 177R-TALE-YFP that targets 177bp repeats (Figure 6F). By conducting ChIP-seq for the kinetochore specific protein KKT2 using YFP-KKT2 we confirmed that KKT2 is indeed enriched on 177bp repeat DNA but not flanking DNA (Figure 7). Moreover, several known telomere-associated proteins are detected in our affinity selections of TelR-TALE-YFP (Figure 6B, FigS6; see also Reis et al, 2018 Nuc. Acids Res. PMID: 29385523; Weisert et al, 2024 Sci. Reports PMID: 39681615).

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. The answer for this question depends on what the authors want to present as the achievements of the present study. If the achievement of the paper was is the creation of a new tool for discovering new proteins, associated with the repeat regions, I recommend that they add a proof for direct interactions between a sample the newly discovered proteins and the relevant repeats, as a proof of concept discussed above, However, if the authors like to claim that the study achieved new functional insights for these interactions they will have to expand the study, as mentioned above, to support the proof of concept.

      Response

      See our response to point 1 and the point we labelled '6' above.

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. I think that they are realistic. If the authors decided to check the capacity of a small sample of proteins (which was unknown before as a repetitive region binding proteins) to interacts directly with the repeated sequence, it will substantially add of the study (e.g., by EMSA; estimated time: 1 months). If the authors will decide to check the also the function of one of at least one such a newly detected proteins (e.g., by KD), I estimate the will take 3-6 months.

      Response

      As highlighted previously the proposed EMSA experiment may well be uninformative for protein complex components identified in our study or for isolated proteins that directly bind DNA in the context of a complex and chromatin. RNAi knockdown data and cell location data (as well as developmental expression and orthology data) is already available through tritrypDB.org and trtyptag.org

      Are the data and the methods presented in such a way that they can be reproduced? Yes

      Are the experiments adequately replicated, and statistical analysis adequate? The authors did not mention replicates. There is no statistical analysis mentioned.

      Response

      The figure legends indicate that all volcano plots of TALE affinity selections were derived from three biological replicates. Cutoffs used for significance: PFor ChiP-seq two biological replicates were analysed for each cell line expressing the specific YFP tagged protein of interest (TALE or KKT2). This is now stated in the relevant figure legends - apologies for this oversight. The resulting data are available for scrutiny at GEO: GSE295698.

      Minor comments: -Specific experimental issues that are easily addressable. The following suggestions can be incorporated: 1. Page 18, in the material method section author mentioned four drugs: Blasticidine, Phleomycin and G418, and hygromycin. It is recommended to mention the purpose of using these selective drugs for the parasite. If clonal selection has been done, then it should also be mentioned.

      Response

      We erroneously added information on several drugs used for selection in our labaoratory. In fact all TALE-YFP construct carry the Bleomycin resistance genes which we select for using Phleomycin. Also, clones were derived by limiting dilution immediately after transfection.

      We have amended the text accordingly:

      Page 17/18:

      "Cell cultures were maintained below 3 x 106 cells/ml. Pleomycin 2.5 mg/ml was used to select transformants containing the TALE construct BleoR gene."

      "Electroporated bloodstream cells were added to 30 ml HMI-9 medium and two 10-fold serial dilutions were performed in order to isolate clonal Pleomycin resistant populations from the transfection. 1 ml of transfected cells were plated per well on 24-well plates (1 plate per serial dilution) and incubated at 37{degree sign}C and 5% CO2 for a minimum of 6 h before adding 1 ml media containing 2X concentration Pleomycin (5 mg/ml) per well."

      In the method section the authors mentioned that there is only one site for binding of NonR-TALE in the parasite genome. But in Fig. 1C, the authors showed zero binding site. So, there is one binding site for NonR-TALE-YFP in the genome or zero?

      Response

      We thank the reviewer for pointing out this discrepancy. We have checked the latest Tb427v12 genome assembly for predicted NonR-TALE binding sites and there are no exact matches. We have corrected the text accordingly.

      Page 7:

      "A control NonR-TALE protein was also designed which was predicted to have no target sequence in the T. bruceigenome."

      Page 17:

      "A control NonR-TALE predicted to have no recognised target in the T. brucei geneome was designed as follows: BLAST searches were used to identify exact matches in the TREU927 reference genome. Candidate sequences with one or more match were discarded."

      The authors used two different anti-GFP antibodies, one from Roche and the other from Thermo Fisher. Why were two different antibodies used for the same protein?

      Response

      We have found that only some anti-GFP antibodies are effective for affinity selection of associated proteins, whereas others are better suited for immunolocalisation. The respective suppliers' antibodies were optimised for each application.

      Page 6: in the introduction, the authors give the number of total VSG genes as 2,634. Is it known how many of them are pseudogenes?

      Response

      This value corresponds to the number reported by Consentino et al. 2021 (PMID: 34541528) for subtelomeric VSGs, which is similar to the value reported by Muller et al 2018 (PMID: 30333624) (2486), both in the same strain of trypanosomes as used by us. Based on the earlier analysis by Cross et al (PMID: 24992042), 80% of the identified VSGs in their study (2584) are pseudogenes. This approximates to the estimation by Consentino of 346/2634 (13%) being fully functional VSG genes at subtelomeres, or 17% when considering VSGs at all genomic locations (433/2872).

      I found several typos throughout the manuscript.

      Response

      Thank you for raising this, we have read through the manuscipt several times and hopefully corrected all outstanding typos.

      Fig. 1C: Table: below TOTAL 2nd line: the number should be 1838 (rather than 1828)

      Corrected- thank you.

      • Are prior studies referenced appropriately? Yes

      • Are the text and figures clear and accurate? Yes

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions? Suggested above

      Reviewer #1 (Significance (Required)):

      Describe the nature and significance of the advance (e.g., conceptual, technical, clinical) for the field: This study represents a significant conceptual and technical advancement by employing a synthetic TALE DNA-binding protein tagged with YFP to selectively identify proteins associated with five distinct repetitive regions of T. brucei chromatin. To the best of my knowledge, it is the first report to utilize TALE-YFP for affinity-based isolation of protein complexes bound to repetitive genomic sequences in T. brucei. This approach enhances our understanding of chromatin organization in these regions and provides a foundation for investigating the functional roles of associated proteins in parasite biology. Importantly, any essential or unique interacting partners identified could serve as potential targets for therapeutic intervention.

      • Place the work in the context of the existing literature (provide references, where appropriate). I agree with the information that has already described in the submitted manuscript, regarding its potential addition of the data resulted and the technology established to the study of VSGs expression, kinetochore mechanism and telomere biology.

      • State what audience might be interested in and influenced by the reported findings. These findings will be of particular interest to researchers studying the molecular biology of kinetoplastid parasites and other unicellular organisms, as well as scientists investigating chromatin structure and the functional roles of repetitive genomic elements in higher eukaryotes.

      • 1Define your field of expertise with a few keywords to help the authors contextualize your point of view. 2Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. (1) Protein-DNA interactions/ chromatin/ DNA replication/ Trypanosomes (2) None

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary

      Carloni et al. comprehensively analyze which proteins bind repetitive genomic elements in Trypanosoma brucei. For this, they perform mass spectrometry on custom-designed, tagged programmable DNA-binding proteins. After extensively verifying their programmable DNA-binding proteins (using bioinformatic analysis to infer target sites, microscopy to measure localization, ChIP-seq to identify binding sites), they present, among others, two major findings: 1) 14 of the 25 known T. brucei kinetochore proteins are enriched at 177bp repeats. As T. brucei's 177bp repeat-containing intermediate-sized and mini-chromosomes lack centromere repeats but are stable over mitosis, Carloni et al. use their data to hypothesize that a 'rudimentary' kinetochore assembles at the 177bp repeats of these chromosomes to segregate them. 2) 70bp repeats are enriched with the Replication Protein A complex, which, notably, is required for homologous recombination. Homologous recombination is the pathway used for recombination-based antigenic variation of the 70bp-repeat-adjacent variant surface glycoproteins.

      Major Comments

      None. The experiments are well-controlled, claims well-supported, and methods clearly described. Conclusions are convincing.

      Response Thank you for these positive comments.

      Minor Comments

      1) Fig. 2 - I couldn't find an uncropped version showing multiple cells. If it exists, it should be linked in the legend or main text; Otherwise, this should be added to the supplement.

      Response

      The images presented represent reproducible analyses, and independently verified by two of the authors. Although wider field of view images do not provide the resolution to be informative on cell location, as requested we have provided uncropped images in new Fig. S4 for all the cell lines shown in Figure 2A.

      In addition, we have included as supplementary images (Fig. S3B) additional images of TelR-TALE-YFP, 177R-TALE-YFP and ingiR-TALE YFP localisation to provide additional support their observed locations presented in Figure 1. The set of cells and images presented in Figure 2A and in Fig S3B were prepared and obtained by a different authors, independently and reproducibly validating the location of the tagged protein.

      2) I think Suppl. Fig. 1 is very valuable, as it is a quantification and summary of the ChIP-seq data. I think the authors could consider making this a panel of a main figure. For the main figure, I think the plot could be trimmed down to only show the background and the relevant repeat for each TALE protein, leaving out the non-target repeats. (This relates to minor comment 6.) Also, I believe, it was not explained how background enrichment was calculated.

      Response

      We are grateful for the reviewer's positive view of original Fig. S1 and appreciate the suggestion. We have now moved these analysis to part B of main Figure 2 in the revised manuscript - now Figure 2B. We have also provided additional details in the Methods section on the approaches used to assess background enrichment.

      Page 19:

      Background enrichment calculation

      The genome was divided into 50 bp sliding windows, and each window was annotated based on overlapping genomic features, including CIR147, 177 bp repeats, 70 bp repeats, and telomeric (TTAGGG)n repeats. Windows that did not overlap with any of these annotated repeat elements were defined as "background" regions and used to establish the baseline ChIP-seq signal. Enrichment for each window was calculated using bamCompare, as log₂(IP/Input). To adjust for background signal amongst all samples, enrichment values for each sample were further normalized against the corresponding No-YFP ChIP-seq dataset.

      Note: While revising the manuscript we also noticed that the script had a nomalization error. We have therefore included a corrected version of these analyses as Figure 2B (old Fig. S1)

      3) Generally, I would plot enrichment on a log2 axis. This concerns several figures with ChIP-seq data.

      Response

      Our ChIP-seq enrichment is calculated by bamCompare. The resulting enrichment values are indeed log2 (IP/Input). We have made this clear in the updated figures/legends.

      4) Fig. 4C - The violin plots are very hard to interpret, as the plots are very narrow compared to the line thickness, making it hard to judge the actual volume. For example, in Centromere 5, YFP-KKT2 is less enriched than 147R-TALE over most of the centromere with some peaks of much higher enrichment (as visible in panel B), however, in panel C, it is very hard to see this same information. I'm sure there is some way to present this better, either using a different type of plot or by improving the spacing of the existing plot.

      Response

      We thank the reviewer for this suggestion; we have elected to provide a Split-Violin plot instead. This improves the presentation of the data for each centromere. The original violin plot in Figure 4C has been replaced with this Split-Violin plot (still Figure 4C).

      5) Fig. 6 - The panels are missing an x-axis label (although it is obvious from the plot what is displayed). Maybe the "WT NO-YFP vs" part that is repeated in all the plot titles could be removed from the title and only be part of the x-axis label?

      Response

      In fact, to save space the X axis was labelled inside each volcano plot but we neglected to indicate that values are a log2 scale indicating enrichment. This has been rectified - see Figure 6, and Fig. S7, S8 and S9.

      6) Fig. 7 - I would like to have a quantification for the examples shown here. In fact, such a quantification already exists in Suppl. Figure 1. I think the relevant plots of that quantification (YFP-KKT2 over 177bp-repeats and centromere-repeats) with some control could be included in Fig. 7 as panel C. This opportunity could be used to show enrichment separated out for intermediate-sized, mini-, and megabase-chromosomes. (relates to minor comment 2 & 8)

      Response

      The CIR147 sequence is found exclusively on megabase-sized chromosomes, while the 177 bp repeats are located on intermediate- and mini-sized chromosomes. Due to limitations in the current genome assembly, it is not possible to reliably classify all chromosomes into intermediate- or mini- sized categories based on their length. Therefore, original Supplementary Fig. S1 presented the YFP-KKT2 enrichment over CIR147 and 177 bp repeats as a representative comparison between megabase chromosomes and the remaining chromosomes (corrected version now presented as main Figure 2B). Additionally, to allow direct comparison of YFP-KKT2 enrichment on CIR147 and 177 bp repeats we have included a new plot in Figure 7C which shows the relative enrichment of YFP-KKT2 on these two repeat types.

      We have added the following text , page 12:

      "Taking into account the relative to the number of CIR147 and 177 bp repeats in the current T.brucei genome (Cosentino et al., 2021; Rabuffo et al., 2024), comparative analyses demonstrated that YFP-KKT2 is enriched on both CIR147 and 177 bp repeats (Figure 7C)."

      7) Suppl. Fig. 8 A - I believe there is a mistake here: KKT5 occurs twice in the plot, the one in the overlap region should be KKT1-4 instead, correct?

      Response

      Thanks for spotting this. It has been corrected

      8) The way that the authors mapped ChIP-seq data is potentially problematic when analyzing the same repeat type in different regions of the genome. The authors assigned reads that had multiple equally good mapping positions to one of these mapping positions, randomly. This is perfectly fine when analysing repeats by their type, independent of their position on the genome, which is what the authors did for the main conclusions of the work. However, several figures show the same type of repeat at different positions in the genome. Here, the authors risk that enrichment in one region of the genome 'spills' over to all other regions with the same sequence. Particularly, where they show YFP-KKT2 enrichment over intermediate- and mini-chromosomes (Fig. 7) due to the spillover, one cannot be sure to have found KKT2 in both regions. Instead, the authors could analyze only uniquely mapping reads / read-pairs where at least one mate is uniquely mapping. I realize that with this strict filtering, data will be much more sparse. Hence, I would suggest keeping the original plots and adding one more quantification where the enrichment over the whole region (e.g., all 177bp repeats on intermediate-/mini-chromosomes) is plotted using the unique reads (this could even be supplementary). This also applies to Fig. 4 B & C.

      Response

      We thank the reviewer for their thoughtful comments. Repetitive sequences are indeed challenging to analyze accurately, particularly in the context of short read ChIP-seq data. In our study, we aimed to address YFP-KKT2 enrichment not only over CIR147 repeats but also on 177 bp repeats, using both ChIP-seq and proteomics using synthetic TALE proteins targeted to the different repeat types. We appreciate the referees suggestion to consider uniquely mapped reads, however, in the updated genome assembly, the 177 bp repeats are frequently immediately followed by long stretches of 70 bp repeats which can span several kilobases. The size and repetitive nature of these regions exceeds the resolution limits of ChIP-seq. It is therefore difficult to precisely quantify enrichment across all chromosomes.

      Additionally, the repeat sequences are highly similar, and relying solely on uniquely mapped reads would result in the exclusion of most reads originating from these regions, significantly underestimating the relative signals. To address this, we used Bowtie2 with settings that allow multi-mapping, assigning reads randomly among equivalent mapping positions, but ensuring each read is counted only once. This approach is designed to evenly distribute signal across all repetitive regions and preserve a meaningful average.

      Single molecule methods such as DiMeLo (Altemose et al. 2022; PMID: 35396487) will need to be developed for T. brucei to allow more accurate and chromosome specific mapping of kinetochore or telomere protein occupancy at repeat-unique sequence boundaries on individual chromosomes.

      Reviewer #2 (Significance (Required)):

      This work is of high significance for chromosome/centromere biology, parasitology, and the study of antigenic variation. For chromosome/centromere biology, the conceptual advancement of different types of kinetochores for different chromosomes is a novelty, as far as I know. It would certainly be interesting to apply this study as a technical blueprint for other organisms with mini-chromosomes or chromosomes without known centromeric repeats. I can imagine a broad range of labs studying other organisms with comparable chromosomes to take note of and build on this study. For parasitology and the study of antigenic variation, it is crucial to know how intermediate- and mini-chromosomes are stable through cell division, as these chromosomes harbor a large portion of the antigenic repertoire. Moreover, this study also found a novel link between the homologous repair pathway and variant surface glycoproteins, via the 70bp repeats. How and at which stages during the process, 70bp repeats are involved in antigenic variation is an unresolved, and very actively studied, question in the field. Of course, apart from the basic biological research audience, insights into antigenic variation always have the potential for clinical implications, as T. brucei causes sleeping sickness in humans and nagana in cattle. Due to antigenic variation, T. brucei infections can be chronic.

      Response

      Thank you for supporting the novelty and broad interest of our manuscript

      My field of expertise / Point of view:

      I'm a computer scientist by training and am now a postdoctoral bioinformatician in a molecular parasitology laboratory. The laboratory is working on antigenic variation in T. brucei. The focus of my work is on analyzing sequencing data (such as ChIP-seq data) and algorithmically improving bioinformatic tools.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      Carloni et al. comprehensively analyze which proteins bind repetitive genomic elements in Trypanosoma brucei. For this, they perform mass spectrometry on custom-designed, tagged programmable DNA-binding proteins. After extensively verifying their programmable DNA-binding proteins (using bioinformatic analysis to infer target sites, microscopy to measure localization, ChIP-seq to identify binding sites), they present, among others, two major findings: 1) 14 of the 25 known T. brucei kinetochore proteins are enriched at 177bp repeats. As T. brucei's 177bp repeat-containing intermediate-sized and mini-chromosomes lack centromere repeats but are stable over mitosis, Carloni et al. use their data to hypothesize that a 'rudimentary' kinetochore assembles at the 177bp repeats of these chromosomes to segregate them. 2) 70bp repeats are enriched with the Replication Protein A complex, which, notably, is required for homologous recombination. Homologous recombination is the pathway used for recombination-based antigenic variation of the 70bp-repeat-adjacent variant surface glycoproteins.

      Major Comments

      None. The experiments are well-controlled, claims well-supported, and methods clearly described. Conclusions are convincing.

      Minor Comments

      1. Fig. 2 - I couldn't find an uncropped version showing multiple cells. If it exists, it should be linked in the legend or main text; Otherwise, this should be added to the supplement.
      2. I think Suppl. Fig. 1 is very valuable, as it is a quantification and summary of the ChIP-seq data. I think the authors could consider making this a panel of a main figure. For the main figure, I think the plot could be trimmed down to only show the background and the relevant repeat for each TALE protein, leaving out the non-target repeats. (This relates to minor comment 6.) Also, I believe, it was not explained how background enrichment was calculated.
      3. Generally, I would plot enrichment on a log2 axis. This concerns several figures with ChIP-seq data.
      4. Fig. 4C - The violin plots are very hard to interpret, as the plots are very narrow compared to the line thickness, making it hard to judge the actual volume. For example, in Centromere 5, YFP-KKT2 is less enriched than 147R-TALE over most of the centromere with some peaks of much higher enrichment (as visible in panel B), however, in panel C, it is very hard to see this same information. I'm sure there is some way to present this better, either using a different type of plot or by improving the spacing of the existing plot.
      5. Fig. 6 - The panels are missing an x-axis label (although it is obvious from the plot what is displayed). Maybe the "WT NO-YFP vs" part that is repeated in all the plot titles could be removed from the title and only be part of the x-axis label?
      6. Fig. 7 - I would like to have a quantification for the examples shown here. In fact, such a quantification already exists in Suppl. Figure 1. I think the relevant plots of that quantification (YFP-KKT2 over 177bp-repeats and centromere-repeats) with some control could be included in Fig. 7 as panel C. This opportunity could be used to show enrichment separated out for intermediate-sized, mini-, and megabase-chromosomes. (relates to minor comment 2 & 8)
      7. Suppl. Fig. 8 A - I believe there is a mistake here: KKT5 occurs twice in the plot, the one in the overlap region should be KKT1-4 instead, correct?
      8. The way that the authors mapped ChIP-seq data is potentially problematic when analyzing the same repeat type in different regions of the genome. The authors assigned reads that had multiple equally good mapping positions to one of these mapping positions, randomly. This is perfectly fine when analyzing repeats by their type, independent of their position on the genome, which is what the authors did for the main conclusions of the work. However, several figures show the same type of repeat at different positions in the genome. Here, the authors risk that enrichment in one region of the genome 'spills' over to all other regions with the same sequence. Particularly, where they show YFP-KKT2 enrichment over intermediate- and mini-chromosomes (Fig. 7) due to the spillover, one cannot be sure to have found KKT2 in both regions. Instead, the authors could analyze only uniquely mapping reads / read-pairs where at least one mate is uniquely mapping. I realize that with this strict filtering, data will be much more sparse. Hence, I would suggest keeping the original plots and adding one more quantification where the enrichment over the whole region (e.g., all 177bp repeats on intermediate-/mini-chromosomes) is plotted using the unique reads (this could even be supplementary). This also applies to Fig. 4 B & C.

      Significance

      This work is of high significance for chromosome/centromere biology, parasitology, and the study of antigenic variation. For chromosome/centromere biology, the conceptual advancement of different types of kinetochores for different chromosomes is a novelty, as far as I know. It would certainly be interesting to apply this study as a technical blueprint for other organisms with mini-chromosomes or chromosomes without known centromeric repeats. I can imagine a broad range of labs studying other organisms with comparable chromosomes to take note of and build on this study. For parasitology and the study of antigenic variation, it is crucial to know how intermediate- and mini-chromosomes are stable through cell division, as these chromosomes harbor a large portion of the antigenic repertoire. Moreover, this study also found a novel link between the homologous repair pathway and variant surface glycoproteins, via the 70bp repeats. How and at which stages during the process, 70bp repeats are involved in antigenic variation is an unresolved, and very actively studied, question in the field. Of course, apart from the basic biological research audience, insights into antigenic variation always have the potential for clinical implications, as T. brucei causes sleeping sickness in humans and nagana in cattle. Due to antigenic variation, T. brucei infections can be chronic.

      My field of expertise / Point of view:

      I'm a computer scientist by training and am now a postdoctoral bioinformatician in a molecular parasitology laboratory. The laboratory is working on antigenic variation in T. brucei. The focus of my work is on analyzing sequencing data (such as ChIP-seq data) and algorithmically improving bioinformatic tools.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      In this article, the authors used the synthetic TALE DNA binding proteins, tagged with YFP, which were designed to target five specific repeat elements in Trypanosoma brucei genome, including centromere and telomeres-associated repeats and those of a transposon element. This is in order to detect and identified, using YFP-pulldown, specific proteins that bind to these repetitive sequences in T. brucei chromatin. Validation of the approach was done using a TALE protein designed to target the telomere repeat (TelR-TALE) that detected many of the proteins that were previously implicated with telomeric functions. A TALE protein designed to target the 70 bp repeats that reside adjacent to the VSG genes (70R-TALE) detected proteins that function in DNA repair and the protein designed to target the 177 bp repeat arrays (177R-TALE) identified kinetochore proteins associated T. brucei mega base chromosomes, as well as in intermediate and mini-chromosomes, which imply that kinetochore assembly and segregation mechanisms are similar in all T. brucei chromosome.

      Major comments:

      Are the key conclusions convincing?

      The authors reported that they have successfully used TALE-based affinity selection of protein-associated with repetitive sequences in the T. brucei genome. They claimed that this study has provided new information regarding the relevance of the repetitive region in the genome to chromosome integrity, telomere biology, chromosomal segregation and immune evasion strategies. These conclusions are based on high-quality research and it is, basically, merits publication, provided that some major concerns, raised below, will be addressed before acceptance for publication. 1. The authors used TALE-YFP approach to examine the proteome associated with five different repetitive regions of the T. brucei genome and confirmed the binding of TALE-YFP with Chip-seq analyses. Ultimately, they got the list of proteins that bound to synthetic proteins, by affinity purification and LS-MS analysis and concluded that these proteins bind to different repetitive regions of the genome. There are two control proteins, one is TRF-YFP and the other KKT2-YFP, used to confirm the interactions. However, there are no experiment that confirms that the analysis gives some insight into the role of any putative or new protein in telomere biology, VSG gene regulation or chromosomal segregation. The proteins, which have already been reported by other studies, are mentioned. Although the author discovered many proteins in these repetitive regions, their role is yet unknown. It is recommended to take one or more of the new putative proteins from the repetitive elements and show whether or not they (1) bind directly to the specific repetitive sequence (e.g., by EMSA); (2) it is recommended that the authors will knockdown of one or a small sample of the new discovered proteins, which may shed light on their function at the repetitive region, as a proof of concept. 2. NonR-TALE-YFP does not have a binding site in the genome, but YFP protein should still be expressed by T. brucei clones with NLS. The authors have to explain why there is no signal detected in the nucleus, while a prominent signal was detected near kDNA (see Fig.2). Why is the expression of YFP in NonR-TALE almost not shown compared to other TALE clones? 3. As a proof of concept, the author showed that the TALE method determined the same interacting partners enrichment in TelR-TALE as compared to TRF-YFP. And they show the same interacting partners for other TALE proteins, whether compared with WT cells or with the NonR-TALE parasites. It may be because NonR-TALE parasites have almost no (or very little) YFP expression (see Fig. S3) as compared to other TALE clones and the TRF-YFP clone. To address this concern, there should be a control included, with proper YFP expression. 4. After the artificial expression of repetitive sequence binding five-TALE proteins, the question is if there is any competition for the TALE proteins with the corresponding endogenous proteins? Is there any effect on parasite survival or health, compared to the control after the expression of these five TALEs YFP protein? It is recommended to add parasite growth curves, for all the TALE-proteins expressing cultures. 5. Since the experiments were performed using whole-cell extracts without prior nuclear fractionation, the authors should consider the possibility that some identified proteins may have originated from compartments other than the nucleus. Specifically, the detection of certain binding proteins might reflect sequence homology (or partial homology) between mitochondrial DNA (maxicircles and minicircles) and repetitive regions in the nuclear genome. Additionally, the lack of subcellular separation raises the concern that cytoplasmic proteins could have been co-purified due to whole cell lysis, making it challenging to discern whether the observed proteome truly represents the nuclear interactome.

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      As mentioned earlier, the author claimed that this study has provided new information concerning telomere biology, chromosomal segregation mechanisms, and immune evasion strategies. But there are no experiments that provides a role for any unknown or known protein in these processes. Thus, it is suggested to select one or two proteins of choice from the list and validate their direct binding to repetitive region(s), and their role in that region of interaction. <br /> Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. The answer for this question depends on what the authors want to present as the achievements of the present study. If the achievement of the paper was is the creation of a new tool for discovering new proteins, associated with the repeat regions, I recommend that they add a proof for direct interactions between a sample the newly discovered proteins and the relevant repeats, as a proof of concept discussed above, However, if the authors like to claim that the study achieved new functional insights for these interactions they will have to expand the study, as mentioned above, to support the proof of concept.

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      I think that they are realistic. If the authors decided to check the capacity of a small sample of proteins (which was unknown before as a repetitive region binding proteins) to interacts directly with the repeated sequence, it will substantially add of the study (e.g., by EMSA; estimated time: 1 months). If the authors will decide to check the also the function of one of at least one such a newly detected proteins (e.g., by KD), I estimate the will take 3-6 months.

      Are the data and the methods presented in such a way that they can be reproduced?

      Yes

      Are the experiments adequately replicated, and statistical analysis adequate?

      The authors did not mention replicates. There is no statistical analysis mentioned.

      Minor comments:

      Specific experimental issues that are easily addressable.

      The following suggestions can be incorporated:

      1. Page 18, in the material method section author mentioned four drugs: Blasticidine, Phleomycin and G418, and hygromycin. It is recommended to mention the purpose of using these selective drugs for the parasite. If clonal selection has been done, then it should also be mentioned.
      2. In the method section the authors mentioned that there is only one site for binding of NonR-TALE in the parasite genome. But in Fig. 1C, the authors showed zero binding site. So, there is one binding site for NonR-TALE-YFP in the genome or zero?
      3. The authors used two different anti-GFP antibodies, one from Roche and the other from Thermo Fisher. Why were two different antibodies used for the same protein?
      4. Page 6: in the introduction, the authors give the number of total VSG genes as 2,634. Is it known how many of them are pseudogenes?
      5. I found several typos throughout the manuscript.
      6. Fig. 1C: Table: below TOTAL 2nd line: the number should be 1838 (rather than 1828)

      Are prior studies referenced appropriately?

      Yes

      Are the text and figures clear and accurate?

      Yes

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Suggested above

      Significance

      Describe the nature and significance of the advance (e.g., conceptual, technical, clinical) for the field:

      This study represents a significant conceptual and technical advancement by employing a synthetic TALE DNA-binding protein tagged with YFP to selectively identify proteins associated with five distinct repetitive regions of T. brucei chromatin. To the best of my knowledge, it is the first report to utilize TALE-YFP for affinity-based isolation of protein complexes bound to repetitive genomic sequences in T. brucei. This approach enhances our understanding of chromatin organization in these regions and provides a foundation for investigating the functional roles of associated proteins in parasite biology. Importantly, any essential or unique interacting partners identified could serve as potential targets for therapeutic intervention.

      Place the work in the context of the existing literature (provide references, where appropriate).

      I agree with the information that has already described in the submitted manuscript, regarding its potential addition of the data resulted and the technology established to the study of VSGs expression, kinetochore mechanism and telomere biology.

      State what audience might be interested in and influenced by the reported findings.

      These findings will be of particular interest to researchers studying the molecular biology of kinetoplastid parasites and other unicellular organisms, as well as scientists investigating chromatin structure and the functional roles of repetitive genomic elements in higher eukaryotes.

      1Define your field of expertise with a few keywords to help the authors contextualize your point of view. 2Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      1. Protein-DNA interactions/ chromatin/ DNA replication/ Trypanosomes
      2. None
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      This is a reply/revision plan, not definitive. Planned and already implemented revisions are underlined.

      First of all, we wish to express our gratitude to the reviewers: they helped to improve the paper.

      Reviewer #1:* **

      Reviewer #1 wrote: Major Comments: 1.Differential gene/pathway analysis across epithelial clusters: What are the differential genes or pathways among the epithelial clusters? Without CCA/Harmony integration, do the tumor subgroups show distinct differences? In addition, I suggest applying NMF or hdWGCNA to identify shared modules and test whether ATC and PTC harbor overlapping regulatory modules.

      *

      Reply plan: Both reviewers suggested some regulatory network analysis. We proposed to run SCENIC+ (Nature Methods, 2023, https://doi.org/10.1038/s41592-023-01938-4) on our data__.__

      * Reviewer #1 wrote: 2.Validation of TSHR/TPO-based subgrouping: While the TSHR/TPO grouping appears appropriate for stratification at the single-cell level, it is necessary to exclude sequencing depth as a confounding factor. Should validate the existence of these subpopulations using mIHC/IF on corresponding samples. *

      __Reply plan: __We made claims about RNA expression, not protein expression. Thus, validation should be at the RNA level:

      • We already replicated part of our analysis on the dataset published by Lu et al. (JCI 2023, https://doi.org/10.1172/JCI169653), see Figs. 3 and 4. This effort will be extended to all single cell analysis results from our study in the revised paper.
      • We will also present plots demonstrating that the sequencing depth is similar in the different cancer cell subgroups-further excluding it as a confounding factor. Reviewer #1 wrote: *3.Impact of mutational differences on conclusions: According to Supplementary Table 1, almost all PTC cases carried BRAF mutations, whereas four ATC patients harbored no BRAF mutation. Could this difference influence the conclusions of the study? Although the authors briefly mention this in the Discussion, a more thorough clarification is warranted. *

      Reply plan: The dataset of Lu et al. includes BRAF-mutated ATCs along with BRAF-mutated PTCs. Therefore, the replication mentioned earlier will also address those concerns. In fact, Fig. 4E-I already confirm in Lu et al. data the ordered loss of markers. Replication will be extended to other results of the study and be more emphasized in the paper.

      * Reviewer #1 wrote: 4. The statement "Myeloid and T cells also grouped in specific clusters" seems descriptive. Is this clustering biologically meaningful? Please elaborate.

      *

      __Reply plan: __This is an important point, and accordingly, a cell mixing experiment was specifically designed to sort apart technical effects from biological effects. We therefore know with certainty that the myeloid and T cell patients-specific clusters are the result of biological variation (Fig. 1). We further demonstrate that part of this variation is associated with hypoxia (Supp. Fig 4). So yes, the clustering is biologically meaningful.

      * Reviewer #1 wrote: Minor Comments: In Figure 2C, the "Epith TSHR-" population resembles myeloid cells. Could the authors clarify why this is the case? For the correlation analysis in Figure 2C, were highly variable genes or all genes used?

      *

      Reply plan: There is a simple explanation: The Epith TSHR- population the reviewer is referring to are cells from anaplastic thyroid cancers (ATC), which are tumors notoriously infiltrated by macrophages (Supp. Fig. 4). A high correlation of Epith TSHR- and macrophages proportion across our panel of ATC and papillary cancer (PTC) is therefore expected. Among other things, Fig. 2C shows that high correlation, but it is not meant to and does not show that Epith TSHR- and macrophages "resemble" one another. It shows that their proportions are highly positively correlated. This correlation analysis does not rely on gene expression but on cell type proportions. It measures co-occurrence rather than resemblance. The text has been clarified in order to prevent any confusion.

      • *

      * __Reviewer #2: __

      Reviewer #2 wrote: 1. This study largely confirms established facts that 1) PTC due to BRAF driver mutation is a heterogeneous tumour entity and 2) ATC is the most dedifferentiated of all thyroid cancers. Although interesting, observations of a highly variable tissue cell composition including immune cells and the gradual loss of thyroid differentiation markers, in part linked to tumor subclone development featured by altered chromosomal copy numbers, are thus not surprising.

      *

      __Reply plan: __We wish to respectfully express our take on this perception of the work:

      • There is a difference between conjecturing a high heterogeneity in the cell composition of thyroid cancers and establishing it with the level of accuracy and quantitative rigor our analysis provides. The extreme amplitude of that variation was surprising to us: the size of the microenvironment makes from 8.4 to 80% of the cells in PTCs driven by the same BRAF mutation.
      • We don't simply show that a subclone characterized by a large number of copy number events is less differentiated. We go all the way proving that those copy number alterations are associated with specific cell states that produce specific histology (Fig. 5). It required a combination of single cell transcriptomics, spatial transcriptomics and sophisticated computational analysis to establish that connection between genomic changes and histology. The fragmentation of epithelial sheets uncovered from CNV analysis had escaped the attention of pathologist colleagues and ours at first, this is not a parameter typically assessed in diagnostic, to our knowledge.
      • We don't simply show that there is a gradual loss of differentiation markers: this loss is ordered in a very specific way that mirrors the gain of markers during thyroid organoid differentiation. * Reviewer #2 wrote: 2. Considering tumor progression, comparison of PTC and ATC should preferably include specimens with the same driver mutation (BRAF or RAS), which is not the case here. This notion should be more clearly explained to readers. An optional improvement would be to conduct similar analyses on an ATC specimen that contains more differentiated PTC tumor portions arguably suggesting that PTC progresses to ATC (by mechanisms that are still largely unexplored).

      *

      __Reply plan: __This is clearly a limitation of our study. As already proposed in our reply to reviewer number one, we will extend to all our single cell results the replication of our analysis in the dataset of Lu e al., which includes ATCs and PTCs harboring the BRAF-mutation.

      * Reviewer #2 wrote: 3. Comments on findings of lymphocytic infiltration need to be balanced. Although autoimmune thyroid disease in infered a risk factor of developing malignancy it is unlikely that the majority of TCGA samples of PTC is associated with thyroiditis as indicated in Fig. 3 and Suppl Fig. 3. Immune cell abundance may rather reflect the tumor immune microenvironment (TIME).

      *

      __Reply plan: __The figure the reviewer is referring to demonstrates that PTC occurring in a background of thyroiditis also has a higher proportion of B cells. We did not claim, and the figure did not show, that "the majority of TCGA samples of PTC is associated with thyroiditis", because they don't. This point has been clarified.

      * Reviewer #2 wrote: 4. Some tissue sections seem of quite poor quality either shape-wise of containing rifts e.g. PTC7 in Fig. 3 and PTC2 in Fig. 5. The authors should explain whether and how this might influence analysis.

      *

      __Reply plan: __Spatial transcriptomics is typically performed on frozen sections. Frozen sections, which are obviously of lower visual quality than slice from FFPE preserved samples. Since no computational analyses were performed on the image, this lower quality has no impact on our results. Regarding RNA quality, the RINs were >7 for all tumors. RINs are now presented in Supp table S1.

      Reviewer #2 wrote: The experiment on mouse ESC/organoids (Fig. 6H-J) does not show much of an expected enhanced thyroid progenitor cell proliferation after induction of the mutant Braf allele by tamoxifen, which raises doupt whether the subsequent promoted growth by fibronectin at all is oncogene-related. This differs from the impact of BrafCA activation along with mouse thyroid development in vivo (Schoultz et al iScience 2023 PMID: 37534159). In the same experimental setup, it appears that mutant Braf prevents follicle formation (Fig. 6I). A control experiment investigating the influence of fibronectin in the absence of oncogene activation should be conclusive. The effect of Braf and fibronectin on thyroid organoid structure and function should be better explained, if necessary based on complementary experiments, and discussed in relation to the claimed association of fibronectin expression to "...low amounts of thyroid differentiation markers...) and "...loss of epithelial structure (PTC7, Figure 6E)." in the previous section of Results.

      __Reply plan: __The induction of the mutant Braf allele for 7 days increases the percentage of BrdU+ cells by 1.43 fold (p-value for Wilcoxon test = 0.035). The effects observed by Schoultz et al. are certainly more dramatic, but they result from an oncogenic activity spanning 1 to 6 months (4 to 26 times longer) in an in vivo model. Most importantly, oncogenic activity is initiated in Nkx2.1+ cells and not Tg+ cells, thus much earlier during development. These two models are thus not comparable. As for the effects of fibronectin on thyroid structure, we do not claim that our organoid model recapitulates the complex interactions between cancer cells and their microenvironment that shapes tissue morphology in vivo. This is now clarified in the text.

      We presented controls with no oncogene expression and no Fn1, controls with oncogene induction and no FN1 and organoids with oncogene induction and Fn1 treatment. This alone establishes the effect of Fn1 on induced organoids, which was our goal. We regard it as a novel and interesting but non-essential development in our paper.

      As the reviewer points out, while our results show an increased proliferation in Braf-mutated organoids treated with Fn1, they do not allow us to conclude on any potential interaction between Fn1 and the oncogenic process. The suggested experiment with Fn1 in absence of oncogene activation would add information, but we cannot follow up for practical lab management reasons detailed in Section 4 below.

      * Reviewer #2 wrote: 6. Concerning EMT profiling (Supplementary Fig. 7B) , there is a great similarity of ATC tumor cells and fibroblasts, and as stated in the text the malignant status of the former is indicated by chromosomal aberrations (refering to Suppl fig. 6). However, looking at Suppl. Fig. 7B it is evident that fractions of cells identified as fibroblasts express TG and TSHR suggesting mismatch. How was this comparison done in order to exclude mismatch? Is there no other profiled markers that distinguish cancer cells from stromal cells that can support conclusions?*

      Reply plan: TG-a thyrocyte marker-seems expressed by fibroblasts in Supplementary Figure 7B. The reviewer suggests this could be caused by an incomplete distinction between bona fide fibroblasts and thyrocytes in advanced EMT state. We argue that

      • Ambient TG RNA leaking out of thyrocytes nuclei contaminates the transcriptomes of all cell types. It is a well-known technical problem, with dedicated software packages to mitigate it. We preprocessed our data with one of them, SoupX, which corrected for most, but not all, ambient RNA contamination.
      • The plot below shows that there is nothing special about fibroblasts in that respect. For example, B and T cells are contaminated by TG at levels comparable to fibroblasts, endothelial cells and pericyte to higher levels.
      • In addition, the UMAP of Fig. 2A shows that EMT cells and fibroblast form very distinct clusters. Furthermore, the fibroblast cluster but not the two EMT clusters contain cells from PTC, and the PTC cluster do not contain cells with DNA copy number aberration. Thus, although both EMT cells and fibroblasts express the typical mesenchymal marker of Supplementary Fig. 7B, they are easy to distinguish on the basis of their overall transcriptomes.
      • The panel below has been added to the Supplementary Figure 7B. [Panel cannot be displayed here]

      Reviewer #2 wrote: *In the same figure, it appears there are no clear differences in EMT marker expression among PTC samples regardless of differentiation state, suggesting that the gradual loss of thyroid differentiation in PTC tumor cells and EMT are not parallel and potentially linked phenomena? Please clarify this dissociation of results. Is possible that refocusing on other EMT markers than the top 10, of which almost all concerns various collagen genes, might better reveal partial EMT in PTCs?

      *

      __Reply plan: __The technical basis of this comment is related to the previous point. Our perception is that the mesenchymal markers in Supplementary Fig. 7B show a binary effect, i.e. strong expression in ATC and no expression in PTC (beyond ambient RNA noise)-not a gradual effect. Thus, there is no correlation of COL1A1 and other mesenchymal markers with dedifferentiation in PTC as these markers are not expressed beyond the noise level of the experiments. A lot has been written about EMT in PTC, but one of the findings of our study is that while ATC undergo full EMT, EMT in PTC is very limited. PTC express FN1 but no other major mesenchymal markers such as collagens I and III, for example.

      • *

      Reviewer #2 wrote: *7. According to Suppl. Table 1, the ATC2 tumor does not harbor any mutations. What about chromosomal aberrations, was that included in analysis? Considering previous consistent reports of a high mutation burden in ATC, if not supported by other data (clinical, pathological) the diagnosis might be questioned for this particular case included in multiple analyses of the present study.

      *

      Reply: There is little doubt about the diagnostic of ATC2 by our pathologist collaborators

      • The histology of this tumor is strikingly anaplastic, i.e. without structure, as shown in the image below.
      • This tumor has a high level of macrophages infiltration typical of ATCs (Supplementary Fig. 4).
      • Reviewer #2 wrote: Minor comments: -The logical order of presentation of Results might benefit from first presenting specific PTC data following by ATC dito. I´m thinking of swapping the section of EMT in ATC to end of Results.*

      *Reply plan: We miss why the reviewer thinks that way. We believe that discussing the microenvironment, then tumor cells bring conciseness and clarity about how we propose to stratify the latter. By contrast, the suggested tumor type-centered structure entails going back and forth between the microenvironment and tumor cells, diluting the messages about both.

      * Reviewer #2 wrote: -Methods paragraph "Mouse ESC-derived thyroid organoids experiments" (starting with "ccc") seems to be missing some essential information.

      *Reply plan: A sentence was missing, indeed, and has been re-introduced in the manuscript. We thank the reviewer for catching that error.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Expression pattern profiling of human thyroid cancer tissues by combining single cell/nuclei RNAseq analysis, spatial transcriptomics and immunofluorescence on corresponding tumor histologic sections. Papillary and anaplastic thyroid carcinomas (PTC n=10 and ATC n=4) were compared; some data were extracted from TCGA. The results indicate that ATCs consists of completely dedifferentiated tumor cells whereas PTCs show variable levels of dedifferentiation, which in a sense mimics the the reverse process of thyroid differentiation as observed in stem cell-based organoids. Moreover, PTC and ATC tumors show different levels of epithelial-mesenchymal transition. Fibronectin is inferred a role in promoting tumor growth, supported by functional studies on organoids. Authors suggest that global profiling of differentiation state is a promising technique to stratifiy tumor heterogeneity, with potentially might be useful distinguishing thyroid malignancies suitable or not to adjuvant treatment e.g. with radioiodine (RAI) therapy.

      Major comments:

      1. This study largely confirms established facts that 1) PTC due to BRAF driver mutation is a heterogeneous tumour entity and 2) ATC is the most dedifferentiated of all thyroid cancers. Although interesting, observations of a highly variable tissue cell composition including immune cells and the gradual loss of thyroid differentiation markers, in part linked to tumor subclone development featured by altered chromosomal copy numbers, are thus not surprising.
      2. Considering tumor progression, comparison of PTC and ATC should preferably include specimens with the same driver mutation (BRAF or RAS), which is not the case here. This notion should be more clearly explained to readers. An optional improvement would be to conduct similar analyses on an ATC specimen that contains more differentiated PTC tumor portions arguably suggesting that PTC progresses to ATC (by mechanisms that are still largely unexplored).
      3. Comments on findings of lymphocytic infiltration need to be balanced. Although autoimmune thyroid disease in infered a risk factor of developing malignancy it is unlikely that the majority of TCGA samples of PTC is associated with thyroiditis as indicated in Fig. 3 and Suppl Fig. 3. Immune cell abundance may rather reflect the tumor immune microenvironment (TIME).
      4. Some tissue sections seem of quite poor quality either shape-wise of containing rifts e.g. PTC7 in Fig. 3 and PTC2 in Fig. 5. The authors should explain whether and how this might influence analysis.
      5. The experiment on mouse ESC/organoids (Fig. 6H-J) does not show much of an expected enhanced thyroid progenitor cell proliferation after induction of the mutant Braf allele by tamoxifen, which raises doupt whether the subsequent promoted growth by fibronectin at all is oncogene-related. This differs from the impact of BrafCA activation along with mouse thyroid development in vivo (Schoultz et al iScience 2023 PMID: 37534159). In the same experimental setup, it appears that mutant Braf prevents follicle formation (Fig. 6I). A control experiment investigating the influence of fibronectin in the absence of oncogene activation should be conclusive. The effect of Braf and fibronectin on thyroid organoid structure and function should be better explained, if necessary based on complementary experiments, and discussed in relation to the claimed association of fibronectin expression to "...low amounts of thyroid differentiation markers...) and "...loss of epithelial structure (PTC7, Figure 6E)." in the previous section of Results.
      6. Concerning EMT profiling (Supplementary Fig. 7B) , there is a great similarity of ATC tumor cells and fibroblasts, and as stated in the text the malignant status of the former is indicated by chromosomal aberrations (refering to Suppl fig. 6). However, looking at Suppl. Fig. 7B it is evident that fractions of cells identified as fibroblasts express TG and TSHR suggesting mismatch. How was this comparison done in order to exclude mismatch? Is there no other profiled markers that distinguish cancer cells from stromal cells that can support conclusions? In the same figure, it appears there are no clear differences in EMT marker expression among PTC samples regardless of differentiation state, suggesting that the gradual loss of thyroid differentiation in PTC tumor cells and EMT are not parallel and potentially linked phenomena? Please clarify this dissociation of results. Is is possible that refocusing on other EMT markers than the top 10, of which almost all concerns various collagen genes, might better reveal partial EMT in PTCs?
      7. According to Suppl. Table 1, the ATC2 tumor does not harbor any mutations. What about chromosomal aberrations, was that included in analysis? Considering previous consistent reports of a high mutation burden in ATC, if not supported by other data (clinical, pathological) the diagnosis might be questioned for this particular case included in multiple analyses of the present study.

      Minor comments:

      • The logical order of presentation of Results might benefit from first presenting specific PTC data following by ATC dito. I´m thinking of swapping the section of EMT in ATC to end of Results.
      • Methods paragraph "Mouse ESC-derived thyroid organoids experiments" (starting with "ccc") seems to be missing some essential information.

      Significance

      The study confirms at single cell level the fundamental difference of PTC and ATC that is evident clinically and biologically, but does not address the intriguing issue how ATC may progress from PTC.

      Tumor heterogeneity of BRAFV600E-driven PTC in terms of dedifferentiation of functional parameters, which are of potential clinical relevance, is well documented.

      Reviewer expertise: thyroid development, thyroid cell and tumor biology, superficial knowledge in scRNAseq analysis

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This study is well designed with a rational sample collection strategy. The authors collected PTC and ATC tissue samples for snRNA and spRNA sequencing, clearly characterizing tumor heterogeneity. Using representative thyroid differentiation markers (TSHR, TPO, TG, NIS), they distinguished different differentiation states of PTC and ATC and further validated the role of FN1 in organoid models. However, the manuscript is largely descriptive in nature, and several key issues remain to be addressed.

      Major Comments:

      1.Differential gene/pathway analysis across epithelial clusters: What are the differential genes or pathways among the epithelial clusters? Without CCA/Harmony integration, do the tumor subgroups show distinct differences? In addition, I suggest applying NMF or hdWGCNA to identify shared modules and test whether ATC and PTC harbor overlapping regulatory modules. 2.Validation of TSHR/TPO-based subgrouping: While the TSHR/TPO grouping appears appropriate for stratification at the single-cell level, it is necessary to exclude sequencing depth as a confounding factor. Should validate the existence of these subpopulations using mIHC/IF on corresponding samples. 3.Impact of mutational differences on conclusions: According to Supplementary Table 1, almost all PTC cases carried BRAF mutations, whereas four ATC patients harbored no BRAF mutation. Could this difference influence the conclusions of the study? Although the authors briefly mention this in the Discussion, a more thorough clarification is warranted. 4.The statement "Myeloid and T cells also grouped in specific clusters" seems descriptive. Is this clustering biologically meaningful? Please elaborate.

      Minor Comments:

      In Figure 2C, the "Epith TSHR-" population resembles myeloid cells. Could the authors clarify why this is the case? For the correlation analysis in Figure 2C, were highly variable genes or all genes used?

      Significance

      This study provides a comprehensive single-nucleus and spatial transcriptomic atlas of papillary and anaplastic thyroid carcinomas. Its strengths include well-designed sample collection, high-resolution profiling of tumor heterogeneity, and validation of FN1 function. By stratifying malignant cells with thyroid differentiation markers (TSHR, TPO, TG, NIS), the authors delineate differentiation states and highlight mechanisms of progression from PTC to ATC. However, the study remains mainly descriptive, and additional analyses of gene modules, pathway regulation would increase its conceptual depth. The findings will interest researchers in thyroid cancer, tumor heterogeneity, and the single-cell/spatial genomics field, with potential relevance for translational oncology.

      Field of expertise: thyroid cancer biology, single-cell and spatial transcriptomics.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Detailed point-by-point response

      __ __The Reviewers provided suggestions to improve the manuscript, most notably by adding experiments to (1) further support the role of Stim and Orai in epidermal heat-off responses and (2) further characterize the thermosensory responses of epidermal cells. We additionally propose to include a new set of calcium imaging experiments to visualize nociceptor sensitization by epidermal cells.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: Drosophila larvae are known to respond to noxious stimuli by rolling. The authors propose that this response arises not only by sensory response of nociceptive neurons but also by direct response of larval epidermal cells. They go onto test this idea by independently manipulating epidermal cells and nociceptive sensory neurons using GAL4 lines, GCAMPs and RNAis. The behavioural data are convincing and presented clearly with good statistical analysis. However the involvement of epidermal cells in evoking the behaviour as well as STIM/Orai mediated Ca2+ entry requires further experiments. Use of another independent GAL4 strain for epidermal cells, alternate RNAi lines for STIM and Orai, mutants for STIM and Orai and overexpression constructs for STIM and Orai would significantly enhance the data. Thus, as of now the key results require more convincing. The following additional experiments would be required to support their claims:

      1) Either use a second epidermal GAL4 strain to show key results OR provide images of the epidermal GAL4 expression double-labelled with a ppk driver using a different fluorescent protein to establish NO overlap of the epidermal GAL4 with neurons. These strains should be available free in Bloomington.

      We agree that the specificity of the GAL4 driver is an important point. In a recent publication (Yoshino et al, eLife, 2025) we provide the most comprehensive analysis of larval epidermal GAL4 drivers published to date. Included in this study is expression analysis of R38F11-GAL4 demonstrating that it is indeed specifically expressed in the epidermis. Based on the detailed expression analysis and functional analysis provided in that paper, R38F11-GAL4 was chosen for these studies as it is both highly specific for epidermal cells and provides uniform expression across the body wall.

      In our revised manuscript, we will more clearly detail how the driver was chosen for this study and provide a citation to the prior work to accompany our description of R38F11-GAL4 as an epidermis-specific driver line.

      2) Authors need to provide better data for the involvement of STIM and Orai in the Calcium responses observed. A single RNAi for each gene with marginal change in response is insufficient. The authors also do not state if the RNAis used are validated by them or anyone else. Minimally they should repeat their experiments with at least one other validated RNAi and rescue these with overexpression constructs of STIM and Orai (available in Bloomington). It is well established in literature that overexpression of STIM/Orai can rescue SOCE in Drosophila. Ideally, to be fully convincing they should test a Drosophila knockout for STIM (available in Bloomington). Heterozygotes of this are viable and should be tested. Additionally a UAS Orai dominant negative (OraiDN) strain is available in Bloomington and can be tested.

      We appreciate the Reviewer’s perspective on the importance of characterizing the efficacy of the reagents we used in this study. However, we disagree with the characterization of the change in response as “marginal”. Our results demonstrate that epidermal knockdown of Stim or Orai causes a significant reduction in the heat-off response of epidermal cells and heat-induced nociceptive sensitization.

      In a prior published study (Yoshino et al, eLife, 2025) we validated for their efficacy of these RNAi lines in combination with the same GAL4 driver at the same developmental stage. Specifically, we demonstrated that R38F11GAL4-mediated expression of UAS-Stim RNAi or UAS-Orai RNAi significantly attenuated store operated calcium entry following story depletion by thapsigargin. In the revised manuscript, we will add a statement referring to this prior validation along with a citation. In light of this prior characterization, we disagree that additional RNAi lines are required to corroborate the results.

      The most salient point of the Reviewer’s comment is that additional evidence should be provided to demonstrate more convincingly the requirement of Stim/Orai in epidermal heat-off responses. We detail our plans to address this point below, but first address the specific experimental suggestions the Reviewer provides.

      First, the Reviewer suggests the use of a dominant-negative version of Orai, and we agree that this could prove complimentary to our RNAi experiments.

      The Reviewer suggests two additional genetic approaches which are well-reasoned but problematic. First, they suggest rescuing the RNAi knockdowns with overexpression approaches. In addition to requiring the generation of new, RNAi-refractory transgenes, this approach is confounded by the effects of overexpressing CRAC channel components. Orai channels exhibit highly cooperative activation by Stim, and we previously showed that epidermal Stim overexpression drove mechanical nociceptive sensitization. Although this dosage effect confounds the rescue assays, we will examine whether epidermal Stim overexpression similarly sensitizes larvae to noxious thermal inputs as we would predict from our model.

      The final experiment the Reviewer suggests – phenotypic analysis of Stim knockouts – is not possible due to the lethal phase of the mutants. Furthermore, it is not possible using traditional mosaic analysis to generate mutant epidermal clones that span the entire epidermis. Such an approach might be possible with a newly engineered FLP-out Stim allele, but generating that reagent is beyond the scope of this work. The Reviewer suggests characterization of Stim heterozygotes, but Drosophila genes rarely show strong dosage effects as heterozygotes (though we acknowledge that dosage effects can be amplified in the cases of genetic interactions), hence a negative result (no effect on heat-off responses) would not be meaningful. In principle we could test whether Stim hetorozygosity enhances effects of epidermal Stim RNAi. Although a negative result will not be telling, the experiment is straightforward, and an enhancement of the effect of Stim RNA would support the model that RNAi provides an incomplete functional knockdown of Stim. We will therefore perform this experiment and incorporate the results into the revised manuscript, pending a postitive outcome.

      To better define the contributions of Stim and Orai to heat-off responses of epidermal cells, we will incorporate results from the following new experiments into our revised manuscript:

      • We will monitor effects of epidermis-specific expression of a dominant negative form of Orai on epidermal heat-off responses (calcium imaging) and heat-induced nociceptive sensitization (behavioral assays).
      • We will monitor effects of epidermis-specific co-expression of Stim+Orai RNAi on epidermal heat-off responses (calcium imaging) and heat-induced nociceptive sensitization (behavioral assays)
      • Orai channels exhibit highly cooperative activation by Stim, therefore we will examine whether epidermal Stim overexpression increases the amplitude of heat-off responses (calcium imaging) and sensitizes larvae to noxious thermal inputs (behavioral assays) as we would predict from our model.

        Minor comments that can be addressed:

      1) Figure 1: Further details required on how the rolling response is measured. Figure is uninformative. A video would be really helpful.

      We appreciate the suggestion. We will add a more detailed explanation of how the behaviors were scored along with an annotated video.

      2) I could not find Figure 1I described in the text. This section should be explained properly.

      Figure 1I is described in the figure legend and we will add an in-text citation.

      3) Figure 3: There appears to a small response at 32oC - why is this ignored in the text? It would be useful to have S3 in the main figure.

      The small response at 32C is not ignored, though that individual response is better understood in the context of all responses plotted in Figure 3D. We will reword the phrase “At temperature maxima below 35°C epidermal cells rarely exhibited heat-off responses” to reflect the small response that is observed at lower temperatures. We will also replace the trace in the figure – the original submission contained the one outlier sample that exhibited robust responses at 32 C.

      We appreciate the suggestion to include Fig S3 in the main text – we initially included it, but moved it to the supplement for space considerations. We will include it as a main figure in our revised submission.

      4) Fig 4: The DF/F traces for the two RNAis should be included in this figure.

      We appreciate the suggestion; we will add these traces to our revised submission.

      5) Extent of knockdown in the epidermis by each RNAi should be shown by RTPCRs.

      We note that efficacy of the knockdowns has been validated by us in acutely dissociated epidermal cells. RTPCR validation as described would require FACS-sorting of acutely dissociated, GFP-labeled epidermal cells from each specimen, an extremely time- and resource intensive experiment that provides limited information. The more relevant information is the physiological readout of Stim/Orai functional knockout using these reagents which we previously conducted. As described above, we will add a description of these experiments and the relevant citation.

      6) The authors need to explain why only a small change in the Ca2+ response is seen with either RNAi. Are there other Ca2+ channels involved? Ideally they could test mutants/RNAi for the TRP channel family. Loss of SOCE in Drosophila neurons changes the expression of other membrane channels - is this possible here? Minimally, this possibility needs to be discussed.

      We agree with the Reviewer that this topic warrants further discussion. Pending the results of our planned experiments (Orai dominan negative, Stim+Orai RNAi), we will incorporate a discussion of other channels that may contribute to the heat-off response. We appreciate the Reviewers point that loss of SOCE in Drosophila neurons can change the expression of membrane channels – that is an intriguing possibility that might explain the modest effects of Stim or Orai knockdown. We have not investigated effects of epidermal Stim/Orai knockdown on expression of other channels, but will incorporate this possibility into our discussion.

      7) In the methods section please explain how the % DF/F calculations are done and how are they normalised to the ionomycin response.

      We will incorporate these additional details in the methods section.

      8) Authors need to look at previous work on STIM and Orai in Drosophila and reference appropriately.

      We appreciate the suggestion and will incorporate additional discussion of relevant Drosophila work on STIM and Orai.

      **Referees cross-commenting**

      Reviewers 2 and 3 have raised some additional queries to what I had mentioned in my review. I agree with their comments. The authors should attempt to address all comments by all three reviewers.

      We address their comments below.

      Reviewer #1 (Significance (Required)):

      This is an interesting study that identifies epidermal cells in Drosophila with the ability to sense a drop in temperature after receiving noxious heat stimuli and invoke appropriate behaviour. Behaviour experiments are well conducted and convincing. So far only nociceptive neurons were thought to control such behavioural responses so the work is significant and important for the field. The mechanism identified needs further convincing and I have suggested experiments that would be of help. With the additional experiments suggested the work will be of interest to neuroethologists, Drosophila neuroscientists and scientists in the field of Ca signaling.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Noxious heat can have a strong adverse effect on animals, resulting in sensitization when noxious thermal stimuli are applied repeatedly. Noxious heat induces a characteristic rolling behavior in Drosophila melanogaster larvae. This study investigates sensitization, whereby a second heat stimulus evokes this behavior with significantly shorter latency (e.g., 3.4 seconds) than the initial exposure (e.g., 8.79 seconds). While prior research has implicated central and peripheral neurons in this process, recent findings in mammalian systems suggest a role for keratinocytes.

      In this manuscript, Yoshino et al. report that epidermal cells are necessary and sufficient to mediate heat sensitization in D. melanogaster larvae. Using an ex vivo epidermal imaging system, the authors demonstrate that calcium influx in epidermal cells is crucial for sensitization. Importantly, this calcium influx was observed only when the temperature was lowered from a dangerously high to a safe temperature. The calcium channel system Orai and Stim facilitates this influx.

      Major comments:

      (1) The authors clearly demonstrate the heat-off reaction using calcium influx imaging. However, all of the imaging shows the response to the first stimulation. Since the study focuses on sensitization, which shows a quicker response to the second heat stimulus, it would be helpful if the authors showed calcium influx when the second stimulus was applied. It would also be interesting to see how many times the epidermal cells can react to heat stimulation.

      We appreciate the suggestion from the Reviewer but note that the calcium influx we show occurs in epidermal cells, which signal to neurons to potentiate future responses in our model. We have emphasized this point in our revised manuscript.

      The relevant response to visualize the sensitization is the heat-evoked calcium response in nociceptors, not epidermal cells. We have verified that C4da neurons exhibit calcium responses to the warming stimulus we use in our heat-off paradigm and our preliminary studies suggest that the heat-off stimulus potentiates future responses to noxious heat in nociceptors. We will therefore examine (1) whether epidermal stimulation triggers a sensitization of nociceptors to thermal stimuli by monitoring heat-induced calcium responses using GCaMP, and (2) whether epidermal Stim and Orai are required for this sensitization.

      The second comment addresses the response of epidermal cells to repeated rounds of stimuli. We agree that this is an interesting point. We have verified that epidermal cells indeed respond to multiple rounds of heat-off stimuli. We will incorporate results from a paradigm in which epidermal cells are presented with two successive heat-off stimuli, spaced by 5 minutes to allow epidermal cytosolic calcium to return to baseline. We will incorporate new analysis examining the relative magnitude of epidermal cells to the first and second stimulus.

      (2) Figure 5 only shows one condition: a 30-second interval between the first and second heat application. While the rolling latency of the Luciferase RNAi control ranges from 4 to 12 seconds (with a median of 5 seconds), Fig. 1E shows a latency ranging from 6 to 12 seconds (with a median of 10 seconds) under the same 30-second interval conditions. This difference makes interpreting the effect of Stim and Orai confusing. The authors need to clarify whether the knockdowns accelerate the first response or delay the second response.

      The Reviewer notes that we assayed effects of Stim/Orai RNAi on heat-induced nociceptive sensitization in only one paradigm. Given the kinetics of cytosolic calcium increases following Stim or Orai RNAi in epidermal cells (Fig. 4F), we agree that an additional set of behavior experiments investing sensitization following a 60 sec recovery is warranted. For our revision we will conduct a time-course to assay requirements of epidermal Stim and Orai (using epidermal expression of Stim/Orai RNAi and Orai dominant negative transgenes) on heat-induced nociceptive sensitization. Our preliminary studies indicate that Stim and Orai RNAi significantly reduce heat-induced sensitization following 60 s of recovery (we present results from 30 s of recovery in the original submission).

      The Reviewer raises some questions about differences in behavioral latencies in Figure 1E and Figure 5B. We intentionally avoid such comparisons both because the genetic backgrounds are different and the experiments were conducted at very different times (more than 1 year apart). In both experiments the salient feature that we discuss is the presence or absence of sensitization, not the mean latency. We note that we do compare mean latency values in Figure 1B, but that was a distinct experimental paradigm (global heat of variable temperatures followed by focal noxious heat) designed specifically to define heat stimuli that generate the maximum level of sensitization. In that case, the genotype was fixed and all assays were conducted concurrently.

      Minor comments:

      (i) In Fig. 2C´´, the authors observed clear calcium influx in epidermal cells by combining the GCaMP genetic tool with an ex vivo thermal perfusion system. Although this system applies heat uniformly across the epidermal tissue, calcium influx is spatially restricted, appearing primarily in the head and tail regions of the epidermis. These results suggest that the heat-responsive epidermal cells are localized to these regions or that there are regional differences in sensitivity. The authors should explain the spatial relationship between the heat-applied epidermal cells and the occurrence of calcium influx.

      The Reviewer notes that intensity of the epidermal GCaMP signal is particularly intense in the anterior and posterior portions of the fillet preparation (Fig. 1B-1C), and we agree that it would be useful to include an explanation of this result, which is an artifact of the sample preparation.

      The specimens we use for calcium preparation are “butterfly” preparations – the body wall is filleted along the long axis with the exception of regions at the head and tail that are pinned down on sylgard plates. Hence, the regions in the head and tail contain intact tissue (including a double layer of skin when we image in widefield), not a single layer of skin (the rest of the prep). More significantly, the head and tail regions are pinned down, creating a wound that triggers lasting local calcium transients (note signal in the absence of temperature stimulus, Figure 1B’ and 1B”, 1C’). We therefore exclude this region from our analysis. We note that our behavior studies relied on stimuli presented to the abdominal segments we sample in the semi-intact calcium imaging. Similarly, we dissociated epidermal cells exclusively from these segments for imaging of acutely isolated epidermal cells.

      We do note that there is a periodicity to the signal – within each segment there are local maxima and minima of signal, and we agree with the Reviewer that this spatial segregation is an interesting point for discussion. We will add 1-2 sentences to our discussion of the result to acknowledge this point.

      (ii) Related to comment (i) above, if heat stimuli are applied topically using a heat probe under the ex vivo imaging system, how large an area reacts to the stimuli?

      The Reviewer raises an interesting question about the local response to heat stimuli. In our dissociated cell experiments we found that the overwhelming majority of isolated epidermal cells exhibit heat-off responses, and we likewise find that the majority of cells in our semi-intact preparation respond to heat-off stimuli. However, our current probe for delivering local heat stimuli is not compatible with our imaging system. We are working to incorporate an IR laser to focally deliver heat stimulus to explore whether epidermal cells signal to neighbors following stimulation, but such studies are beyond the scope of the current work.

      (iii) Providing supplementary movie(s) of the calcium live imaging would enhance the reader's understanding.

      We agree with the Reviewer that this would be a useful supplement. We will add representative movies as experimental supplements in our revised manuscript.

      (iv) The time point of the image in Fig. 2C´ ("before heat") is not the most informative for demonstrating a "heat-off" response. The authors should replace it with an image taken during the heat application to provide a more direct comparison with the post-stimulus influx shown in Fig. 2C´´.

      We appreciate the Reviewer’s suggestion and agree this would be a better choice to visually represent the change in fluorescence induced by the heat-off response. We will make this change in our revised manuscript.

      (v) The authors state that sensitization occurs "primarily in the 30-45 ºC range." However, the rolling probability and latency developed oppositely at 45 ºC stimulation than at 40 ºC. It would be doubtful that 45 ºC may be approaching a noxious or damaging threshold that engages a different phenomenon. The authors should reconsider including 45 ºC within the optimal sensitization range or provide a justification.

      We agree with the Reviewer that a more detailed discussion of the effects of temperature at the end of the range (45 C) is warranted. Exposure to a 45 C global heat stimulus triggered temporary paralysis in some larvae, and we suspect that this accounts for the apparent reduction in roll probability following the second stimulus. We can add a plot depicting the proportion of larvae that exhibited paralysis during 45 C global heat and determine whether these heat-paralyzed larvae exhibited distinct responses from larvae that were not paralyzed and provide a more detailed account of the optimal sensitization range.

      Treatment with 45 C stimuli still triggered a significant reduction in roll latency (sensitization), but we did not examine whether the latency was significantly different from what was observed at 40 C. We can add that analysis in the revision.

      (vi) In the sentence "To this end, we developed a perfusion system, that would deliver thermal ramps from ~20-45ºC ...," the tilde ~ should be replaced with "approximately".

      Noted. We will make the change.

      (vii) Throughout the manuscript, please clarify in the figure legends whether the sample size (n) refers to the number of individual animals or the number of cells.

      Noted. We will add the relevant details to our sample sizes notations.

      (viii) The Key Resources Table does not specify the wild-type (WT) strain used for the control experiments (e.g., in Fig. 1). Please provide the full genotype of the control strain used.

      We included the experimental genotypes in each figure legend, which we find more useful than the key resource table, which contains a list of all reagents used in the study (Drosophila alleles included).

      Reviewer #2 (Significance (Required)):

      General Assessment

      This study addresses a fundamental question in sensory biology: whether epidermal cells, long regarded as passive participants in somatosensation, actively contribute to noxious heat detection and avoidance behavior. While previous work has defined the neuronal circuits and TRP channel mechanisms underlying thermal nociception in Drosophila larvae, the potential sensory role of skin cells has remained largely unexplored. The authors integrate behavioral analysis with in vitro and ex vivo calcium imaging to provide a rigorous, multi-level investigation of epidermal thermosensitivity.

      Advancement

      The work advances the field by revealing that Drosophila epidermal cells are intrinsically thermosensitive and can acutely sensitize larval nociceptive responses to noxious heat through heat-off signaling. This discovery shifts the current paradigm of thermal nociception from a neuron-centric model to one that incorporates epidermal contributions, highlighting a conserved and previously underappreciated role of skin cells in active environmental sensing.

      The reviewer's expertise: Molecular genetics, developmental biology, insect physiology and endocrinology.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This manuscript describes the temperature responses of Drosophila larval epidermal cells. These cells are activated by cooling and also exhibit strong heat-off responses. Orai and Stim are required in epidermal cells for these heat-off responses. The heat-off responses sensitize the epidermal cells, leading to a greater proportion of animals displaying rolling behaviors and a reduced latency to initiate rolling following noxious heating treatment. The following comments are intended to help improve the manuscript.

      Major:

      1. In Figure 3A, the conclusion will be strengthened by testing heat-off responses from 10 {degree sign}C to 40 {degree sign}C.

      The Reviewer makes an important point. In our original experiment, the lack of response in the 10C – 30C experiment could be due to some cold-induced suppression of the off response. We have found that this is not the case – we have found that off responses following a 10C-40C ramp are indistinguishable from responses to a 20C-40C ramp. In our revised manuscript we will incorporate new results showing epidermal heat off responses to a 10C-40C ramp as well as normalization to 20C-40C responses performed in parallel.

      Figure 4C shows that 2-APB suppresses the heat-off response. Since 2-APB blocks both Orai and TRP channels, it is unclear why the authors focused exclusively on the Orai pathway without testing TRP channels.

      We found that epidermal cells exhibited minimal responses to warming stimuli, as would be expected for the epidermally expressed TRP channel TRPA1. In addition, the heat-off response we identified was remarkably similar to characteristic heat-off responses of mammalian CRAC channels. Hence, we focused our attention on the Orai pathway. While we agree that contributions of TRP channels could be of interest, especially if our additional analyses (double RNAi and Orai Dominant Negative) support the model that additional channels likely contribute to the heat-off response, the characteristic temperature responses of CRAC channels made them the most plausible candidate.

      In parallel to the experiments to further characterize Stim/Orai contributions to the heat-off response, we will assay requirements of TRPA1 to heat-induced nociceptor sensitization.

      While 2-APB completely abolishes the heat-off response, Orai and Stim RNAi only slightly (although significantly) reduce calcium responses. The knockdown efficiency of the RNAi constructs should be validated. Furthermore, testing whether combining Orai RNAi and Stim RNAi produces a stronger reduction in calcium responses would be informative.

      We addressed the question of knockdown efficiency above, and agree that testing the effects of Orai RNAi and Stim RNAi in combination is worthwhile. We detailed our plans for these experiments above.

      The study uses third-instar larvae. Please specify whether early, mid, or late third instar were used.

      In our original submission we stated “Third-instar larvae (96-120 AEL) larvae were used in all experiments” We provide additional details on the staging of larvae for all experiments in the methods section of our revised submission. To synchronize cultures, embryos were collected from experimental crosses for 24 h, aged for 96 h, and foraging mid-third instar larvae (96-120 h old) were used for all experiments.

      Please provide more details about the thin layer of water used. Specifically, indicate the size of the Peltier plate and the volume of water applied.

      We provide additional details on the application of global heat stimulus in the methods section of our revised manuscript. “For assays testing effects of varying the temperature of prior thermal stimuli on thermal nociception, larvae were individually transferred to a pre-warmed Peltier plate (11 x 7 cm; Torrey Pines Scientific). Peltier plates were warmed to the indicated temperatures, a thin layer of water was applied to the surface using a paint brush, and the temperature was verified using an infrared thermometer. Larvae were transferred individually to the Peltier plate, incubated for the indicated time, and recovered to 2% Agar Pads using a paint brush. Following 10 s of recovery, larvae were stimulated with a 41.5°C thermal probe, as above, and latency to the first complete roll was recorded.”

      Minor:

      1. There is an inconsistency between the text and the figure regarding the sample number in Figure 1D.

      We thank the reviewer for identifying the discrepancy. This inconsistency has been corrected in the revised submission.

      Please provide the raw representative data for the time course of heat-off calcium responses in Figure 1E.

      We will incorporate representative traces for the heat-off responses plotted in Figure 1E.

      A period is missing at the end of the sentence: "For curve fitting, sample-averaged fluorescence traces were fitted with a single exponential decay function using R to extract a representative time constant (τ) and assess response kinetics."

      We thank the reviewer for identifying the omission. The period has been added.

      In the sentence "Behavior Responses were analyzed post-hoc blind to genotype and were plotted according to roll probability and roll latency," the word Responses should begin with a lowercase r.

      This has been corrected in the revised submission.

      Reviewer #3 (Significance (Required)):

      This manuscript describes the heat-off responses of larval epidermal cells and investigates their underlying molecular mechanisms as well as associated behavioral consequences.

      The calcium responses and behavioral assays are clearly presented. However, the contribution of Stim and Orai to this process is not convincing.

      The study may be of interest to researchers working on Drosophila and temperature sensation, as well as to those studying Orai and Stim function.

      I am a researcher specializing in Drosophila thermosensation.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript describes the temperature responses of Drosophila larval epidermal cells. These cells are activated by cooling and also exhibit strong heat-off responses. Orai and Stim are required in epidermal cells for these heat-off responses. The heat-off responses sensitize the epidermal cells, leading to a greater proportion of animals displaying rolling behaviors and a reduced latency to initiate rolling following noxious heating treatment. The following comments are intended to help improve the manuscript.

      Major:

      1. In Figure 3A, the conclusion will be strengthened by testing heat-off responses from 10 {degree sign}C to 40 {degree sign}C.
      2. Figure 4C shows that 2-APB suppresses the heat-off response. Since 2-APB blocks both Orai and TRP channels, it is unclear why the authors focused exclusively on the Orai pathway without testing TRP channels.
      3. While 2-APB completely abolishes the heat-off response, Orai and Stim RNAi only slightly (although significantly) reduce calcium responses. The knockdown efficiency of the RNAi constructs should be validated. Furthermore, testing whether combining Orai RNAi and Stim RNAi produces a stronger reduction in calcium responses would be informative.
      4. The study uses third-instar larvae. Please specify whether early, mid, or late third instar were used.
      5. Please provide more details about the thin layer of water used. Specifically, indicate the size of the Peltier plate and the volume of water applied.

      Minor:

      1. There is an inconsistency between the text and the figure regarding the sample number in Figure 1D.
      2. Please provide the raw representative data for the time course of heat-off calcium responses in Figure 1E.
      3. A period is missing at the end of the sentence: "For curve fitting, sample-averaged fluorescence traces were fitted with a single exponential decay function using R to extract a representative time constant (τ) and assess response kinetics."
      4. In the sentence "Behavior Responses were analyzed post-hoc blind to genotype and were plotted according to roll probability and roll latency," the word Responses should begin with a lowercase r.

      Significance

      This manuscript describes the heat-off responses of larval epidermal cells and investigates their underlying molecular mechanisms as well as associated behavioral consequences.

      The calcium responses and behavioral assays are clearly presented. However, the contribution of Stim and Orai to this process is not convincing.

      The study may be of interest to researchers working on Drosophila and temperature sensation, as well as to those studying Orai and Stim function.

      I am a researcher specializing in Drosophila thermosensation.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Noxious heat can have a strong adverse effect on animals, resulting in sensitization when noxious thermal stimuli are applied repeatedly. Noxious heat induces a characteristic rolling behavior in Drosophila melanogaster larvae. This study investigates sensitization, whereby a second heat stimulus evokes this behavior with significantly shorter latency (e.g., 3.4 seconds) than the initial exposure (e.g., 8.79 seconds). While prior research has implicated central and peripheral neurons in this process, recent findings in mammalian systems suggest a role for keratinocytes. In this manuscript, Yoshino et al. report that epidermal cells are necessary and sufficient to mediate heat sensitization in D. melanogaster larvae. Using an ex vivo epidermal imaging system, the authors demonstrate that calcium influx in epidermal cells is crucial for sensitization. Importantly, this calcium influx was observed only when the temperature was lowered from a dangerously high to a safe temperature. The calcium channel system Orai and Stim facilitates this influx.

      Major comments:

      (1) The authors clearly demonstrate the heat-off reaction using calcium influx imaging. However, all of the imaging shows the response to the first stimulation. Since the study focuses on sensitization, which shows a quicker response to the second heat stimulus, it would be helpful if the authors showed calcium influx when the second stimulus was applied. It would also be interesting to see how many times the epidermal cells can react to heat stimulation.

      (2) Figure 5 only shows one condition: a 30-second interval between the first and second heat application. While the rolling latency of the Luciferase RNAi control ranges from 4 to 12 seconds (with a median of 5 seconds), Fig. 1E shows a latency ranging from 6 to 12 seconds (with a median of 10 seconds) under the same 30-second interval conditions. This difference makes interpreting the effect of Stim and Orai confusing. The authors need to clarify whether the knockdowns accelerate the first response or delay the second response.

      Minor comments:

      (i) In Fig. 2C´´, the authors observed clear calcium influx in epidermal cells by combining the GCaMP genetic tool with an ex vivo thermal perfusion system. Although this system applies heat uniformly across the epidermal tissue, calcium influx is spatially restricted, appearing primarily in the head and tail regions of the epidermis. These results suggest that the heat-responsive epidermal cells are localized to these regions or that there are regional differences in sensitivity. The authors should explain the spatial relationship between the heat-applied epidermal cells and the occurrence of calcium influx.

      (ii) Related to comment (i) above, if heat stimuli are applied topically using a heat probe under the ex vivo imaging system, how large an area reacts to the stimuli?

      (iii) Providing supplementary movie(s) of the calcium live imaging would enhance the reader's understanding.

      (iv) The time point of the image in Fig. 2C´ ("before heat") is not the most informative for demonstrating a "heat-off" response. The authors should replace it with an image taken during the heat application to provide a more direct comparison with the post-stimulus influx shown in Fig. 2C´´.

      (v) The authors state that sensitization occurs "primarily in the 30-45 ºC range." However, the rolling probability and latency developed oppositely at 45 ºC stimulation than at 40 ºC. It would be doubtful that 45 ºC may be approaching a noxious or damaging threshold that engages a different phenomenon. The authors should reconsider including 45 ºC within the optimal sensitization range or provide a justification.

      (vi) In the sentence "To this end, we developed a perfusion system, that would deliver thermal ramps from ~20-45ºC ...," the tilde ~ should be replaced with "approximately".

      (vii) Throughout the manuscript, please clarify in the figure legends whether the sample size (n) refers to the number of individual animals or the number of cells.

      (viii) The Key Resources Table does not specify the wild-type (WT) strain used for the control experiments (e.g., in Fig. 1). Please provide the full genotype of the control strain used.

      Significance

      General Assessment

      This study addresses a fundamental question in sensory biology: whether epidermal cells, long regarded as passive participants in somatosensation, actively contribute to noxious heat detection and avoidance behavior. While previous work has defined the neuronal circuits and TRP channel mechanisms underlying thermal nociception in Drosophila larvae, the potential sensory role of skin cells has remained largely unexplored. The authors integrate behavioral analysis with in vitro and ex vivo calcium imaging to provide a rigorous, multi-level investigation of epidermal thermosensitivity.

      Advancement

      The work advances the field by revealing that Drosophila epidermal cells are intrinsically thermosensitive and can acutely sensitize larval nociceptive responses to noxious heat through heat-off signaling. This discovery shifts the current paradigm of thermal nociception from a neuron-centric model to one that incorporates epidermal contributions, highlighting a conserved and previously underappreciated role of skin cells in active environmental sensing.

      The reviewer's expertise: Molecular genetics, developmental biology, insect physiology and endocrinology.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: Drosophila larvae are known to respond to noxious stimuli by rolling. The authors propose that this response arises not only by sensory response of nociceptive neurons but also by direct response of larval epidermal cells. They go onto test this idea by independently manipulating epidermal cells and nociceptive sensory neurons using GAL4 lines, GCAMPs and RNAis. The behavioural data are convincing and presented clearly with good statistical analysis. However the involvement of epidermal cells in evoking the behaviour as well as STIM/Orai mediated Ca2+ entry requires further experiments. Use of another independent GAL4 strain for epidermal cells, alternate RNAi lines for STIM and Orai, mutants for STIM and Orai and overexpression constructs for STIM and Orai would significantly enhance the data. Thus, as of now the key results require more convincing. The following additional experiments would be required to support their claims:

      1) Either use a second epidermal GAL4 strain to show key results OR provide images of the epidermal GAL4 expression double-labelled with a ppk driver using a different fluorescent protein to establish NO overlap of the epidermal GAL4 with neurons. These strains should be available free in Bloomington.

      2) Authors need to provide better data for the involvement of STIM and Orai in the Calcium responses observed. A single RNAi for each gene with marginal change in response is insufficient. The authors also do not state if the RNAis used are validated by them or anyone else. Minimally they should repeat their experiments with at least one other validated RNAi and rescue these with overexpression constructs of STIM and Orai (available in Bloomington). It is well established in literature that overexpression of STIM/Orai can rescue SOCE in Drosophila. Ideally, to be fully convincing they should test a Drosophila knockout for STIM (available in Bloomington). Heterozygotes of this are viable and should be tested. Additionally a UAS Orai dominant negative (OraiDN) strain is available in Bloomington and can be tested.

      Minor comments that can be addressed:

      1) Figure 1: Further details required on how the rolling response is measured. Figure is uninformative. A video would be really helpful.

      2) I could not find Figure 1I described in the text. This section should be explained properly.

      3) Figure 3: There appears to a small response at 32oC - why is this ignored in the text? It would be useful to have S3 in the main figure.

      4) Fig 4: The DF/F traces for the two RNAis should be included in this figure.

      5) Extent of knockdown in the epidermis by each RNAi should be shown by RTPCRs.

      6) The authors need to explain why only a small change in the Ca2+ response is seen with either RNAi. Are there other Ca2+ channels involved? Ideally they could test mutants/RNAi for the TRP channel family. Loss of SOCE in Drosophila neurons changes the expression of other membrane channels - is this possible here? Minimally, this possibility needs to be discussed.

      7) In the methods section please explain how the % DF/F calculations are done and how are they normalised to the ionomycin response.

      8) Authors need to look at previous work on STIM and Orai in Drosophila and reference appropriately.

      Referees cross-commenting

      Reviewers 2 and 3 have raised some additional queries to what I had mentioned in my review. I agree with their comments. The authors should attempt to address all comments by all three reviewers.

      Significance

      This is an interesting study that identifies epidermal cells in Drosophila with the ability to sense a drop in temperature after receiving noxious heat stimuli and invoke appropriate behaviour. Behaviour experiments are well conducted and convincing. So far only nociceptive neurons were thought to control such behavioural responses so the work is significant and important for the field. The mechanism identified needs further convincing and I have suggested experiments that would be of help. With the additional experiments suggested the work will be of interest to neuroethologists, Drosophila neuroscientists and scientists in the field of Ca signaling.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      • *

      __Reviewer #1 __


      Major comments


      1. The manuscript posits that the loss of function of MASh components (Ogc1 and Aralar) decreases adrenergic-stimulated lipolysis by altering the cytosolic NAD⁺/NADH ratio, with AMPK/ACC mentioned as possible mediators. However, this remains speculative. Please provide mechanistic data directly linking MASh-dependent NAD⁺/NADH changes to the regulation of lipolysis in brown adipocytes during adrenergic stimulation. Answer 1) The reviewer raises an important point regarding the direct assessment of cytosolic NAD⁺/NADH redox changes as a mechanistic link for altered lipolysis in brown adipocytes lacking MASh components. To address this point, we added new data to the revised manuscript showing lactate/pyruvate ratio as measured by metabolomics. This is a well-established surrogate marker to monitor changes in redox balance. Notably, under basal (non-stimulated) conditions, the lactate/pyruvate ratio did not display any significant differences between Aralar 1 KD and control cells, suggesting preservation of cytosolic NAD⁺/NADH levels in the absence of functional MASh under these conditions. This finding is consistent with reports showing the robustness of NAD⁺ regeneration via multiple shuttles and the possibility of metabolic compensation when one shuttle is compromised (PMID: 40540398; PMID: 37647199).

      The results have been added as new supplementary Figure 1 as following:

      Our new metabolomics data also revealed substantial reductions in the aspartate/glutamate ratio in Aralar 1 knockdown cells, serving as a metabolomic signature of impaired MASh function and reduced exchange of these amino acids between the cytosol and mitochondria. Given that the MASh is a major mechanism for exporting cytosolic reducing equivalents into the mitochondria under high metabolic demand, its loss would be expected to impact redox homeostasis, particularly under adrenergic stimulation when glycolytic flux and lipolytic activity are elevated (PMID: 40540398).

      Importantly, although our redox surrogate marker did not detect alterations, this may be explained by activation of compensatory pathways, most notably the glycerol phosphate shuttle (GPSh), which is highly expressed and active in brown adipocytes. Indirect support for this compensation comes from data shown in figure 4I showing reduced glycerol release in Aralar 1 KD cells upon norepinephrine stimulation and blocked lipolysis. This suggests a redirection of glycolytically derived G3P away from release and toward enhanced cycling within the GPSh, supporting cytosolic NAD⁺ regeneration via mitochondrial FAD-dependent G3PDH and cytosolic NAD⁺-dependent G3PDH activity. This is consistent with studies documenting that the combined action of MASh and GPSh maintains NAD redox homeostasis in brown adipocytes especially during non-thermogenic conditions (PMID: 168075; PMID: 40540398; PMID: 37647199). We have included a discussion about this possibility at page 9, third paragraph as follows:

      *“Previous studies have shown that BAT exhibits high activity of mitochondrial FAD-dependent glycerol-3-phosphate dehydrogenase (mG3PDH), which functions as an electron sink to sustain low cytosolic NADH levels essential for continuous glycolytic flux [11]. Accordingly, suppression of the MASh, either genetically or pharmacologically, is likely to induce a compensatory upregulation of the GPSh. This adaptation would enhance G3P turnover, contributing to the maintenance of cytosolic NAD redox balance. Moreover, the increased flux through the GPSh could favor fatty acid esterification and triglyceride synthesis or re-esterification, consistent with our findings in Ogc and/or Aralar 1 KD cells, where (i) triglyceride content rises (Fig. 3), (ii) overall respiratory rates remain largely unaltered (Figs. 2D–G), and (iii) glycerol release declines significantly (Fig. 4I). Notably, the decrease in glycerol release persists even when lipolysis is blocked by ATGlistatin, suggesting that the available G3P pool is rerouted from dephosphorylation and extracellular release toward oxidation to DHAP by mG3PDH to regenerate cytosolic NAD+ under MASh-deficient conditions. We propose that interference with the MASh does not directly impact lipolysis but instead alters the cellular balance between DHAP and G3P owing to enhanced activity of the GPSh. This metabolic shift would favor the esterification of G3P with free fatty acids, thereby promoting triglyceride synthesis. These results support the notion that, even during adrenergic stimulation—when long-chain unsaturated fatty acids and their CoA esters strongly inhibit mG3PDH activity [11]—the residual flux through the glycerophosphate shuttle remains critical for sustaining cytosolic NAD redox equilibrium [11,19,32].” *

      • *

      At the mechanistic level, adrenergic stimulation in brown adipocytes activates robust lipolysis and thermogenic gene programs, generating high NADH that must be efficiently reoxidized to sustain flux through glycolysis and lipolysis-linked pathways. Our findings are consistent with a model in which the loss of MASh does not prevent cytosolic NAD⁺ regeneration or lipolytic flux during acute adrenergic stimulation, due to compensatory upregulation of the GPSh, as suggested by the glycerol release changes. Thus, while MASh normally acts as a conduit for NADH export and aspartate/glutamate exchange, in its absence, the GPSh maintains cytosolic redox balance, thereby sustaining glycolytic and lipolytic capacity.

      We agree that future studies should employ direct measurements of cytosolic NAD⁺/NADH ratios (e.g., genetically-encoded redox sensors) during adrenergic stimulation and specific pharmacological inhibition of both shuttles to dissect these relationships in greater detail. We sincerely appreciate the reviewer's input, which has prompted us to clarify the indirect but robust evidence supporting a role for compensatory redox shuttle activity in preserving brown adipocyte lipolysis in the setting of MASh impairment.

      We have further added a new paragraph in the discussion section (page 10)::

      *“Mechanistically, the connection between the MASh and lipolysis appears to involve regulation of the cytosolic NAD⁺/NADH redox balance. MASh activity facilitates the regeneration of NAD⁺ from NADH in the cytosol, primarily through the reduction of oxaloacetate to malate by cytosolic malate dehydrogenase (Fig. 1G-H). Despite the theoretical expectation that reductions in MASh activity would disturb redox homeostasis, our metabolomic data show that the lactate/pyruvate ratio remains unchanged under conditions of MASh impairment, indicating that the overall cytosolic NAD⁺/NADH ratio is maintained (Figure S1A-C). While direct measurements of cytosolic NAD⁺/NADH were not performed, the preserved lactate/pyruvate ratio in Aralar 1 KD cells under basal conditions strongly suggests redox stability, likely due to compensatory activity by alternative mitochondrial shuttles or metabolic adaptations that maintain NAD redox homeostasis despite MASh impairment [18,33]. *

      Previous evidence indicates that BAT exhibits high activity of mitochondrial FAD-dependent glycerol-3-phosphate dehydrogenase (G3PDH), which acts as an electron sink to sustain low cytosolic NADH levels critical for glycolysis [34]. In this sense, it is conceivable that genetic or pharmacological suppression of MASh triggers compensatory enhancement of the G3P shuttle, increasing G3P availability and facilitating the maintenance of cytosolic NAD redox balance. This adaptation could also promote fatty acid esterification and triglyceride synthesis or re-esterification, aligning with our observations that in Ogc and/or Aralar 1 KD cells: (i) triglyceride levels increase (Fig. 3); (ii) overall respiratory rates are preserved (Figs. 2D–G); and (iii) glycerol release is significantly reduced (Fig 4I).”

      • *

      __ The absence of in vivo analysis of lipid-droplet size in MASh loss-of-function models is a major concern. In vitro results could be confounded by differences in differentiation stage between groups. Please document equivalent adipogenesis across groups (e.g., Pparg/Cebpa/Plin1/Fabp4 expression).__

      Answer 2) We thank the reviewer for the thoughtful and constructive comment regarding potential confounding by differences in differentiation stage, and for highlighting the importance of documenting equivalence between experimental groups. We appreciate the opportunity to clarify and provide additional assurance on this point.

      As detailed in our manuscript, we have performed qPCR analysis of multiple well-established markers of brown adipocyte differentiation, including Ucp1, Elovl3, Prdm16, Pparg, Cebpa, Plin1, and Fabp4, in both scramble, aralar1 KD, and Ogc KD cells (see Fig. S1A and accompanying text). Our results show no apparent effect of these genetic interventions on overall differentiation, as the expression levels of these key markers were consistently unaltered across groups. Furthermore, adenoviral-mediated knockdown of Ogc achieved an approximate 80% reduction in Ogc mRNA (see Fig. S1B), yet most differentiation markers remained unaffected. We did observe significant increases in Atgl, Pgc1α, and Tfam mRNA levels, which may indicate a degree of pathway reprogramming without affecting the general differentiation profile. We propose that interference with the MASh does not directly impact lipolysis but instead alters the cellular balance between DHAP and G3P owing to enhanced activity of the GPSh. This metabolic shift would favor the esterification of G3P with free fatty acids, thereby promoting triglyceride synthesis.

      Additional experimental support for equivalent differentiation can be drawn from our respirometry data presented in Figures 2E and 2G. These figures demonstrate that respiratory rates upon norepinephrine stimulation, which is a sensitive indicator of brown adipocyte thermogenic capacity, were essentially identical in scramble, aralar1 KD, and Ogc KD cells. Since norepinephrine-stimulated respiration requires both functional mitochondria and the full differentiation of brown adipocytes, these results strongly support the conclusion that silencing either MASh component does not impair the fundamental ability of cells to undergo brown adipocyte differentiation or achieve functional thermogenic competence.

      This is consistent with published findings showing that norepinephrine triggers robust respiration and thermogenic activation only in fully differentiated and functional brown adipocytes, making such measurements a widely accepted proxy for differentiation status and mitochondrial integrity. Thus, the equivalent respiratory responses observed in all groups further validate that differentiation was not compromised by the genetic interventions.

      We hope this clarifies that equivalent adipogenesis was carefully documented and that any observed phenotypes are unlikely to be attributable to differences in differentiation stages. Thank you again for your rigorous assessment and for helping to ensure the robustness of our study.

      __ Please include rescue experiments (add-back OGC1 and Aralar) to rule out siRNA/shRNA off-target effects and verify that the phenotype stems from MASh loss of function.__

      Answer 3) We thank the reviewer for this important suggestion regarding the inclusion of rescue experiments with add-back of Ogc and Aralar to definitively exclude off-target effects of the siRNA/shRNA-mediated knockdowns.

      We would like to kindly point out that although we did not perform add-back rescue experiments directly, the consistency of phenotypes observed across two independent genetic interventions—aralar 1 KD and Ogc KD—strongly argues against off-target effects being responsible for the observed metabolic and functional alterations. Specifically, both knockdowns yielded remarkably similar phenotypes in multiple assays, including respirometry analyses, mitochondrial morphology, lipid droplet homeostasis, and lipid metabolism, supporting the conclusion that these effects stem from MASh loss of function rather than nonspecific silencing.

      Furthermore, our new supplementary data (new Supplementary Figure 1A-F) reveals a significant reduction in the aspartate/glutamate ratio in Aralar 1 KD cells, a compelling functional readout for MASh impairment. This molecular evidence corroborates that our genetic interventions effectively disrupted MASh activity as intended.

      We sincerely appreciate the reviewer’s thorough evaluation and understand the importance of rescue experiments. While recognizing their value, we believe the convergent genetic, metabolic, and functional evidence presented across two different MASh components provides strong and consistent support that the phenotypes observed are due to specific loss of MASh function.


      __ Please expand on physiological significance: What is the importance of MASh regulation of BAT lipolysis in long-term adaptive thermogenesis?__

      Answer 4) This is a very interesting aspect, and we have included a new paragraph in the discussion section (page 14) to address it as follows:

      “Our results, supported by recent literature, strongly indicate that the malate–aspartate shuttle (MASh) plays a key role in facilitating fatty acid–dependent thermogenesis in brown adipocytes. Specifically, BAT-targeted overexpression of GOT1 has been shown to enhance β-oxidation and support acute cold-induced thermogenesis (PMID: 40540398). Interestingly, genetic ablation of GOT1—and thus MASh inhibition—preserves cold-induced thermogenesis by promoting a metabolic shift from fatty acid to glucose oxidation. Our findings corroborate and extend these observations by demonstrating that MASh impairment sustains overall respiratory activity in norepinephrine-stimulated brown adipocytes (Figures 2D–2G), while concurrently impairing lipolysis and resulting in an accumulation of small lipid droplets (Figures 3 and 4). Collectively, these data suggest that MASh not only modulates substrate preference towards fatty acid oxidation but also facilitates lipolysis, an essential upstream step that enables lipid oxidation and supports thermogenic heat production.”

      Minor comments

      1. __ Fig. 4 legend/title contains a typo ("lypolysis" → lipolysis).__ Answer 1) Corrected

      __ In Fig. 2 legend line: "Adevirus-mediated" → Adenovirus-mediated; "OCAR" → OCR.__

      Answer 2) Corrected

      __ For lipolysis imaging, you already show Forskolin/Atglistatin/Etomoxir controls; add a vehicle-only time course overlay in the main figure (currently in text/legend) to aid visual comparison.__

      Answer 3) We thank the reviewer for pointing this out. To improve clarity, we have updated the labeling in Figures 3 and 4: “basal” now clearly refers to the unstimulated/untreated condition, and the previously labeled “UT” condition has been clarified as “untransduced.” These changes make the figure legends and data presentation more consistent and easier to interpret.

      __ Ensure consistent gene symbols (Atgl/Pnpla2), and protein capitalization.__

      Answer 4) Corrected.

      __Reviewer #2 __

      Major points:

      1. __ In the current manuscript, mitochondrial morphology (area, aspect ratio, and roundness) was analyzed in OGC1 KD cells using TMRE, whereas MitoTracker Deep Red (MTDR) was used in Aralar1 KD cells. Notably, TMRE is a ΔΨm-dependent probe. The signal intensity can change, or the distribution may reflect alterations in membrane potential rather than true morphological changes. Therefore, the observed differences in OGC1 KD cells based on TMRE staining may be confounded by the dye's functional dependence, potentially biasing the conclusions. It is recommended to evaluate mitochondrial morphology with consistent trackers across conditions. In addition, in the subsequent OCR analysis, mitochondrial area was used for normalization. Please clarify which staining method was employed, and provide justification for its suitability.__ Answer 1) We thank the reviewer for this insightful comment. Indeed, TMRE is a membrane potential-sensitive dye and could therefore potentially affect measurements of mitochondria.

      We would like to point out that mitochondrial morphology was quantified based on mitochondrial area rather than fluorescence intensity. To create an accurate binary map of mitochondria, we used a low threshold, which allowed us to include even weakly stained mitochondria and thereby detect them independently of their membrane potential. In all imaged cells, TMRE signal was sufficient to reliably identify mitochondrial pixels. Moreover, these images were acquired using a confocal microscope, where the risk of pixel expansion due to higher fluorescence intensity is minimized. Lastly, given that overall mitochondrial oxygen consumption in these cells remains largely intact, we do not expect a substantial loss of membrane potential, although minor effects cannot be entirely excluded.

      We opted to use TMRE for imaging Ogc KD cells because the scramble control for these shRNA viruses carries an mKate fluorescent tag, which overlaps with the MTDR signal. Since accurate assessment of transduction efficiency relied on detecting mKate, MTDR could not be used in these experiments. Importantly, we only compare mitochondrial morphology within the same staining condition and do not draw conclusions across cells stained with different dyes.

      To ensure transparency, we have added a new section at the discussion (page 17, 2nd paragraph) highlighting the potential influence of ΔΨm-dependent dyes on morphological measurements as follows:

      “It is also important to note that mitochondrial morphology was quantified using MTDR in Aralar 1 KD cells and TMRE in Ogc KD cells due to experimental constraints (see Methods). TMRE is a membrane potential–dependent dye, which could potentially influence morphology measurements. To minimize this risk, we used confocal microscopy, which reduces the likelihood of pixel expansion due to higher fluorescence intensity, and set thresholds to detect even weakly stained mitochondria. Nonetheless, we cannot fully exclude the possibility that the differences in morphology observed between Aralar 1 and Ogc KD are influenced by the use of different dyes; however, statistical comparisons were never performed across samples stained with different dyes.”

      Also, we have expanded the Methods section (page 22, 2nd paragraph) to include a rationale for using these dyes and describe the analysis protocol as following:

      “TMRE was used for Ogc KD cells because the scramble control for the shRNA viruses carries an mKate fluorescent tag, which overlaps with MTDR fluorescence, preventing its use. MTDR was used for Aralar KD cells. Image Analysis was performed in FIJI (ImageJ, NIH). For the quantification of mitochondrial morphology and area, images stained with TMRE or MTDR were analyzed. Thresholds were adjusted to ensure that even weakly stained mitochondria were detected and included in the analysis. Only the mitochondrial area was evaluated, independent of fluorescence intensity.”

      Minor points:

      1. __ In the introduction, the authors state that "LDH activity increases in the context of BAT activation". This point is important for the logic of the manuscript, reference [10] cited here is not sufficient to support this claim. It is recommended to provide appropriate references to support this statement.__ Answer 1) We have substantially changed this paragraph in the revised manuscript to better explain why LDH would not act as a major player in contributing to NAD redox balance in the context of BAT thermogenesis, as follows:

      “In mammalian cells, cytosolic NAD⁺ is regenerated through lactate dehydrogenase (LDH), the glycerol-3-phosphate shuttle (GPSh), or the malate-aspartate shuttle (MASh). In BAT, however, lactate production rises only slightly with adrenergic activation and most lactate is oxidized via the TCA cycle, suggesting that LDH primarily consumes NAD⁺ rather than regenerating it [PMID: 30456392; PMID: 37337122; PMID: 30456392; PMID: 37802078; PMID: 40982723]. Consequently, mitochondrial redox shuttles become critical for sustaining cytosolic NAD⁺ supply”.

      We have also provided additional references to support this new section at the introduction.

      __ In Fig. 1A and B-D, there are inconsistencies and duplications in the abbreviation labels. Please check and revise accordingly. __

      Answer 2) We thank the reviewer for this comment. We would like to clarify that Figure 1A is a schematic overview of the system, while Figures 1B–D show protein expression in specific contexts: whole BAT (B), whole liver (C), and BAT mitochondria (D). In Figures 1B and 1C, all components are shown because both cytosolic (MDH1 and GOT1) and mitochondrial proteins (MDH2, GOT2, Aralar 1 and 2 and OGC) are present. In contrast, Figure 1D shows only mitochondrial components (OGC, Aralar1, MDH2, and GOT2). Although Aralar2 is a mitochondrial protein, it was not detected in this study (Forner et al., 2009). Similarly, cytosolic components such as MDH1 and GOT1 are not shown in Figure 1D because they are absent in the mitochondrial fraction. We have revised the figure legend to make these distinctions clearer.

      __ In Fig. S1, the number of n indicated does not match the number of data points shown. Please clarify whether these represent technical replicates or biological replicates, and provide a detailed description of the statistical methods used throughout the manuscript.__

      Answer 3) We thank the reviewer for catching this and allowing us to correct our mistakes. In the revised version, we have corrected the figure legend of Supplementary Figure 1 so that the number of n matches the data points shown.

      __ Please provide details on the normalization strategy used in the BODIPY-C12/BODIPY-493 staining analysis, such as whether fluorescence intensity was quantified as mean or integrated values, and whether the analysis was normalized to lipid droplet area, cell number, or baseline. Since lipolytic stimulation can reduce droplet size and increase droplet number, these factors may bias the results. __

      Answer 4) We thank the reviewer for this important comment and apologize for the lack of detail regarding this analysis. The analysis of BODIPY-C12 and BODIPY-493 was performed by quantifying the mean fluorescence intensity of BODIPY-C12 detected within a mask generated from the BODIPY-493 signal. This approach allowed us to define all lipid droplets and measure the release of previously esterified C12. To account for variability across samples, the data were normalized to each sample’s individual baseline at time point 0 and expressed as fold change relative to this baseline. In the revised manuscript we have included this description in the Methods section (page 18, last paragraph) for clarity and reproducibility, as following:

      “Lipid Droplet area was defined based on Bodipy 493/503 signal, which was used to generate a mask identifying all lipid droplets. Within this mask, the mean fluorescence intensity of BODIPY C12 was quantified over time to monitor the release of previously esterified C12. To account for variability between samples, data were normalized to each sample’s individual baseline at time point 0 and expressed as fold change relative to this baseline.”

      __ The manuscript notes that the unexpected result in Fig. 3K-M in parallel with increased Atgl mRNA expression might be because it does not reflect protein levels or enzymatic activity. To strengthen this point, it is recommended to include data on ATGL and phosphorylation ATGL. __

      Answer 5) We thank the reviewer for this constructive comment. We have clarified these aspects in the revised Results and Discussion sections to reflect this interpretation more accurately as follows:

      “Notably, Atgl mRNA measurement in our study was primarily used as a marker of brown adipocyte differentiation, rather than as a direct indicator of ATGL protein abundance or enzymatic activity. We detected increased Atgl expression only in Ogc KD cells (Fig. S1H), but not in Aralar 1 KD cells (Fig. S1G). This likely does not reflect a major difference in differentiation status, as other brown adipocyte markers and norepinephrine-stimulated respiration were comparable between scramble and knockdown cells (Fig. 2D-G and 2N-O and S1G-H). Although lipolysis was not evaluated in Ogc KD cells, in Aralar 1 KD cells basal lipolysis remained unchanged (Fig. 4D-E and 4G-I), whereas norepinephrine-stimulated lipolysis was delayed or partially inhibited. Notably, the enhanced fatty acid esterification observed in Ogc KD cells despite elevated Atgl expression is not contradictory, since in brown adipocytes lipolysis and re-esterification occur concurrently to sustain high lipid turnover [34].

      __ Red-on-black is not a great color code for IMFs, how about black-and-white? __

      Answer 6) We have changed color text for white on figures 2H and K as suggested.

      __Reviewer #3 __

      Major points;

      1. __ Although in the manuscript Veliova and coworkers demonstrated that MAS is functional in brown adipocytes showing kinetic parameters equivalent to that previously described in other tissues, surprisingly, when its components are downregulated, no effect, or very little, on mitochondrial respiration is found (figure 2). This is an intriguing result since MAS disruption has been widely reported to impair respiration in different cell types and tissues. However, since no direct evidence of MAS dysfunction is provided, it is possible that MAS may still remain partially or fully functional under the conditions used by the authors, and therefore this point needs to be clarified to validate these results.__ Answer 1) We thank the reviewer for the insightful comment and the opportunity to clarify these important points regarding MASh dysfunction validation in our study. We acknowledge the reviewer’s observation that mitochondrial respiration was largely unaffected by MASh component knockdown, which is indeed intriguing. Importantly, as already indicated in our responses to Reviewer 1, we have provided new data showing direct molecular evidence of MASh impairment through substantial reductions in the aspartate/glutamate ratio in Aralar 1 KD cells (new Supplementary Figure S1F). This ratio is a well-established functional readout reflecting MASh activity and amino acid exchange between cytosol and mitochondria, as demonstrated in original experimental studies of MASh function in multiple tissues including brown adipocytes (PMID: 4436323). The reduction in the aspartate/glutamate ratio directly confirms loss of MASh functionality even though respiratory rates remained unchanged, likely due to metabolic compensation by robust glycerol phosphate shuttle (GPSh) activity, as further supported by our data showing reduced glycerol release upon norepinephrine stimulation in Aralar 1 KD cells cells (Figure 4I). This metabolic rerouting maintains cytosolic NAD⁺ regeneration and partially preserves respiration and energy metabolism under these experimental conditions (PMID: 168075; PMID: 40540398; PMID: 37647199). Thus, the combination of metabolomic, respirometry, and functional lipid data strongly indicates that MASh activity was disrupted specifically and effectively by our genetic interventions. This molecular evidence was already signposted in our original manuscript and responses, underscoring that MASh loss of function—and not residual or compensatory MASh activity—is responsible for the phenotypes reported. We greatly appreciate the reviewer’s insightful attention to this critical mechanistic issue and hope this provides clear reassurance that MASh impairment was indeed achieved and functionally validated within our study framework.

      Furthermore, strategies used to downregulate MAS components produce only a partial reduction in mRNA levels, about 70 %, but its outcome on protein levels has not been determined. and the remaining protein level could be sufficient to maintain shuttle activity. Therefore, the effect of silencing at protein level should be analyzed, because as authors also point out on page 16; "mRNA levels may not reflect actual protein levels or activity".

      Answer 2) We thank the reviewer for this important point. Our knockdowns resulted in ~70–80% reduction in mRNA levels. While not complete, this represents a substantial decrease and is sufficient to produce strong functional effects. At the time the experiments were performed, we did not have access to suitable antibodies, and the available antibodies did not provide reliable signals in our samples, which is why we used qPCR to estimate knockdown efficiency. Importantly, we observed clear phenotypic changes in both knockdowns (Aralar and OGC), and both showed very similar phenotypes. This suggests that the level of knockdown was sufficient to significantly impair MAS activity. In the revised version we added new data which further validated the functional impact of Aralar KD (given that this protein has an alternative isoform, as pointed out by the reviewer). We performed metabolomics experiments measuring aspartate and glutamate levels. Our new data shows that the aspartate-to-glutamate ratio is significantly reduced in Aralar KD cells. This ratio serves as a proxy for glutamate catabolism, and the observed decrease suggests reduced glutamate catabolism, likely due to impaired MAS activity. Therefore, the reduced whole-cell aspartate/glutamate ratio serves as a metabolic signature of MAS impairment, consistent with Aralar KD. These data indicate that Aralar is sufficiently downregulated to produce a functional effect, supporting our conclusion that MAS activity is impaired. The results have been added as new supplementary Figure 1 as follows:

      __ In the case of aspartate/glutamate carriers (AGCs) the role of citrin/slc25a13, the second AGC paralog, should also be analyzed. This AGC isoform is discarded based on proteomic data from brown adipose tissue, but, as it is shown in figure 1B, its levels are similar those of Aralar/slc25a12, the only AGC silenced. Besides, primary brown adipocytes differentiated for 7 days are used here, and it is possible that factors such as culture conditions or differentiation itself could alter AGC levels. Therefore, it is necessary to determine the protein levels of citrin/AGC2, and, if necessary, downregulate it together with the Aralar/AGC1 isoform. citrin/AGC2 activity may be responsible for the observed difference between the OGC and Aralar/AGC1 KD adipocytes.__

      Answer 3) We thank the reviewer for this important point. We chose Aralar1 because it is the isoform predominantly expressed in brown adipose tissue (PMID: 23436904). We acknowledge, however, that compensatory increases in Citrin/AGC2 upon Aralar1 knockdown are possible. To address this, we have included new metabolomics data in the revised manuscript (added as Supplementary Figure 1), which provides additional support that downregulation of Aralar1, even if not complete, is sufficient to cause a metabolic change reflected by a reduced aspartate/glutamate ratio in these cells. This functional change supports that the knockdown of Aralar1 alone is sufficient to study its role in brown adipocytes, although minor compensation by Citrin/AGC2 cannot be entirely excluded.

      To address this explicitly, we have added a paragraph to the discussion (page 13, 2nd paragraph) highlighting the potential for partial compensation by Citrin/AGC2 and explaining why the observed metabolic effects are still attributable to Aralar 1 knockdown, as follows:

      “Phenotypes observed in Aralar 1 KD cells closely resemble those in Ogc KD cells, particularly in terms of lipid metabolism alterations and energy expenditure. The main difference lies in mitochondrial morphology, which is altered in Ogc KD cells but remains unchanged in Aralar 1-silenced cells (Fig. 2J,M). Unlike Ogc, which lacks an alternative isoform, Aralar 1 has a paralog Aralar 2 (Citrin, or SLC25A13) that may partially compensate for its loss. This potential compensation might explain the preservation of mitochondrial morphology in Aralar 1 KD cells. Nonetheless, our metabolomics data demonstrate that downregulation of Aralar 1 alone significantly reduces the aspartate/glutamate ratio (Fig. S1D-F). Since this ratio reflects glutamate catabolism, its decrease indicates impaired malate-aspartate shuttle activity and reduced glutamate catabolism. Therefore, although compensation by Aralar 2 cannot be entirely excluded, Aralar 1 KD alone suffices to cause substantial impairment of malate-aspartate shuttle function”.

      • *

      __ OGC and Aralar/AGC1 silencing is associated with the accumulation of smaller lipid droplets and impaired norepinephrine-induced lipolysis, but no mechanistical evidence is provided. The authors discuss a role for AMPK signaling associated with the redox unbalance generated by MAS disfunction but neither of them is proven.__

      Answer 4) We thank the reviewer for this insightful question, which was also raised by Reviewer 1 (see Reviewer 1, Question 1 above). Here, we aim to clarify the mechanistic basis by which MASh may regulate lipolysis in BAT in a complementary and refined manner.

      Our new data directly addresses this issue by examining cytosolic redox status through the lactate/pyruvate ratio, a well-established indicator of NAD⁺/NADH balance. Under basal conditions, Aralar 1 KD cells showed no change in this ratio compared to controls, indicating preserved cytosolic NAD⁺ regeneration despite reduced MASh activity. This observation is consistent with previous studies demonstrating the resilience of cellular redox homeostasis through overlapping NAD⁺-regenerating systems (PMID: 40540398; PMID: 37647199). The new results are shown in Supplementary Figure 1.

      At the same time, we detected a marked decrease in the aspartate/glutamate ratio in Aralar 1 KD cells, confirming impaired MASh function and reduced amino acid exchange between cytosol and mitochondria. The lack of redox imbalance likely reflects compensatory mechanisms, most notably the GPSh, which is highly active in brown adipocytes. Supporting this view, Aralar 1 KD cells displayed significantly reduced glycerol release upon norepinephrine stimulation (Fig. 4I), suggesting enhanced metabolic cycling of G3P through mitochondrial and cytosolic G3PDH, thereby sustaining NAD⁺ regeneration and redox equilibrium.

      We therefore propose that, although MASh normally facilitates NADH export and aspartate/glutamate exchange, its loss activates GPSh-mediated compensation that preserves cytosolic NAD⁺/NADH balance and maintains lipolytic flux during adrenergic stimulation. These findings refine our mechanistic understanding of how redox shuttle interplay supports glycolytic and lipolytic processes in BAT. Future studies employing NAD⁺/NADH sensors and simultaneous blockade of both shuttles will be essential to dissect these compensatory mechanisms in greater detail.

      Minor points;

      1. __ Is pyruvate present in respiration medium? If so, no effect on respiration is expected as pyruvate reverses the respiratory defects caused by MAS inactivation. __ Answer 1) Thanks for this important insight. In fact, as indicated in the methods section (page 17, last paragraph) all respirometry experiments were carried out in the absence of pyruvate in the media. Therefore, preserved overall respiratory rates in Aralar 1 and Ogc KD cannot be explained by compensatory pyruvate oxidation present in the media.

      __ In figure 4, only data from Aralar KD cells in relation to norepinephrine-stimulated lipolysis are shown. What happens when OGC is silenced? __

      Answer 2) This is a very interesting and relevant question. We did not perform the norepinephrine-stimulated lipolysis experiments in Ogc-silenced cells, since in most of the other experiments presented in the manuscript Ogc and Aralar 1 silencing converged to very similar, if not identical, phenotypes. Based on these consistent overlaps, we anticipate that Ogc KD would likely lead to comparable effects on lipolysis as observed in Aralar 1 KD cells. Nonetheless, we fully agree that direct assessment of lipolysis upon Ogc KD would strengthen this conclusion, and we consider this an important aspect for future studies.

      __ Nomenclature used for mitochondrial carriers is confusing. Please do not use OGC1 as there is only one isoform. Furthermore, different names for OGC are used in the manuscript; oxoglutarate carrier, malate-ketoglutarate carrier or OGC1/SLC25A11. In the case of citrin/AGC2, Aralar2 is used and is a uncommon designation.__

      Answer 3) We corrected all OGC naming in the revised manuscript. We also changed “aralar 2” for “citrin” since this was more commonly used in the literature.

      __ Some panels of figures 3 and 4 should be improved. Panels 3J, 3L and 4G are difficult to see. In panel 3J please clarify UT line from untreated/NE, are they not transduced? No equivalents conditions are assayed in Aralar KD and OGC KO cells.__

      Answer 4) We thank the reviewer for giving us the opportunity to improve this figure and apologize for the confusing labeling. In the revised version, we have clarified the labels in panels 3J, 3L, and 4G to improve visibility, and we have added descriptions of all abbreviations to the figure legends, accordingly.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this manuscript, Veliova and coworkers explore the contribution of the malate-aspartate NADH shuttle (MAS) to energy metabolism in brown adipose tissue. This work done by a group expert in mitochondrial metabolism, continues an interesting previous one (Veliova, 2020) where it was shown that the inhibition of the mitochondrial pyruvate carrier caused an increase in energy expenditure mediated by the activation of MAS in BAT. Here, the authors have explored the consequences of the lack of MAS activity on BAT metabolism by the silencing of the metabolite transporters that are part of MAS in cultured primary brown adipocytes. Using this loss-of-function approach, the role for MAS in the regulation of lipid homeostasis in BAT is analyzed. The results could be interesting, but in my opinion, they are not sufficiently proven. Much more evidence should be provided to confirm MAS deficiency and the mechanisms involved in the alteration of lipid homeostasis.

      Major points

      1. Although in the manuscript Veliova and coworkers demonstrated that MAS is functional in brown adipocytes showing kinetic parameters equivalent to that previously described in other tissues, surprisingly, when its components are downregulated, no effect, or very little, on mitochondrial respiration is found (figure 2). This is an intriguing result since MAS disruption has been widely reported to impair respiration in different cell types and tissues. However, since no direct evidence of MAS dysfunction is provided, it is possible that MAS may still remain partially or fully functional under the conditions used by the authors, and therefore this point needs to be clarified to validate these results. Furthermore, strategies used to downregulate MAS components produce only a partial reduction in mRNA levels, about 70 %, but its outcome on protein levels has not been determined. and the remaining protein level could be sufficient to maintain shuttle activity. Therefore, the effect of silencing at protein level should be analyzed, because as authors also point out on page 16; "mRNA levels may not reflect actual protein levels or activity".
      2. In the case of aspartate/glutamate carriers (AGCs) the role of citrin/slc25a13, the second AGC paralog, should also be analyzed. This AGC isoform is discarded based on proteomic data from brown adipose tissue, but, as it is shown in figure 1B, its levels are similar those of Aralar/slc25a12, the only AGC silenced. Besides, primary brown adipocytes differentiated for 7 days are used here, and it is possible that factors such as culture conditions or differentiation itself could alter AGC levels. Therefore, it is necessary to determine the protein levels of citrin/AGC2, and, if necessary, downregulate it together with the Aralar/AGC1 isoform. citrin/AGC2 activity may be responsible for the observed difference between the OGC and Aralar/AGC1 KD adipocytes.
      3. OGC and Aralar/AGC1 silencing is associated with the accumulation of smaller lipid droplets and impaired norepinephrine-induced lipolysis, but no mechanistical evidence is provided. The authors discuss a role for AMPK signaling associated with the redox unbalance generated by MAS disfunction but neither of them is proven.

      Minor points

      1. Is pyruvate present in respiration medium? If so, no effect on respiration is expected as pyruvate reverses the respiratory defects caused by MAS inactivation.
      2. In figure 4, only data from Aralar KD cells in relation to norepinephrine-stimulated lipolysis are shown. What happens when OGC is silenced?
      3. Nomenclature used for mitochondrial carriers is confusing. Please do not use OGC1 as there is only one isoform. Furthermore, different names for OGC are used in the manuscript; oxoglutarate carrier, malate-ketoglutarate carrier or OGC1/SLC25A11. In the case of citrin/AGC2, Aralar2 is used and is a uncommon designation.
      4. Some panels of figures 3 and 4 should be improved. Panels 3J, 3L and 4G are difficult to see. In panel 3J please clarify UT line from untreated/NE, are they not transduced? No equivalents conditions are assayed in Aralar KD and OGC KO cells.

      Significance

      General assessment: The robust part of this study is its analysis of some aspects related to lipid metabolism in cultured primary cells derived from brown adipose tissue. The participating teams are well-versed in this topic and the approaches used are correct. However, no data in animal models supporting these results are provided and this fact rests interest.

      Advance: This manuscript is the "logical" continuation of a previous study, Veliova et al., (2020) EMBO Rep, more relevant in my opinion. Also, recently, it has been also proposed using animal models, either by overexpression or using deficient mice for GOT1 a cytosolic protein component of MAS, a role for MAS in BAT thermogenesis (Park et al., Cell Rep. 2025). The novelty in this manuscript is the analysis of deficient cells in the metabolite transporter that regulate the direction of NADH shuttling. However, since no evidence is provided its effect on NAD+/NADH ratio, the conclusions related to the role of MAS, or the mitochondrial carriers silenced, in the regulation of lipolysis in BAT and its involvement in thermogenesis are not convinced.

      Audience: These results could be of interest to the audience interested in basic research, but could also be useful in the translational/clinical area because they address metabolic aspects in adipose tissue.

      My expertise is focus on mitochondrial metabolism, specifically in the function of a subtype of mitochondrial carriers regulated by cytosolic calcium and how they participate in the control of different mitochondrial functions, such as respiration, calcium buffering, cell proliferation. Some of these transporters are components of MAS such as Aralar/AGC1 or citrin/AGC2.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This manuscript presents novel findings on the role of the malate-aspartate shuttle (MASh) in brown adipose tissue (BAT). Building on the recent advances in elucidating the contribution of MASh to BAT metabolism, the present study provides new evidence by offering direct biochemical validation using a reconstituted BAT mitochondrial system and by introducing genetic data on the mitochondrial carriers OGC1 and Aralar1, thereby adding significant new insight. However, the following points require further clarification.

      Major points:

      1. In the current manuscript, mitochondrial morphology (area, aspect ratio, and roundness) was analyzed in OGC1 KD cells using TMRE, whereas MitoTracker Deep Red (MTDR) was used in Aralar1 KD cells. Notably, TMRE is a ΔΨm-dependent probe. The signal intensity can change, or the distribution may reflect alterations in membrane potential rather than true morphological changes. Therefore, the observed differences in OGC1 KD cells based on TMRE staining may be confounded by the dye's functional dependence, potentially biasing the conclusions. It is recommended to evaluate mitochondrial morphology with consistent trackers across conditions. In addition, in the subsequent OCR analysis, mitochondrial area was used for normalization. Please clarify which staining method was employed, and provide justification for its suitability.

      Minor points:

      1. In the introduction, the authors state that "LDH activity increases in the context of BAT activation". This point is important for the logic of the manuscript, reference [10] cited here is not sufficient to support this claim. It is recommended to provide appropriate references to support this statement.
      2. In Fig. 1A and B-D, there are inconsistencies and duplications in the abbreviation labels. Please check and revise accordingly.
      3. In Fig. S1, the number of n indicated does not match the number of data points shown. Please clarify whether these represent technical replicates or biological replicates, and provide a detailed description of the statistical methods used throughout the manuscript.
      4. Please provide details on the normalization strategy used in the BODIPY-C12/BODIPY-493 staining analysis, such as whether fluorescence intensity was quantified as mean or integrated values, and whether the analysis was normalized to lipid droplet area, cell number, or baseline. Since lipolytic stimulation can reduce droplet size and increase droplet number, these factors may bias the results.
      5. The manuscript notes that the unexpected result in Fig. 3K-M in parallel with increased Atgl mRNA expression might be because it does not reflect protein levels or enzymatic activity. To strengthen this point, it is recommended to include data on ATGL and phosphorylation ATGL.
      6. Red-on-black is not a great color code for IMFs, how about black-and-white?

      Referees cross-commenting

      To my opinion, all three reviewers have provided constructive criticism of the work.

      Significance

      The work dives deeper into mitochondrial function and metabolism of brown adipocytes and, thus, advances our understanding of thermogenesis in an incremental fashion. The work will be relevant to brown adipose tissue researchers and mitochondrial biologist.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The paper makes a clear, well-supported case that the malate-aspartate shuttle (MAS) is active in brown adipocytes and supports adrenergically stimulated lipolysis. The combination of a functional MAS assay, targeted carrier knockdowns, and multi-modal lipolysis measurements is a strong package. The reconstituted mitochondrial assay paired with live-cell lipolysis imaging is technically elegant and broadly reusable. The main gap is the limited in-vitro scope relative to in-vivo cold adaptation.

      Major comments

      1. The manuscript posits that the loss of function of MASh components (Ogc1 and Aralar) decreases adrenergic-stimulated lipolysis by altering the cytosolic NAD⁺/NADH ratio, with AMPK/ACC mentioned as possible mediators. However, this remains speculative. Please provide mechanistic data directly linking MASh-dependent NAD⁺/NADH changes to the regulation of lipolysis in brown adipocytes during adrenergic stimulation.
      2. The absence of in vivo analysis of lipid-droplet size in MASh loss-of-function models is a major concern. In vitro results could be confounded by differences in differentiation stage between groups. Please document equivalent adipogenesis across groups (e.g., Pparg/Cebpa/Plin1/Fabp4 expression)
      3. Please include rescue experiments (add-back OGC1 and Aralar) to rule out siRNA/shRNA off-target effects and verify that the phenotype stems from MASh loss of function.
      4. Please expand on physiological significance: What is the importance of MASh regulation of BAT lipolysis in long-term adaptive thermogenesis?

      Minor comments

      1. Fig. 4 legend/title contains a typo ("lypolysis" → lipolysis).
      2. In Fig. 2 legend line: "Adevirus-mediated" → Adenovirus-mediated; "OCAR" → OCR.
      3. For lipolysis imaging, you already show Forskolin/Atglistatin/Etomoxir controls; add a vehicle-only time course overlay in the main figure (currently in text/legend) to aid visual comparison.
      4. Ensure consistent gene symbols (Atgl/Pnpla2), and protein capitalization.

      Referees cross-commenting

      In my view, the feedback offered by all three reviewers has been highly constructive, as each of them has contributed thoughtful and meaningful criticism that can help improve the quality, clarity, and overall impact of the work.

      Significance

      Advance - how it fits the literature and what kind of advance.

      Relative to prior work linking MASh (often via GOT1) to fuel preference and redox during thermogenesis, this study fills a mechanistic gap by showing that carrier-level MASh disruption (Aralar1/OGC1) specifically impairs adrenergic lipid mobilization upstream of β-oxidation, while respiration per cell can be buffered by compensatory mitochondrial biogenesis (lower OCR per mitochondrion). Conceptual/fundamental advance: it sharpens the redox - lipolysis axis in BAT and clarifies why changes in fuel availability (lipolysis) may limit thermogenesis even when bulk OCR looks preserved.

      Audience - who will be interested/influenced.

      Specialized but cross-cutting: adipose biology & thermogenesis, mitochondrial/redox metabolism, lipid-droplet and lipolysis communities, and metabolic-disease researchers exploring strategies to modulate BAT fuel handling.

      Reviewer expertise

      Adipose tissue and systemic energy metabolism; mitochondrial bioenergetics; thermogenic mechanisms in BAT/beige fat; transcriptional and metabolic control of lipid mobilization. Not a specialist in membrane-carrier biophysics.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewer’s Comments

      We thank all three reviewers for their thoughtful and detailed comments, which will help us to improve the quality and clarity of our manuscript.


      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __ Summary: In this work, Tripathi et al address the open question of how the Fat/Ds pathway affects organ shape, using the Drosophila wing as a model. The Fat/Ds pathway is a conserved but complex pathway, interacting with Hippo signalling to affect growth and providing planar cell polarity that can influence cellular dynamics during morphogenesis. Here, authors use genetic perturbations combined with quantification of larval, pupal, and adult wing shape and laser ablation to conclude that the Ft/Ds pathway affects wing shape only during larval stages in a way that is at least partially independent of its interaction with Hippo and rather due to an effect on tissue tension and myosin II distribution. Overall the work is clearly written and well presented. I only have a couple major comments on the limitations of the work.

      Major comments: 1. Authors conclude from data in Figures 1 and 2 that the Fat/Ds pathway only affects wing shape during larval stages. When looking at the pupal wing shape analysis in Figure 2L, however, it looks there is a difference in wt over time (6h-18h, consistent with literature), but that difference in time goes away in RNAi-ds, indicating that actually there is a role for Ds in changing shape during pupal stages, although the phenotype is clearly less dramatic than that of larval stages. No statistical test was done over time (within the genotype), however, so it's hard to say. I recommend the authors test over time - whether 6h and 18h are different in wild type and in ds mutant. I think this is especially important because there is proximal overgrowth in the Fat/Ds mutants, much of which is contained in the folds during larval stages. That first fold, however, becomes the proximal part of the pupal wing after eversion and contracts during pupal stages to elongate the blade (Aiguoy 2010, Etournay 2015). Also, according to Trinidad Curr Biol 2025, there is a role for Fat/Ds pathway in pupal stages. All of that to say that it seems likely that there would be a phenotype in pupal stages. It's true it doesn't show up in the adult wing in the experiments in Fig 1, but looking at the pupal wing itself is more direct - perhaps the very proximal effect is less prominent later, as there is potential for further development after 18hr before adulthood and the most proximal parts are likely anyway excluded in the analysis.

      Response: Our main purpose in examining pupal wing shape was to emphasize that wings lacking ds are visibly abnormal even at early pupal stages. The reviewer makes the point that the change in shape from 6h to 18h APF is greater in control wings than in RNAi-ds wings. We have added quantitation of this to the revised manuscript as suggested. This difference could be interpreted as indicating that Ds-Fat signaling actively contributes to wing shape during pupal morphogenesis. However, given the genetic evidence that Ds-Fat signaling influences wing shape only during larval growth, we favor the interpretation that it reflects consequences of Ds-Fat action during larval stages – eg, overgrowth of the wing, particularly the proximal wing and hinge as occurs in ds and fat mutants, could result in relatively less elongation during the pupal hinge contraction phase. This wouldn’t change our key conclusions, but it is something that we discuss in a revised manuscript.

      I think there needs to be a mention and some discussion of the fact that the wing is not really flat. While it starts out very flat at 72h, by 96h and beyond, there is considerable curvature in the pouch that may affect measurements of different axis and cell shape. It is not actually specified in the methods, so I assume the measurements were taken using a 2D projection. Not clear whether the curvature of the pouch was taken into account, either for cell shape measurements presented in Fig 4 or for the wing pouch dimensional analysis shown in Fig 3, 6, and supplements. Do perturbations in Ft/Ds affect this curvature? Are they more or less curved in one or both axes? Such a change could affect the results and conclusions. The extent to which the fat/ds mutants fold properly is another important consideration that is not mentioned. For example, maybe the folds are deeper and contain more material in the ds/fat mutants, and that's why the pouch is a different shape? At the very least, this point about the 3D nature of the wing disc must be raised in discussion of the limitations of the study. For the cell shape analysis, you can do a correction based on the local curvature (calculated from the height map from the projection). For the measurement of A/P, D/V axes of the wing pouch, best would be to measure the geodesic distance in 3D, but this is not reasonable to suggest at this point. One can still try to estimate the pouch height/curvature, however, both in wild type and in fat/ds mutants.

      Response: The wing pouch measurements were done on 2D projections of wing discs that were already slightly flattened by coverslips, so there is not much curvature outside of the folds. We will revise the methods to make sure this is clear. While we recognize that the absolute values measured can be affected by this, our conclusions are based on the qualitative differences in proportions between genotypes and time points, and we wouldn’t expect these to differ significantly even if 3D distances were measured. Obtaining accurate 3D measures is technically more challenging - it requires having spacers matching the thickness of the wing disc, which varies at different time points and genotypes, and then measuring distances across curved surfaces. What we propose to address this is to do a limited set of 3D measures on wild-type and dsmutant wing discs at early and late stages and which we expect will confirm our expectation that the conclusions of our analysis are unaffected, while at the same time providing an indication of how much curvature affects the values obtained. We will also make sure the issue of wing disc curvature and folds is discussed in the text.

      Minor comments: 1. The analysis of the laser ablation is not really standard - usually one looks at recoil velocity or a more complicated analysis of the equilibrium shape using a model (e.g Shivakumar and Lenne 2016, Piscitello-Gomez 2023, Dye et al 2021). One may be able to extract more information from these experiments - nevertheless, I doubt the conclusions would change, given that that there seems to be a pretty clear difference between wt and ds (OPTIONAL).

      Response: We will add measurements of recoil velocities to complement our current analysis of circular cuts.

      Figure 7G: I think you also need a statistical test between RNAi-ds and UAS-rokCA+RNAi-ds.

      Response: We include this statistical test in the revised manuscript (it shows that they are significantly different).

      In the discussion, there is a statement: "However, as mutation or knock down of core PCP components, including pk or sple, does not affect wing shape... 59." Reference 59 is quite old and as far as I can tell shows neither images nor quantifications of the wing shape phenotype (not sure it uses "knockdown" either - unless you mean hypomorph?). A more recent publication Piscitello-Gomez et al Elife 2023 shows a very subtle but significant wing shape phenotype in core PCP mutants. It doesn't change your logic, but I would change the statement to be more accurate by saying "mutation of core PCP components has only subtle changes in adult wing shape"

      Response: Thank-you for pointing this out, we have revised the manuscript accordingly.

      **Referee cross-commenting**

      Reviewer2: Reviewer 2 makes the statement: "The distance along the AP boundary from the pouch border to DV midline is topologically comparable to the PD length of the adult wing. The distance along the DV boundary from A border to P border is topologically comparable to the AP length of the adult wing."

      I disagree - the DV boundary wraps around the entire margin of the adult wing (as correctly drawn with the pink line in Fig 2A). It is not the same as the wide axis of the adult wing (perpendicular to the AP boundary). It is not trivial to map the proximal-distal axis of the larval wing to the proximal-distal axis of the adult, due to the changes in shape that occur during eversion. Thus, I find it much easier to look at the exact measurement that the authors make, and it is much more standard in the field, rather than what the reviewer suggests. Alternatively, one could I guess measure in the adult the ratio of the DV margin length (almost the circumference of the blade?) to the AP boundary length. That may be a more direct comparison. Actually the authors leave out the term "boundary" - what they call AP is actually the AP boundary, not the AP axis, and likewise for the DV - what they measure is DV boundary, but I only noticed that in the second read-through now. Just another note, these measurements of the pouch really only correspond to the very distal part of the wing blade, as so much of the proximal blade comes from the folds in the wing disc. Therefore, a measurement of only distal wing shape would be more comparable.

      Response: We thank Reviewer 1 for their comments here. In terms of the region measured, we measure to the inner Wg ring in the disc, the location of this ring in the adult is actually more proximal than described above (eg see Fig 1B of Liu, X., Grammont, M. & Irvine, K. D. Roles for scalloped and vestigial in regulating cell affinity and interactions between the wing blade and the wing hinge. Developmental Biology 228, 287–303 (2000)), and this defines roughly the region we have measured in adult wings (with the caveat noted above that the measurements in the disc can be affected by curvature and the hinge/pouch fold, which we will address).

      Reviewer 2 states that authors cannot definitively conclude anything about mechanical tension from their reported cutting data because the authors have not looked at initial recoil velocity. I strongly disagree. __The wing disc tissue is elastic on much longer timescales than what's considered after laser ablation (even hours), and the shape of the tissue after it equilibrates from a circular cut (1-2min) can indeed be used to infer tissue stresses (see Dye et al Elife 2021, Piscitello-Gomez et al eLife 2023, Tahaei et al arXiv 2024).__ In the wing disc, the direction of stresses inferred from initial recoil velocity are correlated with the direction of stresses inferred from analysing the equilibrium shape after a circular cut. Rearrangements, a primary mechanism of fluidization in epithelia, does not occur within 1'. Analysing the equilibrium shape after circular ablation may be more accurate for assessing tissue stresses than initial recoil velocity - in Piscitello-Gomez et al 2023, the authors found that a prickle mutation (PCP pathway) affected initial recoil velocity but not tissue stresses in the pupal wing. Such equilibrium circular cuts have also been used to analyze stresses in the avian embryo, where it correlates with directions of stress gathered from force inference methods (Kong et al Scientific Reports 2019). The Tribolium example noted by the reviewer is on the timescale of tens to hundreds of minutes - much longer than the timescale of laser ablation retraction. It is true the analysis of the ablation presented in this paper is not at the same level as those other cited papers and could be improved. But I don't think the analysis would be improved by additional experiments doing timelapse of initial retraction velocity.

      Response: Thank-you, we agree with Reviewer 1 here.

      Reviewer 2 states "If cell anistropy is caused by polarized myosin activity, that activity is typically polarized along the short edges not long edges" Not true in this case. Myosin II accumulates along long boundaries (Legoff and Lecuit 2013). "Therefore, interpreting what causes the cell anistropy and how DS regulates it is difficult," Agreed - but this is well beyond the scope of this manuscript. The authors clearly show that there is a change of cell shape, at least in these two regions. Better would be to quantify it throughout the pouch and across multiple discs. Similar point for myosin quantifications - yes, polarity would be interesting and possible to look at in these data, and it would be better to do so on multiple discs, but the lack of overall myosin on the junctions shown here is not nothing. Interpreting what Ft/Ds does to influence tension and myosin and eventually tissue shape is a big question that's not answered here. I think the authors do not claim to fully understand this though, and maybe further toning down the language of the conclusions could help.

      Response: We agree with Reviewer 1 here and will also add quantitation of myosin across multiple discs and will include higher magnification myosin images and polarity tests.

      Reviewer 3: I agree with many of the points raised by Reviewer 3, in particular that relevant for Fig 1. The additional experiments looking at myosin II localization and laser ablation in the other perturbations (Hippo and Rok mutants/RNAi) would certainly strengthen the conclusions.

      Response: Reviewer 3 comment on Fig 1 requests Ab stains to assess recovery of expression after downshift, which we will do.

      We will add examination of myosin localization in hpo RNAi wing discs, and in the ds/rok combinations. We note that the effects of Rok manipulations on myosin and on recoil velocity have been described previously (eg Rauskolb et al 2014).

      Reviewer #1 (Significance (Required)): I think the work provides a clear conceptual advance, arguing that the Ft/Ds pathway can influence mechanical stress independently of its interaction with Hippo and growth. Such a finding, if conserved, could be quite important for those studying morphogenesis and Fat function in this and other organisms. For this point, the genetic approach is a clear strength. Previous work in the Drosophila wing has already shown an adult wing phenotype for Ft/Ds mutations that was attributed to its role in the larval growth phase, as marked clones show aberrant growth in mutants. The novelty of this work is the dissection of the temporal progression of this phenotype and how it relates to Hippo and myosin II activation. It remains unclear exactly how Ft/Ds may affect tissue tension, except that it involves a downregulation of myosin II - the mechanism of that is not addressed here and would involve considerable more work. I think the temporal analysis of the wing pouch shape was quite revealing, providing novel information about how the phenotype evolves in time, in particular that there is already a phenotype quite early in development. As mentioned above, however, the lack of consideration of the wing disc as a 3D object is a potential limitation. While the audience is likely mostly developmental biologists working in basic research, it may also interest those studying the pathway in other contexts, including in vertebrates given its conservation and role in other processes.

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __ The manuscript begins with very nice data from a ts sensitive period experiment. Instead of a ts mutation, the authors induced RNAi in a temperature dependent manner. The results are striking and strong. Knockdown of FT or DS during larval stages to late L3 changed shape while knockdown of FT or DS during later pupal stages did not. This indicates they are required during larval, not pupal stages of wing development for this shape effect. They did shift-up or shift-down at "early pupa stage" but precisely what stage that means was not described anywhere in the manuscript. White prepupal? Time? Likewise a shift-down was done at "late L3" but that meaning is also vague. Moreover, I was surprised to see they did not do a shift-up at the late L3 stage, to give completeness to the experiment. Why?

      Response: We have added more precise descriptions of the timing, and we will also add the requested late L3 shift-up experiment.

      Looking at the "shape" of the larval wing pouch they see a difference in the mutants. The pouch can be approximated as an ellipse, but with differing topology to the adult wing. Here, they muddled the analysis. The adult wing surface is analogous to one hemisphere of the larval wing pouch, ie., either dorsal or ventral compartment. The distance along the AP boundary from the pouch border to DV midline is topologically comparable to the PD length of the adult wing. The distance along the DV boundary from A border to P border is topologically comparable to the AP length of the adult wing. They confusingly call this latter metric the "DV length" and the former metric the "AP length" , and in fact they do not measure the PD length but PD+DP length. Confusing. Please change to make this consistent with earlier analysis of the adult and invert the reported ratio and divide by two.

      Then you would find the larval PD/AP ratio is smaller in the FT and DS mutants than wildtype, which resembles the smaller PD/AP ratio seen in the mutant adult wings. Totally consistent and also provides further evidence with the ts experiments that FT and DS exert shape effects in the larval phase of life.

      Response: As noted by Reviewer 1 in cross-referencing, some of the statements made by Reviewer 2 here are incorrect, eg “The distance along the DV boundary from A border to P border is topologically comparable to the AP length of the adult wing.” They are correct where they note that the A-P length we measure in the discs is actually equivalent to 2x the adult wing length, since we are measuring along both the dorsal and ventral wing, but this makes no difference to the analysis as the point is to compare shape between time points and genotypes, not to make inferences based on the absolute numbers obtained. The numerical manipulations suggested are entirely feasible but we think they are unnecessary.

      The remainder of the manuscript has experimental results that are more problematic, and really the authors do not figure out how the shape effect in larval stages is altered. I outline below the main problems.

      1. They compare the FT DS shape phenotypes to those of mutants or knockdowns in Hippo pathway genes (Hippo is known to be downstream of FT and DS). They find these Hippo perturbations do have shape effects trending in same direction as FT and DS effects. Knockdown reduces the PD/AP ratio while overexpressing WARTS increases the PD/AP ratio. The effect magnitudes are not as strong, but then again, they are using hypomorphic alleles and RNAi, which often induces partial or hypomorphic phenotypes. The effect strength is comparable when wing pouches are young but then dissipates over time, while FT and DS effects do not dissipate over time. The complexity of the data do not negate the idea that Hippo signaling is also playing some role and could be downstream of FT and DS in all of this. But the authors really downplay the data to the point of stating "These results imply that Ds-Fat influences wing pouch shape during wing disc growth separately from its effects on Hippo signaling." I think a more expansive perspective is needed given the caveats of the experiments.

      Response: Our results emphasize that the effects of Ds-Fat on wing shape cannot be explained solely by effects on Hippo signaling, eg as we stated on page 7 “These observations suggest that Hippo signaling contributes to, but does not fully explain, the influence of ds or fat on adult wing shape.” We also note that impairment of Hippo signaling has similar effects in younger discs, but very different effects in older discs, which clearly indicates that they are having very different effects during disc growth; we will revise the text to make sure our conclusions are clear.

                    The reviewer wonders whether some of the differences could be due to the nature of the alleles or gene knockdown. First, the *ex*, *ds*, and *fat* alleles that we use are null alleles (eg see FlyBase), so it is not correct to say that we use only hypomorphic alleles and RNAi. We do use a hypomorphic allele for wts, and RNAi for hpo, for the simple reason that null alleles in these genes are lethal, so adult wings could not be examined. A further issue that is not commented on by the reviewer, but is more relevant here, is that there are multiple inputs into Hippo signaling, so of course even a null allele for ex, ds or fat is not a complete shutdown of Hippo signaling. Nonetheless, one can estimate the relative impairment of Hippo signaling by measuring the increased size of the wings, and from this perspective the knockdown conditions that we use are associated with roughly comparable levels of Hippo pathway impairment, so we stand by our results. We do however, recognize that these issues could be discussed more clearly in the text, and will do so in a revised manuscript.
      

      Puzzlingly, this lack of taking seriously a set of complex results does not transfer to another set of experiments in which they inhibit or activate ROK, the rho kinase. When ROK is perturbed, they also see weak effects on shape when compared to FT or DS perturbation. This weakness is seen in adults, larvae, clones and in epistasis experiments. The epistasis experiment in particular convincingly shows that constitutuve ROK activation is not epistatic to loss of DS; in fact if anything the DS phenotype suppresses the ROK phenotype. These results also show that one cannot simply explain what FT and DS are doing with some single pathway or effector molecule like ROK. It is more complex than that.

      What I really think was needed were experiments combining FT and DS knockdown with other mutants or knockdowns in the Hippo and Rho pathways, and even combining Hippo and Rho pathway mutants with FT or DS intact, to see if there are genetic interactions (additive, synergistic, epistatic) that could untangle the phenotypic complexity.

      Response: We’re puzzled by these comments. First, we never claimed that what Fat or Ds do could be explained simply by manipulation of Rok (eg, see Discussion). Moreover, examination of wings and wing discs where ds is combined with Rho manipulations is in Fig 7, and Hippo and Rho pathway manipulation combinations are in Fig S5. We don’t think that combining ds or fat mutations with other Hippo pathway mutations would be informative, as it is well established that Ds-Fat are upstream regulators of Hippo signaling.

      Laser cutting experiments were done to see if there is anisotropy in tissue tension within the wing pouch. This was to test a favored idea that FT and DS activity generates anisotropy in tissue tension, thereby controlling overall anisotropic shape of the pouch. However there is a fundamental flaw to their laser cutting analysis. Laser cutting is a technique used to measure mechanical tension, with initial recoil velocity directly proportional to the tissue's tension. By cutting a small line and observing how quickly the edges of the cut snap apart, people can quantify the initial recoil velocity and infer the stored mechanical stress in the tissue at the time of ablation. Live imaging with high-speed microscopy is required to capture the immediate response of the tissue to the cut since initial recoil velocity occurs in the first few seconds. A kymograph is created by plotting the movement of the tissue edges over this time scale, perpendicular to the cut. The initial recoil velocity is the slope of the kymograph at time zero, representing how fast the severed edges move apart. A higher recoil velocity indicates higher mechanical tension in the tissue. However, the authors did not measure this initial recoil velocity but instead measured the distance between the severed edges at one time point: 60 seconds after cutting. This is much later than the time point at which the recoil usually begins to dissipate or decay. This decay phase typically lasts a minute or two, during which time the edges continue to separate but at a progressively slower rate. This time-dependent decay of the recoil reveals whether the tissue behaves more like a viscous fluid or an elastic solid. Therefore, the distance metric at 60 seconds is a measurement of both tension and the material properties of the cells. One cannot know then whether a difference in the distance is due to a difference in tension or fluidity of the cells. If the authors made measurements of edge separation at several time points in the first 10 seconds after ablation, they can deconvolute the two. Otherwise their analysis is inconclusive. Anisotropy in recoil could be caused by greater tissue fluidity along one axis. Observing a gradient of cell fluidity in a tissue along one axis of a tissue has been observed in the amnioserosa of Tribolium for example. (Related and important point - was the anisotropy of recoil oriented along the PD or AP axis or not oriented to either axis, this key point was never stated)..

      The authors cannot definitiviely conclude anything about mechanical tension from their reported cutting data.

      Response: As noted by Reviewer 1 in cross-commenting, there is no fluidity on a time scale of 1 minute in the wing disc, and circular ablations are an established methods to investigate tissue stress. We choose the circular ablation method in part because it interrogates stress over a larger area, whereas cutting individual junctions is subject to more variability, particularly as the orientation of the junction (eg radial vs tangential) impacts the tension detected in the wing disc. Nonetheless, we will add recoil measurements to the revised manuscript to complement our circular ablations, which we expect will provide independent confirmation of our results and address the Reviewer’s concern here.

      They measured the eccentricity of wing pouch cells near the pouch border, and found they were highly anisotropic compared to DS mutant cells at comparable locations. Cells were elongated but again what if either axis (PD or AP) they were elongated along was never stated. If cell anistropy is caused by polarized myosin activity, that activity is typically polarized along the short edges not long edges. Thus, recoil velocity after laaser cutting would be stonger along the axis aligned with short cell edges. It looks like the cutting anisotropy they see is greater along the axis aligned with long cell edges. Of course, if the cell anisotropy is caused by a pulling force exerted by the pouch boundary, then it would stretch the cells. This would in fact fit their cutting data. But then again, the observed cell anisotropy could also be caused by variation in the fluid-solid properties of the wing cells as discussed earlier. Compression of the cells then would deform them anisotropically and produce the anisotropic shapes that were observed, Therefore, interpreting what causes the cell anistropy and how DS regulates it is difficult,

      Response: As noted by Reviewer 1 in cross-commenting, it is well established that tension and myosin are higher along long edges in the proximal wing. However, we acknowledge that we could do a better job of making the location and orientation of the regions shown in these experiments clear and, we will address this in a revised manuscript.

      The imaging and analysis of the myosin RLC by GFP tagging is also flawed. SQH-GFP is a tried and true proxy for myosin activity in Drosophila. Although the authors image the wing pouch of wildtype and DS mutants. they did so under low magnification to image the entire pouch. This gives a "low-res" perspective of overall myosin but what they needed to do was image at high magnification in that proximal region of the pouch and see if Sqh-GFP is polarized in wildtype cells along certain cell edges aligned with an axis. And if such a polarity is observed, is it present or absent in the DS mutant. From the data shown in Figure 5, I cannot see any significant difference between wildtype and knocked down samples at this low resolution. Any difference, if there is any, is not really interpretable.

      Response: We agree that examination of myosin localization at high resolution to see if it is polarized is a worthwhile experiment. We did in fact do this, and myosin (Sqh:GFP) appeared unpolarized in ds mutants. However, the levels of myosin were so low that we didn’t feel confident in our assessment, so we didn’t include it. We now recognize that this was a mistake, and we will include high resolution myosin images and assessments of (lack of) polarity in a revised manuscript to address this comment.

      In conclusion, the manuscript has multiple problems that make it imposiible for the authors to make the claims they make in the current manuscript. And even if they calibrated their interpretations to fit the data, there is not much of a simple clear picture as to how FT and DS regulate pouch eccentricity in the larval wing.

      Response: We think that the legitimate issues raised are addressable, as described above, while some of the criticisms are incorrect (as noted by Reviewer 1).

      Reviewer #2 (Significance (Required)): This manuscript describes experiments studying the role that the protocadherins FAT and DACHSOUS play in determining the two dimensional "shape" of the fruit fly wing. By "shape", the manuscript really means how much the wing's outline, when approximated as an ellipse, deviates from a circle. The elliptical approximations of FT and DS mutant wings more closely resemble a circle compared to the more eccentric wildtype wings. This suggests the molecules contribute to anisotropic growth in some way. A great deal of attention has been paid on how FT and DS regulate overall organ growth and planar cell polarity, and the Irvine lab has made extensive contributions to these questions over the years. Somewhat understudied is how FT and DS regulate wing shape, and this manuscript focuses on that. It follows up on an interesting result that the Irvine lab published in 2019, in which mud mutants randomized spindle pole orientation in wing cells but did not change the eccentricity of wings, ruling out biased cell division orientation as a mechanism for the anisotropic growth.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __ Summary The authors investigate the mechanisms underlying epithelial morphogenesis using the Drosophila wing as a model system. Specifically, they analyze the contribution of the conserved Fat/Ds pathway to wing shape regulation. The main claim of the manuscript is that Ds/Fat controls wing shape by regulating tissue mechanical stress through MyoII levels, independently of Hippo signaling and tissue growth.

      Major Comments To support their main conclusions, the authors should address the following major points and consider additional experiments where indicated. Most of the suggested experiments are feasible within a reasonable timeframe, while a few are more technically demanding but would substantially strengthen the manuscript's central claims.

      Figure 1: The authors use temperature-sensitive inactivation of Fat or Ds to determine the developmental window during which these proteins regulate wing shape. To support this claim, it is essential to demonstrate that upon downshift during early pupal stages, Ds or Fat protein levels are restored to normal. For consistency, please include statistical analyses in Figure 1P and ensure that all y-axis values in shape quantifications start at 1.

      Response: We will do the requested antibody stains for Fat (Ds antibody is unfortunately no longer available, but the point made by the reviewer can be addressed by Fat as the approach and results are the same for both genes). We have also added the requested statistical analysis to Fig 1P, and adjusted the scales as requested.

      Figure 2: The authors propose that wing shape is regulated by Fat/Ds during larval development. However, Figure 2L suggests that wing elongation occurs in control conditions between 6 and 12 h APF, while this elongation is not observed upon Ds RNAi. The authors should therefore perform downshift experiments while monitoring wing shape during the pupal stage to substantiate their main claim. In addition, equivalent data for Fat loss of function should be included to support the assertion that Fat and Ds act similarly.

      Response: As noted in our response to point 1 of Reviewer 1, we agree that there does seem to be relatively more elongation in control wings than in ds RNAi wings, but we think this likely reflects effects of ds on growth during larval stages, and we will revise the manuscript to comment on this.

      We will also add the suggested examination of fat RNAi pupal wings.

      The suggested examination of pupal wing shape in downshift experiments is unfortunately not feasible. Our temperature shift experiments expressing ds or fat RNAi are done using the UAS-Gal4-Gal80tssystem. We also use the UAS-Gal4 system to mark the pupal wing. If we do a downshift experiment, then expression of the fluorescent marker will be shut down in parallel with the shut down of ds or fat RNAi, so the pupal wings would no longer be visible.

      Figure 3: The authors state that "These observations indicate that Ds-Fat signaling influences wing shape during the initial formation of the wing pouch, in addition to its effects during wing growth." This conclusion is not fully supported, as the authors only examine wing shape at 72 h AEL. At this stage, fat or ds mutant wings already display altered morphology. The authors could only make this claim if earlier time points were fully analyzed. In fact, the current data rather suggest that Ds function is required before 72 h AEL, as a rescue of wing shape is observed between 72 and 120 h AEL.

      Response: First, I think we are largely in agreement with the Reviewer, as the basis for our saying that DS-Fat are likely required during initial formation of the wing pouch is that our data show they must be required before 72 h AEL. Second, 72 h is the earliest that we can look using Wg expression as a marker, as at earlier stages it is in a ventral wedge rather than a ring around the future wing pouch + DV line (eg see Fig 8 of Tripathi, B. K. & Irvine, K. D. The wing imaginal disc. Genetics (2022) doi:10.1093/genetics/iyac020.). We can revise the text to make sure this is clear.

      Figure 4: The authors state that "The influence of Ds-Fat on wing shape is not explained by Hippo signaling." However, this conclusion is not supported by their data, which show that partial loss of ex or hippo causes clear defects in wing shape. In addition, the initial wing shape is affected in wts and ex mutants, and hypomorphic alleles were used for these experiments. Therefore, the main conclusion requires revision. It would be useful to include a complete dataset for hippo RNAi, ex, and wts conditions in Figure S1. The purpose and interpretation of the InR^CA experiments are also unclear. While InR^CA expression can increase tissue growth, Hippo signaling has functions beyond growth control. Whether Hippo regulates tissue shape through InR^CA-dependent mechanisms remains to be clarified.

      Response: As noted in our response to point 1 of Reviewer 2 - our results emphasize that the effects of Ds-Fat on wing shape cannot be explained solely by effects on Hippo signaling, eg as we stated on page 7 “These observations suggest that Hippo signaling contributes to, but does not fully explain, the influence of ds or fat on adult wing shape.” We also note that impairment of Hippo signaling has similar effects in younger discs, but very different effects in older discs, which clearly indicates that they are having very different effects during disc growth. We will make some revisions to the text to make sure that our conclusions are clear throughout.

      While we used a hypomorphic allele for wts, because null alleles are lethal, the ex allele that we used is described in Flybase as an amorph, not a hypomorph, and as noted in our response to Reviewer 2, we will add some discussion about relative strength of effects on Hippo signaling.

      In Fig S1, we currently show adult wings for ex[e1] and RNAi-Hpo, and wing discs for wts[P2]/wts[x1], and for ex[e1]. The wts combination does not survive to adult so we can’t include this. We will however, add hpo RNAi wing discs as requested.

                    The purpose of including InR^CA experiments is to try to separate effects of Hippo signaling from effects of growth, because InR signaling manipulation provides a distinct mechanism for increasing growth. We will revise the text to try to make sure this is clearer.
      

      Figure 5: This figure presents images of MyoII distribution, but no quantification across multiple samples is provided. Moreover, the relationship between changes in tissue stress and MyoII levels remains unclear. Performing laser ablation and MyoII quantification on the same samples would provide stronger support for the proposed conclusions.

      Response: We will revise the quantitation so that it presents analysis of averages across multiple discs, rather than representative examples of single discs.

      Both the myosin imaging, and the laser ablation were done on the same genotypes (wildtype and ds) at the same ages (108 h AEL) so we think it is valid to directly compare them. Moreover, the imaging conditions for laser ablation and myo quantification are different, so it’s not feasible to do them at the same time (For ablations we do a single Z plane and a single channel (has to include Ecad, or an equivalent junctional marker) on live discs, so that fast imaging can be done. For Myo imaging we do multiple Z stacks and multiple channels (eg Ecad and Myo), which is not compatible with the fast imaging needed for analysis of laser ablations).

      Figure 6: It is unclear when Rok RNAi and Rok^CA misexpression were induced. To substantiate their claims, the authors should measure both MyoII levels and mechanical tension under the different experimental conditions in which wing shape was modified through Rok modulation (i.e. the condition shown in Fig. 7G). For comparison, fat and ds data should be added to Fig 6H. Overall, the effects of Rok modulation appear milder than those of Fat manipulation. Given that Dachs has been shown to regulate tension downstream of Fat/Ds, it would be informative to determine whether tissue tension is altered in dachs mutant wings and to assess the relative contribution of Dachs- versus MyoII-mediated tension to wing shape control. It would also be interesting to test whether Rok activation can rescue dachs loss-of-function phenotypes.

      Response: In these Rok experiments there was no separate temporal control of Rok RNAi or Rok^CA expression, they were expressed under nub-Gal4 control throughout development.

      We will add examination of myosin in combinations of ds RNAi and rok manipulation as in Fig 7G to a revised manuscript.

      Data for fat and ds comparable to that shown in Fig 6H is already presented in Fig 3D, and we don’t think its necessary to reproduce this again in Fig 6H.

      We agree that the effects of Rok manipulations are milder than those of Fat manipulations; as we try to discuss, this could be because the pattern or polarity of myosin is also important, not just the absolute level, and we will add assessment of myosin polarity.

      The suggestion to also look at dachs mutants is reasonable, and we will add this. In addition, we plan to add an "activated" Dachs (a Zyxin-Dachs fusion protein previously described in Pan et al 2013) that we anticipate will provide further evidence that the effects of Ds-Fat are mediated through Dachs. We will also add the suggested experiment combining Rok activation with dachs loss-of-function.

      Figure 7: The authors use genetic interactions to support their claim that Fat controls wing shape independently of Hippo signaling. However, these interactions do not formally exclude a role for Hippo. Moreover, previous work has shown that tissue tension regulates Hippo pathway activity, implying that any manipulation of tension could indirectly affect Hippo and growth. To provide more direct evidence, the authors should further analyze MyoII localization and tissue tension under the various experimental conditions tested (as also suggested above).

      Response: As discussed above, our data clearly show that Fat has effects independently of Hippo signaling that are crucial for its effects on wing shape, but we did not mean to imply that the regulation of Hippo signaling by Fat makes no contribution to wing shape control, and we will revise the text to make this clearer. We will also add additional analysis of Myosin localization , as described above.

      Reviewer #3 (Significance (Required)): How organ growth and shape are controlled remains a fundamental question in developmental biology, with major implications for our understanding of disease mechanisms. The Drosophila wing has long served as a powerful and informative model to study tissue growth and morphogenesis. Work in this system has been instrumental in delineating the conserved molecular and mechanical processes that coordinate epithelial dynamics during development. The molecular regulators investigated by the authors are highly conserved, suggesting that the findings reported here are likely to be of broad biological relevance.

      Previous studies have proposed that anisotropic tissue growth regulates wing shape during larval development and that such anisotropy induces mechanical responses that promote MyoII localization (Legoff et al., 2013, PMID: 24046320; Mao et al., 2013, PMID: 24022370). The Ds/Fat system has also been shown to regulate tissue tension through the Dachs myosin, a known modulator of the Hippo/YAP signaling pathway. As correctly emphasized by the authors, the respective contributions of anisotropic growth and mechanical tension to wing shape control remain only partially understood. The current study aims to clarify this issue by analyzing the role of Fat/Ds in controlling MyoII localization and, consequently, wing shape. This represents a potentially valuable contribution. However, the proposed mechanistic link between Fat/Ds and MyoII localization remains insufficiently explored. Moreover, the role of MyoII is not fully discussed in the broader context of Dachs function and its known interactions with MyoII (Mao et al., 2011, PMID: 21245166; Bosveld et al., 2012, PMID: 22499807; Trinidad et al., 2024, PMID: 39708794). Most importantly, the experimental evidence supporting the authors' conclusions would benefit from further strengthening. It should also be noted that disentangling the relative contributions of anisotropic growth and MyoII polarization to tissue shape and size remains challenging, as MyoII levels are known to increase in response to anisotropic growth (Legoff et al., 2013; Mao et al., 2013), and mechanical tension itself can modulate Hippo/YAP signaling (Rauskolb et al., 2014, PMID: 24995985).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      The authors investigate the mechanisms underlying epithelial morphogenesis using the Drosophila wing as a model system. Specifically, they analyze the contribution of the conserved Fat/Ds pathway to wing shape regulation. The main claim of the manuscript is that Ds/Fat controls wing shape by regulating tissue mechanical stress through MyoII levels, independently of Hippo signaling and tissue growth.

      Major Comments

      To support their main conclusions, the authors should address the following major points and consider additional experiments where indicated. Most of the suggested experiments are feasible within a reasonable timeframe, while a few are more technically demanding but would substantially strengthen the manuscript's central claims.

      Figure 1:

      The authors use temperature-sensitive inactivation of Fat or Ds to determine the developmental window during which these proteins regulate wing shape. To support this claim, it is essential to demonstrate that upon downshift during early pupal stages, Ds or Fat protein levels are restored to normal. For consistency, please include statistical analyses in Figure 1P and ensure that all y-axis values in shape quantifications start at 1.

      Figure 2:

      The authors propose that wing shape is regulated by Fat/Ds during larval development. However, Figure 2L suggests that wing elongation occurs in control conditions between 6 and 12 h APF, while this elongation is not observed upon Ds RNAi. The authors should therefore perform downshift experiments while monitoring wing shape during the pupal stage to substantiate their main claim. In addition, equivalent data for Fat loss of function should be included to support the assertion that Fat and Ds act similarly.

      Figure 3:

      The authors state that "These observations indicate that Ds-Fat signaling influences wing shape during the initial formation of the wing pouch, in addition to its effects during wing growth." This conclusion is not fully supported, as the authors only examine wing shape at 72 h AEL. At this stage, fat or ds mutant wings already display altered morphology. The authors could only make this claim if earlier time points were fully analyzed. In fact, the current data rather suggest that Ds function is required before 72 h AEL, as a rescue of wing shape is observed between 72 and 120 h AEL.

      Figure 4:

      The authors state that "The influence of Ds-Fat on wing shape is not explained by Hippo signaling." However, this conclusion is not supported by their data, which show that partial loss of ex or hippo causes clear defects in wing shape. In addition, the initial wing shape is affected in wts and ex mutants, and hypomorphic alleles were used for these experiments. Therefore, the main conclusion requires revision. It would be useful to include a complete dataset for hippo RNAi, ex, and wts conditions in Figure S1. The purpose and interpretation of the InR^CA experiments are also unclear. While InR^CA expression can increase tissue growth, Hippo signaling has functions beyond growth control. Whether Hippo regulates tissue shape through InR^CA-dependent mechanisms remains to be clarified.

      Figure 5:

      This figure presents images of MyoII distribution, but no quantification across multiple samples is provided. Moreover, the relationship between changes in tissue stress and MyoII levels remains unclear. Performing laser ablation and MyoII quantification on the same samples would provide stronger support for the proposed conclusions.

      Figure 6:

      It is unclear when Rok RNAi and Rok^CA misexpression were induced. To substantiate their claims, the authors should measure both MyoII levels and mechanical tension under the different experimental conditions in which wing shape was modified through Rok modulation (i.e. the condition shown in Fig. 7G). For comparison, fat and ds data should be added to Fig 6H.<br /> Overall, the effects of Rok modulation appear milder than those of Fat manipulation. Given that Dachs has been shown to regulate tension downstream of Fat/Ds, it would be informative to determine whether tissue tension is altered in dachs mutant wings and to assess the relative contribution of Dachs- versus MyoII-mediated tension to wing shape control. It would also be interesting to test whether Rok activation can rescue dachs loss-of-function phenotypes.

      Figure 7:

      The authors use genetic interactions to support their claim that Fat controls wing shape independently of Hippo signaling. However, these interactions do not formally exclude a role for Hippo. Moreover, previous work has shown that tissue tension regulates Hippo pathway activity, implying that any manipulation of tension could indirectly affect Hippo and growth. To provide more direct evidence, the authors should further analyze MyoII localization and tissue tension under the various experimental conditions tested (as also suggested above).

      Significance

      How organ growth and shape are controlled remains a fundamental question in developmental biology, with major implications for our understanding of disease mechanisms. The Drosophila wing has long served as a powerful and informative model to study tissue growth and morphogenesis. Work in this system has been instrumental in delineating the conserved molecular and mechanical processes that coordinate epithelial dynamics during development. The molecular regulators investigated by the authors are highly conserved, suggesting that the findings reported here are likely to be of broad biological relevance.

      Previous studies have proposed that anisotropic tissue growth regulates wing shape during larval development and that such anisotropy induces mechanical responses that promote MyoII localization (Legoff et al., 2013, PMID: 24046320; Mao et al., 2013, PMID: 24022370). The Ds/Fat system has also been shown to regulate tissue tension through the Dachs myosin, a known modulator of the Hippo/YAP signaling pathway. As correctly emphasized by the authors, the respective contributions of anisotropic growth and mechanical tension to wing shape control remain only partially understood. The current study aims to clarify this issue by analyzing the role of Fat/Ds in controlling MyoII localization and, consequently, wing shape. This represents a potentially valuable contribution. However, the proposed mechanistic link between Fat/Ds and MyoII localization remains insufficiently explored. Moreover, the role of MyoII is not fully discussed in the broader context of Dachs function and its known interactions with MyoII (Mao et al., 2011, PMID: 21245166; Bosveld et al., 2012, PMID: 22499807; Trinidad et al., 2024, PMID: 39708794). Most importantly, the experimental evidence supporting the authors' conclusions would benefit from further strengthening. It should also be noted that disentangling the relative contributions of anisotropic growth and MyoII polarization to tissue shape and size remains challenging, as MyoII levels are known to increase in response to anisotropic growth (Legoff et al., 2013; Mao et al., 2013), and mechanical tension itself can modulate Hippo/YAP signaling (Rauskolb et al., 2014, PMID: 24995985).

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript begins with very nice data from a ts sensitive period experiment. Instead of a ts mutation, the authors induced RNAi in a temperature dependent manner. The results are striking and strong. Knockdown of FT or DS during larval stages to late L3 changed shape while knockdown of FT or DS during later pupal stages did not. This indicates they are required during larval, not pupal stages of wing development for this shape effect. They did shift-up or shift-down at "early pupa stage" but precisely what stage that means was not described anywhere in the manuscript. White prepupal? Time? Likewise a shift-down was done at "late L3" but that meaning is also vague. Moreover, I was surprised to see they did not do a shift-up at the late L3 stage, to give completeness to the experiment. Why?

      Looking at the "shape" of the larval wing pouch they see a difference in the mutants. The pouch can be approximated as an ellipse, but with differing topology to the adult wing. Here, they muddled the analysis. The adult wing surface is analogous to one hemisphere of the larval wing pouch, ie., either dorsal or ventral compartment. The distance along the AP boundary from the pouch border to DV midline is topologically comparable to the PD length of the adult wing. The distance along the DV boundary from A border to P border is topologically comparable to the AP length of the adult wing. They confusingly call this latter metric the "DV length" and the former metric the "AP length" , and in fact they do not measure the PD length but PD+DP length. Confusing. Please change to make this consistent with earlier analysis of the adult and invert the reported ratio and divide by two. Then you would find the larval PD/AP ratio is smaller in the FT and DS mutants than wildtype, which resembles the smaller PD/AP ratio seen in the mutant adult wings. Totally consistent and also provides further evidence with the ts experiments that FT and DS exert shape effects in the larval phase of life.

      The remainder of the manuscript has experimental results that are more problematic, and really the authors do not figure out how the shape effect in larval stages is altered. I outline below the main problems.

      1. They compare the FT DS shape phenotypes to those of mutants or knockdowns in Hippo pathway genes (Hippo is known to be downstream of FT and DS). They find these Hippo perturbations do have shape effects trending in same direction as FT and DS effects. Knockdown reduces the PD/AP ratio while overexpressing WARTS increases the PD/AP ratio. The effect magnitudes are not as strong, but then again, they are using hypomorphic alleles and RNAi, which often induces partial or hypomorphic phenotypes. The effect strength is comparable when wing pouches are young but then dissipates over time, while FT and DS effects do not dissipate over time. The complexity of the data do not negate the idea that Hippo signaling is also playing some role and could be downstream of FT and DS in all of this. But the authors really downplay the data to the point of stating "These results imply that Ds-Fat influences wing pouch shape during wing disc growth separately from its effects on Hippo signaling." I think a more expansive perspective is needed given the caveats of the experiments.

      Puzzlingly, this lack of taking seriously a set of complex results does not transfer to another set of experiments in which they inhibit or activate ROK, the rho kinase. When ROK is perturbed, they also see weak effects on shape when compared to FT or DS perturbation. This weakness is seen in adults, larvae, clones and in epistasis experiments. The epistasis experiment in particular convincingly shows that constitutuve ROK activation is not epistatic to loss of DS; in fact if anything the DS phenotype suppresses the ROK phenotype. These results also show that one cannot simply explain what FT and DS are doing with some single pathway or effector molecule like ROK. It is more complex than that.

      What I really think was needed were experiments combining FT and DS knockdown with other mutants or knockdowns in the Hippo and Rho pathways, and even combining Hippo and Rho pathway mutants with FT or DS intact, to see if there are genetic interactions (additive, synergistic, epistatic) that could untangle the phenotypic complexity. 2. Laser cutting experiments were done to see if there is anisotropy in tissue tension within the wing pouch. This was to test a favored idea that FT and DS activity generates anisotropy in tissue tension, thereby controlling overall anisotropic shape of the pouch. However there is a fundamental flaw to their laser cutting analysis. Laser cutting is a technique used to measure mechanical tension, with initial recoil velocity directly proportional to the tissue's tension. By cutting a small line and observing how quickly the edges of the cut snap apart, people can quantify the initial recoil velocity and infer the stored mechanical stress in the tissue at the time of ablation. Live imaging with high-speed microscopy is required to capture the immediate response of the tissue to the cut since initial recoil velocity occurs in the first few seconds. A kymograph is created by plotting the movement of the tissue edges over this time scale, perpendicular to the cut. The initial recoil velocity is the slope of the kymograph at time zero, representing how fast the severed edges move apart. A higher recoil velocity indicates higher mechanical tension in the tissue. However, the authors did not measure this initial recoil velocity but instead measured the distance between the severed edges at one time point: 60 seconds after cutting. This is much later than the time point at which the recoil usually begins to dissipate or decay. This decay phase typically lasts a minute or two, during which time the edges continue to separate but at a progressively slower rate. This time-dependent decay of the recoil reveals whether the tissue behaves more like a viscous fluid or an elastic solid. Therefore, the distance metric at 60 seconds is a measurement of both tension and the material properties of the cells. One cannot know then whether a difference in the distance is due to a difference in tension or fluidity of the cells. If the authors made measurements of edge separation at several time points in the first 10 seconds after ablation, they can deconvolute the two. Otherwise their analysis is inconclusive. Anisotropy in recoil could be caused by greater tissue fluidity along one axis. Observing a gradient of cell fluidity in a tissue along one axis of a tissue has been observed in the amnioserosa of Tribolium for example. (Related and important point - was the anisotropy of recoil oriented along the PD or AP axis or not oriented to either axis, this key point was never stated)..

      The authors cannot definitiviely conclude anything about mechanical tension from their reported cutting data. 3. They measured the eccentricity of wing pouch cells near the pouch border, and found they were highly anisotropic compared to DS mutant cells at comparable locations. Cells were elongated but again what if either axis (PD or AP) they were elongated along was never stated. If cell anistropy is caused by polarized myosin activity, that activity is typically polarized along the short edges not long edges. Thus, recoil velocity after laaser cutting would be stonger along the axis aligned with short cell edges. It looks like the cutting anisotropy they see is greater along the axis aligned with long cell edges. Of course, if the cell anisotropy is caused by a pulling force exerted by the pouch boundary, then it would stretch the cells. This would in fact fit their cutting data. But then again, the observed cell anisotropy could also be caused by variation in the fluid-solid properties of the wing cells as discussed earlier. Compression of the cells then would deform them anisotropically and produce the anisotropic shapes that were observed, Therefore, interpreting what causes the cell anistropy and how DS regulates it is difficult, 4. The imaging and analysis of the myosin RLC by GFP tagging is also flawed. SQH-GFP is a tried and true proxy for myosin activity in Drosophila. Although the authors image the wing pouch of wildtype and DS mutants. they did so under low magnification to image the entire pouch. This gives a "low-res" perspective of overall myosin but what they needed to do was image at high magnification in that proximal region of the pouch and see if Sqh-GFP is polarized in wildtype cells along certain cell edges aligned with an axis. And if such a polarity is observed, is it present or absent in the DS mutant. From the data shown in Figure 5, I cannot see any significant difference between wildtype and knocked down samples at this low resolution. Any difference, if there is any, is not really interpretable.

      In conclusion, the manuscript has multiple problems that make it imposiible for the authors to make the claims they make in the current manuscript. And even if they calibrated their interpretations to fit the data, there is not much of a simple clear picture as to how FT and DS regulate pouch eccentricity in the larval wing.

      Significance

      This manuscript describes experiments studying the role that the protocadherins FAT and DACHSOUS play in determining the two dimensional "shape" of the fruit fly wing. By "shape", the manuscript really means how much the wing's outline, when approximated as an ellipse, deviates from a circle. The elliptical approximations of FT and DS mutant wings more closely resemble a circle compared to the more eccentric wildtype wings. This suggests the molecules contribute to anisotropic growth in some way. A great deal of attention has been paid on how FT and DS regulate overall organ growth and planar cell polarity, and the Irvine lab has made extensive contributions to these questions over the years. Somewhat understudied is how FT and DS regulate wing shape, and this manuscript focuses on that. It follows up on an interesting result that the Irvine lab published in 2019, in which mud mutants randomized spindle pole orientation in wing cells but did not change the eccentricity of wings, ruling out biased cell division orientation as a mechanism for the anisotropic growth.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this work, Tripathi et al address the open question of how the Fat/Ds pathway affects organ shape, using the Drosophila wing as a model. The Fat/Ds pathway is a conserved but complex pathway, interacting with Hippo signalling to affect growth and providing planar cell polarity that can influence cellular dynamics during morphogenesis. Here, authors use genetic perturbations combined with quantification of larval, pupal, and adult wing shape and laser ablation to conclude that the Ft/Ds pathway affects wing shape only during larval stages in a way that is at least partially independent of its interaction with Hippo and rather due to an effect on tissue tension and myosin II distribution. Overall the work is clearly written and well presented. I only have a couple major comments on the limitations of the work.

      Major comments:

      1. Authors conclude from data in Figures 1 and 2 that the Fat/Ds pathway only affects wing shape during larval stages. When looking at the pupal wing shape analysis in Figure 2L, however, it looks there is a difference in wt over time (6h-18h, consistent with literature), but that difference in time goes away in RNAi-ds, indicating that actually there is a role for Ds in changing shape during pupal stages, although the phenotype is clearly less dramatic than that of larval stages. No statistical test was done over time (within the genotype), however, so it's hard to say. I recommend the authors test over time - whether 6h and 18h are different in wild type and in ds mutant. I think this is especially important because there is proximal overgrowth in the Fat/Ds mutants, much of which is contained in the folds during larval stages. That first fold, however, becomes the proximal part of the pupal wing after eversion and contracts during pupal stages to elongate the blade (Aiguoy 2010, Etournay 2015). Also, according to Trinidad Curr Biol 2025, there is a role for Fat/Ds pathway in pupal stages. All of that to say that it seems likely that there would be a phenotype in pupal stages. It's true it doesn't show up in the adult wing in the experiments in Fig 1, but looking at the pupal wing itself is more direct - perhaps the very proximal effect is less prominent later, as there is potential for further development after 18hr before adulthood and the most proximal parts are likely anyway excluded in the analysis.
      2. I think there needs to be a mention and some discussion of the fact that the wing is not really flat. While it starts out very flat at 72h, by 96h and beyond, there is considerable curvature in the pouch that may affect measurements of different axis and cell shape. It is not actually specified in the methods, so I assume the measurements were taken using a 2D projection. Not clear whether the curvature of the pouch was taken into account, either for cell shape measurements presented in Fig 4 or for the wing pouch dimensional analysis shown in Fig 3, 6, and supplements. Do perturbations in Ft/Ds affect this curvature? Are they more or less curved in one or both axes? Such a change could affect the results and conclusions. The extent to which the fat/ds mutants fold properly is another important consideration that is not mentioned. For example, maybe the folds are deeper and contain more material in the ds/fat mutants, and that's why the pouch is a different shape? At the very least, this point about the 3D nature of the wing disc must be raised in discussion of the limitations of the study. For the cell shape analysis, you can do a correction based on the local curvature (calculated from the height map from the projection). For the measurement of A/P, D/V axes of the wing pouch, best would be to measure the geodesic distance in 3D, but this is not reasonable to suggest at this point. One can still try to estimate the pouch height/curvature, however, both in wild type and in fat/ds mutants.

      Minor comments:

      1. The analysis of the laser ablation is not really standard - usually one looks at recoil velocity or a more complicated analysis of the equilibrium shape using a model (e.g Shivakumar and Lenne 2016, Piscitello-Gomez 2023, Dye et al 2021). One may be able to extract more information from these experiments - nevertheless, I doubt the conclusions would change, given that that there seems to be a pretty clear difference between wt and ds (OPTIONAL).
      2. Figure 7G: I think you also need a statistical test between RNAi-ds and UAS-rokCA+RNAi-ds.
      3. In the discussion, there is a statement: "However, as mutation or knock down of core PCP components, including pk or sple, does not affect wing shape... 59." Reference 59 is quite old and as far as I can tell shows neither images nor quantifications of the wing shape phenotype (not sure it uses "knockdown" either - unless you mean hypomorph?). A more recent publication Piscitello-Gomez et al Elife 2023 shows a very subtle but significant wing shape phenotype in core PCP mutants. It doesn't change your logic, but I would change the statement to be more accurate by saying "mutation of core PCP components has only subtle changes in adult wing shape"

      Referee cross-commenting

      Reviewer2:

      Reviewer 2 makes the statement: "The distance along the AP boundary from the pouch border to DV midline is topologically comparable to the PD length of the adult wing. The distance along the DV boundary from A border to P border is topologically comparable to the AP length of the adult wing."

      I disagree - the DV boundary wraps around the entire margin of the adult wing (as correctly drawn with the pink line in Fig 2A). It is not the same as the wide axis of the adult wing (perpendicular to the AP boundary). It is not trivial to map the proximal-distal axis of the larval wing to the proximal-distal axis of the adult, due to the changes in shape that occur during eversion. Thus, I find it much easier to look at the exact measurement that the authors make, and it is much more standard in the field, rather than what the reviewer suggests. Alternatively, one could I guess measure in the adult the ratio of the DV margin length (almost the circumference of the blade?) to the AP boundary length. That may be a more direct comparison. Actually the authors leave out the term "boundary" - what they call AP is actually the AP boundary, not the AP axis, and likewise for the DV - what they measure is DV boundary, but I only noticed that in the second read-through now. Just another note, these measurements of the pouch really only correspond to the very distal part of the wing blade, as so much of the proximal blade comes from the folds in the wing disc. Therefore, a measurement of only distal wing shape would be more comparable.

      Reviewer 2 states that authors cannot definitively conclude anything about mechanical tension from their reported cutting data because the authors have not looked at initial recoil velocity. I strongly disagree. The wing disc tissue is elastic on much longer timescales than what's considered after laser ablation (even hours), and the shape of the tissue after it equilibrates from a circular cut (1-2min) can indeed be used to infer tissue stresses (see Dye et al Elife 2021, Piscitello-Gomez et al eLife 2023, Tahaei et al arXiv 2024). In the wing disc, the direction of stresses inferred from initial recoil velocity are correlated with the direction of stresses inferred from analysing the equilibrium shape after a circular cut. Rearrangements, a primary mechanism of fluidization in epithelia, does not occur within 1'. Analysing the equilibrium shape after circular ablation may be more accurate for assessing tissue stresses than initial recoil velocity - in Piscitello-Gomez et al 2023, the authors found that a prickle mutation (PCP pathway) affected initial recoil velocity but not tissue stresses in the pupal wing. Such equilibrium circular cuts have also been used to analyze stresses in the avian embryo, where it correlates with directions of stress gathered from force inference methods (Kong et al Scientific Reports 2019). The Tribolium example noted by the reviewer is on the timescale of tens to hundreds of minutes - much longer than the timescale of laser ablation retraction. It is true the analysis of the ablation presented in this paper is not at the same level as those other cited papers and could be improved. But I don't think the analysis would be improved by additional experiments doing timelapse of initial retraction velocity.

      Reviewer 2 states "If cell anistropy is caused by polarized myosin activity, that activity is typically polarized along the short edges not long edges" Not true in this case. Myosin II accumulates along long boundaries (Legoff and Lecuit 2013). "Therefore, interpreting what causes the cell anistropy and how DS regulates it is difficult," Agreed - but this is well beyond the scope of this manuscript. The authors clearly show that there is a change of cell shape, at least in these two regions. Better would be to quantify it throughout the pouch and across multiple discs. Similar point for myosin quantifications - yes, polarity would be interesting and possible to look at in these data, and it would be better to do so on multiple discs, but the lack of overall myosin on the junctions shown here is not nothing. Interpreting what Ft/Ds does to influence tension and myosin and eventually tissue shape is a big question that's not answered here. I think the authors do not claim to fully understand this though, and maybe further toning down the language of the conclusions could help.

      Reviewer 3:

      I agree with many of the points raised by Reviewer 3, in particular that relevant for Fig 1. The additional experiments looking at myosin II localization and laser ablation in the other perturbations (Hippo and Rok mutants/RNAi) would certainly strengthen the conclusions.

      Significance

      I think the work provides a clear conceptual advance, arguing that the Ft/Ds pathway can influence mechanical stress independently of its interaction with Hippo and growth. Such a finding, if conserved, could be quite important for those studying morphogenesis and Fat function in this and other organisms. For this point, the genetic approach is a clear strength. Previous work in the Drosophila wing has already shown an adult wing phenotype for Ft/Ds mutations that was attributed to its role in the larval growth phase, as marked clones show aberrant growth in mutants. The novelty of this work is the dissection of the temporal progression of this phenotype and how it relates to Hippo and myosin II activation. It remains unclear exactly how Ft/Ds may affect tissue tension, except that it involves a downregulation of myosin II - the mechanism of that is not addressed here and would involve considerable more work. I think the temporal analysis of the wing pouch shape was quite revealing, providing novel information about how the phenotype evolves in time, in particular that there is already a phenotype quite early in development. As mentioned above, however, the lack of consideration of the wing disc as a 3D object is a potential limitation. While the audience is likely mostly developmental biologists working in basic research, it may also interest those studying the pathway in other contexts, including in vertebrates given its conservation and role in other processes.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Overall Response.

      We would like to thank the reviewers for their analysis of the manuscript. From their comments it is clear that our manuscript was not. We completely rewrote the manuscript to focus on the central core question which was how does Adam13 regulates gene expression in general and TFap2a in particular leading to the expression of Calpain8 a protein required for CNC migration.

      The following model will be the central line of our story. It will address all of the proteins involved and mechanistical evidences that link Adam13 to one of its proven effector target Calpain8.

      • *

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)): **

      In this manuscript, Pandey et al. show that the ADAM13 protein modulates histone modifications in cranical neural crest and that the Arid3a protein binds the Tfap2a promoter in an Adam13-dependent manner and has promoter-specific effects on transcription. Furthermore, they show that the Adam13 and human ADAM9 proteins associated with histone modifiers as well as proteins involved in RNA splicing. Although the manuscript is mostly clearly written and the figures well assembled, it reads like a couple of separate and unfinished stories.*

      I believe that our story line was not clear and that the overarching questions was not well stated. We have made every effort to change this in the revised manuscript. I would like to include a figure that explains the story.

      In short:

      1 We knew that Adam13 could regulate gene expression in CNC via its cytoplasmic domain.

      2 We also knew that this required Adam13 interaction with Arid3a and that a direct target with the transcription factor TFAP2a which in turn regulates functional targets that we had identified including the protocadherin PCNS and the protease Calpain8.

      Our goal was to understand the mechanism allowing Adam13 to regulate gene expression.

      3 This first part of this manuscript shows how Adam13 modulates Histone modification in vivo in the CNC globally as well as specifically on the Tfap2a promoter. This results I an Open chromatin.

      4 Using Chip we show that Adam13 and Arid3a both bind to the Tfap2a promoter and that Arid3a binding to the first ATG depends on Adam13.

      5 Using Luciferase reporter we show that both Adam13 and Arid3a can induce expression at the first ATG.

      *They show using immunocytochemistry and qPCR that ADAM13 knockouts in CNCs afffects histone modifications. Here ChIP-seq or Cut-n-Run experiments would be more appropriate and would result in a more comprehensive understanding of the changes mediated. *

      I agree but we did not have the fund and now I have nobody working in the lab to do this experiment. These are also likely to overlap with the RNAseq data that we have and would simply add more open leads. We selected to go after the only direct target that we know which is TFAP2a and focus on this gene to understand the mechanism.

      We believe that the Chip PCR experiment are sufficient for this story.

      *The immunohistochemistry assays should at least be verified further using western blotting or other more quantiative methods. *

      Immunofluorescence and statistical analysis is a valid quantification method. Western blot of CNC explants is not trivial and requires a large amount of material. Given the small overall change we also would not expect to be able to detect the change over the noise of western blot. The Chip PCR confirms our finding in a completely independent manner.

      *The authors then show that ADAM13 interacts with a number of histone modifiers such as KDM3B, KDM4B and KMT2A but strangely they do not follow up this interesting observation to map the interactions further (apart from a co-ip with KMT2A), the domains involved, the functional role of the interactions or how they mediate the changes in chromatin modifications. *

      We selected KMT2a because it is expressed in the Hek293T cells. KMT2D has been shown to regulate CNC development in Xenopus and is responsible for the Kabuki syndrome in human. We used aphafold to predict interaction and found that Adam13 interact with the Set domain. In addition we see multiple Set- containing domain protein in our mass spec data. The mass spec is done on Human hek293T cells that express a subset of KMT proteins. We now include evidence that Adam13 interact with the KMT2D SET domain (new figure 5D)

      The authors then show that ADAM13 affects expression of the TFAP2a gene in a promoter specific manner - affecting expression from S1 but not S2.

      It is the S1 but not S3. Adam13 has no effect on S2.

      • They further show that ADAM13 affects the binding of the Arid3 transcription fator to the S1-promoter but not to the S3 promoter. However, ADAM13 was present at both promoters. Absence of ADAM13 resulted in increased H3K9me2/3 and decreased H3K4me3 at the S1 promoter whereas only H3K4me3 was changed at the S2 promoter*

      S3 not S2*. Unfortunately, they do not show how this is mediated or through which binding elements this takes place. Why is ADAM13 present at both promoters but only affects Arid3 binding at S1? *

      We agree this is a very interesting question that could be the subject of an entire publication. Promoter deletion and mutation to identify which site are bound by and modulated by Adam13/Arid3a is not trivial.

      *The authors claim that transfecting Arid3a and Adam13 together further increases expression from a reporter (Fig 4E) but this is not true as no statistical comparison is done between the singly transfected and double transfected cells. *

      This is correct, there is a small increase that is not significant with both. The fact that both proteins can induce the promoter suggest but does not prove that they can have additive roles. The loss of function experiment shows that the human Arid3a expressed in Hek293T cells is important for Adam13 increases of S1. It is possible that the dose of the endogenous Arid3a is sufficient to get full activity of Adam13.* Then the authors surprisingly start investigating association of proteins with the two isoforms of TFAP2a which in the mind of this reviewer is a different question entirely. *

      We agree and have removed this part of the manuscript.

      *They find a number of proteins involved in splicing. And the observation that ADAM13 also interacts with splicing factors is really irrelevant in terms of the story that they are trying to tell. Transcription regulation and splicing are different processes and although both affect the final outcome, mRNA, they need to be investigated separately. The link is at least not very clear from the manuscript. Again, the effects on splicing are not further investigated through functional analysis and as presented the data presented is too open-ended and lacking in clarity. *

      We agree that beside the different activities of the TFap2a isoform, the rest of the splicing regulation could be a separate study. We were interested to understand how these two isoforms could activate Calpain8 so differently this is why we looked at LC/MS/MS. We have removed this part of the story from the manuscript.*

      Additional points: 1. In the abstract they propose that the ADAMs may act as extracellular sensors. This is not substantiated by the results. *

      As an extracellular protein translocating into the nucleus it is a possibility that we propose, but I agree this is not investigated in this manuscript. We will modify the text.* 2. Page 5, line 16: what is referred to by 6 samples 897 proteins? Were 6 samples analyzed for each condition? The number of repeats for the mass spec analysis is not clear from the text nor are the statistical parameters used to analyse the data. This is also true for the mass spec presented in the part on TFAP2aL-S1 and Adam13 regulate splicing. Statistics and repeats are not presented. *

      In general we provide biological triplicate and use the statistical function of Scaffold to identify proteins that are significantly enriched or absent in each samples.

      When we specify 6 samples it means 6 independent proteins samples were analyzed and used for our statistic. We use Scafold T-test with a p value less than 0.05. Peptides were identified with 95% confidence and proteins with 99% confidence.* 3. Page 6, line 19: set domain should be SET domain. *

      Yes* 4. The number of repeats in the RNA sequencing of the CNCs is not clear from the text. *

      Three biological replicates (Different batch of embryos from different females).* 5. The explanation of Figure C is a bit lacking. There are two forms of TFAP2a, L and S, but only one is presented in the figure. Do both forms have the extra S1-3 exons? Also, at the top of the figure it is not clear that the boxes are part of a continuous DNA sequence. Also, it is not clear which codon is not coding. *

      Xenopus laevis are pseudo tetraploid giving in most cases L and S genes in addition to the 2 alleles from being diploid. The TFAP2a gene structure is conserved between both aloalleles and is similar to the human gene. For promoter analysis and Chip PCR we chose one of the alloallele (L), given that the RNAseq data showed that both genes and variant behave the same in response to Adam13. This only becomes important in loss of function experiment in which both L and S version need to be knock down or Knock out.

      * In the sashimi plot there are green and pink shaded areas. What do they denote? What exactly is lacking in the MO13 mutant - seems that a particular exon is missing suggesting skipping?*

      MO13 is a morpholino that bocks the translation of Adam13 (Already characterized with >90% of the protein absent) but does not affect Adam13 mRNA expression.* 7. Page 11, line 9: „with either MbC or MbC and MO13" needs to be rephrased. *

      Will do *8. Page 11, line 19: „the c-terminus of....and S3) and" should be „the C-terminus of...and S3 and". ** 9. Page 15, line 10: substrateS 10. Page 16, line 23: the sentence „increases H3K9 to the promoter of the most upstream" needs revision. 11. Page 26, line 12: Here the authors say: „for two samples two-tail unpaired". What does this mean? Statistics should not be performed on fewer than three samples. In legnd to Figure 6 it indicates that T-test was performed on two samples. 12. The discussion should be shortened and simplified. 13. Figure 1 legend. How many images were quantitated for each condition? *

      At least 3 images per condition. For 3 independent experiments. (9 images per condition).* 14. Figure 2 has a strange order of panels where G is below B. 15. Figure 6 legend, line 12. „proteins that were significantly enriched in either of the 2 samples" is not very clear. What exactly does this mean?

      Reviewer #1 (Significance (Required)):

      If the authors follow up on either the transcription-part of the story, or the splicing part of the story, they are likely to have important results to present. However, in the present format the paper is lacking in focus as both issues are mixed together without a clear end-result. *

      We have entirely changed the paper according to these comments.

      *

      • *

      *Reviewer #2 (Evidence, reproducibility and clarity (Required)): **

      Panday et al seeks to determine the function of ADAM13 in regulating histone modifications, gene expression and splicing during cranial neural crest development. Specifically, the authors tested how Adam13, a metalloprotease, could modify chromatin by interaction with Arid3a and Tfap2a and RNA splicing and gene expression. They then utilize knockouts in Xenopus and HEK293T cells followed by immunofluorescence, IPs, BioID, luciferase assays, Mass spec and RNA assays. Although there is some strong data in the BioID and luciferase experiments, the manuscript tells multiple stories, linking together too many things to make a compelling story. The result is a paper that is very difficult to read and understand the take home message. In addition, some of the conclusions are not supported by the data. This unfortunately means it is not ready for publication. However, I have added below some suggestions that would strengthen the manuscript. My comments are below: *

      Clarity is clearly an issue here. The new version is entirely re-written.

      Here is the take home message:

      We knew that Adam13 could regulate gene expression via its cytoplasmic domain. One of the key targets was identified as Calpain8, a protein critical for CNC migration. We subsequently showed that Adam13 and Arid3a regulated Tfap2a expression which in turn regulated Calpain8.

      In this manuscript we investigated 1) how Adam13 regulates TFAP2a and 2) how Tfap2a controls Calpain8 expression.

      The take home message is that Adam13 bind to Histone methyl transferase and changes the histone methylation code overall in the CNC and in particular at the TFAP2a promoter. This results in more open chromatin. We further find that Adam13 binds to the Tfap2a promoter in vivo and is important for Arid3a binding to the first start. Tfap2a that include this N-terminus sequence regulates Capn8 expression.*

      Major comments: 1. I think it would be better to split out the chromatin modification function from the splicing in two separate papers. While there is a connection, having it all together makes the story difficult to follow. *

      Agree but I believe that the S1 vs S3 story of Tfap2a is important for the overall story. The new paper does not emphasize splicing.* 2. The immunofluorescence of H3K9me2/3, in Figure 1, 2, 3 following Adam13 knockdown is not convincing. There seems to be a strong edge effect especially in Figure 2 and 3. *

      The statistical analysis shows that the results, while modest, are significant (Three independent experiments using 3 different females and 3 explants for each condition were analyzed). The edge effect observed is eliminated by the mask that we use that normalize the expression to either DAPI or Snai2. The edge effect is seen in both control and KD as well. These are further confirmed by the Chip PCR on one direct target.

      Similarly the Arid3a expression in Supp Figure 1 if anything seems increased.

      We have previously shown that Arid3a expression is not affected by Adam13 KD (Khedgikar et al). Our point here is simply that the difference in Tfap2a cannot be explained by a decrease in Arid3a expression. It is not a critical figure and was eliminated in the new manuscript.

      *It would be better to quantify by western blot and not by fluorescent intensity since it is difficult to determine what a small change in fluorescent intensity means in vivo. *

      Not all antibodies used here work by western blot and the quantity of material required for western blot is much larger than IF. Given the small overall changes and the variability observed in Western blot it is not a viable alternative.

      IF is a quantitative method that has been used widely to assert increase or decrease of protein level or post translational modification. The fact that the same post translational modification that we see in cranial neural crest explants can also be seen by ChipPCR on the Tfap2a promoter confirm this observation.

      *Also, it does not say in the text or the figure legend what these are, Xenopus explants of CNC? *

      These are CNC explants. It is now clearly stated in the figure legend.* 3. The rationale for isolating KMT2A from the other chromatin modifiers in the dataset is not clear. *

      The new manuscript is clarifying that point. Because we are using Hek293T cells in this assay, which are human embryonic kidney derived instead of Xenopus Cranial neural crest cells, we are not interested in a specific protein but rather a family of protein that can modify histones (KMT and KDM). Our rational is if Adam13 can bind to KMT2 via the SET domain, it is likely to interact with KTM2 that are expressed in the CNC. KMT2A and D are expressed in the CNC. This is why we selected KMT2a here (Hek293T). We now include 1 co-IP with the Set domain of Xenopus KMT2D (new figure 5D)

      From the RNA-seq in Supp Figure 2 it is not changed as much as likely some of the others.

      The new manuscript addresses this point. We did not show or expect that the loss of Adam13 would affect mRNA expression of Kmt2.

      *Also, the arrow seems to indicate that it is right above the cutoff. What about other proteins with ATPase activity? That is the top hit in the Dot plot nuclear function. Would be helpful to write out Adam13 cytoplasm/nucleus here. *

      We have used another set of proteomics data that does not include the cytoplasmic/nuclear extract to simplify the results. We hope that the changes make it more obvious.

      Given that we are looking at Chromatin remodeling enzyme here we did not chose to investigate further in this report the ATPase. This is such a wide category that it could lead us away from the main story here.* 4. The splicing information, while interesting would be better as a different manuscript. The sashimi plot requires more explanation as written. *

      We agree and think that a simple representation of the fold change of the different isoform is more obvious. It is now a minor part of figure 1 and the legend has been improved to describe the method here.

      How do you tell if the interactions are changed from this?

      I do not understand this question. The sashimi plot indicate the read through from the mRNA that goes from one exon to the next quantifying the specific exon usage. It can therefore be quantified and compared between different conditions.

      • The authors argue there is a reduction of Tfap2a in Figure 3H but half the explant is not expressing sox9 in the Adam13 knockdown. How is this kind of experiment controlled when measure areas that don't have any fluorescence because of the nature of the explants? *

      We have removed this figure as we had already shown previously by western blot that Tfap2a protein decreased in MO13 embryos. As noted on the histogram, the fluorescence is only measured in Sox9 positive cells in each explant. Three independent experiments with 3 explants for each. We also have seen a decrease by Western blot and mRNA expression (Both RNAseq and realtime PCR). In most of our explants, the vast majority of the cells are positive for Snai2 and Sox9, while those that are negative are positive for Sox3 (data not shown here). There is always less signal in the center of the explant possibly due to the penetration of antibody or interference with the signal by the cells pigment or yolk autofluorescence. Our control explants have the same effect so our quantification is valid.* 5. The use of a germ line Xenopus mutant for Adam13 is great but how were these knockouts validated? *

      All of the KO were validated by sequencing, RNAseq and protein expression. These are now included in the supplemental figure 1.

      *More information is required here. The Chip-qPCR has a lot of variability between the samples, especially in the H3K9me2/3. *

      All ChipPCR were performed on Xenopus embryos. The variability is tested by statistical analysis and is either significant or not.

      Because these are in cell lines, this should be more consistent.

      They are not in cell lines but in Xenopus embryos.

      • In addition, it is difficult to understand what this means for cranial neural crest cells when assaying in HEK293T cells with the luciferase assay. *

      We use Luciferase assay in Hek293T cells to test if Xenopus protein can induce a specific reporter (Gain of function). We also use luciferase reporter in Xenopus to test if they can perceive the loss of a specific protein (For example Adam13).

      Our result show that Adam13 or Arid3a expression in Hek293T cells can induce the TFAP2S1 reporter. * 6. The migration assay shows only an example of what it looks like to have defective migration. But it would be better to show control embryos, embryos with Adam13 knockdown and what the rescues look like so the reader can make their own conclusion.*

      We can certainly include this but have published this assay in multiple publication before. The picture is a single example, the histogram shows that statistical validation.

      • The argument from the section above suggests the S1 isoform is the primary one but S3 in this assay also rescues, please explain what this result means since it seems to suggest that even though these isoforms have different activity the function is similar in terms of the ability to rescue defective migration. *

      The result in Hek293T cells shows that only TFAP2aS1 can induce Calpain8, while both S1 and S3 can partially rescue CNC migration in embryos lacking Adam13. The issue here is the dose of mRNA injected for each variant might be too high. Adam13 proteolytic activity is also critical, so we do not expect a complete rescue. The fact that S1 is significantly better at rescuing than S3 is relevant here. It is possible that if we were to decrease the dose of each mRNA we would find one in which S3 no longer rescues but S1 does.

      * The next section again talks about yet another protein Calpain-8. Here the authors use MO13 for luciferase assays instead of HEK293 cells. The authors do not explain why they decided to switch from cells to MO.*

      Calpain8 is one of the validate target of Adam13 that can rescue CNC migration (Cousin et al Dev Cell). We use the luciferase reporter corresponding to the Xenopus Capn8 reporter to show 1 in vivo that loss of Adam13 reduce its expression (Similar to the Capn8 gene). We then went in vitro using Hek293T cells for gain of function experiment that shows that only the Tfaps2S1 variant can induce it while S3 does not.

      We hope that the graphical summary and the new manuscript make this clear.* 8. The experiment to IP RNA supports only the correlation that Adam9 and Adam13 bind RNA and RNA binding proteins to regulate splicing. This conclusion presented is not supported by the data presented here. While there is a sentence about why Adam9 was chosen here, it would be preferred to focus on Adam13 as the rest of the manuscript is focused on Adam13. The conclusions are generalized to all ADAMs, but ADAM13 and ADAM9 are the only ADAMs investigated in the manuscript *

      This figure is no longer included. For each of the protein classes that we identify by Masspec we try to find a validation. RNA-IP is simply a validation that Adam13 and Adam9 can bind to complexes that include RNA in a cytoplasmic domain dependent fashion. The conclusion that Adam13 and possibly ADAM9 might be involved in regulating splicing is 1) that the protein associated with Adam13 are include multiple splicing factors, 2) that the RNAseq analysis shows abnormal splicing in CNC missing Adam13 and 3) that the form of TFAP2a induced by Adam13 (S1) associate significantly more with splicing factor than the S3 isoform.

      We agree that the generalization to other ADAM is not demonstrated here but only suggested. We selected ADAM9 and ADAM19 because we have shown that they can each rescue Adam13 function in the CNC. Unfortunately there are no ADAM19 antibody that work by IP on the market. We have tested multiple company and multiple cell lines.

      We believe that the ADAM9 experiment is critical to show that the protein associated with Adam13 are not simply the result of overexpressing a different species protein sin ADAM9 is the endogenous protein.*

      Minor comments 1. The manuscript using a lot of abbreviations (PCNS, NI, MO, SH3) and lingo that are unclear to a general reader. Please define acronyms when first used, as well as be clear on which model is being used in each experiment. *

      We have corrected this* 2. Similarly, the figures are not labeled such that a reader would be able to understand ie MO13 should be Adam13 knockdown etc. *

      We have corrected this in the legend

      • Please identify the genes on the heatmap and some highlighted genes from volcano plot from the RNA-seq.*

      The volcano plot is from MS/MS not RNAseq. We have list of all of the genes and/or proteins corresponding to each figure in tables

      We now have a figure from the RNAseq and a subset of genes of interest are show. *4. Why use the flag tag in Figure 5? *

      We used Flag-tagged construct to only immunoprecipitated the variants and not the endogenous TFPA2a in these experiments. Also we used RFP-Flag to eliminate any protein that bound to the tag or the antibody.

      This figure is no longer in the manuscript.* 5. Is the data in figure 4A-D the same as Supp. Figure 4A-D? *

      These are independent biological replicates of the same experiment.* 6. Please italicize gene symbols - e.g. "key transcription factors that exemplify CNC, such as the SOX9, FOXD3, SNAI1, SNAI2, and TFAP2 family". *

      We clearly have missed some, we are using italicized for gene, and regular for proteins. It might not be clear in the text when we are referring to genes and proteins. We will correct this in the rewrite. 7. Please review the manuscript for grammatical and typographical errors. * We have used all available software including Word and Grammarly. We will try to improve on the next version. **Cross-commenting**

      I think the two reviewers on one the same page on this manuscript.

      Reviewer #2 (Significance (Required)):

      If more solid, would be a conceptual advance in role of Adam13 in mediating chromatin modification and transcription factors, adds to exiting work from this lab, good for a specialize audience, my expertise is in in neural crest development, non-mammalian modes, epigenetic regulators.*

      • *
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Panday et al seeks to determine the function of ADAM13 in regulating histone modifications, gene expression and splicing during cranial neural crest development. Specifically, the authors tested how Adam13, a metalloprotease, could modify chromatin by interaction with Arid3a and Tfap2a and RNA splicing and gene expression. They then utilize knockouts in Xenopus and HEK293T cells followed by immunofluorescence, IPs, BioID, luciferase assays, Mass spec and RNA assays. Although there is some strong data in the BioID and luciferase experiments, the manuscript tells multiple stories, linking together too many things to make a compelling story. The result is a paper that is very difficult to read and understand the take home message. In addition, some of the conclusions are not supported by the data. This unfortunately means it is not ready for publication. However, I have added below some suggestions that would strengthen the manuscript. My comments are below:

      Major comments:

      1. I think it would be better to split out the chromatin modification function from the splicing in two separate papers. While there is a connection, having it all together makes the story difficult to follow.
      2. The immunofluorescence of H3K9me2/3, in Figure 1, 2, 3 following Adam13 knockdown is not convincing. There seems to be a strong edge effect especially in Figure 2 and 3. Similarly the Arid3a expression in Supp Figure 1 if anything seems increased. It would be better to quantify by western blot and not by fluorescent intensity since it is difficult to determine what a small change in fluorescent intensity means in vivo. Also, it does not say in the text or the figure legend what these are, Xenopus explants of CNC?
      3. The rationale for isolating KMT2A from the other chromatin modifiers in the dataset is not clear. From the RNA-seq in Supp Figure 2 it is not changed as much as likely some of the others. Also, the arrow seems to indicate that it is right above the cutoff. What about other proteins with ATPase activity? That is the top hit in the Dot plot nuclear function. Would be helpful to write out Adam13 cytoplasm/nucleus here.
      4. The splicing information, while interesting would be better as a different manuscript. The sashimi plot requires more explanation as written. How do you tell if the interactions are changed from this? The authors argue there is a reduction of Tfap2a in Figure 3H but half the explant is not expressing sox9 in the Adam13 knockdown. How is this kind of experiment controlled when measure areas that don't have any fluorescence because of the nature of the explants?
      5. The use of a germ line Xenopus mutant for Adam13 is great but how were these knockouts validated? More information is required here. The Chip-qPCR has a lot of variability between the samples, especially in the H3K9me2/3. Because these are in cell lines, this should be more consistent. In addition, it is difficult to understand what this means for cranial neural crest cells when assaying in HEK293T cells with the luciferase assay.
      6. The migration assay shows only an example of what it looks like to have defective migration. But it would be better to show control embryos, embryos with Adam13 knockdown and what the rescues look like so the reader can make their own conclusion. The argument from the section above suggests the S1 isoform is the primary one but S3 in this assay also rescues, please explain what this result means since it seems to suggest that even though these isoforms have different activity the function is similar in terms of the ability to rescue defective migration.
      7. The next section again talks about yet another protein Calpain-8. Here the authors use MO13 for luciferase assays instead of HEK293 cells. The authors do not explain why they decided to switch from cells to MO.
      8. The experiment to IP RNA supports only the correlation that Adam9 and Adam13 bind RNA and RNA binding proteins to regulate splicing. This conclusion presented is not supported by the data presented here. While there is a sentence about why Adam9 was chosen here, it would be preferred to focus on Adam13 as the rest of the manuscript is focused on Adam13. The conclusions are generalized to all ADAMs, but ADAM13 and ADAM9 are the only ADAMs investigated in the manuscript

      Minor comments

      1. The manuscript using a lot of abbreviations (PCNS, NI, MO, SH3) and lingo that are unclear to a general reader. Please define acronyms when first used, as well as be clear on which model is being used in each experiment.
      2. Similarly, the figures are not labeled such that a reader would be able to understand ie MO13 should be Adam13 knockdown etc.
      3. Please identify the genes on the heatmap and some highlighted genes from volcano plot from the RNA-seq.
      4. Why use the flag tag in Figure 5?
      5. Is the data in figure 4A-D the same as Supp. Figure 4A-D?
      6. Please italicize gene symbols - e.g. "key transcription factors that exemplify CNC, such as the SOX9, FOXD3, SNAI1, SNAI2, and TFAP2 family".
      7. Please review the manuscript for grammatical and typographical errors.

      Cross-commenting

      I think the two reviewers on one the same page on this manuscript.

      Significance

      If more solid, would be a conceptual advance in role of Adam13 in mediating chromatin modification and transcription factors, adds to exiting work from this lab, good for a specialize audience, my expertise is in in neural crest development, non-mammalian modes, epigenetic regulators

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Pandey et al. show that the ADAM13 protein modulates histone modifications in cranical neural crest and that the Arid3a protein binds the Tfap2a promoter in an Adam13-dependent manner and has promoter-specific effects on transcription. Furthermore, they show that the Adam13 and human ADAM9 proteins associated with histone modifiers as well as proteins involved in RNA splicing.

      Although the manuscript is mostly clearly written and the figures well assembled, it reads like a couple of separate and unfinished stories. They show using immunocytochemistry and qPCR that ADAM13 knockouts in CNCs afffects histone modifications. Here ChIP-seq or Cut-n-Run experiments would be more appropriate and would result in a more comprehensive understanding of the changes mediated. The immunohistochemistry assays should at least be verified further using western blotting or other more quantiative methods. The authors then show that ADAM13 interacts with a number of histone modifiers such as KDM3B, KDM4B and KMT2A but strangely they do not follow up this interesting observation to map the interactions further (apart from a co-ip with KMT2A), the domains involved, the functional role of the interactions or how they mediate the changes in chromatin modifications.

      The authors then show that ADAM13 affects expression of the TFAP2a gene in a promoter specific manner - affecting expression from S1 but not S2. They further show that ADAM13 affects the binding of the Arid3 transcription fator to the S1-promoter but not to the S3 promoter. However, ADAM13 was present at both promoters. Absence of ADAM13 resulted in increased H3K9me2/3 and decreased H3K4me3 at the S1 promoter whereas only H3K4me3 was changed at the S2 promoter. Unfortunately, they do not show how this is mediated or through which binding elements this takes place. Why is ADAM13 present at both promoters but only affects Arid3 binding at S1? The authors claim that transfecting Arid3a and Adam13 together further increases expression from a reporter (Fig 4E) but this is not true as no statistical comparison is done between the singly transfected and double transfected cells.

      Then the authors surprisingly start investigating association of proteins with the two isoforms of TFAP2a which in the mind of this reviewer is a different question entirely. They find a number of proteins involved in splicing. And the observation that ADAM13 also interacts with splicing factors is really irrelevant in terms of the story that they are trying to tell. Transcription regulation and splicing are different processes and although both affect the final outcome, mRNA, they need to be investigated separately. The link is at least not very clear from the manuscript. Again, the effects on splicing are not further investigated through functional analysis and as presented the data presented is too open-ended and lacking in clarity.

      Additional points:

      1. In the abstract they propose that the ADAMs may act as extracellular sensors. This is not substantiated by the results.
      2. Page 5, line 16: what is referred to by 6 samples 897 proteins? Were 6 samples analyzed for each condition? The number of repeats for the mass spec analysis is not clear from the text nor are the statistical parameters used to analyse the data. This is also true for the mass spec presented in the part on TFAP2aL-S1 and Adam13 regulate splicing. Statistics and repeats are not presented.
      3. Page 6, line 19: set domain should be SET domain.
      4. The number of repeats in the RNA sequencing of the CNCs is not clear from the text.
      5. The explanation of Figure C is a bit lacking. There are two forms of TFAP2a, L and S, but only one is presented in the figure. Do both forms have the extra S1-3 exons? Also, at the top of the figure it is not clear that the boxes are part of a continuous DNA sequence. Also, it is not clear which codon is not coding.
      6. In the sashimi plot there are green and pink shaded areas. What do they denote? What exactly is lacking in the MO13 mutant - seems that a particular exon is missing suggesting skipping?
      7. Page 11, line 9: „with either MbC or MbC and MO13" needs to be rephrased.
      8. Page 11, line 19: „the c-terminus of....and S3) and" should be „the C-terminus of...and S3 and".
      9. Page 15, line 10: substrateS
      10. Page 16, line 23: the sentence „increases H3K9 to the promoter of the most upstream" needs revision.
      11. Page 26, line 12: Here the authors say: „for two samples two-tail unpaired". What does this mean? Statistics should not be performed on fewer than three samples. In legnd to Figure 6 it indicates that T-test was performed on two samples.
      12. The discussion should be shortened and simplified.
      13. Figure 1 legend. How many images were quantitated for each condition?
      14. Figure 2 has a strange order of panels where G is below B.
      15. Figure 6 legend, line 12. „proteins that were significantly enriched in either of the 2 samples" is not very clear. What exactly does this mean?

      Significance

      If the authors follow up on either the transcription-part of the story, or the splicing part of the story, they are likely to have important results to present. However, in the present format the paper is lacking in focus as both issues are mixed together without a clear end-result.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1, point 1: In general, the statistical analysis is not transparent. The size of the sample, i.e. the number of observations or data points, is never specified. This information is essential for further evaluation of the statistical details.

      The size of each sample quantified, given as number of ommatidia/number of retinas, is indicated in the figure legends. This must have escaped the attention of reviewer 1, so we have added a sentence in the legend of Fig. 2 to state it more clearly. We think that the figure legends are the best place to put this information for ease of comparison to the figures.

      *Reviewer 1, point 2: To gain a better understanding of chitin deposition, it would be beneficial to have data on Kkv overexpression in cone cells versus outer pigment cells. Does it cause reb/exp-like effects on chitin deposition and corneal lens formation? Furthermore, can the authors rule out the involvement of chitin synthase 2 in chitin matrix formation and the retention of the matrix in kkv knockdowns? *

      We will generate clones of cells that over-express Kkv in either central cells (cone and primary pigment cells) or lattice cells (secondary and tertiary pigment cells), using the same drivers that we used to over-express Reb, and will examine chitin secretion at 54 h after puparium formation (APF) and in adults.

      As there are no available mutations in Chitin synthase 2 (Chs2), we will knock it down with RNAi in all retinal cells using lGMR-GAL4 and look for corneal lens defects. However, we think that Chs2 is unlikely to contribute chitin to the corneal lens, because its expression is restricted to the digestive system, and because kkv knockdown essentially eliminates chitin from the corneal lens.

      *Reviewer 1, point 3: Recent results published by the authors regarding ZP domain proteins, such as dusky-like (dyl), have not been adequately discussed in the context of chitin secretion and Kkv expression, a matter that must be addressed. It has been demonstrated that dyl mutants do not affect Kkv expression, but chitin levels are reduced. Does Dyl exhibit Kkv-like phenotypes? Furthermore, what is the expression of Dyl or Dmupy in Kkv knockdowns? Is there any interaction between the ZP domain protein matrix and the chitin matrix required for lens formation? *

      In dyl mutants, chitin deposition is delayed, but it does accumulate later in development, so the phenotype is different from kkv mutants. We have clarified this in the manuscript (p. 6). To address the other points, we will examine the expression of Dyl and of Dumpy-YFP in mid-pupal and late pupal retinas in which kkv is knocked down in all cells with lGMR-GAL4. The ZP protein matrix is originally deposited before chitin secretion begins, so we will examine whether loss of chitin affects its later maintenance.

      *Reviewer 1, point 4: What is retained in the chitin matrix if chitin is missing in kkv knockdown? Is it the ZP domain matrix (see the above question) or are the chitin matrix proteins also involved, such as Obst-A, Obst-C (Gasp), Knk and others? Obst proteins are particularly essential for the regular packaging of chitin and thus for the formation of the chitin layer, which is shown in Fig. 1. Beyond this story, it would also be interesting to see how the aforementioned chitin matrix proteins (Obst-A, Obst-C (Gasp), Knk and others) impact lens formation. *

      Adult corneal lenses derived from kkv knockdown retinas do not contain chitin, but there is remaining corneal lens material. We do not think that this is the ZP domain matrix, as this is normally lost in late pupal development, but we will check whether Dpy-YFP is retained in kkv knockdown adults. We will try to detect Obst-A and Gasp proteins using available antibodies. However, this may not be successful, as we have found that antibodies do not penetrate the corneal lens well. Our transcriptomic studies have identified numerous secreted proteins that are expressed at high levels in the mid-pupal retina and could be components of the corneal lens. We may be able to detect some of these using fluorescently tagged forms, but it is possible that the currently available tools will not be sufficient to answer this question.

      We have begun to work on how some of these proteins affect corneal lens structure, but this will take a significant amount of time and we think it would work better as a separate manuscript. We see our current manuscript as a short and focused story about the importance of the source of chitin in determining corneal lens shape.

      *Reviewer 1, minor comment 1: Figure 1 is not easily comprehensible for those who are not already familiar with the subject of eye development. Fig -1A' please label the cone cells and pigment cells. *

      We have labeled these cells in Fig. 1A’’.

      *Reviewer 1, minor comment 2: Fig. 1H - The meaning of the abbreviations and numbers is not given in the legend. It would also be beneficial to include a meaningful cartoon illustrating the corneal lens situation before and after chitin secretion, as shown in Figure 3. *

      We have defined the abbreviations in the figure legend. Fig. 1H did show the corneal lens situation before, during and after chitin secretion, but we have added the cone and pigment cells to the 72 h APF and adult diagrams to make them more meaningful (now Fig. 1I).

      *Reviewer 1, minor comment 3: Fig.1 F when does the authors recognize a first chitin assembly as initial corneal lens at the eye and how does it look like? Chitin expression is high already at 54h APF, which means 20 hours earlier. *

      We think that the reviewer is asking when the chitin first starts to form a dome shape. We have added an orthogonal view of chitin in a 54 h APF retina viewed with LIGHTNING microscopy, showing that the external curvature is already present at this stage (new Fig. 1F).

      *Reviewer 1, minor comment 4: Page 6 / Fig 2E: cells autonomously synthesize chitin and no lateral diffusion. Please label which lens contains chitin and which not *

      Fig. 2E shows part of a retina in which kkv has been knocked down in all cells, so none of the corneal lenses contain chitin. We have clarified this in the legend to Fig. 2.

      *Reviewer 1, minor comment 5: Page 7: The authors state that reb/exp knockdown affects external and internal curvature. However, Fig. S1 statistics does not support this statement. *

      We were referring to the double knockdown, which Fig. 2L, M show is significant, and not to the single knockdowns quantified in Fig. S1. We have clarified this in the text.

      *Reviewer 1, minor comment 6: Fig.2 and Fig. S1: what is Chp (Chaoptin)? *

      We have stated in the legend to Fig. 2 that Chaoptin is a component of photoreceptor rhabdomeres.

      *Reviewer 1, minor comment 7: Fig. S1E,I: which part of the eye is marked by the chitin staining outside the cone and pigment cells? *

      Chitin is still present in the mechanosensory bristles in Fig.S1I, as these do not express lGMR-GAL4. We have stated this in the figure legend.

      *Reviewer 1, minor comment 8: Fig. 2 L,M, Why do exp/reb show different statistical results at outer angle in exp and reb knockdown when compared with the IGMR driver line, although chitin reduction is eliminated in exp knockdown already from 54h APF onwards? *

      The double knockdown of exp and reb has a more significant effect on the adult corneal lens outer angle than the single exp knockdown, even though the exp knockdown lacks chitin at 54 h APF. We believe that this is because Reb is sufficient for some chitin synthesis at later stages of development. This was mentioned in the text (p. 6) and we have added further clarification in the legend to Fig. S1.

      *Reviewer 1, minor comment 9: Fig 3 G-H: please clarify where the chitin reduction can be observed at the edge of adult corneal lens and provide comparable wt staining's. Fig. S2 D. What was the normalization and the sample number? *

      We have added a high magnification image of a mosaic ommatidium with one wild-type and one kkv knockdown edge, showing the region at the edge of the corneal lens in which chitin fluorescence was quantified and the central region used for the normalization (Fig. 3I). The sample numbers are given in the legend to Fig. S2D.

      Reviewer 1, minor comment 10: Page 6, last paragraph: I fully agree that ZP domain proteins may retain other corneal lens components. But deeper discussion is missing. It should be noted that the authors hypothesis fits well to the proposed function of the ZP matrix in providing chitin matrix adhesion to the underlying cell surface. A loss of the ZP domain protein Piopio causes loss of the chitin matrix as show recently in trachea and at epidermal tendon cells (Göpfert et al., 2025; https://www.sciencedirect.com/science/article/pii/S1742706125003733). Furthermore, a recent publication identifies ZPD proteins as modular units that establish the mechanical environment essential for nanoscale morphogenesis (Itakura et al., https://www.biorxiv.org/content/10.1101/2024.08.20.608778v1.full.pdf*). This should be cited and discussed accordingly.

      It could be that outer and inner part of the chitin is different in ultrastructure due to expression pattern. In dragonfly the surface morphology analysis by scanning electron microscopy revealed that the outer part of corneal lenses consisted of long chitin fibrils with regular arrays of papillary structures while the smoother inner part had concentric lamellated chitin formation with shorter chitin nanofibrils (Kaya et al., 2016; https://www.sciencedirect.com/science/article/pii/S0141813016303646?via%3Dihub#fig0020) . Thus, a ultrastructure analyses would be very beneficial, or at least a detailed discussion. *

      We have added a discussion of these points and papers to the text (p. 6 and 9). Although we are not specifically addressing differences between the inner and outer parts of the corneal lens in this manuscript, we have now included a high-resolution LIGHTNING image showing how the layered structure of the corneal lens is affected when chitin production by central cells is increased (Fig. 4F).

      *Reviewer 2, point 1: Adult corneal lenses lacking chitin still form a thin structure in kkv RNAi. The authors suggest that this may be due to the presence of the ZP domain proteins Dyl, Dpy and Pio. Immunostaining for these ZP domain proteins could provide supporting evidence. *

      To clarify, we meant to say that the earlier presence of the ZP domain matrix could retain components other than chitin in the corneal lens. The ZP domain proteins are no longer present in the adult. We have made this clearer in the text. As described under reviewer 1, points 3 and 4, we will examine Dyl and Dpy-YFP expression in kkv knockdown retinas at mid-pupal and adult stages, and we will also look at the expression of another ZP domain protein, Piopio.

      *Reviewer 2, minor comment 1: At 50 h APF, Kkv (Fig. 2B, B') and Reb (Fig. S1A, A') appear to be expressed at higher levels in lattice cells than in central cells, even though chitin is mainly present in the central cells at this time (Fig. 1B-B'). Discuss possible explanation for their expression pattern and their roles at this stage. *

      We agree that this is a surprising result. We have added a discussion of possible explanations, such as the lack of another component necessary for chitin secretion in lattice cells at this stage, or the presence of high levels of chitinases (p. 7).

      *Reviewer 2, minor comment 2: Fig. 1F and G: Indicate that the cryosection images represent single ommatidia, and label "external" and "internal" to help orient readers. *

      We have made these changes to the figure panels (now G and H), and indicated in the legend that they are single ommatidia.

      *Reviewer 2, minor comment 3: Figure 2. The cartoon diagram showing the angle measurement (currently Fig S1K) should be moved to the main figure to help readers understand the quantifications. *

      We have moved this diagram to Figure 2L.

      *Reviewer 2, minor comment 4: Figure 3H. It would be helpful to clearly mark the edge of the corneal lens in the chitin intensity image. *

      As described under reviewer 1, minor comment 9, we have added a high magnification picture showing the edge region used for chitin quantification (Fig. 3I), which should also address reviewer 2’s concern.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The manuscript by Ghosh and Treisman demonstrates that localized chitin secretion shapes the Drosophila corneal lens. Building on their previous work showing that zona pellucida domain proteins influence corneal lens architecture, which correlates with a delay in chitin deposition, the authors investigate how chitin and its secretion contribute to defining the lens curvature. Through cell-type-specific RNAi and overexpression experiments combined with beautiful imaging and quantifications, they provide convincing evidence that the central cone and primary pigment cells are the principal sources of chitin in the middle region of the corneal lens. Overall, this study offers strong evidence that localized chitin secretion and restricted diffusion underlie the precise shaping of the corneal lens. I have only one major comment and a few relatively minor suggestions to improve clarity.

      Major comments:

      • Adult corneal lenses lacking chitin still form a thin structure in kkv RNAi. The authors suggest that this may be due to the presence of the ZP domain proteins Dyl, Dpy and Pio. Immunostaining for these ZP domain proteins could provide supporting evidence.

      Minor comments:

      • At 50 h APF, Kkv (Fig. 2B, B') and Reb (Fig. S1A, A') appear to be expressed at higher levels in lattice cells than in central cells, even though chitin is mainly present in the central cells at this time (Fig. 1B-B'). Discuss possible explanation for their expression pattern and their roles at this stage.
      • Fig. 1F and G: Indicate that the cryosection images represent single ommatidia, and label "external" and "internal" to help orient readers.
      • Figure 2. The cartoon diagram showing the angle measurement (currently Fig S1K) should be moved to the main figure to help readers understand the quantifications.
      • Figure 3H. It would be helpful to clearly mark the edge of the corneal lens in the chitin intensity image.

      Significance

      This study provides novel insights into how the differential secretion of a polysaccharide determines the curvature of a complex optical structure. The elegant use of cell-type-specific genetic manipulations, together with high-quality imaging and rigorous quantification is the key strength.

      The study advances our understanding of how chitin secretion and limited diffusion shape apical ECM structures during tissue morphogenesis. It also extends findings from the tracheal and cuticular chitin systems into a new optical context.

      The manuscript will be of interest to developmental biologists, particularly those studying epithelial tissue morphogenesis and apical ECM organization.

      I have expertise in Drosophila epithelial morphogenesis.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Chitin plays a crucial role in the morphogenesis of the Drosophila corneal lens by supporting the structural integrity and biconvex shape of the lens. The Drosophila corneal lens is a biconvex structure that focuses light. Chitin, a major component, is produced mainly by the central cone and primary pigment cells. The production and arrangement of chitin by central cells directly impacts the thickness and curvature of the lens. Adequate chitin secretion is necessary to ensure the correct shape and function of the corneal lens, while disturbances in chitin production can lead to deformed lenses. Blocking chitin synthesis leads to a significant reduction in chitin deposition in the corneal lens, resulting in a thinner and deformed lens. In particular, the corneal lens shows reduced outer and inner curvature, which compromises its biconvex shape. These changes in chitin production and arrangement result in abnormal morphology of the corneal lens in the adult stage. The key messages of the paper's results are: The Drosophila corneal lens is a biconvex structure that focuses light. 2.) chitin, a significant component, is produced mainly by central cells (cone and primary pigment cells). 3.) Downregulation of the chitin synthase gene Krotzkopf reduces lens thickness and curvature. 4.) Overexpression of Rebuf increases chitin secretion and lens thickness. 5.) Localized chitin secretion is crucial for the typical shape of the corneal lens.

      Comments

      Main comments

      The manuscript provides an exciting insight into how the formation of the lens is regulated by the secretion of chitin. However, the data set appears to have shortcomings that must be considered for the next steps. 1.) In general, the statistical analysis is not transparent. The size of the sample, i.e. the number of observations or data points, is never specified. This information is essential for further evaluation of the statistical details.

      2.) To gain a better understanding of chitin deposition, it would be beneficial to have data on Kkv overexpression in cone cells versus outer pigment cells. Does it cause reb/exp-like effects on chitin deposition and corneal lens formation? Furthermore, can the authors rule out the involvement of chitin synthase 2 in chitin matrix formation and the retention of the matrix in kkv knockdowns?

      3.) Recent results published by the authors regarding ZP domain proteins, such as dusky-like (dyl), have not been adequately discussed in the context of chitin secretion and Kkv expression, a matter that must be addressed. It has been demonstrated that dyl mutants do not affect Kkv expression, but chitin levels are reduced. Does Dyl exhibit Kkv-like phenotypes? Furthermore, what is the expression of Dyl or Dmupy in Kkv knockdowns? Is there any interaction between the ZP domain protein matrix and the chitin matrix required for lens formation?

      4.) What is retained in the chitin matrix if chitin is missing in kkv knockdown? Is it the ZP domain matrix (see the above question) or are the chitin matrix proteins also involved, such as Obst-A, Obst-C (Gasp), Knk and others? Obst proteins are particularly essential for the regular packaging of chitin and thus for the formation of the chitin layer, which is shown in Fig. 1. Beyond this story, it would also be interesting to see how the aforementioned chitin matrix proteins impact lens formation.

      Minor comments:

      Page 6: Figure 1 is not easily comprehensible for those who are not already familiar with the subject of eye development.

      Fig -1A' please label the cone cells and pigment cells.

      Fig. 1H - The meaning of the abbreviations and numbers is not given in the legend. It would also be beneficial to include a meaningful cartoon illustrating the corneal lens situation before and after chitin secretion, as shown in Figure 3.

      Fig.1 F when does the authors recognize a first chitin assembly as initial corneal lens at the eye and how does it look like? Chitin expression is high already at 54h APF, which means 20 hours earlier.

      Page 6 / Fig 2E: cells autonomously synthesize chitin and no lateral diffusion. Please label which lens contains chitin and which not

      Page 7: The authors state that reb/exp knockdown affects external and internal curvature. However, Fig. S1 statistics does not support this statement.

      Fig.2 and Fig. S1: what is Chp (Chaoptin)?

      Fig. S1E,I: which part of the eye is marked by the chitin staining outside the cone and pigment cells?

      Fig. 2 L,M, Why do exp/reb show different statistical results at outer angle in exp and reb knockdown when compared with the IGMR driver line, although chitin reduction is eliminated in exp knockdown already from 54h APF onwards?

      Fig 3 G-H: please clarify where the chitin reduction can be observed at the edge of adult corneal lens and provide comparable wt staining's. Fig. S2 D. What was the normalization and the sample number?

      Page 6, last paragraph: I fully agree that ZP domain proteins may retain other corneal lens components. But deeper discussion is missing. It should be noted that the authors hypothesis fits well to the proposed function of the ZP matrix in providing chitin matrix adhesion to the underlying cell surface. A loss of the ZP domain protein Piopio causes loss of the chitin matrix as show recently in trachea and at epidermal tendon cells (Göpfert et al., 2025; https://www.sciencedirect.com/science/article/pii/S1742706125003733). Furthermore, a recent publication identifies ZPD proteins as modular units that establish the mechanical environment essential for nanoscale morphogenesis (Itakura et al., https://www.biorxiv.org/content/10.1101/2024.08.20.608778v1.full.pdf). This should be cited and discussed accordingly.

      It could be that outer and inner part of the chitin is different in ultrastructure due to expression pattern. In dragonfly the surface morphology analysis by scanning electron microscopy revealed that the outer part of corneal lenses consisted of long chitin fibrils with regular arrays of papillary structures while the smoother inner part had concentric lamellated chitin formation with shorter chitin nanofibrils (Kaya et al., 2016; https://www.sciencedirect.com/science/article/pii/S0141813016303646?via%3Dihub#fig0020) . Thus, a ultrastructure analyses would be very beneficial, or at least a detailed discussion.

      Significance

      The manuscript's strength and most important aspects are the genetic expression, and localization studies of the chitin under control of the chitin synthase kkv, reb and exp in Drosophila pupal and adult eye . However, beyond this manuscript, the development of mechanistic details, such as interaction partners that trigger secretion and action at the ZP matrix and adjacent apical membranes will be interesting.

      The manuscript uses nice genetics tools to describe the Chitin secretion differences in Drosophila eye and their specific impact on corneal lens formation. Such a precise molecular analysis has not been investigated before in insects. Therefore, the study deeply extends knowledge about the role of Chitin synthases and chitin secretion in insect eye.

      The audience will not only rather specialized in basic research in zoology, developmental biology, and cell biology in terms of how the Chitin synthases produce chitin. Nevertheless, as chitin is relevant to material research and medical and immunological aspects, the manuscript will be interesting beyond the specific field and thus for a broader audience.

      I'm working on chitin in the tracheal system and epidermis in Drosophila.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      Overall, the authors show an interesting and conclusive work on the activation of ERM proteins upon TBXA2R signaling. The use of the ebBRET biosensor to assess ERM-protein activation enables elegant investigation of activation modalities. The Thromboxane A2 analogue U46619 robustly shows activation of ERM proteins in ebBRET assays as well as an increase in ERM-protein phosphorylation status. The functional effects of this signaling pathway are shown convincingly for moesin, where moesin mediates an TBXA2R mediated increase in cell motility, invasion and metastasis of triple-negative breast cancer Hs578 cells in vitro and in vivo. Nonetheless, some points need to be clarified.

      Significance

      Comment 1: In the title the authors state, that ERM-activation via TBXA2R is controlling invasion and motility of triple-negative breast cancer cells. In the manuscript, there is only data supporting this assumption for moesin (MSN). Therefore, the authors need to change the title accordingly or support additional experiments for the other two ERM-proteins radixin and ezrin. Throughout the experiments, the p-ERM antibody is used to measure ERM-protein activation. Since the effects on invasion and motility observed in Hs578 cells are mainly mediated through moesin, it would be necessary to see, at least for one experiment per cell line (HEK293T, Hs578) the detailed phosphorylation status of ezrin, radixin and moesin separately. As there are specific, phospho-detecting antibodies for this case, this could be done rather easy. Furthermore, showing specific increase of phosphorylated moesin would support the functional data shown in Figure 5 and 6. To investigate the functional effect of TBXA2R mediated activation of ezrin and radixin on cell motility and invasion, similar experiments could be done in e.g. HMC-1-8 breast cancer cells (high ezrin expression) and HCC1187 (high radixin expression).

      Comment 2: Figure 1A, C, D: The concentration of staurosporine is with 100 nM relatively high for kinase inhibition. It would be informative to see the assay with increasing staurosporine concentrations, e.g. from 1 nM to 50 nM. In general, a concentration of 1-10 nM should be sufficient for kinase inhibition, preventing unspecific effects of the drug.

      Comment 3: The citation for the p-ERM antibody is confusing, as there is only p-Moe used in the cited paper (Roubinet, 2011). There is a p-ERM antibody commercially available (Cell Signaling, Phospho-ezrin (Thr567)/radixin (Thr564)/moesin (Thr558) Antibody #3141). Could you clarify which antibody you are using?

      Comment 4: From the inhibitor experiments using C3 transferase toxin (Figure 2), the authors conclude that RhoA plays a role in TBXA2R mediated ERM activation. As mentioned in the manufacturer's description, C3 toxin is inhibiting RhoA, RhoB and RhoC. Therefore, it would be necessary to repeat those experiments under RhoA knockdown conditions (e.g. using an siRNA-based approach) to state that specifically RhoA is involved.

      Comment 5: To assess, if the findings in Figure 5 and 6 are due to the higher moesin expression in Hs578 cells or are linked to a specific function of moesin, a re-expression experiment would be informative. To achieve this, the 2D and 3D migration experiments could be redone after re-expression of moesin, ezrin and radixin separately in moesin knockdown conditions.

      Minor comments:

      • Even though U46619 is a known Thromboxane A2 analogue, including negative and positive controls would strengthen the results. In detail, this could be done by showing a known protein which gets phosphorylated downstream of TBXA2R signaling and a protein which is not affected by this signaling pathway alongside the shown effects on ERM-proteins.
      • Figure 1 J: There are no statistics comparing the conditions of SQ-29548 treated cells in presence/absence of U46619, that should be added.
      • Figure 1 G, H: How was the quantification for cell periphery performed? In detail, how were the thresholds set for cell periphery / not cell periphery?
      • Figure 3 H:
        • The labelling indicating presence of U46619 is missing.
        • Also, what is the rationale behind normalizing MB-453 for 3 cell lines and comparing the BT-549 to MB-157?
      • Suppl. Fig 4 D: Define y-axis better. Absorbance at what wave length?
      • Define FERM and ERMAD abbreviations in introduction.
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      The Ezrin, radixin, and moesin (ERM) family of proteins orchestrate morphological changes that potentiate metastatic invasion in cancer cells. In this study, Leguay et al. identify the GPCR, TBXA2R, as a key activator of the ERM proteins which promotes motility and invasion in triple-negative breast cancer (TNBC) cells. Using BRET-based sensors developed by them previously for monitoring the activation of ERM proteins and building upon their previous findings on the role of the small GTPase RhoA in the activation of ERM proteins, the authors carefully dissect the molecular pathway leading to the activation of ERM proteins upon stimulation of the TBX2AR. The authors also establish the pathological relevance of the pathway in TNBC using in vitro and in vivo models, opening up possibilities for targeting this pathway in cancer cells. Overall, the study is well-conceived and executed, and the results are clearly described and presented in the manuscript. However, the following comments must be addressed before publication.

      Major comments

      Fig 1C - Why p-ERM was normalized over Ezrin and not ERM? It would be more appropriate and consistent to normalize against the ERM signal as done in other experiments in the manuscript.

      Fig 1E and S3C - The levels of total ERM also seem to change with increasing treatment times. This must be clarified and discussed in the manuscript.

      Fig 1F - Why is the mean of all three independent experiments not presented here as in S3C?

      Fig 2E - Though SLK seems to play a dominant role in the phosphorylation of ERM in HEK293T cells, the depletion of LOK also substantially reduces the phosphorylation of ERM in the representative figure (Fig 2E), which is not reflected in the quantification (Fig 2F). Indeed, both SLK and LOK seem to be equally crucial in Hs578T cells (Fig 4I), unlike the conclusion here. The authors must check if the quantifications were affected by any white spots in the blot for total ERM as seen in the representative figure. If necessary, the authors must include additional replicates, and the model in Fig 2G should be updated accordingly. If the contributions of LOK are indeed quite minimal in HEK293T cells, then the difference in Hs578T cells must be adequately highlighted and discussed rather than broadly mentioning similar results were observed in both cell lines. The discussion mentions that SLK kinases are the only kinases needed for ERM activation, which conflicts with findings from Hs578T cells, where both SLK and LOK contribute to ERM phosphorylation (Fig 4I). The authors should revise this to reflect their data accurately.

      Minor comments

      FigS3B should cite the source dataset and not just the database. Also, details of how the extracted data was processed (if any) should be described clearly.

      When multiple treatments are involved (for, e.g. U46619 and staurosporine), the exact sequence of treatments and the overlap in timings of different treatments must be clearly mentioned. E.g. fig 1A and 1C. There are a few grammatical errors which need to be fixed. E.g. Paragraph 2 in the second section of results - We next aimed to identify (not identifying) which kinase(s) acts downstream of TBX2AR

      Significance

      Triple-negative breast cancer, which is characterized by a lack of estrogen, progesterone or HER2 receptors, is a highly metastatic and aggressive form of breast cancer with poor prognosis. Currently, there are fewer treatment options than other types of invasive breast cancer. The current study opens up the possibility of targeting the TBXA2R or the downstream signalling components in TNBC, which are still expressed in TNBC cells. However, certain TNBC sub-types express low levels of p-ERM and TBX2AR (Fig 3E, 3F), indicating a minor role for TBX2AR pathway and targeting this pathway in these subtypes may be inefficient. In addition, certain subtypes express high p-ERM and low TBX2AR indicating alternative pathways for ERM activation. Currently, it is not clear which other GPCRs can contribute to ERM activation by engaging similar downstream effectors. A comprehensive screening of different GPCR antagonists could identify alternative strategies to target the ERM-mediated metastasis in TNBC cells that show low expression of TBX2AR.

      Audience The manuscript is relevant to a broad audience, especially to cell biologists, cancer biologists and clinical scientists.

      The reviewer's field of expertise includes cell signaling, gene expression, and RNA biology in mammalian systems. Moderate expertise in cancer biology. Limited knowledge of histopathological analysis.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Leguay et al present an interesting and logical series studies that investigate the activity and signaling of the GPCR TBXA2R in TNBC cells. The premise of the overall study is that metastasis is often associated with a more invasive/motile cancer cell phenotype. The investigators have an interest in ERM (Ezrin, Radixin, Moesin) proteins, which have been implicated in cell motility. The authors link stimulation of TBXAR2, a GPCR, to activation of ERM proteins and also show that TBXAR2 is associated with worse outcome in TNBC patients. Through the use of genetic and pharmacologic tools the authors provide convincing biochemical and cell based data to support their model that stimulation of TBXAR2 activates Gα11 & Gα12/13 which subsequently stimulate RhoA and SLK/LOK which then phosphorylate ERMs. The authors show relevant biologic consequences of the pathway. Data include orthogonal assays with similar results and the manuscript is written clearly and the data are displayed well. Overall it is a solid story that is largely well done. There are a few comments that should be addressed.

      Comments:

      1. All the biochemical/cell based in vitro data exploit the use of small molecule agonists of TBXAR2, not the natural ligand. A comment on this and why use of TXA2 is not feasible would be helpful to the reader.
      2. The data in figures 1-5 are solid and clear. However, I suggest adding a higher magnification inset for the IHC images shown in Fig 3E. It would be useful to be able to distinguish cells in the IHC, a higher mag shot should suffice.
      3. A) The use of Hs578t cells for the in vivo modeling is unfortunate. Additionally, the use of iv injection to in a study focused on cell invasion is also unfortunate. The metastatic propensity of Hs578t is not clear, in fact a recent report comparing metastasis in breast cancer cell lines shows that Hs578t perform poorly in terms of metastasis after orthotopic injection (see PMID 38468326). I searched the literature a bit to try and find other examples of iv injection of Hs578t cells, I found 1 (PMID:27654855, I did not search exhaustively), this paper shows significant lung metastasis and does not mention liver metastases. Were other breast cancer cells investigated for the in vivo studies?

      B) Why I was interested is because the typical organ that is seeded post iv injection is the lungs (as seen in the above ref), liver metastases post iv injection are not common, especially with breast cancer cells. What did the lungs look like in your experiments?

      C) Further while the data presented in figure 6 are supportive of the overall conclusions, the data is modest at best in terms of metastatic burden. Repetition of the experiment using a breast cancer cell line injected orthotopically would likely be more useful in highlighting the importance of the pathway to metastasis. <br /> I understand performing an orthotopic assay may be outside the scope of the study, but it would provide greater impact given the focus of the paper on cell invasion.

      Cross-commenting

      I think reviewer comments are generally aligned. I was least critical but appreciate the concerns of the other reviewers, especially rev #1 who requested additional validation and controls. In my opinion in vivo studies are not robust, I expect that is due to cell line choice. Repetition of the in vivo study with a breast cancer cell line that is capable of metastasis (from a primary tumor) would be more effective.

      Significance

      The manuscript presents a solid, logical flow and the biochemical/cell based in vitro data are clean. Clear differences between groups, appropriate controls, and displayed effectively.

      The challenge is the in vivo study. IV injection of cancer cells is a valid model for seeding and growing in a target organ BUT it does not reflect cell invasion, which is typically thought of as a step that occurs earlier in the metastatic cascade. That said, the data are supportive with conclusions but not necessarily consistent with expected results based on iv injection of this cell line. A caveat is that the cell line used is characterized as having metastatic characteristics in vitro but is not a consistent metastatic line in vivo. The recommendation is the perform a new in vivo experiment. An orthotopic injection of a strongly metastatic cell line, such as MDA MB 231 or other (see paper ref aboved) would be a more stringent and accurate test of the importance of the pathway to cell invasion in vivo.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      This manuscript investigates the role of the thromboxane A2 receptor (TBXA2R) in activating ERM (ezrin, radixin, and moesin) proteins to promote cell motility and invasion in triple-negative breast cancer (TNBC) cells. Using TBXA2R stimulation and a series of in vitro and in vivo experiments, the authors report that ERM activation is mediated through a TBXA2R signaling pathway involving Gαq/11 and Gα12/13 subunits, RhoA, and SLK/LOK kinases. They propose that this pathway enhances cell migration, invasion, and metastatic potential in TNBC.

      General criticisms

      Experimental design and analyses are adequate, even though certain experiments lack appropriate controls or employ the wrong statistical tests. However, the study primarily relies on a single TNBC cell line and heavy use of overexpression systems and/or small molecule inhibitors, raising concerns about the generalizability and specificity of the findings. Furthermore, several conclusions appear premature and unsupported by the current data. Critical controls and additional validation experiments are necessary to support the claims about the role of TBXA2R in metastasis and to justify the strong mechanistic conclusions drawn.

      Specific criticisms

      Figure 1

      TBXA2R expression should be shown to understand whether different ebBRET signals are dependent on the overexpression levels of TBXA2R.

      E-F: As ERM levels change over time, one would like to understand whether this is due to misloading or whether there is an underlying biological event going on in the stimulated cells. Are total ERM levels really changing over time? Please add a blot for 1-2 housekeeping proteins as loading controls. This is also crucial to clarify the kinetics of ERM activation; such notable intensity variations make quantifications of non-linear WB signals not fully reliable. In F, mean and SD should be plotted.

      G: The authors need to use a PM marker if they want to claim that pERM increases at the cell cortex. TBXA2R localization should also be shown.

      Figure 2

      A: This reviewer cannot see the purported partial inhibition in Ga12/13 KO cells. Are differences between the two KOs significant? Furthermore, there are reports indicating that YM-254890 may not be specific for Gaq. Experiments on double KO cells are needed to assess the possible redundancy between the two Ga subfamilies. C-D: it is important to add a positive control for the activity of Y-27632 in these experiments. Please show that a ROCK-dependent effect is inhibited in the treated cells. G: The working model is premature as it is unknown whether ROCKi was active. While asking for ROCK1/2 KO cells would be too much, this claim is far-fetched.

      Figure 3

      B: In the legend, it is not clear what grey and light read colours mark. E-F: This reviewer finds it difficult to believe that p-ERM and TBXA2R signal intensities at the cell cortex could be reliably quantified using IHC images. The representative samples would indicate that p-ERM and TBXA2R positivity are not correlated. It would be crucial to show examples for each of the TNBC subgroups the existence of which is inferred based on p-ERM and TBXA2R staining. The conclusion that "no TNBC samples exhibited high TBXA2R expression and low levels of p-ERMs, further supporting a role for TBXA2R signalling in ERM activation in TNBC" is an overstatement.

      Figure 4

      The authors wrote that "We focused on the Hs578T cell line, which showed a median level of TBXA2R mRNA expression among the six TNBC cell lines tested". I do not understand the rationale for it as anti-TBXA2R antibodies detecting endogenous TBXA2R are available and thus why not use the median protein levels?

      Figure 5

      Effects of the knockouts are subtle, and rescue experiments would be needed to corroborate these results. The employed statistical analysis is prone to overestimating differences. The authors should use the superplots instead. The authors might also decide to use other TNBC cell lines to explore the functional relevance of this pathway in BC progression. This is particularly important because Hs578T are poorly tumorigenic, and they often do not form palpable tumours in mice.

      Figure 6

      The fact that Hs578T are poorly tumorigenic in mice is likely the reason why the authors used the experimental metastasis model. However, it is puzzling that metastases were studied in the liver but not in the lungs. Furthermore, the whole approach is rather artefactual as the TBXA2R agonist was administered for the entire duration of these experiments. What is the pathological relevance of such a study? Including a spontaneous metastasis model or alternative TNBC lines that mimic human disease more closely would help strengthen the functional relevance of this pathway in BC progression and study's translational relevance.

      Figure S2

      B-M: the pERM signal appears to be perinuclear in some of the tested cell lines. Please use a PM marker.

      Figure S3

      The authors should use the superplots to analyse the cell migration data.

      Discussion

      The claim that "our findings demonstrated that kinases of the SLK family are the only kinases needed for ERM activation by TBXA2R" should be tuned down as only 2 cell lines were tested. In this section, the authors should also discuss the proposed pro-metastatic functions of TXA2 and TXA2R in more detail, including vascular permeability. The sweeping conclusion that "TBXA2R expression correlates with phosphorylation and activation of ERMs in TNBC patient samples" clashes with the authors' own results; please stick to the data.

      Concluding remarks

      This study investigates a signaling pathway whereby TBXA2R thorugh ERM activation enhances the migratory and invasive potential of TNBC cells. However, several improvements are needed to support the main claims. The dependence on a single TNBC cell line, reliance on pharmacological inhibitors with potential off-target effects, and limited in vivo relevance detract from the generalizability of the findings. Additional TNBC models, adeguate controls, and a broader focus on natural metastasis patterns would make the conclusions more compelling. Moderating certain overstated claims would be needed to align the interpretations with the actual data.

      Cross-commenting

      I found comments in the other reviewers' reports that align with my criticisms on the mouse experiments as well as with those pertaining to the tissue culture work.

      Significance

      General comments

      The manuscript investigates the role of TBXA2R in the regulation of ERM in the context of TNBC metastasis. Much of this TBXA2R signalling axis is already known, as well as that SLK and LOK can phosphorylate ERM in other cell systems. Similarly, the positive role of ERM in cell migration/invasion and cancer progression has long been reported. The somewhat unexpected finding that ERM phosphorylation is independent of ROCK remains not fully convincing. The BC-related part is problematic as the continuous administration a TBXA2R agonist is required for key tumour metrics to show some differences in vivo. This calls into question the main conclusion of the work, namely that the TBXA2R/ERM-dependent pathway is activated during BC progression in TNBC cells.

      Audience

      Specialists interested in GPCRs and signal transduction or in the cytoskeleton.

      Expertise

      Rev: cancer cell biology, signal transduction, cytoskeleton, actin biochemistry, multiplexed imaging, mouse model of human diseases.

      Co-rev: nanoparticles, cell biology.

    5. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      Overall, the authors show an interesting and conclusive work on the activation of ERM proteins upon TBXA2R signaling. The use of the ebBRET biosensor to assess ERM-protein activation enables elegant investigation of activation modalities. The Thromboxane A2 analogue U46619 robustly shows activation of ERM proteins in ebBRET assays as well as an increase in ERM-protein phosphorylation status. The functional effects of this signaling pathway are shown convincingly for moesin, where moesin mediates an TBXA2R mediated increase in cell motility, invasion and metastasis of triple-negative breast cancer Hs578 cells in vitro and in vivo. Nonetheless, some points need to be clarified.

      Significance

      Comment 1: In the title the authors state, that ERM-activation via TBXA2R is controlling invasion and motility of triple-negative breast cancer cells. In the manuscript, there is only data supporting this assumption for moesin (MSN). Therefore, the authors need to change the title accordingly or support additional experiments for the other two ERM-proteins radixin and ezrin. Throughout the experiments, the p-ERM antibody is used to measure ERM-protein activation. Since the effects on invasion and motility observed in Hs578 cells are mainly mediated through moesin, it would be necessary to see, at least for one experiment per cell line (HEK293T, Hs578) the detailed phosphorylation status of ezrin, radixin and moesin separately. As there are specific, phospho-detecting antibodies for this case, this could be done rather easy. Furthermore, showing specific increase of phosphorylated moesin would support the functional data shown in Figure 5 and 6. To investigate the functional effect of TBXA2R mediated activation of ezrin and radixin on cell motility and invasion, similar experiments could be done in e.g. HMC-1-8 breast cancer cells (high ezrin expression) and HCC1187 (high radixin expression).

      Comment 2: Figure 1A, C, D: The concentration of staurosporine is with 100 nM relatively high for kinase inhibition. It would be informative to see the assay with increasing staurosporine concentrations, e.g. from 1 nM to 50 nM. In general, a concentration of 1-10 nM should be sufficient for kinase inhibition, preventing unspecific effects of the drug.

      Comment 3: The citation for the p-ERM antibody is confusing, as there is only p-Moe used in the cited paper (Roubinet, 2011). There is a p-ERM antibody commercially available (Cell Signaling, Phospho-ezrin (Thr567)/radixin (Thr564)/moesin (Thr558) Antibody #3141). Could you clarify which antibody you are using?

      Comment 4: From the inhibitor experiments using C3 transferase toxin (Figure 2), the authors conclude that RhoA plays a role in TBXA2R mediated ERM activation. As mentioned in the manufacturer's description, C3 toxin is inhibiting RhoA, RhoB and RhoC. Therefore, it would be necessary to repeat those experiments under RhoA knockdown conditions (e.g. using an siRNA-based approach) to state that specifically RhoA is involved.

      Comment 5: To assess, if the findings in Figure 5 and 6 are due to the higher moesin expression in Hs578 cells or are linked to a specific function of moesin, a re-expression experiment would be informative. To achieve this, the 2D and 3D migration experiments could be redone after re-expression of moesin, ezrin and radixin separately in moesin knockdown conditions.

      Minor comments:

      • Even though U46619 is a known Thromboxane A2 analogue, including negative and positive controls would strengthen the results. In detail, this could be done by showing a known protein which gets phosphorylated downstream of TBXA2R signaling and a protein which is not affected by this signaling pathway alongside the shown effects on ERM-proteins.
      • Figure 1 J: There are no statistics comparing the conditions of SQ-29548 treated cells in presence/absence of U46619, that should be added.
      • Figure 1 G, H: How was the quantification for cell periphery performed? In detail, how were the thresholds set for cell periphery / not cell periphery?
      • Figure 3 H:
        • The labelling indicating presence of U46619 is missing.
        • Also, what is the rationale behind normalizing MB-453 for 3 cell lines and comparing the BT-549 to MB-157?
      • Suppl. Fig 4 D: Define y-axis better. Absorbance at what wave length?
      • Define FERM and ERMAD abbreviations in introduction.
    6. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      The Ezrin, radixin, and moesin (ERM) family of proteins orchestrate morphological changes that potentiate metastatic invasion in cancer cells. In this study, Leguay et al. identify the GPCR, TBXA2R, as a key activator of the ERM proteins which promotes motility and invasion in triple-negative breast cancer (TNBC) cells. Using BRET-based sensors developed by them previously for monitoring the activation of ERM proteins and building upon their previous findings on the role of the small GTPase RhoA in the activation of ERM proteins, the authors carefully dissect the molecular pathway leading to the activation of ERM proteins upon stimulation of the TBX2AR. The authors also establish the pathological relevance of the pathway in TNBC using in vitro and in vivo models, opening up possibilities for targeting this pathway in cancer cells. Overall, the study is well-conceived and executed, and the results are clearly described and presented in the manuscript. However, the following comments must be addressed before publication.

      Major comments

      Fig 1C - Why p-ERM was normalized over Ezrin and not ERM? It would be more appropriate and consistent to normalize against the ERM signal as done in other experiments in the manuscript.

      Fig 1E and S3C - The levels of total ERM also seem to change with increasing treatment times. This must be clarified and discussed in the manuscript.

      Fig 1F - Why is the mean of all three independent experiments not presented here as in S3C?

      Fig 2E - Though SLK seems to play a dominant role in the phosphorylation of ERM in HEK293T cells, the depletion of LOK also substantially reduces the phosphorylation of ERM in the representative figure (Fig 2E), which is not reflected in the quantification (Fig 2F). Indeed, both SLK and LOK seem to be equally crucial in Hs578T cells (Fig 4I), unlike the conclusion here. The authors must check if the quantifications were affected by any white spots in the blot for total ERM as seen in the representative figure. If necessary, the authors must include additional replicates, and the model in Fig 2G should be updated accordingly. If the contributions of LOK are indeed quite minimal in HEK293T cells, then the difference in Hs578T cells must be adequately highlighted and discussed rather than broadly mentioning similar results were observed in both cell lines. The discussion mentions that SLK kinases are the only kinases needed for ERM activation, which conflicts with findings from Hs578T cells, where both SLK and LOK contribute to ERM phosphorylation (Fig 4I). The authors should revise this to reflect their data accurately.

      Minor comments

      FigS3B should cite the source dataset and not just the database. Also, details of how the extracted data was processed (if any) should be described clearly.

      When multiple treatments are involved (for, e.g. U46619 and staurosporine), the exact sequence of treatments and the overlap in timings of different treatments must be clearly mentioned. E.g. fig 1A and 1C. There are a few grammatical errors which need to be fixed. E.g. Paragraph 2 in the second section of results - We next aimed to identify (not identifying) which kinase(s) acts downstream of TBX2AR

      Significance

      Triple-negative breast cancer, which is characterized by a lack of estrogen, progesterone or HER2 receptors, is a highly metastatic and aggressive form of breast cancer with poor prognosis. Currently, there are fewer treatment options than other types of invasive breast cancer. The current study opens up the possibility of targeting the TBXA2R or the downstream signalling components in TNBC, which are still expressed in TNBC cells. However, certain TNBC sub-types express low levels of p-ERM and TBX2AR (Fig 3E, 3F), indicating a minor role for TBX2AR pathway and targeting this pathway in these subtypes may be inefficient. In addition, certain subtypes express high p-ERM and low TBX2AR indicating alternative pathways for ERM activation. Currently, it is not clear which other GPCRs can contribute to ERM activation by engaging similar downstream effectors. A comprehensive screening of different GPCR antagonists could identify alternative strategies to target the ERM-mediated metastasis in TNBC cells that show low expression of TBX2AR.

      Audience The manuscript is relevant to a broad audience, especially to cell biologists, cancer biologists and clinical scientists.

      The reviewer's field of expertise includes cell signaling, gene expression, and RNA biology in mammalian systems. Moderate expertise in cancer biology. Limited knowledge of histopathological analysis.

    7. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Leguay et al present an interesting and logical series studies that investigate the activity and signaling of the GPCR TBXA2R in TNBC cells. The premise of the overall study is that metastasis is often associated with a more invasive/motile cancer cell phenotype. The investigators have an interest in ERM (Ezrin, Radixin, Moesin) proteins, which have been implicated in cell motility. The authors link stimulation of TBXAR2, a GPCR, to activation of ERM proteins and also show that TBXAR2 is associated with worse outcome in TNBC patients. Through the use of genetic and pharmacologic tools the authors provide convincing biochemical and cell based data to support their model that stimulation of TBXAR2 activates Gα11 & Gα12/13 which subsequently stimulate RhoA and SLK/LOK which then phosphorylate ERMs. The authors show relevant biologic consequences of the pathway. Data include orthogonal assays with similar results and the manuscript is written clearly and the data are displayed well. Overall it is a solid story that is largely well done. There are a few comments that should be addressed.

      Comments:

      1. All the biochemical/cell based in vitro data exploit the use of small molecule agonists of TBXAR2, not the natural ligand. A comment on this and why use of TXA2 is not feasible would be helpful to the reader.
      2. The data in figures 1-5 are solid and clear. However, I suggest adding a higher magnification inset for the IHC images shown in Fig 3E. It would be useful to be able to distinguish cells in the IHC, a higher mag shot should suffice.
      3. A) The use of Hs578t cells for the in vivo modeling is unfortunate. Additionally, the use of iv injection to in a study focused on cell invasion is also unfortunate. The metastatic propensity of Hs578t is not clear, in fact a recent report comparing metastasis in breast cancer cell lines shows that Hs578t perform poorly in terms of metastasis after orthotopic injection (see PMID 38468326). I searched the literature a bit to try and find other examples of iv injection of Hs578t cells, I found 1 (PMID:27654855, I did not search exhaustively), this paper shows significant lung metastasis and does not mention liver metastases. Were other breast cancer cells investigated for the in vivo studies?

      B) Why I was interested is because the typical organ that is seeded post iv injection is the lungs (as seen in the above ref), liver metastases post iv injection are not common, especially with breast cancer cells. What did the lungs look like in your experiments?

      C) Further while the data presented in figure 6 are supportive of the overall conclusions, the data is modest at best in terms of metastatic burden. Repetition of the experiment using a breast cancer cell line injected orthotopically would likely be more useful in highlighting the importance of the pathway to metastasis. <br /> I understand performing an orthotopic assay may be outside the scope of the study, but it would provide greater impact given the focus of the paper on cell invasion.

      Cross-commenting

      I think reviewer comments are generally aligned. I was least critical but appreciate the concerns of the other reviewers, especially rev #1 who requested additional validation and controls. In my opinion in vivo studies are not robust, I expect that is due to cell line choice. Repetition of the in vivo study with a breast cancer cell line that is capable of metastasis (from a primary tumor) would be more effective.

      Significance

      The manuscript presents a solid, logical flow and the biochemical/cell based in vitro data are clean. Clear differences between groups, appropriate controls, and displayed effectively.

      The challenge is the in vivo study. IV injection of cancer cells is a valid model for seeding and growing in a target organ BUT it does not reflect cell invasion, which is typically thought of as a step that occurs earlier in the metastatic cascade. That said, the data are supportive with conclusions but not necessarily consistent with expected results based on iv injection of this cell line. A caveat is that the cell line used is characterized as having metastatic characteristics in vitro but is not a consistent metastatic line in vivo. The recommendation is the perform a new in vivo experiment. An orthotopic injection of a strongly metastatic cell line, such as MDA MB 231 or other (see paper ref aboved) would be a more stringent and accurate test of the importance of the pathway to cell invasion in vivo.

    8. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      This manuscript investigates the role of the thromboxane A2 receptor (TBXA2R) in activating ERM (ezrin, radixin, and moesin) proteins to promote cell motility and invasion in triple-negative breast cancer (TNBC) cells. Using TBXA2R stimulation and a series of in vitro and in vivo experiments, the authors report that ERM activation is mediated through a TBXA2R signaling pathway involving Gαq/11 and Gα12/13 subunits, RhoA, and SLK/LOK kinases. They propose that this pathway enhances cell migration, invasion, and metastatic potential in TNBC.

      General criticisms

      Experimental design and analyses are adequate, even though certain experiments lack appropriate controls or employ the wrong statistical tests. However, the study primarily relies on a single TNBC cell line and heavy use of overexpression systems and/or small molecule inhibitors, raising concerns about the generalizability and specificity of the findings. Furthermore, several conclusions appear premature and unsupported by the current data. Critical controls and additional validation experiments are necessary to support the claims about the role of TBXA2R in metastasis and to justify the strong mechanistic conclusions drawn.

      Specific criticisms

      Figure 1

      TBXA2R expression should be shown to understand whether different ebBRET signals are dependent on the overexpression levels of TBXA2R.

      E-F: As ERM levels change over time, one would like to understand whether this is due to misloading or whether there is an underlying biological event going on in the stimulated cells. Are total ERM levels really changing over time? Please add a blot for 1-2 housekeeping proteins as loading controls. This is also crucial to clarify the kinetics of ERM activation; such notable intensity variations make quantifications of non-linear WB signals not fully reliable. In F, mean and SD should be plotted.

      G: The authors need to use a PM marker if they want to claim that pERM increases at the cell cortex. TBXA2R localization should also be shown.

      Figure 2

      A: This reviewer cannot see the purported partial inhibition in Ga12/13 KO cells. Are differences between the two KOs significant? Furthermore, there are reports indicating that YM-254890 may not be specific for Gaq. Experiments on double KO cells are needed to assess the possible redundancy between the two Ga subfamilies. C-D: it is important to add a positive control for the activity of Y-27632 in these experiments. Please show that a ROCK-dependent effect is inhibited in the treated cells. G: The working model is premature as it is unknown whether ROCKi was active. While asking for ROCK1/2 KO cells would be too much, this claim is far-fetched.

      Figure 3

      B: In the legend, it is not clear what grey and light read colours mark. E-F: This reviewer finds it difficult to believe that p-ERM and TBXA2R signal intensities at the cell cortex could be reliably quantified using IHC images. The representative samples would indicate that p-ERM and TBXA2R positivity are not correlated. It would be crucial to show examples for each of the TNBC subgroups the existence of which is inferred based on p-ERM and TBXA2R staining. The conclusion that "no TNBC samples exhibited high TBXA2R expression and low levels of p-ERMs, further supporting a role for TBXA2R signalling in ERM activation in TNBC" is an overstatement.

      Figure 4

      The authors wrote that "We focused on the Hs578T cell line, which showed a median level of TBXA2R mRNA expression among the six TNBC cell lines tested". I do not understand the rationale for it as anti-TBXA2R antibodies detecting endogenous TBXA2R are available and thus why not use the median protein levels?

      Figure 5

      Effects of the knockouts are subtle, and rescue experiments would be needed to corroborate these results. The employed statistical analysis is prone to overestimating differences. The authors should use the superplots instead. The authors might also decide to use other TNBC cell lines to explore the functional relevance of this pathway in BC progression. This is particularly important because Hs578T are poorly tumorigenic, and they often do not form palpable tumours in mice.

      Figure 6

      The fact that Hs578T are poorly tumorigenic in mice is likely the reason why the authors used the experimental metastasis model. However, it is puzzling that metastases were studied in the liver but not in the lungs. Furthermore, the whole approach is rather artefactual as the TBXA2R agonist was administered for the entire duration of these experiments. What is the pathological relevance of such a study? Including a spontaneous metastasis model or alternative TNBC lines that mimic human disease more closely would help strengthen the functional relevance of this pathway in BC progression and study's translational relevance.

      Figure S2

      B-M: the pERM signal appears to be perinuclear in some of the tested cell lines. Please use a PM marker.

      Figure S3

      The authors should use the superplots to analyse the cell migration data.

      Discussion

      The claim that "our findings demonstrated that kinases of the SLK family are the only kinases needed for ERM activation by TBXA2R" should be tuned down as only 2 cell lines were tested. In this section, the authors should also discuss the proposed pro-metastatic functions of TXA2 and TXA2R in more detail, including vascular permeability. The sweeping conclusion that "TBXA2R expression correlates with phosphorylation and activation of ERMs in TNBC patient samples" clashes with the authors' own results; please stick to the data.

      Concluding remarks

      This study investigates a signaling pathway whereby TBXA2R thorugh ERM activation enhances the migratory and invasive potential of TNBC cells. However, several improvements are needed to support the main claims. The dependence on a single TNBC cell line, reliance on pharmacological inhibitors with potential off-target effects, and limited in vivo relevance detract from the generalizability of the findings. Additional TNBC models, adeguate controls, and a broader focus on natural metastasis patterns would make the conclusions more compelling. Moderating certain overstated claims would be needed to align the interpretations with the actual data.

      Cross-commenting

      I found comments in the other reviewers' reports that align with my criticisms on the mouse experiments as well as with those pertaining to the tissue culture work.

      Significance

      General comments

      The manuscript investigates the role of TBXA2R in the regulation of ERM in the context of TNBC metastasis. Much of this TBXA2R signalling axis is already known, as well as that SLK and LOK can phosphorylate ERM in other cell systems. Similarly, the positive role of ERM in cell migration/invasion and cancer progression has long been reported. The somewhat unexpected finding that ERM phosphorylation is independent of ROCK remains not fully convincing. The BC-related part is problematic as the continuous administration a TBXA2R agonist is required for key tumour metrics to show some differences in vivo. This calls into question the main conclusion of the work, namely that the TBXA2R/ERM-dependent pathway is activated during BC progression in TNBC cells.

      Audience

      Specialists interested in GPCRs and signal transduction or in the cytoskeleton.

      Expertise

      Rev: cancer cell biology, signal transduction, cytoskeleton, actin biochemistry, multiplexed imaging, mouse model of human diseases.

      Co-rev: nanoparticles, cell biology.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The article titled 'Non-Invasive Mechanical-Functional Analysis of Individual Liver Mitochondria by Atomic Force Microscopy' discusses how the mechanical properties of mitochondria as response to various drugs, like CCCP and ADP or rotenon and antimycin A.

      Key findings:

      The Authors correlated the thermal noise power spectrum (PSD) measured in contact on the top of mitochondria using atomic force microscope (AFM) with membrane activity of the organelles measured using fluorescence markers.

      They identified correlation trends between PSD, height, elasticity and fluorecence marker intensities for various cases, where the organelle activity was modified using drugs or genetic changes. The work is a very interesting approach, an excellent application of mechanobiology to gain further understanding of the properties of the energy producing organelles of eukaryotes. However, the overall results the Authors present have some serious flaws.

      I would recommend for publication after significant changes were made.

      Major comments:

      Upon measuring the power spectrum density (PSD) of thermal fluctuations in contact of an organelle, there are several factors influencing the measurements, such as: spatial inhomogeneity of the mitochondrion, the loading force applied, the feedback system of the AFM, hydrodynamic drag of the media on the cantilever.

      None of the above points are addressed in the manuscript. That is:

      • what was the spatial variability of the signal on the top of the organelle? (Using a tip with 30 nm apex radius has a relatively high variability even in microscopically homogeneous systems)
      • what was the loading force applied, and how did the PSD vary with the loading force?
      • according to the text on the bottom of page 5 the feedback was ON. How did this influence the recorded PSD? Significance of differences between organelles can be only properly estimated in relation to the spatial and load dependence of the same information.

      Minor comments:

      • Numerical Fourier transform generating the PSD is very noise prone, thus many curves need to be averaged for a good result. Please provide statistical information on this aspect of the obtained curves.
      • In the text it is mentioned that characteristic changes of the PSD were observerd. What are the characteristic changes between unperturbed and drog affected mitochondria? Please highlight them on the graphs of PSD.
      • How is the distribution of the results e.g. in Figure 1.E? Histogram and box-plots are more informative than bar plots.
      • How many curves were recorded for the individual mitochrondia? (30 mitochrondia were measured)
      • Figure 2.A and Figure S1.C indicate nicely how heterogeneous the mitochondria are. How did you eliminate the corresponding error from the PSD measurements?
      • To highlight correlations, simple plots of the parameters as the function of each-other can be very informative.
      • On Figure 1, the correlation between the fluorescence intensities and the PSD integrals are only qualitative.
      • On Figure 3 the inverste correlation between the height and Young's modulus is not clear. Can it be plot such a way that the intended information becomes clear?
      • While the Authors are claiming that the PSD is charactersitic to the mechanical properties of the organelles, its direct connection remains elusive and is not discussed in the paper. Again, loading force dependence is expected to be present and influence whether the probe is detecting changes in membrane properties or sense something deeper, structures under the membrane.
      • While the Authors correlate various measures derived from AFM data, these are only ensemble comparisons, since imaging and PSD measurements were done using different AFMs, thus different sample points. This should be clearly stated in the text.
      • QI mode is very robust for imaging, but its Young's moduli are difficult to compare to any real situation, since the measurement si performed typically at the 500 - 2000 Hz frequency range. Not mentioning that the individual force curves are usually rather noisy for biological samples.
      • In Figure S1.B, nothing is visible for the CCCP sample.
      • In Figure S2, what does the value of 300 means for alpha in the first sentence?
      • While the frequency dependence of the PSD makes sense, the data indicated in figure S2 also indicates very high noise, making the fits unreliable. What would be the exponent value in the 5% - 95% confidence interval?
      • It may be also informative to see a common plot of individual PSDs for the various cases, and in the representative plot see mean +/- SE plots for each frequency points.
      • In the experiment description stands: 'Bruker Multimode AFM was used for overall imaging and power spectra in tapping mode.' This is misleading, because in tapping mode the end of the cantilever is driven by a constant frequency, which would interfere with the thermal PSD measurement. If it was done so, this is a driven state which should be discussed, and which is also dependent on the driving frequency.
      • When preparing the PLL surfaces, how were the mica substrates washed before adding the organelles?
      • The topography images are most probably measured Z-piezo sensor outputs. However, this is not mentioned.
      • Imaging conditions of QI mode are incomplete the point measurement frequency, parameter to the apparent Young's modulus is not mentioned.

      Referee cross-commenting

      Reading the review of Reviewer 1 highlights the flaws in the organelle biology part of the work I was not aware of. (I am expert in mechanical characterization in the molecular - cellular level.) Putting the reviews together highlights that this study is in a very early state of investigation. It would be really interesting to see its results, but claiming it to be a novel diagnosis tool may be far fetched. (I agree with Referee 1.)

      Significance

      In general, the idea of estimating the mechanical properties of mitochondria and correlate them to the activity of the organelles is a very interesting idea in the field of mechanobiology. The Authors have done a relatively large amount of experiments to identify correlation between activity followed by more traditional fluorescence labels and the AFM data they generated. They performed many experiments spanning also three AFM devices and other experimental methods in their work.

      Limitations:

      I believe however, they missed some key points influencing their results, most importantly the dependence of the data on the:

      • normal loading force
      • spatial inhomogeneity (their own images prove the presence of this)

      I am afraid some of the effects they detect are not only qualitative, but also biased, but with the current figures and data I cannot substantiate.

      Audience: specific to microbiology, especially the audience interested in mechanobiology

      I believe this is an interesting work, and contributes to our understanding of micromechanics at the organelle level. Thus I would really like to see it published in a more complete form.

      Advance: Mitochondria is known to respond to environmental clues and can remodel its internal structure in response to stresses. However, it is difficult to find studies on the individual mechanical properties of these organelles, even in ex-situ environments.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In the article "Non-Invasive Mechanical-Functional Analysis of Individual Liver Mitochondria by Atomic Force Microscopy", O. Zorikova and colleagues propose the use of Atomic Force Microscopy (AFM) as a tool for characterizing the biophysical properties of individual mitochondria. By analyzing parameters such as height, membrane fluctuation power spectra, and Young's modulus under various drug treatments and genetic mutations, the authors aim to provide a novel, label-free method for assessing mitochondrial functionality.

      While the manuscript presents an interesting approach, the introduction would benefit from a clearer and more cohesive narrative. The authors highlight the need to monitor the function of individual mitochondria, which is indeed an important challenge, but the rationale for doing so should be more explicitly stated. A stronger emphasis on the biological importance of mitochondrial biophysical parameters and the added value of using AFM would enhance the motivation for the study. Additionally, the symbol Δψ, referring to mitochondrial membrane potential, should be defined and briefly explained in the introduction for clarity.

      In the results section, a schematic diagram of the experiment would aid comprehension, especially for readers less familiar with this technique. In general, in the figures it would be good to find the individual data points. The integration of the results into the main text could also be improved. Currently, several findings are presented in a descriptive manner, but the biological interpretation or relevance is not always clear. For example, the sentence "Figure 2 presents a comprehensive analysis of the height and elastic properties of mitochondria" could be expanded to explain what those findings actually mean and how they help support the main goal of the study. Similarly, the statement that "the integrated power of mitochondrial membrane fluctuations decreased significantly upon valinomycin treatment" is presented without explanation of what this metric represents or why valinomycin was chosen. When discussing MTH2, the authors refer to "mechanical alterations in mitochondria lacking this protein" without explaining what MTH2 is, where it is localized, or why it is biologically relevant.

      Finally, in the discussion, the interpretation of results could be expanded. For example, the statement "MKO/MLM exhibited increased integrated power/potential, increased modulus/stiffness, and decreased height" would benefit from more biological context - what do these changes imply about mitochondrial function or physiology? Adding this kind of interpretation would help the reader better understand the broader significance of the findings.

      Methods: The authors say they record the piezo movement but it is not clear to the reviewer if the authors perform a closed-loop force-feedback experiment. If so, this will introduce noise into the measurement which can be avoid by performing an open loop measurement. Why did the authors not record the cantilever fluctuation at a constant piezo height? This gives enough bandwidth and low noise to record Angstrom deflections. Likewise, it is unclear to this reviewer why the power spectrum is given in V and not in nm, as it is typical in AFM measurements. I assume the authors calibrated the deflection sensitivity and spring constant of the cantilever, hence, if possible, the authors should convert the PSD into nm/Hz.

      During the elasticity measurements, did the authors correct for the finite thickness of the mitochondria? What was the contact force and indentation depth, and how thick were the mitochondria to begin with? If the indentation is larger than 20%, I suggest to perform a correction to account for the infinite stiffness of the substrate. Given that the mitochondrial stiffness is in the tens of kPa, this seems to be important (perhaps not for relative values but for absolute stiffness measurements).

      Figures. The figures are well constructed and aid the reader through the important messages of the paper. The authors however, should not excessively overuse bar charts without explicitly mentioning number of measurements for each condition. In essence, I strongly recommend plotting individual data points to see the distribution and replace the stars with actual p-values.

      Significance

      The premise of the study is compelling and could have important clinical implications for distinguishing dysfunctional mitochondria in pathological contexts. However, the manuscript in its current should be improved. First of all, non-invasive is more than an euphemism, as the mitochondria need to be taken out of the cell, which is highly invasive. The authors should delete non-invase from the title.

      As the work presents an orthogonal and non-standard approach, the authors introduced a novel assay that can guide future investigations into the biophysics of mitochondrial physiology. Thus the paper is of high interest, timely and cutting edge.

      In summary, the study presents a promising approach with potentially high relevance for mitochondrial research.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The authors use atomic force microscopy (AFM) to study mitochondria isolated from primary mouse livers, and they attempt to correlate these measurements with mitochondrial membrane potential and oxygen consumption under different bioenergetic conditions. They argue that AFM could be used diagnostically to assess mitochondrial function. While there is some novelty in potentially using AFM to assess mitochondrial function in the clinic, it is not clear how this would be more efficient or meaningful that assessing mitochondrial parameters by more standard methods, such as respirometry, confocal microscopy, etc. Considerably more work would need to be performed, particularly on relevant patient samples, to show that AFM holds potential as a diagnostic tool. It is important to note that the authors of this study have not taken sufficient care to quantify the mitochondrial membrane potential in a manner that could be considered reliable, which casts further doubt upon the merits of this method for diagnosing mitochondrial function. These concerns, laid out in detail below, should be thoroughly addressed before publication.

      Major comments:

      The authors used azide to inhibit complex V, but azide is also a potent inhibitor of complex IV (Bowler et al., 2006). Why did the authors not use oligomycin, which is more specific, to inhibit complex V? In Fig. 1 H - K, the y axes are labelled in a confusing or ambiguous way. The legend says that all data represent the mean {plus minus} SEM; however, panels D, F, H, and K have no error bars. For example, the data in H and K are shown as violin plots. Typically, the y axis would say what the name of the quantity is (e.g., mean TMRM fluorescence intensity) followed by the units (e.g., a.u.) in parentheses. However, the authors write, for example, in panel K "Mean pixel (TMRM)." The authors seem to follow the correct convention in panels D - G, so it is not clear why H - K are written incorrectly. In any event, the authors need to specify how these data were obtained, as there are virtually no details as to the methods of how these measurements of mitochondrial membrane potential were acquired. For example, JC-1 is a ratiometric probe. In its monomeric form, it emits a green signal, but, as the dye aggregates into so-called J-aggregates, the emission is red. The correct way of analyzing JC-1 signal is to compute the ratio of red over green fluorescence intensity. However, in the authors' quantifications, they simply say "Fluorescence (JC-1)." The units of the y axes go from zero to 20,000, which means that the authors likely did not assess the ratio of these emissions, so the data are not informative as to the actual mitochondrial membrane potential. Moreover, the authors indicate that they use 5 µM JC-1. This seems quite a high concentration, particularly for staining isolated mitochondria, which means that the dye has direct access to the organelle without having to cross the plasma membrane. There is no information about how long the dye was allowed to load and whether it was washed off prior to obtaining the measurements with the plate reader. Likewise, the authors used TMRM to also try to assess the mitochondrial membrane potential. In this case, they used 0.5 µM, but they did not indicate for what duration the mitochondria were exposed to the dye before going through the FACS. It should be noted, too, that TMRM is a Nernstian probe, which effectively stains mitochondria at concentrations as low is 1 nM. Accordingly, it is known that TMRM (and other mitochondrial dyes) can be toxic at higher concentrations, inhibiting essential processes such as OXPHOS. The very low dynamic range of the TMRM signal in panels H and K suggest that the signal was saturated, because there was too much dye loaded into the mitochondria. Moreover, the values, ranging merely from zero to 80 suggest a very insensitive method for quantifying the mitochondrial membrane potential. In Fig. S1 A-B, the authors used confocal microscopy to assess the isolated mitochondria. It would be wise to continue to use this technique for the other experiments, as plate readers and FACS offer no direct visual cues to validate that the numbers reflect bona fide biological measurements. Especially in the case of FACS, where there is an exceedingly large number of events, the statistics become essentially meaningless, as it is possible to show that almost anything is statistically significantly different if there is a sufficiently high number of samples or events. The authors should bear in mind that measuring the mitochondrial membrane potential is not trivial. One needs to understand the properties of the probes that are being employed as well as the instruments that are used to make the measurements. Care must be taken to ascertain that the quantifications reflect true biological processes. The authors claim, for Fig. 1, that there is an "excellent correlation" between height fluctuations and mitochondrial membrane potential. Given that the mitochondrial membrane potential measurements were associated with various errors (see above), it is premature to assert that there is any correlation, at all. Furthermore, if the authors want to argue that there is indeed a correlation between these variables, then they should perform an appropriate statistical analysis, e.g., a pearson correlation coefficient test.

      For the reasons explained above, the JC-1 and TMRM measurements in Figs. 3 and 4 are not convincing. The authors must demonstrate, unambiguously, that they understand the use of these probes and that they are making accurate measurements.

      Given that MTCH2 was recently reported to function as an insertase of the OMM (Guna et al., 2022), understanding the KO phenotype is extremely challenging, since it implicates the downstream loss of function of numerous other proteins. It would be valuable to examine other KO models with more specific mitochondrial defects, which can simplify the interpretation of the data. For example, suppression of any of the large Dynamin GTPases that control mitochondrial shape, i.e., MFN1/2, OPA1, or DRP1. Conversely, modulation of mitochondrial membrane composition by suppression of specific phospholipid biosynthetic enzymes would be valuable. It is important to note that the authors are attempting to highlight AFM as a novel way to assess patient samples, but they do not provide any data as to whether mitochondria, derived from a patient with a known mitochondrial defect, could be meaningfully assessed by this method. It is worth pointing out, too, that isolating mitochondria from primary tissues involves a significant amount of stress to the organelle. To understand mitochondrial function in a manner that reflects an in vivo state as much as possible, it would be essential to show that the isolated mitochondria from the liver are largely the same as those in intact liver cells. The authors should be aware that isolating live hepatocytes is far from a trivial thing to do (Charni-Natan & Goldstein, 2020). Simply mincing the liver and subjecting it to mechanical and enzymatic dissociation likely involves significant mitochondrial stress, which implies that the values derived from isolated mitochondria represent a highly non-physiological, even dysfunctional, condition. These are fundamental concerns which should be considered and discussed in any report that is lauding the potential diagnostic benefits of quantifying isolated mitochondria from primary tissues.

      The authors say, in the discussion, "Accordingly, the AFM method employed here measured several characteristics such as morphology and elastic modulus of the structures, as well as fully exploiting the rich information available from the noise spectra." There was no measurement of "morphology" in this study. Differences in height are not what is generally considered in discussions of mitochondrial morphology, which reflects the dynamic changes in organelle shape and connectivity, typically in the x-y (rather than z) axes.

      The authors performed experiments on fixed and dried mitochondria; however, there is no systematic comparison of the integrated power and other parameters compared to the live mitochondria isolates. This is a key comparison that should have been performed, as it would offer a basic frame of reference for the values of the live organelles. Another key experiment that is lacking in this study is measurement of the same organelle over time to understand the variance in individual organelles from moment to moment.

      Minor comments:

      Generally, the authors should moderate their claims that AFM could be used diagnostically until the above concerns are addressed.

      There needs to be considerably more detail as to the methods that were used here. This is essential insofar as the authors wish to convince potential readers that the experiments were carefully conducted and that the data is reliable. Putting numbers on the margin of the manuscript would be helpful for the referee to specifically address certain points.

      References:

      Bowler MW, Montgomery MG, Leslie AG, Walker JE. How azide inhibits ATP hydrolysis by the F-ATPases. Proc Natl Acad Sci U S A. 2006 Jun 6;103(23):8646-9. doi: 10.1073/pnas.0602915103. Epub 2006 May 25. PMID: 16728506; PMCID: PMC1469772.

      Guna A, Stevens TA, Inglis AJ, Replogle JM, Esantsi TK, Muthukumar G, Shaffer KCL, Wang ML, Pogson AN, Jones JJ, Lomenick B, Chou TF, Weissman JS, Voorhees RM. MTCH2 is a mitochondrial outer membrane protein insertase. Science. 2022 Oct 21;378(6617):317-322. doi: 10.1126/science.add1856. Epub 2022 Oct 20. PMID: 36264797; PMCID: PMC9674023.

      Charni-Natan M, Goldstein I. Protocol for Primary Mouse Hepatocyte Isolation. STAR Protoc. 2020 Aug 13;1(2):100086. doi: 10.1016/j.xpro.2020.100086. PMID: 33111119; PMCID: PMC7580103.

      Significance

      I am an expert in imaging of mitochondria, with considerable direct knowledge of various super-resolution and advanced imaging systems. I have also studied mitochondrial function, using standard biochemical and molecular approaches. I have great familiarity with mitochondrial behavior and dynamics, as understood from live-cell imaging approaches and morphological analysis.

      This study is potentially interesting due to its relatively novel use of AFM to examine mitochondria. However, there is a lot of uncertainty in the measurements due to technical oversights and lack of relevant controls. Whether AFM could be useful in the clinic remains an open question. If the authors could address the comments above, it would go a long way to finding out one way or the other.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      (...) The study describes meticulously conducted and controlled experiments, showing the impressive biochemistry work consistently produced by this group. The statistical analysis and data presentation are appropriate, with the following major comments noted:

      Response: We thank the reviewer for their thoughtful and constructive review of our manuscript. We appreciate the positive comments on our experimentation.

      Major comments

      1. Please clarify why K8ac/K12ac, K5ac/K16ac, K5ac/K12ac are not quantified (Figure 3). If undetected, state explicitly and annotate figures with "n.d." rather than leaving gaps. If detected but excluded, justify the exclusion.

      Response: We restricted ourselves to mapping those diacetylated motifs that can be readily identified by MS2. The characteristic ions of the d3-labeled and endogenous acetylated peptides in the MS2 spectra could not differentiate the diacetylated forms mentioned by the reviewer. Rather than expanding the figure with non-informative rows we amended the legend of figure 3 accordingly "Diacetylated forms K8-K12, K5-K16, K5-K12 could not be distinguished from each other by MS2 and were thus not included in the analysis".

      The statement "Nevertheless, combinations of di- and triacetylation were much more frequent if K12ac was included, suggesting that K12 is the primary target." is under-supported because only two non-K12ac combinations are shown, and only one is lower than K12ac-containing combinations. Either soften the claim ("trend toward ... in our dataset") or expand the analysis to all observed di/tri combinations with effect sizes, n, and statistical tests.

      Response: The reviewer is right our statement does properly reflect the data. It rather seems that combinations lacking K12ac are considerably less frequent (K5K8K16 tri-ac, K5K8 di-ac). We now modified the sentence as follows: "Peptides lacking K12ac were less frequent, suggesting that K12 is a primary target".

      Please provide a more detailed discussion about the known nature of NU9056 inhibition and how it fits or doesn't fit with your data. Are there any structural studies on this?

      Response: Unfortunately, NU9056 is very poorly described, neither the mode of interaction with Tip60 nor the mechanism of inhibition are known. The specificity of the chemical has not really been shown, but nevertheless it is used as a selective Tip60 inhibitor in several papers which is why we picked it in the first place. Our conclusions on the inhibitor are in the last paragraph of the discussion: "The fact that acetylation of individual lysines is inhibited with different kinetics argues against a mechanism involving competition with acetyl-CoA, but for an allosteric distortion of the catalytic center." We think that any further interpretation would likely be considered an overstatement.

      Why was the inhibitor experiment MS only performed for H2A.V and not H2A? Given the clear H2A vs H2A.V differences reported in Fig. 2, it would be useful to have the matched data for H2A.

      Response: In these costly mass spec experiments we strive to balance limited resources and most informative output. Because H2A.V and H4 are the major functional targets of Tip60, we considered that documenting the effect of the inhibitor on these substrates would be most appropriate. In hindsight, including H2A would have been nice to have, but would not change our conclusions about the inhibitor.

      The inhibitor observations are very interesting as they can highlight systems to study the loss of specific acetyl residues: can the authors perform WB/IF validation in treated cells? I understand it will not be possible with the H2A antibodies, but the difference in H4K5ac vs H4K12ac should be possible to validate in cells

      Response: We attempted to monitor changes of histone modifications upon treatment of cells with NU9056 by immunoblotting. Probing H4K5 and K12, the results were variable. We also observed occasionally that acetylation of H4K5 and H4K12 was slightly diminished in whole cell extracts, but not in nuclear extracts. This reminded us that diacetylation of H4 at K5 and K12 is a feature of cytoplasmic H4 in complex with chaperones, a mark that is placed by HAT1 (Aguldo Garcia et al., DOI: 10.1021/acs.jproteome.9b00843; Varga et al., DOI: 10.1038/s41598-019-54497-0). The observed proliferation arrest by NU9056 may thus affect chromatin assembly and indirectly K5K12 acetylation. H4K12 is also acetylated by chameau (Chm).

      We observed a reduction of acetylated H4K16 and H2A.V. H4K16 is not a preferred target of Tip60, but Tip60 acetylates MSL1 and MBDR2, two subunits of the NSL1 complex (Apostolou et al. DOI: 10.1101/2025.07.15.664872). We, therefore, consider that effects on H4 acetylation upon NU9056 treatment may at least partially be affected indirectly. Because we are not confident about the data and because our manuscript emphasizes the direct, intrinsic specificity of Tip60, we refrain from showing the corresponding Western blots.

      You highlight that H2AK10 (a major TIP60 site here) is not conserved in human canonical H2A. Please expand the discussion of the potential function and physiological relevance. Maybe in relation to H2A.V being a fusion of different human variants?

      Response: The reviewer noted an interesting aspect of the evolution of the histone H2A variants. It turns out that H2A.Z is the more ancient variant, from which H2A derived by mutation. H2A.Z/H2A.V sequences are more conserved than H2A sequences. We summarized these evolutionary notions in Baldi and Becker (DOI: 10.1007/s00412-013-0409-x). In the context of the question, this means that mammalian H2A.Z, Drosophila H2A.V and mammalian H2A still contain the ancient sequence (lacking K10), and Drosophila H2A acquired K10 by mutation. The evolutionary advantage associated with this mutation in unclear. We now added a small paragraph summarizing these ideas on page 13 of the (changes tracked in red).

      To enable direct comparisons between variants and residues, please match y-axis scales where the biology invites comparison (e.g., H2A vs H2A.V; Figs. 2-3).

      Response: We adjusted the Y-axes in Figure 2 and 3 to facilitate direct comparisons, where such comparison is informative.

      Minor comments

      1. Add 1-2 sentences in the abstract on the gap in the field being addressed by the study.

      Response: We are grateful for this suggestion and have expanded the abstract accordingly (changes tracked in red).

      Either in the introduction or discussion, comment on your prior Tip60 three-subunit data (Kiss et al.). The three-subunit complex was significantly less active on H4, as indicated in that publication, which is likely due to the absence of Eaf6.

      Response: We thank the reviewer for the opportunity to emphasize this point. Motivated by findings in the yeast and mammalian systems that Eaf6 was important for acetylation, we added this subunit to our previously reconstituted 3-subunit 'piccolo' complex. As can be seen by the comparison of the older data (Kiss et al.) and the new data, the 4-subunit TIP60 core complex is a much more potent HAT. We amended the introduction (see marked text) accordingly. We also added a paragraph on what is known about the properties and function of Eaf6 to the discussion.

      3a. Text references Fig.1E before Fig.1C, please reorder

      Response: We deleted the premature mentioning of Figure 1E and added the following explanation to the relevant panels in Figure 1: "The blot was reprobed with an antibody detecting H3 as an internal standard for nucleosome input."

      3b. Fig.1B/C legend labels appear swapped.

      Response: We thank the reviewer for spotting the swap. We corrected the figure legend.

      3c. Fig.1E, 4A, 4B: add quantification

      Response: We quantified each acetylation level, and added to the relevant panel of Figure 1 and 4 the following phrase: "The quantified levels of each acetylation mark over H3 are shown below each plot." Notably, the difference in acetylation signal strength between the two antibodies highlights the inherent variability of antibody-based detection.

      3d. Fig.2A: Note explicitly that K5-K10 and K8-K10 are unresolvable pairs to explain the shading scheme used.

      Response: The legend of Figure 2A now includes the following sentence. "Peptides that are diacetylated at either K5/K10 or K8/K10 cannot be resolved by MS2. The last row reminds of this fact by the patterning of boxes and displays the combined values."

      Ensure consistent KAT5/TIP60 naming.

      Response: Our naming follows this logic: We use 'Tip60' for the Drosophila protein and 'TIP60' for the Drosophila 'piccolo' or 'core' complexes. The mammalian protein is referred to by the capital acronym TIP60, as is established in the literature. We use KAT5/TIP60 according to the unified nomenclature in the introduction and parts of the discussion, when we refer to the enzymes in more general terms, independent of species. We scrutinized the manuscript again and made a few changes to adhere to the above scheme.

      Consider moving the first two Discussion paragraphs (field context and challenges in antibody-based detection) into the Introduction to better frame the significance.

      Response: We thank the reviewer for this suggestion that improved the manuscript a lot. We incorporated the first two paragraphs of the discussion into the introduction.

      Significance

      This is a valuable and timely study for the histone acetylation field. The substrate specificity of many individual HATs remains incompletely understood owing to (i) cross-reactivity and limited selectivity of many anti-acetyl-lysine antibodies, (ii) functional redundancy among KATs, (iii) variability across in-vitro assays (HAT domain vs full-length/complex; free histones vs oligonucleosomes), and (iv) incomplete translation of in-vitro specificity to in-vivo settings. These factors have produced conflicting reports in the literature. By combining quantitative mass spectrometry with carefully engineered oligonucleosomal arrays, the authors make a principal step toward deconvoluting TIP60 biology in a controlled yet close-to-physiologically relevant system. Conceptually, the work delineates intrinsic, site-specific preferences of the TIP60 core on variant versus canonical nucleosomes, consistent with largely distributive behaviour and site-dependent inhibitor sensitivity. The inhibitor-dependent shifts in acetylation patterns are particularly intriguing and could enable dissection of residue-specific functions, with potential translational implications for preclinical cancer research and biomarker development. Overall, this manuscript will be of interest to the chromatin community, and I am supportive of publication pending satisfactory resolution of the points raised above.

      Response: Once more we thank the reviewer for their time and efforts devoted to help us improve the manuscript.


      Reviewer #2

      Major comments

      (...) A central limitation of the study, noted by the authors, is the uncertainty regarding the biological relevance of the findings. While the in vitro system provides a controlled framework for analyzing residue specificity and kinetics, it does not address the functional significance of these results in a cellular or organismal context. This limitation is outside the scope of the current work but indicates potential directions for follow-up studies. Within its defined objectives, the study presents a methodological framework and dataset that contribute to understanding TIP60 activity in a biochemical setting.

      Response: We agree with the referee.

      Minor comments

      While the manuscript is clearly presented overall, there are two minor issues that could be addressed:

      1. In Figure 1, the panels are not ordered according to their appearance in the Results section. In addition, the legends for Figures 1B and 1C appear to be swapped.

      Response: We thank the reviewer for spotting these oversights. We deleted the premature mentioning of Figure 1E and added the following explanation to the relevant panels in Figure 1: "The blot was reprobed with an antibody detecting H3 as an internal standard for nucleosome input." We also swapped the legends.

      For the quantitative MS data (N = 2 biological replicates), the phrasing "Error bars represent the two replicate values" could be refined. With N = 2, showing individual data points or the range may convey the information more transparently than conventional error bars, which are typically associated with statistical measures (e.g., SEM) from larger sample sizes. Alternatively, a brief note explaining the choice to use two replicates and represent them with error bars could be added.

      Response: We appreciate the reviewer's comment and have revised the figure to display individual data points for the two biological replicates instead of error bars, providing a clearer representation of the data distribution. We changed the phrasing 'Error bars represent...' to "Bars represent the mean of two biological replicates (each consisting of two TIP60 core complexes and two nucleosome arrays - each analyzed with two technical replicates), with individual replicate values shown as open circles." and hope that this describes the data better.

      Significance

      Krause and colleagues, using a clean in vitro system, define the substrate specificity of the Drosophila TIP60 core complex. They identify the main acetylation sites and their kinetic dynamics on H2A, H2A.V, and H4 tails, and further characterize the inhibitory activity of NU9056. This work addresses a longstanding question in the field and provides compelling evidence to support its conclusions. Future studies will be needed to establish the biological relevance of these findings.

      Response: We thank the reviewer for a thoughtful and constructive review of our manuscript. We appreciate the suggestions that helped to improve the manuscript.


      Reviewer #3

      (...) However, the authors should revisit some additional points:

      Major comments:

      1. The Tip60 core complex is usually described as containing three subunits: Tip60, Ing3 and E(Pc). The authors also included Eaf6 in their analysis, however, their motivation to include Eaf6 specifically remains unclear. They should explain in the manuscript why Eaf6 was included and how this could affect the observed acetylation pattern.

      Response: We thank the reviewer for the opportunity to emphasize this point. Motivated by findings in the yeast and mammalian systems that Eaf6 was important for acetylation, we added this subunit to our previously reconstituted 3-subunit piccolo complex. As can be seen by the comparison of the older data (ref Kiss) and the new data, the 4-subunit Tip60 core complex is a much more potent HAT. We amended the introduction accordingly. We also added a paragraph on what is known about the properties and function of Eaf6 to the discussion. Please see the amended text marked in red.

      The authors investigated the effectiveness of two Tip60 inhibitors by testing their effects on H4K12ac using an antibody. They state that "TH1834 had no detectable effect on either complex [Tip60 or Msl], even at very high concentrations." However, the initial publication describing TH1834 also stated that this inhibitor particularly affected H2AX with not direct effect on H4 acetylation. The authors should revisit TH1834 and specifically investigate its effect on H2A and, in particular, on H2Av as H2Av is the corresponding ortholog of H2AX.

      Response: The case of TH1834 is not very strong in the literature, which is why we discontinued the line of experimentation when we did not see any effect of TH1834 (2 different batches) on the preferred substrate. The reviewer's suggestion is very good, but given our limited resources we decided to remove the data and discussion of TH1834 from the manuscript (old Figure 4A). The deletion of these very minor data does not diminish the overall conclusion and significance of the manuscript.

      The authors performed a detailed analysis of NU9056 effects. However, they did not include effects on H2A. H2A is distinct from H4 and H2Av as it is the only one containing K10 and this lysine also showed high levels of acetylation by Tip60. Therefore, a comprehensive analysis of Nu9056 effects should include analyzing its effects on H2A acetylation.

      Response: In these costly mass spec experiments, we strive to balance limited resources and most informative output. Because H2A.V and H4 are the major functional targets of Tip60, we considered that documenting the effect of the inhibitor on these substrates would be most appropriate. In hindsight, including H2A would have been nice to have, but would not change our conclusions about the inhibitor.

      The authors have previously reported non-histone substrates of Tip60. It would be interesting to test whether the two investigated Tip60 inhibitors affect acetylation of non-histone substrates of Tip60. This analysis would greatly increase the understanding of how selective these inhibitors are. (OPTIONAL)

      Response: We agree with the reviewer that the proposed experiments may be an interesting extension of our current work. However, the Becker lab will be closed down by the end of this year due to retirement, precluding major follow-up studies at this point.

      __ Minor comments: __

      1. Fig. 1 a: instead of "blue residues", would be more accurate to refer to "blue arrows"?

      Response: Yes of course - the text has been revised accordingly.

      Fig.1 b-c: it would be helpful to include which staining (silver/Ponceau?) was performed here.

      Response: The legends now contain the relevant information.

      Fig. 2a: I did not understand the shading for the K5/K8-K10ac panel from the figure legend. The explanation is present in the main text but would be helpful in the figure legend to allow easy access for readers.

      Response: We agree and revised text accordingly.

      Fig. 4 c: bar graphs on the top: the X-values are missing.

      Response: The figure has been revised accordingly.

      This sentence in the discussion seems to require revision: "Whereas the replication-dependent H2A resides in most nucleosomes in the genome, H2A.V, the only H2A variant histone in Drosophila, is incorporated by exchange of H2A, independent of replication."

      Response: We revised the sentence as follows to improve clarity. "While the replication-dependent H2A is present in most nucleosomes across the genome, H2A.V, the only H2A variant in Drosophila, is incorporated through replication-independent exchange of H2A."

      In this sentence: "A comparison with the TIP60 core complex is instructive since both enzymes are MYST acetyltransferases and bear significant similarity in their catalytic center." do the authors mean "informative" rather than "instructive"?

      Response: We replaced 'instructive' by 'informative.

      Significance

      The findings are novel and expand our knowledge of Tip60 histone tail acetylation dynamics and specificity. The manuscript does not address the biological relevance of distinct acetylation marks, which is clearly beyond the scope of the study, but discuss their relevance where possible. The analysis of NU9056 is informative and relevant in a broad context. Optionally, the authors could expand their analysis of NU9056 on its effects on non-histone Tip60 targets to increase impact further. Their analysis of TH1834, however, is currently insufficient as they focused on H4 acetylation alone, which has already been reported to not be affected by TH1834. The authors should include an analysis of TH1834 effects on H2A and H2A.V acetylation. The manuscript is well written, easy to follow and of appropriate length. The methods are elegant and the findings of the study are novel. The manuscripts targets researchers specifically interested in chromatin remodeling as well as a broader audience using the Tip60 inhibitor NU9056.

      Response: We thank the reviewer for their profound assessment and the general appreciation of our work. We agree that the analysis of the TH1834 is not satisfactory at this point and have removed the corresponding data and description from figure 4. The deletion of these very minor data does not diminish the overall conclusion and significance of the manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In their manuscript Krause et al investigate Tip60 selectivity on histone tail acetylation. They use elegant mass spectrometry analysis to analyze lysine acetylation marks and combination of acetylation marks of histone tails of the Tip60 targets H2A, H2A.V and H4. They further consider distinct dynamics by performing a time course experiment and compare Tip60 to MOF. Using these methods, the authors describe interesting and previously undescribed selectivity, dynamics and di-acetylation patterns of Tip60 that will be the starting point of follow-up studies diving into the biological relevance of these findings. Lastly, they investigate the effects of two Tip60 inhibitors and characterize the effects of NU9056 on Tip60 histone tail acetylation in detail. These studies showed that NU9056 has selective effects, impacting some lysine acetylations with greater efficiency than others. As antibodies available to investigate histone acetylations affected by NU9056 are not selective enough, these findings are relevant for any applicant of NU9056.

      However, the authors should revisit some additional points:

      Major comments:

      1. The Tip60 core complex is usually described as containing three subunits: Tip60, Ing3 and E(Pc). The authors also included Eaf6 in their analysis, however, their motivation to include Eaf6 specifically remains unclear. They should explain in the manuscript why Eaf6 was included and how this could affect the observed acetylation pattern
      2. The authors investigated the effectiveness of two Tip60 inhibitors by testing their effects on H4K12ac using an antibody. They state that "TH1834 had no detectable effect on either complex [Tip60 or Msl], even at very high concentrations." However, the initial publication describing TH1834 also stated that this inhibitor particularly affected H2AX with not direct effect on H4 acetylation. The authors should revisit TH1834 and specifically investigate its effect on H2A and, in particular, on H2Av as H2Av is the corresponding ortholog of H2AX.
      3. The authors performed a detailed analysis of NU9056 effects. However, they did not include effects on H2A. H2A is distinct from H4 and H2Av as it is the only one containing K10 and this lysine also showed high levels of acetylation by Tip60. Therefore, a comprehensive analysis of Nu9056 effects should include analyzing its effects on H2A acetylation.
      4. The authors have previously reported non-histone substrates of Tip60. It would be interesting to test whether the two investigated Tip60 inhibitors affect acetylation of non-histone substrates of Tip60. This analysis would greatly increase the understanding of how selective these inhibitors are. (OPTIONAL)

      Minor comments:

      1. Fig. 1 a): instead of "blue residues", would be more accurate to refer to "blue arrows"?
      2. Fig.1 b-c): it would be helpful to include which staining (silver/Ponceau?) was performed here
      3. Fig. 2a): I did not understand the shading for the K5/K8-K10ac panel from the figure legend. The explanation is present in the main text but would be helpful in the figure legend to allow easy access for readers.
      4. Fig. 4 c) bar graphs on the top: the X-values are missing.
      5. This sentence in the discussion seems to require revision: "Whereas the replication-dependent H2A resides in most nucleosomes in the genome, H2A.V, the only H2A variant histone in Drosophila, is incorporated by exchange of H2A, independent of replication."
      6. In this sentence: "A comparison with the TIP60 core complex is instructive since both enzymes are MYST acetyltransferases and bear significant similarity in their catalytic center." do the authors mean "informative" rather than "instructive"?

      Significance

      The findings are novel and expand our knowledge of Tip60 histone tail acetylation dynamics and specificity. The manuscript does not address the biological relevance of distinct acetylation marks, which is clearly beyond the scope of the study, but discuss their relevance where possible. The analysis of NU9056 is informative and relevant in a broad context. Optionally, the authors could expand their analysis of NU9056 on its effects on non-histone Tip60 targets to increase impact further. Their analysis of TH1834, however, is currently insufficient as they focused on H4 acetylation alone, which has already been reported to not be affected by TH1834. The authors should include an analysis of TH1834 effects on H2A and H2A.V acetylation.

      The manuscript is well written, easy to follow and of appropriate length. The methods are elegant and the findings of the study are novel. The manuscripts targets researchers specifically interested in chromatin remodeling as well as a broader audience using the Tip60 inhibitor NU9056.

      My expertise: I am a researcher working with Drosophila melanogaster and have published on the functions of the Tip60-p400 complex. I do not have extensive expertise in nucleosome arrays, the major method applied in this manuscript.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In this study, Krause and colleagues investigate the intrinsic substrate selectivity of the four-subunit TIP60 core module from Drosophila melanogaster using synthetic nucleosome arrays. To quantitatively assess acetylation at individual lysines on histones H2A, the variant H2A.V, and H4, the authors employ targeted mass spectrometry, thereby overcoming the limitations of antibody-based approaches. Contrary to earlier reports, their results reveal that the TIP60 core complex displays a selective lysine acetylation pattern, with distinct kinetics toward specific residues on each histone tail. For example, H2A lysines K5, K8, and K10 were acetylated, with K10 exhibiting the highest modification levels. On H2A.V, K4 and K7 were modified, with K7 showing greater initial efficiency. For H4, K12 was identified as the primary target, and its acetylation was further enhanced in the presence of H2A.V. The study also examined the activity of the KAT5 inhibitor NU9056, uncovering variable inhibition across different acetylation sites. Overall, the authors conclude that intrinsic substrate selectivity is central to understanding the mechanism of Tip60 activity and that the presence of H2A variants can modulate both the efficiency and specificity of acetylation.

      Major comments:

      The study by Krause et al. examines the in vitro substrate selectivity of the Drosophila TIP60 core complex and the lysine-specific effects of the inhibitor NU9056. The authors use a defined in vitro system with recombinant proteins and nucleosome arrays, together with targeted mass spectrometry, to assess intrinsic enzyme activity while avoiding potential issues of antibody specificity and avidity. Heatmaps and bar plots derived from the MS data show site-specific acetylation patterns and the effects of the inhibitor. A comparative analysis with the MSL core complex, which has a well-characterized selectivity, is used as a reference point for interpreting the specificity of TIP60. The observation that NU9056 exhibits different levels of effectiveness on individual lysines, including residues within the same histone tail, is supported by the quantitative MS measurements. A central limitation of the study, noted by the authors, is the uncertainty regarding the biological relevance of the findings. While the in vitro system provides a controlled framework for analyzing residue specificity and kinetics, it does not address the functional significance of these results in a cellular or organismal context. This limitation is outside the scope of the current work but indicates potential directions for follow-up studies. Within its defined objectives, the study presents a methodological framework and dataset that contribute to understanding TIP60 activity in a biochemical setting.

      Minor comments:

      While the manuscript is clearly presented overall, there are two minor issues that could be addressed:

      • In Figure 1, the panels are not ordered according to their appearance in the Results section. In addition, the legends for Figures 1B and 1C appear to be swapped.
      • For the quantitative MS data (N = 2 biological replicates), the phrasing "Error bars represent the two replicate values" could be refined. With N = 2, showing individual data points or the range may convey the information more transparently than conventional error bars, which are typically associated with statistical measures (e.g., SEM) from larger sample sizes. Alternatively, a brief note explaining the choice to use two replicates and represent them with error bars could be added.

      Significance

      Krause and colleagues, using a clean in vitro system, define the substrate specificity of the Drosophila TIP60 core complex. They identify the main acetylation sites and their kinetic dynamics on H2A, H2A.V, and H4 tails, and further characterize the inhibitory activity of NU9056. This work addresses a longstanding question in the field and provides compelling evidence to support its conclusions. Future studies will be needed to establish the biological relevance of these findings.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      This study uses defined, reconstituted nucleosome arrays (H2A- or H2A.V-containing) and the four-subunit Drosophila TIP60 core complex to map intrinsic substrate selectivity across time courses and in the presence of reported TIP60 inhibitors (NU9056, TH1834). Key findings are: (i) selective H2A-tail acetylation (K10 > K8 > K5) with negligible K12/K14; (ii) preferential H2A.V K4 and K7 acetylation with distinct kinetics and low co-occurrence on a single tail; (iii) H4K12 is strongly favoured over other H4 sites; (iv) acetylation patterns are consistent with a more distributive (non-processive) mechanism relative to MOF/MSL; (v) NU9056 inhibits TIP60 activity with site-specific differences suggestive of a non-competitive/allosteric component, whereas TH1834 shows no effect in this Drosophila system.

      Major comments

      The study describes meticulously conducted and controlled experiments, showing the impressive biochemistry work consistently produced by this group. The statistical analysis and data presentation are appropriate, with the following major comments noted:

      1. Please clarify why K8ac/K12ac, K5ac/K16ac, K5ac/K12ac are not quantified (Figure 3). If undetected, state explicitly and annotate figures with "n.d." rather than leaving gaps. If detected but excluded, justify the exclusion.
      2. The statement "Nevertheless, combinations of di- and triacetylation were much more frequent if K12ac was included, suggesting that K12 is the primary target." is under-supported because only two non-K12ac combinations are shown, and only one is lower than K12ac-containing combinations. Either soften the claim ("trend toward ... in our dataset") or expand the analysis to all observed di/tri combinations with effect sizes, n, and statistical tests.
      3. Please provide a more detailed discussion about the known nature of NU9056 inhibition and how it fits or doesn't fit with your data. Are there any structural studies on this?
      4. Why was the inhibitor experiment MS only performed for H2A.V and not H2A? Given the clear H2A vs H2A.V differences reported in Figure 2, it would be useful to have the matched data for H2A.
      5. The inhibitor observations are very interesting as they can highlight systems to study the loss of specific acetyl residues: can the authors perform WB/IF validation in treated cells? I understand it will not be possible with the H2A antibodies, but the difference in H4K5ac vs H4K12ac should be possible to validate in cells.
      6. You highlight that H2A K10 (a major TIP60 site here) is not conserved in human canonical H2A. Please expand the discussion of the potential function and physiological relevance. Maybe in relation to H2A.V being a fusion of different human variants?
      7. To enable direct comparisons between variants and residues, please match y-axis scales where the biology invites comparison (e.g., H2A vs H2A.V; Figs. 2-3).

      Minor comments

      1. Add 1-2 sentences in the abstract on the gap in the field being addressed by the study.
      2. Either in the introduction or discussion, comment on your prior Tip60 three-subunit data (Kiss et al.). The three-subunit complex was significantly less active on H4, as indicated in that publication, which is likely due to the absence of Eaf6.
      3. Figure order/legends:

      a. Text references Fig.1E before Fig.1C, please reorder

      b. Fig.1B/C legend labels appear swapped.

      c. Fig.1E, 4A, 4B: add quantification

      d. Fig.2A: Note explicitly that K5-K10 and K8-K10 are unresolvable pairs to explain the shading scheme used 4. Ensure consistent KAT5/TIP60 naming. 5. Consider moving the first two Discussion paragraphs (field context and challenges in antibody-based detection) into the Introduction to better frame the significance.

      Significance

      This is a valuable and timely study for the histone acetylation field. The substrate specificity of many individual HATs remains incompletely understood owing to (i) cross-reactivity and limited selectivity of many anti-acetyl-lysine antibodies, (ii) functional redundancy among KATs, (iii) variability across in-vitro assays (HAT domain vs full-length/complex; free histones vs oligonucleosomes), and (iv) incomplete translation of in-vitro specificity to in-vivo settings. These factors have produced conflicting reports in the literature. By combining quantitative mass spectrometry with carefully engineered oligonucleosomal arrays, the authors make a principal step toward deconvoluting TIP60 biology in a controlled yet close-to-physiologically relevant system. Conceptually, the work delineates intrinsic, site-specific preferences of the TIP60 core on variant versus canonical nucleosomes, consistent with largely distributive behaviour and site-dependent inhibitor sensitivity. The inhibitor-dependent shifts in acetylation patterns are particularly intriguing and could enable dissection of residue-specific functions, with potential translational implications for preclinical cancer research and biomarker development. Overall, this manuscript will be of interest to the chromatin community, and I am supportive of publication pending satisfactory resolution of the points raised above.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Xiong and colleagues investigate the mechanisms operating downstream to TRIM32 and controlling myogenic progression from proliferation to differentiation. Overall, the bulk of the data presented is robust. Although further investigation of specific aspects would make the conclusions more definitive (see below), it is an interesting contribution to the field of scientists studying the molecular basis of muscle diseases.

      We thank the Reviewer for appreciating our work and for their valuable suggestions to improve our manuscript. We have carefully addressed some of the concerns raised, as detailed here, while others, which require more experimental efforts, will be addressed as detailed in the Revision Plan.

      In my opinion, a few aspects would improve the manuscript. Firstly, the conclusion that Trim32 regulates c-Myc mRNA stability could be expanded and corroborated by further mechanistic studies:

      1. Studies investigating whether Tim32 binds directly to c-Myc RNA. Moreover, although possibly beyond the scope of this study, an unbiased screening of RNA species binding to Trim32 would be informative. Authors’ response. This point will be addressed as detailed in the Revision Plan

      If possible, studies in which the overexpression of different mutants presenting specific altered functional domains (NHL domain known to bind RNAs and Ring domain reportedly involved in protein ubiquitination) would be used to test if they are capable or incapable of rescuing the reported alteration of Trim32 KO cell lines in c-Myc expression and muscle maturation.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      An optional aspect that might be interesting to explore is whether the alterations in c-Myc expression observed in C2C12 might be replicated with primary myoblasts or satellite cells devoid of Trim32.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      I also have a few minor points to highlight:

        • It is unclear if the differences highlighted in graphs 5G, EV5D, and EV5E are statistically significant.*

      Authors’ response. We thank the Reviewer for raising this point. We now indicated the statistical analyses performed on the data presented in the mentioned figures (according also to a point of Reviewer #3). According to the conclusion that Trim32 is necessary for proper regulation of c-Myc transcript stability, using 2-way-ANOVA, the data now reported as Figure 5G show the statistically significant effect of the genotype at 6h (right-hand graph) but not at D0 (left-hand graph). In the graphs of Fig. EV5 D and E at D0 no significant changes are observed whereas at 6h the data show significant difference at the 40 min time point. We included this info in the graphs and in the corresponding legends.

      - On page 10, it is stated that c-Myc down-regulation cannot rescue KO myotube morphology fully nor increase the differentiation index significantly, but the corresponding data is not shown. Could the authors include those quantifications in the manuscript?

      Authors’ response. As suggested, we included the graph showing the differentiation index upon c-Myc silencing in the Trim32 KO clones and in the WT clones, as a novel panel in Figure 6 (Fig. 6D). As already reported in the text, a partial recovery of differentiation index is observed but the increase is not statistically significant. In contrast, no changes are observed applying the same silencing in the WT cells. Legend and text were modified accordingly.

      Reviewer #1 (Significance (Required)):

      The manuscript offers several strengths. It provides novel mechanistic insight by identifying a previously unrecognized role for Trim32 in regulating c-Myc mRNA stability during the onset of myogenic differentiation. The study is supported by a robust methodology that integrates CRISPR/Cas9 gene editing, transcriptomic profiling, flow cytometry, biochemical assays, and rescue experiments using siRNA knockdown. Furthermore, the work has a disease relevance, as it uncovers a mechanistic link between Trim32 deficiency and impaired myogenesis, with implications for the pathogenesis of LGMDR8. * * At the same time, the study has some limitations. The findings rely exclusively on the C2C12 myoblast cell line, which may not fully represent primary satellite cell or in vivo biology. The functional rescue achieved through c-Myc knockdown is only partial, restoring Myogenin expression but not the full differentiation index or morphology, indicating that additional mechanisms are likely involved. Although evidence supports a role for Trim32 in mRNA destabilization, the precise molecular partners-such as RNA-binding activity, microRNA involvement, or ligase function-remain undefined. Some discrepancies with previous studies, including Trim32-mediated protein degradation of c-Myc, are acknowledged but not experimentally resolved. Moreover, functional validation in animal models or patient-derived cells is currently lacking. Despite these limitations, the study represents an advancement for the field. It shifts the conceptual framework from Trim32's canonical role in protein ubiquitination to a novel function in RNA regulation during myogenesis. It also raises potential clinical implications by suggesting that targeting the Trim32-c-Myc axis, or modulating c-Myc stability, may represent a therapeutic strategy for LGMDR8. This work will be of particular interest to muscle biology researchers studying myogenesis and the molecular basis of muscle disease, RNA biology specialists investigating post-transcriptional regulation and mRNA stability, and neuromuscular disease researchers and clinicians seeking to identify new molecular targets for therapeutic intervention in LGMDR8. * * The Reviewer expressing this opinion is an expert in muscle stem cells, muscle regeneration, and muscle development.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: * * In this study, the authors sought to investigate the molecular role of Trim32, a tripartite motif-containing E3 ubiquitin ligase often associated with its dysregulation in Limb-Girdle Muscular Dystrophy Recessive 8 (LGMDR8), and its role in the dynamics of skeletal muscle differentiation. Using a CRISPR-Cas9 model of Trim32 knockout in C2C12 murine myoblasts, the authors demonstrate that loss of Trim32 alters the myogenic process, particularly by impairing the transition from proliferation to differentiation. The authors provide evidence in the way of transcriptomic profiling that displays an alteration of myogenic signaling in the Trim32 KO cells, leading to a disruption of myotube formation in-vitro. Interestingly, while previous studies have focused on Trim32's role in protein ubiquitination and degradation of c-Myc, the authors provide evidence that Trim32-regulation of c-Myc occurs at the level of mRNA stability. The authors show that the sustained c-Myc expression in Trim32 knockout cells disrupts the timely expression of key myogenic factors and interferes with critical withdrawal of myoblasts from the cell cycle required for myotube formation. Overall, the study offers a new insight into how Trim32 regulates early myogenic progression and highlights a potential therapeutic target for addressing the defects in muscular regeneration observed in LGMDR8.

      We thank the Reviewer for valuing our work and for their appreciated suggestions to improve our manuscript. We have carefully addressed some of the concerns raised as detailed here, while others, which require more laborious experimental efforts, will be addressed as reported in the Revision Plan.

      Major Comments:

      The work is a bit incremental based on this:

      https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0030445 * * And this:

      https://www.nature.com/articles/s41418-018-0129-0 * * To their credit, the authors do cite the above papers.

      Authors’ response. We thank the Reviewer for this careful evaluation of our work against the current literature and for recognising the contribution of our findings to the understanding of myogenesis complex picture in which the involvement of Trim32 and c-Myc, and of the Trim32-c-Myc axis, can occur at several stages and likely in narrow time windows along the process, thus possibly explaining some reports inconsistencies.

      The authors do provide compelling evidence that Trim32 deficiency disrupts C2C12 myogenic differentiation and sustained c-Myc expression contributes to this defective process. However, while knockdown of c-Myc does restore Myogenin levels, it was not sufficient to normalize myotube morphology or differentiation index, suggesting an incomplete picture of the Trim32-dependent pathways involved. The authors should qualify their claim by emphasizing that c-Myc regulation is a major, but not exclusive, mechanism underlying the observed defects. This will prevent an overgeneralization and better align the conclusions with the author's data.

      Authors’ response. We agree with the Reviewer and we modified our phrasing that implied Trim32-c-Myc axis as the exclusive mechanism by explicitly indicated that other pathways contribute to guarantee proper myogenesis, in the Abstract and in Discussion.

      The Abstract now reads: … suggesting that the Trim32–c-Myc axis may represent an essential hub, although likely not the exclusive molecular mechanism, in muscle regeneration within LGMDR8 pathogenesis.”

      The Discussion now reads: “Functionally, we demonstrated that c-Myc contributes to the impaired myogenesis observed in Trim32 KO clones, although this is clearly not the only factor involved in the Trim32-mediated myogenic network; realistically other molecular mechanisms can participate in this process as also suggested by our transcriptomic results.”

      The authors provide a thorough and well-executed interrogation of cell cycle dynamics in Trim32 KO clones, combining phosphor-histone H3 flow cytometry of DNA content, and CFSE proliferation assays. These complementary approaches convincingly show that, while proliferation states remain similar in WT and KO cells, Trim32-deficient myoblasts fail in their normal withdraw from the cell cycle during exposure to differentiation-inducing conditions. This work adds clarity to a previously inconsistent literature and greatly strengthens the study.

      Authors’ response. We thank the Reviewer for appreciating our thorough analyses on cell cycle dynamics in proliferation conditions and at the onset of the differentiation process.

      The transcriptomic analysis (detailed In the "Transcriptomic analysis of Trim32 WT and KO clones along early differentiation" section of Results) is central to the manuscript and provides strong evidence that Trim32 deficiency disrupts normal differentiation processes. However, the description of the pathway enrichment results is highly detailed and somewhat compressed, which may make it challenging for readers to following the key biological 'take-homes'. The narrative quickly moves across their multiple analyses like MDS, clustering, heatmaps, and bubble plots without pausing to guide the reader through what each analysis contributes to the overall biological interpretation. As a result, the key findings (reduced muscle development pathways in KO cells and enrichment of cell cycle-related pathways) can feel somewhat muted. The authors may consider reorganizing this section, so the primary biological insights are highlighted and supported by each of their analyses. This would allow the biological implications to be more accessible to a broader readership.

      Authors’ response. We thank the Reviewer for raising this point and apologise for being too brief in describing the data, leaving indeed some points excessively implicit. As suggested, we now reorganised this session and added the lists of enriched canonical pathways relative to WT vs KO comparisons at D0 and D3 (Fig. EV3B) as well as those relative to the comparison between D0 and D3 for both WT and Trim32 KO samples (Fig. EV3C), with their relative scores. We changed the Results section “Transcriptomic analysis of Trim32 WT and Trim32 KO clones along early differentiationas reported here below and modified the legends accordingly.

      The paragraph now reads: Based on our initial observations, the absence of Trim32 already exerts a significant impact by day 3 (D3) of C2C12 myogenic differentiation. To investigate how Trim32 influences early global transcriptional changes during the proliferative phase (D0) and early differentiation (D3), we performed an unbiased transcriptomic profiling of WT and Trim32 KO clones (Fig. 2A). Multidimensional Scaling (MDS) analysis revealed clear segregation of gene expression profiles based on both time of differentiation (Dim1, 44% variance) and Trim32 genotype (Dim2, 16% variance) (Fig. 2A). Likewise, hierarchical clustering grouped WT and Trim32 KO clones into distinct clusters at both timepoints, indicating consistent genotype-specific transcriptional differences (Fig. EV3A). Differentially Expressed Genes (DEGs) were detected in the Trim32 KO transcriptome relative to WT, at both D0 and D3. In proliferating conditions, 72 genes were upregulated and 189 were downregulated whereas at D3 of differentiation, 72 genes were upregulated and 212 were downregulated. Ingenuity Pathway Analysis of the DEGs revealed the top 10 Canonical Pathways displayed in Fig. EV3B as enriched at either D0 or D3 (Fig. EV3B). Several of these pathways can underscore relevant Trim32-mediated functions though most of them represent generic functions not immediately attributable to the observed myogenesis defects.

      Notably, the transcriptional divergence between WT and Trim32 KO cells is more pronounced at D3, as evidenced by a greater separation along the MSD Dim2 axis, suggesting that Trim32-dependent transcriptional regulation intensifies during early differentiation (Fig. 2A). Given our interest in the differentiation process, we therefore focused our analyses comparing the changes occurring from D0 to D3 in WT (WT D3 vs. D0) and in Trim32 KO (KO D3 vs. D0) RNAseq data.

      Pathway enrichment analysis of D3 vs. D0 DEGs allowed the selection of the top-scored pathways for both WT and Trim32 KO data. We obtained 18 top-scored pathways enriched in each genotype (-log(p-value) ³ 9 cut-off): 14 are shared while 4 are top-ranked only in WT and 4 only in Trim32 KO (Fig. EV3C). For the following analyses, we employed thus a total of 22 distinct pathways and to better mine those relevant in the passage from the proliferation stage to the early differentiation one and that are affected by the lack of Trim32, we built a bubble plot comparing side-by-side the scores and enrichment of the 22 selected top-scored pathways above in WT and Trim32 KO (Fig. 2B). A heatmap of DEGs included within these selected pathways confirms the clustering of the samples considering both the genotypes and the timepoints highlighting gene expression differences (Fig. 2C). These pathways are mainly related to muscle development, cell cycle regulation, genome stability maintenance and few other metabolic cascades.

      As expected given the results related to Figure 1, moving from D0 to D3 WT clones showed robust upregulation of key transcripts associated with the Inactive Sarcomere Protein Complex, a category encompassing most genes in the “Striated Muscle Contraction” pathway, while in Trim32 KO clones this pathway was not among those enriched in the transition from D0 to D3 (Fig. EV3C). Detailed analyses of transcripts enclosed within this pathway revealed that on the transition from proliferation to differentiation, WT clones show upregulation of several Myosin Heavy Chain isoforms (e.g., MYH3, MYH6, MYH8), α-Actin 1 (ACTA1), α-Actinin 2 (ACTN2), Desmin (DES), Tropomodulin 1 (TMOD1), and Titin (TTN), a pattern consistent with previous reports, while these same transcripts were either non-detected or only modestly upregulated in Trim32 KO clones at D3 (Fig. 2D). This genotype-specific disparity was further confirmed by gene set enrichment barcode plots, which demonstrated significant enrichment of these muscle-related transcripts in WT cells (FDR_UP = 0.0062), but not in Trim32 KO cells (FDR_UP = 0.24) (Fig. EV3D). These findings support an early transcriptional basis for the impaired myogenesis previously observed in Trim32 KO cells.

      In addition to differences in muscle-specific gene expression, we observed that also several pathways related to cell proliferation and cell cycle regulation were more enriched in Trim32 KO cells compared to WT. This suggests that altered cell proliferation may contribute to the distinct differentiation behavior observed in Trim32 KO versus WT (Fig. 2B). Given that cell cycle exit is a critical prerequisite for the onset of myogenic differentiation and considering that previous studies on Trim32 role in cell cycle regulation have reported inconsistent findings, we further examined cell cycle dynamics under our experimental conditions to clarify Trim32 contribution to this process

      The work would be greatly strengthened by the conclusion of LGMDR8 primary cells, and rescue experiments of TRIM32 to explore myogenesis.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      Also, EU (5-ethynyl uridine) pulse-chase experiments to label nascent and stable RNA coupled with MYC pulldowns and qPCR (or RNA-sequencing of both pools) would further enhance the claim that MYC stability is being affected.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      "On one side, c-Myc may influence early stages of myogenesis, such as myoblast proliferation and initial myotube formation, but it may not contribute significantly to later events such as myotube hypertrophy or fusion between existing myotubes and myocytes. This hypothesis is supported by recent work showing that c-Myc is dispensable for muscle fiber hypertrophy but essential for normal MuSC function (Ham et al, 2025)." Also address and discuss the following, as what is currently written is not entirely accurate: https://www.embopress.org/doi/full/10.1038/s44319-024-00299-z and https://journals.physiology.org/doi/prev/20250724-aop/abs/10.1152/ajpcell.00528.2025

      Authors’ response. We thank the Reviewer for bringing to our attention these two publications, that indeed, add important piece of data to recapitulate the in vivo complexity of c-Myc role in myogenesis. We included this point in our Discussion.

      The Discussion now reads: “On one side, c-Myc may influence early stages of myogenesis, such as myoblast proliferation and initial myotube formation, but it may not contribute significantly to later events such as myotube hypertrophy or fusion between existing myotubes and myocytes. This hypothesis is supported by recent work showing that c-Myc is dispensable for muscle fiber hypertrophy but essential for normal MuSC function (Ham et al, 2025). Other reports, instead, demonstrated the implication of c-Myc periodic pulses, mimicking resistance-exercise, in muscle growth, a role that cannot though be observed in our experimental model (Edman et al., 2024; Jones et al., 2025).”

      Minor Comments:

      Z-score scale used in the pathway bubble plot (Figure 2C) could benefit from alternative color choices. Current gradient is a bit muddy and clarity for the reader could be improved by more distinct color options, particularly in the transition from positive to negative Z-score.

      Authors’ response. As suggested, we modified the z-score-representing colors using a more distinct gradient especially in the positive to negative transition in Figure 2B.

      Clarification on the rationale for selecting the "top 18" pathways would be helpful, as it is not clear if this cutoff was chosen arbitrarily or reflects a specific statistical or biological threshold.

      Authors’ response. As now better explained (see comment regarding Major point: Transcriptomics), we used a cut-off of -log(p-value) above or equal to 9 for pathways enriched in DEGs of the D0 vs D3 comparison for both WT and Trim32 KO. The threshold is now included in the Results section and the pathways (shared between WT and Trim32 KO and unique) are listed as Fig. EV3C.

      The authors alternates between using "Trim 32 KO clones" and "KO clones" throughout the manuscript. Consistent terminology across figures and text would improve readability.

      Authors’ response. We thank the Reviewer for this remark, and we apologise for having overlooked it. We amended this throughout the manuscript by always using for clarity “Trim32 KO clones/cells”.

      Cell culture methodology does not specify passage number or culture duration (only "At confluence") before differentiation. This is important, as C2C12 differentiation potential can drift with extended passaging.

      Authors’ response. We agree with the Reviewer that C2C12 passaging can reduce the differentiation potential of this myoblast cell lines; this is indeed the main reason why we decided to employ WT clones, which underwent the same editing process as those that resulted mutated in the Trim32 gene, as reference controls throughout our study. We apologise for not indicating the passages in the first version of the manuscript that now is amended as per here below in the Methods section:

      The C2C12 parental cells used in this study were maintained within passages 3–8. All clonal cell lines (see below) were utilized within 10 passages following gene editing. In all experiments, WT and Trim32 KO clones of comparable passage numbers were used to ensure consistency and minimize passage-related variability.

      Reviewer #2 (Significance (Required)):

      General Assessment:

      This study provides a thorough investigation of Trim32's role the processes related to skeletal muscle differentiation using a CRISPR-Cas9 knockout C2C12 model. The strengths of this study lie in the multi-layered experimental approach as the authors incorporated transcriptomics, cell cycle profiling, and stability assays which collectively build a strong case for their hypothesis that Trim32 is a key factor in the normal regulation of myogenesis. The work is also strengthened by the use of multiple biological and technical replicates, particularly the independent KO clones which helps address potential clonal variation issues that could occur. The largest limitation to this study is that, while the c-Myc mechanism is well explored, the other Trim32-dependent pathways associated with the disruption (implicated by the incomplete rescue by c-Myc knockdown) are not as well addressed. Overall however, the study convincingly identifies a critical function for Trim32 during skeletal muscle differentiation. * * Advance: * * To my knowledge, this is the first study to demonstrate the mRNA stability level of c-Myc regulation by Trim32, rather than through the ubiquitin-mediated protein degradation. This work will advance the current understanding and provide a more complete understanding of Trim32's role in c-Myc regulation. Beyond c-Myc, this work highlights the idea that TRIM family proteins can influence RNA stability which could implicate a broader role in RNA biology and has potential for future therapeutic targeting. * * Audience: * * This research will be of interest to an audience that focuses on broad skeletal muscle biology but primarily to readers with more focused research such as myogenesis and neuromuscular disease (LGMDR8 in particular) where the defined Trim32 governance over early differentiation checkpoints will be of interest. It will also provide mechanistic insights to those outside of skeletal muscle that study TRIM family proteins, ubiquitin biology, and RNA regulation. For translational/clinical researchers, it identifies the Trim32/c-Myc axis as a potential therapeutic target for LGMDR8 and related muscular dystrophies.

      Expertise: * * My expertise lies in skeletal muscle biology, gene editing, transgenic mouse models, and bioinformatics. I feel confident evaluating the data and conclusions as presented.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      • In this paper, the authors examine the role of TRIM32, implicated in limb girdle muscular dystrophy recessive 8 (LGMDR8), in the differentiation of C2C12 mouse myoblasts. Using CRISPR, they generate mutant and wild-type clones and compare their differentiation capacity in vitro. They report that Trim32-deficient clones exhibit delayed and defective myogenic differentiation. RNA-seq analysis reveals widespread changes in gene expression, although few are validated by independent methods. Notably, Trim32 mutant cells maintain residual proliferation under differentiation conditions, apparently due to a failure to downregulate c-Myc. Translation inhibition experiments suggest that TRIM32 promotes c-Myc mRNA destabilization, but this conclusion is insufficiently substantiated. The authors also perform rescue experiments, showing that c-Myc knockdown in Trim32-deficient cells alleviates some differentiation defects. However, this rescue is not quantified, was conducted in only two of the three knockout lines, and is supported by inappropriate statistical analysis of gene expression. Overall, the manuscript in its current form has substantial weaknesses that preclude publication. Beyond statistical issues, the major concerns are: (1) exclusive reliance on the immortalized C2C12 line, with no validation in primary/satellite cells or in vivo, (2) insufficient mechanistic evidence that TRIM32 acts directly on c-Myc mRNA, and (3) overinterpretation of disease relevance in the absence of supporting patient or in vivo data. Please find more details below:*

      We thank the Reviewer for the in-depth assessment of our work and precious suggestions to improve the manuscript. We have carefully addressed some of the concerns raised, as detailed here, while others, which require more experimental efforts, will be addressed as detailed in the Revision Plan.

      - TRIM32 complementation / rescue experiments to exclude clonal or off-target CRISPR effects and show specificity are lacking.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      - The authors link their in vitro findings to LGMDR8 pathogenesis and propose that the Trim32-c-Myc axis may serve as a central regulator of muscle regeneration in the disease. However, LGMDR8 is a complex disorder, and connecting muscle wasting in patients to differentiation assays in C2C12 cells is difficult to justify. No direct evidence is provided that the proposed mRNA mechanism operates in patient-derived samples or in mouse satellite cells. Moreover, the partial rescue achieved by c-Myc knockdown (which does not fully restore myotube morphology or differentiation index) further suggests that the disease connection is not straightforward. Validation of the TRIM32-c-Myc axis in a physiologically relevant system, such as LGMD patient myoblasts or Trim32 mutant mouse cells, would greatly strengthen the claim.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      -Some gene expression changes from the RNA-seq study in Figure 2 should be validated by qPCR

      Authors’ response. We thank the reviewer for this suggestion. This point will be addressed as detailed in the Revision Plan. We have selected several transcripts that will be evaluated in independent samples in order to validate the RNAseq results.

      - The paper shows siRNA knockdown of c-Myc in KO restores Myogenin RNA/protein but does not fully rescue myotube morphology or differentiation index. This suggests that Trim32 controls additional effectors beyond c-Myc; yet the authors do not pursue other candidate mediators identified in the RNA-seq. The manuscript would be strengthened by systematically testing whether other deregulated transcripts contribute to the phenotype.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      - There are concerns with experimental/statistical issues and insufficient replicate reporting. The authors use unpaired two-tailed Student's t-test across many comparisons; multiple testing corrections or ANOVA where appropriate should be used. In Figure EV5B and Figure 6B, the authors perform statistical analyses with control values set to 1. This method masks the inherent variability between experiments and artificially augments p values. Control sample values need to be normalized to one another to have reliable statistical analysis. Myotube morphology and differentiation index quantifications need clear description of fields counted, blind analysis, and number of biological replicates.

      Authors’ response. We thank the Reviewer for raising this point.

      Regarding the replicates, we clarified in the Methods and Legends that the Trim32 KO experiments have been performed on 3 biological replicates (independent clones) and the same for the reference control (3 independent WT clones), except for the Fig. 6 experiments that were performed on 2 Trim32 KO and 2 WT clones. All the Western Blots, immunofluorescence, qPCR data are representative of the results of at least 3 independent experiments unless otherwise stated. We reported the number and type of replicates as well as the microscope fields analyzed.

      We repeated the statistical analyses of the data in Figure 5G, EV5D, EV5E, employing more appropriately the 2-way-ANOVA test, as suggested, and we now reported this info in the graphs and legends.

      We thank the Reviewer for raising this point, we agree and substituted the graphs in Fig. EV5B and 6B showing the control values normalised as suggested. The statistical analyses now reflect this change.

      -Some English mistakes require additional read-throughs. For example: "Indeed, Trim32 has no effect on the stability of c-Myc mRNA in proliferating conditions, but upon induction of differentiation the stability of c-Myc mRNA resulted enhanced in Trim32 KO clones (Fig. 5G, Fig. EV5D and 5E)."

      Authors’ response. We re-edited this revised version of the manuscript as suggested.

      -Results in Figure 5A should be quantified

      Authors’ response. We amended this point by quantifying the results shown in Fig. 5A, we added the graph of the quantification of 3 experimental replicates to the Figure. Quantification confirms that no statistically significant difference is observed. The Figure and the relative legend are modified accordingly.

      -Based on the nuclear marker p84, the separation of cytoplasmic and nuclear fractions is not ideal in Figure 5D

      Authors’ response. We agree with the Reviewer that the presence of p84 also in the cytoplasmic fraction is not ideal. Regrettably, we observed this faint p84 band in all the experiments performed. We think however, that this is not impacting on the result that clearly shows that c-Myc and Trim32 are never detected in the same compartment.

      -In Figure 6, it is not appropriate to perform statistical analyses on only two data points per condition.

      Authors’ response. We agree with the Reviewer and we now show the graph of the results of the 3 technical replicates for 2 biological replicates and do not indicate any statistics (Fig. 6B). The graph was also modified according to a previous point raised.

      -The nuclear MYOG phenotype is very interesting; could this be related to requirements of TRIM32 in fusion?

      Authors’ response. We agree with the Reviewer that Trim32 might also be necessary for myoblast fusion. This point is however beyond the scope of the present study and will be addressed in future work.

      - The hypothesis that TRIM32 destabilizes c-Myc mRNA is intriguing but requires stronger mechanistic support. This would be more convincing with RNA immunoprecipitation to test direct association with c-Myc mRNA, and/or co-immunoprecipitation to identify interactions between TRIM32 and proteins involved in mRNA stability. The study would also be strengthened by reporter assays, such as c-Myc 3′UTR luciferase constructs in WT and KO cells, to directly demonstrate 3′UTR-dependent regulation of mRNA stability.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      Reviewer #3 (Significance (Required)):

      The manuscript presents a minor conceptual advance in understanding TRIM32 function in myogenic differentiation. Its main limitation is that all experiments were performed in C2C12 cells. While C2C12 are a classical system to study muscle differentiation, they are an immortalized, long-cultured, and genetically unstable line that represents a committed myoblast stage rather than bona fide satellite cells. They therefore do not fully model the biology of early regenerative responses. Several TRIM32 phenotypes reported in the literature differ between primary satellite cells and cell lines, and the authors themselves note such discrepancies. Extrapolating these findings to LGMDR8 pathogenesis without validation in primary human myoblasts, satellite cell assays, or in vivo regeneration models is therefore not justified. Previous work has already established clear roles for TRIM32 in mouse satellite cells in vivo and in patient myoblasts in vitro, whereas this study introduces a novel link to c-Myc regulation during differentiation. In addition, without mechanistic evidence, the central claim that TRIM32 regulates c-Myc mRNA stability remains descriptive and incomplete. Nevertheless, the results will be of interest to researchers studying LGMD and to those exploring TRIM32 biology in broader contexts. I review this manuscript as a muscle biologist with expertise in satellite cell biology and transcriptional regulation.

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Reply to the Reviewers

      I thank the Referees for their...

      Referee #1

      1. The authors should provide more information when...

      Responses + The typical domed appearance of a hydrocephalus-harboring skull is apparent as early as P4, as shown in a new side-by-side comparison of pups at that age (Fig. 1A). + Though this is not stated in the MS 2. Figure 6: Why has only...

      Response: We expanded the comparison

      Minor comments:

      1. The text contains several...

      Response: We added...

      Referee #2

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Reply to the Reviewers

      I thank the Referees for their...

      Referee #1

      1. The authors should provide more information when...

      Responses + The typical domed appearance of a hydrocephalus-harboring skull is apparent as early as P4, as shown in a new side-by-side comparison of pups at that age (Fig. 1A). + Though this is not stated in the MS 2. Figure 6: Why has only...

      Response: We expanded the comparison

      Minor comments:

      1. The text contains several...

      Response: We added...

      Referee #2

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this paper, the authors examine the role of TRIM32, implicated in limb girdle muscular dystrophy recessive 8 (LGMDR8), in the differentiation of C2C12 mouse myoblasts. Using CRISPR, they generate mutant and wild-type clones and compare their differentiation capacity in vitro. They report that Trim32-deficient clones exhibit delayed and defective myogenic differentiation. RNA-seq analysis reveals widespread changes in gene expression, although few are validated by independent methods. Notably, Trim32 mutant cells maintain residual proliferation under differentiation conditions, apparently due to a failure to downregulate c-Myc. Translation inhibition experiments suggest that TRIM32 promotes c-Myc mRNA destabilization, but this conclusion is insufficiently substantiated. The authors also perform rescue experiments, showing that c-Myc knockdown in Trim32-deficient cells alleviates some differentiation defects. However, this rescue is not quantified, was conducted in only two of the three knockout lines, and is supported by inappropriate statistical analysis of gene expression. Overall, the manuscript in its current form has substantial weaknesses that preclude publication. Beyond statistical issues, the major concerns are: (1) exclusive reliance on the immortalized C2C12 line, with no validation in primary/satellite cells or in vivo, (2) insufficient mechanistic evidence that TRIM32 acts directly on c-Myc mRNA, and (3) overinterpretation of disease relevance in the absence of supporting patient or in vivo data. Please find more details below:

      • TRIM32 complementation / rescue experiments to exclude clonal or off-target CRISPR effects and show specificity are lacking.
      • The authors link their in vitro findings to LGMDR8 pathogenesis and propose that the Trim32-c-Myc axis may serve as a central regulator of muscle regeneration in the disease. However, LGMDR8 is a complex disorder, and connecting muscle wasting in patients to differentiation assays in C2C12 cells is difficult to justify. No direct evidence is provided that the proposed mRNA mechanism operates in patient-derived samples or in mouse satellite cells. Moreover, the partial rescue achieved by c-Myc knockdown (which does not fully restore myotube morphology or differentiation index) further suggests that the disease connection is not straightforward. Validation of the TRIM32-c-Myc axis in a physiologically relevant system, such as LGMD patient myoblasts or Trim32 mutant mouse cells, would greatly strengthen the claim. -Some gene expression changes from the RNA-seq study in Figure 2 should be validated by qPCR
      • The paper shows siRNA knockdown of c-Myc in KO restores Myogenin RNA/protein but does not fully rescue myotube morphology or differentiation index. This suggests that Trim32 controls additional effectors beyond c-Myc; yet the authors do not pursue other candidate mediators identified in the RNA-seq. The manuscript would be strengthened by systematically testing whether other deregulated transcripts contribute to the phenotype.
      • There are concerns with experimental/statistical issues and insufficient replicate reporting. The authors use unpaired two-tailed Student's t-test across many comparisons; multiple testing corrections or ANOVA where appropriate should be used. In Figure EV5B and Figure 6B, the authors perform statistical analyses with control values set to 1. This method masks the inherent variability between experiments and artificially augments p values. Control sample values need to be normalized to one another to have reliable statistical analysis. Myotube morphology and differentiation index quantifications need clear description of fields counted, blind analysis, and number of biological replicates. -Some English mistakes require additional read-throughs. For example: "Indeed, Trim32 has no effect on the stability of c-Myc mRNA in proliferating conditions, but upon induction of differentiation the stability of c-Myc mRNA resulted enhanced in Trim32 KO clones (Fig. 5G, Fig. EV5D and 5E)." -Results in Figure 5A should be quantified -Based on the nuclear marker p84, the separation of cytoplasmic and nuclear fractions is not ideal in Figure 5D -In Figure 6, it is not appropriate to perform statistical analyses on only two data points per condition. -The nuclear MYOG phenotype is very interesting; could this be related to requirements of TRIM32 in fusion?
      • The hypothesis that TRIM32 destabilizes c-Myc mRNA is intriguing but requires stronger mechanistic support. This would be more convincing with RNA immunoprecipitation to test direct association with c-Myc mRNA, and/or co-immunoprecipitation to identify interactions between TRIM32 and proteins involved in mRNA stability. The study would also be strengthened by reporter assays, such as c-Myc 3′UTR luciferase constructs in WT and KO cells, to directly demonstrate 3′UTR-dependent regulation of mRNA stability.

      Significance

      The manuscript presents a minor conceptual advance in understanding TRIM32 function in myogenic differentiation. Its main limitation is that all experiments were performed in C2C12 cells. While C2C12 are a classical system to study muscle differentiation, they are an immortalized, long-cultured, and genetically unstable line that represents a committed myoblast stage rather than bona fide satellite cells. They therefore do not fully model the biology of early regenerative responses. Several TRIM32 phenotypes reported in the literature differ between primary satellite cells and cell lines, and the authors themselves note such discrepancies. Extrapolating these findings to LGMDR8 pathogenesis without validation in primary human myoblasts, satellite cell assays, or in vivo regeneration models is therefore not justified. Previous work has already established clear roles for TRIM32 in mouse satellite cells in vivo and in patient myoblasts in vitro, whereas this study introduces a novel link to c-Myc regulation during differentiation. In addition, without mechanistic evidence, the central claim that TRIM32 regulates c-Myc mRNA stability remains descriptive and incomplete. Nevertheless, the results will be of interest to researchers studying LGMD and to those exploring TRIM32 biology in broader contexts. I review this manuscript as a muscle biologist with expertise in satellite cell biology and transcriptional regulation.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In this study, the authors sought to investigate the molecular role of Trim32, a tripartite motif-containing E3 ubiquitin ligase often associated with its dysregulation in Limb-Girdle Muscular Dystrophy Recessive 8 (LGMDR8), and its role in the dynamics of skeletal muscle differentiation. Using a CRISPR-Cas9 model of Trim32 knockout in C2C12 murine myoblasts, the authors demonstrate that loss of Trim32 alters the myogenic process, particularly by impairing the transition from proliferation to differentiation. The authors provide evidence in the way of transcriptomic profiling that displays an alteration of myogenic signaling in the Trim32 KO cells, leading to a disruption of myotube formation in-vitro. Interestingly, while previous studies have focused on Trim32's role in protein ubiquitination and degradation of c-Myc, the authors provide evidence that Trim32-regulation of c-Myc occurs at the level of mRNA stability. The authors show that the sustained c-Myc expression in Trim32 knockout cells disrupts the timely expression of key myogenic factors and interferes with critical withdrawal of myoblasts from the cell cycle required for myotube formation. Overall, the study offers a new insight into how Trim32 regulates early myogenic progression and highlights a potential therapeutic target for addressing the defects in muscular regeneration observed in LGMDR8.

      Major Comments:

      The work is a bit incremental based on this: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0030445 And this: https://www.nature.com/articles/s41418-018-0129-0 To their credit, the authors do cite the above papers.

      The authors do provide compelling evidence that Trim32 deficiency disrupts C2C12 myogenic differentiation and sustained c-Myc expression contributes to this defective process. However, while knockdown of c-Myc does restore Myogenin levels, it was not sufficient to normalize myotube morphology or differentiation index, suggesting an incomplete picture of the Trim32-dependent pathways involved. The authors should qualify their claim by emphasizing that c-Myc regulation is a major, but not exclusive, mechanism underlying the observed defects. This will prevent an overgeneralization and better align the conclusions with the author's data. The authors provide a thorough and well-executed interrogation of cell cycle dynamics in Trim32 KO clones, combining phosphor-histone H3 flow cytometry of DNA content, and CFSE proliferation assays. These complementary approaches convincingly show that, while proliferation states remain similar in WT and KO cells, Trim32-deficient myoblasts fail in their normal withdraw from the cell cycle during exposure to differentiation-inducing conditions. This work adds clarity to a previously inconsistent literature and greatly strengthens the study.

      The transcriptomic analysis (detailed In the "Transcriptomic analysis of Trim32 WT and KO clones along early differentiation" section of Results) is central to the manuscript and provides strong evidence that Trim32 deficiency disrupts normal differentiation processes. However, the description of the pathway enrichment results is highly detailed and somewhat compressed, which may make it challenging for readers to following the key biological 'take-homes'. The narrative quickly moves across their multiple analyses like MDS, clustering, heatmaps, and bubble plots without pausing to guide the reader through what each analysis contributes to the overall biological interpretation. As a result, the key findings (reduced muscle development pathways in KO cells and enrichment of cell cycle-related pathways) can feel somewhat muted. The authors may consider reorganizing this section, so the primary biological insights are highlighted and supported by each of their analyses. This would allow the biological implications to be more accessible to a broader readership.

      The work would be greatly strengthened by the conclusion of LGMDR8 primary cells, and rescue experiments of TRIM32 to explore myogenesis. Also, EU (5-ethynyl uridine) pulse-chase experiments to label nascent and stable RNA coupled with MYC pulldowns and qPCR (or RNA-sequencing of both pools) would further enhance the claim that MYC stability is being affected.

      "On one side, c-Myc may influence early stages of myogenesis, such as myoblast proliferation and initial myotube formation, but it may not contribute significantly to later events such as myotube hypertrophy or fusion between existing myotubes and myocytes. This hypothesis is supported by recent work showing that c-Myc is dispensable for muscle fiber hypertrophy but essential for normal MuSC function (Ham et al, 2025)." Also address and discuss the following, as what is currently written is not entirely accurate: https://www.embopress.org/doi/full/10.1038/s44319-024-00299-z and https://journals.physiology.org/doi/prev/20250724-aop/abs/10.1152/ajpcell.00528.2025

      Minor Comments:

      Z-score scale used in the pathway bubble plot (Figure 2C) could benefit from alternative color choices. Current gradient is a bit muddy and clarity for the reader could be improved by more distinct color options, particularly in the transition from positive to negative Z-score.

      Clarification on the rationale for selecting the "top 18" pathways would be helpful, as it is not clear if this cutoff was chosen arbitrarily or reflects a specific statistical or biological threshold.

      The authors alternates between using "Trim 32 KO clones" and "KO clones" throughout the manuscript. Consistent terminology across figures and text would improve readability.

      Cell culture methodology does not specify passage number or culture duration (only "At confluence") before differentiation. This is important, as C2C12 differentiation potential can drift with extended passaging.

      Significance

      General Assessment:

      This study provides a thorough investigation of Trim32's role the processes related to skeletal muscle differentiation using a CRISPR-Cas9 knockout C2C12 model. The strengths of this study lie in the multi-layered experimental approach as the authors incorporated transcriptomics, cell cycle profiling, and stability assays which collectively build a strong case for their hypothesis that Trim32 is a key factor in the normal regulation of myogenesis. The work is also strengthened by the use of multiple biological and technical replicates, particularly the independent KO clones which helps address potential clonal variation issues that could occur. The largest limitation to this study is that, while the c-Myc mechanism is well explored, the other Trim32-dependent pathways associated with the disruption (implicated by the incomplete rescue by c-Myc knockdown) are not as well addressed. Overall however, the study convincingly identifies a critical function for Trim32 during skeletal muscle differentiation.

      Advance:

      To my knowledge, this is the first study to demonstrate the mRNA stability level of c-Myc regulation by Trim32, rather than through the ubiquitin-mediated protein degradation. This work will advance the current understanding and provide a more complete understanding of Trim32's role in c-Myc regulation. Beyond c-Myc, this work highlights the idea that TRIM family proteins can influence RNA stability which could implicate a broader role in RNA biology and has potential for future therapeutic targeting.

      Audience:

      This research will be of interest to an audience that focuses on broad skeletal muscle biology but primarily to readers with more focused research such as myogenesis and neuromuscular disease (LGMDR8 in particular) where the defined Trim32 governance over early differentiation checkpoints will be of interest. It will also provide mechanistic insights to those outside of skeletal muscle that study TRIM family proteins, ubiquitin biology, and RNA regulation. For translational/clinical researchers, it identifies the Trim32/c-Myc axis as a potential therapeutic target for LGMDR8 and related muscular dystrophies.

      Expertise:

      My expertise lies in skeletal muscle biology, gene editing, transgenic mouse models, and bioinformatics. I feel confident evaluating the data and conclusions as presented.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Xiong and colleagues investigate the mechanisms operating downstream to TRIM32 and controlling myogenic progression from proliferation to differentiation. Overall, the bulk of the data presented is robust. Although further investigation of specific aspects would make the conclusions more definitive (see below), it is an interesting contribution to the field of scientists studying the molecular basis of muscle diseases. In my opinion, a few aspects would improve the manuscript.

      Firstly, the conclusion that Trim32 regulates c-Myc mRNA stability could be expanded and corroborated by further mechanistic studies:

      1. Studies investigating whether Tim32 binds directly to c-Myc RNA. Moreover, although possibly beyond the scope of this study, an unbiased screening of RNA species binding to Trim32 would be informative.
      2. If possible, studies in which the overexpression of different mutants presenting specific altered functional domains (NHL domain known to bind RNAs and Ring domain reportedly involved in protein ubiquitination) would be used to test if they are capable or incapable of rescuing the reported alteration of Trim32 KO cell lines in c-Myc expression and muscle maturation. An optional aspect that might be interesting to explore is whether the alterations in c-Myc expression observed in C2C12 might be replicated with primary myoblasts or satellite cells devoid of Trim32.

      I also have a few minor points to highlight:

      • It is unclear if the differences highlighted in graphs 5G, EV5D, and EV5E are statistically significant.
      • On page 10, it is stated that c-Myc down-regulation cannot rescue KO myotube morphology fully nor increase the differentiation index significantly, but the corresponding data is not shown. Could the authors include those quantifications in the manuscript?

      Significance

      The manuscript offers several strengths. It provides novel mechanistic insight by identifying a previously unrecognized role for Trim32 in regulating c-Myc mRNA stability during the onset of myogenic differentiation. The study is supported by a robust methodology that integrates CRISPR/Cas9 gene editing, transcriptomic profiling, flow cytometry, biochemical assays, and rescue experiments using siRNA knockdown. Furthermore, the work has a disease relevance, as it uncovers a mechanistic link between Trim32 deficiency and impaired myogenesis, with implications for the pathogenesis of LGMDR8. At the same time, the study has some limitations. The findings rely exclusively on the C2C12 myoblast cell line, which may not fully represent primary satellite cell or in vivo biology. The functional rescue achieved through c-Myc knockdown is only partial, restoring Myogenin expression but not the full differentiation index or morphology, indicating that additional mechanisms are likely involved. Although evidence supports a role for Trim32 in mRNA destabilization, the precise molecular partners-such as RNA-binding activity, microRNA involvement, or ligase function-remain undefined. Some discrepancies with previous studies, including Trim32-mediated protein degradation of c-Myc, are acknowledged but not experimentally resolved. Moreover, functional validation in animal models or patient-derived cells is currently lacking.

      Despite these limitations, the study represents an advancement for the field. It shifts the conceptual framework from Trim32's canonical role in protein ubiquitination to a novel function in RNA regulation during myogenesis. It also raises potential clinical implications by suggesting that targeting the Trim32-c-Myc axis, or modulating c-Myc stability, may represent a therapeutic strategy for LGMDR8. This work will be of particular interest to muscle biology researchers studying myogenesis and the molecular basis of muscle disease, RNA biology specialists investigating post-transcriptional regulation and mRNA stability, and neuromuscular disease researchers and clinicians seeking to identify new molecular targets for therapeutic intervention in LGMDR8.

      The Reviewer expressing this opinion is an expert in muscle stem cells, muscle regeneration, and muscle development.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The manuscript by Shukla et al described the "chromatin states" in the bryophyte Marchantia polymorpha and compared it with that in Arabidopsis thaliana. They described the generally common features of chromatin states between these evolutionally distant plant species, but they also find some differences. The authors also studied the connection between chromatin states and TF bindings, mostly in Arabidopsis due to the scarcity of the TF binding data in Marchantia. Their analyses lead to interesting finding that specific transcription families tend to associate with specific chromatin state, which tend to associate with specific genomic regions such as promoter, TSS, gene body, and fucultative heterochromatin. Overall, the authors provide novel piece of information regarding the evolutional conservation of chromatin states and the relationship between chromatin states and TFs.

      Major comments:

      1. In the end of the abstract they state "The association with the +1 nucleosome defines a list of candidate pioneer factors we know little about in plants", which is one of their major points. This is based on the results Fig4F and 4G, described in P27 L16-17. Question is, is cluster 1 TFs really associated with the +1 nucleosome? From Fig. 1C, +1 nucleosome is characterized mostly by E1 state and also by E2, F3, F4. However, from Fig. 4F, cluster 1 TFs are not associated with E1/E2 and association is not particularly strong for F3/F4. Indeeed association with E1/E2 is much conspicuous for cluster 4 TFs. Therefore, authors should reconsider this point and consider rephrasing or showing further results of analyses.

      2. P17 last line to P18, they state "The facultative heterochromatin states were primarily associated with the intergenic states I1 to I3, based on their enrichment in H3K27me3 and H2AK121ub, low accessibility, and low gene expression". I'm not sure about this statement. How can they say "primarily associated" from the data they cite? As far as the PTMs and variants patterns, I1 to I3 and facultative heterochromatin look different. The authors should explain more or rephrase.

      3. P20 L15, the authors state "Contrary to Arabidopsis, the promoters of Marchantia defined by the region just upstream of the TSS showed enrichment of H2AUb and the elongation mark H3K36me3, along with other euchromatic marks. " I have a concern that the TSS annotation could be inaccurate in Marchantia compared to more rigorously tested annotation of Arabidopsis thaliana, so that the relationship between TSS and histone PTMs could be different between species. The authors should make sure this is not the case.

      4. P21 last line to P22, they analyzed only H3K27me3 and H2Aub in the mutants of E(z) (Fig. 2E) and states that "we analyzed chromatin landscape in the Marchantia...". Is analyzing two histone marks enough to say "chromatin landscape"? In addition, they state "These findings suggest a strong independence of the two Polycomb repressive pathways in Marchantia. " However, they did not analyzed the effect of loss of PRC1 on H3K27me3; the opposite way. Actually, in Arabidopsis loss of PRC1 causes loss of H2Aub AND H3K27me3 (Zhou et al (2017) Genome Biol: DOI 10.1186/s13059-017-1197-z).

      5. Related to the above comments, they states "To further compare the regulation by PRC2 in both species,". However, they did not describe the knowledge about regulation by PRC2 in Arabidopsis. They should consider describing.

      6. P25 L14: "With this method to estimate TF activity, the scores of TF occupancy and activity converged. To look at different patterns of chromatin preferences among TFs, we kept ChIP-seq and DAP-seq data for ~300 TFs in Arabidopsis (after filtering out TFs with low scores of occupancy and activity)." This part is a little hard to follow. Perhaps better to explain in more detail.

      7. In discussion section P30 L19-21: "This could be due to open chromatin, which is associated with highly expressed genes and permissive for TF binding, generating highly occupied target regions (HOT) with redundant or passive activity (19)." This part needs further explanation; espetially for the latter part, It's not clar what the authors claim.

      Minor comments:

      1. P17 L21: H2bUb should be H2Bub.

      2. Legend of Fig. 4D: later should be latter.

      3. Legend of Fig. 4G and H: "clusters defined in figure-H" should be "defined in Fig. 4F"?

      Referee cross-commenting

      Reviewer #1 raises thorough and important points that should be addressed before the manuscript is published. Particularly about the comparison of chromatin states between Arabidopsis and Marchantia, as this paper will make foundation for further research in the future and serve as a resource for community, the authors should thoroughly look into the points raised by reviewer #1 including annotation of transcriptional units.

      Significance

      Strength and limitation: Strength of this paper is the insights into chromatin-based transcriptional regulation by defining chromatin states using combination of many epigenome data and compare it with TF biding data. Limitation is lack of experimental support for their interesting claims by perturbing histone PTMs, for example. Also, a limitation is that comparing only two species can tell subjective "similar" or "different" between species.

      Advance comparing past literature: One clear advance is studying chromatin states in a plant other than Arabidopsis thaliana. Another one is revealing that TFs can be classified into a number of groups according to the relationships with chromatin-based transcription regulation. However, experimental tests for these are awaited.

      Audience: Epigenetics, chromatin, and transcription researchers, plant biologists interested in transcriptional regulation.

      My expertise: Epigenome, genetics, histone PTMs, plants

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The authors characterize chromatin states in the flowering plant Arabidopsis thaliana and the bryophyte Marchantia polymorpha. Here, they draw from ChIP-seq data that was previously published, and from data generated as part of this study, in particular for Marchantia H2.A variants (H2A.X.1, H2A.X.1, H2A.Z, H2A.M.2). The authors compute chromatin states, which enables a comparison over more than 450 million years of land plant evolution. While comparisons of plant chromatin to other species highlighted conservation as well as differences, this study targets a knowledge gap of evaluating chromatin conservation during land plant evolution. The authors investigate a connection between Transcription Factors binding sites and chromatin states. They propose a list of candidate pioneer factors associating with the +1 nucleosome.

      Major comments:

      • For the Association of chromatin states with expression, the authors use the TAIR10 annotation for extracting TSSs and promoter sequences. When investigated, a comparison of data resolving TSS with this annotation (or Araport11) shows a pretty poor overlap between the TSS based on Tair10/Araport11 and experimentally derived TSSs. This information was captured in Arabidopsis genome annotation files where the experimental TSS matches the genome annotation. What is the advantage of using an annotation with the inaccurate TSSs in TAIR10? It seems to confound the study.

      • The TSS annotation in Marchantia polymorpha (Tak1 v7.1) may also match poorly to the experimentally derived TSS. I suggest that the authors generate data to detect TSS in their tissue of choice and compare the positions to the genome annotation they use (f.x. PMID: 38831668).

      • I am not convinced that it is a wise choice to utilize fewer ChIP-seq data in Marchantia than Arabidopsis. Can the missing Marchantia ChIP-seq experiments not be performed and included to complete the comparison?

      • P. 26 onwards, the authors investigate different TF clusters and their association with chromatin states. They state "cluster 1 TFs primarily associated with the first nucleosome downstream of the TSS". However, if the gene is not really expressed in these "leave" tissues, then how can the authors be sure that the same TSS position would be used in "flower" tissue? It could be an artifact of a genome annotation file that misses flower-tissue TSS data. It is not an obvious to conclusion to name these factors "pioneer TFs". Experiments testing this are missing as far as I can gather.

      Minor comments:

      • Can the authors add files ( e.g. .bed) with their segmented chromatin states as part of their GEO submission? That could improve the impact and make the findings more accessible.

      • Can the authors rule out issues with the Marchantia annotation, for example missing read-through transcription or alternative isoforms, that would essentially have the effect that the genomic segmentation they use contains elongating upstream transcripts in from of promoter TSS? This could be an alternative explanation for the enrichment of H2AUb/H3K36me3 just upstream of the TSSs as they describe on p.21. If it can´t be ruled, the limitations from genome annotations, and examples offering improvements could be highlighted in the discussion. This may also be supported by the long persistence of E4 after the TTS p.23.

      • P.23 - This further suggests that in Marchantia, the orientation of genes defines

      • distinct chromatin environment in their vicinity, through mechanisms yet to be uncovered. Does this correlate with the distance of the closest (annotated) transcript pairs?

      • The E1 state highlighted on p.24 and in Fig.3A/d is not annotated in Fig.3A/D. It is also not clear in the legends which number it is.

      • P.30 - The marks H3K4me1 and H3K36me3 reflecting transcriptional elongation and confined to the gene bodies in Arabidopsis, extend beyond the TTS in Marchantia, suggesting that signals for transcriptional termination differ between flowering plants and bryophytes. There are multiple alternative explanations. Likely a combination of missing transcripts in their genome annotation (e.g. lncRNAs), annotation errors (e.g. wrong ends) and the segmentation of these regions (e.g. the transcripts are closer than in Arabidopsis). The discussion could extended significantly to address these issues and include the efforts to improve the genome annotations.

      Referee cross-commenting

      Reviewer #2 raises fair and valuable questions.

      Significance

      Significance: The authors corroborate prior chromatin state analyses in Arabidopsis and provide a chromatin state analysis for Marchantia. These data represent a resource that will be used and appreciated by the plant and ChromEvoDevo communities. The quality of the analyses are high and the description is transparent. I am not aware of a similar study comparing bryophytes and a land plant, so this study addresses a gap in knowledge.

      General assessment: The quality of the manuscript is high. The analyses are described well, and in sufficient detail to be understood. The effort going into documentation is high, I rate the study as reproducible. The linked github deposition looks good. The data generated as part of this study is available in the linked GEO deposition. An experimental design of 2 biological repeats is used, which is OK, but the lower limit. The GEO-deposited .bw files should be of interest to the ChromEvoDevo community, and researchers interested in Marchantia epigenetics and gene expression. The manuscript is written clearly and to the point. The figures condense a lot of data and match the text. The figures are rather complex and not easily accessible to someone browsing through a journal issue. However, that is fine for these types of papers. The manuscript is strong on data analysis. Other approaches, for example mutants to validate their hypothesis, are not utilized. The calculation of chromatin states offers a way to condense complex information into simpler terms. Nevertheless, it re-organizes information that largely existed before. To me, the biggest value of this study appears to be to regard it as a resource that calculated the chromatin states in a comparable fashion between organisms.

      Advance: The manuscript provides several advances. It provides new ChIP-seq data for Marchantia, it generates a chromatin state map for Marchantia, it compares Chromatin state maps between distant evolutionary time, and it generates a new hypothesis regarding pioneer TFs in plants. Some of the points described in the article hold true for even larger evolutionary distances, for example comparing plants to yeast and metazoans. The manuscript fills a knowledge gap and has offers a comparison via the computation of comparable chromatin states.

      Audience: The audience will be colleagues interested in chromatin and epigenetics, the Marchantia and plant communities as well as researchers interested in EvoDevo of chromatin organization. Even though the study uses plant models, it is highly relevant for non-plant models.

  3. Oct 2025
    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In the manuscript "Nucleosome positioning shapes cryptic antisense transcription", Kok and colleagues perform a characterization of nucleosome remodeling factors in S. pombe by assaying the impact of their deletion on antisense transcription and nucleosome organization. They find that deletion of Hrp3 leads to up-regulation of antisense RNA transcripts as well as disruption of phased nucleosomes in gene bodies. The authors then establish a catalogue of antisense transcripts in S. pombe using long read RNA sequencing, which they use to analyze the relationship between nucleosome positioning and antisense transcription. Through this analysis, they associate nucleosome positioning with the initiation of antisense transcription and conclude that nucleosome positioning within gene bodies represses cryptic antisense transcription. They further support this observation by showing that the up-regulated genes in the Hrp3 knock-out are enriched for genes usually expressed in meiosis, which in S. pombe often occur as nested transcripts in reverse orientation. Using growth assays under various stress conditions, the authors narrow down the domain responsible for the phenotype to the C-terminal CHCT domain. To address how Hrp3 gains specificity, they perform an in-silico interaction prediction screen to identify Prf1 as a putative interactor of the CHCT domain. Using recombinant expression in bacteria followed by pulldowns from lysates, they confirm the interaction and introduce point mutants that abolish the interaction. The authors then link the interaction with Prf1 to transcriptional elongation, where they observe a correlation between Hrp3 presence and chromatin marks of transcription elongation, especially H2BK119ub, which is also reduced in the Hrp3 knockout. They further demonstrate that both gene body nucleosome phasing and antisense transcription are similarly affected in the prf1 knockout as well as the hrp1-hrp3-prf1 triple knock-out cells, which indicates that they affect the same pathway.

      Major comments:

      The manuscript is well-written and the claims are generally supported by the data. The authors demonstrate scientific rigor through comprehensive experiments using single and double knockouts. I have three main comments that can be addressed through additional analysis and limited experimentation:

      1. The authors use the terms "Prf1" and "Paf1 complex" interchangeably multiple times in the manuscript (eg. Line 296). However, the experimental data presented only demonstrate a connection between Prf1 and Hrp3. Furthermore, published literature establishes that Prf1 and Paf1 represent distinct entities in S. pombe (Mbogning et al., 2013, PLoS Genetics 9(3): e1004029). The authors should clarify this distinction and use consistent, accurate terminology throughout the text. Reference: Mbogning, J., et al. (2013). The PAF Complex and Prf1/Rtf1 Delineate Distinct Cdk9-Dependent Pathways Regulating Transcription Elongation in Fission Yeast. PLoS Genetics, 9(3), e1004029. https://doi.org/10.1371/journal.pgen.1004029

      2. The authors demonstrate that Hrp3 limits antisense promoter usage; however, the analysis lacks characterization of sequence composition, promoter classes (TATA-box versus TATA-less), or identification of enriched transcription factor motifs near these sites. A more thorough bioinformatic analysis would strengthen the paper and potentially reveal interesting biology, as the effect may be specific to certain transcription factors or promoter architectures.

      3. The Hrp3-Prf1 interaction is demonstrated solely through recombinant overexpression and pulldown assays, which carries the risk of detecting non-physiological interactions. While the authors use mutations to verify pulldown specificity, in vivo evidence for this interaction is absent. Given that the authors cite a recent preprint demonstrating sophisticated techniques to show S. cerevisiae Chd1-Prf1 interactions, I presume standard approaches such as co-immunoprecipitation followed by mass spectrometry or Western blot were attempted. Even negative results from such experiments should be reported, as readers will likely question the physiological relevance of the interaction. Additionally, establishing the hierarchy between Hrp3, Prf1, and H2BK119Ub is crucial. While the authors show that Hrp3 ChIP-seq signal correlates with gene expression levels, the proposed Prf1-Hrp3 interaction raises questions about recruitment specificity and hierarchy. The authors mention in lines 344-345: "...the CHCT domain of Hrp3 is critical for its association with transcription elongation along the gene body..." which requires support from experimental data. Testing Hrp3 ChIP-seq in Prf1-depleted conditions would clarify how specificity is achieved and substantiate the functional importance of this interaction. As the authors have all the required strains I would estimate around 1.5-2 months for data generation and analysis.

      4. [Optional] Based on strucutre predictions the authors suggest that the interaction of of CHD1 and RTF1 is conserved in arabidopsis and mouse. This should be further supported by pulldown assays and also the pre-print (Reference nr. 99) should be cited as they show similar results using yeast-tow-hybrid assays

      Minor comments:

      1. Figure 1B: Grouping individual panels according to different paralog groups would make the figure more accessible.

      2. Figure 1D: The display of antisense transcription is not accessible. Perhaps boxplots, like those in Figures 2B and 5D, would be easier to read.

      3. Line 335: The transition is abrupt and would benefit from additional explanation. Why do the authors use Rtf1 instead of Prf1 here? Consistent nomenclature would improve clarity.

      4. Line 352: For the phrase "significant loss," please provide a statistical test or omit the word "significant."

      5. Figure 7F: The model presented in panel F suggests that there are two parallel routes that lead to nucleosome phasing; however, the authors state in the text (lines 363-364): "further supporting the idea that Hrp3 and Prf1 act together in the same pathway to control antisense transcription." The model and the text should align better.

      Significance

      • In the study, the authors establish Hrp3, one of the fission yeast CHD1 remodelers, as a crucial regulator of antisense transcription within gene bodies, which they link to both fitness penalties and the regulation of genes typically expressed during meiosis. They further link the recruitment of Hrp3 at gene bodies to transcriptional elongation, which provides an interesting model for how antisense transcription is prevented in actively transcribed regions of the genome.

      • The study is overall very well executed and controlled and provides strong evidence for connecting Hrp3 with the repression of antisense transcription using adequate experiments and technologies. This provides novel insights into a widespread phenomenon present in many organisms. A point that needs further improvement is the suggested physical link between Hrp3 and Prf1. Despite potentially being challenging to address using molecular biology techniques, the authors can further improve the study by dissecting the genetic hierarchy of Hrp3 and Prf1 using accessible tools. This study will be of interest to a broad audience in basic research as it addresses the broad question of how antisense transcription is repressed and provides mechanistic insights into this process. Consequently, this study will be relevant for the broader field of transcriptional regulation and could provide entry points for studying the role of CHD remodelers in other organisms.

      • Field of expertise: chromatin biology, small RNA mediated heterochromatin formation

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Kok et al. report on the role of the chromatin remodelers Hrp1 and Hrp3 in maintaining nucleosome positioning and preventing antisense transcription in Schizosaccharomyces pombe. As commented below, the main criticism of the manuscript is that the first half describes results that are very similar to those already reported by several other laboratories. Therefore, the main novel aspect of the work is the interaction between Hrp3 and the Prf1 subunit of the PAF complex.

      Specific points:

      1. The articles of Hennig et al. (2012), Pointner et al. (2012) and Shim et al. (2012) are cited in the manuscript (line 119, Refs. 61-63) only as a confirmation of the minor effect of the absence of Hrp1 on nucleosome positioning and antisense expression. However, these three articles reached the same conclusion as Kok et al. that the absence of Hrp3 in S. pombe causes severe, genome-wide loss of nucleosome positioning and overexpression of antisense transcripts, whereas the absence of Hrp1 has a much weaker effect. These results were also discussed in a short review article (Touat-Todeschini et al. EMBO J. 2012. 31: 4371). Although Kok et al. analysed transcription at a higher resolution and mapped transcription initiation using Pro-Seq (Figures 1, 2 and 3), their results do not add much to what was already reported in these previous studies.

      2. Several sites in the manuscript state that Hrp3 belongs to the SWI/SNF family of chromatin remodelers (for example, line 92). However, Hrp3 is a member of the CHD family, whose members have a very different structure and function (see, for example, Clapier et al. 2017. Nat Rev Mol Cell Biol 18: 407; Paliwal et al. 2024 TIGs 41:236).

      3. The authors should indicate where the nucleosome remodelling activity of some of the proteins in Figure 1A like Irc20, Rrp1, Rrp2 and Mot1) has been reported.

      4. The analysis of nucleosome positioning by aggregating thousands of genes, such as those shown in Figure 1B, has low resolution and can only detect gross alterations affecting many genes. Nevertheless, several mutants, such as swr1∆ and rrp1∆, also exhibit altered nucleosomal profiles in Figure 1B. In other cases, the occupancy of the first and second nucleosomes after the TSS is reduced relative to the wild type. Therefore, it cannot be concluded that "nucleosome arrays in wild type and most remodeller mutant cells were highly ordered and regular" (line 105).

      5. Although it was previously reported that hrp3∆ mutants overexpress antisense transcripts (see point 1 above), it is unclear how this finding is represented in Figure 1D. Similarly, it not clear either why antisense transcription is undetectable in hrp1∆ relative to WT in Figure 1D, yet significantly higher than in WT in Figures 2B, 3A and 3B. Furthermore, sense transcription in the single and double mutants is comparable to WT in Figure 2A, yet much higher in Figure S3B.

      6. Figure S3C claims that antisense transcription is higher in genes with greater nucleosome disruption in the double mutant hrp1∆hrp3∆. However, without a quantitative analysis, it is difficult to discern any significant differences in the degree of disruption across the four quartiles of antisense expression.

      7. Figures 3D and S4C show that the TSS of antisense transcription colocalizes with a region resistant to MNase that is at least 300 bp wide. This size does not correspond to that occupied by a nucleosome and contrasts with the expected size of the four nucleosome peaks downstream from it.

      8. In relation to the previous point, Figure S4C (bottom) shows that the centre of the region above the TSS is slightly displaced in the three mutants. This displacement corresponds to an increase in the G+C content of approximately 1.5% (Figure S4C top), equivalent to an increase of less than 2.5 Gs and Cs every 150 bp of nucleosomal DNA. Without some cause and effect experiments, it is difficult to attribute a functional significance to such a tiny difference. How repetitive is this difference in biological replicates?

      9. The authors should also explain how the position of the dyads was estimated in the double mutant hrp1∆hrp3∆ in Figure S4B. The severe loss of nucleosomal positioning suggests that the dyads occupy different positions in different cells within the same population. While most of the remaining figures show data for the three mutants, this figure shows results for the double hrp1∆hrp3∆ mutant only.

      10. Figures 3G and 3H show the analysis of the promoter activity of some regions upstream from antisense transcripts, achieved by replacing the endogenous ura4 gene promoter with these regions. This analysis lacks negative controls showing the level of transcription in the recipient strain following the removal of the endogenous ura4 promoter and its replacement for genomic regions not associated with the initiation of antisense transcription in the mutants. Furthermore, transcription should be measured by quantitative PCR of the ura4 mRNA rather than by the more indirect method of measuring OD600 in 384-well plates (line 708).

      11. Figure F4 suggests that Hrp3 may regulate the expression of genes specific to meiosis by showing an anticorrelation between the expression levels of Hrp3 and a selection of genes that are upregulated during meiosis (MUGs) 5 hours after the onset of meiosis. While this is an interesting possibility, it will remain speculative until it is demonstrated that the level of Hrp3 protein is reduced at the same stage of meiosis, and that MUG overexpression is associated with reduced nucleosomal occupancy adjacent to their TSS at that stage.

      12. The experiments in Figures 5 and 6, which describe the interaction between the Hpr3-specific CHCT domain and the Prf1 protein, are interesting and represent the main element of novelty of the manuscript. However, this interaction in figure 6D and 6E should be confirmed in vivo.

      13. Kok et al. indicate that the triple prf1∆ hrp1∆ hrp3∆ mutant exhibits stronger growth defects than the single prf1∆ mutant. However, Figure S9F shows that no growth is detectable in the single prf1∆ mutant, a phenotype that cannot be exacerbated in the triple mutant. Perhaps the use of a prf1 mutant showing a less severe phenotype migh help.

      Significance

      As indicated in point 1, the first half of the manuscript describes results that are very similar to those already reported in the literature.

      The interaction between Hrp3 and the Prf1 subunit is new and interesting, and could lead to further research and a new manuscript.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This is an excellent study that leverages the chromatin biology of Schizosaccharomyces pombe to uncover the central role of CHD1-family remodelers in maintaining nucleosome organisation and suppressing cryptic transcription. The work is carefully executed. In short, the authors show that Hrp3 is the primary CHD1-family remodeler responsible for maintaining nucleosome organisation over gene bodies. This represses antisense transcription from cryptic promoters in gene bodies. They provide evidence that Hrp3 is repressed in meiosis to allow the induction of meiotic genes. They further identified that a conserved domain, the CHCT domain of Hrp3, is essential for its interaction with Prf1 (PAF complex subunit), which is critical for the chromatin organisation in gene bodies. This manuscript is of excellent quality and is an important contribution towards understanding how transcription initiation is repressed within gene bodies. I have small comments and suggestions for clarification.

      Minor comments:

      • The study demonstrates that Hrp3 represses antisense transcription at meiotic genes, showing that Hrp3 is reduced in meiosis, which could facilitate the induction of meiotic genes. Is there a phenotype in the hrp3Δ or the hrp1Δ hrp3Δ mutant in relation to meiosis? E.g. do these strains enter meiosis uncontrolled?

      • Figure 3C - ORC4 Locus TSS presentation. The presented data do not show a well-defined TSS on the sense strand. For reference, it would be useful to show that sense TSS is not altered between the different strains.

      • The study focuses on antisense cryptic transcription, which is relatively easy to measure by RNA-seq. Often, however, cryptic transcription can also occur in the sense direction in gene bodies. Do the authors also find evidence of cryptic sense transcription in gene bodies (based on TSS-seq data)? This could be useful for completeness to report, as this could lead to aberrant protein-coding isoforms.

      • The manuscript alternates between "Prf1" (S. pombe) and "RTF1" (other eukaryotes). This is at times confusing. I recommend consistent use of gene nomenclature.

      • The authors show epistatic interaction for nucleosome spacing in Figure 7D for the prf1Δ and hrp1Δ hrp3Δ prf1Δ strains. It would be informative to have the hrp1Δ hrp3Δ data also included in Figure 7D, like in the other figure panels.

      Significance

      This is an excellent study that leverages the chromatin biology of Schizosaccharomyces pombe to uncover the central role of CHD1-family remodelers in maintaining nucleosome organisation and suppressing cryptic transcription. This manuscript is of excellent quality and is an important contribution towards understanding how transcription initiation is repressed within gene bodies.

      I am an expert on transcription regulation and noncoding transcription.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General Statements

      We would like to thank the referees for their time and effort in giving feedback on our work, and their overall positive attitude towards the manuscript. Most of the referees' points were of clarifying and textual nature. We have identified three points which we think require more attention in the form of additional analyses, simulations or significant textual changes:

      Within the manuscript we state that conserved non coding sequences (CNSs) are a proxy for cis regulatory elements (CREs). We proceed to use these terms interchangeably without explaining the underlying assumption, which is inaccurate. To improve on this point we ensured in the new text that we are explicit about when we mean CNS or CRE. Secondly, we added a section to the discussion (‘Limitations of CNSs as CREs’) dedicated to this topic. During stabilising selection (maintaining the target phenotype) DSD can occur fully neutrally, or through the evolution of either mutational or developmental robustness. We describe the evolutionary trajectories of our simulations as neutral once fitness mostly plateaued; however, as reviewer 3 points out, small gains in median fitness still occur, indicating that either development becomes more robust to noisy gene expression and tissue variation, and/or the GRNs become more robust to mutations. To discern between fully neutral evolution where the fitness distribution of the population does not change, and the higher-order emergence of robustness, we performed additional analysis of the given results. Preliminary results showed that many (near-)neutral mutations affect the mutational robustness and developmental robustness, both positively and negatively. To investigate this further we will run an additional set of simulations without developmental stochasticity, which will take about a week. These simulations should allow us to more closely examine the role of stabilising selection (of developmental robustness) in DSD by removing the need to evolve developmental robustness. Additionally, we will set up simulations in which we changed the total number of genes, and the number of genes under selection to investigate how this modelling choice influences DSD. In the section on rewiring (‘Network redundancy creates space for rewiring’) we will analyse the mechanism allowing for rewiring in more depth, especially in the light of gene duplications and redundancy. We will extend this section with an additional analysis aimed to highlight how and when rewiring is facilitated. We will describe the planned and incorporated revisions in detail below; we believe these have led to a greatly improved manuscript.

      Kind regards,

      Pjotr van der Jagt, Steven Oud and Renske Vroomans

      Description of the planned revisions

      Referee cross commenting (Reviewer 4)

      Reviewer 3's concern about DSD resulting from stabilising selection for robustness is something I missed -- this is important and should be addressed.

      We understand this concern, and agree that we should be more thorough in our analysis of DSD by assessing the higher-order effects of stabilising selection on mutational robustness and/or environmental (developmental) robustness (McColgan & DiFrisco 2024).

      We will 1) extend our analysis of fitness under DSD by computing the mutational and developmental robustness (similar to Figure 2F) over time for a number of ancestral lineages. By comparing these two measures over evolutionary time we will gain a much more fine grained image of the evolutionary dynamics and should be able to find adaptive trends through gain of either type of robustness. Preliminary results suggest that during the plateaued fitness phase both mutational robustness and developmental robustness undergo weak gains and losses, likely due to the pleiotropic nature of our GPM. Collectively, these weak gains and losses result in the gain observed in Figure S3. So, rather than fully neutral we should discern (near-)neutral regimes in which clear adaptive steps are absent, but in which the sum of them is a net gain. These are interesting findings we initially missed, and give insights into how this high-dimensional fitness landscape is traversed, and will be included in a future revised version of the manuscript.

      2) We will run extra simulations without stochasticity to investigate DSD in the absence of adaptation through developmental robustness, and include the comparison between these and our original simulations in a future revised version.

      Finally 3) we will address stabilising selection more prominently in the introduction and discussion to accommodate these additional simulations.

      Reviewer 3 suggests that the model construction may favor DSD because there are many genes (14) of which only two determine fitness. I agree that some discussion on this point is warranted, though I am not sure enough is known about "the possible difference in constraints between the model and real development" for such a discussion to be on firm biological footing. A genetic architecture commonly found in quantitative genetic studies is that a small number of genes have large effects on the phenotype/fitness, whereas a very large number of genes have effects that are individually small but collectively large (see, e.g. literature surrounding the "omnigenic model" of complex traits). Implementing such an architecture is probably beyond the scope of the study here. More generally, would be natural to assume that the larger the number of genes, and the smaller the number of fitness-determining genes, the more likely DSD / re-wiring is to occur. That being said, I think the authors' choice of a 14-gene network is biologically defensible. It could be argued that the restriction of many modeling studies to small networks (often including just 3 genes) on the ground of convenience artificially ensures that DSD will not occur in these networks.

      The choice of 14 genes does indeed stem from a compromise between constraining the number of available genes, but at the same time allowing for sufficient degrees of freedom and redundancy. We have added a ‘modelling choices’ section in the discussion in which we address this point. Additionally, it is important to note that, while the fitness criterion only measures the pattern of 2 genes, throughout the evolutionary lineage additional genes become highly important for the fitness of an individual, because these genes evolved to help generate the target pattern (see for example Figure 4); the other genes indeed reflect reviewer 4’s point that most genes have a small effect. Crucially, we observe that even the genes and interactions that are important for fitness undergo DSD.

      Nevertheless, we think it is interesting to investigate this point of the influence of this particular modelling choice on the potential for DSD, and have set up an extra set of simulations with fewer gene types, and one with additional fitness genes.

      Furthermore, we discuss the choice of our network architecture more in depth in a discussion section on our modelling choices: ‘Modelling assumptions and choices’.

      Reviewer 1

      The observation of DSD in the computational models remains rather high-level in the sense that no motifs, mechanisms, subgraphs, mutations or specific dynamics are reported to be associated to it ---with the exception of gene expression domains overlapping. Perhaps the authors feel it is beyond this study, but a Results section with a more in-depth "mechanistic" analysis on what enables DSD would (a) make a better case for the extensive and expensive computational models and (b) would push this paper to a next level. As a starting point, it could be nice to check Ohno's intuition that gene duplications are a creative "force" in evolution. Are they drivers of DSD? Or are TFBS mutations responsible for the majority of cases?

      We agree that some mechanistic analysis would strengthen the manuscript, and will therefore extend the section ‘Network redundancy creates space for rewiring’ to address how this redundancy is facilitated. For instance, in the rewiring examples given in Figure 4 we can highlight how this new interaction emerges, if this is through a gene mutation followed by rewiring and loss of a redundant gene, or if the gain, redundancy and loss are all on the level of TFBS mutations. Effectively we will investigate which route of the three in the following schematic is most prominent:

      Additionally, we will do analysis on the different effects of the transcription dynamics for each of these routes. (note that this is not an exhaustive schematic, and combinations could be possible).

      l171. You discuss an example here, would it be possible to generalize this analysis and quantify the amount of DSD amongst all cloned populations? And related question: of the many conserved interactions in Fig 4A, how many do the two clonal lineages share? None? All?

      We agree that this is a good idea. In a new supplementary figure, we will show the number of times a conserved interaction gets lost, and a new interaction is gained as a metric for DSD in every cloned population.

      The populations in Fig 4A are cloned at generation 50.000, any interaction starting before then and still present at a point in time is shared. Any interactions starting after 50.000 are unique (or independently gained at least).

      - l269. What about phenotypic plasticity due to stochastic gene expression? Does it play a role in DSD in your model? I am thinking about https://pubmed.ncbi.nlm.nih.gov/24884746/ and https://pubmed.ncbi.nlm.nih.gov/21211007/

      We agree that this is an interesting point which should be included into the discussion. Following the comments of reviewer 3 we have set up extra simulations to investigate this in more detail, we will make sure to include these citations in the revised discussion when we have the results of those simulations.

      Reviewer 3

      Issue One: Interpretation of fitness gains under stabilising selection

      A central issue concerns how the manuscript defines and interprets developmental systems drift (DSD) in relation to evolution on the fitness landscape. The authors define DSD as the conservation of a trait despite changes in its underlying genetic basis, which is consistent with the literature. However, the manuscript would benefit from clarifying the relationship between DSD, genotype-to-phenotype maps, and fitness landscapes. Very simply, we can say that (i) DSD can operate along neutral paths in the fitness landscape, (ii) DSD can operate along adaptive paths in the fitness landscape. During DSD, these neutral or adaptive paths along the fitness landscape are traversed by mutations that change the gene regulatory network (GRN) and consequent gene expression patterns whilst preserving the developmental outcome, i.e., the phenotype. While this connection between DSD and fitness landscapes is referenced in the introduction, it is not fully elaborated upon. A complete elaboration is critical because, when I read the manuscript, I got the impression that the manuscript claims that DSD is prevalent along neutral paths in the fitness landscape, not just adaptive ones. If I am wrong and this is not what the authors claim, it should be explicitly stated in the results and discussed. Nevertheless, claiming DSD operates along neutral paths is a much more interesting statement than claiming it operates along adaptive paths. However, it requires sufficient evidence, which I have an issue with.

      The issue I have is about adaptations under stabilising selection. Stabilising selection occurs when there is selection to preserve the developmental outcome. Stabilising selection is essential to the results because evolutionary change in the GRN under stabilising selection should be due to DSD, not adaptations that change the developmental outcome. To ensure that the populations are under stabilising selection, the authors perform clonal experiments for 100,000 generations for 8 already evolved populations, 5 clones for each population. They remove 10 out of 40 clones because the fitness increase is too large, indicating that the developmental outcome changes over the 100,000 generations. However, the remaining 30 clonal experiments exhibit small but continual fitness increases over 100,000 generations. The authors claim that the remaining 30 are predominantly evolving due to drift, not adaptations (in the main text, line 137: "indicating predominantly neutral evolution", and section M: "too shallow for selection to outweigh drift"). The author's evidence for this claim is a mathematical analysis showing that the fitness gains are too small to be caused by beneficial adaptations, so evolution must be dominated by drift. I found this explanation strange, given that every clone unequivocally increases in fitness throughout the 100,000 generations, which suggests populations are adapting. Upon closer inspection of the mathematical analysis (section M), I believe it will miss many kinds of adaptations possible in their model, as I now describe.

      The mathematical analysis treats fitness as a constant, but it's a random variable in the computational model. Fitness is a random variable because gene transcription and protein translation are stochastic (Wiener terms in Eqs. (1)-(5)) and cell positions change for each individual (Methods C). So, for a genotype G, the realised fitness F is picked from a distribution with mean μ_G and higher order moments (e.g., variance) that determine the shape of the distribution. I think these assumptions lead to two problems.

      The first problem with the mathematical analysis is that F is replaced by an absolute number f_q, with beneficial mutations occurring in small increments denoted "a", representing an additive fitness advantage. The authors then take a time series of the median population fitness from their simulations and treat its slope as the individual's additive fitness advantage "a". The authors claim that drift dominates evolution because this slope is lower than a drift-selection barrier, which they derive from the mathematical analysis. This analysis ignores that the advantage "a" is a distribution, not a constant, which means that it does not pick up adaptations that change the shape of the distribution. Adaptations that change the shape of the distribution can be adaptations that increase robustness to stochasticity. Since there are multiple sources of noise in this model, I think it is highly likely that robustness to noise is selected for during these 100,000 generations.

      The second problem is that the mathematical analysis ignores traits that have higher-order effects on fitness. A trait has higher-order effects when it increases the fitness of the lineage (e.g., offspring) but not the parent. One possible trait that can evolve in this model with higher-order effects is mutational robustness, i.e., traits that lower the expected mutational load of descendants. Since many kinds of mutations occur in this model (Table 2), mutational robustness may be also evolving.

      Taken together, the analysis in Section M is set up to detect only immediate, deterministic additive gains in a single draw of fitness. It therefore cannot rule out weak but persistent adaptive evolution of robustness (to developmental noise and/or to mutations), and is thus insufficient evidence that DSD is occurring along neutral paths instead of adaptive paths. The small but monotonic fitness increases observed in all 40 clones are consistent with such adaptation (Fig. S3). The authors also acknowledge the evolution of robustness in lines 129-130 and 290-291, but the possibility of these adaptations driving DSD instead of neutral evolution is not discussed.

      To address the issue I have with adaptations during stabilising selection, the authors should, at a minimum, state clearly in their results that DSD is driven by both the evolution of robustness and drift. Moreover, a paragraph in the discussion should be dedicated to why this is the case, and why it is challenging to separate DSD through neutral evolution vs DSD through adaptations such as those that increase robustness.

      [OPTIONAL] A more thorough approach would be to make significant changes to the manuscript by giving sufficient evidence that the experimental clones are evolving by drift, or changing the model construction. One possible way to provide sufficient evidence is to improve the mathematical analysis. Another way is to show that the fitness distributions (both without and with mutations, like in Fig. 2F) do not significantly change throughout the 100,000 generations in experimental clones. It seems more likely that the model construction makes it difficult to separate the evolution of robustness from evolution by drift in the stabilising selection regime. Thus, I think the model should be constructed differently so that robustness against mutations and noise is much less likely to evolve after a "fitness plateau" is reached. This could be done by removing sources of noise from the model or reducing the kinds of possible mutations (related to issue two). In fact, I could not find justification in the manuscript for why these noise terms are included in the model, so I assume they are included for biological realism. If this is why noise is included, or if there is a separate reason why it is necessary, please write that in the model overview and/or the methods.

      We agree that we should be more precise about whether DSD operates along neutral vs adaptive paths in the fitness landscape, and have expanded our explanation of this distinction in the introduction. We also agree that it is worthwhile to distinguish between neutral evolution that does not change the fitness distribution of the population (either through changes in developmental or mutational robustness), higher-order evolutionary processes that increase developmental robustness, and drift along a neutral path in the fitness landscape towards regions of greater connectivity, resulting in mutational robustness (as described in Huynen et al., 1999). We have performed a preliminary analysis to identify changes in mutational robustness and developmental robustness over evolutionary time in the populations in which the maximum fitness has already plateaued. This analysis shows frequent weak gains and losses, in which clear adaptive steps are absent but a net gain can be seen in robustness, as consistent with higher-order fitness effects.

      To investigate the role of stabilising selection more in depth we will run simulations without developmental noise in the form of gene expression noise and tissue connectivity variation, thus removing the effect of the evolution of developmental robustness. We will compare the evolutionary dynamics of the GRNs with our original set of simulations, and include both these types of analyses in a supplementary figure of the revised manuscript.

      Furthermore, we now discuss the limitations of the mathematical analysis with regard to adaptation vs neutrality in our simulations, in the supplementary section.

      Issue two: The model construction may favour DSD

      In this manuscript, fitness is determined by the expression pattern of two types of genes (genes 12 and 13 in Table 1). There are 14 types of genes in total that can all undergo many kinds of mutations, including duplications (Table 2). Thus, gene regulatory networks (GRNs) encoded by genomes in this model tend to contain large numbers of interactions. The results show that most of these interactions have minimal effect on reaching the target pattern in high fitness individuals (e.g. Fig. 2F). A consequence of this is that only a minimal number of GRN interactions are conserved through evolution (e.g. Fig. 2D). From these model constructions and results from evolutionary simulations, we can deduce that there are very few constraints on the GRN. By having very few constraints on the GRN, I think it makes it easy for a new set of pattern-producing traits to evolve and subsequently for an old set of pattern-producing traits to be lost, i.e., DSD. Thus, I believe that the model construction may favour DSD.

      I do not have an issue with the model favouring DSD because it reflects real multicellular GRNs, where it is thought that a minority fraction of interactions are critical for fitness and the majority are not. However, it is unknown whether the constraints GRNs face in the model are more or less constrained than real GRNs. Thus, it is not known whether the prevalence of DSD in this model applies generally to real development, where GRN constraints depend on so many factors. At a minimum, the possible difference in constraints between the model and real development should be discussed as a limitation of the model. A more thorough change to the manuscript would be to test the effect of changing the constraints on the GRN. I am sure there are many ways to devise such a test, but I will give my recommendation here.

      [OPTIONAL] My recommendation is that the authors should run additional simulations with simplified mutational dynamics by constraining the model to N genes (no duplications and deletions), of which M out of these N genes contribute to fitness via the specific pattern (with M=2 in the current model). The authors should then test the effect of changing N and M independently, and how this affects the prevalence of DSD. If the prevalence of DSD is robust to changes in N and M, it supports the authors argument that DSD is highly prevalent in developmental evolution. If DSD prevalence is highly dependent on M and/or N, then the claims made in the manuscript about the prevalence of DSD must change accordingly. I acknowledge that these simulations may be computationally expensive, and I think it would be great if the authors knew (or devised) a more efficient way to test the effect of GRN constraints on DSD prevalence. Nevertheless, these additional simulations would make for a potentially very interesting manuscript.

      We agree that these modelling choices likely influence the potential for DSD. We think that our model setup, where most transcription factors are not under direct selection for a particular pattern, more accurately reflects biological development, where the outcome of the total developmental process (a functional organism) is what is under selection, rather than each individual gene pattern. As also mentioned by the referee, in real multicellular development the majority of interactions is not crucial for fitness, similar to our model. We also observe that, as fitness increases, additional genes experience emergent selection for particular expression patterns or interaction structures in the GRN, resulting in their conservation. Nevertheless, we do agree that the effect of model construction on DSD is an unexplored avenue and this work lends itself to addressing this. We will run additional sets of simulations: one in which we reduce the size of the network (‘N’), and a second set where we double the number of fitness contributing genes (‘M’), and show the effect on the extent of DSD in a future supplementary figure.

      Description of the revisions that have already been incorporated in the transferred manuscript

      Referee cross commenting (Reviewer 4)

      Overall I agree with the comments of Reviewer 1, 2 and 3. I note that reviewers 1, 3, and 4 each pointed out the difficulties with assuming that CNSs = CREs, so this needs to be addressed. Two reviewers (3 and 4) also point out problems with equating bulk RNAseq with a conserved phenotype.

      We agree that caution is warranted with the assumption of CNSs = CREs. We have added a section to the discussion in which we discuss this more thoroughly, see ‘Limitations of CNSs as CREs’ in the revised manuscript.

      Additionally, we made textual changes to the statement of significance, abstract and results to better reflect when we talk about CNSs or CREs.

      I agree with Reviewer 1's hesitancy about the rhetorical framing of the paper potentially generalising too far from a computational model of plant meristem patterning.

      We agree that the title should reflect the scope of the manuscript, and our short title reflects that better than ubiquitous, which implies we investigated beyond plant (meristem) development. We have changed the title in the revised version, to ‘System drift in the evolution of plant meristem development’.

      Reviewer 1

      It is system drift, not systems drift (see True and Haag 2001). No 's' after system.

      Thank you for catching this – we corrected this throughout.

      - I am afraid I have a problem with the manuscript title. I think "Ubiquitoes" is misplaced, because it strongly suggests you have a long list of case studies across plants and animals, and some quantification of DSD in these two kingdoms. That would have been an interesting result, but it is not what you report. I suggest something along the lines of "System drift in the evolution of plant meristem development", similar to the short title used in the footer.

      - Alternatively, the authors may aim to say that DSD happens all over the place in computational models of development? In that case the title should reflect that the claim refers to modeling. (But what then about the data analysis part?)

      As remarked in the summary (point 2), we agree with this assessment and have changed the title to ‘System drift in the evolution of plant meristem development’’

      Multiple times in the Abstract and Introduction the authors make statements on "cis-regulatory elements" that are actually "conserved non-coding sequences" (CNS). Even if it is not uncommon for CNSs to harbor enhancers etc., I would be very hesitant to use the two as synonyms. As the authors state themselves, sequences, even non-coding, can be conserved for many reasons other than CREs. I would ask the authors to support better their use of "CREs" or adjust language. As roughly stated in their Discussion (lines 310-319), one way forward could be to show for a few CNS that are important in the analysis (of Fig 5), that they have experimentally-verified enhancers. Is that do-able or a bridge too far?

      We changed the text such that we use CNS instead of CRE when discussing the bioinformatic analysis. Additionally we added a section in the discussion to clarify the relationship between CNS and CRE.

      line 7. evo-devo is jargon

      We changed this to ‘…evolution of development (evo-devo) research…

      l9. I would think "using a computational model and data analysis"

      Yes, corrected.

      l13. Strictly speaking you did not look at CREs, but at conserved non-coding sequences.

      Indeed, we changed this to CNS.

      l14. "widespread" is exaggerated here, since you show for a single organ in a handful of plant species. You may extrapolate and argue that you do not see why it should not be widespread, but you did not show it. Or tie in all the known cases that can be found in literature.

      We understand that ‘widespread’ seems to suggest that we have investigated a broader range of species and organs. To be more accurate we changed the wording to ‘prevalent’.

      l16. "simpler" than what?

      We added the example of RNA folding.

      l27. Again the tension between CREs and non-coding sequence.

      Changed to conserved non coding sequence.

      l28. I don't understand the use of "necessarily" here.

      This is indeed confusing and unnecessary, removed

      l34-35. A very general biology statement is backed up by two modeling studies. I would have expected also a few based on comparative analyses (e.g., fossils, transcriptomics, etc).

      We added extra citations and a discussion of more experimental work

      l36. I was missing the work on "phenogenetic drift" by Weiss; and Pavlicev & Wagner 2012 on compensatory mutations.

      Changed the text to:

      This phenomenon is called developmental system drift (DSD) (True and Haag, 2001; McColgan and DiFrisco, 2024), or phenogenetic drift (Weiss and Fullerton, 2000), and can occur when multiple genotypes which are separated by few mutational steps encode the same phenotype, forming a neutral (Wagner, 2008a; Crombach et al., 2016); or adaptive path (Johnson and Porter, 2007; Pavlicev and Wagner, 2012) .

      l38. Kimura and Wagner never had a developmental process in mind, which is much bigger than a single nucleotide or a single gene, respectively. First paper that I am aware of that explicitly connects DSD to evolution on genotype networks is my own work (Crombach 2016), since the editor of that article (True, of True and Haag 2001) highlighted that point in our communications.

      Added citation and moved Kimura to the theoretical examples of protein folding DSD.

      l40. While Hunynen and Hogeweg definitely studied the GP map in many of their works, the term goes back to Pere Alberch (1991).

      Added citation.

      l54-55. I'm missing some motivation here. If one wants to look at multicellular structures that display DSD, vulva development in C. elegans and related worms is an "old" and extremely well-studied example. Also, studies on early fly development by Yogi Jaeger and his co-workers are not multicellular, but at least multi-nuclear. Obviously these are animal-based results, so to me it would make sense to make a contrast animal-plant regarding DSD research and take it from there.

      Indeed, DSD has been found in these species and we now reference some of this work; the principle is better known in animals. Nevertheless, within the theoretical literature there is a continuing debate on the importance/extent of DSD.

      Changed text:

      ‘For other GPMs, such as those resulting from multicellular development, it has been suggested that complex phenotypes are sparsely distributed in genotype space, and have low potential for DSD because the number of neutral mutations anti-correlates with phenotypic complexity (Orr, 2000; Hagolani et al., 2021). On the other hand, theoretical and experimental studies in nematodes and fruit flies have shown that DSD is present in a phenotypically complex context (Verster et al., 2014; Crombach et al., 2016; Jaeger, 2018). It therefore remains debated how much DSD actually occurs in species undergoing multicellular development. DSD in plants has received little attention. One multicellular structure which …’

      l66-86. It is a bit of a style-choice, but this is a looong summary of what is to come. I would not have done that. Instead, in the Introduction I would have expected a bit more digging into the concept of DSD, mention some of the old animal cases, perhaps summarize where in plants it should be expected. More context, basically.

      We extended the paragraph on empirical examples of DSD by adding the animal cases and condensed our summary.

      l108. Could you quantify the conserved interactions shared between the populations? Or is each simulation so different that they are pretty much unique?

      Each simulation here is independent of the other simulations, so a per interaction comparison would be uninformative. After cloning they do share ancestry, but that is much later in the manuscript and here the quantification of the conserved interactions would be the inverse of the divergence as shown in, for instance Figure 3B.

      l169. "DSD driving functional divergence" needs some context, since DSD is supposed to not affect function (of the final phenotype). Or am I misunderstanding?

      This is indeed a confusing sentence. We mean to say that DSD allows for divergence to such an extent that the underlying functional pathway is changed. So instead of a mere substitution of the underlying network, in which the topology and relative functions stay conserved, a different network structure is found. We have modified the line to read “Taken together, we found that DSD can drive functional divergence in the underlying GRN resulting in novel spatial expression dynamics of the genes not directly under selection.

      l176. Say which interaction it is. Is it 0->8, as mentioned in the next paragraph?

      It is indeed 0->8, we have clarified this in the text.

      l197. Bulk RNAseq has the problem of averaging gene expression over the population of cells. How do you think that impacts your test for rewiring? If you would do a similar "bulk RNA" style test on your computational models, would you pick up DSD?

      The rewiring is based on the CNSs, whereas the RNAseq is used as phenotype, so it does not impact the test for rewiring.

      The averaging of bulk RNAseq does however, mean that we cannot show conservation/divergence of the phenotype within the tissues, only between the different tissues.

      The most important implication of doing this in our model would be the definition of the ‘phenotype’ which undergoes DSD. Currently the phenotype is a gene expression pattern on a cellular level, for bulk RNA this phenotype would change to tissue-level gene expression.

      This change in what we measure as phenotype implicates how we interpret our results, but would not hinder us in picking up DSD, it just has a different meaning than DSD on a cellular - and single tissue scale.

      We added clarification of the roles of the datasets at the start of the paragraph.

      ‘The Conservatory Project collects conserved non-coding sequences (CNSs) across plant genomes, which we used to investigate the extent of GRN rewiring in flowering plants. Schuster et al. measured gene expression in different homologous tissues of several species via bulk RNAseq, which we used to test for gene expression (phenotype) conservation, and how this relates to the GRN rewiring inferred from the CNSs.’

      l202. I do not understand the "within" of a non-coding sequence within an orthogroup. How are non-coding sequences inside an orthogroup of genes?

      We clarify this sentence by saying ‘A CNS is defined as a non-coding sequence conserved within the upstream/downstream region of genes within an orthogroup’, to more clearly separate the CNS from the orthogroup of genes. We also updated Figure 5A to reflect this better.

      l207-217. This paragraph is difficult to read and would benefit of a rephrasing. Plant-specific jargon, numbers do not add up (line 211), statements are rather implicit (9 deeply conserved CNS are the 3+6? Where do I see them in Fig 5B? And where do I see the lineage-specific losses?).

      We added extra annotations to the figure to make the plant jargon (angiosperm, eudicot, Brassicaceae) clear, and show the loss more clearly in the figure. We also clarified the text by splitting up 9 to 3 and 6.

      l223. Looking at the shared CNS between SEP1-2, can you find a TF binding site or another property that can be interpreted as regulatory importance?

      Reliably showing an active TF binding site would require experimental data, which we don’t have. We do mention in the discussion the need for datasets which could help address this gap.

      l225. My intuition says that the continuity of the phenotype may not be necessary if its loss can be compensated for somehow by another part of the organism. I.e., DSD within DSD. It is a poorly elaborated thought, I leave it here for your information. Perhaps a Discussion point?

      Although very interesting we think this discussion might be outside of the scope of this work, and would benefit from a standalone discussion – especially since the capacity for such compensation might differ between animals and plants (which are more “modular” organisms). This is our interpretation:

      First, let’s take a step back from ‘genotype’ and ‘phenotype’ and redefine DSD more generally: in a system with multiple organisational levels, where a hierarchical mapping between them exists, DSD is changes on one organisational level which do not alter the outcome of the ‘higher’ organisational level. In other words, DSD can exist any many-to-one mapping in which a set of many (which map to the same one) are within a certain distance in space, which we generally define as a single mutational step.

      Within this (slightly) more general definition we can extend the definition of DSD to the level of phenotype and function, in which phenotype describes the ‘many’ layer, and multiple phenotypes can fulfill the same function. When we are freed from the constraint of ‘genotype’ and ‘phenotype’, and DSD is defined at the level of this mapping, than it becomes an easy exercise to have multiple mappings (genotype→phenotype→function) and thus ‘DSD within DSD’.

      l233. "rarely"? I don't see any high Pearson distances.

      True in the given example there are no high Pearson distances, however some of the supplementary figures do so rarely felt like the most honest description. We changed the text to refer to these supplementary figures.

      Fig 4. Re-order of panels? I was expecting B at C and vice versa.

      Agreed, we swapped the order of the panels

      Fig 5B. Red boxes not explained. Mention that it is an UpSetplot?

      We added clarification to the figure caption.

      Fig 5D. It would be nice to quantify the minor and major diffs between orthologs and paralogs.

      We quantify the similarities (and thus differences) in Figure F, but we do indeed not show orthologs vs paralogs explicitly. We have extended Figure F to distinguish which comparisons are between orthologs vs paralogs with different tick marks, which shows their different distributions quite clearly.

      - l247. Over-generalization. In a specific organ of plants...

      Changed to vascular plant meristem.

      - l249. Where exactly is this link between diverse expression patterns and the Schuster dataset made? I suggest the authors to make it more explicit in the Results.

      We are slightly overambitious in this sentence. The Schuster dataset confirms the preservation of expression where the CNS dataset shows rewiring. That this facilitates diversification of expression patterns in traits not under selection is solely an outcome of the computational model. We have changed the text to reflect this more clearly.

      - l268. Final sentence of the paragraph left me puzzled. Why talk about opposite function?

      The goal here was to highlight regulatory rewiring which, in the most extreme case, would achieve an opposite function for a given TF within development. We agree that this was formulated vaguely so we rewrote this to be more to the point.

      These examples demonstrate that whilst the function of pathways is conserved, their regulatory wiring often is not.

      - l269. What about time scales generated by the system? Looking at Fig 2C and 2D, the elbow pattern is pretty obvious. That means interactions sort themselves into either short-lived or long-lived. Worth mentioning?

      Added a sentence to highlight this.

      - l291. Evolution in a *constant* fitness landscape increases robustness.

      Changed

      - l296. My thoughts, for your info: I suspect morphogenesis as single parameters instead of as mechanisms makes for a brittle landscape, resulting in isolated parts of the same phenotype.

      We agree, and now include citations to different models in which morphogenesis evolves which seem to display a more connected landscape.

      Reviewer 2

      Every computational model necessarily makes some simplifying assumptions. It would be nice if the authors could summarise in a paragraph in the Discussion the main assumptions made by their model, and which of those are most worth revisiting in future studies. In the current draft, some assumptions are described in different places in the manuscript, which makes it hard for a non-expert to evaluate the limitations of this model.

      We added a section to the discussion: ‘Modelling assumptions and choices’

      I did not find any mention of potential energetic constraints or limitations in this model. For example, I would expect high levels of gene expression to incur significant energy costs, resulting in evolutionary trade-offs. Could the authors comment on how taking energy limitations into account might influence their results?

      This would put additional constraints on the evolution/fitness landscape. Some paths/regions of the fitness landscape which are currently accessible will not be traversable anymore. On the other hand, an energy constraint might reduce certain high fitness areas to a more even plane and thus make it more traversable. During analysis of our data there were no signs of extremely high gene expression levels.

      Figure 3C lists Gene IDs 1, 2, 8, and 11, but the caption refers to genes 1, 2, 4, and 11.

      Thank you for catching this.

      Reviewer 3

      The authors present an analysis correlating conserved non-coding sequence (CNS) composition with gene expression to investigate developmental systems drift. One flaw of this analysis is that it uses deeply conserved sequences as a proxy for the entire cis-regulatory landscape. The authors acknowledge this flaw in the discussion.

      Another potential flaw is equating the bulk RNA-seq data with a conserved phenotype. In lines 226-227 of the manuscript, it is written that "In line with our computational model, we compared gene expression patterns to measure changes in phenotype." I am not sure if there is an equivalence between the two. In the computational model, the developmental outcome determining fitness is a spatial pattern, i.e., an emergent product of gene expression and cell interactions. In contrast, the RNA-seq data shows bulk measurements in gene expression for different organs. It is conceivable that, despite having very similar bulk measurements, the developmental outcome in response to gene expression (such as a spatial pattern or morphological shape) changes across species. I think this difference should be explicitly addressed in the discussion. The authors may have intended to discuss this in lines 320-326, although it is unclear to me.

      It is correct that the CNS data and RNA-seq data has certain limitations, and the brief discussion of some of these limitations in lines 320-326 is not sufficient. We have been more explicit on this point in the discussion.

      The gene expression data used in this study represents bulk expression at the organ level, such as the vegetative meristem (Schuster et al., 2024). This limits our analysis of the phenotypic effects of rewiring to comparisons between organs, which is different to our computational simulations where we look at within organ gene expression. Additionally, the bulk RNA-seq does not allow us to discern whether the developmental outcome of similar gene expression is the same in all these species. More fine-grained approaches, such as single-cell RNA sequencing or spatial transcriptomics, will provide a more detailed understanding of how gene expression is modulated spatially and temporally within complex tissues of different organisms, allowing for a closer alignment between computational predictions and experimental observations.

      Can the authors justify using these six species in the discussion or the results? Are there any limitations with choosing four closely related and two distantly related species for this analysis, in contrast to, say, six distantly related species? If so, please elaborate in the discussion.

      The use of these six species is mainly limited by the datasets we have available. Nevertheless, the combination of four closely related species, and two more distantly related species gives a better insight into the short vs long term divergence dynamics than six distantly related species would. We have noted this when introducing the datasets:

      This set of species contains both closely (A. thaliana, A. lyrata, C. rubella, E. salsugineum) and more distantly related species (M. truncatula, B. distachyon), which should give insight in short and long term divergence.

      In Figure S7, some profiles show no conservation across the six species. Can we be sure that a stabilising selection pressure conserves any CNSs? Is it possible that the deeply conserved CNSs mentioned in the main text are conserved by chance, given the large number of total CNSs? A brief comment on these points in the results or discussion would be helpful.

      In our simulations, we find that even CREs that were under selection for a long time can disappear; however, in our neutral simulations, CREs were not conserved, suggesting that deep conservation is the result of selection. When it comes to CNSs, the assumption is that they often contain CREs that are under selection.We have added a more elaborate section on CNSs in the discussion. See ‘Limitations of CNSs as CREs

      Line 7-8: I thought this was a bit difficult to read. The connection between (i) evolvability of complex phenotypes, (ii) neutral/beneficial change hindered by deleterious mutations, and (iii) DSD might not be so simple for many readers, so I think it should be rewritten. The abstract was well written, though.

      We made the connection to DSD and evolvability clearer and removed the specific mutational outcomes:

      *A key open question in evolution of development (evo-devo) is the evolvability of complex phenotypes. Developmental system drift (DSD) may contribute to evolvability by exploring different genotypes with similar phenotypic outcome, but with mutational neighbourhoods that have different, potentially adaptive, phenotypes. We investigated the potential for DSD in plant development using a computational model and data analysis. *

      Line 274 vs 276: Is there a difference between regulatory dynamics and regulatory mechanisms?

      No, we should use the same terminology. We have changed this to be clearer.

      Figure S4: Do you expect the green/blue lines to approach the orange line in the long term? In some clonal experiments, it seems like it will. In others, it seems like it has plateaued. Under continual DSD, I assume they should converge. It would be interesting to see simulations run sufficiently long to see if this occurs.

      In principle yes, however this might take a considerable amount of time given that some conserved interactions take >75000 generations to be rewired.

      Line 27: Evolutionarily instead of evolutionary?

      Changed

      Line 67-68: References in brackets?

      Changed

      Line 144: Capitalise "fig"

      Changed

      Fig. 3C caption: correct "1, 2, 4, 11" (should be 8)

      Changed

      Line 192: Reference repeated

      Changed

      Fig. 5 caption: Capitalise "Supplementary figure"

      Changed

      Line 277: Correct "A previous model Johnson.."

      Changed

      Line 290: Brackets around reference

      Changed

      Line 299: Correct "will be therefore be"

      Changed

      Line 394: Capitalise "table"

      Changed

      Line 449: Correct "was build using"

      Changed

      Fig. 5B: explain the red dashed boxes in the caption

      Added explanation to the caption

      Some of the Figure panels might benefit from further elaboration in their respective captions, such as 3C and 5F.

      Improved the figure captions.

      Reviewer 4

      Statement of significance. The logical connection between the first two sentences is not clear. What does developmental system drift have to do with neutral/beneficial mutations?

      This is indeed an unclear jump. Changed such that the connection between evolvability of complex phenotypes and DSD is more clear:

      *A key open question in evolution of development (evo-devo) is the evolvability of complex phenotypes. Developmental system drift (DSD) contributes to evolvability by exploring different genotypes with similar phenotypic outcome, but with mutational neighbourhoods that have different, potentially adaptive, phenotypes..We investigated the potential for DSD in plant development using a computational model and data analysis. *

      l 41 - "DSD is found to ... explain the developmental hourglass." Caution is warranted here. Wotton et al 2015 claim that "quantitative system drift" explains the hourglass pattern, but it would be more accurate to say that shifting expression domains and strengths allows compensatory regulatory change to occur with the same set of genes (gap genes). It is far from clear how DSD could explain the developmental hourglass pattern. What does DSD imply about the causes of differential conservation of different developmental stages? It's not clear there is any connection here.

      We should indeed be more cautious here. DSD is indeed not in itself an explanation of the hourglass model, but only a mechanism by which the developmental divergence observed in the hourglass model could have emerged. As per Pavlicev and Wagner, 2012, compensatory changes resulting from other shifts would fall under DSD, and can explain how the patterning outcome of the gap gene network is conserved. However, this does not explain why some stages are under stronger selection than others. We changed the text to reflect this.

      ‘...be a possible evolutionary mechanism involved in the developmental hourglass model (Wotton et al., 2015; Crombach et al., 2016)...’

      ll 51-53 - "Others have found that increased complexity introduces more degrees of freedom, allowing for a greater number of genotypes to produce the same phenotype and potentially allowing for more DSD (Schiffman and Ralph, 2022; Greenbury et al., 2022)." Does this refer to increased genomic complexity or increased phenotypic complexity? It is not clear that increased phenotypic complexity allows a greater number of genotypes to produce the same phenotype. Please explain further.

      The paragraph discusses complexity in the GPM as a whole, where the first few examples in the paragraph regard phenotypic complexity, and the ones in l51-53 refer to genomic complexity. This is currently not clear so we clarified the text.

      ‘For other GPMs, such as those resulting from multicellular development, it has been suggested that complex phenotypes are sparsely distributed in genotype space, and have low potential for DSD because the number of neutral mutations anti-correlates with phenotypic complexity (Orr, 2000; Hagolani et al., 2021). Others have found that increased genomic complexity introduces more degrees of freedom, allowing for a greater number of genotypes to produce the same phenotype and potentially allowing for more DSD (Schiffman and Ralph, 2022; Greenbury et al., 2022).’

      It was not clear why some gene products in the model have the ability to form dimers. What does this contribute to the simulation results? This feature is introduced early on, but is not revisited. Is it necessary?

      *Fitness. The way in which fitness is determined in the model was not completely clear to me. *

      Dimers are not necessary, but as they have been found to play a role in actual SAM development we added them to increase the realism of the developmental simulations. In some simulations the patterning mechanism involves the dimer, in others it does not, suggesting that dimerization is not essential for DSD.

      We have made changes to the methods to clarify fitness.

      Lines 103-104 say: "Each individual is assigned a fitness score based on the protein concentration of two target genes in specific regions of the SAM: one in the central zone (CZ), and one in the organizing center (OC)." How are these regions positionally defined in the simulation?

      We have defined bounding boxes to define cells as either CZ, OC or both. We have added these bounds in the figure description and more clearly in the revised methods.

      F, one reads (l. 385): "Fitness depends on the correct protein concentration of the two fitness genes in each cell, pcz and poc respectively." This sounds like fitness is determined by the state of all cells rather than the state of the two specific regions of the SAM. Please clarify.

      A fitness penalty is given for incorrect expression so it is true that the fitness is determined by the state of all cells. We agree that it is phrased unclearly and have clarified this in the text.

      The authors use conserved non-coding sequences as a proxy for cis-regulatory elements. More specification of how CNSs were assigned to an orthogroup seems necessary in this section. Is assignment based on proximity to the coding region? Of course the authors will appreciate that regulatory elements can be located far from the gene they regulate. This data showed extensive gains and losses of CNS. It might be interesting to consider how much of this is down to transposons, in which case rapid rearrangement is not unexpected. A potential problem with the claim that the data supports the simulation results follows from the fact that DSD is genetic divergence despite trait conservation, but conserved traits appear to have only been defined or identified in the case of the SEP genes. It can't be ruled out that divergence in CNSs and in gene expression captured by the datasets is driven by straightforward phenotypic adaptation, thus not by DSD. Further caution on this point is needed.

      CNSs are indeed assigned based on proximity up to 50kb, the full methods are described in detail in Hendelman et al., (2021). CREs can be located further than 50kb, but evidence suggests that this is rare for species with smaller genomes.

      In the cases where both gene expression and the CNSs diverged it can indeed not be ruled out that there has been phenotypic adaptation. We clarified in the text that the lower Pearson distances are informative for DSD as they highlight conserved phenotypes.

      l. 290-291 - "However, evolution has been shown to increase mutational robustness over time, resulting in the possibility for more neutral change." It is doubtful that there is any such unrestricted trend. If mutational robustness only tended to increase, new mutations would not affect the phenotype, and phenotypes would be unable to adapt to novel environments. Consider rethinking this statement.

      We have reformulated this statement, since it is indeed not expected that this trend is indefinite. Infinite robustness would indeed result in the absence of evolvability; however, it has been shown for other genotype-phenotype maps that mutational robustness, where a proportion of mutations is neutral, aids the evolution of novel traits. The evolution of mutational robustness also depends on population size and mutation rate. This trend will, most probably, also be stronger in modelling work where the fitness function is fixed, compared to a real life scenario where ‘fitness’ is much less defined and subject to continuous change. We added ‘constant’ to the fitness landscape to highlight this disparity.

      ll. 316-317 "experimental work investigating the developmental role of CREs has shown extensive epistasis - where the effect of a mutation depends on the genetic background - supporting DSD." How does extensive epistasis support DSD? One can just as easily imagine scenarios where high interdependence between genes would prevent DSD from occurring. Please explain further.

      We should be more clear. Experimental work has shown that the effect of mutating a particular CRE strongly depends on the genetic background, also known as epistasis. Counterintuitively, this indirectly supports the presence of DSD, since it means that different species or strains have slightly different developmental mechanisms, resulting in these different mutational effects. We have shown how epistatic effects shift over evolutionary time.

      Overall I found the explanation of the Methods, especially the formal aspects, to be unclear at times and would recommend that the authors go back over the text to improve its clarity.

      We rewrote parts of the methods and some of the equations to be more clear and cohesive throughout the text.

      C. Tissue Generation. Following on the comment on fitness above, it would be advisable to provide further details on how cell positions are defined. How much do the cells move over the course of the simulation? What is the advantage of modelling the cells as "springs" rather than as a simple grid?

      The tissue generation is purely a process to generate a database of tissue templates: the random positions, springs and voronoi method serve the purpose of having similar but different tissues to prevent unrealistic overfitting of our GRNs on a single topology. For each individual’s development however, only one, unchanging template is used. We clarified this in the methods.

      E. Development of genotype into phenotype. The diffusion term in the SDE equations is hard to understand as no variable for spatial position (x) is included in the equation. It seems this equation should rather be an SPDE with a position variable and a specified boundary condition (i.e. the parabola shape). In eq. 5 it should be noted that the Wi are independent. Also please justify the choice of how much noise/variance is being stipulated here.

      We have rewritten parts of this section for clarity and added citations.

      F. Fitness function. I must say I found formula 7 to be unclear. It looks like fi is the fitness of cell(s) but, from Section G, fitness is a property of the individual. It seems formula 7 should define fi as a sum over the cell types or should capture the fitness contribution of the cell types.

      Correct. We have rewritten this equation. We’ll define fi as the fitness contribution of a cell, F as the sum of fi, so the fitness of an individual, and use F in function 8.

      What is the basis for the middle terms (fractions) in the equation? After plugging in the values for pcz and poc, this yields a number, but how does that number assign a cell to one of the types? If a reviewer closely scrutinizing this section cannot make sense of it, neither will readers. Please explain further.

      The cell type is assigned based on the spatial location of the cell, and the correct fitness function for each of these cell types is described in this equation. We have clarified the text and functions.

      A minor note: it would be best practice not to re-use variables to refer to different things within the same paper. For example p refers to protein concentration but also probability of mutation.

      Corrected

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      In "Ubiquitous system drift in the evolution of development," van der Jagt et al. report a large-scale simulation study of the evolution of gene networks controlling a developmental patterning process. The 14-gene simulation shows interesting results: continual rewiring of the network and establishment of essential genes which themselves are replaced on long time scales. The authors suggest that this result is validated by plant genome and expression data from some public datasets. Overall, this study lends support to the idea that developmental system drift may be more pervasive in the evolution of complex gene networks than is currently appreciated.

      I have a number of comments, mostly of a clarificatory nature, that the authors can consider in revision.

      1. Intro

      Statement of significance. The logical connection between the first two sentences is not clear. What does developmental system drift have to do with neutral/beneficial mutations?

      l 41 - "DSD is found to ... explain the developmental hourglass." Caution is warranted here. Wotton et al 2015 claim that "quantitative system drift" explains the hourglass pattern, but it would be more accurate to say that shifting expression domains and strengths allows compensatory regulatory change to occur with the same set of genes (gap genes). It is far from clear how DSD could explain the developmental hourglass pattern. What does DSD imply about the causes of differential conservation of different developmental stages? It's not clear there is any connection here.

      ll 51-53 - "Others have found that increased complexity introduces more degrees of freedom, allowing for a greater number of genotypes to produce the same phenotype and potentially allowing for more DSD (Schiffman and Ralph, 2022; Greenbury et al., 2022)." Does this refer to increased genomic complexity or increased phenotypic complexity? It is not clear that increased phenotypic complexity allows a greater number of genotypes to produce the same phenotype. Please explain further. 2. Model

      It was not clear why some gene products in the model have the ability to form dimers. What does this contribute to the simulation results? This feature is introduced early on, but is not revisited. Is it necessary?

      Fitness. The way in which fitness is determined in the model was not completely clear to me. Lines 103-104 say: "Each individual is assigned a fitness score based on the protein concentration of two target genes in specific regions of the SAM: one in the central zone (CZ), and one in the organizing center (OC)." How are these regions positionally defined in the simulation? In Methods section F, one reads (l. 385): "Fitness depends on the correct protein concentration of the two fitness genes in each cell, pcz and poc respectively." This sounds like fitness is determined by the state of all cells rather than the state of the two specific regions of the SAM. Please clarify. 3. Data

      The authors use conserved non-coding sequences as a proxy for cis-regulatory elements. More specification of how CNSs were assigned to an orthogroup seems necessary in this section. Is assignment based on proximity to the coding region? Of course the authors will appreciate that regulatory elements can be located far from the gene they regulate. This data showed extensive gains and losses of CNS. It might be interesting to consider how much of this is down to transposons, in which case rapid rearrangement is not unexpected. A potential problem with the claim that the data supports the simulation results follows from the fact that DSD is genetic divergence despite trait conservation, but conserved traits appear to have only been defined or identified in the case of the SEP genes. It can't be ruled out that divergence in CNSs and in gene expression captured by the datasets is driven by straightforward phenotypic adaptation, thus not by DSD. Further caution on this point is needed. 4. Discussion

      ll. 290-291 - "However, evolution has been shown to increase mutational robustness over time, resulting in the possibility for more neutral change." It is doubtful that there is any such unrestricted trend. If mutational robustness only tended to increase, new mutations would not affect the phenotype, and phenotypes would be unable to adapt to novel environments. Consider rethinking this statement.

      ll. 316-317 "experimental work investigating the developmental role of CREs has shown extensive epistasis - where the effect of a mutation depends on the genetic background - supporting DSD." How does extensive epistasis support DSD? One can just as easily imagine scenarios where high interdependence between genes would prevent DSD from occurring. Please explain further. 5. Methods

      Overall I found the explication of the Methods, especially the formal aspects, to be unclear at times and would recommend that the authors go back over the text to improve its clarity.

      C. Tissue Generation. Following on the comment on fitness above, it would be advisable to provide further details on how cell positions are defined. How much do the cells move over the course of the simulation? What is the advantage of modelling the cells as "springs" rather than as a simple grid?

      E. Development of genotype into phenotype. The diffusion term in the SDE equations is hard to understand as no variable for spatial position (x) is included in the equation. It seems this equation should rather be an SPDE with a position variable and a specified boundary condition (i.e. the parabola shape). In eq. 5 it should be noted that the Wi are independent. Also please justify the choice of how much noise/variance is being stipulated here.

      F. Fitness function. I must say I found formula 7 to be unclear. It looks like fi is the fitness of cell(s) but, from Section G, fitness is a property of the individual. It seems formula 7 should define fi as a sum over the cell types or should capture the fitness contribution of the cell types.

      What is the basis for the middle terms (fractions) in the equation? After plugging in the values for pcz and poc, this yields a number, but how does that number assign a cell to one of the types? If a reviewer closely scrutinizing this section cannot make sense of it, neither will readers. Please explain further.

      A minor note: it would be best practice not to re-use variables to refer to different things within the same paper. For example p refers to protein concentration but also probability of mutation.

      Referee cross-commenting

      Overall I agree with the comments of Reviewer 1, 2 and 3. I note that reviewers 1, 3, and 4 each pointed out the difficulties with assuming that CNSs = CREs, so this needs to be addressed. Two reviewers (3 and 4) also point out problems with equating bulk RNAseq with a conserved phenotype.

      I agree with Reviewer 1's hesitancy about the rhetorical framing of the paper potentially generalising too far from a computational model of plant meristem patterning.

      Reviewer 3's concern about DSD resulting from stabilising selection for robustness is something I missed -- this is important and should be addressed.

      Reviewer 3 suggests that the model construction may favor DSD because there are many genes (14) of which only two determine fitness. I agree that some discussion on this point is warranted, though I am not sure enough is known about "the possible difference in constraints between the model and real development" for such a discussion to be on firm biological footing. A genetic architecture commonly found in quantitative genetic studies is that a small number of genes have large effects on the phenotype/fitness, whereas a very large number of genes have effects that are individually small but collectively large (see, e.g. literature surrounding the "omnigenic model" of complex traits). Implementing such an architecture is probably beyond the scope of the study here. More generally, would be natural to assume that the larger the number of genes, and the smaller the number of fitness-determining genes, the more likely DSD / re-wiring is to occur. That being said, I think the authors' choice of a 14-gene network is biologically defensible. It could be argued that the restriction of many modeling studies to small networks (often including just 3 genes) on the ground of convenience artificially ensures that DSD will not occur in these networks.

      I agree with the other reviewers on the overall positive assessment of the significance of the manuscript. There are many points to address and revise, but the core setup and result of this study is sound and should be published.

      Significance

      In "Ubiquitous system drift in the evolution of development," van der Jagt et al. report a large-scale simulation study of the evolution of gene networks controlling a developmental patterning process. The 14-gene simulation shows interesting results: continual rewiring of the network and establishment of essential genes which themselves are replaced on long time scales. The authors suggest that this result is validated by plant genome and expression data from some public datasets. Overall, this study lends support to the idea that developmental system drift may be more pervasive in the evolution of complex gene networks than is currently appreciated.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      This manuscript uses an Evo-Devo model of the plant apical meristem to explore the potential for developmental systems drift (DSD). DSD occurs when the genetic underpinnings of development change through evolution while reaching the same developmental outcome. The mechanisms underlying DSD are theoretically intriguing and highly relevant for our understanding of how multicellular species evolve. The manuscript shows that DSD occurs extensively and continuously in their evolutionary simulations whilst populations evolve under stabilising selection. The authors examine regulatory rewiring across plant angiosperms to link their theoretical model with real data. The authors claim that, despite the conservation of genetic wiring in angiosperm species over shorter evolutionary timescales, this genetic wiring changes over long evolutionary timescales due to DSD, which is consistent with their theoretical model.

      Major comments:

      I enjoyed reading the author's approach to understanding DSD and the link to empirical data. I think it is a very important line of investigation that deserves more theoretical and experimental attention. All the data and methods are clearly presented, and the software for the research is publicly available. Sufficient information is given to reproduce all results. However, I have two major issues relating to the theoretical part of the research.

      Issue One: Interpretation of fitness gains under stabilising selection

      A central issue concerns how the manuscript defines and interprets developmental systems drift (DSD) in relation to evolution on the fitness landscape. The authors define DSD as the conservation of a trait despite changes in its underlying genetic basis, which is consistent with the literature. However, the manuscript would benefit from clarifying the relationship between DSD, genotype-to-phenotype maps, and fitness landscapes. Very simply, we can say that (i) DSD can operate along neutral paths in the fitness landscape, (ii) DSD can operate along adaptive paths in the fitness landscape. During DSD, these neutral or adaptive paths along the fitness landscape are traversed by mutations that change the gene regulatory network (GRN) and consequent gene expression patterns whilst preserving the developmental outcome, i.e., the phenotype. While this connection between DSD and fitness landscapes is referenced in the introduction, it is not fully elaborated upon. A complete elaboration is critical because, when I read the manuscript, I got the impression that the manuscript claims that DSD is prevalent along neutral paths in the fitness landscape, not just adaptive ones. If I am wrong and this is not what the authors claim, it should be explicitly stated in the results and discussed. Nevertheless, claiming DSD operates along neutral paths is a much more interesting statement than claiming it operates along adaptive paths. However, it requires sufficient evidence, which I have an issue with. The issue I have is about adaptations under stabilising selection. Stabilising selection occurs when there is selection to preserve the developmental outcome. Stabilising selection is essential to the results because evolutionary change in the GRN under stabilising selection should be due to DSD, not adaptations that change the developmental outcome. To ensure that the populations are under stabilising selection, the authors perform clonal experiments for 100,000 generations for 8 already evolved populations, 5 clones for each population. They remove 10 out of 40 clones because the fitness increase is too large, indicating that the developmental outcome changes over the 100,000 generations. However, the remaining 30 clonal experiments exhibit small but continual fitness increases over 100,000 generations. The authors claim that the remaining 30 are predominantly evolving due to drift, not adaptations (in the main text, line 137: "indicating predominantly neutral evolution", and section M: "too shallow for selection to outweigh drift"). The author's evidence for this claim is a mathematical analysis showing that the fitness gains are too small to be caused by beneficial adaptations, so evolution must be dominated by drift. I found this explanation strange, given that every clone unequivocally increases in fitness throughout the 100,000 generations, which suggests populations are adapting. Upon closer inspection of the mathematical analysis (section M), I believe it will miss many kinds of adaptations possible in their model, as I now describe. The mathematical analysis treats fitness as a constant, but it's a random variable in the computational model. Fitness is a random variable because gene transcription and protein translation are stochastic (Wiener terms in Eqs. (1)-(5)) and cell positions change for each individual (Methods C). So, for a genotype G, the realised fitness F is picked from a distribution with mean μ_G and higher order moments (e.g., variance) that determine the shape of the distribution. I think these assumptions lead to two problems. The first problem with the mathematical analysis is that F is replaced by an absolute number f_q, with beneficial mutations occurring in small increments denoted "a", representing an additive fitness advantage. The authors then take a time series of the median population fitness from their simulations and treat its slope as the individual's additive fitness advantage "a". The authors claim that drift dominates evolution because this slope is lower than a drift-selection barrier, which they derive from the mathematical analysis. This analysis ignores that the advantage "a" is a distribution, not a constant, which means that it does not pick up adaptations that change the shape of the distribution. Adaptations that change the shape of the distribution can be adaptations that increase robustness to stochasticity. Since there are multiple sources of noise in this model, I think it is highly likely that robustness to noise is selected for during these 100,000 generations. The second problem is that the mathematical analysis ignores traits that have higher-order effects on fitness. A trait has higher-order effects when it increases the fitness of the lineage (e.g., offspring) but not the parent. One possible trait that can evolve in this model with higher-order effects is mutational robustness, i.e., traits that lower the expected mutational load of descendants. Since many kinds of mutations occur in this model (Table 2), mutational robustness may be also evolving. Taken together, the analysis in Section M is set up to detect only immediate, deterministic additive gains in a single draw of fitness. It therefore cannot rule out weak but persistent adaptive evolution of robustness (to developmental noise and/or to mutations), and is thus insufficient evidence that DSD is occurring along neutral paths instead of adaptive paths. The small but monotonic fitness increases observed in all 40 clones are consistent with such adaptation (Fig. S3). The authors also acknowledge the evolution of robustness in lines 129-130 and 290-291, but the possibility of these adaptations driving DSD instead of neutral evolution is not discussed. To address the issue I have with adaptations during stabilising selection, the authors should, at a minimum, state clearly in their results that DSD is driven by both the evolution of robustness and drift. Moreover, a paragraph in the discussion should be dedicated to why this is the case, and why it is challenging to separate DSD through neutral evolution vs DSD through adaptations such as those that increase robustness. [OPTIONAL] A more thorough approach would be to make significant changes to the manuscript by giving sufficient evidence that the experimental clones are evolving by drift, or changing the model construction. One possible way to provide sufficient evidence is to improve the mathematical analysis. Another way is to show that the fitness distributions (both without and with mutations, like in Fig. 2F) do not significantly change throughout the 100,000 generations in experimental clones. It seems more likely that the model construction makes it difficult to separate the evolution of robustness from evolution by drift in the stabilising selection regime. Thus, I think the model should be constructed differently so that robustness against mutations and noise is much less likely to evolve after a "fitness plateau" is reached. This could be done by removing sources of noise from the model or reducing the kinds of possible mutations (related to issue two). In fact, I could not find justification in the manuscript for why these noise terms are included in the model, so I assume they are included for biological realism. If this is why noise is included, or if there is a separate reason why it is necessary, please write that in the model overview and/or the methods.

      Issue two: The model construction may favour DSD

      In this manuscript, fitness is determined by the expression pattern of two types of genes (genes 12 and 13 in Table 1). There are 14 types of genes in total that can all undergo many kinds of mutations, including duplications (Table 2). Thus, gene regulatory networks (GRNs) encoded by genomes in this model tend to contain large numbers of interactions. The results show that most of these interactions have minimal effect on reaching the target pattern in high fitness individuals (e.g. Fig. 2F). A consequence of this is that only a minimal number of GRN interactions are conserved through evolution (e.g. Fig. 2D). From these model constructions and results from evolutionary simulations, we can deduce that there are very few constraints on the GRN. By having very few constraints on the GRN, I think it makes it easy for a new set of pattern-producing traits to evolve and subsequently for an old set of pattern-producing traits to be lost, i.e., DSD. Thus, I believe that the model construction may favour DSD. I do not have an issue with the model favouring DSD because it reflects real multicellular GRNs, where it is thought that a minority fraction of interactions are critical for fitness and the majority are not. However, it is unknown whether the constraints GRNs face in the model are more or less constrained than real GRNs. Thus, it is not known whether the prevalence of DSD in this model applies generally to real development, where GRN constraints depend on so many factors. At a minimum, the possible difference in constraints between the model and real development should be discussed as a limitation of the model. A more thorough change to the manuscript would be to test the effect of changing the constraints on the GRN. I am sure there are many ways to devise such a test, but I will give my recommendation here. [OPTIONAL] My recommendation is that the authors should run additional simulations with simplified mutational dynamics by constraining the model to N genes (no duplications and deletions), of which M out of these N genes contribute to fitness via the specific pattern (with M=2 in the current model). The authors should then test the effect of changing N and M independently, and how this affects the prevalence of DSD. If the prevalence of DSD is robust to changes in N and M, it supports the authors argument that DSD is highly prevalent in developmental evolution. If DSD prevalence is highly dependent on M and/or N, then the claims made in the manuscript about the prevalence of DSD must change accordingly. I acknowledge that these simulations may be computationally expensive, and I think it would be great if the authors knew (or devised) a more efficient way to test the effect of GRN constraints on DSD prevalence. Nevertheless, these additional simulations would make for a potentially very interesting manuscript.

      Minor comments:

      1. The authors present an analysis correlating conserved non-coding sequence (CNS) composition with gene expression to investigate developmental systems drift. One flaw of this analysis is that it uses deeply conserved sequences as a proxy for the entire cis-regulatory landscape. The authors acknowledge this flaw in the discussion. Another potential flaw is equating the bulk RNA-seq data with a conserved phenotype. In lines 226-227 of the manuscript, it is written that "In line with our computational model, we compared gene expression patterns to measure changes in phenotype." I am not sure if there is an equivalence between the two. In the computational model, the developmental outcome determining fitness is a spatial pattern, i.e., an emergent product of gene expression and cell interactions. In contrast, the RNA-seq data shows bulk measurements in gene expression for different organs. It is conceivable that, despite having very similar bulk measurements, the developmental outcome in response to gene expression (such as a spatial pattern or morphological shape) changes across species. I think this difference should be explicitly addressed in the discussion. The authors may have intended to discuss this in lines 320-326, although it is unclear to me.
      2. Can the authors justify using these six species in the discussion or the results? Are there any limitations with choosing four closely related and two distantly related species for this analysis, in contrast to, say, six distantly related species? If so, please elaborate in the discussion.
      3. In Figure S7, some profiles show no conservation across the six species. Can we be sure that a stabilising selection pressure conserves any CNSs? Is it possible that the deeply conserved CNSs mentioned in the main text are conserved by chance, given the large number of total CNSs? A brief comment on these points in the results or discussion would be helpful.
      4. Line 7-8: I thought this was a bit difficult to read. The connection between (i) evolvability of complex phenotypes, (ii) neutral/beneficial change hindered by deleterious mutations, and (iii) DSD might not be so simple for many readers, so I think it should be rewritten. The abstract was well written, though.
      5. Line 274 vs 276: Is there a difference between regulatory dynamics and regulatory mechanisms?
      6. Figure S4: Do you expect the green/blue lines to approach the orange line in the long term? In some clonal experiments, it seems like it will. In others, it seems like it has plateaued. Under continual DSD, I assume they should converge. It would be interesting to see simulations run sufficiently long to see if this occurs.
      7. Line 27: Evolutionarily instead of evolutionary?
      8. Line 67-68: References in brackets?
      9. Line 144: Capitalise "fig"
      10. Fig. 3C caption: correct "1, 2, 4, 11" (should be 8)
      11. Line 192: Reference repeated
      12. Fig. 5 caption: Capitalise "Supplementary figure"
      13. Line 277: Correct "A previous model Johnson.."
      14. Line 290: Brackets around reference
      15. Line 299: Correct "will be therefore be"
      16. Line 394: Capitalise "table"
      17. Line 449: Correct "was build using"
      18. Fig. 5B: explain the red dashed boxes in the caption
      19. Some of the Figure panels might benefit from further elaboration in their respective captions, such as 3C and 5F.

      Significance

      General Assessment:

      This manuscript tackles a fundamental evolutionary problem of developmental systems drift (DSD). Its primary strength lies in its integrative approach, combining a multiscale evo-devo model with a comparative genomic analysis in angiosperms. This integrative approach provides a new way of investigating how developmental mechanisms can evolve even while the resulting phenotype is conserved. The details of the theoretical model are well defined and succinctly combined across scales. The manuscript employs several techniques to analyse the conservation and divergence of the theoretical model's gene regulatory networks (GRNs), which are rigorous yet easy to grasp. This study provides a strong platform for further integrative approaches to tackle DSD and multicellular evolution.

      The study's main limitations are due to the theoretical model construction and the interpretation of the results. The central claim that DSD occurs extensively through predominantly neutral evolution is not sufficiently supported, as the analysis does not rule out an alternative: DSD is caused by adaptive evolution for increased robustness to developmental or mutational noise. Furthermore, constructing the model with a high-dimensional GRN space and a low-dimensional phenotypic target may create particularly permissive conditions for DSD, raising questions about the generality of the theoretical conclusions. However, these limitations could be resolved by changes to the model and further simulations, although these require extensive research. The genomic analysis uses cis-regulatory elements as a proxy for the entire regulatory landscape, a limitation the authors are aware of and discuss. The genomic analysis uses bulk RNA-seq as a proxy for the developmental outcome, which may not accurately reflect differences in plant phenotypes.

      Advance:

      The concept of DSD is well-established, but mechanistic explorations of its dynamics in complex multicellular models are still relatively rare. This study represents a mechanistic advance by providing a concrete example of how DSD can operate continuously under stabilising selection. I found the evolutionary simulations and subsequent analysis of mechanisms underlying DSD in the theoretical model interesting, and these simulations and analyses open new pathways for studying DSD in theoretical models. To my knowledge, the attempt to directly link the dynamics from such a complex evo-devo model to patterns of regulatory element conservation across a real phylogeny (angiosperms) is novel. However, I think that the manuscript does not have sufficient evidence to show a high prevalence of DSD through neutral evolution in their theoretical model, which would be a highly significant conceptual result. The manuscript does have sufficient evidence to show a high prevalence of DSD through adaptive evolution under stabilising selection, which is a conceptually interesting, albeit somewhat expected, result.

      Audience:

      This work will be of moderate interest to a specialised audience in the fields of evolutionary developmental biology (evo-devo), systems biology, and theoretical/computational biology. Researchers in these areas will be interested in the model and the dynamics of GRN conservation and divergence. The results may interest a broader audience across the fields of evolutionary biology and molecular evolution.

      Expertise:

      My expertise is primarily in theoretical and computational models of biology and biophysics. While I have sufficient background knowledge in bioinformatics to assess the logic of the authors' genomic analysis and its connection to their theoretical model, I do not have sufficient expertise to critically evaluate the technicalities of the bioinformatic methods used for the identification of conserved non-coding sequences (CNSs) or analysis of RNA-seq data. A reviewer with expertise in plant comparative genomics would be better suited to judge the soundness of these specific methods.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript, van der Jagt and co-workers present a computational model of the evolution of gene regulatory networks that underpin the development of shoot apical meristems in plants. They find evidence for conservation of a subset of regulatory interactions over many thousands of generations. They also show that after reaching a fitness plateau, the topology of regulatory interactions continues to evolve, giving rise to substantial differences in regulatory networks among cloned populations. Their model suggests that cis-regulatory rewiring is key for developmental evolution, and they reach a similar conclusion after analysing two empirical datasets covering six land plant species. Overall, I find that this study is excellently executed, its methodology sufficiently described, and that its claims are well-supported by the data presented.

      Major comments:

      • Every computational model necessarily makes some simplifying assumptions. It would be nice if the authors could summarise in a paragraph in the Discussion the main assumptions made by their model, and which of those are most worth revisiting in future studies. In the current draft, some assumptions are described in different places in the manuscript, which makes it hard for a non-expert to evaluate the limitations of this model.
      • I did not find any mention of potential energetic constraints or limitations in this model. For example, I would expect high levels of gene expression to incur significant energy costs, resulting in evolutionary trade-offs. Could the authors comment on how taking energy limitations into account might influence their results?

      Minor comments:

      • Figure 3C lists Gene IDs 1, 2, 8, and 11, but the caption refers to genes 1, 2, 4, and 11.

      Significance

      I have to note that my expertise is not in developmental systems drift, but I am generally interested in the evolution of complex phenotypes in response to various environmental pressures. Thus, I do not feel qualified to evaluate the novelty of this work, which I hope other reviewers have done. Nevertheless, I found this study very interesting and the manuscript generally easy to understand. I believe that this study will be of strong interest primarily (but not only) to evolutionary and systems biologists, regardless of the taxonomic group of their research focus.

    5. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      # Summary

      On the basis of computational modelling and bioinformatic data analysis, the authors report evidence for Developmental System Drift in the plant apical meristem (a plant stem cell tissue from which other tissues and organs grow, like shoots and roots). The modelling focuses on a general (shoot) apical meristem, the data analysis on the floral meristem. As a non-plant computational biologist, I was lacking some basic plant biology to immediately understand all the technical terms. It hindered a bit, but was not a show-stopper. That said, I interpret their study as follows.

      In the computational modelling part, the authors take into account gene expression, protein complex formation, stochasticity (expression noise), tissue shape, etc. to do evolutionary simulations to obtain a "standard" gene expression pattern known from the shoot apical meristem. Next, they analyze the gene regulatory networks in terms of conserved regulatory interactions. They find two timescales, either interactions quickly turn-over or they are slowly replaced (because under selection). The slowly replaced interactions are important for the realization of the phenotype and their turnover (further explored in a separate set of "neutral evolution" simulations) is called DSD by the authors. The authors state that at the basis of DSD is overlap in gene expression domains, such that genes can take over from each other. Next, the authors analyze two public data sets to show that DSD-associated phenomena such as turn-over of (conserved) noncoding sequences and differences in gene expression patterns occur in plants.

      Considering my limited amount of time and energy, I apologize in advance for stupidities and/or un-elegantly formulated sentences. I'll be happy to discuss with the authors about this work, it was a pleasant summer read!

      Anton Crombach

      Major comments

      • It is system drift, not systems drift (see True and Haag 2001). No 's' after system.
      • I am afraid I have a problem with the manuscript title. I think "Ubiquitoes" is misplaced, because it strongly suggests you have a long list of case studies across plants and animals, and some quantification of DSD in these two kingdoms. That would have been an interesting result, but it is not what you report. I suggest something along the lines of "System drift in the evolution of plant meristem development", similar to the short title used in the footer.
      • Alternatively, the authors may aim to say that DSD happens all over the place in computational models of development? In that case the title should reflect that the claim refers to modeling. (But what then about the data analysis part?)
      • The observation of DSD in the computational models remains rather high-level in the sense that no motifs, mechanisms, subgraphs, mutations or specific dynamics are reported to be associated to it ---with the exception of gene expression domains overlapping. Perhaps the authors feel it is beyond this study, but a Results section with a more in-depth "mechanistic" analysis on what enables DSD would (a) make a better case for the extensive and expensive computational models and (b) would push this paper to a next level. As a starting point, it could be nice to check Ohno's intuition that gene duplications are a creative "force" in evolution. Are they drivers of DSD? Or are TFBS mutations responsible for the majority of cases?
      • Multiple times in the Abstract and Introduction the authors make statements on "cis-regulatory elements" that are actually "conserved non-coding sequences" (CNS). Even if it is not uncommon for CNSs to harbor enhancers etc., I would be very hesitant to use the two as synonyms. As the authors state themselves, sequences, even non-coding, can be conserved for many reasons other than CREs. I would ask the authors to support better their use of "CREs" or adjust language. As roughly stated in their Discussion (lines 310-319), one way forward could be to show for a few CNS that are important in the analysis (of Fig 5), that they have experimentally-verified enhancers. Is that do-able or a bridge too far?

      Minor comments

      Statement of significance:

      • line 7. evo-devo is jargon
      • l9. I would think "using a computational model and data analysis"
      • l13. Strictly speaking you did not look at CREs, but at conserved non-coding sequences.
      • l14. "widespread" is exaggerated here, since you show for a single organ in a handful of plant species. You may extrapolate and argue that you do not see why it should not be widespread, but you did not show it. Or tie in all the known cases that can be found in literature..

      Abstract:

      • l16. "simpler" than what?
      • l27. Again the tension between CREs and non-coding sequence.
      • l28. I don't understand the use of "necessarily" here.

      Introduction:

      • l34-35. A very general biology statement is backed up by two modeling studies. I would have expected also a few based on comparative analyses (e.g., fossils, transcriptomics, etc).
      • l36. I was missing the work on "phenogenetic drift" by Weiss; and Pavlicev & Wagner 2012 on compensatory mutations.
      • l38. Kimura and Wagner never had a developmental process in mind, which is much bigger than a single nucleotide or a single gene, respectively. First paper that I am aware of that explicitly connects DSD to evolution on genotype networks is my own work (Crombach 2016), since the editor of that article (True, of True and Haag 2001) highlighted that point in our communications.
      • l40. While Hunynen and Hogeweg definitely studied the GP map in many of their works, the term goes back to Pere Alberch (1991).
      • l54-55. I'm missing some motivation here. If one wants to look at multicellular structures that display DSD, vulva development in C. elegans and related worms is an "old" and extremely well-studied example. Also, studies on early fly development by Yogi Jaeger and his co-workers are not multicellular, but at least multi-nuclear.
      • Obviously these are animal-based results, so to me it would make sense to make a contrast animal-plant regarding DSD research and take it from there.
      • l66-86. It is a bit of a style-choice, but this is a looong summary of what is to come. I would not have done that. Instead, in the Introduction I would have expected a bit more digging into the concept of DSD, mention some of the old animal cases, perhaps summarize where in plants it should be expected. More context, basically.

      Results:

      • l108. Could you quantify the conserved interactions shared between the populations? Or is each simulation so different that they are pretty much unique?
      • l169. "DSD driving functional divergence" needs some context, since DSD is supposed to not affect function (of the final phenotype). Or am I misunderstanding?
      • l171. You discuss an example here, would it be possible to generalize this analysis and quantify the amount of DSD amongst all cloned populations? And related question: of the many conserved interactions in Fig 4A, how many do the two clonal lineages share? None? All?
      • l176. Say which interaction it is. Is it 0->8, as mentioned in the next paragraph?
      • l190. In the section on DSD in plant gene regulation, the repeated explanation of where the data comes from is a bit tedious to read. You intro it clearly at the start, that is enough.
      • l197. Bulk RNAseq has the problem of averaging gene expression over the population of cells. How do you think that impacts your test for rewiring? If you would do a similar "bulk RNA" style test on your computational models, would you pick up DSD?
      • l202. I do not understand the "within" of a non-coding sequence within an orthogroup. How are non-coding sequences inside an orthogroup of genes?
      • l207-217. This paragraph is difficult to read and would benefit of a rephrasing. Plant-specific jargon, numbers do not add up (line 211), statements are rather implicit (9 deeply conserved CNS are the 3+6? Where do I see them in Fig 5B? And where do I see the lineage-specific losses?).
      • l223. Looking at the shared CNS between SEP1-2, can you find a TF binding site or another property that can be interpreted as regulatory importance?
      • l225. My intuition says that the continuity of the phenotype may not be necessary if its loss can be compensated for somehow by another part of the organism. I.e., DSD within DSD. It is a poorly elaborated thought, I leave it here for your information. Perhaps a Discussion point?
      • l233. "rarely"? I don't see any high Pearson distances.

      • Fig 4. Re-order of panels? I was expecting B at C and vice versa.

      • Fig 5B. Red boxes not explained. Mention that it is an UpSetplot?
      • Fig 5D. It would be nice to quantify the minor and major diffs between orthologs and paralogs.

      Discussion: - l247. Over-generalization. In a specific organ of plants...<br /> - l249. Where exactly is this link between diverse expression patterns and the Schuster dataset made? I suggest the authors to make it more explicit in the Results. - l268. Final sentence of the paragraph left me puzzled. Why talk about opposite function?<br /> - l269. What about phenotypic plasticity due to stochastic gene expression? Does it play a role in DSD in your model? I am thinking about https://pubmed.ncbi.nlm.nih.gov/24884746/ and https://pubmed.ncbi.nlm.nih.gov/21211007/ - l269. What about time scales generated by the system? Looking at Fig 2C and 2D, the elbow pattern is pretty obvious. That means interactions sort themselves into either short-lived or long-lived. Worth mentioning? - l291. Evolution in a constant fitness landscape increases robustness. - l296. My thoughts, for your info: I suspect morphogenesis as single parameters instead of as mechanisms makes for a brittle landscape, resulting in isolated parts of the same phenotype.

      Methods: I have diagonally read through the Methods section, I did not have time to dig in. I hope another reviewer can compensate for me.

      Significance

      Nature and significance of advance

      I find this study a strong contribution to the concept of DSD. It was good to see that colleagues have done the effort of making a convincing case for the presence of DSD in plants. This will be appreciated by the evo-devo community in general. On top of that, the computational modelling work is excellent and sets new standards that will be appreciated by computational colleagues. And I anticipate that the evolutionary biology community welcomes the extension of DSD to the plant kingdom; so far it has been dominated by animal studies.

      I see two limitations: (1) almost no mechanistic explanation of what drives DSD in the simulations. (2) the Abstract, Introduction, etc. need some polishing to be better in line with the results reported.

      Context of existing literature

      Literature is very modeling focused, it could use some empirical support. Also, some literature on DSD is missing: Weiss 2005, Pavlicev 2012, "Older" C. elegans work by the group of Marie-Anne Felix. Probably some more recent empirical case studies have established DSD as well... I may not be aware, as I did not keep track of it.

      What audience?

      In no particular order: plant evolution, plant development, evo-devo, computational biology.

      My field of expertise

      My expertise: gene regulatory networks, evolution, development (in animals), computational modelling, bioinformatic data analysis (single cell omics).

      Phylogenetic tree building is surely not my strength.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank all the reviewers for their comments and suggestions, which has helped in revising the manuscript for a broader audience. Some of the experiments that was suggested by the reviewers has been performed and included in the revised manuscript. The response to reviewers is provided below their comments.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      MprF proteins exist in many bacteria to synthesize aminoacyl phospholipids that have diverse biological functions, e.g. in the defense against small cationic peptides. They integrate two functions, the aminoacylation of lipids, i.e. the transfer of Lys, Arg or Ala from tRNAs to the head group, and the flipping of these modified lipids to the membrane outer leaflet. The authors present structures of MprF from Pseudomonas aeruginosa and describe these structures in great detail. As MprF enzymes confer antibiotic resistance and are therefore highly important, studying them is significant and interesting. Consequently, their structures have been substantially characterized in recent years, including the publication of the dimeric full-length MpfR from Rhizobium (Song et al., 2021).

      While the structural work appears to be solid and carried out well on the technical part, one big criticism is how the data are presented in the manuscript, how they are analyzed and how they are put into relation to previous work. As structures of Mpfr from Rhizobium have been published, it is not required and rather distracting to explain the methodological details and the structure of Pseudomonas MprF in such great detail. Instead, the manuscript would benefit very strongly from reaching the interesting and novel parts, the comparison with the previous structures, as early as possible. Overall, the manuscript should be substantially shortened to not divert the reader's attention away from the novel parts by drowning them in miniscule description of the structural features such as secondary structure elements or lipid molecule positions where it remains completely unclear what their relevance is to the story and the message of the paper. Finally, during this revision, care should be taken to improve the language and maybe involve a native speaker in doing so.

      It is true that we have described the experimental details of PaMprF in detail including the constructs. We had reconstructed the map of dimeric PaMprF in 2020 but with the publication of the homologues structures (Song et al 2021 and the unpublished Rhizobium etli structure), we had to make sure the PaMprF dimer is not an artefact. Hence, our attempts to rule out this with different constructs and extensive testing with various detergents. Thus, we would like to keep this in the manuscript. We realise the importance of focusing on novel/interesting parts and have reshuffled sections (comparing structures and validating the dimer interface) followed by description of modelling of lipid molecules.

      Even more importantly, since the authors observe a dimer interface which strongly deviates from the previously presented arrangement of another species, the most important thing would be to properly characterize this interface and experimentally validate it, both of which has not been done sufficiently. When also taking into account that there were significant differences in the arrangement of the dimer between their structures in GDN and nanodisc, and that in the GDN structure, the cholesterol backbone of GDN appears to be involved in the interface (there should not be any cholesterol in native bacterial membranes!), there is a realistic chance that the observed dimer is an artefact. If the authors cannot convincingly rule out this possibility, all their conclusions are meaningless.

      The trials with cholesterol hemisuccinate stems more of out of curiosity (we are aware that no cholesterol is present in bacterial membranes). We had started the initial analysis of PaMprF with DDM and by itself it was largely monomeric (unpublished observation and supported by recent publication of PaMprF in DDM – Hankins et al 2025). When we observed that GDN was essential for the stability of the dimer (and not even LMNG), we asked if a combination of CHS with DDM will keep the dimer intact, which didn’t work and GDN was found to be important. The use of CHS for prokaryotic membrane protein studies has now been reported in few different systems and a recent one includes – Caliseki et al., 2025. We would like to keep the observation with CHS in the manuscript, and we have moved this figure to Appendix Fig. S3C.

      In addition, in a recent report on MgtA, a magnesium transporter (Zeinert et al., 2025), it was observed that DDM/LMNG resulted in monomeric enzyme, while GDN resulted in dimeric enzyme albeit, the dimer interface was in the soluble domain. We have added this reference and observation of MgtA in the discussion (page 13, lines 407-411).

      We like to think that the milder GDN tends to keep the membrane proteins or oligomers of membrane proteins more stable but further studies on multiple labile membrane protein systems will be required to substantiate this.

      Hence, while I think that the data presented here would be worth publishing. However, a major drawback is that the authors do not sufficiently analyse, characterise and validate the dimer interface and fail to show that the dimer is biologically relevant.

      Further major points: - The authors always jump between their structures in detergent and nanodisc during all the descriptions, which makes following the story even more difficult. Please first describe one of the structures and then (briefly) discuss relevant similarities and differences afterwards.

      The flow and description of the structures is now modified and the figures have now been rearranged to make it easier to follow. The panel in figure 2 describing the overlay of the GDN and nanodisc is now moved to Appendix Fig. S2B. Thus, figure 2 has only description of salient features of the structures (the interacting residues between the membrane and soluble domain) and the terminal helix.

      • The difference in dimerization between Pseudomonas and Rhizobium is the most interesting and surprising feature (if true) of the new structures. However, it is not really presented as such. The authors should put more emphasis on making clear that this is a complete rotation of the monomers with respect to each other (by how many degrees?) and they should visualize it even more clearly in Figure 4 (and label the figure so that it is possible to understand it without having to read the text or the legend first).

      We thought the colouring of the TM helices should make the difference in interface more obvious (the N and C-terminal TM helices in different colours). Now, we have also labelled the TM helices, so that it is easier to follow (this was also shown in panel E). The rotation is ~180° and this is now mentioned in the figure legend.

      • P. 10: The authors insinuate that only one of the dimer interfaces, either Pseudomonas or Rhizobium could be real, but disregard the possibility that both might be the biologically relevant interfaces of the respective species and that there might have been a switch of interfaces during evolution. They should also mention and discuss this possibility.

      We didn’t imply that one of the interfaces is real but clearly mentioned that it could also be different conformational state (page 7, lines 226-228). In the revised version, we have included a multiple sequence alignment (we had not included in the initial draft as it had been presented in several previous publications). The MSA (Appendix Fig. S6) reveals that neither of the interfaces are highly conserved.

      • Fig. 5G: The authors claim that the higher molecular band that appears in the mutant is a "dimer with aberrant migration" of >250 kDa as opposed to the expected 150 kDa. They should explain how they came to this conclusion and how they can be sure that the band does not correspond to a higher oligomer (trimer or tetramer). They could show, by extraction and purification scheme similar to the wildtype using first LMNG and then GDN, followed by at least a preliminary EM analysis, that the crosslinked mutant MprF is indeed a dimer, or use other biophysical methods to do the same, otherwise this experiment does not show much. Furthermore, they should also include a cysteine mutant in the part of Pseudomonas MprF that would be involved in a Rhizobium-like interface in their crosslinking experiments to check whether they could also stabilize dimers in this case.

      The band of the double mutant after crosslinking (or even without crosslinking) migrates at higher molecular weight than that expected for a dimer, and could potentially be a higher molecular band that a dimer. We also note that in the previous publication by Song et al 2021, the crosslinking of RtMprF also resulted in a higher molecular weight band (shown also by Western blot).

      We now substantiate the dimer of PaMprF with different approaches. We employed blue-native gel and also SDS-PAGE of the purified protein. This clearly shows that the higher molecular band after crosslinking is a dimer (Figure 4B and Fig. EV4D). In particular, in the BN-PAGE, the treatment of mutants with crosslinkers revealed a dimeric band even in the presence of SDS. Further, we have performed cryoEM analysis of the mutants - H386C/F389C and H566C. The images, classes and reconstruction show that the enzyme forms a dimer similar to the WT. Interestingly, we also observe in H566C mutant in nanodisc, a small population that has similar architecture to the Rhizobium-like interface (classes shown in Fig. EV7 and Appendix Fig. S5). This prompted us to look closely at other datasets and it is clear that during the process of reconstitution in nanodisc, we observe both kinds of dimer interface but the PaMprF dimer is predominant. We also observe higher order oligomers (tetramer) in GDN but as only few views are visible, a reconstruction could not be obtained (Appendix Fig. S5). In addition, we also introduced two cysteines on the Rhizobium-like interface and no crosslinking on the membranes were observed (Figure 4B). But it is possible that these chosen mutants are not accessible to the crosslinker. Thus, we conclude that the oligomers of PaMprF is sensitive to nature of detergents and labile.

      • As the question whether the observed interface is real or an artefact is very central to the value of the structural data and the drawn conclusions from it, the authors should make more effort to analyze and try to validate the interface. First, an analysis of interface properties (buried surface area, nature of the interactions, conservation) should be performed for the interface as observed in the Pseudomonas structure but also for a (hypothetical) Rhizobium-like interface of two Pseudomonas monomers (such a model of a dimer should be easily obtainable by AlphaFold using the available Rhizobium structures as models). Then, experimental methods such as FRET or crosslinking-MS would allow to draw more solid conclusions on the distances between potential interface residues. While these experiments are a certain effort, the question whether the dimer interface is real is so central to the paper that it would be worthwhile to make this effort.

      We have included the interface area and nature of interactions in the revised manuscript (page 7, lines 221-223).

      We attempted AlphaFold for predicting the dimeric structure of PaMprF (and included RtMprF also). Some of the attempts from the predictions is summarised in figure 1.

      The prediction of monomer is of high confidence but the oligomer (here dimer) is of low confidence (from ipTM values). Even the prediction for Rhizobium enzyme has low confidence, and gives a complete different architecture (and in some trials with lipids, it gives an inverted or non-physiological dimer). Only when the monomer of PaMprF with lipids and tRNA was given as input (requested by reviewer 2 and described below), it predicts oligomeric structure with some confidence but rest were not informative.

      • As it seems that detergents might disrupt or modify the dimer interface, it might be an alternative to solubilize the protein in a more native environment by polymer-stabilized nanodiscs using DIBMA or similar molecules.

      We have tried to use SMALPs for extraction of PaMprF. We were able to solubilise but unable to enrich the enzyme sufficient for structural studies currently and will require further optimisation.

      • Since parts of the Discussion are mostly repetitions of the Results part and other parts of the Discussion also contain a large extend of structure analysis one would usually rather expect in the Results part instead of the Discussion, the authors should consider condensing both to a combined (and overall much shorter) Results & Discussion section.

      We have rewritten much of the discussion section and removed any repetition from the results sections. We would prefer to keep the results and discussion separate.

      Minor points: - Explain abbreviations the first time they appear in the text, e.g. TTH

      This is now expanded in the first instance

      • Figure labels are very minimalistic. This should be improved, e.g. by putting labels to important structural features that appear in the text, otherwise the figures are not an adequate support for the text.

      The font size for the labels have been increased.

      • Figure 5: Label where the different oligomers run on the gels

      Labelled.

      Reviewer #1 (Significance (Required)):

      While the structural work appears to be solid and carried out well on the technical part, one big criticism is how the data are presented in the manuscript, how they are analyzed and how they are put into relation to previous work. As structures of Mpfr from Rhizobium have been published, it is not required and rather distracting to explain the methodological details and the structure of Pseudomonas MprF in such great detail. Instead, the manuscript would benefit very strongly from reaching the interesting and novel parts, the comparison with the previous structures, as early as possible. Overall, the manuscript should be substantially shortened to not divert the reader's attention away from the novel parts by drowning them in miniscule description of the structural features such as secondary structure elements or lipid molecule positions where it remains completely unclear what their relevance is to the story and the message of the paper. Finally, during this revision, care should be taken to improve the language and maybe involve a native speaker in doing so.

      Even more importantly, since the authors observe a dimer interface which strongly deviates from the previously presented arrangement of another species, the most important thing would be to properly characterize this interface and experimentally validate it, both of which has not been done sufficiently. When also taking into account that there were significant differences in the arrangement of the dimer between their structures in GDN and nanodisc, and that in the GDN structure, the cholesterol backbone of GDN appears to be involved in the interface (there should not be any cholesterol in native bacterial membranes!), there is a realistic chance that the observed dimer is an artefact. If the authors cannot convincingly rule out this possibility, all their conclusions are meaningless.

      Hence, while I think that the data presented here would be worth publishing. However, a major drawback is that the authors do not sufficiently analyse, characterise and validate the dimer interface and fail to show that the dimer is biologically relevant.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Shaileshanand J. et al., reported the structures of Multiple Peptide Resistance Factor, MprF, which is a bi-functional enzyme in bacteria responsible for aminoacylation of lipid head groups. The authors purified MprF from Pseudomonas aeruginosa in GDN micelles and nanodiscs, and by applying cryo-EM single particle method, they successfully reached near-atomic resolution, and built corresponding atomic models. By applying structural analysis as well as biochemistry methods, the authors demonstrated dimeric formation of MprF, exhibited the dynamic nature of the catalytic domain of this enzyme, and proposed a possible model on tRNA binding and aminoacylation.

      Major comments 1. In abstract, the authors stated 'Several lipid-like densities are observed in the cryoEM maps, which might indicate the path taken by the lipids and the coupling function of the two functional domains. Thus, the structure of a well characterised PaMprF lays a platform for understanding the mechanism of amino acid transfer to a lipid head group and subsequent flipping across the leaflet that changes the property of the membrane.' Firstly, those lipid-like densities were demonstrated in Fig 3A, since densities of lipids of purified membrane proteins often exist within regions of relatively low local resolution, or low quality, I think more detailed description on how the authors defined which part of the density belongs to lipid and how they acquired the modeling of some of the lipids is required. And the authors modeled phosphatidylglycerol into the GDN MprF, I would require additional experiment, for instance, mass spectrometry over the purified sample, to demonstrate the existence of this specific lipid with the sample. Secondly, regarding the last sentence in the abstract, how these structures lay a platform for further understanding was poorly discussed in both result section and discussion section, since the authors clearly stated 'This cavity perhaps provides a path for holding lipids...', then the statement in the next sentence 'Taken together... the vicinity to the cavities described above indicates the possible path taken by the lipids to enter and exit the enzyme' does not have a reliable evidence to support this conclusion, I would suggest the authors move these statements into discussion section, and elaborate more over this issue since it is an important part in the abstract, or make a more solid proof using other approaches, such as molecular dynamics simulation, to make these statements solid in the result section.

      The membranes of E. coli have predominantly phosphatidyl ethanolamine (PE) and phosphatidyl glycerol (PG) as the next abundant lipid with cardiolipin though smaller in number, plays an important role in functioning of many membrane proteins. In our map, the non-protein density are unambiguous and they can be observed as long density reflective of acyl chains (note that GDN used in purification has no acyl chain) and hence attributed these densities to lipids (Fig. EV4E/F and Figure 5A). Only in few of these densities, head group could be modelled and the identity of the lipid as PG at the dimer interface is based on the requirement of negatively charged lipids for oligomerisation of membrane proteins in general (for example – KcsA tetramer formation requires PG, Marius et al., 2005; Valiyaveetil et al., 2002;2004). It is true that the lipid densities are at the peripheral regions of the map but here only acyl chains have been modelled. Within the membrane domain, one reasonably ordered lipid is observed and by analogy with R. tropici structure, it is possible to build a modified-PG (in PaMprF here ala-PG). However, the density of the head group is not unambiguous (unlike lysine in the R. tropici, whose density stands out) and hence we have modelled it as PG alone. In the methods (page 20, lines 649-650), the identification and modelling of lipid densities is described.

      We agree that mass spectrometry analysis of purified lipids will be useful but it will not be able to tell the position of the lipid in the map (model) and for this we still require a map at higher resolution with better ordered lipids. We have recently built/developed the workflow for native MS and we plan to initiate analysis of PaMprF in the near future, which will provide details for the lipid purified with the enzyme.

      We had initiated molecular dynamics simulation during the review process, and we had included tRNA molecules (shorter version) as we felt the connection between tRNA binding and lipid modification was important. This would have also explained the path taken by lipids (performed by Hankins et al., 2025 in their publication). However, this is likely to require more work (and computing resources) and both mass spectrometry and molecular dynamics will be part of the future work.

      We have rewritten the discussion and changed the last line of the abstract to the following

      “From the structures, the binding modes of tRNA and lipid transport can be postulated and the mobile secondary structural elements in the synthase domain might play a mechanistic role”.

      (in the abstract, lines 24-26).

      Fig 2B, it seems the H566 sidechains were overlapping in the zoom-in figure of distance measurement between H566 residues, to clarify this, authors should either present another figure with rotation, to better demonstrate their relative locations, or swap this zoom-in figure with another figure with rotations. Also, could the authors briefly commenting on why they chose H566 for distance measurement specifically?

      The side chain of residue H566 in the nanodisc model face towards each other at the interface, hence this residue was chosen to shown the proximity.

      Related to previous comment, I see one additional green square in Fig. 2A and an additional green square in Fig. 2B, without any zoom-in images provided on these regions. Besides, they're focusing on two different domains with same color, any particular reason why they're there? If so, please provide the information in figure legends.

      The green squares in panels 2A and 2B are the regions that have been zoomed in panels 2D and 2E showing the interactions of the TTH. This is now made clear in the legend as well as in the figure.

      Related to previous comment, authors should also provide distance measurement over electrostatic interaction sites in Fig. 2A, since distance plays as an important factor in these forces.

      The electrostatic interactions have been included.

      For Fig. 2C, since in Fig. 1, the authors have already indicated the differences between reconstruction of the GDN and nanodisc datasets, this information provided here seems to be a bit abundant, I suggest either move this panel to Fig. 1, to make a visualization on both electron densities as well as atomic models, or move this panel to supplementary figures.

      We thank the reviewer for the suggestion. The panel, figure 2C is moved to Appendix Fig. S2B.

      Fig. 3B, some of the spheres of the lipids were also marked as red, any particular reason why they're red? Do they indicate they're phosphate heads? If so, could the authors provide evidences how they define these orientations of the lipid heads? If not, any particular reason why they're red?

      Although, there are non-protein densities (i.e., density beyond noise that remain after modelling of protein residues and found individually) have been modelled as lipids (In Fig. EV4E, these additional densities are shown). Except for few, all these densities have been modelled only as acyl chain. The lipids modelled with head group and phosphate (that have oxygen) and the fit of the density are shown in both figure 3A and EV4F. Hence, the red (oxygen) is seen in the space filling model of lipids (the density for few lipids are shown, also in the response to the comment below).

      Fig. 3C, the fitted model of lipid and its corresponding density should be added to Fig. S4, to give more detailed view on the quality of the fitting.

      The figure 3 has now been reorganised and the new figure (fig. 5) has only 3 panels. We have provided an enlarged view of the lipids in the membrane domain along with unmodelled densities in 3A. In addition, in fig. EV4F, fit of the lipid to density (select lipids) are shown.

      Fig. 4D and 4E, could the authors also indicate the RMSD values when comparing the differences of RtMprF, PaMprF, ReMprF, this information would be helpful to understand how big of a difference within these three models.

      The RMSD values of the structural comparison is given in the text.

      Fig. 6E, the coloring used for CCA-Ala were similar to the blue part of soluble domain, could the authors change the coloring a bit? Also, for Fig. 6F, I would suggest the authors provide a prediction model, such as using AlphaFold3, of this tRNA interaction site, to further validate this proposed model.

      The colour of the CCA part is changed in the revised figure. Following the suggestion of the reviewer, we used AlphaFold3 to predict the complex formation of PaMprF with tRNA (or shorter version) (Figure 2). As mentioned above in response to reviewer 1, the prediction of dimeric enzyme was of low confidence and this is also reflected when a combination of tRNA, lipids and enzyme sequence are given. Instead of full-length tRNA, if only the CCA end is provided, then the prediction program does position this in the postulated cavity. Only with the monomeric enzyme and tRNA does one get a reasonable model. With respect to the proposed model in 6F, currently we don’t have any evidence and this remains a postulate. In the revised manuscript, we have replaced this with conservation figure, which we thought is more relevant.

      In Supplementary Figures S1 and S3, the angular distribution of maps exhibited preferred orientation to certain extent, 3D FSC estimation should also be supplied for these maps, as an indication of whether the reconstructed densities were affected or not.

      We have included the 3DFSC plots for all the data sets (including the new ones in figures EV1, 2, 5, 6, 7). It is evident that the nanodisc datasets in general are slightly anisotropic.

      For Fig S3B, could the authors switch to another image with better contrast?

      This is now replaced with an image to show the particles.

      Minor comments 1. Fig. 2E and 2F, distance measurement should also be supplied to these two panels.

      We have now included the distance measurement in both the panels, which are now Fig. 2D and 2E.

      Fig. 5D, since in Fig. 4F and 4G already mentioned the skeleton of GDN, this modeling part should be presented before exhibit it in dimer interface, the authors should rearrange the sequence over these three panels.

      The figures in the revised manuscript has been rearranged. Figure 5 (now figure 4) has been modified to include the biochemical analysis (crosslinking studies) and the panel 5D has been removed.

      In Supplementary Figure S3, which density was shown for the PaMprF local resolution estimation result? Authors should provide this information as two maps were shown in this figure.

      The local resolution is for C2 symmetrised map and this is now mentioned in the panel.

      CROSS-REFEREE COMMENTS Both Reviewer #1 and #3 made comments over technical issue, their evaluation over functional aspects of this protein is what I was lacking over my comments, also, their evaluation of the biological narrative, relevance toward previous research is also more insightful. Finally, they offer valuable suggestions on how to adjust the article to make it more readable, and better describing the biological story which I would suggest the authors to pay attention to.

      Reviewer #2 (Significance (Required)):

      Significance The authors mainly focused on the structure of MprF in Pseudomonas aeruginosa, this protein is essential for the resistance to cationic antimicrobial peptides. A combination of structural and biochemical analysis provided evidences to the dimeric formation to this enzyme, and the analysis over differences of purified proteins using GDN and nanodisc was particular interesting, which provide new insight regarding the flexible nature of this enzyme, and potentially could be beneficial to the membrane protein community, as it demonstrates the differences in detergent/nanodisc of choice could affect the assembly of the protein of interest. Still, some of the statements in the manuscript, for instance, the assignment of lipids was over-claimed and could be benefited from additional approaches to support the issue. I would suggest some refinement in the discussion section as well as some of the figures.

      My expertise: cryo-EM single particle analysis; cryo-ET; sub-tomo averaging; cryo-FIB;

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Jha and Vinothkumar characterize the cryoEM structure of the alanyl-phosphatidylglycerol producing multiple peptide resistance factor (MprF) of Pseudomonas aeruginosa. MprF proteins mediate the transfer of amino acids from aminoacyl-tRNAs to negatively charged phospholipids resulting in reduced membrane interactions with cationic antimicrobial peptides (produced by the host and competing microorganisms). The phospholipid modifications involve in most cases the transfer of lysine or alanine to phosphatidylglycerol. MprF proteins are membrane proteins consisting of a soluble and hydrophobic domain. Multiple functional studies have shown that the soluble domain of MprF mediates the aminoacylation of phosphatidylglycerol, while the hydrophobic domain mediates the "flipping" of aminoacylated phospholipids across the membrane, a process that is crucial to repulse or prevent the interaction of antimicrobial peptides encountered at the outer leaflet of bacterial membranes. Aside from its role in conferring antimicrobial peptide resistance, other roles of MprF have been described including more physiological roles such as improving growth under acidic conditions. Interestingly, MprF proteins are also found in Gram-negative bacteria which are already protected by an additional membrane that includes LPS. However, in Pseudomonas aeruginosa, MprF confers phenotypes that are similar to those observed in Gram-positive bacteria. Importantly, crystal structures of the soluble domain have led to important insights into aminoacyl phospholipid synthesis and recent studies on the cryoEM structure of Rhizobium tropici have confirmed functional and preliminary structural studies with other MprF proteins. The cryoEM structure from R. tropici confirmed the dimeric structure of MprF and supported a role of the hydrophobic domain in flipping lysyl-phosphatidylglycerol across the membrane. A comparison of the structures of lysyl-phosphatidylglycerol with alanyl-phosphatidylglycerol producing MprFs could reveal new insights into the mechanism of transferring aminoacyl-phospholipids from the soluble domain to the hydrophobic domain and translocation of alanyl- vs lysyl-phosphatidylglycerol across the membrane.

      Major concerns

      1. The study by Jha and Vinothkumar provides the cryoEM structure of an alanyl-phosphatidylglycerol producing MprF protein which is in principle an important milestone in gaining a better understanding of the mechanism of aminoacyl-phospholipid synthesis and flipping, including the potentially different requirements of accommodating different aminoacyl -tRNAs and aminoacyl-phospholipid species. However, this is not addressed. The authors present a "distinct architecture" compared to the structure of R. tropici- MprF, without providing functional insights and the focus of the study shifts to the role of detergents in determining MprF structures via cryoEM. Thus, after fundamental discoveries have been made with crystal structures of the soluble domain and cryoEM structure of R. tropici, this study -while valuable as a resource- seems to offer only an incremental advance in understanding the mode of action of MprF and the potential different requirements for transferring alanyl-phosphatidylglycerol to the hydrophobic domain and flipping across the membrane. The reader is left with the finding of a distinct architecture with no further explanation or hypothesis.

      We thank the reviewer for his/her comments. It is true that the crystal structures of soluble domains of MprF (from 3 species) and the cryoEM structures are now available (two Rhizobium species). However, the cryoEM maps that we have obtained has several salient features including the distinct dimeric interface and the position of the C-terminal helix of the soluble domain. This in particular is important. In the previous study, Hebecker et al 2011 had reported that the terminal helix of PaMprF was important for the activity and the construct without the TM domain can also function in modifying the lipids. The full-length cryoEM map of PaMprF in GDN now provides an idea how this occurs, with the terminal helix buried at the interface. Further, the proposed tRNA binding site (from Hebecker et al 2015, lysine amide bound structure) face other in the dimeric architecture of R. tropici and it is not clear how the full-length tRNA will bind without disrupting the dimer. In contrast, the dimer architecture observed for PaMprF has the tRNA binding site facing away and they can bind to the enzyme without any constraints. We think the mobile/dynamic elements (or secondary structure) of the synthase domain play a major role in interaction with substrates and mechanism. The current structures provide some evidence for this and form the basis of future studies. Instead of cartoon description, we have now included a conservation plot of the molecule in explaining the possible mechanism along with the surface representation in figure 6.

      Differences to R.tropici MprF and other studies are difficult to follow as only a topological map of the Pseudomonas MprF is provided and conserved amino acids that have been shown to be crucial in mediating synthesis and flipping are not highlighted in the text or in the figures, specifically addressed, or discussed. Conserved amino acids in the presented cryoEM structure could provide important mechanistic insights and could address substrate specificity/requirements for aminoacyl phospholipid synthesis, transfer to the hydrophobic domain and flipping.

      The conservation of residues across MprF homologues have been presented in previous published articles and hence, initially we had not included in the manuscript. We have now included multiple sequence alignment of select homologues of MprF highlighting conserved residues (Appendix Fig. S6) as well a figure (Fig. 6F) colouring the molecule with conservation scores with CONSURF. In figure 6F, zoomed in version, we highlight the many of the conserved residues in the synthase domain as they play a role in substrate selectivity.

      Authors characterize an alanyl-phosphatidylglycerol producing MprF but do not detect the lipid in the cryoEM structure. Thus, the potential path taken by alanyl-phosphatidylglycerol remains unclear. Authors model the detected lipids as phosphatidylglycerol, which may be an interesting finding as it would indicate that MprF is generally capable of flipping phospholipids (this is however not discussed). While it is plausible that MprF flippases may be able to flip phosphatidyglycerol it could have a different path and structural requirements. It is also difficult to follow what the suggested pathway of flipping is in the Pseudomonas-MprF flippase (compared to R.tropici). Authors could provide a similar overview figure as in Song et al. and indicate what the potential differences are.

      We modelled phosphatidylglycerol as the lipid as the current density doesn’t allow to model ala-PG ambiguously though it is found in the same position as the lys-PG in the R. tropici maps. The recent in-vitro assay by Hankins et al 2025 shows that PaMprF is able to flip wide range of lipids and we would also like to point out that PG from outer leaflet can be flipped, whose headgroup can be modified at the inner leaflet and flipped back. As shown by Song et al 2021 and Hebecker et al 2011, the specificity for the substrates is in the synthase domain (by mutagenesis and swapping). We don’t think there will be any difference between the lys-PG and Ala-PG path but in our opinion the positional relation between the soluble and membrane domain is the most important and has remained the focus of the manuscript along with the dimeric architecture. The figure 6 in the manuscript is descriptive of this and provides a summary of the structural observation from the presented structures.

      Minor concerns

      • Page 13: the following sentence should be rephrased: "Among the missing links in the current cryoEM maps is the lack of well-ordered density for lipid molecules on the inner leaflet closer to the re-entrant helices but it is reasonable to assume from the cluster of positive charge that there will be lipid molecules and are dynamic. "

      This is has been rephrased.

      • Page 4: Klein et al do not show that the Pseudomonas aeruginosa MprF mediates flipping

      Corrected to reflect only the modification of lipid and not flipping.

      Reviewer #3 (Significance (Required)):

      General assessment: see review

      Advance: Minor

      Audience: Specialized

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Jha and Vinothkumar characterize the cryoEM structure of the alanyl-phosphatidylglycerol producing multiple peptide resistance factor (MprF) of Pseudomonas aeruginosa. MprF proteins mediate the transfer of amino acids from aminoacyl-tRNAs to negatively charged phospholipids resulting in reduced membrane interactions with cationic antimicrobial peptides (produced by the host and competing microorganisms). The phospholipid modifications involve in most cases the transfer of lysine or alanine to phosphatidylglycerol. MprF proteins are membrane proteins consisting of a soluble and hydrophobic domain. Multiple functional studies have shown that the soluble domain of MprF mediates the aminoacylation of phosphatidylglycerol, while the hydrophobic domain mediates the "flipping" of aminoacylated phospholipids across the membrane, a process that is crucial to repulse or prevent the interaction of antimicrobial peptides encountered at the outer leaflet of bacterial membranes. Aside from its role in conferring antimicrobial peptide resistance, other roles of MprF have been described including more physiological roles such as improving growth under acidic conditions. Interestingly, MprF proteins are also found in Gram-negative bacteria which are already protected by an additional membrane that includes LPS. However, in Pseudomonas aeruginosa, MprF confers phenotypes that are similar to those observed in Gram-positive bacteria. Importantly, crystal structures of the soluble domain have led to important insights into aminoacyl phospholipid synthesis and recent studies on the cryoEM structure of Rhizobium tropici have confirmed functional and preliminary structural studies with other MprF proteins. The cryoEM structure from R. tropici confirmed the dimeric structure of MprF and supported a role of the hydrophobic domain in flipping lysyl-phosphatidylglycerol across the membrane. A comparison of the structures of lysyl-phosphatidylglycerol with alanyl-phosphatidylglycerol producing MprFs could reveal new insights into the mechanism of transferring aminoacyl-phospholipids from the soluble domain to the hydrophobic domain and translocation of alanyl- vs lysyl-phosphatidylglycerol across the membrane.

      Major concerns:

      1. The study by Jha and Vinothkumar provides the cryoEM structure of an alanyl-phosphatidylglycerol producing MprF protein which is in principle an important milestone in gaining a better understanding of the mechanism of aminoacyl-phospholipid synthesis and flipping, including the potentially different requirements of accommodating different aminoacyl -tRNAs and aminoacyl-phospholipid species. However, this is not addressed. The authors present a "distinct architecture" compared to the structure of R. tropici- MprF, without providing functional insights and the focus of the study shifts to the role of detergents in determining MprF structures via cryoEM. Thus, after fundamental discoveries have been made with crystal structures of the soluble domain and cryoEM structure of R. tropici, this study -while valuable as a resource- seems to offer only an incremental advance in understanding the mode of action of MprF and the potential different requirements for transferring alanyl-phosphatidylglycerol to the hydrophobic domain and flipping across the membrane. The reader is left with the finding of a distinct architecture with no further explanation or hypothesis.

      2. Differences to R.tropici MprF and other studies are difficult to follow as only a topological map of the Pseudomonas MprF is provided and conserved amino acids that have been shown to be crucial in mediating synthesis and flipping are not highlighted in the text or in the figures, specifically addressed, or discussed. Conserved amino acids in the presented cryoEM structure could provide important mechanistic insights and could address substrate specificity/requirements for aminoacyl phospholipid synthesis, transfer to the hydrophobic domain and flipping.

      3. Authors characterize an alanyl-phosphatidylglycerol producing MprF but do not detect the lipid in the cryoEM structure. Thus, the potential path taken by alanyl-phosphatidylglycerol remains unclear. Authors model the detected lipids as phosphatidylglycerol, which may be an interesting finding as it would indicate that MprF is generally capable of flipping phospholipids (this is however not discussed). While it is plausible that MprF flippases may be able to flip phosphatidyglycerol it could have a different path and structural requirements. It is also difficult to follow what the suggested pathway of flipping is in the Pseudomonas-MprF flippase (compared to R.tropici). Authors could provide a similar overview figure as in Song et al. and indicate what the potential differences are.

      Minor concerns:

      1. Page 13: the following sentence should be rephrased: "Among the missing links in the current cryoEM maps is the lack of well-ordered density for lipid molecules on the inner leaflet closer to the re-entrant helices but it is reasonable to assume from the cluster of positive charge that there will be lipid molecules and are dynamic. "

      2. Page 4: Klein et al do not show that the Pseudomonas aeruginosa MprF mediates flipping

      Significance

      General assessment:

      The study by Jha and Vinothkumar provides the cryoEM structure of an alanyl-phosphatidylglycerol producing MprF protein which is in principle an important milestone in gaining a better understanding of the mechanism of aminoacyl-phospholipid synthesis and flipping, including the potentially different requirements of accommodating different aminoacyl -tRNAs and aminoacyl-phospholipid species. However, this is not addressed. The authors present a "distinct architecture" compared to the structure of R. tropici- MprF, without providing functional insights and the focus of the study shifts to the role of detergents in determining MprF structures via cryoEM. Thus, after fundamental discoveries have been made with crystal structures of the soluble domain and cryoEM structure of R. tropici, this study -while valuable as a resource- seems to offer only an incremental advance in understanding the mode of action of MprF and the potential different requirements for transferring alanyl-phosphatidylglycerol to the hydrophobic domain and flipping across the membrane

      Advance: Minor

      Audience: Specialized

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Shaileshanand J. et al., reported the structures of Multiple Peptide Resistance Factor, MprF, which is a bi-functional enzyme in bacteria responsible for aminoacylation of lipid head groups. The authors purified MprF from Pseudomonas aeruginosa in GDN micelles and nanodiscs, and by applying cryo-EM single particle method, they successfully reached near-atomic resolution, and built corresponding atomic models. By applying structural analysis as well as biochemistry methods, the authors demonstrated dimeric formation of MprF, exhibited the dynamic nature of the catalytic domain of this enzyme, and proposed a possible model on tRNA binding and aminoacylation.

      Major comments:

      1. In abstract, the authors stated 'Several lipid-like densities are observed in the cryoEM maps, which might indicate the path taken by the lipids and the coupling function of the two functional domains. Thus, the structure of a well characterised PaMprF lays a platform for understanding the mechanism of amino acid transfer to a lipid head group and subsequent flipping across the leaflet that changes the property of the membrane.' Firstly, those lipid-like densities were demonstrated in Fig 3A, since densities of lipids of purified membrane proteins often exist within regions of relatively low local resolution, or low quality, I think more detailed description on how the authors defined which part of the density belongs to lipid and how they acquired the modeling of some of the lipids is required. And the authors modeled phosphatidylglycerol into the GDN MprF, I would require additional experiment, for instance, mass spectrometry over the purified sample, to demonstrate the existence of this specific lipid with the sample. Secondly, regarding the last sentence in the abstract, how these structures lay a platform for further understanding was poorly discussed in both result section and discussion section, since the authors clearly stated 'This cavity perhaps provides a path for holding lipids...', then the statement in the next sentence 'Taken together... the vicinity to the cavities described above indicates the possible path taken by the lipids to enter and exit the enzyme' does not have a reliable evidence to support this conclusion, I would suggest the authors move these statements into discussion section, and elaborate more over this issue since it is an important part in the abstract, or make a more solid proof using other approaches, such as molecular dynamics simulation, to make these statements solid in the result section.

      2. Fig 2B, it seems the H566 sidechains were overlapping in the zoom-in figure of distance measurement between H566 residues, to clarify this, authors should either present another figure with rotation, to better demonstrate their relative locations, or swap this zoom-in figure with another figure with rotations. Also, could the authors briefly commenting on why they chose H566 for distance measurement specifically?

      3. Related to previous comment, I see one additional green square in Fig. 2A and an additional green square in Fig. 2B, without any zoom-in images provided on these regions. Besides, they're focusing on two different domains with same color, any particular reason why they're there? If so, please provide the information in figure legends.

      4. Related to previous comment, authors should also provide distance measurement over electrostatic interaction sites in Fig. 2A, since distance plays as an important factor in these forces.

      5. For Fig. 2C, since in Fig. 1, the authors have already indicated the differences between reconstruction of the GDN and nanodisc datasets, this information provided here seems to be a bit abundant, I suggest either move this panel to Fig. 1, to make a visualization on both electron densities as well as atomic models, or move this panel to supplementary figures.

      6. Fig. 3B, some of the spheres of the lipids were also marked as red, any particular reason why they're red? Do they indicate they're phosphate heads? If so, could the authors provide evidences how they define these orientations of the lipid heads? If not, any particular reason why they're red?

      7. Fig. 3C, the fitted model of lipid and its corresponding density should be added to Fig. S4, to give more detailed view on the quality of the fitting.

      8. Fig. 4D and 4E, could the authors also indicate the RMSD values when comparing the differences of RtMprF, PaMprF, ReMprF, this information would be helpful to understand how big of a difference within these three models.

      9. Fig. 6E, the coloring used for CCA-Ala were similar to the blue part of soluble domain, could the authors change the coloring a bit? Also, for Fig. 6F, I would suggest the authors provide a prediction model, such as using AlphaFold3, of this tRNA interaction site, to further validate this proposed model.

      10. In Supplementary Figures S1 and S3, the angular distribution of maps exhibited preferred orientation to certain extent, 3D FSC estimation should also be supplied for these maps, as an indication of whether the reconstructed densities were affected or not.

      11. For Fig S3B, could the authors switch to another image with better contrast?

      Minor comments:

      1. Fig. 2E and 2F, distance measurement should also be supplied to these two panels.

      2. Fig. 5D, since in Fig. 4F and 4G already mentioned the skeleton of GDN, this modeling part should be presented before exhibit it in dimer interface, the authors should rearrange the sequence over these three panels.

      3. In Supplementary Figure S3, which density was shown for the PaMprF local resolution estimation result? Authors should provide this information as two maps were shown in this figure.

      CROSS-REFEREE COMMENTS

      Both Reviewer #1 and #3 made comments over technical issue, their evaluation over functional aspects of this protein is what I was lacking over my comments, also, their evaluation of the biological narrative, relevance toward previous research is also more insightful. Finally, they offer valuable suggestions on how to adjust the article to make it more readable, and better describing the biological story which I would suggest the authors to pay attention to.

      Significance

      Significance

      The authors mainly focused on the structure of MprF in Pseudomonas aeruginosa, this protein is essential for the resistance to cationic antimicrobial peptides. A combination of structural and biochemical analysis provided evidences to the dimeric formation to this enzyme, and the analysis over differences of purified proteins using GDN and nanodisc was particular interesting, which provide new insight regarding the flexible nature of this enzyme, and potentially could be beneficial to the membrane protein community, as it demonstrates the differences in detergent/nanodisc of choice could affect the assembly of the protein of interest. Still, some of the statements in the manuscript, for instance, the assignment of lipids was over-claimed and could be benefited from additional approaches to support the issue. I would suggest some refinement in the discussion section as well as some of the figures.

      My expertise: cryo-EM single particle analysis; cryo-ET; sub-tomo averaging; cryo-FIB;

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      MprF proteins exist in many bacteria to synthesize aminoacyl phospholipids that have diverse biological functions, e.g. in the defense against small cationic peptides. They integrate two functions, the aminoacylation of lipids, i.e. the transfer of Lys, Arg or Ala from tRNAs to the head group, and the flipping of these modified lipids to the membrane outer leaflet. The authors present structures of MprF from Pseudomonas aeruginosa and describe these structures in great detail. As MprF enzymes confer antibiotic resistance and are therefore highly important, studying them is significant and interesting. Consequently, their structures have been substantially characterized in recent years, including the publication of the dimeric full-length MpfR from Rhizobium (Song et al., 2021).

      • While the structural work appears to be solid and carried out well on the technical part, one big criticism is how the data are presented in the manuscript, how they are analyzed and how they are put into relation to previous work. As structures of Mpfr from Rhizobium have been published, it is not required and rather distracting to explain the methodological details and the structure of Pseudomonas MprF in such great detail. Instead, the manuscript would benefit very strongly from reaching the interesting and novel parts, the comparison with the previous structures, as early as possible. Overall, the manuscript should be substantially shortened to not divert the reader's attention away from the novel parts by drowning them in miniscule description of the structural features such as secondary structure elements or lipid molecule positions where it remains completely unclear what their relevance is to the story and the message of the paper. Finally, during this revision, care should be taken to improve the language and maybe involve a native speaker in doing so.

      • Even more importantly, since the authors observe a dimer interface which strongly deviates from the previously presented arrangement of another species, the most important thing would be to properly characterize this interface and experimentally validate it, both of which has not been done sufficiently. When also taking into account that there were significant differences in the arrangement of the dimer between their structures in GDN and nanodisc, and that in the GDN structure, the cholesterol backbone of GDN appears to be involved in the interface (there should not be any cholesterol in native bacterial membranes!), there is a realistic chance that the observed dimer is an artefact. If the authors cannot convincingly rule out this possibility, all their conclusions are meaningless.

      • Hence, while I think that the data presented here would be worth publishing. However, a major drawback is that the authors do not sufficiently analyse, characterise and validate the dimer interface and fail to show that the dimer is biologically relevant.

      Major points:

      • The authors always jump between their structures in detergent and nanodisc during all the descriptions, which makes following the story even more difficult. Please first describe one of the structures and then (briefly) discuss relevant similarities and differences afterwards.

      • The difference in dimerization between Pseudomonas and Rhizobium is the most interesting and surprising feature (if true) of the new structures. However, it is not really presented as such. The authors should put more emphasis on making clear that this is a complete rotation of the monomers with respect to each other (by how many degrees?) and they should visualize it even more clearly in Figure 4 (and label the figure so that it is possible to understand it without having to read the text or the legend first).

      • P. 10: The authors insinuate that only one of the dimer interfaces, either Pseudomonas or Rhizobium could be real, but disregard the possibility that both might be the biologically relevant interfaces of the respective species and that there might have been a switch of interfaces during evolution. They should also mention and discuss this possibility.

      • Fig. 5G: The authors claim that the higher molecular band that appears in the mutant is a "dimer with aberrant migration" of >250 kDa as opposed to the expected 150 kDa. They should explain how they came to this conclusion and how they can be sure that the band does not correspond to a higher oligomer (trimer or tetramer). They could show, by extraction and purification scheme similar to the wildtype using first LMNG and then GDN, followed by at least a preliminary EM analysis, that the crosslinked mutant MprF is indeed a dimer, or use other biophysical methods to do the same, otherwise this experiment does not show much. Furthermore, they should also include a cysteine mutant in the part of Pseudomonas MprF that would be involved in a Rhizobium-like interface in their crosslinking experiments to check whether they could also stabilize dimers in this case.

      • As the question whether the observed interface is real or an artefact is very central to the value of the structural data and the drawn conclusions from it, the authors should make more effort to analyze and try to validate the interface. First, an analysis of interface properties (buried surface area, nature of the interactions, conservation) should be performed for the interface as observed in the Pseudomonas structure but also for a (hypothetical) Rhizobium-like interface of two Pseudomonas monomers (such a model of a dimer should be easily obtainable by AlphaFold using the available Rhizobium structures as models). Then, experimental methods such as FRET or crosslinking-MS would allow to draw more solid conclusions on the distances between potential interface residues. While these experiments are a certain effort, the question whether the dimer interface is real is so central to the paper that it would be worthwhile to make this effort.

      • As it seems that detergents might disrupt or modify the dimer interface, it might be an alternative to solubilize the protein in a more native environment by polymer-stabilized nanodiscs using DIBMA or similar molecules.

      • Since parts of the Discussion are mostly repetitions of the Results part and other parts of the Discussion also contain a large extend of structure analysis one would usually rather expect in the Results part instead of the Discussion, the authors should consider condensing both to a combined (and overall much shorter) Results & Discussion section.

      Minor points:

      • Explain abbreviations the first time they appear in the text, e.g. TTH

      • Figure labels are very minimalistic. This should be improved, e.g. by putting labels to important structural features that appear in the text, otherwise the figures are not an adequate support for the text.

      • Figure 5: Label where the different oligomers run on the gels

      Significance

      While the structural work appears to be solid and carried out well on the technical part, one big criticism is how the data are presented in the manuscript, how they are analyzed and how they are put into relation to previous work. As structures of Mpfr from Rhizobium have been published, it is not required and rather distracting to explain the methodological details and the structure of Pseudomonas MprF in such great detail. Instead, the manuscript would benefit very strongly from reaching the interesting and novel parts, the comparison with the previous structures, as early as possible. Overall, the manuscript should be substantially shortened to not divert the reader's attention away from the novel parts by drowning them in miniscule description of the structural features such as secondary structure elements or lipid molecule positions where it remains completely unclear what their relevance is to the story and the message of the paper. Finally, during this revision, care should be taken to improve the language and maybe involve a native speaker in doing so.

      • Even more importantly, since the authors observe a dimer interface which strongly deviates from the previously presented arrangement of another species, the most important thing would be to properly characterize this interface and experimentally validate it, both of which has not been done sufficiently. When also taking into account that there were significant differences in the arrangement of the dimer between their structures in GDN and nanodisc, and that in the GDN structure, the cholesterol backbone of GDN appears to be involved in the interface (there should not be any cholesterol in native bacterial membranes!), there is a realistic chance that the observed dimer is an artefact. If the authors cannot convincingly rule out this possibility, all their conclusions are meaningless.

      • Hence, while I think that the data presented here would be worth publishing. However, a major drawback is that the authors do not sufficiently analyse, characterise and validate the dimer interface and fail to show that the dimer is biologically relevant.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-03130

      Corresponding author(s): Ellie S. Heckscher

      [The "revision plan" should delineate the revisions that authors intend to carry out in response to the points raised by the referees. It also provides the authors with the opportunity to explain their view of the paper and of the referee reports.

      • *

      The document is important for the editors of affiliate journals when they make a first decision on the transferred manuscript. It will also be useful to readers of the reprint and help them to obtain a balanced view of the paper.

      • *

      If you wish to submit a full revision, please use our "Full Revision" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

      1. General Statements [optional]

      We thank all three reviewers for their feedback on the paper. Reviewers stated that the paper was of broad interest to developmental biologists and neurobiologists. However, we want to ensure that our two key conceptual contributions are clear. We clarify in the following paragraph and include a revised abstract. We will update the introduction and paper to better reflect these advances. We also attach a supplemental table 1, which was inadvertently omitted from the previous submission due to our error.

      The first advance is that serially homologous neuroblasts follow a multimodal production model: In principle, stem cells can divide any number of times, from once to throughout the entire lifetime of the animal. And, on each division, a stem cell can generate either a proliferative daughter cell or a post-mitotic neuron. Together, therefore, there is a vast potential number of neurons any given stem cell could produce. From the literature on the vertebrate neocortex, we had the following models: (1) "random production" model, in which any number of neurons could be made by a stem cell; or (2) "unitary production" model, in which the same number of neurons (~eight) is produced by a stem cell regardless of context. Our data revealed an entirely new "multi-modal production" model, which could not have been predicted by prior literature. In the context of serially homologous neuroblasts arrayed along the Drosophila larval body axis, sets of five to seven neurons are produced in increments of one, two, or four. These increments correspond to units called temporal cohorts. Temporal cohorts are lineage fragments, or small set of neurons that share synaptic partners, making them lineage-based units of circuit assembly. Thus, in a multimodal production model, serially homologous stem cells produce different numbers of temporal cohorts depending on location. Our data advance the field by showing that stem cells produce circuit-relevant sets of neurons by adding or omitting temporal cohorts from a region, to meet regional needs.

      Key to understanding the second advance is that there are multiple types of temporal cohorts: early-born Notch OFF, early-born Notch ON, late-born Notch OFF, and late-born Notch ON. One temporal cohort type, the early-born Notch OFF, is found in every segment, which we term the "ubiquitous" temporal cohort. The other temporal cohort types can be produced in various combinations depending on the stem cell division pattern and segmental location. In a result that could not have been predicted, we found that the ubiquitous temporal cohorts are refined both in terms of the number of neurons and their connectivity, depending on body region. In contrast, when other temporal cohort types are produced, they are not refined to the same degree.

      The impact of this work is to advance how we think about stem cell-based circuit assembly.

      2. Description of the planned revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      *Summary: The study by Vasudevan et al intends to address how serially homologous neural progenitors generate different numbers and types of neurons depending on their location along the body axis. *

      Investigation of full repertoire of neurogenesis for these progenitors necessitates a precise ability to track the fates of both progenitors and their neuronal progeny making it extremely difficult in vertebrate paradigm. The authors used NB3-3 in the developing fly embryo as a model to investigate the full extent of the flexibility in neurogenesis from a single type of serially homologous stem cell. Previous work showed NB3-3 generates neurons including lateral interneurons that can be positively labeled by Even-skipped, but detailed characterization of the NB3-3 lineage mainly focused on 3 segments during embryogenesis. The authors defined the number of EL neurons in all segments of the central nervous system in early larvae after the completion of circuit formation and carried out clonal analyses to determine the proliferation pattern of NB3-3. They described the failure to express Eve in Notch OFF/B neurons as a new mechanism for controlling the number of EL neurons and PCD limits EL neurons in terminal segments.

      • *Thank you! In addition to the contributions highlighted by the reviewer, we also showed that all segments have ELs with early-born molecular identities, but only a subset have ELs with late-born identities (Figure 5). And we showed that early-born temporal cohorts can be mapped into different circuits depending on the axial region (Figure 6).

      *Major comments: The authors performed careful analyses of the NB3-3 lineage using EL neurons. My main concerns are limited applicability of their findings and lack of mechanisms as how NB3-3 generate various numbers of EL neurons. Their findings are exclusively relevant to the NB3-3 lineage despite their effort in highlighting that other NB lineages also generate temporal cohorts of EL neurons. *

        Thank you for raising these points. First, to clarify, as Reviewer 4 also mentioned, NB3-3 is the only lineage to produce EL neurons. We will ensure that this is clearly stated in the revised text.
      

      We agree that our findings might not apply beyond the NB3-3 lineage. However, as this is the first study of its kind, it is impossible to know a priori to what extent the concepts surfaced here are generalizable. In our opinion, this speaks to the novelty and impact of the study. A contribution is to motivate a need for future studies. We will make this explicit in our updated manuscript in the Discussion section.

        Our manuscript provides cell biological mechanisms that explain how stem cells give rise to different numbers of EL neurons in different regions, including stem cell division duration and type, neural cell death, identity gene expression, and differentiation state. If the reviewer is interested in genetic or molecular mechanisms, this is an interesting point. Several prior studies using NB3-3 as a model (e.g., Tsuji et al., 2008, Birkholz et al., 2013, Baumgardt et al., 2014) have elucidated the genetic regulation of specific cell biological processes. However, these studies provided fragmentary insight with regard to serially homologous stem cell development along the body axis. A comprehensive understanding of how the NB3-3 lineage, or any other serially homologous lineage, develops was missing. This is what makes our study both novel and needed. Without an analysis that both examines every segment and assays multiple cell biological processes, we would have missed key insights: that there is a ubiquitous type of temporal cohort, and that neurons within the ubiquitous temporal cohort are selectively refined post-mitotically (See General Statements for more details).
      

      *I disagreed with their conclusion that failure to express Eve as a mechanism for controlling EL neuron numbers when Eve serves as the marker for these neurons. Are there any other strategy to assess the fates and functions of these cells beside relying solely on Eve expression? I am not familiar with the significance of Eve expression on the functions of these neurons. Is it possible to perform clonal analyses of NB3-3 mutant for Eve and see if these neurons adopt different functionalities/identities? *

      • We agree that if Eve were only a marker, our logic would be circular. The Eve homolog, Evx1/2 is crucial for vertebrate interneuron cell fate (Moran-Rivard et al., 2001). Eve is essential for motor neuron morphology in Drosophila *(Fujioka et al., 2003). Eve is critical in Even-skipped for both the morphology and function of Even-skipped interneurons (Marshall et al., 2022). Hence, ELs cannot fully differentiate or incorporate into circuits without Eve. Thus, we use the failure to express Eve as a mechanism for controlling EL number. Furthermore, our prior study (Wang et al., 2022) showed that NB3-3 Notch OFF neurons in A1 that fail to express Eve have small soma and "stick-like" neurite projections that are typical of undifferentiated neurons. We will be sure to add this context to the revised manuscript.

      *If NB3-3 in the SEZ continually generate GMCs based on the interpretation of clonal analyses and depicted in Fig. 2A, why is the percent of clones that are 1:0 virtually at or near 100% from division 6-11 shown in 2G? *

      Admittedly, the ts-MARCM heat-shock-based lineage tracing experiments are inherently messy. This is part of the reason why we included the G-TRACE lineage tracing experiments in Figure 3. In Figure 3E, one can see that the number of Notch ON/A neurons in SEZ3 is equal to the number of ELs in that segment (Figure 1E). This is a second independent method that supports the assertion that in SEZ, NB3-3 stem cells continually generate GMCs. Given this independent observation, it leads us to believe that this question is most likely explained by technical issues inherent in ts-MARCM. These issues include but are not limited to: cell-type specific accessibility/success of heat-shock induced recombination; variably effective RNAi; and idiosyncrasies of the EL-GAL4 line used to detect recombination events. If the question is why the data is only reported for division 6-11, the answer is that the ts-MARCM dataset, which included SEZ clones only used later heat-shock time points (line from the paper "for the SEZ-containing dataset, inductions started at NB3-3's 5th division"). Along with this revision plan, we will include Supplemental Table 1, which was inadvertently omitted from the previous submission due to our error. This table shows all of the clonal data. We will include a section in the discussion to describe limitations in ts-MARCM.

      The authors also indicate that NB3-3 in the abdomen directly generate Notch OFF/B cells that assume EL neuronal identity. In this scenario, shouldn't the percent of 1:0 clones be 100% in later divisions in Fig. 2G? Based on the number of clones in abdomen shown in Fig. 2E, I cannot seem to understand how the authors come to the percent of 1:0 clones shown in Fig. 2G

        We agree that one might expect the 12th division to be 100% 1:0 clones in the abdomen. Unfortunately, we didn't sample that late in our dataset, and even when we sampled the inferred 11th division, we had a small sample size (Figure 2E). Other studies suggest that NB3-3 in the abdomen directly generates Notch OFF/B neurons (Baumgardt et al., 2014), which served as our starting point. We will revise the text to make this clearer. As you can see from Figure 3E, there is only one NB3-3 Notch ON/ A neuron produced in each abdominal segment in comparison to the number of NB3-3 Notch OFF/B/EL neurons (Figure 1E). According to two independent assessments, Figure 3 and Baumgardt et al., 2014, the data support the conclusion that NB3-3 in the abdomen directly generates Notch OFF/B cells that assume EL identity for all but one of their divisions. Again, we believe technical issues make the ts-MARCM dataset messy. We will include a section in the discussion to describe limitations in ts-MARCM.
      

      *There are many potentially interesting questions related to this study that can significantly broaden the impact of this study. For example, are other NB lineages that also generate distinct temporal cohorts of EL neurons display similar proliferation patterns (type 1 division in SEZ, early termination of cell division in thoracic segments and type 0 division in abdomen)? *

      • *NB3-3 is the only lineage that makes ELs; Many lineages switch proliferation fates along the body axis. Previous studies have described how this switch in division patterns produces the wedge-shaped CNS: Cobeta et al., 2017. In the revision, we will be sure to clarify both points.

      *Why does NB3-3 in the thoracic segment become quiescence so much sooner than SEZ and abdominal segments? *

      • *NB3-3 in the thorax enters quiescence due to Hox genes and temporal transcription factors (Tsuji et al., 2008). In the revision, we will be sure to clarify this point.

      The authors' observations suggest that NB3-3 in SEZ and abdomen generate a similar number of EL neurons despite the difference in their division patterns (type 1 vs type 0). Are the mechanisms that promote EL neuron generate in NB3-3 in SEZ and abdomen the same? Anything else is known beside Notch OFF?

      • We agree this is an interesting point. Previous work has detailed NB3-3 division patterns, showing Type 1 divisions in the thorax, and Type 1 to Type 0 switch in the abdomen (Baumgardt et al., 2014). However, the proliferation pattern of NB3-3 in the SEZ had not been addressed until our study. Figures 2 and 3 suggest the following (1) SEZ proliferates for the duration of embryonic neurogenesis; (2) It produces a GMC on each division; (3) the GMC divides to produce one EL Notch OFF neuron and one Notch ON neuron. In our revision, we will manipulate the Notch pathway using two mutants, sanpodo, which produces two Notch OFF cells, and numb*, which produces two Notch ON cells (Skeath et al., 1998), to specifically test how ELs in the SEZ are regulated by Notch signaling. The other difference we know of between the SEZ, and abdomen is Hox gene expression. In Figure S2, we show that a subset of ELs in the SEZ express the anterior Hox genes, Sex combs reduced (Scr). The role of Hox genes in this lineage is an interesting question, as addressed in the discussion. This is an important future direction that merits in-depth study and is beyond the scope of what of this study is trying to accomplish.

      Minor commentsThe authors' writing style is highly unusual especially in the result section. There is an overwhelming large amount of background information in the result section but very thin description on their observations. The background information portion also includes previously published observations. Since the nature of this study is not hypothesis-driven, it is very confusing to read in many places and difficult to distinguish their original observations from previously published results and making. One easily achievable improvement is to insert relevant figure numbers into the text more often.

      Thank you for this comment. It is invaluable. In the revision, we will expand the background into a more comprehensive introduction and present the results more clearly. We will certainly insert relevant figure numbers. In responding to the reviewer's comments above, we can see where our writing lacked clarity and will improve these areas. Thank you again.

      Reviewer #1 (Significance (Required)):

      The study by Vasudevan et al intends to address how serially homologous neural progenitors generate different numbers and types of neurons depending on their location along the body axis. Investigation of full repertoire of neurogenesis for these progenitors necessitates a precise ability to track the fates of both progenitors and their neuronal progeny making it extremely difficult in vertebrate paradigm. The authors used NB3-3 in the developing fly embryo as a model to investigate the full extent of the flexibility in neurogenesis from a single type of serially homologous stem cell. Previous work showed NB3-3 generates neurons including lateral interneurons that can be positively labeled by Even-skipped, but detailed characterization of the NB3-3 lineage mainly focused on 3 segments during embryogenesis. The authors defined the number of EL neurons in all segments of the central nervous system in early larvae after the completion of circuit formation and carried out clonal analyses to determine the proliferation pattern of NB3-3. They described the failure to express Eve in Notch OFF/B neurons as a new mechanism for controlling the number of EL neurons and PCD limits EL neurons in terminal segments.

      Because this text is the same as the summary, please see our response to that section.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Vasudevan et al provide a detailed characterisation of the different numbers and temporal birthdates of Even-skipped Lateral (EL) neurons produced at in different segments from the same neuroblast, NB3-3. The work highlights the differences in EL neuronal generation across segments is achieved through a combination of different division patterns, failure to upregulate EL marker Eve and segment-specific program cell death. For neurons born within the same window and segment, the authors describe additional heterogeneity in their circuit formation. The work underscores the large diversity that the same neuroblast can generate across segments.

      Thank you!

      Major comments:

      - Based on the ts-MARCM 1:0 clones representing 100% of the SEZ clones at any given inferred cell division, the authors conclude "NB3-3 neuroblasts generate proliferative daughter GMCs in the SEZ and thorax on most divisions". Figure 2G does not have any data for SEZ before inferred division 5, whereas there is data in other regions. The authors also state "In the SEZ and abdomen, ELs were labelled regardless of induction time." In reference to Fig 2F, which seems inaccurate given there are no SEZ clones before inferred division 5. There is no comment on this fact, which is surprising give their focus on temporal cohorts. The authors should explain this discrepancy, if known, or modify their statements to reflect the data.

      • *Thank you for raising this point. The reason is because we produced two ts-MARCM datasets. One had SEZ clones, the other did not. The dataset with SEZ clones used heat shock protocols only for later time points, because those were most informative. The text from the paper is "We combined a published ts-MARCM (Wang et al., 2022) dataset with a new one (Table S1). The differences between the datasets are (1) CNSs were imaged either at low resolution for all regions (SEZ to terminus) or higher resolution for nerve cords (thorax to terminus); (2) for the SEZ-containing dataset, inductions started at NB3-3's 5th division. The combined data includes ~12 different heat shock protocols, 80 CNS, and 234 clones (Table S2)". In response to this comment, however, we will further clarify this point. In addition, we are submitting Supplemental table 1, which contains all the clonal data, as you can see experiments a-h lack SEZ data and experiments i-k contain SEZ data.

      - The temporal cohort (early-born vs late-born) identity is exclusively examined based on markers. Given the absence of SEZ clones from early NB3-3 divisions, a time course showing that the SEZ generate early-born Els or some other complementary method would be desirable.

      Thank you for raising this point. We show early-born versus late-born identity using markers in Figure 5. We conducted the time-course experiment as suggested and can confirm that there are early-born ELs in the SEZ at stage 13. We will include a new Supplemental Figure that includes a time course of EL number at stages 11, 13, 15, and 17 for segments SEZ3 to Te2 in the revision. See figure below.

      - The authors repeatedly refer to their work as showing how a stem cell type can have "flexibility". Flexibility would imply that NB3-3 from one segment could adopt a different behaviour (different division pattern, or cell death or connectivity) if it were placed in a different segment. This is not what is being shown. In my opinion, "heterogeneity" of the same neuroblast across different segments would be more appropriate.

      • *Thank you for this comment. We will change the wording to heterogeneity in the revision.

      Minor comments:

      - Figure 2A depicts a combination of known data and conclusions from their own (mainly SEZ). The authors might consider editing the figure to highlight what is new. A possibility would be for figure A to be a diagram of the experimental design and their summary division pattern to be shown after the new data instead of being panel A.

      Thank you for this suggestion. We will make the suggested change.

      - The authors state that they combined published ts-MARCM with their new one, which differed in a number ways that they list, but they don't specify which limitations are associated with the published vs new dataset. Could the authors please clarify?

        We now include Supplemental Table 1, which shows the complete combined datasets. In the first dataset, experiments a-h, the CNS was imaged at high resolution, but in a smaller region. The limitation is that the SEZ is missing. In the second dataset, i-k, inductions started at NB3-3's 5th division. The limitation is that we fail to sample early time points. This was a strategic decision. There were two possible scenarios: (1) in the SEZ, NB3-3 divided early, made GMCs, but both daughters expressed Eve. (2) in the SEZ, NB3-3 divided for the entirety of the embryonic neurogenesis, making GMCs, with only the Notch OFF daughters expressing Eve-our data support (2). Only late heat shocks were needed to distinguish between these possibilities. As these experiments are labor-intensive, we focused our efforts on the later time points. We will make this clearer in our revised text.
      

      - The title refers exclusively to "temporal cohorts", which in the manuscript are defined quite narrowly and do not seem to apply to all segments.

      • *Thank you! This, in our opinion, is a central, not a minor point to raise, because the impact of this study involves temporal cohort biology. We outlined the essential concepts in Part 1 "general statements" section of this revision plan. We did not mean to use "temporal cohort" in a limited sense, and we can see how the writing of our results section led to this comment. We will revise to make this clear.

      - Several cited references are missing from the Reference list at the end. Could the authors please double check this? (e.g. Matsushita, 1997; Sweeney et al., 2018)

      • *Thank you, we will remedy this!

      - Legend for figure 2 is a bit confusing, there is a "(A)" within the legend for (D), which indicates that segments A1-A7 are shown (this seems inaccurate, as it only goes to A6).

      Thank you, we will remedy this!

      Reviewer #3 (Significance (Required)):

      This study provides a comprehensive analysis of different cell biological scenarios for a neuroblast to generate distinct progeny across repeating axial units. The strength is the detailed and systematic approach across segments and possible scenarios: different division patterns, cell death, molecular marker expression. While it focuses on one specific neuroblast of the ventral nerve cord of Drosophila, the authors have done extensive work to place their findings and interpretation in the context of other cell types and across model organisms both in the introduction and discussion. This makes the work of interest for developmental biologists in general, neurodevelopment research in particular and those interested in circuit assembly, beyond their specialised community. This point of view comes from someone working in vertebrate CNS development.

      Thank you!

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      Summary

      This manuscript addresses the question of how the number of neurons produced by each progenitor in the nervous system is determined. To address this question the authors use the Drosophila embryo model. They focus on a single type of neural stem cell (neuroblast), with homologues in each hemisegment along the anterior-posterior axis.

      Using a combination of clonal labelling, antibody stainings, and blockade of programmed cell death, they provide a detailed description of segment-specific differences in the proliferation patterns of these neuroblasts, as well as in the fate and survival of their neuronal progeny.

      Furthermore, by employing trans-synaptic labelling, they demonstrate that neurons derived from the same progenitor type receive distinct patterns of synaptic input depending on their segmental origin, in part due to their temporal window origin.

      Overall this work shows that different mechanisms contribute to the final number and identity of the neuronal progeny arising from a single progenitor, even within homologous progenitors along the anterior posterior body axis.

      Thank you!

      Major Comments

      I would suggest adding line numbers to the text for future submissions, this massively helps providing comments.

        Thank you for this comment. We will definitely add line numbers to the revised manuscript. We also thank you for providing comments despite this oversight on our part. We appreciate your time, and did not mean to make extra work.
      

      *The authors propose that all neuroblasts produce the same type of temporal cohort (early born) and that, by changing the pattern of cell division, different temporal cohorts can be added. The way this this presented in the abstract sounds like an obvious thing, what would be the alternative scenario/s? *

        Thank you for raising the point that the abstract should be updated. We have included a revised abstract. The things that are obvious are: (1) changing a neuroblast's division pattern will change the number of neurons produced, and (2) if you have late-born neurons, the stem cell must at some point, have made early-born neurons. However, within those bounds is an extremely large parameter space. Each stem cell can choose to divide or not, and it can also choose to produce a proliferative daughter or not. The stem cell must navigate these choices at every division. The field had two models for what a stem cell might do - a "random production" model and a "unitary production" model. Our data support a third "multimodal production" model, which could not have been predicted based on prior literature or data.
      

      We had raised these points in the discussion as follows-

      "Under a null model, the durations and types of proliferation would vary stochastically across segments, resulting in a continuous and unstructured distribution of neuron numbers (Llorca et al., 2019). In a unitary production model, based on the vertebrate neocortex, there is a fixed neurogenic output of ~8-9 neurons per progenitor (Gao et al., 2014). However, our data support a third model, a multimodal production model. In a multimodal model, serially homologous neuroblasts generate different numbers of neurons depending on the segment."

      We will now update the text to address this concern.

      Here it's the late born neurons that lack in thoracic segments because of early NB quiescence, but it cannot be excluded that different neuroblast types adopt a different strategy.

      • *True. Neural development is complex. Other lineages could easily employ alternative strategies. Our study presents a new conceptual framework that should inspire future research.

      I found the ts-MARCM results confusing for 2 reasons:

      1- It's not clear to me why there are so many single cell clones in div 3 and 4 in abdominal segments. This is not compatible with the division model depicted for abdominal segments, unless GMCs are produced in those division window and the MARCM hits the GMC, as also mentioned in the legend for G. This aspect is important because, either the previous model by Baumgardt et al. - please correct cit. currently Gunnar et al. 2026 - is wrong, or something strange happens in this experiment, or the relative temporal order is incorrect.

      Thank you for raising this point. Having multiple single-cell (i.e., 1:0) clones in divisions 3 and 4 is not precisely what would be predicted by the model in Figure 2C. In part because heat-shock-based recombination methods in fly are stochastic and inherently "messy", we also conducted a second set of lineage tracing experiments, as shown in Figure 3, using G-TRACE. Figure 3E shows one Notch ON/A neuron in each abdominal segment, suggesting there is only one GMC present during lineage progression. But Figure 3E's result does not localize the GMC to any particular division. One possibility is that the GMC is generated once, but randomly throughout lineage progression. This possibility is consistent with the idea that the relative temporal order is incorrect and suggests that Baumgardt is erroneous. However, the Baumgardt data are strong, so we do not favor this idea. A second possibility, which we favor, is that something strange happened in this experiment. Here is how we envision the strange occurrence: heterogeneity in the EL driver. Ts-MARCM's recombination timing dictates the upper limit for the number of cells within a clone. However, recombination is detected by GAL4. So, if the GAL4 driver for some reason detects fewer cells than one expects, then one would see unusually small clones as is the case in question. To detect Ts-MARCM recombination in Figure 2, we used the EL-GAL4 driver. The EL-GAL4 driver is an enhancer fragment, ~400KB, meaning that it does not capture the full regulatory context of the eve locus. In our experience (e.g., Manning et al., 2012), drivers using small enhancers tend to give highly-specific, but somewhat variable expression, and this is the case for EL-GAL4 in our experience. We will update the discussion to discuss the ts-MARCM dataset and its limitations. And, we will correct the citation to Baumgardt et al., 2014, not Gunnar. Thank you!

      2- In segments other than abdomen, it is quite rare to hit proper clones, it appears that only GMCs are hit by recombination, with very few exceptions. Could the author please provide an explanation for this or at least mention this aspect?

      • *This is true. We cannot explain it. It could have something to do with the RNAi cassettes that are used in ts-MARCM, because in the original paper they mention that RNAi can be differently regulated in GMCs versus neuroblasts (Yu et al., 2009). We will mention it in the revised discussion about ts-MARCM limitations.

      It is also unclear whether in F the graph includes all types of clones (including 1:0 clones). This is important, because the timing of division for NBs and GMCs is different, and inclusion of 1:0 might lead to a wrong estimate of the NB proliferation window (longer than it actually is because GMCs divide for longer). This is particularly important for the SEZ, where most clones in normalised division 10 and 11 are with ratio 1:0, thus compatible with both terminal division as well as GMC division.

      • *The graph in F does include all types of clones. We provide Supplemental Table 1, which shows the full dataset. Unfortunately, we do not have enough data to analyze only NB clones. We agree that the estimate of the NB proliferation window is coarse using this analysis method and could overrepresent the division time by one cell division. We will mention this in the discussion and make sure that our results text is free from any overreaching claims about the precision of these measurements.

      To obtain an estimate of the timing of division, the authors normalise clone size to the size of the bigger clone in the abdomen. What happened to those samples where no abdominal clones were hit? Were they simply excluded from the analysis?

        From the analysis in Figure 2, we excluded the clones that were SEZ, thorax, or terminus only. They were rare. They are shown in Supplemental Table 1, which will now be added in our revision plan.
      

      It is proposed that in the thorax late temporal cohort neurons are not produced, yet the ts-MARCM experiment detects some 1:0 clones. What is the fate of these cells? Are they all derived from GMC division and therefore decoupled from the temporal identity window? Or is this a re-activation of division?

      Figure 2F shows at the inferred 11th NB3-3 division, 100% of thoracic clones are of the 1:0 type. This is an n=1 observation (Supplemental Table 1, row f-Jan20-2). When we look at the morphology of this thoracic EL, we can see that it is a fully differentiated neuron that crosses the midline and ascends to the CNS, which is similar to EL morphologies in A1, so we don't think it's a whole new cell type. We have no way of determining whether this neuron was derived from a GMC division. It is also possible that this is an infrequent event or a technical anomaly. To address the question of reactivation of the thoracic NB3-3 division, we plan to include a Supplemental Figure of EL number over developmental time (stages 11, 13, 15, 17) for segments SEZ3 to Te2. This is the same data that we mentioned to Reviewer 3. This will reveal the extent to which the thorax produces late-born ELs.

      *"in A1, a majority of segments had one Notch OFF/B neuron that failed to label with Eve" does "the majority" in this sentence mean that there were cases where all B neurons were labelled with Eve? If yes, where would this stochasticity come from? *

        • Yes, "the majority" in this sentence means that there were cases where all B neurons were labeled by Eve. In Figure 3F, for segment A1, that number is four. In contrast, there are 6 cases where B neurons failed to label with Eve. We can only speculate about the origin of the stochasticity. It could be biological (e.g., low level of Eve expression) or technical (e.g., poor antibody penetration). We plan to mention this in the discussion.

      Additionally, there is no evidence that it's the first born NotchOFF neuron in A1 that does not express Eve. The authors should clarify where this speculation comes from.

      • *The evidence that the first-born Notch OFF neuron in A1 does not express Eve comes from our ts-MARCM data: "So far, our ts-MARCM analyses grouped segments into regions (Figure 2A-C), however, EL number varies on a segment-by-segment basis (Figure 1). Therefore, we looked for segment-by-segment differences in ts-MARCM data (Table S1). The only detectable difference was between A1 and the other abdominal segments: When both A1 and another abdominal segment were labeled in a single CNS, a majority had smaller A1 clones. These data suggest that the production of ELs by NB3-3 neuroblasts lags in A1 compared to A2-A7." We will add a representation of these data to the ts-MARCM figure. As we stated above, we will add a Supplemental Figure of EL number over developmental time (stages 11, 13, 15, 17) for segments SEZ3 to Te2, which could strengthen this point.

      When discussing trends shared with other phyla:

      A- "In the mammalian spinal cord, more neurons are present in regions that control limbs (Francius et al., 2013). Analogously, EL numbers do not smoothly taper from anterior to posterior; instead, the largest number of ELs is found in two non-adjacent regions, SEZ and the abdomen." It's unclear what is the link between the figure in the mammalian spinal cord and the Drosophila embryo. The embryo doesn't even have limbs and the number of neurons measured here refer only to a single lineage, while there could be (and in fact there are) lineage-to-lineage differences that could depict a different scenario.

      Thank you for this comment. We will rewrite this sentence, "in the mammalian spinal cord, more neurons are present in regions that control limbs (Francius et al., 2013)" to more accurately reflect the data in the Francius paper, and make the parallel more explicit. We will say "the size of columns of V3, V1, V2a, V2b, and V0v neurons differ at brachial compared to lumbar levels in the developing spinal cord." This removes the confusion about limbs and somewhat mitigates the concern about lineage-to-lineage differences, at least from the perspective of the spinal cord.

      B- The parallelism between V1 mouse neurons and EL Drosophila neurons is also unclear to me. The similarity in fold change across segments could be a pure coincidence and, from what I understand, the two cell types are not functionally linked.

        Thank you for this comment. We believe this is the sentence in question (sorry about no line numbers). "(3) In the mouse spinal cord, ~10-fold differences in molecular subtypes for V1 neurons (Sweeney et al., 2018). In *Drosophila*, NB3-3 neuroblasts show differences in EL number, depending on region, with similar fold changes, suggesting this trait is shared across phyla."  The emphasis was intended to be on the fold-changes, not cell types. Coincidence or not, it is parallel. We will update the sentence to say "(3) In the mouse spinal cord, ~10-fold differences in molecular subtypes for V1 neurons (Sweeney et al., 2018). Although V1 neurons are not direct homologs of EL neurons, the number also varies ~10-fold depending on the region. One possibility is that this trait is shared across phyla." And, we will remove the final part of the paragraph, which distracts from the point "Thus, for this study and future research, NB3-3 development now offers a uniquely tractable, detailed, and comprehensive model for studying how stem cells flexibly produce neurons."
      

      Minor comments:

      I found the manuscript somewhat difficult to follow, even though I am familiar with both the model and the topic. For non-specialist readers, I expect it will be even more challenging. The presentation of the results often feels fragmented, at times resembling a sequence of brief statements rather than a continuous narrative. I would encourage the authors to provide more synthesis and interpretation, for example by summarising key findings, rather than listing in detail the number of neurons labelled in each segment for every experiment. This would make the results more accessible and easier to digest.

      • *Thank you for this comment. We will provide more synthesis and interpretation in results by summarizing key findings.

      From the way the MS is written it's not clear from the beginning that the work focuses exclusively on embryonic-born neurons. Since in Drosophila neuronal stem cells undergo two rounds of neurogenesis, one in the embryo and one in the larva, this omission could lead to confusion.

        Thank you for this comment. We will mention this in the abstract, introduction and discussion.
      

      In the abstract, what would be the other temporal cohorts generated in specific regions? (ref to: "In specific regions, NB3-3 neuroblasts produce additional types of temporal cohorts, including but not limited to the late-born EL temporal cohort.")

        In this manuscript, we use lineage tracing to identify four types of temporal cohorts- early-born Notch ON, early-born Notch OFF, late-born Notch ON, and late-born Notch OFF. This is now reflected in the revised abstract. ELs are early-born Notch OFF and/or late-born Notch OFF.
      

      This sentence in the introduction is inaccurate: "The Drosophila CNS is

      organized into an anterior hindbrain-like subesophageal zone (SEZ) and a posterior spinal cord-like nerve cord". The anterior hindbrain-like portion of the CNS is in fact the supraesophageal ganglion (or cerebrum), while the SEZ is a posterior-like region.

        Thank you. We will change this sentence to: "The *Drosophila* CNS is
      

      organized into a hindbrain-like subesophageal zone (SEZ) and a spinal cord-like nerve cord".

      Fig 1E: the encoding of the significance is not immediately clear. In the legend the 4 stars could also be arranged in the same way for clarity.

      • *Thank you. We will change it for clarity.

      Fig 2E legend: it is mentioned that B corresponds to a 1:4 clone, however the MARCM example is shown for C and it's a 1:5.

      Thank you. We will fix this.

      The occurrence of "undifferentiated" neurons in Th segments is in less than 10% of the clones, I wonder if this a stochastic or deterministic event and to what extent small cell bodies could just be the consequence of local differences in tissue architecture.

      • Because we are using a stochastic technique, it is difficult for us to determine whether the occurrence of neurons with small somas is a stochastic or deterministic event. Several papers suggest neurons with small axons are found across insect species (Pearson and Fourtner, 1975; Burrows, 1996). Neurons with a small soma and short axons/ axonless are found in the Drosophila embryonic abdominal nerve cord (Lacin et al., 2009). In our unpublished work from the Drosophila* nerve cord at a first instar larval stage, we found small somas with short axons in segment A1 (see Figure 4.6 below). This leads us to believe it is not a consequence of local tissue architecture.

      Fig 2I: it's unclear what the purple means (I suppose it might be Eve expression) and why in J there should be one purple cell not labelled by the ts-MARCM when this is not present in H and I.

      Purple is Eve. We will add labels for stains used in H and I, and remove the extra purple cell from the illustration in J.

      "When synapses do occur, they are numerically similar from segment to segment". It's unclear where the evidence for this statement comes from, please clarify or remove the sentence.

      We calibrated our trans-Tango data against available connectomic data using segment A1 as a reference. We learned that the trans-tango method only identifies strongly (>15 synapses) connected neurons.

      "First, we calibrated trans-Tango for use in larval Drosophila, focusing on segment A1, where connectome data are available (Wang et al., 2022). In the connectome, of the five early-born ELs in A1, three are strongly connected to CHOs (>15 synapses), two are weakly connected (15 synapses) connected to somatosensory neurons."

        We will modify this sentence to say "when synapses do occur they are of similar strengths from segment to segment"
      

      "In SEZ2, NB3-3 divides 10 times (Figure 2F)". Figure 2F does not support this statement and Figure 7 shows 12 divisions. Possibly SEZ2 and 3 have been inverted in this statement, please clarify.

      Thank you for pointing this out. We will correct it!

      **Referees cross-commenting**

      I agree with most of the comments/suggestions provided by the other two reviewers.

      In particular:

      I agree with reviewer #1's comment about failure to express Eve being a mechanism for controlling neurons number, as this is a circular argument.

      • *We address this earlier and direct you to that text. Briefly, Eve is not just a marker, but a key differentiation gene for ELs.

      I agree with reviewer #2's concern about the use of the word "flexibility"; "heterogeneity" would be a more appropriate term, as I would associate the word "flexibility" to the ability of a single neuroblast in a single segment to produce neurons with different fates under, for example, unusual growth conditions. Here no genetic/epigenetic manipulations were performed to address flexibility and the observed (stereotypical) differences result from axial patterning.

      • *We will change this, thank you.

      *As a note, Reviewer #1 asks about other temporal cohorts of EL neurons produced by other lineages, but these neurons are specifically generated from NB3-3. *

      • *Thank you for adding this clarification.

      To generalise the observations reported in this study, the authors would need to focus on other molecularly defined temporal cohorts or, more generally, on other lineages, which, however, are likely to adopt different combinations of mecahnisms to tune progeny number across segments.

      • *We agree that further studies are needed to assess the generalizability of our findings.

      Reviewer #4 (Significance (Required)):

      In Drosophila melanogaster, the relationship between neural progenitors and their neuronal progeny has been studied in great detail. This work has provided a comprehensive description of the number of progenitors present in each embryonic segment, their molecular identities, the number of neurons they produce, and the temporal transcriptional cascades that couple progenitor temporal identity to neuronal fate.

      This work adds to the existing knowledge a detailed characterisation of intersegmental differences in the pattern of proliferation of a single type of neuronal progenitor as well as in post-divisional fate depending on anterior-posterior position in the body axis (i.e. programmed cell death and Notch signalling activation). This is a first step towards understanding the cellular and molecular mechanisms underlying such differences, but it's not disclosing them.

      We have disclosed the cellular mechanisms- stem cell division duration and type, neural cell death, identity gene expression, and differentiation state -unless something else is envisaged by this comment. The molecular mechanisms are beyond the scope of this paper.

      That homologous neuroblasts can generate variable numbers of progeny neurons depending on their segmental position has been established previously. What this manuscript adds is the demonstration that these differences arise through a combination of altered division patterns and differential programmed cell death, thereby revealing a more complex and less predictable scenario than could have been anticipated from existing knowledge in other contexts. The advance provided by this study is therefore incremental, refining rather than overturning our understanding of how segmental diversity in neuroblast lineages is achieved.

      The key conceptual advances provided by this study are described in the General Statements section above. We don't overturn, but we advance the field.

      By touching on the general question of how progenitors generate diversity, this work could be of broad interest to developmental neuroscientists beyond the fly field. However, the way it is currently written does not make it very accessible to non-specialists.

      Thank you for this comment. We will endeavor to make it more accessible in the revised manuscript. Reviewer 3, an expert in vertebrate neurobiology, agreed that our work was of broad interest.

      My expertise: Drosophila neurodevelopment, nerve cord, cell types specification

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      With this Revision Plan, we submit a revised abstract, and a supplemental table 1. We plan to address every point raised by the reviewers.

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      This manuscript addresses the question of how the number of neurons produced by each progenitor in the nervous system is determined. To address this question the authors use the Drosophila embryo model. They focus on a single type of neural stem cell (neuroblast), with homologues in each hemisegment along the anterior-posterior axis.

      Using a combination of clonal labelling, antibody stainings, and blockade of programmed cell death, they provide a detailed description of segment-specific differences in the proliferation patterns of these neuroblasts, as well as in the fate and survival of their neuronal progeny. Furthermore, by employing trans-synaptic labelling, they demonstrate that neurons derived from the same progenitor type receive distinct patterns of synaptic input depending on their segmental origin, in part due to their temporal window origin. Overall this work shows that different mechanisms contribute to the final number and identity of the neuronal progeny arising from a single progenitor, even within homologous progenitors along the anterior posterior body axis.

      Major Comments

      I would suggest adding line numbers to the text for future submissions, this massively helps providing comments.

      The authors propose that all neuroblasts produce the same type of temporal cohort (early born) and that, by changing the pattern of cell division, different temporal cohorts can be added. The way this this presented in the abstract sounds like an obvious thing, what would be the alternative scenario/s? Here it's the late born neurons that lack in thoracic segments because of early NB quiescence, but it cannot be excluded that different neuroblast types adopt a different strategy.

      I found the ts-MARCM results confusing for 2 reasons:

      1. It's not clear to me why there are so many single cell clones in div 3 and 4 in abdominal segments. This is not compatible with the division model depicted for abdominal segments, unless GMCs are produced in those division window and the MARCM hits the GMC, as also mentioned in the legend for G. This aspect is important because, either the previous model by Baumgardt et al. - please correct cit. currently Gunnar et al. 2026 - is wrong, or something strange happens in this experiment, or the relative temporal order is incorrect.
      2. In segments other than abdomen, it is quite rare to hit proper clones, it appears that only GMCs are hit by recombination, with very few exceptions. Could the author please provide an explanation for this or at least mention this aspect? It is also unclear whether in F the graph includes all types of clones (including 1:0 clones). This is important, because the timing of division for NBs and GMCs is different, and inclusion of 1:0 might lead to a wrong estimate of the NB proliferation window (longer than it actually is because GMCs divide for longer). This is particularly important for the SEZ, where most clones in normalised division 10 and 11 are with ratio 1:0, thus compatible with both terminal division as well as GMC division.

      To obtain an estimate of the timing of division, the authors normalise clone size to the size of the bigger clone in the abdomen. What happened to those samples where no abdominal clones were hit? Were they simply excluded from the analysis?

      It is proposed that in the thorax late temporal cohort neurons are not produced, yet the ts-MARCM experiment detects some 1:0 clones. What is the fate of these cells? Are they all derived from GMC division and therefore decoupled from the temporal identity window? Or is this a re-activation of division?

      "in A1, a majority of segments had one Notch OFF/B neuron that failed to label with Eve" does "the majority" in this sentence mean that there were cases where all B neurons were labelled with Eve? If yes, where would this stochasticity come from? Additionally, there is no evidence that it's the first born NotchOFF neuron in A1 that does not express Eve. The authors should clarify where this speculation comes from. When discussing trends shared with other phyla:

      A- "In the mammalian spinal cord, more neurons are present in regions that control limbs (Francius et al., 2013). Analogously, EL numbers do not smoothly taper from anterior to posterior; instead, the largest number of ELs is found in two non-adjacent regions, SEZ and the abdomen." It's unclear what is the link between the figure in the mammalian spinal cord and the Drosophila embryo. The embryo doesn't even have limbs and the number of neurons measured here refer only to a single lineage, while there could be (and in fact there are) lineage-to-lineage differences that could depict a different scenario.

      B- The parallelism between V1 mouse neurons and EL Drosophila neurons is also unclear to me. The similarity in fold change across segments could be a pure coincidence and, from what I understand, the two cell types are not functionally linked.

      Minor comments:

      I found the manuscript somewhat difficult to follow, even though I am familiar with both the model and the topic. For non-specialist readers, I expect it will be even more challenging. The presentation of the results often feels fragmented, at times resembling a sequence of brief statements rather than a continuous narrative. I would encourage the authors to provide more synthesis and interpretation, for example by summarising key findings, rather than listing in detail the number of neurons labelled in each segment for every experiment. This would make the results more accessible and easier to digest.

      From the way the MS is written it's not clear from the beginning that the work focuses exclusively on embryonic-born neurons. Since in Drosophila neuronal stem cells undergo two rounds of neurogenesis, one in the embryo and one in the larva, this omission could lead to confusion.

      In the abstract, what would be the other temporal cohorts generated in specific regions? (ref to: "In specific regions, NB3-3 neuroblasts produce additional types of temporal cohorts, including but not limited to the late-born EL temporal cohort.")

      This sentence in the introduction is inaccurate: "The Drosophila CNS is organized into an anterior hindbrain-like subesophageal zone (SEZ) and a posterior spinal cord-like nerve cord". The anterior hindbrain-like portion of the CNS is in fact the supraesophageal ganglion (or cerebrum), while the SEZ is a posterior-like region.

      Fig 1E: the encoding of the significance is not immediately clear. In the legend the 4 stars could also be arranged in the same way for clarity.

      Fig 2E legend: it is mentioned that B corresponds to a 1:4 clone, however the MARCM example is shown for C and it's a 1:5.

      The occurrence of "undifferentiated" neurons in Th segments is in less than 10% of the clones, I wonder if this a stochastic or deterministic event and to what extent small cell bodies could just be the consequence of local differences in tissue architecture.

      Fig 2I: it's unclear what the purple means (I suppose it might be Eve expression) and why in J there should be one purple cell not labelled by the ts-MARCM when this is not present in H and I.

      "When synapses do occur, they are numerically similar from segment to segment". It's unclear where the evidence for this statement comes from, please clarify or remove the sentence.

      "In SEZ2, NB3-3 divides 10 times (Figure 2F)". Figure 2F does not support this statement and Figure 7 shows 12 divisions. Possibly SEZ2 and 3 have been inverted in this statement, please clarify.

      Referees cross-commenting

      I agree with most of the comments/suggestions provided by the other two reviewers. In particular: I agree with reviewer #1's comment about failure to express Eve being a mechanism for controlling neurons number, as this is a circular argument. I agree with reviewer #2's concern about the use of the word "flexibility"; "heterogeneity" would be a more appropriate term, as I would associate the word "flexibility" to the ability of a single neuroblast in a single segment to produce neurons with different fates under, for example, unusual growth conditions. Here no genetic/epigenetic manipulations were performed to address flexibility and the observed (stereotypical) differences result from axial patterning. As a note, Reviewer #1 asks about other temporal cohorts of EL neurons produced by other lineages, but these neurons are specifically generated from NB3-3. To generalise the observations reported in this study, the authors would need to focus on other molecularly defined temporal cohorts or, more generally, on other lineages, which, however, are likely to adopt different combinations of mecahnisms to tune progeny number across segments.

      Significance

      In Drosophila melanogaster, the relationship between neural progenitors and their neuronal progeny has been studied in great detail. This work has provided a comprehensive description of the number of progenitors present in each embryonic segment, their molecular identities, the number of neurons they produce, and the temporal transcriptional cascades that couple progenitor temporal identity to neuronal fate. This work adds to the existing knowledge a detailed characterisation of intersegmental differences in the pattern of proliferation of a single type of neuronal progenitor as well as in post-divisional fate depending on anterior-posterior position in the body axis (i.e. programmed cell death and Notch signalling activation). This is a first step towards understanding the cellular and molecular mechanisms underlying such differences, but it's not disclosing them.

      That homologous neuroblasts can generate variable numbers of progeny neurons depending on their segmental position has been established previously. What this manuscript adds is the demonstration that these differences arise through a combination of altered division patterns and differential programmed cell death, thereby revealing a more complex and less predictable scenario than could have been anticipated from existing knowledge in other contexts. The advance provided by this study is therefore incremental, refining rather than overturning our understanding of how segmental diversity in neuroblast lineages is achieved. By touching on the general question of how progenitors generate diversity, this work could be of broad interest to developmental neuroscientists beyond the fly field. However, the way it is currently written does not make it very accessible to non-specialists.

      My expertise: Drosophila neurodevelopment, nerve cord, cell types specification

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, Vasudevan et al provide a detailed characterisation of the different numbers and temporal birthdates of Even-skipped Lateral (EL) neurons produced at in different segments from the same neuroblast, NB3-3. The work highlights the differences in EL neuronal generation across segments is achieved through a combination of different division patterns, failure to upregulate EL marker Eve and segment-specific program cell death. For neurons born within the same window and segment, the authors describe additional heterogeneity in their circuit formation. The work underscores the large diversity that the same neuroblast can generate across segments.

      Major comments:

      • Based on the ts-MARCM 1:0 clones representing 100% of the SEZ clones at any given inferred cell division, the authors conclude "NB3-3 neuroblasts generate proliferative daughter GMCs in the SEZ and thorax on most divisions". Figure 2G does not have any data for SEZ before inferred division 5, whereas there is data in other regions. The authors also state "In the SEZ and abdomen, ELs were labelled regardless of induction time." In reference to Fig 2F, which seems inaccurate given there are no SEZ clones before inferred division 5. There is no comment on this fact, which is surprising give their focus on temporal cohorts. The authors should explain this discrepancy, if known, or modify their statements to reflect the data.
      • The temporal cohort (early-born vs late-born) identity is exclusively examined based on markers. Given the absence of SEZ clones from early NB3-3 divisions, a time course showing that the SEZ generate early-born Els or some other complementary method would be desirable.
      • The authors repeatedly refer to their work as showing how a stem cell type can have "flexibility". Flexibility would imply that NB3-3 from one segment could adopt a different behaviour (different division pattern, or cell death or connectivity) if it were placed in a different segment. This is not what is being shown. In my opinion, "heterogeneity" of the same neuroblast across different segments would be more appropriate.

      Minor comments:

      • Figure 2A depicts a combination of known data and conclusions from their own (mainly SEZ). The authors might consider editing the figure to highlight what is new. A possibility would be for figure A to be a diagram of the experimental design and their summary division pattern to be shown after the new data instead of being panel A.
      • The authors state that they combined published ts-MARCM with their new one, which differed in a number ways that they list, but they don't specify which limitations are associated with the published vs new dataset. Could the authors please clarify?
      • The title refers exclusively to "temporal cohorts", which in the manuscript are defined quite narrowly and do not seem to apply to all segments.
      • Several cited references are missing from the Reference list at the end. Could the authors please double check this? (e.g. Matsushita, 1997; Sweeney et al., 2018)
      • Legend for figure 2 is a bit confusing, there is a "(A)" within the legend for (D), which indicates that segments A1-A7 are shown (this seems inaccurate, as it only goes to A6).

      Significance

      This study provides a comprehensive analysis of different cell biological scenarios for a neuroblast to generate distinct progeny across repeating axial units. The strength is the detailed and systematic approach across segments and possible scenarios: different division patterns, cell death, molecular marker expression. While it focuses on one specific neuroblast of the ventral nerve cord of Drosophila, the authors have done extensive work to place their findings and interpretation in the context of other cell types and across model organisms both in the introduction and discussion. This makes the work of interest for developmental biologists in general, neurodevelopment research in particular and those interested in circuit assembly, beyond their specialised community. This point of view comes from someone working in vertebrate CNS development.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: The study by Vasudevan et al intends to address how serially homologous neural progenitors generate different numbers and types of neurons depending on their location along the body axis. Investigation of full repertoire of neurogenesis for these progenitors necessitates a precise ability to track the fates of both progenitors and their neuronal progeny making it extremely difficult in vertebrate paradigm. The authors used NB3-3 in the developing fly embryo as a model to investigate the full extent of the flexibility in neurogenesis from a single type of serially homologous stem cell. Previous work showed NB3-3 generates neurons including lateral interneurons that can be positively labeled by Even-skipped, but detailed characterization of the NB3-3 lineage mainly focused on 3 segments during embryogenesis. The authors defined the number of EL neurons in all segments of the central nervous system in early larvae after the completion of circuit formation and carried out clonal analyses to determine the proliferation pattern of NB3-3. They described the failure to express Eve in Notch OFF/B neurons as a new mechanism for controlling the number of EL neurons and PCD limits EL neurons in terminal segments.

      Major comments: The authors performed careful analyses of the NB3-3 lineage using EL neurons. My main concerns are limited applicability of their findings and lack of mechanisms as how NB3-3 generate various numbers of EL neurons. Their findings are exclusively relevant to the NB3-3 lineage despite their effort in highlighting that other NB lineages also generate temporal cohorts of EL neurons. I disagreed with their conclusion that failure to express Eve as a mechanism for controlling EL neuron numbers when Eve serves as the marker for these neurons. Are there any other strategy to assess the fates and functions of these cells beside relying solely on Eve expression? I am not familiar with the significance of Eve expression on the functions of these neurons. Is it possible to perform clonal analyses of NB3-3 mutant for Eve and see if these neurons adopt different functionalities/identities? If NB3-3 in the SEZ continually generate GMCs based on the interpretation of clonal analyses and depicted in Fig. 2A, why is the percent of clones that are 1:0 virtually at or near 100% from division 6-11 shown in 2G? The authors also indicate that NB3-3 in the abdomen directly generate Notch OFF/B cells that assume EL neuronal identity. In this scenario, shouldn't the percent of 1:0 clones be 100% in later divisions in Fig. 2G? Based on the number of clones in abdomen shown in Fig. 2E, I cannot seem to understand how the authors come to the percent of 1:0 clones shown in Fig. 2G

      There are many potentially interesting questions related to this study that can significantly broaden the impact of this study. For example, are other NB lineages that also generate distinct temporal cohorts of EL neurons display similar proliferation patterns (type 1 division in SEZ, early termination of cell division in thoracic segments and type 0 division in abdomen)? Why does NB3-3 in the thoracic segment become quiescence so much sooner than SEZ and abdominal segments? The authors' observations suggest that NB3-3 in SEZ and abdomen generate a similar number of EL neurons despite the difference in their division patterns (type 1 vs type 0). Are the mechanisms that promote EL neuron generate in NB3-3 in SEZ and abdomen the same? Anything else is known beside Notch OFF?

      Minor comments:

      The authors' writing style is highly unusual especially in the result section. There is an overwhelming large amount of background information in the result section but very thin description on their observations. The background information portion also includes previously published observations. Since the nature of this study is not hypothesis-driven, it is very confusing to read in many places and difficult to distinguish their original observations from previously published results and making. One easily achievable improvement is to insert relevant figure numbers into the text more often.

      Significance

      The study by Vasudevan et al intends to address how serially homologous neural progenitors generate different numbers and types of neurons depending on their location along the body axis. Investigation of full repertoire of neurogenesis for these progenitors necessitates a precise ability to track the fates of both progenitors and their neuronal progeny making it extremely difficult in vertebrate paradigm. The authors used NB3-3 in the developing fly embryo as a model to investigate the full extent of the flexibility in neurogenesis from a single type of serially homologous stem cell. Previous work showed NB3-3 generates neurons including lateral interneurons that can be positively labeled by Even-skipped, but detailed characterization of the NB3-3 lineage mainly focused on 3 segments during embryogenesis. The authors defined the number of EL neurons in all segments of the central nervous system in early larvae after the completion of circuit formation and carried out clonal analyses to determine the proliferation pattern of NB3-3. They described the failure to express Eve in Notch OFF/B neurons as a new mechanism for controlling the number of EL neurons and PCD limits EL neurons in terminal segments.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: This work by Matsui et al. examined the function of a gene Stand Stil (stil) in Drosophila in regulation of germ cell death in the female germline. They show that stil mutants contain many apoptotic cells, leading to germ cell loss and infertility. Gene expression analysis showed upregulation of pro-apoptotic genes such as rpr in stil mutant. DamID experiment further showed that stil binds to rpr promoter region to repress its expression. Additionally, they also show that undifferentiated germ cells are resistant to cell death in stil mutant (but stil mutant still eventually loses all germ cells).

      Major comments: Overall, experiments adhere to a general standard of rigor, and each result is fairly convincing. In that sense, this paper warrants publication, as a paper that revealed a new gene important for preventing germ cell death. With that said, I feel that this paper does not reveal a new biological insight. In a nutshell, this paper is about a transcriptional repressor for pro-apoptotic gene, hence its depletion leads to cell death. Data is solid and the conclusion is well supported. But the readers will be left wondering why nature implemented such control? Unless one can show what kind of defects stil rpr double mutant (which rescues germ cell loss phenotype) exhibits, there is no insight why the balance of pro-apoptotic gene and its repressor is important. The paper discusses the 'molecular' mechanisms that explain the phenomenon, but it does not provide insights. The lack of conceptual advancement is the limitation of this work.

      Response:

      We thank the reviewer for pointing out a biological insight into the evolutionary rationale underlying the adoption of such a regulatory mechanism in nature. To address this point, we assessed the evolutionary conservation of rpr and stil through BLAST searches and comparative analyses. Our results showed that both genes are Diptera-restricted, whereas their key domains (the rpr IAP-binding motif and the Stil BED finger) are widely conserved across metazoans. In this phylogenetic context, we propose that Stil acts as a dedicated repressor of rpr in the Drosophila female germline, thereby establishing an apoptotic control architecture in which hid predominates and rpr is repressed by Stil. This explains why the balance between a potent effector (Rpr) and its repressor (Stil) is critical in oogenesis; preventing catastrophic germline loss while preserving hid-mediated responsiveness.

      We have incorporated these phylogenetic analyses and the perspective into the revised Discussion section as follows.

      Revised Page 22, Line 475; rpr is conserved only within Diptera, although its IAP-binding motif, essential for apoptosis induction, is broadly conserved across metazoans (Du et al., 2000; Gottfried et al., 2004; Hegde et al., 2002; Shi, 2002; Verhagen et al., 2000; Vucic et al., 1998; Wing et al., 2001; L. Zhou, 2005) (Fig. S7). Similarly, stil is also restricted to Diptera, predominantly within Drosophila, whereas its BED-type zinc finger domain is widely conserved among diverse organisms (Aravind, 2000; Hayward et al., 2013; Tue et al., 2017b; H. Zhou et al., 2016). Phylogenetic patterns across Diptera are consistent with a model in which stil acts as a dedicated repressor of rpr in the Drosophila germline cells (Fig. S7). Due to its potent pro-apoptotic activity, rpr must be stringently repressed in a spatiotemporal manner through mechanisms that are specific to both cell type and developmental stage. During embryogenesis, repression of rpr is mediated by the Dpp-signaling factor Shn, which binds to the rpr regulatory region, whereas in intestinal stem cells (ISCs), its expression is suppressed through chromatin conformation. In Drosophila female germline cells, hid serves as the primary regulator of apoptosis, while rpr activity is generally suppressed (Park et al., 2019; Xing et al., 2015). However, rpr mutants exhibit reduced fertility despite producing viable eggs (Fig. 3H), suggesting that rpr-mediated apoptosis may be required for proper egg development. Accordingly, we propose that stil restrains rpr in the Drosophila female germline, allowing hid to predominate in apoptotic regulation.

      New Fig. S7;

      The legend of new Fig. S7;

      Figure S7 Conservation of Rpr and Stil within Diptera

      Homologs of Drosophila melanogaster Rpr and Stil were identified by BLASTp, aligned, and analyzed phylogenetically. Homologs are present across Dipteran lineages, with the genus Drosophila highlighted in blue. Branch lengths indicate the expected number of substitutions per site, as shown by the scale bar.

      Minor comments: Although this is a minor point, and this is not specifically pointing a finger at the author of this paper, I really don't like the term 'safeguard'. This term is now overutilized to add hype to papers, when 'is necessary' is sufficient. In this case, unless the answer is provided as to 'against what stil is safeguarding germ cells', this term is not meaningful. For example, if one can show that stil specifically senses germline-specific threat and tweaks the regular apoptotic pathway based on germline-specific needs, then the term 'safeguard' may be warranted.

      Response:

      In light of the reviewer's comment, we have revised the title of the manuscript to replace 'safeguard' with 'ensure,' which better reflects the demonstrated function of Stil without overstating its role. The new title of the manuscript is: 'Transcriptional Repression of reaper by Stand Still Ensures Female Germline Development in Drosophila'

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this well-executed study, Matsui et al. investigate how the female Drosophila germline prevents inappropriate apoptosis during development. They identify stand still (stil) as a key germline-specific repressor of apoptosis. Stil mutant flies are homozygous viable but female sterile due to widespread germ cell loss at the time of eclosion, which is driven by activation of the pro-apoptotic gene reaper (rpr) and caspase-dependent cell death. Germline-specific expression of anti-apoptotic factors such as p35 can rescue this phenotype, confirming that the defect lies in apoptotic regulation. The authors show that Stil directly represses rpr transcription through its BED-type zinc finger domain. Notably, undifferentiated germline cells remain resistant to apoptosis in the absence of stil, which the authors attribute to a silenced chromatin state at the rpr locus, marked by H3K9me3. These findings support a dual mechanism of protection: transcriptional repression of rpr by Stil, and a potential parallel chromatin-based silencing mechanism operating specifically in undifferentiated cells.

      Major Issues:

      1. Clarify cell identity in Figure 2E: It is unclear whether the apoptotic cells shown are somatic or germline in origin. Including a somatic marker such as 1B1 would allow the reader to clearly distinguish the apoptotic population and better interpret the figure.

      Response:

      We thank the reviewer for this helpful suggestion. Occasionally, the signal of the germline marker Vasa can be attenuated in dying germline cells. As suggested by the reviewer, we also tested α-Spectrin (a plasma membrane and fusome marker) instead of 1B1 together with TUNEL labeling, but this approach did not clearly distinguish somatic from germline apoptotic cells. To directly clarify cell identity, we now provide an improved co-stained image in which TUNEL-positive nuclei are surrounded by Vasa-positive cytoplasm, indicating a germline origin. Figure 2E has been updated accordingly.

      New Fig. 2E;

      Quantification of undifferentiated cells in mutants: There appears to be inconsistency in the representation of undifferentiated germ cells across figures. Early panels show near-complete germline loss, while later analyses focus on undifferentiated cells that are reportedly apoptosis-resistant. The authors should quantify the proportion of ovarioles retaining undifferentiated cells and present this data in Figure 1 or the supplements to resolve this discrepancy.

      Response:

      Thank you for raising the important point regarding the apparent inconsistency in the representation of undifferentiated germ cell populations. In early panes (Fig.1C, D), we analyzed adult ovaries of stil loss-of function mutants where all germline cells including undifferentiated germline stem cells (GSCs) are almost completely lost (Fig. 1C), showing nearly 100% agametic ovarioles. However, in later analysis such as those in Fig. 5A, B, we showed 3rd instar-larval ovaries of stil loss-of function mutants containing a few surviving germline cells nearby the future cap cell, the niche providing stem cell ligand, Decapentaplegic (Dpp) (Xie & Spradling, 1998). This suggests that Dpp-responsive undifferentiated germline cells may be relatively resistant to apoptosis caused by stil loss.

      Indeed, the GSC-like cells generated by the overexpression of a constitutively active form of Dpp receptor, Thickveins (Tkv.CA) or loss of the differentiation factor bam, were resistant to apoptosis caused by stil loss (Fig. 5C, D). These GSC-like cells may possess enhanced stemness, owing to either excessively elevated Dpp signaling or complete loss of bam, which could lead to stronger repression of rpr expression through tighter chromatin compaction.

      We added this argument in the Results section of the revised manuscript as follows.

      Revised Page 16, Line 361; Compared to GSCs, which were almost completely lost in stil mutants, GSC-like cells may retain a more robust stemness owing to the extremely elevated Dpp signaling pathway, potentially resulting in stronger repression of rpr expression.

      Interpretation of chromatin state at the rpr locus: The claim that H3K9me3, but not H3K27me3, marks the rpr locus is not fully convincing given the low ChIP-seq signal shown. Including a comparison to a known positive control locus would strengthen the argument. Alternatively, the authors could broaden the discussion to include global chromatin reorganization during germ cell to maternal transition, as reported in Kotb et al., 2024 and how such changes may impact rpr accessibility. Also stl mutant rescued with P53 have a "string of pearls" phenotype that are associated with germ cell to maternal transition defects (Figure S3, p53 OE)

      Response:

      We thank the reviewer for the thoughtful and constructive comment regarding the interpretation of chromatin state at the rpr locus. To strengthen the inference that the rpr locus shows H3K9me3 enrichment, whereas clear H3K27me3 enrichment is not evident, we have now included ChIP-seq signal profiles for known positive control loci, using light (lt) as an H3K9me3-enriched locus (Akkouche et al., 2017; Greil et al., 2003) and Ultrabithorax (Ubx) as a canonical H3K27me3 target (Torres-Campana et al., 2022). These comparisons support our interpretation that H3K9me3, rather than H3K27me3, characterize chromatin around the rpr locus in GSCs. Accordingly, while we do not exclude a minor H3K27me3 contribution, our analyses indicate H3K9me3 as the predominant signature at rpr in GSCs.

      New Fig.6B and 6C;

      The legend of new Fig. 6B and Fig. 6C;

      (B) H3K9me3 ChIP-seq signal at the rpr locus and the lt locus (H3K9me3-positive control) in GSCs and 4C NCs. (C) H3K27me3 ChIP-seq signal at the rpr locus and the Ubx locus (H3K27me3-positive control) in GSCs and 32C NCs.

      A sentence of Result section was revised as below.

      Revised Page 17, Line 396; As internal controls, we confirmed H3K9me3 enrichment at the light (lt) locus and H3K27me3 enrichment at the Ultrabithorax (Ubx) locus, consistent with their established chromatin states (Akkouche et al., 2017; Greil et al., 2003; Torres-Campana et al., 2022); relative to these controls, the rpr locus shows H3K9me3 but no clear H3K27me3 enrichment in GSCs.

      Regarding the suggestion to broaden the discussion to include global chromatin reorganization during the germline-to-maternal transition, as reported in Kotb et al., 2024, we agree that this is an important avenue for understanding rpr accessibility. The "string of pearls" phenotype observed in stil mutants rescued with P35 overexpression (Figure S3) is consistent with perturbations during this transition. However, a detailed analysis of such chromatin reorganization and its potential impact on rpr regulation lies beyond the scope of the present study and represents a valuable direction for future work.

      Broader analysis of rpr regulation in somatic cells: It would be informative to examine publicly available chromatin or transcriptional data for the rpr locus in somatic ovarian cells. This could help clarify whether rpr regulation by Stil is truly germline-specific or reflects broader developmental trends. This will also clarify why the flies are homozygous viable but female sterile.

      Response:

      We thank the reviewer for this insightful suggestion. We agree that exploring chromatin accessibility and transcriptional regulation at the rpr locus in somatic ovarian cells would provide valuable insights into tissue- or cell-type-specific chromatin environments that influence rpr expression.

      However, to our knowledge, there are currently no publicly available ATAC-seq or comparable chromatin datasets for purified ovarian somatic cells, including follicle cells or ovarian somatic cells (OSCs). As such, we are unable to incorporate this analysis in the current study. Nevertheless, we fully recognize the importance of this line of inquiry and consider it a valuable direction for future research.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      This manuscript describes the characterization of stand still (stil), a previously identified gene needed for germ cell survival in Drosophila. The molecular function of Stil has until now remained poorly understood. This new work shows that loss of stil results in reaper (rpr)-dependent apoptosis within female germ cells. Loss of rpr suppresses many of the phenotypes observed in stil mutants. Experiments performed using Drosophila cell culture suggest that Stil binds to elements within the rpr promoter. DamID and structure/function experiments indicate that Stil likely directly represses the transcription of rpr within germ cells.

      In general, the experiments are well executed, and the data largely support the basic claims of the authors. Replicates are included and appropriate statistical analyses have been provided. The text and figures clear and accurate. Appropriate references were cited. There are a few things the authors should address or rephrase before publication.

      On page 9 line 190-192. The authors state "Altogether, these findings indicate that the loss of stil function not only triggers apoptosis that can be suppressed by apoptosis inhibitors but also causes defects in oogenesis progression that are not rescued by blocking cell death." Failure to rescue defects during mid-oogenesis could be due to insufficient transgene expression. Indeed, loss of rpr appears to rescue the fertility of stil mutants. The conclusions of this section should be restated.

      Response:

      We agree that the failure to rescue mid-oogenesis defects by P35 overexpression may, at least in part, be due to insufficient transgene expression. This explanation is particularly plausible given that loss of rpr more effectively restored fertility in stil mutants. As suggested by the reviewer, we have revised the relevant sentences, to avoid misinterpretation as below.

      Revised Page 9, Line 191; Altogether, these findings indicate that the loss of stil function triggers apoptosis that can be suppressed by apoptosis inhibitors.

      Revised Page 12, Line 253; The complete rescue of germline survival in stil rpr double mutants also suggests that the failure of P35 overexpression to restore mid-oogenesis defects may partly reflect insufficient transgene expression (Fig. S3).

      The authors should present the overlap between genes that change expression in a stil mutant and those in which the DamID experiments indicate are directly bound by Stil protein. DamID can sometimes give spurious results depending on expression levels. Further discussion along this point is necessary.

      Response:

      We thank the reviewer for raising this issue. As suggested, we have now analyzed the overlap between genes that are differentially expressed in stil mutant ovaries (identified by RNA-seq with stil mutant expressing P35) and genes that are potentially bound by Stil based on DamID-seq data (promoter-proximal peaks {less than or equal to}1 kb) as Supplementary Table 4. The list includes genes with DamID peaks within promoter regions and that also exhibit significant differential expression (|log2FC| > 1, adjusted p The overlap between DamID-seq and RNA-seq comprises 682 genes, including rpr, supporting the idea that Stil regulates rpr expression through interaction with its upstream promoter region. However, the detected peak signal at rpr was 3.41, which was not that strong, suggesting that Stil may also bind to and regulate other genes in female germline cells. Investigating the potential role of Stil in regulating other genes represents an important future direction of our study.

      We have included this analysis and argument in the revised manuscript as below.

      Revised Page 13, Line 280; A total of 682 genes with Stil-enriched peaks detected at promoter regions ({less than or equal to}1 kb) showed significantly altered expression in RNA-seq analysis of stil mutants expressing P35, including rpr (Supplementary Table 4).

      Revised Page 20, Line 440; Notably, the DamID peak intensity at the rpr locus reached 3.41, which is moderate rather than strong (Supplementary Table 4). This suggests that, in addition to repressing rpr, Stil may bind to and regulate other genomic loci in the female germline. Investigating the repertoire of Stil target genes and elucidating their roles in germline cells will be an important future direction of this study.

      For structure function experiments, a western blot showing expression levels of the different transgenes in ovaries should be included.

      Response:

      We thank the reviewer for this helpful comment. To address this point, we examined the expression levels of the four Stil variants (FL, NT, CT, and AAYA) in ovaries driven by a germline driver under a wild-type background using Western blotting. The representative blot and quantification from three biological replicates showed comparable expression levels among the variants, with the CT variant displaying a slightly reduced signal. Importantly, AAYA showed expression comparable to FL yet, like CT, failed to rescue, indicating that the rescue failure is not explained by expression-level differences. These data instead support a requirement for the BED-type zinc finger for Stil function in the germline. While we cannot fully exclude a minor contribution from the slightly lower expression of the CT variant to the lack of rescue, the AAYA result argues that loss of BED-type zinc-finger function is the primary cause; we note this caveat in the revised text. The corresponding data are now presented in Figure S6A of the revised manuscript.

      New Fig. S6A;

      The legend of new Fig. S6A;

      (A) Western blot analysis of 6×Myc-tagged Stil variants (FL, NT, CT, and AAYA) driven by NGT40-Gal4; NosGal4-VP16, with y w as a control. Stil variants were detected with anti-Myc, and α-Tubulin (αTub) served as a loading control. Arrowheads indicate Stil variant proteins. The lower panel shows quantification of the Myc/αTub signal ratio normalized to FL. Error bars indicate standard deviation (s.d.) (n = 3).

      A sentence of Result section was revised as below.

      Revised Page 13, Line 291; The expression of all four Stil variant proteins from the transgenes was confirmed, although Stil-CT showed a slightly reduced expression level (Fig. S6A)

      Revised Page 14, Line 305; Although CT shows slightly lower expression, AAYA fails to rescue despite FL-like expression, indicating that expression level is not limiting and that loss of the BED-type zinc finger underlies the phenotype.

      "With the addition of the new Fig. S6A, the following figure labels have been updated;

      Fig. S6A →S6B

      Fig. S6B → S6C

      Fig. S6C → S6D

      Fig. S6D → S6E

      Individual data points should be shown in each graph in place of simple bar graphs. This type of presentation was inconsistent throughout the paper.

      Response:

      We thank the reviewer for this constructive comment. In line with the reviewer's suggestion, we have revised the relevant graphs to include individual data points overlaid on bar plots with error bars. This modification enables readers to better assess data variability. We also ensured consistency in data presentation among the revised figures while maintaining clarity throughout the manuscript.

      Reference "G & D., 1997" should be properly formatted.

      Page 6 line 117 and 121- a couple of instances where "cell" should be "cells"

      Page 14 line 304- typo "Still"

      Response:

      As suggested, we have revised all figures to display individual data points in each graph instead of using simple bar graphs. This change has been applied consistently throughout the manuscript to improve data transparency and readability. The revised figures include Figure 1A, 2B, S1A, and S2A.

      We have also corrected the following textual issues;

      ・The reference "G & D., 1997" has been properly formatted as "Pennetta & Pauli, 1997".

      ・On page 6, lines 119 and 123, "cell" has been corrected to "cells" to ensure grammatical accuracy.

      ・On page 14, line 315, the typo "Still" has been corrected to "Stil".

      Reviewer #3 (Significance (Required)):

      The significance of the work lies in characterizing a previously unknown function of Stil. By showing that Stil acts to repress transcription of the cell death gene rpr, the authors provide new insights into how programmed cell death is regulated in the Drosophila female germline. Readers interested in reproductive biology, cell death, chromatin, and general developmental biology will find value in these new findings.

      One thing to consider is the possibility that Stil represses rpr in the context of the double strand breaks that form during meiosis. Experiments in the paper indicate that stil knockdown results in TUNEL labeling in region 2A/2B of the germarium. The authors should consider co-labeling for a meiosis marker (C(3)G or gammaH2Av) to see if this PCD correlates with this expression. In addition, they could test whether loss of Spo11 (mei-W68) suppresses stil phenotypes during early germ cell development. Relating the function of Stil to repression of cell death during this critical time of germ cell development would elevate the impact and significance of the paper. However, this may be considered beyond the scope of the current study.

      Response:

      We deeply thank the reviewer for this insightful and thought-provoking suggestion.

      As suggested, we conducted co-staining with γH2Av (DBS marker), as well as genetic interaction experiments with Spo11 (mei-W68) mutants to address this question shown below. In region 2 across all genotypes including y w control, and stil heterozygous and homozygous ovaries expressing P35, γH2Av signals were discernible and subsequently lost in region 3 through the meiotic recombination-specific DNA repair program (Additional Figure A). In stil mutants, however, an additional strong γH2Av signal was specifically observed in the oocyte, beyond the expected meiotic pattern. Furthermore, loss of meiotic recombination factors, including mei-W68, in stil mutants partially rescued the germline loss phenotype, although not to the same extent as in rpr mutants (Additional Figure B, C: 43.5 % in mei-W68-GLKD, 23.9 % in mei-P22P22 and 12.8 % in vilya826 versus 100 % with loss of rpr in Fig. 3E, F of the revised manuscript). These findings suggest that accumulation of meiotic DSBs is not the main cause of rpr upregulation in stil mutants. We feel that these analyses are beyond the scope of the current study, which focuses on identifying Stil as a transcriptional repressor of rpr and characterizing its role in germline apoptosis. Elucidating other mechanisms that elevate rpr expression in stil mutants will be the focus of future work. Hence, we are providing these data here for the reviewer's reference, but if the reviewer prefers, we would be happy to incorporate them into the manuscript.

      Additional Figure (A) Immunostaining of ovarioles from y w, stilEY16156/CyO; P35 OE (NGT40; NosGal4-VP16> P35), stilEY16156; P35 OE flies with antibody against DNA double-strand break marker H2Av (green), Vasa (red), and DAPI (blue). Insets show enlarged views of egg chamber. White dots indicate oocyte nuclei, Scale bar: 50 μm (ovariole) and 20 μm (egg chamber). (B) Immunofluorescence of Vasa (red) and DAPI (blue) in ovaries from stilEY16156, stilEY16156; mei-W68-GLKD (driven by NGT40; NosGal4-VP16), stilEY16156; meiP22P22, and stilEY16156; vilya826. Scale bar: 50 μm. (C) Quantification of the percentage of ovarioles containing germline cells in 2-3-day-old females. The genotypes of females are indicated below the x-axis, and the number of germaria analyzed is shown above each bar. Error bars represent the standard deviation (s.d.).

      Akkouche, A., Mugat, B., Barckmann, B., Varela-Chavez, C., Li, B., Raffel, R., Pélisson, A. & Chambeyron, S. (2017). Piwi Is Required during Drosophila Embryogenesis to License Dual-Strand piRNA Clusters for Transposon Repression in Adult Ovaries. Molecular Cell, 66(3), 411-419.e4. https://doi.org/10.1016/j.molcel.2017.03.017

      Greil, F., Kraan, I. van der, Delrow, J., Smothers, J. F., Wit, E. de, Bussemaker, H. J., Driel, R. van, Henikoff, S. & Steensel, B. van. (2003). Distinct HP1 and Su(var)3-9 complexes bind to sets of developmentally coexpressed genes depending on chromosomal location. Genes & Development, 17(22), 2825-2838. https://doi.org/10.1101/gad.281503

      Röper, K. & Brown, N. H. (2004). A Spectraplakin Is Enriched on the Fusome and Organizes Microtubules during Oocyte Specification in Drosophila. Current Biology, 14(2), 99-110. https://doi.org/10.1016/j.cub.2003.12.056

      Torres-Campana, D., Horard, B., Denaud, S., Benoit, G., Loppin, B. & Orsi, G. A. (2022). Three classes of epigenomic regulators converge to hyperactivate the essential maternal gene deadhead within a heterochromatin mini-domain. PLoS Genetics, 18(1), e1009615. https://doi.org/10.1371/journal.pgen.1009615

      Xie, T. & Spradling, A. C. (1998). decapentaplegic Is Essential for the Maintenance and Division of Germline Stem Cells in the Drosophila Ovary. Cell, 94(2), 251-260. https://doi.org/10.1016/s0092-8674(00)81424-5

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      This manuscript describes the characterization of stand still (stil), a previously identified gene needed for germ cell survival in Drosophila. The molecular function of Stil has until now remained poorly understood. This new work shows that loss of stil results in reaper (rpr)-dependent apoptosis within female germ cells. Loss of rpr suppresses many of the phenotypes observed in stil mutants. Experiments performed using Drosophila cell culture suggest that Stil binds to elements within the rpr promoter. DamID and structure/function experiments indicate that Stil likely directly represses the transcription of rpr within germ cells.

      In general, the experiments are well executed, and the data largely support the basic claims of the authors. Replicates are included and appropriate statistical analyses have been provided. The text and figures clear and accurate. Appropriate references were cited. There are a few things the authors should address or rephrase before publication.

      On page 9 line 190-192. The authors state "Altogether, these findings indicate that the loss of stil function not only triggers apoptosis that can be suppressed by apoptosis inhibitors but also causes defects in oogenesis progression that are not rescued by blocking cell death." Failure to rescue defects during mid-oogenesis could be due to insufficient transgene expression. Indeed, loss of rpr appears to rescue the fertility of stil mutants. The conclusions of this section should be restated.

      The authors should present the overlap between genes that change expression in a stil mutant and those in which the DamID experiments indicate are directly bound by Stil protein. DamID can sometimes give spurious results depending on expression levels. Further discussion along this point is necessary.

      For structure function experiments, a western blot showing expression levels of the different transgenes in ovaries should be included.

      Individual data points should be shown in each graph in place of simple bar graphs. This type of presentation was inconsistent throughout the paper.

      Reference "G & D., 1997" should be properly formatted. Page 6 line 117 and 121- a couple of instances where "cell" should be "cells" Page 14 line 304- typo "Still"

      Referee cross-commenting

      I also agree with the points raised by the other two reviewers. I think we are in general agreement on the strengths and weaknesses of the study.

      Significance

      The significance of the work lies in characterizing a previously unknown function of Stil. By showing that Stil acts to repress transcription of the cell death gene rpr, the authors provide new insights into how programmed cell death is regulated in the Drosophila female germline. Readers interested in reproductive biology, cell death, chromatin, and general developmental biology will find value in these new findings.

      One thing to consider is the possibility that Stil represses rpr in the context of the double strand breaks that form during meiosis. Experiments in the paper indicate that stil knockdown results in TUNEL labeling in region 2A/2B of the germarium. The authors should consider co-labeling for a meiosis marker (C(3)G or gammaH2Av) to see if this PCD correlates with this expression. In addition, they could test whether loss of Spo11 (mei-W68) suppresses stil phenotypes during early germ cell development. Relating the function of Stil to repression of cell death during this critical time of germ cell development would elevate the impact and significance of the paper. However, this may be considered beyond the scope of the current study.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this well-executed study, Matsui et al. investigate how the female Drosophila germline prevents inappropriate apoptosis during development. They identify stand still (stil) as a key germline-specific repressor of apoptosis. Stil mutant flies are homozygous viable but female sterile due to widespread germ cell loss at the time of eclosion, which is driven by activation of the pro-apoptotic gene reaper (rpr) and caspase-dependent cell death. Germline-specific expression of anti-apoptotic factors such as p35 can rescue this phenotype, confirming that the defect lies in apoptotic regulation. The authors show that Stil directly represses rpr transcription through its BED-type zinc finger domain. Notably, undifferentiated germline cells remain resistant to apoptosis in the absence of stil, which the authors attribute to a silenced chromatin state at the rpr locus, marked by H3K9me3. These findings support a dual mechanism of protection: transcriptional repression of rpr by Stil, and a potential parallel chromatin-based silencing mechanism operating specifically in undifferentiated cells.

      Major Issues:

      1. Clarify cell identity in Figure 2E: It is unclear whether the apoptotic cells shown are somatic or germline in origin. Including a somatic marker such as 1B1 would allow the reader to clearly distinguish the apoptotic population and better interpret the figure.
      2. Quantification of undifferentiated cells in mutants: There appears to be inconsistency in the representation of undifferentiated germ cells across figures. Early panels show near-complete germline loss, while later analyses focus on undifferentiated cells that are reportedly apoptosis-resistant. The authors should quantify the proportion of ovarioles retaining undifferentiated cells and present this data in Figure 1 or the supplements to resolve this discrepancy.
      3. Interpretation of chromatin state at the rpr locus: The claim that H3K9me3, but not H3K27me3, marks the rpr locus is not fully convincing given the low ChIP-seq signal shown. Including a comparison to a known positive control locus would strengthen the argument. Alternatively, the authors could broaden the discussion to include global chromatin reorganization during germ cell to maternal transition, as reported in Kotb et al., 2024 and how such changes may impact rpr accessibility. Also stl mutant rescued with P53 have a "string of pearls" phenotype that are associated with germ cell to maternal transition defects (Figure S3, p53 OE)
      4. Broader analysis of rpr regulation in somatic cells: It would be informative to examine publicly available chromatin or transcriptional data for the rpr locus in somatic ovarian cells. This could help clarify whether rpr regulation by Stil is truly germline-specific or reflects broader developmental trends. This will also clarify why the flies are homozygous viable but female sterile.

      Referee cross-commenting

      I agree with the assessment of the other two reviewers. I think reviewer 3 point of "the overlap between genes that change expression in a stil mutant and those in which the DamID experiments indicate are directly bound by Stil" is important and needs to be addressed.

      Significance

      This study provides important insight into how germline cells in Drosophila evade apoptosis through both transcriptional and chromatin-based regulation. While reaper is a well-known effector of apoptosis, the identification of stil as a direct repressor in the female germline adds a new layer of cell type-specific control. The authors also delineate an epigenetic mechanism that protects undifferentiated germline cells, highlighting stage-specific differences in apoptotic susceptibility. This dual mechanism is conceptually significant and expands our understanding of how cell survival is maintained during gametogenesis. However, the precise novelty of stil relative to other rpr regulators could be articulated more clearly, and some data interpretations would benefit from additional clarification.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: This work by Matsui et al. examined the function of a gene Stand Stil (stil) in Drosophila in regulation of germ cell death in the female germline. They show that stil mutants contain many apoptotic cells, leading to germ cell loss and infertility. Gene expression analysis showed upregulation of pro-apoptotic genes such as rpr in stil mutant. DamID experiment further showed that stil binds to rpr promoter region to repress its expression. Additionally, they also show that undifferentiated germ cells are resistant to cell death in stil mutant (but stil mutant still eventually loses all germ cells).

      Major comments: Overall, experiments adhere to a general standard of rigor, and each result is fairly convincing. In that sense, this paper warrants publication, as a paper that revealed a new gene important for preventing germ cell death. With that said, I feel that this paper does not reveal a new biological insight. In a nutshell, this paper is about a transcriptional repressor for pro-apoptotic gene, hence its depletion leads to cell death. Data is solid and the conclusion is well supported. But the readers will be left wondering why nature implemented such control? Unless one can show what kind of defects stil rpr double mutant (which rescues germ cell loss phenotype) exhibits, there is no insight why the balance of pro-apoptotic gene and its repressor is important. The paper discusses the 'molecular' mechanisms that explain the phenomenon, but it does not provide insights. The lack of conceptual advancement is the limitation of this work.

      Minor comments: Although this is a minor point, and this is not specifically pointing a finger at the author of this paper, I really don't like the term 'safeguard'. This term is now overutilized to add hype to papers, when 'is necessary' is sufficient. In this case, unless the answer is provided as to 'against what stil is safeguarding germ cells', this term is not meaningful. For example, if one can show that stil specifically senses germline-specific threat and tweaks the regular apoptotic pathway based on germline-specific needs, then the term 'safeguard' may be warranted.

      Referee cross-commenting

      I also agree with other reviewers.

      Significance

      As I summarized above, as is, this manuscript's impact is limited to identifying a gene that is required to prevent germ cell death.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Point-by-Point Response to Reviewers for Manuscript #RC-2024-02720

      Manuscript Title: Molecular and Neural Circuit Mechanisms Underlying Sexual Experience-dependent Long-Term Memory in Drosophila.

      Corresponding Author: Woo Jae Kim

      We extend our sincere gratitude to the Managing Editor and both reviewers for their diligent and insightful evaluation of our manuscript. The comprehensive feedback provided has been invaluable, guiding us to significantly strengthen the manuscript's scientific rigor, logical cohesion, and overall impact. We have undertaken a substantial revision, incorporating new experimental evidence, reframing the central narrative, and improving data presentation to address all concerns raised.

      The major revisions include:

      1. New Experimental Evidence: We have performed three new sets of experiments to address key questions raised by the reviewers. First, we used the protein synthesis inhibitor cycloheximide to pharmacologically validate that the observed memory is indeed a form of long-term memory (LTM). Then, we performed genetic intersectional analyses to determine if the identified Yuelao (YL) neurons express the canonical sex-determination transcription factors doublesex (dsx) and fruitless (fru).
      2. Narrative Reframing and Logical Restructuring: We fully agree with the reviewers that the logic of the original manuscript was confusing, particularly regarding the distinction between the broad Mushroom Body (MB) Kenyon Cell (KC) population and the specific YL neurons. The manuscript has been extensively rewritten to present a clear, hypothesis-driven narrative. We now frame the initial KC-related findings as part of a broader screening effort that logically led to the identification and focused investigation of the YL neuron circuit.
      3. Refined Central Claim: Guided by the reviewers' feedback and our new data, we have sharpened our central claim. We now propose that YL neurons constitute a critical circuit for forming attractive taste- and pheromone-based memories derived from Gr5a neuronal inputs. This form of appetitive memory is distinct from the previously characterized internal reward state associated with ejaculation, adding a new layer to our understanding of how male flies remember and evaluate reproductive experiences.
      4. Improved Data Quality and Analysis: In response to valid critiques, all imaging figures have been replaced with high-resolution versions. Furthermore, our methods for fluorescence quantification, particularly for the TRIC calcium imaging experiments, have been corrected to include normalization against an internal reference channel, adhering to established best practices. All requested genetic control experiments have been performed. We are confident that these comprehensive revisions have fully addressed all concerns and have transformed our manuscript into a much stronger, more focused, and logically sound contribution. We thank you again for the opportunity to improve our work and look forward to your evaluation of the revised manuscript.

      Responses to Reviewer #1

      General Comments: This study explores the molecular and neural circuitry mechanisms underlying sexual experience-dependent long-term memory (SELTM) in male Drosophila. The authors use behavioral, imaging, and bioinformatics approaches to identify YL neurons, a subset of mushroom body (MB) projecting neurons, as crucial for SELTM formation. They propose that YL neurons receive inputs from WG neurons via the sNPF-sNPFR pathway and implicate molecular players such as orb2, fmr1, MDAR2-CaMK, and synaptic plasticity in their function.

      However, the evidence presented does not adequately support the authors' claims. The data fail to cohesively tell a logical story, and key conclusions appear to be based on assumptions and correlations rather than robust evidence.

      • Answer: We are deeply grateful to both reviewers for their thorough and constructive evaluation of our manuscript. Their collective feedback has been instrumental in helping us to clarify the study's rationale, strengthen our interpretations, and significantly improve the overall quality and impact of the work. We appreciate the recognition of our study's potential to advance the understanding of how sexual experience modifies future mating behaviors and to elucidate the neuronal and molecular mechanisms of how memory regulates a key sexual behavior in male Drosophila*.

      • *In response to the general comments, we have undertaken a major revision of the manuscript to improve the clarity, logic, and presentation. We have rewritten the Abstract and Introduction to more clearly define "sexual experience-dependent long-term memory" (SELTM) and articulate its significance in the context of adaptive decision-making and interval timing. The entire manuscript has been restructured to present a more logical, hypothesis-driven narrative that clearly distinguishes our initial broad screening from the focused investigation of the YL neuron circuit. We have also incorporated alternative interpretations of our data, particularly regarding the role of the YL circuit in regulating baseline mating duration in naive males, which has added more depth to the study. Finally, all figures have been remade in high resolution, and all requested genetic controls and methodological clarifications have been added to ensure rigor and reproducibility. We are confident that these revisions have addressed the reviewers' concerns and have resulted in a much stronger manuscript.

      Comment 1: The study identifies the knowledge gap (lines 103-104) but fails to integrate relevant literature, particularly Shohat-Ophir et al., Science (2012), and Zer-Krispil et al., Curr Biol (2018). These studies established that ejaculation induces appetitive memory in male Drosophila via corazonin and NPF neurons. The current study does not provide direct evidence that the "act of mating itself" drives SELTM, as it includes both courtship and copulation.

      Response: Thank you for highlighting these two landmark studies. We fully agree that Shohat-Ophir et al., Science (2012) and Zer-Krispil et al., Curr Biol (2018) were pivotal in demonstrating that ejaculation—and the accompanying corazonin/NPF signalling—can establish an appetitive memory in males.

      In the revised manuscript we have now integrated both papers on lines 111-118:

      “Previous work has shown that successful copulation is intrinsically rewarding to male Drosophila: a single mating encounter elevates brain neuropeptide F (NPF) levels and suppresses subsequent ethanol preference19. Importantly, Zer-Krispil et al. further demonstrated that ejaculation itself—artificially induced by optogenetic activation of corazonin (Crz) neurons—is sufficient to mimic this reward state, driving appetitive memory formation and up-regulation of NPF. These findings indicate that the act of ejaculation, rather than the entire courtship sequence, is the critical sensory event that gates post-mating reward.”

      Comment 2: The nature of the observed long-lasting reduced mating duration requires clearer characterization: Is this an associative memory or experience-dependent behavioral plasticity? Can the formation of this long-term memory be blocked by protein synthesis inhibitors, such as cycloheximide?

      Response: We thank the reviewer for this excellent suggestion to pharmacologically characterize the nature of the memory. To definitively test whether the observed SMD is a form of protein synthesis-dependent long-term memory (LTM), we performed a new experiment as suggested.

      We have now included data in new Figure supplement 1I showing that feeding males the protein synthesis inhibitor cycloheximide (CXM) for 24 hours immediately following the sexual experience completely blocks the formation of the long-lasting SMD phenotype. Control flies fed a vehicle solution exhibited robust SMD. This result provides strong evidence that SELTM is not merely a form of transient behavioral plasticity but is a genuine form of LTM that requires de novo protein synthesis for its consolidation, a hallmark of LTM across species.[1]

      The revised text was put on lines 173-176:

      " To determine whether the persistent reduction in mating duration (SMD) depends on de-novo protein synthesis, we fed males the translational inhibitor cycloheximide (CXM). Under this regimen, CXM completely abolished the SMD phenotype (Fig. 1I)."

      Comment 3: While schematics illustrate the working hypotheses, the text lacks detailed explanations, leaving the reader unclear about the rationale behind certain conclusions.

      __Response: __Thank you very much for this insightful comment. We fully agree that the original manuscript did not provide sufficient textual justification for the conclusions derived from the schematics. In the revised version we have therefore added comprehensive explanations immediately following each figure (or schematic) that explicitly state the underlying rationale, the key observations supporting our hypotheses, and the logical steps leading to each conclusion. We believe these additions now make the reasoning transparent and easy to follow. We appreciate your feedback, which has substantially improved the clarity of our work.

      • *

      Comment 4*: The logic to draw certain conclusions was confusing and misleading. - For instance, the role of orb2 in SELTM is examined via knockdown in MB Kenyon cells (KCs) (using ok107>orb2-RNAi), which is irrelevant to the claim that orb2 functions in YL neurons. Additionally, RNAseq analyses (Fig. 1N-S) focusing on orb2 expression in a/b KCs are irrelevant to and cannot support the claim that Orb2 functions in YL neurons. *

      *- Similarly, the claim (lines 302-303) that sNPF-R expression is exclusive to MB KCs conflicts with data showing effects when sNPF-R is knocked down in YL neurons. How can knocking-down a gene, which is exclusively expressed in neural population A, in neural population B affect a phenotype? This inconsistency undermines the interpretation of the results. *

      *- Other examples include lines 223-227 and lines 246-249. It is very confusing how the authors came to the indications. *

      - The authors also kept confusing the readers and themselves by mistakenly referring to MB KC a-lobe and YL a-lobe projection. They may know the difference between the two neural populations but they did not always refer to the right one in the text.

      Response: We agree completely with the reviewer that the logic in the original manuscript was confusing and failed to clearly distinguish between the general MB Kenyon Cell (KC) population and the specific YL projection neurons. This was a major flaw, and we are grateful for the opportunity to correct it. We have undertaken a major revision of the manuscript's narrative and structure to present a clear, logical progression of discovery.

      The new logical flow of the manuscript is as follows:

      1. We first establish that sexual experience induces a robust, long-lasting SMD behavior that is dependent on protein synthesis
      2. We then perform initial experiments to implicate the MB as a key brain region. We show that broad inhibition of MB KCs (using the ok107-GAL4 driver) disrupts SMD behavior.This result establishes the general involvement of the MB but lacks cellular specificity.
      3. The remainder of the manuscript then focuses specifically on dissecting the molecular and cellular properties of these YL neurons. Finally, we have meticulously edited the entire manuscript to ensure that we always use precise terminology, clearly distinguishing between "YL neuron projections to the MB α-lobe" and the "MB KC α-lobe."

      Comment 5*: The imaging figures provided are unfocused and poorly resolved, making it difficult to assess data quality. *

      *- Colocalization analyses of orb2 and YL are unconvincing... Maximum intensity projection images are insufficient... complete image stacks with staining of orb2, YL, and KCs (MB-dsRed) are needed for validation. *

      - Quantification of imaging data appears flawed. For example, claims of orb2 and CaMKII upregulation in MB a-lobe projections (e.g., Fig. S2F-J, Fig. 3M,N) are confounded by widespread increases in intensity across the brain, lacking specificity.

      • *

      *- The TRIC experiment analysis should normalize GFP signals to internal reference channel (RFP in the TRIC construct)... *

      - In Fig. 6H-J, methods for counting synapse numbers are not described. How are synapse numbers counted in these low-resolution images?

      Response: We sincerely apologize for the poor quality of the imaging data presented in the original manuscript. We agree with the reviewer's critiques and have taken comprehensive steps to rectify these issues.

      • Image Quality: We apologize for not including the full image data in the original submission. The complete figure is now presented in revised Fig. 2J .
      • Fluorescence Quantification: The fluorescence quantification has been re-analyzed. The Methods section now includes a detailed description of our protocol.
      • TRIC Normalization: We apologize for not stating this explicitly in the previous version. As now described in the revised Methods subsection “Quantitative Analysis of Fluorescence Intensity”, all TRIC images were acquired with identical laser power and exposure settings. The GFP signal was background-corrected and then normalized to the RFP fluorescence encoded by the TRIC construct itself (UAS-mCD8RFP), which serves as an internal reference for construct expression and mounting thickness.
      • Synapse Counting: We agree with the reviewer that the resolution of our images was insufficient for accurate synapse particle counting. We have therefore removed the problematic analysis from the former Fig 6H-J. Our conclusions regarding synaptic plasticity now rest on the more robust and quantifiable data showing a significant increase in the total area of dendritic (DenMark) and presynaptic (syt.eGFP) markers. Comment 6: The study presents data from unrelated learning paradigms (e.g., olfactory associative learning, courtship conditioning; Fig. 7) without justifying how these paradigms relate to SELTM. Particularly, the authors claimed that SELTM is related to Gr5a, which leads to appetitive memories, which involve PAM dopaminergic neurons and MB horizontal lobes. However, the olfactory associative learning with electric shock and courtship conditioning lead to aversive memories, that involve PPL1 dopaminergic neurons and the vertical lobes.

      • *

      Response: We thank the reviewer for requesting clarification on the rationale for including these experiments. The purpose of these assays was to test the specificity of the YL neuron circuit. A key question is whether YL neurons represent a general-purpose LTM circuit or one specialized for a particular memory modality.

      The data show that knockdown of Orb2 or Nmdar2 specifically in YL neurons has no effect on the formation of LTM for aversive olfactory conditioning or aversive courtship conditioning. These negative results are critically important, as they demonstrate that the YL circuit is

      not required for all forms of LTM. This finding strongly supports our revised central claim that YL neurons are specialized for processing appetitive memories derived from the specific sensory context of mating (i.e., taste and pheromonal cues from Gr5a neurons).

      To improve the narrative flow of the main text, We rearranged the order of the articles. The relevant description is in lines 398-401:

      “To determine whether YL neurons constitute a general LTM circuit or are dedicated to the appetitive context of mating, we tested two canonical aversive paradigms: electric-shock olfactory conditioning and courtship conditioning. If YL neurons serve as a universal LTM module, their genetic impairment should also impair aversive memory.”

      lines 469-472:

      “The inability of YL perturbation to impair aversive memories (Fig. 7) corroborates that this micro-circuit is dedicated to Gr5a-dependent SELTM rather than acting as a generic LTM hub”

      Minor Issues

      Comment 1: Fig 2F. YL projections are labeled as MBONs. Clarify whether YL neurons are the upstream or downstream (MBON) of KCs.

      __Response: __Thank you for this helpful comment. As Huang et al., 2018[2] (Nat. Commun. 9:872) have mentioned, the MB093C-GAL4 driver is the MBON-α3 mushroom body output neuro. Consequently, YL neurons are positioned downstream of the MBON-α3.

      We have now clarified this point in the revised manuscript lines 217-222:

      “Each of these neurons extends a vertical fiber to the dorsal brain region, where they form dense arbors within the α-lobes of the mushroom body. Because the MB093C-GAL4 driver labels MBON-α3 output neuron[51], these YL arbors are positioned postsynaptically within the α-lobe and relay mushroom-body output to the anterior, middle, and posterior superior-medial protocerebrum.”

      Comment 2: Extensive language polishing is required, as several sentences are unclear (e.g., lines 169-172).

      Response: We apologize for the lack of clarity in the original text. The entire manuscript has undergone extensive revision and professional language editing to improve readability, precision, and grammatical accuracy.

      Responses to Reviewer #2


      Major Comments

      Comment 1: Clearer articulation of the rationale, motivation, and significance of the overall study design and individual experiments can strengthen the manuscript and promote readership. For example, the beginnings of the abstract and introduction should define what authors mean by sexual experience-dependent long-term memory and its significance (including why it is "significant for reproductive success" (lines 46 and 92)). Similarly, employing more concrete language throughout the text will help anchor and contextualize the study. Interpretation is occasionally insufficient or does not follow directly from the data provided.

      Response: We thank the reviewer for this valuable advice. We agree that the motivation and significance of our study were not articulated clearly enough. We have rewritten the Abstract and the beginning of the Introduction to address this. The revised text now explicitly defines SELTM as a protein synthesis-dependent, appetitive memory formed in response to gustatory and pheromonal cues. We explain its significance in the context of adaptive behavior, linking it to interval timing, a process by which male flies strategically adjust their mating investment (i.e., mating duration) based on prior experience to optimize reproductive success and energy expenditure. This framing provides a clearer context for our investigation into its underlying neural and molecular mechanisms.

      Comment 2: Long term memory: I do not work on Drosophila memory, but a cursory search suggests that the field generally considers long term memory in Drosophila to last for 24 hr to days (courtship memory lasts for >24 hr). SMD decays between 12-24 hr after copulation. Could SMD be considered a short-term effect?

      Response: This is an important point of clarification, as described in our response to Reviewer #1 (Major Comment 2), we have performed a new experiment demonstrating that the formation of SMD is blocked by the protein synthesis inhibitor cycloheximide (Figure 1I). This dependence on de novo protein synthesis is a defining characteristic of LTM, distinguishing it from short- and intermediate-term memory forms.[1] where memories lasting 12-24 hours are well-established as forms of LTM.[3] Therefore, based on both its duration and its molecular requirements, SMD represents a bona fide form of LTM.

      The relevant statement is in lines 174-178:

      "To determine whether the persistent reduction in mating duration (SMD) depends on de-novo protein synthesis, we fed males the translational inhibitor cycloheximide (CXM). Under this regimen, CXM completely abolished the SMD phenotype (Fig. 1I). This finding suggests that the reduction in mating investment is contingent upon the formation of LTM."

      Comment 3: Fig 1B-E share the same control (naive) group. If these experiments were performed in the same replicate(s), they should be plotted in the same figure. If not, please provide more details on how experimental blocks were set up and how controls compared between replicates.

      Response: Thank you for this helpful suggestion. We understand that sharing the same naive control across multiple panels (Fig. 1B–E) may raise concerns about data independence. However, we chose to present these panels separately for the following reasons:

      1. Clarity and Readability: Each panel (B–E) represents a distinct temporal condition (0 h, 6 h, 12 h, 24 h post-isolation). Separating them avoids visual clutter and allows readers to focus on one time point at a time, improving interpretability.

      __ Consistency with Internal Controls:__

      Although the naive group is identical across panels, each experimental block (i.e., each isolation time point) was run independently on same days, with internal controls (naive vs. experienced) included in every block. This ensures that statistical comparisons remain valid within each panel, even if the naive data overlap.

      We have now added a clear statement in the figure legend explaining that the naive group is shared across panels and that each time point was tested independently with internal controls. This maintains transparency while preserving the visual clarity of the current layout.

      Comment 4: Serial mating (Fig 1F-H): please provide details on the methods. How much time elapsed between successive matings? Is a paired statistical test used? Sperm depletion also affects mating duration, and without this information the authors' conclusion (lines 155-156) does not automatically follow from the data.

      Response:

      1. __ Interval between successive matings__ We have rewritten the Methods to state explicitly that “as soon as one copulation ended the male was transferred immediately to a fresh virgin female, so the next mating began immediately.”

      we add new method:

      " Serial mating ____duration ____assay

      Serial mating duration assay was identical to the standard procedure except that each male was presented with four DF virgin females in immediate succession: upon termination of the first copulation the male was immediately put into a fresh chamber containing the next virgin, the timer was restarted at first contact, and this step was repeated until four complete matings were recorded or 5 min elapsed without initiation, whichever came first."

      __ Statistical test__

      We apologize for omitting this detail. Unpaired t-test was used: for male the mating duration before (naïve) and after sexual experience was recorded, yielding paired observations. Prism’s unpaired t-test module was therefore applied to evaluate the mean difference.

      The figure legend now states “with error bars representing SEM. Asterisks represent significant differences, as revealed by the Unpaired t test and ns represents non-significant difference (**p __ Mating duration versus sperm depletion__

      We apologize for not having made it clear that these two observations are complementary, not contradictory. Previous work has shown that when male Drosophila copulate repeatedly, mating duration remains stable even though the number of sperm transferred—and thus the number of progeny sired—declines progressively [4]

      The revised text is as follows (lines235-241):

      "Previous work has shown that when male Drosophila copulate repeatedly, mating duration remains stable even though the number of sperm transferred—and thus the number of progeny sired—declines progressively. This dissociation confirms that the constant mating duration we observe in our serial-mating experiment (Fig. 1F–H) is consistent with normal sperm depletion and does not compromise the conclusion that the experience-dependent reduction in mating duration reflects long-term memory."

      Thank you for helping us improve the clarity of our study.

      Comment 5: Mating duration assay: Which isolation interval was chosen for the rest of the SMD experiments? The 12 hr en masse mating setup is relatively uncommon among studies on courtship/copulation/post-copulatory phenotypes, and introduces uncertainty and variability in the number and timing of matings that occurred during the 12 hr-window. This source of variability and its implication in interpreting the data should be acknowledged. Moreover, the 3 studies referenced in the methods all house males in groups of 4, whereas this study uses groups of 40. Could density confound the manifestation of SMD?

      Response: We thank the reviewer for these important methodological questions.

      • Isolation Interval: We have clarified in the Methods that virgin females were introduced into vials for last 1 day before assay.
      • Housing Density: This is an excellent point. To control for any potential effects of housing density itself, we have clarified that our "naive" control males are also housed in groups of 40 for the same duration as the "experienced" males. Therefore, the only difference between the two groups is the presence of females, isolating the effect of sexual experience from the effect of social density. Comment 6: SMD behavior: comparing orb2 mutants and controls (Fig 1M and Fig S1K-L), loss of orb2 actually reduces the mating duration in native males (mean ~15 min) relative to controls (~20 min), and have possibly no effect on experienced males (~15 min). This is inconsistent with the SMD behavior demonstrated in Fig 1B-E. The same pattern is found for mushroom body silencing (Fig 1P, Fig S1M-N), orb2 knockdown in YL neurons (Fig 2D, Fig S2A-B), Fmr1 knockdown in YL neurons (Fig 3D, Fig S2B, S3D) and most other experiments where mating duration is not significantly different between naive and experienced males. This might demonstrate a separate role of YL neurons and its related circuit in regulating mating duration in naive males. Could the authors discuss this interpretation? As an aside, plotting genetic controls next to experimental groups is customary and facilitates comparisons between relevant groups.

      Response: Thank you very much for this insightful observation.

      1. Baseline differences among genotypes We agree that absolute mating duration differs slightly between genotypes (e.g. naive orb2∆/+ about 15 min vs. wild-type CS about 20 min). Such differences are common when mutations or transgenes are introduced into distinct genetic backgrounds, and they do not affect the within-genotype comparison that is the essence of SMD (sexual-experience-dependent shortening of mating duration). Therefore, for every experiment we compared naive vs. experienced males of the identical genotype, keeping all other variables constant.

      Consistency of SMD across figures

      In every manipulation that disrupts SMD memory (orb2∆, MB silencing, orb2-RNAi in YL neurons, Fmr1-RNAi in YL neurons, etc.) the naive–experienced difference disappears, whereas the genetic controls retain a significant ΔMD. This is fully consistent with Fig. 1B–E and demonstrates that the memory trace, not the basal duration, is abolished.

      Figure layout

      Following your suggestion, we have re-ordered all bar graphs so that the relevant genetic controls are placed immediately adjacent to the experimental groups, making within-panel comparisons easier.

      We hope these clarifications and adjustments address your concerns.

      Comment 7: Bitmap figures: unfortunately the bitmap figures are compressed and their resolution makes it difficult to evaluate the visual evidence.

      Response: We apologize for the poor quality of the figures. All figures in the revised manuscript, including the scRNA-seq plots, have been remade as high-resolution vector graphics to ensure clarity and detail. For better understanding, different colored illustrations are also placed next to the scRNA-seq.

      Comment 8: Sexual dimorphism of YL neurons: many neurons involved in sexual behaviors express dsx and/or fru. Do YL neurons express them?

      Response: This is an excellent question. To address it, we performed a new set of experiments using genetic intersectional tools to test for the expression of doublesex (dsx) and fruitless (fru) in YL neurons. Our analysis, presented in figure supplement 2B, reveals that YL neurons are indeed fru-negative and dsx-negative. We therefore conclude that YL neurons do not belong to the canonical fru- or dsx-expressing neuronal classes and are unlikely to be intrinsically sex-specific.

      We add explanation in lines 223-229:

      "Our further analysis confirmed the presence of only three pairs of nuclei near the SOG in male brains, whereas female brains exhibit a greater number of nuclei near the AL (Fig. 2I), suggesting subtle sexual dimorphisms in GAL4MB093C-expressing neurons. Importantly, these neurons do not overlap with either fru- or dsx-expressing cells: co-immunostaining for GFP and Fru or Dsx revealed almost no colocalization in any brain region examined (Fig. S2B), indicating that YL neurons are distinct from the canonical sex-specific fru/dsx circuits."

      Comment 9: Genetic controls for some crucial experiments are not provided, e.g. Fig 2J, Fig S3C, Fig S3E-F Fig 5B-C, F, Q-R, Fig S5A-E.

      Response: We thank the reviewer for their careful attention to detail. We have now performed all the missing genetic control experiments.

      Comment 10: Colocalization experiments: please provide more detail on how fluorescence is normalized for each channel across images, especially when the overall expression of an effector is up- or down-regulated after mating.

      Response: We have updated the Methods section under "Quantitative Analysis of Fluorescence Intensity" and "Colocalization Analysis" to provide a detailed description of our normalization procedure.

      Comment 11: Please resolve this apparent contradiction on the expression of Nmdar1 and 2 in YL neurons. On line 261: "both receptors co-expressing in Orb2-positive MB Kenyon cells"; on line 279-281 "Nmdar1 is not expressed with YL neurons [...] whereas Nmdar2 is expressed in a single pair of YL neurons..."

      Response: We apologize for this contradiction, which arose from the confusing narrative structure of the original manuscript. As detailed in our response to Reviewer #1 (Major Comment 4), we have reframed the manuscript.

      Comment 12: Particle analysis (Fig 6H-J): experienced males seem to have more synapses but trend towards smaller average size. It would be helpful to show number of synapses and average size as paired data, or show that the total particle area is larger in experienced males.

      Response: We agree with the reviewer that this analysis was inconclusive and potentially misleading due to the limitations of image resolution. As noted in our response to Reviewer #1, we have removed this particle analysis (former Fig 6H-J) from the revised manuscript. Our claim for increased synaptic plasticity is now supported by the more robust measurement of the total fluorescence area of the pre- and postsynaptic markers, which shows a significant increase in experienced males.

      Minor Comments

      We thank the reviewer for their meticulous attention to detail. We have addressed all minor comments as follows:

      Comment 1: 1. Some figures (e.g. Fig 3M-R) and experiments (e.g. oenocyte scRNA-seq) are not referenced in the text. dnc data is shown alongside amn and rut but the rationale for its inclusion is not provided.

      __Response: __Original Fig. 3M-R (now Fig,3 M-O) was referenced on line 283. The rationale for including dnc data (as a canonical memory mutant) is now clarified in the text on lines 187-189:

      "To ask whether the same molecular machinery underlies the SMD that follows sexual experience, we tested three classical memory mutants: dunce (dnc), amnesiac (amn), and rutabaga(rut)."

      Comment 2: Some references might not point to the intended article (e.g. ref 123).

      __Response: __The reference list has been checked and corrected.


      Comment 3. Please plot genetic controls next to experimental genotypes as they are a crucial part of the experiment.


      __Response: __All relevant figures now include plots of genetic controls next to experimental genotypes.

      Comment 4. The "estimation statistics" plots are not necessary since the authors show individual data points. To further enhance data transparency, the authors may consider reducing the alpha and/or dot size so the individual data points are more readily visible.

      Response: Thank you for this helpful suggestion! We fully agree that data transparency is essential. After carefully testing lower alpha values and smaller dot sizes, we found that either change markedly obscured the dense regions of the distributions. So we didn't change the size of the point.

      The estimation-statistics overlays are kept for two courteous reasons: (i) they provide an immediate visual estimate of the mean difference and its 95 % confidence interval, which is the key statistic we base our conclusions on, and (ii) they spare readers from having to cross-reference separate tables.


      Comment 5. For accessibility, please avoid using green and red in the same plot.

      __Response: __We fully agree that red–green combinations can be problematic for colour-vision-impaired readers. In the present manuscript, however, the only panel that juxtaposes pure red and pure green is the Fly-SCOPE co-expression data. These scRNA-seq plots are provided only as supportive reference; the actual quantitative conclusions are based on independent genetic and imaging experiments that use magenta, cyan, yellow, and greyscale palettes. Moreover, the scope images are accompanied by detailed text descriptions of the overlapping cell clusters, so no essential information is lost even if the colours are indistinguishable

      Comment 6. Fly Cell Atlas: please show color scales used for each gene as the color thresholds are gene-specific by default.The 3-color overlap on SCope also makes it very difficult to see the expression pattern for each gene. One possibility is outlining the Kenyon cells on the tSNE plots and showing the expression for each gene of interest.

      Response: Thank you for this helpful suggestion. To avoid the ambiguity that arises from RGB blending in the three-colour overlay, we have added a small colour-mixing diagram next to the t-SNE plots (revised Fig. 1). This key shows the exact hues produced by pairwise and three-way overlaps:

      • Red + Green = Yellow

      • Red + Blue = Magenta

      • Green + Blue = Cyan

      • Red + Green + Blue = White

      Thus, yellow, magenta or cyan dots indicate co-expression of two genes, while white dots mark cells where all three genes are detected. this diagram allows readers to interpret overlap colours at a glance without re-entering SCope.

      Comment 7. Please also refer to Fly Cell Atlas as such. SCope is a visualization platform that houses multiple datasets.

      __Response: __The reference to Fly Cell Atlas was added.

      Comment 8. Please introduce acronyms and genetic reagents the first time they are mentioned.

      __Response: __All acronyms and genetic reagents are now defined upon their first use.

      Comment 9. Line 184: please specify "split-GAL4 reagents" instead of "advanced genetic tools".

      __Response: __We have replaced "advanced genetic tools" with the more specific term "Split-GAL4 reagents."


      Comment 10. Line 187: there are a few other lines with p>0.05 or p>0.01, so "uniquely" is inaccurate. Are the p-values in Table 1 corrected for multiple testing?

      __Response: __The term "uniquely" has been revised for accuracy. No correction for multiple testing was applied because each entry in Table 1 represents a single pairwise comparison (naive vs. exp). Thus only one p-value was generated per experiment.

      Comment 11. Some immunofluorescence panels lack scale bars.

      __Response: __Scale bars have been added to all immunofluorescence panels.


      Comment 12. Fig S2G-I: do authors mean "naive" instead of "group"?

      __Response: __The term "group" in Fig S2G-I has been corrected to "naive."

      Comment 13. Movie 1 should be referenced when YL neurons are first introduced.

      __Response: __Movie 1 is now referenced when YL neurons are first introduced in the text.

      Comment 14. Is Fig 4L similar to Fig 6L-N?

      __Response: __This error has been corrected after the article was reformatted

      Comment 15. Fig 7: please plot olfactory conditioning experiment results as either percentages, preference index, or paired numbers. "Number of flies/tube" is not as informative.

      __Response: __Thank you for pointing this out. The bars in Fig. 7 indeed represent paired numbers, but we realise this was not stated explicitly. We apologize for the lack of clarity. In the revised manuscript we explained it in detail in figure legend and method. In the figure, we also marked the percentage of flies that chose to avoid the side of the stimulus with gas, and explained it in the Figure legend.




      Reference

      1. Lagasse F, Devaud J-M, Mery F. A Switch from Cycloheximide-Resistant Consolidated Memory to Cycloheximide-Sensitive Reconsolidation and Extinction in Drosophila. J Neurosci. 2009;29: 2225–2230. doi:10.1523/jneurosci.3789-08.2009
      2. Huang C, Maxey JR, Sinha S, Savall J, Gong Y, Schnitzer MJ. Long-term optical brain imaging in live adult fruit flies. Nat Commun. 2018;9: 872. doi:10.1038/s41467-018-02873-1
      3. Tonoki A, Davis RL. Aging Impairs Protein-Synthesis-Dependent Long-Term Memory in Drosophila. J Neurosci. 2015;35: 1173–1180. doi:10.1523/jneurosci.0978-14.2015
      4. Macartney EL, Zeender V, Meena A, Nardo AND, Bonduriansky R, Lüpold S. Sperm depletion in relation to developmental nutrition and genotype in Drosophila melanogaster. Evol Int J Org Evol. 2021;75: 2830–2841. doi:10.1111/evo.14373
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Sun et al. show that Orb2-expressing, glutamatergic mushroom body neurons (YL neurons) are central to the "shorter mating duration (SMD)" behavior, where males reduce their mating duration up to 12 hours after the initial copulation. The authors use SMD as a model for understanding sexual experience-dependent long-term memory in males. A few genes implicated in long-term memory (Fmr1, CrebB) are required in YL neurons for SMD. The Nmdar-CaMKII signaling pathways is also implicated, and mating attenuates Ca2+ signaling and increases synaptic plasticity in the mushroom body and subesophageal ganglion.

      Major comments:

      1. Clearer articulation of the rationale, motivation, and significance of the overall study design and individual experiments can strengthen the manuscript and promote readership. For example, the beginnings of the abstract and introduction should define what authors mean by sexual experience-dependent long-term memory and its significance (including why it is "significant for reproductive success" (lines 46 and 92)). Similarly, employing more concrete language throughout the text will help anchor and contextualize the study. Interpretation is occasionally insufficient or does not follow directly from the data provided.
      2. Long term memory: I do not work on Drosophila memory, but a cursory search suggests that the field generally considers long term memory in Drosophila to last for 24 hr to days (courtship memory lasts for >24 hr). SMD decays between 12-24 hr after copulation. Could SMD be considered a short-term effect?
      3. Fig 1B-E share the same control (naive) group. If these experiments were performed in the same replicate(s), they should be plotted in the same figure. If not, please provide more details on how experimental blocks were set up and how controls compared between replicates.
      4. Serial mating (Fig 1F-H): please provide details on the methods. How much time elapsed between successive matings? Is a paired statistical test used? Sperm depletion also affects mating duration, and without this information the authors' conclusion (lines 155-156) does not automatically follow from the data.
      5. Mating duration assay: Which isolation interval was chosen for the rest of the SMD experiments? The 12 hr en masse mating setup is relatively uncommon among studies on courtship/copulation/post-copulatory phenotypes, and introduces uncertainty and variability in the number and timing of matings that occurred during the 12 hr-window. This source of variability and its implication in interpreting the data should be acknowledged. Moreover, the 3 studies referenced in the methods all house males in groups of 4, whereas this study uses groups of 40. Could density confound the manifestation of SMD?
      6. SMD behavior: comparing orb2 mutants and controls (Fig 1M and Fig S1K-L), loss of orb2 actually reduces the mating duration in native males (mean ~15 min) relative to controls (~20 min), and have possibly no effect on experienced males (~15 min). This is inconsistent with the SMD behavior demonstrated in Fig 1B-E. The same pattern is found for mushroom body silencing (Fig 1P, Fig S1M-N), orb2 knockdown in YL neurons (Fig 2D, Fig S2A-B), Fmr1 knockdown in YL neurons (Fig 3D, Fig S2B, S3D) and most other experiments where mating duration is not significantly different between naive and experienced males. This might demonstrate a separate role of YL neurons and its related circuit in regulating mating duration in naive males. Could the authors discuss this interpretation? As an aside, plotting genetic controls next to experimental groups is customary and facilitates comparisons between relevant groups.
      7. Bitmap figures: unfortunately the bitmap figures are compressed and their resolution makes it difficult to evaluate the visual evidence.
      8. Sexual dimorphism of YL neurons: many neurons involved in sexual behaviors express dsx and/or fru. Do YL neurons express them? If they do, they might be a subset of characterized and named dsx/fru neurons.
      9. Genetic controls for some crucial experiments are not provided, e.g. Fig 2J, Fig S3C, Fig S3E-F Fig 5B-C, F, Q-R, Fig S5A-E.
      10. Colocalization experiments: please provide more detail on how fluorescence is normalized for each channel across images, especially when the overall expression of an effector is up- or down-regulated after mating.
      11. Please resolve this apparent contradiction on the expression of Nmdar1 and 2 in YL neurons. On line 261: "both receptors co-expressing in Orb2-positive MB Kenyon cells"; on line 279-281 "Nmdar1 is not expressed with YL neurons [...] whereas Nmdar2 is expressed in a single pair of YL neurons in both male and female brains".
      12. Particle analysis (Fig 6H-J): experienced males seem to have more synapses but trend towards smaller average size. It would be helpful to show number of synapses and average size as paired data, or show that the total particle area is larger in experienced males.

      Minor comments:

      1. Some figures (e.g. Fig 3M-R) and experiments (e.g. oenocyte scRNA-seq) are not referenced in the text. dnc data is shown alongside amn and rut but the rationale for its inclusion is not provided.
      2. Some references might not point to the intended article (e.g. ref 123).
      3. Please plot genetic controls next to experimental genotypes as they are a crucial part of the experiment.
      4. The "estimation statistics" plots are not necessary since the authors show individual data points. To further enhance data transparency, the authors may consider reducing the alpha and/or dot size so the individual data points are more readily visible.
      5. For accessibility, please avoid using green and red in the same plot.
      6. Fly Cell Atlas: please show color scales used for each gene as the color thresholds are gene-specific by default.The 3-color overlap on SCope also makes it very difficult to see the expression pattern for each gene. One possibility is outlining the Kenyon cells on the tSNE plots and showing the expression for each gene of interest.
      7. Please also refer to Fly Cell Atlas as such. SCope is a visualization platform that houses multiple datasets.
      8. Please introduce acronyms and genetic reagents the first time they are mentioned.
      9. Line 184: please specify "split-GAL4 reagents" instead of "advanced genetic tools".
      10. Line 187: there are a few other lines with p>0.05 or p>0.01, so "uniquely" is inaccurate. Are the p-values in Table 1 corrected for multiple testing?
      11. Some immunofluorescence panels lack scale bars.
      12. Fig S2G-I: do authors mean "naive" instead of "group"?
      13. Movie 1 should be referenced when YL neurons are first introduced.
      14. Is Fig 4L similar to Fig 6L-N?
      15. Fig 7: please plot olfactory conditioning experiment results as either percentages, preference index, or paired numbers. "Number of flies/tube" is not as informative.

      Significance

      The manuscript describes an extensive and comprehensive set of experiments aimed at elucidating the role of a subset of mushroom body neurons in mediating a male post-mating sexual behavior, which the authors use as a model for sexual experience-dependent long-term memory. Long-term post-mating responses in females have been well characterized in Drosophila and other insects, but post-mating long term memory in males are less well understood despite a few studies reporting their importance in mating success. How males adjust their mating duration based on internal and external cues can reveal insights about decision making and interval timer mechanisms. This study represents a functional advancement in the neuronal and molecular mechanisms of how memory and experience regulates a sexual behavior in male Drosophila. Overall, the manuscript can significantly benefit from general editing on clearer articulation of rationale and more appropriate interpretations of data. Higher resolution versions of bitmap figures is also crucial. The SMD experiments invite an alternative interpretation of data that centers on YL neurons' role on regulating mating duration in naive males, which alongside other roles of the mushroom body demonstrated in this manuscript, could add more depth to the study.

      The findings in this manuscript will be of interest to a specialized audience interested in memory, neural circuits of behavior, and Drosophila sexual behavior. I work on Drosophila sexual behavior and circuits, but lacking experience on memory research, I am not as familiar with the mushroom body and conditioning experiments.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This study explores the molecular and neural circuitry mechanisms underlying sexual experience-dependent long-term memory (SELTM) in male Drosophila. The authors use behavioral, imaging, and bioinformatics approaches to identify YL neurons, a subset of mushroom body (MB) projecting neurons, as crucial for SELTM formation. They propose that YL neurons receive inputs from WG neurons via the sNPF-sNPFR pathway and implicate molecular players such as orb2, fmr1, MDAR2-CaMK, and synaptic plasticity in their function.

      However, the evidence presented does not adequately support the authors' claims. The data fail to cohesively tell a logical story, and key conclusions appear to be based on assumptions and correlations rather than robust evidence.

      Major comments:

      1. The study identifies the knowledge gap (lines 103-104) but fails to integrate relevant literature, particularly Shohat-Ophir et al., Science (2012), and Zer-Krispil et al., Curr Biol (2018). These studies established that ejaculation induces appetitive memory in male Drosophila via corazonin and NPF neurons. The current study does not provide direct evidence that the "act of mating itself" drives SELTM, as it includes both courtship and copulation.
      2. The nature of the observed long-lasting reduced mating duration requires clearer characterization: Is this an associative memory or experience-dependent behavioral plasticity? Can the formation of this long-term memory be blocked by protein synthesis inhibitors, such as cycloheximide?
      3. While schematics illustrate the working hypotheses, the text lacks detailed explanations, leaving the reader unclear about the rationale behind certain conclusions.
      4. The logic to draw certain conclusions was confusing and misleading.
        • For instance, the role of orb2 in SELTM is examined via knockdown in MB Kenyon cells (KCs) (using ok107>orb2-RNAi), which is irrelevant to the claim that orb2 functions in YL neurons. Additionally, RNAseq analyses (Fig. 1N-S) focusing on orb2 expression in a/b KCs are irrelevant to and cannot support the claim that Orb2 functions in YL neurons.
        • Similarly, the claim (lines 302-303) that sNPF-R expression is exclusive to MB KCs conflicts with data showing effects when sNPF-R is knocked down in YL neurons. How can knocking-down a gene, which is exclusively expressed in neural population A, in neural population B affect a phenotype? This inconsistency undermines the interpretation of the results.
        • Other examples include lines 223-227 and lines 246-249. It is very confusing how the authors came to the indications.
        • The authors also kept confusing the readers and themselves by mistakenly referring to MB KC a-lobe and YL a-lobe projection. They may know the difference between the two neural populations but they did not always refer to the right one in the text.
      5. The imaging figures provided are unfocused and poorly resolved, making it difficult to assess data quality.
        • Colocalization analyses of orb2 and YL are unconvincing, especially given that orb2 is well-documented in literature as expressed in MB a-KCs and YL projection wrapping MB a-lobe. Maximum intensity projection images are insufficient for confirming colocalization; complete image stacks with staining of orb2, YL, and KCs (MB-dsRed) are needed for validation.
        • Quantification of imaging data appears flawed. For example, claims of orb2 and CaMKII upregulation in MB a-lobe projections (e.g., Fig. S2F-J, Fig. 3M,N) are confounded by widespread increases in intensity across the brain, lacking specificity.
        • The TRIC experiment analysis should normalize GFP signals to internal reference channel (RFP in the TRIC construct), as per established protocols in the original paper.
        • In Fig. 6H-J, methods for counting synapse numbers are not described. How are synapse numbers counted in these low-resolution images?
      6. The study presents data from unrelated learning paradigms (e.g., olfactory associative learning, courtship conditioning; Fig. 7) without justifying how these paradigms relate to SELTM. Particularly, the authors claimed that SELTM is related to Gr5a, which leads to appetitive memories, which involve PAM dopaminergic neurons and MB horizontal lobes. However, the olfactory associative learning with electric shock and courtship conditioning lead to aversive memories, that involve PPL1 dopaminergic neurons and the vertical lobes.
      7. Some figures are not referred to in the text. For example, Fig S1 K and L (also, what's the difference between these two figures?) and Fig 3M-R. What is MB-V3 in Fig 4J-K?

      Minor issues

      1. Fig 2F. YL projections are labeled as MBONs. Clarify whether YL neurons are the upstream or downstream (MBON) of KCs.
      2. Extensive language polishing is required, as several sentences are unclear (e.g., lines 169-172).

      Significance

      This study potentially advances our understanding of how sexual experience modifies future mating behaviors. While previous work has shown that mating induces appetitive memory in males, the mechanisms linking this memory to future mating behavior remain poorly understood. This work could provide valuable insights into these mechanisms, pending appropriate revisions.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      The manuscript presents IGNITE (Inference of Gene Networks using Inverse kinetic Theory and Experiments), an unsupervised machine learning framework for constructing gene regulatory networks from single-cell RNA sequencing (scRNA-seq) data. IGNITE utilizes a kinetic inverse Ising model to infer gene interactions from binarized expression data and can predict genetic perturbation effects, such as those from knockout experiments. Although the application of inverse Ising models to network reconstruction is not entirely novel, IGNITE's specific implementation and its application to single-cell RNA sequencing data represent a new development. The method is tested on the transition from naive to formative states in murine pluripotent stem cells, a system the authors are highly knowledgeable about, and its performance is compared to state-of-the-art alternative methods.

      Major concerns

      My concern regards the generality of the method, particularly the entire pipeline presented, and the fairness of the performance comparison. These concerns can be easily addressed by the authors by better explaining their choices and their general applicability, and by toning down the conclusions about the comparison with existing inference methods.

      The pre-processing steps are extensive, and their rationale is not always clear, though the results heavily depend on this analysis. Several steps appear to involve arbitrary choices optimized for specific outcomes, potentially introducing biases. The authors should better explain the rationale behind their choices to mitigate these concerns.

      Specifically, part of the pipeline seems to be built to reproduce a specific expression pattern of 24 genes that some of the authors discovered in a previous paper. Although this prior knowledge could be useful and relevant in this specific system, it could limit the generality of the method. For example, the authors selected approximately 2000 genes based on prior knowledge and used a combination of t-SNE and UMAP for dimensionality reduction (although the two techniques have a similar goal). This specific combination seems to reproduce the pseudotime alignment the authors were expecting to find, but such prior information might not be available in general. Therefore, feature selection and the methods used to project data need more justification, especially if the goal is to create a general tool applicable across different biological systems.

      Analogously, the clustering seems manually adjusted to match known expression patterns of 24 relevant genes, rather than being the result of an optimized clustering method. Additionally, the clusters overlap with different time points, raising concerns about potential batch effects. These issues should be addressed to strengthen the validity of the method.

      The claims about the comparison with existing methods should be toned down. While the comparisons are useful and interesting, they might be biased due to the method's fine-tuning for the specific system studied. The claim that the model requires only scRNA-seq data is misleading, as strong prior biological knowledge was used to select, for example, the genes analyzed.

      Significance

      The manuscript is scientifically sound, clearly written, and deserves publication. The proposed method is quantitative, novel, theoretically grounded, and was tested in detail with appropriate null models and statistical methods. Moreover, IGNITE can be applied to various biological systems as the availability of scRNA-seq datasets is continuously growing. The paper will be of interest to a broad community of computational biologists and biology labs interested in gene regulation using scRNA-seq data.

      The limitation, in my opinion, is the method's (particularly the pre-processing pipeline) fine-tuning for the specific biological system tested. Testing IGNITE on another biological system without pre-selected pre-processing steps or detailed biological priors would be more convincing and make the paper's conclusions much stronger. The comparison with other methods also may be slightly biased due to this fine-tuning.

      My background is in statistical physics, with expertise in biological physics, specifically in mathematical modeling and data analysis in molecular biology.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Corridori et al introduce IGNITE, a computational framework to infer gene regulatory networks (GRNs) from scRNA-seq data leveraging the kinetic Ising model, which can be used to simulate synthetic gene expression and perform in-silico knockout experiments. Other similar frameworks exist, but none combine these three aspects together. The authors have generated a scRNA-seq of murine ESCs differentiation which they use to compare their method with others. Specifically they show that they can infer known regulatory interactions, that they can generate similar data than the original and that it can potentially predict gene expression changes in transcription factor knock-out perturbations.

      Major comments:

      • Many of the authors' claims are backed by qualitative results and not properly quantified. In Fig2, authors qualitatively compare intra gene correlations between genes for the original data and their prediction. Instead of just visualizing they should compute and report the Spearman correlation between the original expression and the predicted one. The Fraction of Agreement is not a good metric to compare knockout predictions since it is completely dependent on the class imbalance of signs, for example if the selected genes are 75% positive and 25% negative, a naive predictor that only outputs positive predictions will still have a high score. Instead, the authors should quantify this with Spearman correlation or RMSE and compare across methods. In FigS4a-b the authors qualitatively claim that other methods could not predict the expected cell composition, which they should quantify and report the values across methods. When comparing against the ground truth network, the fraction of correctly inferred interactions is technically the same as precision but is ignoring recall. I suggest the authors compute precision, recall and a combined F1 score to compare the evaluated methods. Authors claim that the method is scalable to a larger number of genes but no data is provided, they should show how their method compares to others when using a different number of cells and number of genes at memory usage and running time.
      • The authors need to better describe which tests were performed when talking about significance, which thresholds and which corrections, if any, were employed.
      • To reduce the number of dimensions of scRNA-seq data the authors use t-SNE and then from the obtained result UMAP to project the data into a lower dimensional space. This is fundamentally wrong since distances are not well preserved in t-SNE. Instead the authors should first employ PCA and then UMAP. Additionally, the authors use UMAP distances in the Slingshot pseudotime calculation. Similar to t-SNE, UMAP distances have no real meaning and should only be used for visualization purposes. Instead, the authors should provide Slingshot the obtained PCA embeddings.
      • Dictys (PMID: 37537351) is a known GRN inference method that also can simulate gene expression but is missing in the benchmark, the authors should add it to the method comparison.
      • The current manuscript is not reproducible since it is missing the method's code, the code to reproduce the figures and the generated scRNA-seq data.
      • Authors claim that the method is scalable to a larger number of genes but no data is provided to back this claim. They should show how their method compares to others when using a different number of cells and number of genes.

      Minor points:

      • In the introduction, authors mention multimodal GRN inference methods but do not provide any references.
      • In Table 1, CellOracle is annotated as not being able to do multiple KO which is wrong. Additionally, the authors mention that IGNITE uses no prior knowledge which is not really true since it requires pseudotime ordering. The authors should add a column to Table 1 whether methods require pseudotime.
      • It is unclear what the dashed arrow of Fig1b means. Moreover, plotting gene expression values on top of UMAPs can be misleading, instead authors should plot the gene expression distributions binned by pseudotime.
      • The authors report a p-value of 1.04x10-171 which is below detection limit (see PMID: 30921532). Authors should change it to an interval such as p < 2.2×10-16.
      • To make CellOracle results easier to interpret and more comparable, authors should run it at the atlas level instead of at the cell type level, this way generating only one GRN. This can be achieved by assigning the same cluster label to all cells.
      • Experimental values in FigS3b seem to have been repeated and do not match the previous ones for IGNITE and SCODE.
      • It is unclear what the different circles mean in Fig5b.

      Significance

      This manuscript is an incremental and methodological work for specialized audiences. Its strengths are that the authors employ kinetic Ising model for GRN inference and that they provide a single framework capable of inferring, simulating and perturbing gene expression. The main limitations are that the claims should be better quantified and that the code and data need to be made accessible.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      Corridori and colleagues propose IGNITE, a novel method to recover Gene Regulatory Networks (GRN) from single cell RNA-sequencing (scRNA-seq) data. Their method solves the inverse Ising problem generating a cohort of candidate GRN optimising it to minimise the difference to the input expression matrix. Authors report the IGNITE is able to predict wild type data and simulate both single and multiple gene knockouts. Authors benchmark this method on a in-house data set of differentiating pluripotent stem cells (PSC). They focus on a small set of genes known to be involved in PSC differentiation into formative cells. Authors benchmark IGNITE against state of the art tools (SCODE, MaxEnt and CELLORACLE). They evaluate IGNITE ability to predict wild type gene expression by comparing their data with experimental data and with SCODE. They conclude the tool has generative capacity comparable with SCODE. They also evaluate IGNITE ability to recover known interactions with respect to other tools without finding it to significantly outperform them.

      Major comments

      • Are the key conclusions convincing?

      Conclusions appear convincing although model generalizability could be shown in a more thorough manner. For instance, analysing some other publicly available dataset could help demonstrate hyperparameters effects on GRN predictions and their robustness across different experiments. - Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      Claims are well supported by data. - Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      I think the work would benefit from an additional benchmark on a different cellular system. This experiment would show how hyperparameters generalise across datasets and would provide potential users insights how to tweak them.

      Also, how does the model scale with the number of genes? A benchmark on computation time and resources required to infer GRN of growing size would be valuable in the adoption of this tool.

      In addition, I think the GRN comparison benchmark presented in section (3.4) would benefit from a quantitative discussion. Authors show inferred GRNs in Figure 4 and S5. For instance, measuring matrix similarity (when appropriate) would help understanding how predicted GRN compare. I understand authors attempt to do so by focusing on validated interactions and computing the fraction of correctly inferred interactions (FCI) but I think a measurement of the overall similarity (eg. Pearson correlation) would add on this.

      Another comment regards the dependency between Correlation Matrices Distance (CMD) and FCI, shown in Figure 5. I understand that IGNITE GRN that maximise FCI are not the same that minimise CMD. However, it looks like GRN that maximise FCI have higher value in terms of biological information. I wonder whether optimization for one or the other metric could be left to the end user as a tunable parameter.

      Authors should discuss why the expression of some genes does not follow the expected trends (Fig 1C vs Fig S1A). Out of the 24 genes they select for their analysis, at least four do not follow the expected trends: Sox2, according to literature, is a Naive gene, however, in Figure 1C its gene expression pattern is more similar to Formative late genes. Other genes with similar "unexpected" patterns are Zic3, Etv4 and Sall4.

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      I think suggested experiments are doable as long as authors get publicly available data, i.e. the in-house dataset they generated for this study is enough to show applicability. For example datasets analysed in SCODE paper (https://doi.org/10.1093/bioinformatics/btx194) could be used as second benchmark. The point of applying the tool to another dataset is to show how it generalises across different biological systems, experiments and, potentially, sequencing technologies. - Are the data and the methods presented in such a way that they can be reproduced?

      The methods section is really clear. To enable reproducibility both raw scRNA-seq data, the IGNITE source code and code written to benchmark it should be released in the public domain in appropriate repositories (eg. ENA, GitHub, Binder etc). - Are the experiments adequately replicated and statistical analysis adequate?

      Yes.

      Minor comments

      • Specific experimental issues that are easily addressable.

      Related to the Sox2 expression pattern is the binarization shown in Figure 2D. How is it possible that Sox2 is always marked as active? Could the authors clarify how these outlier behaviours emerge and propose mitigation strategies, if any?

      In section 5.11.2 it is unclear if xi are in log scale or not. Since the model starts from binarized, log transformed expression values, should not generated ones be in the same scale as the input? - Are prior studies referenced appropriately?

      Yes, referencing is clear. - Are the text and figures clear and accurate?

      Yes, figures appear to be clear, readable and well documented both in captions and main text. - Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Section 3.3 could be improved by better describing experimental datasets. Only in the methods section it is clearly stated that experimental data for single KO experiments were retrieved from the literature.

      Check typesetting:

      • parenthesis missing in Eq. 1
      • Leftover $ in section 3.1
      • Parenthesis missing in Section 3.3
      • Misplaced comma in section 5.2.1

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      The paper presents a method to infer GRN from scRNA-seq data alone. Applications include GRN prediction and their perturbations. This paper represents a technical advance in the field as it is the first application of the inverse Ising problem GRN inference. - Place the work in the context of the existing literature (provide references, where appropriate).

      The paper itself presents the landscape of GRN inference tools using scRNA-seq data: SCODE, MaxEnt and CELLORACLE. More tools exist, for instance SCENIC (https://doi.org/10.1038/nmeth.4463) mainly relies on co-expression matrices. Other tools exist but require additional data types e.g. GRaNIE and GRaNPA (https://doi.org/10.15252/msb.202311627) leverage on physical interaction data (ATAC-seq, ChIP-seq). Similarly DeepFlyBrain uses deep neural networks to infer eGRN in Drosophila (https://doi.org/10.1038/s41586-021-04262-z). The value of tools like IGNITE and its competitors is that they do not require additional data types, which, in turn, helps in controlling experimental costs. - State what audience might be interested in and influenced by the reported findings.

      The paper might be of interest to biologists interested in regulation of gene expression. The tool might turn out to be useful in planning experimental work by guiding the choice of perturbations to introduce in experimental systems. - Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      I am a computational biologist.

      I have no sufficient expertise to evaluate the mathematical details of the method.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We appreciated the positive, detailed and helpful feedback from all three reviewers.

      Reviewer 1.

      Minor comments.

      1. In the introduction, on page 2, the authors seem a little confused about the Plk1 Polo-box domain - text as written: "...kinase domain linked to tandem Polo-box domains (PBD)", and cite a review paper. Actually, there is only a single Polo-box domain in these kinases, which contains both Polo-boxes and a bit of the upstream linker region. The "PBD" terminology denotes his 2-Polo-box +linker structure. Perhaps it would be better here to cite the PBD structure (Elia et al., Cell, 2002) as a primary citation here.

      Response: Thank you for finding this error, the text has been updated and the new citation included within the text on line 65.

      1. Similarly, the line "...during the G2/M transition following successful DNA damage repair" cites the Seki et al paper, but those findings are shown in the Macurek et al paper, not the Seki et al paper.

      _Response: _Thank you for finding this error, the new citation included within the text on line 69.

      1. Using the model of the ternary complex as shown in Figure 1B, deletion constructs of Bora missing regions within the disordered loops, but still retaining the residues that bind the PBD, FW pocket and Aurora A, can be modeled and tested to see if such deletions can improve the ipTM scores and binding affinity.

      Response: ____AlphaFold3 modelling was attempted with shorter regions of Bora to see the effect on the ipTM scores. Unfortunately, when Bora was reduced to shorter sequences, such as 18-88 or 18-45 modelled with 68-120, the models became inconsistent and of a low quality. Models were also created including the short region of Bora surrounding Ser252 that interacts with the polo box domain as well as Bora 18-120, but this had minimal effect on the calculated iPTM scores.

      1. On page 5, "S112A" within the sentence "Unexpectedly, the F56A/W58A Bora was less efficiently phosphorylated on S112A (Supplementary Figure S11, F compared to H and Supplementary Table S4)." This should be "S112".

      Response: ____Thank you for spotting this, the error has been corrected.

      1. In the assays shown in Figure 2D, the presence of excess F56AW58A Bora that remained unphosphorylated on S112 may complicate the interpretation of the results. Can the authors show that the S112-phosphorylated F56AW68A Bora is predominantly bound to Aurora A in such a mixture, perhaps by NMR using labelled pS112 F56AW58A Bora and unlabeled S112 F56AW58A Bora?

      _Response: _15N13C labelled of Bora 18-120 F56A W58A was produced and assigned. We then phosphorylated a sample using ERK2, tracking with NMR, and when the reaction had progressed to a 50:50 mixture of pSer112 and Ser112 (based on peak intensities) the kinase activity was quenched by addition of EDTA to sequester Mg2+. This produced a solution containing both pS112 and unphosphorylated S112 Bora species with marker peaks in HSQC spectra that could be used to directly compare Aurora-binding to the two species. Aurora-A was introduced to the sample and the peak intensities were monitored. Although both species are affected, there is much greater peak loss from the pS112 related peaks than those for unphosphorylated S112. This indicates that Aurora-A still preferentially binds pS112 Bora over S112 Bora when the F56A W58A mutation is present. This data has been included in Supplementary Figure S11.

      1. Please expand Figure 3A to better show the FW pocket-forming residues on Plk1.

      Response: ____Figure 3 has been amended to reduce the size of the sequence alignments so that 3A could be made slightly larger.

      1. It would be helpful to label the peaks in the mass spectra in Fig. S11 with the phospho-species that they correspond to.

      Response: ____This information has been added to the mass spectra in Fig. S11 (now supplementary Figure S14) to make them easier to view.

      1. In the last paragraph on page 7, "see we" in the sentence "As well as a decrease in intensity around pSer112 in Bora, see we an overall effect with decreased intensity across most of the Bora sequence." Should be corrected to "we see".

      Response: ____Thank you for spotting this, the error has been corrected.

      1. While not required, it would be helpful if binding or Bora to Aurora A after Erk2 phosphorylation could be shown using fluorescence polarization or ITC to lend additional support to the NMR data for S112 and S59 phosphorylation and for CEP192 and TPX2 competition.

      Response: ____This question has been partially answered in previous work by Tavernier et al. (2021), who showed improved binding of Aurora-A to Bora after Erk phosphorylation (by SPR), and they used labelled-TPX2 for a series of competition FP assays in that and the recent parallel study (Pillan et al. 2025).

      We made initial efforts to perform additional FP assays using longer sections of Bora with different phosphorylation states but without success (perhaps due to the multisite-binding nature of the Bora–Aurora interaction, and difficulties with directly expressing phosphorylated Bora). The revised manuscript now includes some additional NMR data to show improved Bora–Aurora-A interaction after phosphorylation at Ser59 (Supplementary Figure S12).

      1. The Aurora A phosphorylation motif has been further defined beyond that reported by the Pinna lab in 2005. Notably, the Ser-59 sequence on Bora (F-R-W-S-I), has, in addition to dominant selection for AR in the -2 position, both favorable -1 (W) and +1 (I) positions based on peptide library measurements (Alexander et al., Science Signaling 2011), further arguing that it may be an excellent Aurora A phosphorylation site.

      Response: ____Thank you for highlighting this publication and how it further reinforces the likelihood of Ser59 being an effective substrate for Aurora-A, this should have been included in the original manuscript. This citation has now been included.

      1. Have the authors tried to model the Drosophila melanogaster Aurora A-Bora-Polo complex to see if the Asn substitution of Bora Ser59, and the expected loss of the interactions between Bora pSer59 and Plk1 Arg59 and Aurora A Arg205 are compensated by other features?

      Response: ____A ternary complex between the Drosophila melanogaster orthologues was modelled using AlphaFold3 (Uniprot code PLK1 (Q9VVR2 72-165), Aurora-A kinase (Q9VGF9) 151-411 and PLK1 (P52304 21-280)). This model was analysed using PDBe PISA to identify potential interactions between the three proteins, focusing on residues that are not conserved between the human and Drosophila sequences. From this model a potential salt bridge was identified between Drosophila Bora Lys120 and PLK1 Glu93 that would not occur in the human ternary complex given Lys120 is replaced with an asparagine. This could be an alternative (kinase-independent) method for improved Bora-PLK1 interaction. When comparing the Bora:Aurora-A side of the predicted interface and focusing on the short region of Bora in between Aurora-A and PLK1, there were no clear differences seen in the residues predicted to bind to Aurora-A. This modelling has been included in Supplementary Figure S10 C and D.

      1. Given the relevance of the recent publication from Zhu et al. to this study, the authors may want to comment on, or test, the relative importance of PKA and Aurora A as a potential kinase for Bora S59. While those authors argue that PKA phosphorylates Bora on Ser-59, one could easily imagine a model in which either PKA or Aurora A could initially phosphorylate that site followed by a propagation step after initial Aurora A activation, in which Aurora A phosphorylation of Bora Ser-59 is the dominant process.

      Response: ____A brief discussion of this recent publication has been added to the discussion, highlighting the similarities between the two publications and the importance of pSer59, as well as suggesting that in cellulo this modification could be achieved via more than one pathway. We also include some additional NMR data to show improved Bora–Aurora-A interaction after phosphorylation at Ser59 (Supplementary Figure S12).

      Reviewer 2.

      Minor comments.

      Page 5: '... a K82R PLK1 mutant was used to increase the stability of the protein' - It is not clear how this mutation confers increased stability of the protein. The authors do not show any data to support this. Isn't the PLK1 K82R an ATP-binding-deficient, kinase-inactive mutant?

      Response: ____Thank you for spotting this, the text has been updated to clarify that this version of PLK1 was used as it is acting as a substrate in the in vitro assay as we didn’t want to see any PLK1 activity within this assay.

      All panels showing the Alphabridge diagram - it would be helpful if pictorial definitions of the colour codes were provided with corresponding score ranges (in addition to the description in the figure legend).

      Response:____The AlphaBridge images have been updated to include details about the plDDT scores each of the different colours refer to.

      Fig 2B - The Fluorescence anisotropy assay curves do not reach a plateau. Though the effect of mutation on binding affinity is pretty clear, if possible, I suggest including more data points at higher concentrations and estimating apparent Kd values.

      __Response:____The direct binding assay was repeated with a higher concentration of PLK1 in order to try and see a top plateau. This was successful and has been included in Figure 2B (shown in black). The measured Kd was 24 ± 3 µM. __

      The cartoon representation of the structures and molecular interfaces - better to avoid shadows, as they compromise the clarity of the figures, particularly the ones where side chains are shown in stick representation.

      Response:____The structural images have been remade to remove the shadows and improve the clarity of the images.

      It is important to discuss how the parallel studies by Verza et al. and Pillan et al. complement this study, highlighting similarities and differences.

      Response:____References to these two publications and details on the similarities and differences seen are now included in the discussion.

      Reviewer 3.

      Major comments

      It would be helpful to measure the level of pThr210 PLK1 in some experiments and graph the data. The current presentation is Fig. 2D-E is qualitative rather than quantitative.

      Response:____Graphs displaying the levels of pThr210 produced in the assay are now shown in Supplementary Figure S4.

      Have the authors measured the binding affinity of the F/W mutant Bora for PLK1 using the assay in Fig. 2B? Likewise, for Fig. 7 the S59 mutant could be tested to see if it affects PLK1 binding or activation.

      Response:____The direct binding assay has been repeated with the use of a FAM-Bora peptide that incorporates the F56A W58A mutation which shows reduced binding (Figure 2B, shown in blue). A version of the Bora peptide phosphorylated on Ser59 was also tested in the direct binding assay and this shows a similar affinity for PLK1 to the wild-type sequence (Figure 2B, shown in red compared to the wild-type shown in black).

      It would be helpful if measurements of pThr210 PLK1 for all conditions were shown in the graph Fig. 7F.

      Response:____This graph has been updated to include the levels of phosphorylation seen for PLK1 in all of the conditions tested.

      Minor comments

      I found Figure S1B easier to understand than Fig S1A and Fig 1A-B. Some of the supplemental data Fig. S1C-E could be moved to a revised Figure 1, dropping the current Fig. 1A-B. Can the interaction plots (Fig. S1C-D) be rotated to have the same original at the top and order of proteins (i.e. Bora > Aurora A > {plus minus} PLK1 depending on the plot).

      Response:____Figure 1 and S1 have been rearranged to hopefully make them easier to understand, with all AlphaFold3 models of the full-length sequences kept in the supplementary figure and the focus in 1B just on the truncated model. The AlphaBridge plots have been rotated as suggested.

      Figure 3F. Typo "Strongyl" not "Strongly".

      Response:____Thank you for spotting this, this has been corrected in the updated manuscript.

      Figure 3 could be supplemental material.

      Response:__Thank you for your suggestion, but we have decided to keep this as a main figure.

      Fig. 7E. Run a positive control reaction +ERK2 on the second gel to allow direct comparison of pThr210 across all the conditions tested.

      Response:____These samples have been rerun on the same membrane and the levels of phosphorylation have been quantified and included in Figure 7F.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary.

      Miles and co-workers have carried out a careful and high-quality study of the activation mechanisms of the mitotic kinase PLK1. Multiple proteins have been implicated in PLK1 activation and localisation as cell enter and pass through mitosis. Initial activation of PLK1 is promoted by a complex of Bora with another kinase Aurora A. Later in mitosis, this activated PLK1 associates with mitotic spindle and centrosome proteins regulating different aspects of mitosis and cytokinesis. In this study, Miles et al. extend previous work on this question by proposing and testing detailed models for Bora/Aurora A-mediated activation of PLK1 to elucidate the mechanism of this reaction.

      Using the latest Alphafold they generate a series of models of the PLK1/Bora/Aurora A complex to home in on the key regions mediating interactions of the three proteins. This approach suggests an arrangement where the first ~120 amino acids of Bora wrap Aurora A and create an interaction surface for the N-terminal kinase domain of PLK1. This orients Thr210 in PLK1 towards Aurora A creating a situation likely favourable for phosphorylation, although has the authors discuss there are some caveats to this. A further prediction of the modelling helps explain the requirement for Bora phosphorylation to promote the interaction with Aurora A. This data is presented in Fig. 1 and Fig. S1-S3.

      In the subsequent figures the details of this model are tested using biochemical assays and structural biology methods to validate key predictions. First the PLK1 interaction with Bora was shown to require the conserved F/W motif of Bora and a conserved pocket close to R106 on PLK1 (Fig. 2 and 3). In reconstituted PLK1 activation assays the F/W motif mutant Bora showed greatly attenuated pThr210 phosphorylation. This reaction also required phosphorylation of Bora at S112, presumably due to the interaction with Aurora A. An R106A mutant PLK1 showed reduced binding to Bora and reduced kinase activation. This data is clear and provides compelling support for the model.

      Using NMR the authors then investigate the interaction between Bora and Aurora A, and more specifically the requirement for Bora phosphorylation at Ser112. The NMR data in Fig. 4 and Fig. 6 provide good support for the Alphafold model. A helpful comparison with known Aurora A binding proteins is also shown to highlight the way CEP192, TPX2 and TACC3 contact a series of conserved pockets on the surface of Aurora A which are common to the Bora interaction. S59 phosphorylation by Aurora A is also shown to play an important role in contacting PLK1 and is required for pThr210 phosphorylation.

      In summary, the authors have made valuable progress in working out details of the PLK1 activation mechanism, that extends previous work in the field.

      Major comments.

      It would be helpful to measure the level of pThr210 PLK1 in some experiments and graph the data. The current presentation is Fig. 2D-E is qualitative rather than quantitative.

      Have the authors measured the binding affinity of the F/W mutant Bora for PLK1 using the assay in Fig. 2B? Likewise, for Fig. 7 the S59 mutant could be tested to see if it affects PLK1 binding or activation.

      It would be helpful if measurements of pThr210 PLK1 for all conditions were shown in the graph Fig. 7F.

      Minor comments.

      I found Figure S1B easier to understand than Fig S1A and Fig 1A-B. Some of the supplemental data Fig. S1C-E could be moved to a revised Figure 1, dropping the current Fig. 1A-B. Can the interaction plots (Fig. S1C-D) be rotated to have the same original at the top and order of proteins (i.e. Bora > Aurora A > {plus minus} PLK1 depending on the plot). Figure 3F. Typo "Strongyl" not "Strongly". Figure 3 could be supplemental material. Fig. 7E. Run a positive control reaction +ERK2 on the second gel to allow direct comparison of pThr210 across all the conditions tested.

      Significance

      Timely and orchestrated activation of multiple mitotic protein kinases is crucial for the alignment and segregation of chromosomes, and for the process of cell division. In this study the authors explore how activation of the mitotic kinase PLK1 is triggered by another mitotic kinase Aurora A, and the role played by a scaffold protein Bora.

      Strengths: Detailed analysis of mechanism using biochemical and structural approaches.

      Limitations: The study is focussed on the biochemical and structural mechanisms rather than the cellular outcomes. Some data would benefit from additional quantitative measurement.

      Relevance: Cancer and cell biology due to the role of Aurora A in many cancers.

      Reviewer expertise: Biochemistry, molecular and cell biology.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      PLK1 is one of the master regulators of cell division. The activation of PLK1 requires the activation loop phosphorylation at T210, mediated by Aurora A kinase. However, Aurora A phosphorylation of PLK1 T210 requires Bora, one of the several activators of Aurora A kinase. While the molecular requirement of Aurora A kinase and Bora for PLK1 activation is well established, the mechanistic understanding of how Bora facilitates PLK1 activation by Aurora A has remained an important open question for a long time. Exploiting the latest development in AI-driven structure prediction, three independent studies provide a structural and mechanistic basis for PLK1 activation by Aurora A and Bora. Here, Miles et al. have generated AlphaFold models, further characterised some of the interfaces using NMR, and validated the contribution of intermolecular interactions at suggested interfaces in vitro using recombinant proteins in kinase assays. Overall, this is a well-executed work providing important new insights into our understanding of the activation of the critical regulator of cell division, PLK1. However, as the authors have highlighted in the discussion section, one limitation of this modelling study is that the models still do not entirely explain how these interactions facilitate the phosphorylation of Thr210ur, as this residue is oriented far away from Aurora A's active site for the reaction to take place. Despite this limitation, I believe this is an important work that advances our understanding significantly.

      Comments:

      Experimental data satisfactorily support claims. Hence, most of my comments are minor in nature.

      Points to consider during revision:

      Page 5: '... a K82R PLK1 mutant was used to increase the stability of the protein' - It is not clear how this mutation confers increased stability of the protein. The authors do not show any data to support this. Isn't the PLK1 K82R an ATP-binding-deficient, kinase-inactive mutant?

      All panels showing the Alphabridge diagram - it would be helpful if pictorial definitions of the colour codes were provided with corresponding score ranges (in addition to the description in the figure legend).

      Fig 2B - The Fluorescence anisotropy assay curves do not reach a plateau. Though the effect of mutation on binding affinity is pretty clear, if possible, I suggest including more data points at higher concentrations and estimating apparent Kd values.

      The cartoon representation of the structures and molecular interfaces - better to avoid shadows, as they compromise the clarity of the figures, particularly the ones where side chains are shown in stick representation.

      It is important to discuss how the parallel studies by Verza et al. and Pillan et al. complement this study, highlighting similarities and differences.

      Significance

      As highlighted in the summary, a mechanistic understanding of how PLK1 is activated by Aurora A kinase and its activator Bora has remained a long-standing open question. As PLk1 is one of the major regulators of cell division, which exerts its function (via phosphorylating numerous substrates) during different stages of mitosis, understanding its activation mechanism is of critical interest for those working on the cell cycle in general and cell division in particular. A key limitation of this study is the lack of any cellular functional evaluation of the interaction interfaces.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Miles et al. used a combination of AlphaFold modeling, biochemical assays of mutant constructs and NMR spectroscopy to model the ternary complex of Aurora A, Bora and Plk1, and elucidate how Bora can act as a molecular bridge that facilitates the phosphorylation of the activation loop Thr210 within Plk1 by Aurora A. Their studies identified an interaction between residues 52-73 within Bora and the 'FW' pocket on the N-terminal lobe of Plk1, which binds Phe56 and Trp58 of Bora. Additionally, Ser59 of Bora was identified as a good Aurora A substrate using a Bora peptide array, and pSer59 was predicted to form bridging interactions with Aurora Arg205 and Plk1 Arg59. This was supported by NMR and biochemical assays. In addition, the authors validate that phosphorylation of Ser-112 on Bora enhances stabilization of the Aurora A-Bora complex Overall, the model revealed novel details of the interactions within the Aurora A-Bora-Plk1 ternary complex that are supported by the biochemical and NMR data. The work will be of significant interest to basic scientists whose work involves protein kinase signaling, cell division/mitosis, signal transduction, and cancer biology. We recommend publication of this manuscript with the following minor changes and additions.

      1. In the introduction, on page 2, the authors seem a little confused about the Plk1 Polo-box domain - text as written: "...kinase domain linked to tandem Polo-box domains (PBD)", and cite a review paper. Actually, there is only a single Polo-box domain in these kinases, which contains both Polo-boxes and a bit of the upstream linker region. The "PBD" terminology denotes his 2-Polo-box +linker structure. Perhaps it would be better here to cite the PBD structure (Elia et al., Cell, 2002) as a primary citation here.
      2. Similarly, the line "...during the G2/M transition following successful DNA damage repair" cites the Seki et al paper, but those findings are shown in the Macurek et al paper, not the Seki et al paper.
      3. Using the model of the ternary complex as shown in Figure 1B, deletion constructs of Bora missing regions within the disordered loops, but still retaining the residues that bind the PBD, FW pocket and Aurora A, can be modeled and tested to see if such deletions can improve the ipTM scores and binding affinity.
      4. On page 5, "S112A" within the sentence "Unexpectedly, the F56A/W58A Bora was less efficiently phosphorylated on S112A (Supplementary Figure S11, F compared to H and Supplementary Table S4)." This should be "S112".
      5. In the assays shown in Figure 2D, the presence of excess F56AW58A Bora that remained unphosphorylated on S112 may complicate the interpretation of the results. Can the authors show that the S112-phosphorylated F56AW68A Bora is predominantly bound to Aurora A in such a mixture, perhaps by NMR using labelled pS112 F56AW58A Bora and unlabeled S112 F56AW58A Bora?
      6. Please expand Figure 3A to better show the FW pocket-forming residues on Plk1.
      7. It would be helpful to label the peaks in the mass spectra in Fig. S11 with the phospho-species that they correspond to.
      8. In the last paragraph on page 7, "see we" in the sentence "As well as a decrease in intensity around pSer112 in Bora, see we an overall effect with decreased intensity across most of the Bora sequence." Should be corrected to "we see".
      9. While not required, it would be helpful if binding or Bora to Aurora A after Erk2 phosphorylation could be shown using fluorescence polarization or ITC to lend additional support to the NMR data for S112 and S59 phosphorylation and for CEP192 and TPX2 competition.
      10. The Aurora A phosphorylation motif has been further defined beyond that reported by the Pinna lab in 2005. Notably, the Ser-59 sequence on Bora (F-R-W-S-I), has, in addition to dominant selection for AR in the -2 position, both favorable -1 (W) and +1 (I) positions based on peptide library measurements (Alexander et al., Science Signaling 2011), further arguing that it may be an excellent Aurora A phosphorylation site.
      11. Have the authors tried to model the Drosophila melanogaster Aurora A-Bora-Polo complex to see if the Asn substitution of Bora Ser59, and the expected loss of the interactions between Bora pSer59 and Plk1 Arg59 and Aurora A Arg205 are compensated by other features?
      12. Given the relevance of the recent publication from Zhu et al. in https://doi.org/10.1038/s41467-025-63352-y to this study, the authors may want to comment on, or test, the relative importance of PKA and Aurora A as a potential kinase for Bora S59. While those authors argue that PKA phosphorylates Bora on Ser-59, one could easily imagine a model in which either PKA or Aurora A could initially phosphorylate that site followed by a propagation step after initial Aurora A activation, in which Aurora A phosphorylation of Bora Ser-59 is the dominant process.

      -Dan Lim and Michael Yaffe

      Significance

      The work is well done and clearly presented.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Major comments:

      (comment #1)- It is interesting that TRF2 loss not only fails to increase γH2AX/53BP1 levels but may even slightly reduce them (e.g., Fig. S2c and the IF images). While the main hypothesis is that TRF2 loss does not trigger telomere dysfunction in NSCs, this observation raises the possibility that TRF2 itself contributes to DDR signaling (ATM-P, γH2AX, 53BP1) in these cells and that in its absence, cells are not able to form those foci. To exclude the possibility that telomere-specific DDR is being missed due to an overall dampened DDR response in the absence of TRF2, it would be informative to induce exogenous DSBs in TRF2-depleted cells and test DDR competence (e.g., IF for γH2AX/53BP1). In other words, are those NSC lacking TRF2 even able to form H2AX/53BP1 foci when damaged? In addition, it would be interesting to perform telomere fusion analysis in TRF2 silenced cells (and TRF1 silenced cells as a positive control).

      We acknowledge a slight reduction; however, this difference is not statistically significant (Fig S2c,e). We will quantify the levels of DDR markers upon TRF2 loss and exogenous DSBs and include it in the subsequent revision.

      (comment #2)-A TRF2 ChIP-seq should be performed in NSC as this list of genes (named TAN genes in the text) was determined using a ChIP performed in another cell line (HT1080). For the ChIP-qPCR in the various conditions, primers for negative control regions should be included to show the specific binding of TRF2 to the promoter of the genes associated with neuronal differentiation. For example, an intergenic region and/or promoters of genes that are not associated with neuronal differentiation (or don't contain a potential G4). The same comment goes true for the gene expression analysis: a few genes that are not bound by TRF2 should be included as negative controls to exclude a potential global effect of TRF2 loss on gene expression (ideally a RNA-seq would be performed instead). We have performed NSC-specific TRF2 ChIP-seq for an upcoming manuscript, which confirms TRF2 occupancy at multiple promoters of differentiation-associated genes. These data are provided solely for confidential evaluation by the designated reviewers.

      Regarding the ChIP-qPCR control experiments: We thank reviewer for pointing this out, indeed we included controls in our PCR assays as positive (telomeric) and TRF2-nonbinding loci (GAPDH, RPS18, and ACTB, based on HT1080 TRF2 ChIP-seq data) as negative controls. These results were not included earlier for clarity given that we were presenting several ChIP-PCR figures - in response to the comment we have included this now in the revised version (Fig. S3d,e). Gene expression analyses show selective upregulation of the TAN genes upon TRF2 loss (data normalised to GAPDH); whereas negative control genes lacking TRF2 binding (RPS18, ACTB) remain unchanged, ruling out non-specific effects. (Fig S3f,g,j,k).

      -(comment #3) A co-IP should be performed between the TRF2 PTM mutant K176R or WT TRF2 and REST and PRC2 components to directly show a defect of interaction between them when TRF2 is mutated (a co-IP with DNase/RNase treatment to exclude nucleic-acid bridging). The TRF2 PTM mutant T188N also seems to lead to an increased differentiation (Fig. S5a). Could the author repeat the measure of gene expression and co-IP with REST upon the overexpression of this mutant too?

      We confirm that DNase/RNase is routinely included in our pull-down experiments to exclude nucleic-acid bridging, with detailed methodology now elaborated in the Methods section. Not including this in the manuscript Methods was an oversight from our side. Our data demonstrate that only REST directly interacts with TRF2, while TRF2 engages PRC2 indirectly via REST, as also previously shown by us and others (page 6; ref. [62]; Sharma et al., ref. [15]).

      We thank the reviewer for noting the apparent differentiation in Fig. S5a. However, this observation represents rare spontaneous differentiation event and is not statistically significant (as shown in Fig S5b). Consistently, gene expression analysis of the TRF2-T188N mutant shows no significant change in TRF2-associated neuronal differentiation (TAN) genes. Therefore, Co-IP for TRF2-T188N with REST was not done.

      (comment #4) - The authors show that the G4 ligands SMH14.6 and Bis-indole carboxamide upregulate TAN genes and promote neuronal differentiation, but the underlying mechanism remains unclear. Bis-indole carboxamide is generally considered a G4 stabilizer, while SMH14.6 is less characterized and should be better introduced. The authors should clarify how G4 stabilization would interfere with TRF2 binding, it seems that it would likely be by blocking access. A more detailed discussion, and ideally TRF2 ChIP after ligand treatment and/or G4 helicase treatment, would strengthen the model.

      We clarify that Bis-indole carboxamide acts as a G4 stabilizer, while SMH14.6 is also a noted G4-binding ligand that stabilizes G4s (ref. [15]). The exclusion of TRF2 from G4 motifs in gene promoters by G4-binding ligands has also been documented previously (ref. [18]). In line with these findings, ChIP experiments performed following ligand treatment revealed a decreased occupancy of TRF2 at TAN gene promoters, supporting the proposed mechanism (added Fig. 6h).

      Minor comments:

      • Supp Figures related to the scRNA-seq are difficult to read (blurry).

      Corrected

      • Fig S1h: The red box mentioned in the legend is not visible

      Corrected

      • In the text, the Figures 1 f-g are misannotated as Fig 1m and l

      Corrected

      • The symbol γ of γH2AX is missing in the text

      Corrected

      • Fig.3d, please indicate in the legend that it is done in SH-SY5Y.

      Added SH-SY5Y in the legend of Fig. 3d.

      • Fig. S3b: Please consider replotting this panel with an increased y-axis scale. As currently presented, the TRF2 ChIP-seq peaks at several promoters appear truncated by the scaling.

      Corrected

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      1. For most of the data graphs in the manuscript, there is no indication of the number of independent biological replicates carried out (which should ideally be plotted as individual dots overlaying the column graphs), or what the error bars represent, or what statistical test was used. All the figure legends and methods have now been updated with the corresponding biological replicates per experiment, with error bars as SD/SEM and the corresponding statistical test along with p values.

      Figure S1.1a: needs a marker to show that the tissue is dentate gyrus.

      We acknowledge the reviewers' concern that high-magnification images alone make it difficult to verify whether the fields are taken from the correct anatomical location. The dentate gyrus (DG) of the hippocampus is a well-defined structure. In the revised figure (Fig S1.1a), we now include a low-magnification image showing the entire hippocampus, including the CA fields, along with two high-magnification fields specifically from the DG region. Consistent with our claim, the co-immunostaining demonstrates that Sox2-positive neural stem cells in the DG are also positive for TRF2.

      Figure 1c (and all other flow cytometry panels throughout the manuscript): it is not clear if the expression of any of these proteins, except maybe MAP2, are significantly different in the presence or absence of TRF2. These differences need to be presented more quantitatively, with the results compiled from multiple biological replicates and analysed statistically. I am not sure that flow cytometry is the best way to determine differences in protein expression levels for non-surface proteins, because many of the reported differences are not at all convincing.

      To detect intracellular/nuclear proteins by flow cytometry, cells were permeabilized using pre-chilled 0.2% Triton X-100 for 10 minutes, as described in the Methods section.

      We have revised the figures (Fig 1c,e) and now included statistical analysis from three independent biological replicates for these experiments.(Fig S1.4h-j, S2e, S6d)

      Fig 1d: has TRF2 been effectively silenced in this experiment? There appears to be just as many TRF2+ nuclei in the "TRF2 silenced" panel vs the control, including in the cells with neurite outgrowths.

      Quantification of nuclear levels of TRF2 showing decrease in nuclear TRF2 has been included in supplementary Fig S1g.

      Fig 2a-c: these experiments need a positive control, showing increased expression of these proteins in mNSC and SH-SY5Y cells in response to a DNA damaging agent. Again, flow cytometry may not be the best method for this; immunofluorescence combined with telomere FISH would be more convincing.

      We confirm that doxorubicin induces 53BP1 foci (IF-FISH Sup Fig. S2b) and TRF1 silencing elevates γH2AX (Sup Fig. S2c) validating DDR sensitivity. Unlike TRF2 loss (Fig. 2a-c), no TIFs appear with IF and telomere probes (Fig. 2d, Sup Fig. 2a), and without TIFs, there is no telomeric fusion. Flow cytometry was performed with Triton X- 100 to target nuclear protein. These findings adequately address the concern; therefore, further IF-FISH experiments were not included in the present study.

      To conclude that telomere damage is not occurring, an independent marker of such damage, such as telomere fusions, should also be measured.

      In response to uncapped telomeres, ATM kinase activates the DNA damage response (DDR), recruiting γH2AX and 53BP1 to telomeres, which precedes the end-to-end fusions (Takai et al., 2003; Maciejowski & de Lange, 2015; Takai et al., 2003; d'Adda di Fagagna et al., 2003; Cesare & Reddel, 2010; Hayashi et al., 2012; Sarek et al., 2015). We observe no DDR activation or foci (Fig. 2; Sup. Fig. 2). This absence of a DDR response and TIFs indicates no telomere uncapping, negating the need for direct telomere fusion analysis.

      Figure S2b is lacking a no-doxorubicin control.

      Untreated control has been included Fig. S2b.

      Figures 3a and 3b need a positive control (e.g. TRF2 binding to telomeric DNA) and a negative control (e.g. a promoter that did not show any TRF2 binding in the HT1080 ChiP-seq experiment in Fig S3).

      We have included positive (telomere) and negative (GAPDH) controls (based on HT1080 TRF2 ChIP-seq data) for the TRF2 ChIP assay in Supplementary Fig. S3d,e. Additionally, positive and negative controls for all ChIP experiments conducted in this study are presented in Supplementary Figs. S3d, S3e, S3h, S3i, S4c-h, and S5c-e

      The data in Figure 3 would be more compelling if all experiments were also performed in fibroblasts to confirm the cell-type specificity of the effect.

      Our HT1080 fibrosarcoma ChIP-seq data (ref. [18]; Sup. Fig. 3a,b) show TRF2 binding to TAN gene promoters in a fibroblast-derived model, with enrichment in neurogenesis-related genes (refs. [19,20]). In fibroblasts TRF2 depletion, as expected, induce telomere dysfunction and DDR (Fig. 2d; Sup. Fig. 2a), and eventually cell-cycle arrest and cell death as also reported earlier (van Steensel et al., 1998; Smogorzewska & de Lange, 2002). Therefore, the suggested experiments which would require sustained TRF2-depletion are not possible to perform in fibroblasts. TRF2 occupancy on the promoter of the genes in question in cells other than NSC was noted in HT1080 cells (ref. [18]; Sup. Fig. 3a,b).

      No references are provided for the TRF2 posttranslational modifications on R17, K176, K190 and T188. What is the evidence for these modifications, and is it known if they participate in the telomeric role of TRF2?

      These lines with references have been included in the manuscript (highlighted in blue).

      R17 methylation enhances telomere stability (66). K176/K190 acetylation stabilizes telomeres and is deacetylated by SIRT6 (67). T188 phosphorylation facilitates telomere repair after DSBs(68). These PTMs primarily support telomeric roles.

      The experiments in Fig 5 should also be performed with WT TRF2, to confirm that effects are not due to the overexpression of TRF2.

      WT TRF2 shows no differentiation phenotype and change in TAN gene expression (Fig. 1f,g; 3h, Sup Fig. 5a). Confirming effects are not due to TRF2 overexpression.

      Fig 5c has not been described in the text, and there are multiple technical problems with the TRF2 WT experiment: i) There appears to be significant background binding of REST to the IgG beads, though this blot has such high background it is hard to tell (the REST blot in Fig S4b is also of poor quality), ii) TRF2 is migrating at two different positions in the Input and IP lanes, and the TRF2 band in the K176R blot is at a different position to either, and iii) the relative loading of the Input and IP lanes is not indicated, so it's not clear why K176R appears to be so enriched in the IP.

      We acknowledge the oversight in not citing Fig 5c in the manuscript. This has been corrected, and, highlighted in blue in the revised manuscript.

      i) Multiple optimization attempts were made for the Co-IP experiments, and the presented figure reflects the best achievable result despite REST blot smearing, a pattern also reported previously (Ref. 65). The TRF2-REST interaction is well established, and a similar background was also observed in the cited study

      ii)Variable migration patterns of TRF2 were also noted in the cited study (Ref. 65), consistent with our observations. Our primary emphasis, however, is on the TRF2 K176R mutant, which clearly disrupts its interaction with REST.

      iii)The input loading corresponds to 10% of the total lysate. As the experiments were conducted independently, variations in transfection and pull-down efficiencies may account for observed differences.

      To rule out indirect effects of the G4 ligands on the results in Fig 6g, the binding of BG4 and TRF2 at the promoters of these genes should be measured by ChIP.

      To confirm that G4 ligand effects on TAN gene promoters are direct, TRF2 occupancy was assessed using ChIP. Significantly decreased occupancy of TRF2 was noted at TAN gene promoters, (added Fig. 6h). This implies that ligand-induced changes in TRF2 binding are directly linked to promoter-level G4 stabilization.

      Minor comments:

      1. The size of all the size markers in western blots should be added to the figures. Size has been included in all the western blots

      2. There are several figure panels that are incorrectly referenced in the text, e.g. Fig S1.1 (e-f) should be Fig S1.1 (e-h); Fig. 1m should be Fig. 1f; Figs 5e and 5f have been swapped.

      Corrected.

      1. Fig S1.4 is not referred to in the text. It is not clear what the purpose of Fig S1.4a is.

      The following line has been included in the manuscript highlighted in blue.

      Neurospheres were characterized using PAX6, a NSC marker (Fig S1.4a).

      Are the experiments in Figs 3e, 4a, 4c and 4e using 4-OHT treatment, or siRNA? If the latter, I don't think a control for the effectiveness of the knockdown in this cell type has been included anywhere in the manuscript.

      It is using siRNA, a western blot showing the effectiveness of knockdown is presented in supplementary figure S4c (now S4a).

      The lanes of the western blots in Fig S4c are not labelled.

      Corrected.

      1. Given that the experiments in Fig 5 were carried out on a background of endogenous WT TRF2 expression, presumably the K176R mutant is having a dominant negative effect. To understand the mechanism of this effect (e.g, is it simply due to replacement of endogenous WT TRF2 at its genomic binding sites by a large excess of exogenous K176R, or is dimerisation with WT TRF2 needed?) it would be helpful to know the relative expression levels of endogenous and K176R TRF2.

      To address the query, qRT-PCR with 3′ UTR-specific primers showed no change in endogenous TRF2 mRNA upon K176R expression in SH-SY5Y cells, while primers detecting total TRF2 revealed ~10-fold higher expression of K176R compared to control (Figure below). This indicates the absence of suppression of endogenous TRF2 mRNA. Given that the mutant's DNA binding is intact (Fig. 5f), the dominant-negative effect of K176R likely arises from overexpression of the exogenous mutant.

      For the sentence "...and critical for transcription factor binding including epigenetic functions that are G4 dependent" (bottom of page 3 of the PDF), the authors cite only their own prior papers, but there are examples from others that could be cited.

      We have incorporated citations from other research groups, now included as references 23-26.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This manuscript examines the effects of depletion of the telomeric protein TRF2 in mouse neural stem cells, using mice carrying a floxed allele of TRF2 and inducible Cre recombinase under the control of the stem cell-specific Nestin promoter. The results are also backed up in a human neuroblastoma cell line that has progenitor-like properties. There is no apparent induction of telomere damage in either of these cell types, but there is an increase in expression of neurogenesis genes. This is accompanied by an increase in binding of TRF2 to the relevant promoters, and evidence is provided that this binding involves G-quadruplexes in the promoters.

      On the whole, these core findings of this study are interesting, and reasonably robust. However, the study as a whole is marred by a large number of technical issues and missing controls which should be addressed prior to publication:

      1. For most of the data graphs in the manuscript, there is no indication of the number of independent biological replicates carried out (which should ideally be plotted as individual dots overlaying the column graphs), or what the error bars represent, or what statistical test was used.
      2. Figure S1.1a: needs a marker to show that the tissue is dentate gyrus.
      3. Figure 1c (and all other flow cytometry panels throughout the manuscript): it is not clear if the expression of any of these proteins, except maybe MAP2, are significantly different in the presence or absence of TRF2. These differences need to be presented more quantitatively, with the results compiled from multiple biological replicates and analysed statistically. I am not sure that flow cytometry is the best way to determine differences in protein expression levels for non-surface proteins, because many of the reported differences are not at all convincing.
      4. Fig 1d: has TRF2 been effectively silenced in this experiment? There appears to be just as many TRF2+ nuclei in the "TRF2 silenced" panel vs the control, including in the cells with neurite outgrowths.
      5. Fig 2a-c: these experiments need a positive control, showing increased expression of these proteins in mNSC and SH-SY5Y cells in response to a DNA damaging agent. Again, flow cytometry may not be the best method for this; immunofluorescence combined with telomere FISH would be more convincing.
      6. To conclude that telomere damage is not occurring, an independent marker of such damage, such as telomere fusions, should also be measured.
      7. Figure S2b is lacking a no-doxorubicin control.
      8. Figures 3a and 3b need a positive control (e.g. TRF2 binding to telomeric DNA) and a negative control (e.g. a promoter that did not show any TRF2 binding in the HT1080 ChiP-seq experiment in Fig S3).
      9. The data in Figure 3 would be more compelling if all experiments were also performed in fibroblasts to confirm the cell-type specificity of the effect.
      10. No references are provided for the TRF2 postranslational modifications on R17, K176, K190 and T188. What is the evidence for these modifications, and is it known if they participate in the telomeric role of TRF2?
      11. The experiments in Fig 5 should also be performed with WT TRF2, to confirm that effects are not due to the overexpression of TRF2.
      12. Fig 5c has not been described in the text, and there are multiple technical problems with the TRF2 WT experiment: i) There appears to be significant background binding of REST to the IgG beads, though this blot has such high background it is hard to tell (the REST blot in Fig S4b is also of poor quality), ii) TRF2 is migrating at two different positions in the Input and IP lanes, and the TRF2 band in the K176R blot is at a different position to either, and iii) the relative loading of the Input and IP lanes is not indicated, so it's not clear why K176R appears to be so enriched in the IP.
      13. To rule out indirect effects of the G4 ligands on the results in Fig 6g, the binding of BG4 and TRF2 at the promoters of these genes should be measured by ChIP.

      Minor comments:

      1. The size of all the size markers in western blots should be added to the figures.
      2. There are several figure panels that are incorrectly referenced in the text, e.g. Fig S1.1 (e-f) should be Fig S1.1 (e-h); Fig. 1m should be Fig. 1f; Figs 5e and 5f have been swapped.
      3. Fig S1.4 is not referred to in the text. It is not clear what the purpose of Fig S1.4a is.
      4. Are the experiments in Figs 3e, 4a, 4c and 4e using 4-OHT treatment, or siRNA? If the latter, I don't think a control for the effectiveness of the knockdown in this cell type has been included anywhere in the manuscript.
      5. The lanes of the western blots in Fig S4c are not labelled.
      6. Given that the experiments in Fig 5 were carried out on a background of endogenous WT TRF2 expression, presumably the K176R mutant is having a dominant negative effect. To understand the mechanism of this effect (e.g is it simply due to replacement of endogenous WT TRF2 at its genomic binding sites by a large excess of exogenous K176R, or is dimerisation with WT TRF2 needed?) it would be helpful to know the relative expression levels of endogenous and K176R TRF2.
      7. For the sentence "...and critical for transcription factor binding including epigenetic functions that are G4 dependent" (bottom of page 3 of the PDF), the authors cite only their own prior papers, but there are examples from others that could be cited.

      Significance

      The protein TRF2 was first identified as one of the core proteins that bind to the double-stranded region of telomeric DNA, and its many-faceted role in telomere protection has been well studied over the last 3 decades. More recent data from several labs indicate that TRF2 has additional roles outside the telomere, including in regulating gene expression, but these roles are so far much less characterised. Also, it has recently been shown that mouse ES cells, unexpectedly, do not require TRF2 for telomere protection (references 3 and 4 in this paper).

      The findings of the current findings expand the type of stem cells in which TRF2 is likely to be playing more of a role elsewhere in the genome, and not at telomeres, and hence is likely to be of high interest to both researchers of telomere biology, and those interested in the regulation of stem cell biology and neurogenesis.

      The strengths of the study are its novelty, its use of an inducible system to knock out TRF2 in the mouse neural stem cells of interest, and a thorough analysis of changes in gene expression and promoter occupancy across a range of genes of relevance to neurogenesis. The major weakness of the study, as descibed above, is the large number of technical problems, missing controls and missing indications of biological reproducibility.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this study, the authors show that TRF2 binds non-telomeric G-quadruplexes in promoters of a set of genes ("TAN" genes for TRF2-associated neuronal differentiation) and recruits REST/chromatin remodelers to repress those genes in neural stem cells, thereby maintaining the NSC state in a telomere-independent manner. They show that the loss of TRF2 derepresses TAN genes and promotes neuronal differentiation.

      However, key experiments are missing to fully support the claims: a genome-wide TRF2 ChIP-seq in NSC to validate binding beyond a restricted set of TAN genes, more robust evidence confirming the absence of telomeric dysfunction, and mechanistic clarification of the effects of G4 ligands on TRF2 binding.

      Major comments:

      • It is interesting that TRF2 loss not only fails to increase γH2AX/53BP1 levels but may even slightly reduce them (e.g., Fig. S2c and the IF images). While the main hypothesis is that TRF2 loss does not trigger telomere dysfunction in NSCs, this observation raises the possibility that TRF2 itself contributes to DDR signaling (ATM-P, γH2AX, 53BP1) in these cells and that in its absence, cells are not able to form those foci. To exclude the possibility that telomere-specific DDR is being missed due to an overall dampened DDR response in the absence of TRF2, it would be informative to induce exogenous DSBs in TRF2-depleted cells and test DDR competence (e.g., IF for γH2AX/53BP1). In other words, are those NSC lacking TRF2 even able to form H2AX/53BP1 foci when damaged? In addition, it would be interesting to perform telomere fusion analysis in TRF2 silenced cells (and TRF1 silenced cells as a positive control).
      • A TRF2 ChIP-seq should be performed in NSC as this list of genes (named TAN genes in the text) was determined using a ChIP performed in another cell line (HT1080). For the ChIP-qPCR in the various conditions, primers for negative control regions should be included to show the specific binding of TRF2 to the promoter of the genes associated with neuronal differentiation. For example, an intergenic region and/or promoters of genes that are not associated with neuronal differentiation (or don't contain a potential G4). The same comment goes true for the gene expression analysis: a few genes that are not bound by TRF2 should be included as negative controls to exclude a potential global effect of TRF2 loss on gene expression (ideally a RNA-seq would be performed instead).
      • A co-IP should be performed between the TRF2 PTM mutant K176R or WT TRF2 and REST and PRC2 components to directly show a defect of interaction between them when TRF2 is mutated (a co-IP with DNase/RNase treatment to exclude nucleic-acid bridging). The TRF2 PTM mutant T188N also seems to lead to an increased differentiation (Fig. S5a). Could the author repeat the measure of gene expression and co-IP with REST upon the overexpression of this mutant too?
      • The authors show that the G4 ligands SMH14.6 and Bis-indole carboxamide upregulate TAN genes and promote neuronal differentiation, but the underlying mechanism remains unclear. Bis-indole carboxamide is generally considered a G4 stabilizer, while SMH14.6 is less characterized and should be better introduced. The authors should clarify how G4 stabilization would interfere with TRF2 binding, it seems that it would likely be by blocking access. A more detailed discussion, and ideally TRF2 ChIP after ligand treatment and/or G4 helicase treatment, would strengthen the model.

      Minor comments:

      • Supp Figures related to the scRNA-seq are difficult to read (blurry).
      • Fig S1h: The red box mentioned in the legend is not visible
      • In the text, the Figures 1 f-g are misannotated as Fig 1m and l
      • The symbol  of H2AX is missing in the text
      • Fig.3d, please indicate in the legend that it is done in SH-SY5Y.
      • Fig. S3b: Please consider replotting this panel with an increased y-axis scale. As currently presented, the TRF2 ChIP-seq peaks at several promoters appear truncated by the scaling.
      • Fig S4b: the legends should be fixed, the figure shows TRF2 occupancy upon REST silencing and not the other way around.

      Significance

      Non-telomeric roles of TRF2 have been reported before: in repressing neuronal genes and promoting a stem-like state by stabilizing REST (PMID: 18818083), in promoter G4 binding and recruitment of chromatin repressors (previous studies from the same lab), and TRF2 was shown to be dispensable for telomere protection in pluripotent stem cells (ES). The novelty of the current study lies primarily in extending/combining these mechanisms to NSCs.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their thoughtful and constructive feedback, which helped us strengthen the study on both the computational and biological side. In response, we added substantial new analyses and results in a total of 26 new supplementary figures and a new supplementary note. Importantly, we demonstrated that our approach generalizes beyond tissue outcomes by predicting final-timepoint morphology clusters from early frames with good accuracy as new Figure 4C. Furthermore, we completely restructured and expanded the human expert panel: six experts now provided >30,000 annotations across evenly spaced time intervals, allowing us to benchmark human predictions against CNNs and classical models under comparable conditions. We verified that morphometric trajectories are robust: PCA-based reductions and nearest-neighbor checks confirmed that patterns seen in t-SNE/UMAP are genuine, not projection artifacts. To test whether z-stacks are required, we re-did all analyses with sum- and maximum-intensity projections across five slices; results were unchanged, showing that single-slice imaging is sufficient. From a bioinformatics perspective, we performed negative-label baselines, downsampling analyses to quantify dataset needs, and statistical tests confirming CNNs significantly outperform classical models. Biologically, we clarified that each well contains one organoid, further introduced the Latent Determination Horizon concept tied to expert visibility thresholds, and discussed limits in cross-experiment transfer alongside strategies for domain adaptation and adaptive interventions. Finally, we clarified methods, corrected terminology and a scaler leak, and made all code and raw data publicly available.

      Together, these revisions in our opinion provide an even clearer, more reproducible, and stronger case for the utility of predictive modeling in retinal organoid development.


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This study presents predictive modeling for developmental outcome in retinal organoids based on high-content imaging. Specifically, it compares the predictive performance of an ensemble of deep learning models with classical machine learning based on morphometric image features and predictions from human experts for four different task: prediction of RPE presence and lense presence (at the end of development) as well as the respective sizes. It finds that the DL model outperforms the other approaches and is predictive from early timepoints on, strongly indicating a time-frame for important decision steps in the developmental trajectory.

      Response: We thank the reviewer for the constructive and thoughtful feedback. In response to the review as found below, we have made substantial revisions and additions to the manuscript. Specifically, we clarified key aspects of the experimental setup, changed terminology regarding training/validation/test sets, and restructured our human expert baseline analysis by collecting and integrating a substantially larger dataset of expert annotations according to suggestion. We introduced the Latent Determination Horizon concept with clearer rationale and grounding. Most importantly, we significantly expanded our interpretability analyses across three CNN architectures and eight attribution methods, providing comprehensive quantitative evaluations and supplementary figures that extend beyond the initial DenseNet121 examples (new Supplementary Figures S29-S37). We also ensured full reproducibility by making both code and raw data publicly available with documentation. While certain advanced interpretability methods (e.g., Discover) could not be integrated despite considerable effort, we believe the revised manuscript presents a robust, well-documented, and carefully qualified analysis of CNN predictions in retinal organoid development.

      Major comments: I find the paper over-all well written and easy to understand. The findings are relevant (see significance statement for details) and well supported. However, I have some remarks on the description and details of the experimental set-up, the data availability and reproducibility / re-usability of the data.

      1. Some details about the experimental set-up are unclear to me. In particular, it seems like there is a single organoid per well, as the manuscript does not mention any need for instance segmentation or tracking to distinguish organoids in the images and associate them over time. Is that correct? If yes, it should be explicitly stated so. Are there any specific steps in the organoid preparation necessary to avoid multiple organoids per well? Having multiple organoids per well would require the aforementioned image analysis steps (instance segmentation and tracking) and potentially add significant complexity to the analysis procedure, so this information is important to estimate the effort for setting up a similar approach in other organoid cultures (for example cancer organoids, where multiple organoids per well are common / may not be preventable in certain experimental settings).

      Response: We thank the reviewer for this question. We agree that these preprocessing steps would add more complexity to our presented preprocessing steps and would definitely be required in some organoid systems. In our experimental setup, there is only one organoid per well which forms spontaneously after cell seeding from (almost) all seeded cells. There are no additional steps necessary in order to ensure this behaviour in our setup. We amended the Methods section to now explicitly state this accordingly (paragraph ‘Organoid timelapse imaging’).

      The terminology used with respect to the test and validation set is contrary to the field, and reporting the results on the test set (should be called validation set), should be avoided since it is used to select models. In more detail: the terms "test set" and "validation set" (introduced in 213-221) are used with the opposite meaning to their typical use in the deep learning literature. Typically, the validation set refers to a separate split that is used to monitor convergence / avoid overfitting during training, and the test set refers to an external set that is used to evaluate the performance of trained models. The study uses these terms in an opposite manner, which becomes apparent from line 624: "best performing model ... judged by the loss of the test set.". Please exchange this terminology, it is confusing to a machine learning domain expert. Furthermore, the performance on the test set (should be called validation set) is typically not reported in graphs, as this data was used for model selection, and thus does not provide an unbiased estimate of model performance. I would remove the respective curves from Figures 3 and 4.

      Response: We are thankful for the reviewers comments on this matter. Indeed, we were using an opposite terminology compared to what is commonly used within the field. We have adjusted the Results, Discussion and Methods sections as well as the figures accordingly. Further, we added a corresponding disclaimer for the code base in the github repository. However, we prefer to not remove the respective curves from the figures. We think that this information is crucial to interpret the variability in accuracy between organoids from the same experiments and organoids acquired from a different, independent experiment. The results suggest that the accuracy for organoids within the same experiments is still higher, indicating to users the potential accuracy drop resulting from independent experiments. As we think that this is crucial information for the interpretability of our results, we would like to still include it side-by-side with the test data in the figures.

      The experimental set-up for the human expert baseline is quite different to the evaluation of the machine learning models. The former is based on the annotation of 4,000 images by seven expert, the latter based on a cross-validation experiments on a larger dataset. First of all, the details on the human expert labeling procedure is very sparse, I could only find a very short description in the paragraph 136-144, but did not find any further details in the methods section. Please add a methods section paragraph that explains in more detail how the images were chosen, how they were assigned to annotators, and if there was any redundancy in annotation, and if yes how this was resolved / evaluated. Second, the fact that the set-up for human experts and ML models is quite different means that these values are not quite comparable in a statistical sense. Ideally, human estimators would follow the same set-up as in ML (as in, evaluate the same test sets). However, this would likely prohibitive in the required effort, so I think it's enough to state this fact clearly, for example by adding a comment on this to the captions of Figure 3 and 4.

      Response: We thank the reviewer for this constructive suggestion. We agree that the curves for human evaluations in the original draft were calculated differently compared to the curves for the classification algorithms, mostly stemming from feasibility of data set annotation at the time. In order to still address this suggestion, we went on to repeat and substantially expand the number of images annotated and thus revised the full human expert annotation. Each one of 6 human experts was asked to predict/interpret 6 images of each organoid within the full dataset. In order to select the images, we divided the time course (0-72h) into 6 evenly spaced intervals of 12 hours. For each interval, one image per organoid and human expert was randomly selected and assigned. This resulted in a total of 31,626 classified images (up from 4000 in the original version of the manuscript), from which the assigned images were overlapping between experts for each source interval but not for the individual images. We then changed the calculation of the curves to be the same as for the classification analysis: F1 data were calculated for each experiment over 6 timeframes and all experts, and plotted within the respective figure. We have amended the Methods section accordingly and replaced the respective curves within Figures 3 and 4 and Supplementary Figures S1, S8 and S19.

      It is unclear to me where the theoretical time window for the Latent Determination Horizon in Figure 5 (also mentioned in line 350) comes from? Please explain this in more detail and provide a citation for it.

      Response: We thank the reviewer for this important point. The Latent Determination Horizon (LDH) is a conceptual framework we introduced in this study to describe the theoretical period during which the eventual presence of a tissue outcome of interest (TOI) is being determined but not yet detectable. It is derived from two main observations in our dataset: (i) the inherent intra- and inter-experimental heterogeneity of organoid outcomes despite standardized protocols, and (ii) the progressive increase in predictive performance of our deep learning models over time, which suggests that informative morphological features only emerge gradually. We have now clarified this rationale in the manuscript (Discussion section) further and explicitly stated that the LDH is a concept we introduce here, rather than a previously described or cited term.

      The timewindow is defined by the TOI visibility, which is defined empirically as indicated by the results of our human expert panel (compare also Supplementary Figure S1).

      The intepretability analysis (Figure 4, 634-639) based on relevance backpropagation was performed based on DenseNet121 only. Why did you choose this model and not the ResNet / MobileNet? I think it is quite crucial to see if there are any differences between these model, as this would show how much weight can be put on the evidence from this analysis and I would suggest to add an additional experiment and supplementary figure on this.

      Response: We thank the reviewer for this important comment regarding the interpretability analysis and the choice of model. In the original submission, we restricted the attribution analyses shown in originial Figure 4C to DenseNet121, which served as our main reference model throughout the study. This choice was made primarily for clarity and to avoid redundancy in the main figures, as all three convolutional neural network (CNN) architectures (DenseNet121, ResNet50, MobileNetV3_Large) achieved comparable classification performance on our tasks.

      In response to the reviewer’s concern, we have now extended the interpretability analyses to include all three CNN architectures and a total of eight attribution methods (new Supplementary Note 1). Specifically, we generated saliency maps for DenseNet121, ResNet50, and MobileNetV3_Large across multiple time points and evaluated them using a systematic set of metrics: pairwise method agreement within each model (new Supplementary Figure S29), cross-model consistency per method (new Supplementary Figure S34), entropy and diffusion of saliencies over time (new Supplementary Figure S35), regional voting overlap across methods (new Supplementary Figure S36), and spatial drift of saliency centers of mass (new Supplementary Figure S37).

      These pooled analyses consistently showed that attribution methods differ markedly in the regions they prioritize, but that their relative behaviors were mostly stable across the three CNN architectures. For example, Grad-CAM and Guided Grad-CAM exhibited strong internal agreement and progressively focused relevance into smaller regions, while gradient-based methods such as DeepLiftSHAP and Integrated Gradients maintained broader and more diffuse relevance patterns but were the most consistent across models. Perturbation-based methods like Feature Ablation and Kernel SHAP often showed decreasing entropy and higher spatial drift, again similarly across architectures.

      To further address the reviewer’s point, we visualized the organoid depicted in original Figure 4C across all three CNNs and all eight attribution methods (new Supplementary Figures S30-S33). These comparisons confirm and extend analysis of the qualitative patterns described in original Figure 4C and show that they are not specific to DenseNet121, but are representative of the general behavior across architectures.

      In sum, we observed notable differences in how relevance was assigned and how consistently these assignments aligned. Highlighted organoid patterns were not consistent enough across attribution methods for us to be comfortable to base unequivocal biological interpretation on them. Nevertheless we believe that the analyses in response to the reviewer’s suggestions (new Supplementary Note 1 and new Supplementary Figures S29-S37) add valuable context to what can be expected from machine learning models in an organoid research setting.

      As we did not base further unequivocal biological claims on the relevance backpropagation, we decided to move the analyses to the Supporting Information and now show a new model predicting organoid morphology by morphometrics clustering at the final imaging timepoint in new Figure 4C in line with suggestions by Reviewer #3.

      The code referenced in the code availability statement is not yet present. Please make it available and ensure a good documentation for reproducibility. Similarly, it is unclear to me what is meant by "The data that supports the findings will be made available on HeiDoc". Does this only refer to the intermediate results used for statistical analysis? I would also recommend to make the image data of this study available. This could for example be done through a dedicated data deposition service such as BioImageArchive or BioStudies, or with less effort via zenodo. This would ensure both reproducibility as well as potential re-use of the data. I think the latter point is quite interesting in this context; as the authors state themselves it is unclear if prediction of the TOIs isn't even possible at an earlier point that could be achieved through model advances, which could be studied by making this data available.

      Response: We thank the reviewer for this comment. We have now made the repository and raw data public on the suggested platform (Zenodo) and apologize for this oversight. The links are contained within the github repository which is stated in the manuscript under “Data availability”.

      Minor comments:

      Line 315: Please add a citation for relevance backpropagation here.

      Response: We have included citations for all relevance backpropagation methods used in the paper.

      Line 591: There seems to be typo: "[...] classification of binary classification [...]"

      Response: Corrected as suggested.

      Line 608: "[...] where the images of individual organoids served as groups [...]" It is unclear to me what this means.

      Response: We wanted to express that organoid images belonging to one organoid were assigned in full to a training/validation set. We have now stated this more clearly in the Methods section.

      Reviewer #1 (Significance (Required)):

      General assessment: This study demonstrates that (retinal) organoid development can be predicted from early timepoints with deep learning, where these cannot be discerned by human experts or simpler machine learning models. This fact is very interesting in itself due to its implication for organoid development, and could provide a valuable tool for molecular analysis of different organoid populations, as outlined by the authors. The contribution could be strengthened by providing a more thorough investigation of what features in the image are predictive at early timepoints, using a more sophisticated approach than relevance backprop, e.g. Discover (https://www.nature.com/articles/s41467-024-51136-9). This could provide further biological insight into the underlying developmental processes and enhance the understanding of retinal organoid development.

      Response: We thank the reviewer for this assessment and suggestion. We agree that identifying image features predictive at early timepoints would add important biological context. We therefore attempted to apply Discover to our dataset. However, we were unable to get the system to run successfully. After considerable effort, we concluded that this approach could not be integrated into our current analysis. Instead, we report our substantially expanded results obtained with relevance backpropagation, which provided the most interpretable and reproducible insights for our study as described above (New Supplementary Note 1, new Supplementary Figures S29-S37).

      Advance: similar studies that predict developmental outcome based on image data, for example cell proliferation or developmental outcome exist. However, to the best of my knowledge, this study is the first to apply such a methodology to organoids and convincingly shows is efficacy and argues is potential practical benefits. It thus constitutes a solid technical advance, that could be especially impactful if it could be translated to other organoid systems in the future.

      Response: We thank the reviewer for this positive assessment of our work and for highlighting its novelty and potential impact. We are encouraged that the reviewer recognizes the value of applying predictive modeling to organoids and the opportunities this creates for translation to other organoid systems.

      Audience: This research is of interest to a technical audience. It will be of immediate interest to researchers working on retinal organoids, who could adapt and use the proposed system to support experiments by better distinguishing organoids during development. To enable this application, code and data availability should be ensured (see above comments on reproducibility). It is also of interest to researchers in other organoid systems, who may be able to adapt the methodology to different developmental outcome predictions. Finally, it may also be of interest to image analysis / deep learning researchers as a dataset to improve architectures for predictive time series modeling.

      My research background: I am an expert in computer vision and deep learning for biomedical imaging, especially in microscopy. I have some experience developing image analysis for (cancer) organoids. I don't have any experience on the wet lab side of this work.

      Response: We thank the reviewer for this encouraging feedback and for recognizing the broad relevance of our work across retinal organoid research, other organoid systems, and the image analysis community. We are pleased that the potential utility of our dataset and methodology is appreciated by experts in computer vision and biomedical imaging. We have now made the repository and raw data public and apologize for this oversight. The links are provided in the manuscript under “Data availability”.

      Constantin Pape


      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: Afting et al. present a computational pipeline for analyzing timelapse brightfield images of retinal organoids derived from Medaka fish. Their pipeline processes images along two paths: 1) morphometrics (based on computer vision features from skimage) and 2) deep learning. They discovered, through extensive manual annotation of ground truth, that their deep learning method could predict retinal pigmented epithelium and lens tissue emergence in time points earlier than either morphometrics or expert predictions. Our review is formatted based on the review commons recommendation.

      Response: We thank the reviewer for the detailed and constructive feedback, which has greatly improved the clarity and rigor of our manuscript. In response, we have corrected a potential data leakage issue, re-ran the affected analyses, and confirmed that results remain unchanged. We clarified the use of data augmentation in CNN training, tempered some claims throughout the text, and provided stronger justification for our discretization approach together with new supplementary analyses (New Supplementary Figures S26, S27). We substantially expanded our interpretability analyses across three CNN architectures and eight attribution methods, quantified their consistency and differences (new Supplementary Figures S29, S34-S37, new Supplementary Note 1), and added comprehensive visualizations (New S30-S33). We also addressed technical artifact controls, provided downsampling analyses to support our statement on sample size sufficiency (new Supplementary Figure S28), and included negative-control baselines with shuffled labels in Figures 3 and 4. Furthermore, we improved the clarity of terminology, figures, and methodological descriptions, and we have now made both code and raw data publicly available with documentation. Together, we believe these changes further strengthen the robustness, reproducibility, and interpretability of our study while carefully qualifying the claims.

      Major comments:

      Are the key conclusions convincing?

      Yes, the key conclusion that deep learning outperforms morphometric approaches is convincing. However, several methodological details require clarification. For instance, were the data splitting procedures conducted in the same manner for both approaches? Additionally, the authors note in the methods: "The validation data were scaled to the same range as the training data using the fitted scalers obtained from the training data." This represents a classic case of data leakage, which could artificially inflate performance metrics in traditional machine learning models. It is unclear whether the deep learning model was subject to the same issue. Furthermore, the convolutional neural network was trained with random augmentations, effectively increasing the diversity of the training data. Would the performance advantage still hold if the sample size had not been artificially expanded through augmentation?

      Response: We thank the reviewer for raising these important methodological points. As Reviewer #1 correctly noted, our use of the terms validation and test may have contributed to confusion. To clarify: in the original analysis the scalers were fitted on the training and validation data and then applied to the test data. This indeed constitutes a form of data leakage. We have corrected the respective code, re-ran all analyses that were potentially affected, and did not observe any meaningful change in the reported results. The Methods section has been amended to clarify this important detail.

      For the neural networks, each image was normalized independently (per image), without using dataset-level statistics, thereby avoiding any risk of data leakage.

      Regarding data augmentation, the convolutional neural network was indeed trained with augmentations. Early experiments without augmentation led to severe overfitting, confirming that the performance advantage would not hold without artificially increasing the effective sample size. We have added a clarifying statement in the Methods section to make this explicit.

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? Their claims are currently preliminary, pending increased clarity and additional computational experiments described below.

      Response: We believe our additionally performed computational experiments qualify all the claims we make in the revised version of the manuscript.

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      • The authors discretize continuous variables into four bins for classification. However, a regression framework may be more appropriate for preserving the full resolution of the data. At a minimum, the authors should provide a stronger justification for this binning strategy and include an analysis of bin performance. For example, do samples near bin boundaries perform comparably to those near the bin centers? This would help determine whether the discretization introduces artifacts or obscures signals.

      Response: We thank the reviewer for this thoughtful suggestion. We agree that regression frameworks can, in principle, preserve the full resolution of continuous outcome variables. However, in our setting we deliberately chose a discretization approach. First, the discretized outcome categories correspond to ranges of tissue sizes that are biologically meaningful and allow direct comparison to expert annotations. In practice, human experts also tend to judge tissue presence and size in categorical rather than strictly continuous terms, which was mirrored by our human expert annotation strategy. As we aimed to compare deep learning with classical machine learning models and with expert annotations across the same prediction tasks, a categorical outcome formulation provided the most consistent and fair framework. Secondly, the underlying outcome variables did not follow a normal distribution, but instead exhibited a skewed and heterogeneous spread. Regression models trained on such distributions often show biases toward the most frequent value ranges, which may obscure less common but biologically important outcomes. Discretization mitigated this issue by balancing the prediction task across defined size categories.

      In line with the reviewer’s request, we have now analyzed the performance in relation to the distance of each sample from the bin center. These results are provided as new Supplementary Figures S26 and S27. Interestingly, for the classical machine learning classifiers, F1 scores tended to be somewhat higher for samples close to bin edges. For the convolutional neural networks, however, F1 scores were more evenly distributed across distances from bin centers. While the reason for this difference remains unclear, the analysis demonstrates that the discretization did not obscure predictive signals in either framework. We have amended the results section accordingly.

      • The relevance backpropagation interpretation analysis is not convincing. The authors argue that the model's use of pixels across the entire image (rather than just the RPE region) indicates that the deep learning approach captures holistic information. However, only three example images are shown out of hundreds, with no explanation for their selection, limiting the generalizability of the interpretation. Additionally, it is unclear how this interpretability approach would work at all in earlier time points, particularly before the model begins making confident predictions around the 8-hour mark. It is also not specified whether the input used for GradSHAP matches the input used during CNN training. The authors should consider expanding this analysis by quantifying pixel importance inside versus outside annotated regions over time. Lastly, Figure 4C is missing a scale bar, which would aid in interpretability.

      Response: We thank the reviewer for raising these important concerns. In the initial version we showed examples of relevance backpropagation that suggested CNNs rely on visible RPE or lens tissue for their predictions (original Figure 4C). Following the reviewer’s comment, we expanded the analysis extensively across all models and attribution methods (compare new Supplementary Note 1), and quantified agreement, consistency, entropy, regional overlap, and drift (new Supplementary Figures S29 and S34-S37), as well as providing comprehensive visualizations across models and methods (new Supplementary Figures S30-S33).

      This extended analysis showed that attribution methods behave very differently from each other, but consistently so across the three CNN architectures. Each method displayed characteristic patterns, for example in entropy or center-of-mass drift, but the overlap between methods was generally low. While integrated gradients and DeepLiftSHAP tended to concentrate on tissue regions, other methods produced broader or shifting relevance patterns, and overall we could not establish robust or interpretable signals from a biological point of view that would support stronger conclusions.

      We have therefore revised the text to focus on descriptive results only, without making claims about early structural information or tissue-specific cues being used by the networks. We also added missing scale bars and clarified methodological details. Together, the revised section now reflects the extensive work performed while remaining cautious about what can and cannot be inferred from saliency methods in this setting.

      • The authors claim that they removed technical artifacts to the best of their ability, but it is unclear if the authors performed any adjustment beyond manual quality checks for contamination. Did the authors observe any illumination artifacts (either within a single image or over time)? Any other artifacts or procedures to adjust?

      Response: We thank the reviewer for this comment. We have not performed any adjustment beyond manual quality control post organoid seeding. The aforementioned removal of technical artifacts included, among others, seeding at the same time of day, seeding and cell processing by the same investigator according to a standardized protocol, usage of reproducible chemicals (same LOT, frozen only once, etc.) and temperature control during image acquisition. We adhered strictly to internal, previously published workflows that were aimed to reduce any variability due to technical variations during cell harvesting, organoid preparation and imaging. We have clarified this important point in the Methods section.

      • In line 434-436 the authors state "In this work, we used 1,000 organoids in total, to achieve the reported prediction accuracies. Yet, we suspect that as little as ~500 organoids are sufficient to reliably recapitulate our findings." It is unclear what evidence the authors use to support this claim? The authors could perform a downsampling analysis to determine tradeoff between performance and sample size.

      Response: We thank the reviewer for this important comment. To clarify, our statement regarding the sufficiency of ~500 organoids was based on a downsampling-style analysis we had already performed. In this analysis, we systematically reduced the number of experiments used for training and assessed predictive performance for both CNN- and classifier-based approaches (former Supplementary Figure S11, new Supplementary Figure S28). For CNNs, performance curves plateaued at approximately six experiments (corresponding to ~500 organoids), suggesting that increasing the sample size further only marginally improved prediction accuracy. In contrast, we did not observe a clear plateau for the machine learning classifiers, indicating that these models can achieve comparable performance with fewer training experiments. We have revised the manuscript text to clarify that this conclusion is derived from these analyses, and continue to include Supplementary Figure S11 as new Supplementary Figure S28 for transparency (compare Supplementary Note 1).

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. Yes, we believe all experiments are realistic in terms of time and resources. We estimate all experiments could be completed in 3-6 months.

      Response: We confirm that the suggested experiments are realistic in terms of time and resources and have been able to complete them within 6 months.

      Are the data and the methods presented in such a way that they can be reproduced? No, the code is not currently available. We were not able to review the source code.

      Response: We have now made the repository public. We apologize for this initial oversight. The links are provided in the revised version of the manuscript under “Data availability”.

      Are the experiments adequately replicated and statistical analysis adequate?

      • The experiments are adequately replicated.

      • The statistical analysis (deep learning) is lacking a negative control baseline, which would be helpful to observe if performance is inflated.

      Response: We thank the reviewer for this comment. We have calculated the respective curves with neural networks and machine learning classifiers that were trained on data with shuffled labels and have included these results as a separate curve in the respective Figures 3 and 4. We have also amended the Methods section accordingly.

      Minor comments:

      Specific experimental issues that are easily addressable.

      Are prior studies referenced appropriately?

      Yes.

      Are the text and figures clear and accurate?

      The authors must improve clarity on terminology. For example, they should define a comprehensive dataset, significant, and provide clarity on their morphometrics feature space. They should elaborate on what they mean by "confounding factor of heterogeneity".

      Response: We thank the reviewer for highlighting the need to clarify terminology. We have revised the manuscript accordingly. Specifically, we now explicitly define comprehensive dataset as longitudinal brightfield imaging of ~1,000 organoids from 11 independent experiments, imaged every 30 minutes over several days, covering a wide range of developmental outcomes at high temporal resolution. Furthermore, we replaced the term significantly with wording that avoids implying statistical significance, where appropriate. We have clarified the morphometrics feature space in the Methods section in a more detailed fashion, describing the custom parameters that we used to enhance the regionprops_table function of skimage.

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions? - Figure 2C describes a distance between what? The y axis is likely too simple. Same confusion over Figure 2D. Was distance computed based on tsne coordinates?

      Response: We thank the reviewer for pointing out this potential source of confusion. The distances shown in original Figures 2C and 2D were not calculated in tSNE space. Instead, morphometrics features were first Z-scaled, and then dimensionality reduction by PCA was applied, with the first 20 principal components retaining ~93% of the variance. Euclidean distances were subsequently computed in this 20-dimensional PC space. For inter-organoid distances (Figure 2C), we calculated mean pairwise Euclidean distances between all organoids at each imaging time point, capturing the global divergence of organoid morphologies over time in an experiment-specific manner. For intra-organoid distances (Figure 2D), we calculated Euclidean distances between consecutive time points (n vs. n+1) for each individual organoid, thereby quantifying the extent of morphological change within organoids over time. We have revised the Figure legend and Methods section to make these definitions clearer.

      • The authors perform a Herculean analysis comparing dozens of different machine learning classifiers. They select two, but they should provide justification for this decision.

      Response: We thank the reviewer for this comment. In our initial machine learning analyses, we systematically benchmarked a broad set of classifiers on the morphometrics feature space, using cross-validation and hyperparameter tuning where appropriate. The classifiers that we ultimately focused on were those that consistently achieved the best performance in these comparisons. This process is described in the Methods and summarized in the Supplementary Figures S4 and S15 (for sum- and maximum-intensity z-projections new Supplementary Figures S5/6 and S16/17), which show the results of the benchmarking. We have clarified the text to state that the selected classifiers were chosen on the basis of their superior performance in these evaluations.

      • It would be good to get a sense for how these retinal organoids grow - are they moving all over the place? They are in Matrigel so maybe not, but are they rotating?

      Can the author's approach predict an entire non-emergence experiment? The authors tried to standardize protocol, but ultimately if It's deriving this much heterogeneity, then how well it will actually generalize to a different lab is a limitation.

      Response: We thank the reviewer for these thoughtful questions. The retinal organoids in our study were embedded in low concentrations of Matrigel and remained relatively stable in position throughout imaging. We did not observe substantial displacement or lateral movement of organoids, and no systematic rotation could be detected in our dataset. Small morphological rearrangements within organoids were observed, but the gross positioning of organoids within the wells remained consistent across time-lapse recordings.

      Regarding generalization across laboratories, we agree with the reviewer that this is an important limitation. While we minimized technical variability by adhering to a highly standardized, published protocol (see Methods), considerable heterogeneity remained at both intra- and inter-experimental levels. This variability likely reflects inherent properties of the system, similar the reportings in the literature across organoid systems, rather than technical artifacts, and poses a potential challenge for applying our models to independently generated datasets. We therefore highlight the need for future work to test the robustness of our models across laboratories, which will be essential to determine the true generalizability of our approach. We have amended the Discussion accordingly.

      • The authors should dampen claims throughout. For example, in the abstract they state, "by combining expert annotations with advanced image analysis". The image analysis pipelines use common approaches.

      Response: We thank the reviewer for this comment. We agree that the individual image analysis steps we used, such as morphometric feature extraction, are based on well-established algorithms. By referring to “advanced image analysis,” we intended to highlight not the novelty of each single algorithm, but rather the way in which we systematically combined a large number of quantitative parameters and leveraged them through machine learning models to generate predictive insights into organoid development.

      • The authors state: "the presence of RPE and lenses were disagreed upon by the two independently annotating experts in a considerable fraction of organoids (3.9 % for RPE, 2.9% for lenses).", but it is unclear why there were two independently annotating experts. The supplements say images were split between nine experts for annotation.

      Response: We thank the reviewer for pointing out this ambiguity. To clarify, the ground truth definition at the final time point was established by two experts who annotated all organoids. These two annotators were part of the larger group of six experts who contributed to the earlier human expert annotation tasks. Thus, while six experts provided annotations for subsets of images during the expert prediction experiments, the final annotation for every single organoid at its last time frame was consistently performed by the same two experts to ensure a uniform ground truth. We have amended this in the revised manuscript to make this distinction clear.

      • Details on the image analysis pipeline would be helpful to clarify. For example, why did they choose to measure these 165 morphology features? Which descriptors were used to quantify blur? Did the authors apply blur metrics per FOV or per segmented organoid?

      Response: We thank the reviewer for this comment. To clarify, we extracted 165 morphometric features per segmented organoid, combining standard scikit-image region properties with custom implementations (e.g., blur quantified as the variance of the Laplace filter response within the organoid mask). All metrics, including blur, were calculated per segmented organoid rather than per full field of view. This broad feature space was deliberately chosen to capture size, shape, and intensity distributions in a comprehensive and unbiased manner. We now provide a more detailed description of the preprocessing steps, the full feature list, and the exact code implementations are provided in the Methods section (“Large-scale time-lapse Image analysis”) of the revised version of the manuscript as well as in the source code github repository.

      • The description of the number of images is confusing and distracts from the number of organoids. The number of organoids and number of timepoints used would provide a better description of the data with more value. For example, does this image count include all five z slices?

      Response: We thank the reviewer for this comment. The reported image count includes slice 3 only, which we based our models on. The five z-slices that we used to create the MAX- and SUM-intensity z-projections would increase this number 5-fold. While we agree that the number of organoids and time points are highly informative metrics and have provided these details in the manuscript, we also believe that reporting the image count is valuable, as it directly reflects the size of the dataset processed by our analysis pipelines. For this reason, we prefer to keep the current description.

      • The authors should consider applying a maximum projection across the five z slices (rather than the middle z) as this is a common procedure in image analysis. Why not analyze three-dimensional morphometrics or deep learning features? Might this improve performance further?

      Response: We thank the reviewer for this valuable suggestion. To address this point, we repeated all analyses using both sum- and maximum-intensity z-projections and have included the results as new Supplementary Figures S8-S10, S13/S14 for TOI emergence and new Supplementary Figures S19-S21, S24/S25 for TOI sizes (classifier benchmarking and hyperparameter tuning in new Supplementary Figures S5/S6 and S16/S17). These additional analyses did not reveal a noticeable improvement in performance, suggesting that projections incorporating all slices are not strictly necessary in our setting. An analysis that included all five z-slices separately for classification would indeed be of interest, but was not feasible within the scope of this study, as it would substantially increase the computational demands beyond the available resources and timeframe.

      • There is a lot of manual annotation performed in this work, the authors could speculate how this could be streamlined for future studies. How does the approach presented enable streamlining?

      Response: We thank the reviewer for raising this important point. The current study relied on expert visual review, which is time-intensive, but our findings suggest several ways to streamline future work. For instance, model-assisted prelabeling could be used to automatically accept high-confidence cases while routing only uncertain cases to experts. Active sampling strategies, focusing expert review on boundary cases or rare classes, as well as programmatic checks from morphometrics (e.g., blur or contrast to flag low-quality frames), could further reduce effort. Consensus annotation could be reserved only for cases where the model and expert disagree or confidence is low. Finally, new experiments could be bootstrapped with a small seed set of annotated organoids for fine-tuning before switching to such a model-assisted workflow. These possibilities are enabled by our approach, where organoids are imaged individually, morphometrics provide automated quality indicators, and the CNN achieves reliable performance at early developmental stages, making model-in-the-loop annotation a feasible and efficient strategy for future studies. We have added a clarifying paragraph to the Discussion accordingly.

      Reviewer #2 (Significance (Required)):

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. The paper's advance is technical (providing new methods for organoid quality control) and conceptual (providing proof of concept that earlier time points contain information to predict specific future outcomes in retinal organoids)

      Place the work in the context of the existing literature (provide references, where appropriate).

      • The authors do a good job of placing their work in context in the introduction.
      • The work presents a simple image analysis pipeline (using only the middle z slice) to process timelapse organoid images. So not a 4D pipeline (time and space), just 3D (time). It is likely that more and more of these approaches will be developed over time, and this article is one of the early attempts.

      • The work uses standard convolutional neural networks.

      Response: We thank the reviewer for this assessment. We agree that our work represents one of the early attempts in this direction, applying a straightforward pipeline with standard convolutional neural networks, and we appreciate the reviewer’s acknowledgment of how the study has been placed in context within the Introduction.

      State what audience might be interested in and influenced by the reported findings. - Data scientists performing image-based profiling for time lapse imaging of organoids.

      • Retinal organoid biologists

      • Other organoid biologists who may have long growth times with indeterminate outcomes.

      Response: We thank the reviewer for outlining the relevant audiences. We agree that the reported findings will be of interest to data scientists working on image-based profiling, retinal organoid biologists, and more broadly to organoid researchers facing long culture times with uncertain developmental outcomes.

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. - Image-based profiling/morphometrics

      • Organoid image analysis

      • Computational biology

      • Cell biology

      • Data science/machine learning

      • Software

      This is a signed review:

      Gregory P. Way, PhD

      Erik Serrano

      Jenna Tomkinson

      Michael J. Lippincott

      Cameron Mattson

      Department of Biomedical Informatics, University of Colorado


      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      This manuscript by Afting et. al. addresses the challenge of heterogeneity in retinal organoid development by using deep learning to predict eventual tissue outcomes from early-stage images. The central hypothesis is that deep learning can forecast which tissues an organoid will form (specifically retinal pigmented epithelium, RPE, and lens) well before those tissues become visibly apparent. To test this, the authors assembled a large-scale time-lapse imaging dataset of ~1,000 retinal organoids (~100,000 images) with expert annotations of tissue outcomes. They characterized the variability in organoid morphology and tissue formation over time, focusing on two tissues: RPE (which requires induction) and lens (which appears spontaneously). The core finding is that a deep learning model can accurately predict the emergence and size of RPE and lens in individual organoids at very early developmental stages. Notably, a convolutional neural network (CNN) ensemble achieved high predictive performance (F1-scores ~0.85-0.9) hours before the tissues were visible, significantly outperforming human experts and classical image-analysis-based classifiers. This approach effectively bypasses the issue of stochastic developmental heterogeneity and defines an early "determination window" for fate decisions. Overall, the study demonstrates a proof-of-concept that artificial intelligence can forecast organoid differentiation outcomes non-invasively, which could revolutionize how organoid experiments are analyzed and interpreted.

      Recommendation:

      While this manuscript addresses an important and timely scientific question using innovative deep learning methodologies, it currently cannot be recommended for acceptance in its present form. The authors must thoroughly address several critical limitations highlighted in this report. In particular, significant issues remain regarding the generalizability of the predictive models across different experimental conditions, the interpretability of deep learning predictions, and the use of Euclidean distance metrics in high-dimensional morphometric spaces-potentially leading to distorted interpretations of organoid heterogeneity. These revisions are essential for validating the general applicability of their approach and enhancing biological interpretability. After thoroughly addressing these concerns, the manuscript may become suitable for future consideration.

      Response: We thank the reviewer for the thoughtful and constructive comments. In response, we expanded our analyses in several key ways. We clarified limitations regarding external datasets. Interpretability analyses were greatly extended across three CNN architectures and eight attribution methods (new Supplementary Figures S29-S37, new Supplementary Note 1), showing consistent but method-specific behaviors; as no reproducible biologically interpretable signals emerged, we now present these results descriptively and clearly state their limitations. We further demonstrated the flexibility of our framework by predicting morphometric clusters in addition to tissue outcomes (new Figure 4C), confirmed robustness of the morphometrics space using PCA and nearest-neighbor analyses (new Supplementary Figure S3), and added statistical tests confirming CNNs significantly outperform classical classifiers (Supplementary File 1). Finally, we made all code and raw data publicly available, clarified species context, and added forward-looking discussion on adaptive interventions. We believe these revisions now further improve the rigor and clarity of our work.

      Major Issues (with Suggestions):

      1. Generalization to Other Batches or Protocols: The drop in performance on independent validation experiments suggests the model may partially overfit to specific experimental conditions. A major concern is how well this approach would work on organoids from a different batch or produced by a slightly different differentiation protocol. Suggestion: The authors should clarify the extent of variability between their "independent experiment" and training data (e.g., were these done months apart, with different cell lines or minor protocol tweaks?). To strengthen confidence in the model's robustness, I recommend testing the trained model on one or more truly external datasets, if available (for instance, organoids generated in a separate lab or under a modified protocol). Even a modest analysis showing the model can be adapted (via transfer learning or re-training) to another dataset would be valuable. If new data cannot be added, the authors should explicitly discuss this limitation and perhaps propose strategies (like domain adaptation techniques or more robust training with diverse conditions) to handle batch effects in future applications.

      Response: We thank the reviewer for this important comment. We fully agree with the reviewer that this would be an amazing addition to the manuscript. Unfortunately we are not able to obtain the requested external data set. Although retinal organoid systems exist and are widely used across different species lines, to the best of our knowledge our laboratory is the only one currently raising retinal organoids from primary embryonic pluripotent stem cells of Oryzias latipes and there is currently only one known (and published) differentiation protocol which allows the successful generation of these organoids. We note that our datasets were collected over the course of nine months, which already introduces variability across time and thus partially addresses concerns regarding batch effects. While we did not have access to truly external datasets (e.g., from other laboratories), we have clarified this limitation as suggested in the revised version of the manuscript and outlined strategies such as domain adaptation and training on more diverse conditions as promising future directions to improve robustness.

      Biological Interpretation of Early Predictive Features: The study currently concludes that the CNN picks up on complex, non-intuitive features that neither human experts nor conventional analysis could identify. However, from a biological perspective, it would be highly insightful to know what these features are (e.g., subtle texture, cell distribution patterns, etc.). Suggestion: I encourage the authors to delve deeper into interpretability. They might try complementary explainability techniques (for example, occlusion tests where parts of the image are masked to see if predictions change, or activation visualization to see what patterns neurons detect) beyond GradientSHAP. Additionally, analyzing false predictions might provide clues: if the model is confident but wrong for certain organoids, what visual traits did those have? If possible, correlating the model's prediction confidence with measured morphometrics or known markers (if any early marker data exist) could hint at what the network sees. Even if definitive features remain unidentified, providing the reader with any hypothesis (for instance, "the network may be sensing a subtle rim of pigmentation or differences in tissue opacity") would add value. This would connect the AI predictions back to biology more strongly.

      Response: We thank the reviewer for this thoughtful suggestion. We agree that linking CNN predictions to specific biological features would be highly valuable. In response, we expanded our interpretability analyses beyond GradientSHAP to a broad set of attribution methods and quantified their behavior across models and timepoints (new Supplementary Figures S29-S37, new Supplementary Note 1). While some methods (e.g., Integrated Gradients, DeepLiftSHAP) occasionally highlighted visible tissue regions, others produced diffuse or shifting relevance, and overall overlap was low. Therefore, our results did not yield reproducible, interpretable biological signals.

      Given these results, we have refrained from speculating about specific early image features and now present the interpretability analyses descriptively. We agree that future studies integrating imaging with molecular markers will be required to directly link early predictive cues to defined biological processes.

      Expansion to Other Outcomes or Multi-Outcome Prediction: The focus on RPE and lens is well-justified, but these are two outcomes within retinal organoids. A major question is whether the approach could be extended to predict other cell types or structures (e.g., presence of certain retinal neurons, or malformations) or even multiple outcomes at once. Suggestion: The authors should discuss the generality of their approach. Could the same pipeline be trained to predict, say, photoreceptor layer formation or other features if annotated? Are there limitations (like needing binary outcomes vs. multi-class)? Even if outside the scope of this study, a brief discussion would reassure readers that the method is not intrinsically limited to these two tissues. If data were available, it would be interesting to see a multi-label classification (predict both RPE and lens presence simultaneously) or an extension to other organoid systems in future. Including such commentary would highlight the broad applicability of this platform.

      Response: We thank the reviewer for this helpful and important suggestion. While our study focused on RPE and lens as the most readily accessible tissues of interest in retinal organoids, our new analyses demonstrate that the pipeline is not limited to these outcomes. In addition to tissue-specific predictions, we trained both a convolutional neural network (on image data) and a decision tree classifier (on morphometrics features) to predict more abstract morphological clusters defined at the final timepoint using the morphometrics features, showing that both approaches could successfully capture non-tissue features from early frames (new Figure 4C). This illustrates that the framework can be extended beyond binary tissue outcomes to multi-class problems, and predict relevant outcomes like the overall organoid morphology. Given appropriate annotations, the framework could in principle be trained to detect additional structures such as photoreceptor layers or malformations. Furthermore, the CNN architecture we employed and the morphometrics feature space are compatible with multi-label classification, meaning simultaneous prediction of several outcomes would also be feasible. We have clarified this point in the discussion to highlight the methodological flexibility and potential generality of our approach and are excited to share this very interesting, additional model with the readership.

      Curse of high dimensionality: Using Euclidean distance in a 165-dimensional morphometric space likely suffers from the curse of dimensionality, which diminishes the meaning of distances as dimensionality increases. In such high-dimensional settings, the range of pairwise distances tends to collapse, undermining the ability to discern meaningful intra- vs. inter-organoid differences. Suggestion: To address this, I would encourage the authors to apply principal component analysis (PCA) in place of (or prior to) tSNE. PCA would reduce the data to a few dominant axes of variation that capture most of the morphometric variance, directly revealing which features drive differences between organoids. These principal components are linear combinations of the original 165 parameters, so one can examine their loadings to identify which morphometric traits carry the most information - yielding interpretable axes of biological variation (e.g., organoid size, shape complexity, etc.). In addition, I would like to mention an important cautionary remark regarding tSNE embeddings. tSNE does not preserve global geometry of the data. Distances and cluster separations in a tSNE map are therefore not faithful to the original high-dimensional distances and should be interpreted with caution. See Chari T, Pachter L (2023), The specious art of single-cell genomics, PLoS Comput Biol 19(8): e1011288, for an enlightening discussion in the context of single cell genomics. The authors have shown that extreme dimensionality reduction to 2D can introduce significant distortions in the data's structure, meaning the apparent proximity or separation of points in a tSNE plot may be an artifact of the algorithm rather than a true reflection of morphometric similarity. Implementing PCA would mitigate high-dimensional distance issues by focusing on the most informative dimensions, while also providing clear, quantitative axes that summarize organoid heterogeneity. This change would strengthen the analysis by making the results more robust (avoiding distance artifacts) and biologically interpretable, as each principal component can be traced back to specific morphometric features of interest.

      Response: We thank the reviewer for this mention. Indeed, high dimensionality and dimensionality reductions can lead to false interpretations. We approached this issue as follows: First, we calculated the same TSNE projections and distances using the first 20 PCs and supplied these data as the new Figure 2 and new Supplementary Figure 2. While the scale of the data shifted slightly, there were no differences in the data distribution that would contradict our prior conclusions.

      In order to confirm the findings and further emphasize the validity of our dimensionality reduction, we calculated the intersection of 30 nearest neighbors in raw data space (or pca space) compared and 30 nearest neighbors in reduced space (TSNE or UMAP, as we wanted to emphasize that this was not an effect specific for TSNE projections and would also be valid in a dimensionality reduction which is more known to preserve global structure rather than local structure). As shown in the new Supplementary Figure S3 (A-D), the high jaccard index confirmed that our projections accurately reflect the data’s structure obtained from raw distance measurements. Moreover, the jaccard index generally increased over time, which is best explained by a stronger morphological similarity of organoids at timepoint 0 and reflected by the dense point cloud in the TSNE projections at that timepoint. The described effects were independent of the usage of data derived from 20 PCs versus data derived from all 165 dimensions.

      We next wanted to confirm the conclusion that data points obtained from organoids at later timepoints were more closely related to each other than data points from different organoids. We therefore identified the 30 nearest neighbor data points, showing that at later timepoints these 30 nearest neighbor data points were almost all attributable to the same organoid (new Supplementary Figure S3 E/F). This was only not the case for experiments that lacked in between timepoints (E007 and E002), therefore misaligning the organoids in the reduced space and convoluting the nearest neighbor analysis.

      We have included the respective new Figures and new Supplementary Figures and linked them in the main manuscript.

      Statistical Reporting and Significance: The manuscript focuses on F1-score as the metric to report accuracy over time, which is appropriate. However, it's not explicitly stated whether any statistical significance tests were performed on the differences between methods (e.g., CNN vs human, CNN vs classical ML). Suggestion: The authors could report statistical significance of the performance differences, perhaps using a permutation test or McNemar's test on predictions. For example, is the improvement of the CNN ensemble over the Random Forest/QDA classifier statistically significant across experiments? Given the n of organoids, this should be assessable. Demonstrating significance would add rigor to the analysis.

      Response: We thank the reviewer for this helpful suggestion. Following the recommendation, we quantified per-experiment differences in predictive performance by calculating the area under the F1-score curves (AUC) for each classifier and experiment. We then compared methods using paired Wilcoxon signed-rank tests across experiments, with Holm-Bonferroni correction for multiple comparisons. This analysis confirmed that the CNN consistently and significantly outperformed the baseline models and classical machine learning classifiers in validation and test organoids, while CNNs were notably but not significantly better performing in test organoids for RPE area and lens sizes compared to the machine learning classifiers. In summary, the findings add the requested statistical rigor to our findings. The results of these tests are now provided in the Supplementary Material as Supplementary File 1.

      Minor Issues (with Suggestions):

      1. Data Availability: Given the resource-intensive nature of the work, the value to the community will be highest if the data is made publicly available. I understand that this is of course at the behest of the authors and they do mention that they will make the data available upon publication of the manuscript. For the time being, the authors can consider sharing at least a representative subset of the data or the trained model weights. This will allow others to build on their work and test the method in other contexts, amplifying the impact of the study.

      Response: We have now made the repository and raw data public and apologize for this oversight. The link for the github repository is now provided in the manuscript under “Data availability”, while the links for the datasets are contained within the github repository.

      Discussion - Future Directions: The Discussion does a good job of highlighting applications (like guiding molecular analysis). One minor addition could be speculation on using this approach to actively intervene: for example, could one imagine altering culture conditions mid-course for organoids predicted not to form RPE, to see if their fate can be changed? The authors touch on reducing variability by focusing on the window of determination; extending that thought to an experimental test (though not done here) would inspire readers. This is entirely optional, but a sentence or two envisioning how predictive models enable dynamic experimental designs (not just passive prediction) would be a forward-looking note to end on.

      Response: We thank the reviewer for this constructive suggestion. We have expanded the discussion to briefly address how predictive modeling could go beyond passive observation. Specifically, we now discuss that predictive models may enable dynamic interventions, such as altering culture conditions mid-course for organoids predicted not to form RPE, to test whether their developmental trajectory can be redirected. While outside the scope of the present work, this forward-looking perspective emphasizes how predictive modeling could inspire adaptive experimental strategies in future studies.

      I believe with the above clarifications and enhancements - especially regarding generalizability and interpretability - the paper will be suitable for broad readership. The work represents an exciting intersection of developmental biology and AI, and I commend the authors for this contribution.

      Response: We thank the reviewer for the positive assessment and their encouraging remarks regarding the contribution of our work to these fields.

      Novelty and Impact:

      This work fills an important gap in organoid biology and imaging. Previous studies have used deep learning to link imaging with molecular profiles or spatial patterns in organoids, but there remained a "notable gap" in predicting whether and to what extent specific tissues will form in organoids. The authors' approach is novel in applying deep learning to prospectively predict organoid tissue outcomes (RPE and lens) on a per-organoid basis, something not previously demonstrated in retinal organoids. Conceptually, this is a significant advance: it shows that fate decisions in a complex 3D culture model can be predicted well in advance, suggesting the existence of subtle early morphogenetic cues that only a sophisticated model can discern. The findings will be of broad interest to researchers in organoid technology, developmental biology, and biomedical AI.

      Response: We thank the reviewer for this thoughtful and encouraging assessment. We agree that our study addresses an important gap by prospectively predicting tissue outcomes at the single-organoid level, and we appreciate the recognition that this represents a conceptual advance with relevance not only for retinal organoids but also for broader applications in organoid biology, developmental biology, and biomedical AI.

      Methodological Rigor and Technical Quality:

      The study is methodologically solid and carefully executed. The authors gathered a uniquely large dataset under consistent conditions, which lends statistical power to their analyses. They employ rigorous controls: an expert panel provided human predictions as a baseline, and a classical machine learning pipeline using quantitative image-derived features was implemented for comparison. The deep learning approach is well-chosen and technically sound. They use an ensemble of CNN architectures (DenseNet121, ResNet50, and MobileNetV3) pre-trained on large image databases, fine-tuning them on organoid images. The use of image segmentation (DeepLabV3) to isolate the organoid from background is appropriate to ensure the models focus on the relevant morphology. Model training procedures (data augmentation, cross-entropy loss with class balancing, learning rate scheduling, and cross-validation) are thorough and follow best practices. The evaluation metrics (primarily F1-score) are suitable for the imbalanced outcomes and emphasize prediction accuracy in a biologically relevant way. Importantly, the authors separate training, test, and validation sets in a meaningful manner: images of each organoid are grouped to avoid information leakage, and an independent experiment serves as a validation to test generalization. The observation that performance is slightly lower on independent validation experiments underscores both the realism of their evaluation and the inherent heterogeneity between experimental batches. In addition, the study integrates interpretability (using GradientSHAP-based relevance backpropagation) to probe what image features the network uses. Although the relevance maps did not reveal obvious human-interpretable features, the attempt reflects a commendable thoroughness in analysis. Overall, the experimental design, data analysis, and reporting are of high quality, supporting the credibility of the conclusions.

      Response: We thank the reviewer for their very positive and detailed assessment. We appreciate the recognition of our efforts to ensure methodological rigor and reproducibility, and we agree that interpretability remains an important but challenging area for future work.

      Reviewer #3 (Significance (Required)):

      Scientific Significance and Conceptual Advances:

      Biologically, the ability to predict organoid outcomes early is quite significant. It means researchers can potentially identify when and which organoids will form a given tissue, allowing them to harvest samples at the right moment for molecular assays or to exclude organoids that will not form the desired structure. The manuscript's results indicate that RPE and lens fate decisions in retinal organoids are made much earlier than visible differentiation, with predictive signals detectable as early as ~11 hours for RPE and ~4-5 hours for lens. This suggests a surprising synchronization or early commitment in organoid development that was not previously appreciated. The authors' introduction of deep learning-derived determination windows refines the concept of a developmental "point of no return" for cell fate in organoids. Focusing on these windows could help in pinpointing the molecular triggers of these fate decisions. Another conceptual advance is demonstrating that non-invasive imaging data can serve a predictive role akin to (or better than) destructive molecular assays. The study highlights that classical morphology metrics and even expert eyes capture mainly recognition of emerging tissues, whereas the CNN detects subtler, non-intuitive features predictive of future development. This underlines the power of deep learning to uncover complex phenotypic patterns that elude human analysis, a concept that could be extended to other organoid systems and developmental biology contexts. In sum, the work not only provides a tool for prediction but also contributes conceptual insights into the timing of cell fate determination in organoids.

      Response: We thank the reviewer for this thoughtful and positive assessment. We agree that the determination windows provide a valuable framework to study early fate decisions in organoids, and we have emphasized this point in the discussion to highlight the biological significance of our findings.

      Strengths:

      The combination of high-resolution time-lapse imaging with advanced deep learning is innovative. The authors effectively leverage AI to solve a biological uncertainty problem, moving beyond qualitative observations to quantitative predictions. The study uses a remarkably large dataset (1,000 organoids, >100k images), which is a strength as it captures variability and provides robust training data. This scale lends confidence that the model isn't overfit to a small sample. By comparing deep learning with classical machine learning and human predictions, the authors provide context for the model's performance. The CNN ensemble consistently outperforms both the classical algorithms and human experts, highlighting the value added by the new method. The deep learning model achieves high accuracy (F1 > 0.85) at impressively early time points. The fact that it can predict lens formation just ~4.5 hours into development with confidence is striking. Performance remained strong and exceeded human capability at all assessed times. Key experimental and analytical steps (segmentation, cross-validation between experiments, model calibration, use of appropriate metrics) are executed carefully. The manuscript is transparent about training procedures and even provides source code references, enhancing reproducibility. The manuscript is generally well-written with a logical flow from the problem (organoid heterogeneity) to the solution (predictive modeling) and clear figures referenced.

      Response: We thank the reviewer for this very positive and encouraging assessment of our study, particularly regarding the scale of our dataset, the methodological rigor, and the reproducibility of our approach.

      Weaknesses and Limitations:

      Generalizability Across Batches/Conditions: One limitation is the variability in model performance on organoids from independent experiments. The CNN did slightly worse on a validation set from a separate experiment, indicating that differences in the experimental batch (e.g., slight protocol or environmental variations) can affect accuracy. This raises the question of how well the model would generalize to organoids generated under different protocols or by other labs. While the authors do employ an experiment-wise cross-validation, true external validation (on a totally independent dataset or a different organoid system) would further strengthen the claim of general applicability.

      Response: We thank the reviewer for this important point. We agree that generalizability across batches and experimental conditions is a key consideration. We have carefully revised the discussion to explicitly address this limitation and to highlight the variability observed between independent experiments.

      Interpretability of the Predictions: Despite using relevance backpropagation, the authors were unable to pinpoint clear human-interpretable image features that drive the predictions. In other words, the deep learning model remains somewhat of a "black box" in terms of what subtle cues it uses at early time points. This limits the biological insight that can be directly extracted regarding early morphological indicators of RPE or lens fate. It would be ideal if the study could highlight specific morphological differences (even if minor) correlated with fate outcomes, but currently those remain elusive.

      Response: We thank the reviewer for raising this important point. Indeed, while our models achieved robust predictive performance, the underlying morphological cues remained difficult to interpret using relevance backpropagation. We believe this limitation reflects both the subtlety of the early predictive signals and the complexity of the features captured by deep learning models, which may not correspond to human-intuitive descriptors. We have clarified this limitation in the Discussion and Supplementary Note 1 and emphasize that further methodological advances in interpretability, or integration with complementary molecular readouts, will be essential to uncover the precise morphological correlates of fate determination.

      Scope of Outcomes: The study focuses on two particular tissues (RPE and lens) as the outcomes of interest. These were well-chosen as examples (one induced, one spontaneous), but they do not encompass the full range of retinal organoid fates (e.g., neural retina layers). It's not a flaw per se, but it means the platform as presented is specialized. The method might need adaptation to predict more complex or multiple tissue outcomes simultaneously.

      Response: We agree with the reviewer that our study focuses on two specific tissues, RPE and lens, which served as proof-of-concept outcomes representing both induced and spontaneous differentiation events. While this scope is necessarily limited, we believe it demonstrates the general feasibility of our approach. We have clarified in the Discussion that the same framework could, in principle, be extended to additional retinal fates such as neural retina layers, or even to multi-label prediction tasks, provided appropriate annotations are available. We now provide additional experiments showing that even abstract morphological classes are well predictable. This will be an important next step to broaden the applicability of our platform.

      Requirement of Large Data and Annotations: Practically, the approach required a very large imaging dataset and extensive manual annotation; each organoid's RPE and lens outcome, plus manual masking for training the segmentation model. This is a substantial effort that may be challenging to reproduce widely. The authors suggest that perhaps ~500 organoids might suffice to achieve similar results, but the data requirement is still high. Smaller labs or studies with fewer organoids might not immediately reap the full benefits of this approach without access to such imaging throughput.

      Response: We thank the reviewer for highlighting this important point. We agree that the generation of a large imaging dataset and the associated annotations represent a substantial investment of time and resources. At the same time, we consider this effort highly relevant, as it reflects the intrinsic heterogeneity of organoid systems rather than technical artifacts, and therefore ensures robust model training. We have clarified this limitation in the discussion. While our full dataset included ~1,000 organoids, our downsampling analysis suggests that as few as ~500 organoids may already be sufficient to reproduce the key findings, which we believe makes the approach feasible for many organoid systems (compare new Supplementary Note 1). Moreover, as we outline in the Discussion, future refinements such as combining image- and tabular-based features or incorporating fluorescence data could further enhance predictive power and reduce annotation effort.

      Medaka Fish vs. Other Systems: The retinal organoids in this study appear to be from medaka fish, whereas much organoid research uses human iPSC-derived organoids. It's not fully clear in the manuscript as to how the findings translate to mammalian or human organoids. If there are species-specific differences, the applicability to human retinal organoids (which are important for disease modeling) might need discussion. This is a minor point if the biology is conserved, but worth noting as a potential limitation.

      Response: We thank the reviewer for pointing out this important consideration. We have now explicitly clarified in the Discussion that our proof-of-concept study was performed in medaka organoids, which offer high reproducibility and rapid development. While species-specific differences may exist, the predictive framework is not inherently restricted to medaka and should, in principle, be transferable to mammalian or human iPSC/ESC-derived organoids, provided sufficiently annotated datasets are available. We have amended the Discussion accordingly.

      Predicting Tissue Size is Harder: The model's accuracy in predicting how much tissue (relative area) an organoid will form, while good, is notably lower than for simply predicting presence/absence. Final F1 scores for size classes (~0.7) indicate moderate success. This implies that quantitatively predicting organoid phenotypic severity or extent is more challenging, perhaps due to more continuous variation in size. The authors do acknowledge the lower accuracy for size and treat it carefully.

      Response: We thank the reviewer for this observation and agree with their interpretation. We have already acknowledged in the manuscript that predicting tissue size is more challenging than predicting tissue presence/absence, and we believe we have treated these results with appropriate caution in the revised version of the manuscript.

      Latency vs. Determination: While the authors narrow down the time window of fate determination, it remains somewhat unclear whether the times at which the model reaches high confidence truly correspond to the biological "decision point" or are just the earliest detection of its consequences. The manuscript discusses this caveat, but it's an inherent limitation that the predictive time point might lag the actual internal commitment event. Further work might be needed to link these predictions to molecular events of commitment.

      Response: We agree with the reviewer. As noted in the Discussion, the time points identified by our models likely reflect the earliest detectable morphological consequences of fate determination, rather than the exact molecular commitment events themselves. Establishing a direct link between predictive signals and underlying molecular mechanisms will require future experimental work.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      This manuscript by Afting et. al. addresses the challenge of heterogeneity in retinal organoid development by using deep learning to predict eventual tissue outcomes from early-stage images. The central hypothesis is that deep learning can forecast which tissues an organoid will form (specifically retinal pigmented epithelium, RPE, and lens) well before those tissues become visibly apparent. To test this, the authors assembled a large-scale time-lapse imaging dataset of ~1,000 retinal organoids (~100,000 images) with expert annotations of tissue outcomes. They characterized the variability in organoid morphology and tissue formation over time, focusing on two tissues: RPE (which requires induction) and lens (which appears spontaneously). The core finding is that a deep learning model can accurately predict the emergence and size of RPE and lens in individual organoids at very early developmental stages. Notably, a convolutional neural network (CNN) ensemble achieved high predictive performance (F1-scores ~0.85-0.9) hours before the tissues were visible, significantly outperforming human experts and classical image-analysis-based classifiers. This approach effectively bypasses the issue of stochastic developmental heterogeneity and defines an early "determination window" for fate decisions. Overall, the study demonstrates a proof-of-concept that artificial intelligence can forecast organoid differentiation outcomes non-invasively, which could revolutionize how organoid experiments are analyzed and interpreted.

      Recommendation:

      While this manuscript addresses an important and timely scientific question using innovative deep learning methodologies, it currently cannot be recommended for acceptance in its present form. The authors must thoroughly address several critical limitations highlighted in this report. In particular, significant issues remain regarding the generalizability of the predictive models across different experimental conditions, the interpretability of deep learning predictions, and the use of Euclidean distance metrics in high-dimensional morphometric spaces-potentially leading to distorted interpretations of organoid heterogeneity. These revisions are essential for validating the general applicability of their approach and enhancing biological interpretability. After thoroughly addressing these concerns, the manuscript may become suitable for future consideration.

      Major Issues (with Suggestions):

      1. Generalization to Other Batches or Protocols: The drop in performance on independent validation experiments suggests the model may partially overfit to specific experimental conditions. A major concern is how well this approach would work on organoids from a different batch or produced by a slightly different differentiation protocol. Suggestion: The authors should clarify the extent of variability between their "independent experiment" and training data (e.g., were these done months apart, with different cell lines or minor protocol tweaks?). To strengthen confidence in the model's robustness, I recommend testing the trained model on one or more truly external datasets, if available (for instance, organoids generated in a separate lab or under a modified protocol). Even a modest analysis showing the model can be adapted (via transfer learning or re-training) to another dataset would be valuable. If new data cannot be added, the authors should explicitly discuss this limitation and perhaps propose strategies (like domain adaptation techniques or more robust training with diverse conditions) to handle batch effects in future applications.
      2. Biological Interpretation of Early Predictive Features: The study currently concludes that the CNN picks up on complex, non-intuitive features that neither human experts nor conventional analysis could identify. However, from a biological perspective, it would be highly insightful to know what these features are (e.g., subtle texture, cell distribution patterns, etc.). Suggestion: I encourage the authors to delve deeper into interpretability. They might try complementary explainability techniques (for example, occlusion tests where parts of the image are masked to see if predictions change, or activation visualization to see what patterns neurons detect) beyond GradientSHAP. Additionally, analyzing false predictions might provide clues: if the model is confident but wrong for certain organoids, what visual traits did those have? If possible, correlating the model's prediction confidence with measured morphometrics or known markers (if any early marker data exist) could hint at what the network sees. Even if definitive features remain unidentified, providing the reader with any hypothesis (for instance, "the network may be sensing a subtle rim of pigmentation or differences in tissue opacity") would add value. This would connect the AI predictions back to biology more strongly.
      3. Expansion to Other Outcomes or Multi-Outcome Prediction: The focus on RPE and lens is well-justified, but these are two outcomes within retinal organoids. A major question is whether the approach could be extended to predict other cell types or structures (e.g., presence of certain retinal neurons, or malformations) or even multiple outcomes at once. Suggestion: The authors should discuss the generality of their approach. Could the same pipeline be trained to predict, say, photoreceptor layer formation or other features if annotated? Are there limitations (like needing binary outcomes vs. multi-class)? Even if outside the scope of this study, a brief discussion would reassure readers that the method is not intrinsically limited to these two tissues. If data were available, it would be interesting to see a multi-label classification (predict both RPE and lens presence simultaneously) or an extension to other organoid systems in future. Including such commentary would highlight the broad applicability of this platform.
      4. Curse of high dimensionality: Using Euclidean distance in a 165-dimensional morphometric space likely suffers from the curse of dimensionality, which diminishes the meaning of distances as dimensionality increases. In such high-dimensional settings, the range of pairwise distances tends to collapse, undermining the ability to discern meaningful intra- vs. inter-organoid differences. Suggestion: To address this, I would encourage the authors to apply principal component analysis (PCA) in place of (or prior to) tSNE. PCA would reduce the data to a few dominant axes of variation that capture most of the morphometric variance, directly revealing which features drive differences between organoids. These principal components are linear combinations of the original 165 parameters, so one can examine their loadings to identify which morphometric traits carry the most information - yielding interpretable axes of biological variation (e.g., organoid size, shape complexity, etc.). In addition, I would like to mention an important cautionary remark regarding tSNE embeddings. tSNE does not preserve global geometry of the data. Distances and cluster separations in a tSNE map are therefore not faithful to the original high-dimensional distances and should be interpreted with caution. See Chari T, Pachter L (2023), The specious art of single-cell genomics, PLoS Comput Biol 19(8): e1011288, for an enlightening discussion in the context of single cell genomics. The authors have shown that extreme dimensionality reduction to 2D can introduce significant distortions in the data's structure, meaning the apparent proximity or separation of points in a tSNE plot may be an artifact of the algorithm rather than a true reflection of morphometric similarity. Implementing PCA would mitigate high-dimensional distance issues by focusing on the most informative dimensions, while also providing clear, quantitative axes that summarize organoid heterogeneity. This change would strengthen the analysis by making the results more robust (avoiding distance artifacts) and biologically interpretable, as each principal component can be traced back to specific morphometric features of interest.
      5. Statistical Reporting and Significance: The manuscript focuses on F1-score as the metric to report accuracy over time, which is appropriate. However, it's not explicitly stated whether any statistical significance tests were performed on the differences between methods (e.g., CNN vs human, CNN vs classical ML). Suggestion: The authors could report statistical significance of the performance differences, perhaps using a permutation test or McNemar's test on predictions. For example, is the improvement of the CNN ensemble over the Random Forest/QDA classifier statistically significant across experiments? Given the n of organoids, this should be assessable. Demonstrating significance would add rigor to the analysis.

      Minor Issues (with Suggestions):

      1. Data Availability: Given the resource-intensive nature of the work, the value to the community will be highest if the data is made publicly available. I understand that this is of course at the behest of the authors and they do mention that they will make the data available upon publication of the manuscript . For the time being, the authors can consider sharing at least a representative subset of the data or the trained model weights. This will allow others to build on their work and test the method in other contexts, amplifying the impact of the study.
      2. Discussion - Future Directions: The Discussion does a good job of highlighting applications (like guiding molecular analysis). One minor addition could be speculation on using this approach to actively intervene: for example, could one imagine altering culture conditions mid-course for organoids predicted not to form RPE, to see if their fate can be changed? The authors touch on reducing variability by focusing on the window of determination; extending that thought to an experimental test (though not done here) would inspire readers. This is entirely optional, but a sentence or two envisioning how predictive models enable dynamic experimental designs (not just passive prediction) would be a forward-looking note to end on.

      I believe with the above clarifications and enhancements - especially regarding generalizability and interpretability - the paper will be suitable for broad readership. The work represents an exciting intersection of developmental biology and AI, and I commend the authors for this contribution.

      Novelty and Impact:

      This work fills an important gap in organoid biology and imaging. Previous studies have used deep learning to link imaging with molecular profiles or spatial patterns in organoids, but there remained a "notable gap" in predicting whether and to what extent specific tissues will form in organoids. The authors' approach is novel in applying deep learning to prospectively predict organoid tissue outcomes (RPE and lens) on a per-organoid basis, something not previously demonstrated in retinal organoids. Conceptually, this is a significant advance: it shows that fate decisions in a complex 3D culture model can be predicted well in advance, suggesting the existence of subtle early morphogenetic cues that only a sophisticated model can discern. The findings will be of broad interest to researchers in organoid technology, developmental biology, and biomedical AI.

      Methodological Rigor and Technical Quality:

      The study is methodologically solid and carefully executed. The authors gathered a uniquely large dataset under consistent conditions, which lends statistical power to their analyses. They employ rigorous controls: an expert panel provided human predictions as a baseline, and a classical machine learning pipeline using quantitative image-derived features was implemented for comparison. The deep learning approach is well-chosen and technically sound. They use an ensemble of CNN architectures (DenseNet121, ResNet50, and MobileNetV3) pre-trained on large image databases, fine-tuning them on organoid images. The use of image segmentation (DeepLabV3) to isolate the organoid from background is appropriate to ensure the models focus on the relevant morphology. Model training procedures (data augmentation, cross-entropy loss with class balancing, learning rate scheduling, and cross-validation) are thorough and follow best practices. The evaluation metrics (primarily F1-score) are suitable for the imbalanced outcomes and emphasize prediction accuracy in a biologically relevant way. Importantly, the authors separate training, test, and validation sets in a meaningful manner: images of each organoid are grouped to avoid information leakage, and an independent experiment serves as a validation to test generalization. The observation that performance is slightly lower on independent validation experiments underscores both the realism of their evaluation and the inherent heterogeneity between experimental batches. In addition, the study integrates interpretability (using GradientSHAP-based relevance backpropagation) to probe what image features the network uses. Although the relevance maps did not reveal obvious human-interpretable features, the attempt reflects a commendable thoroughness in analysis. Overall, the experimental design, data analysis, and reporting are of high quality, supporting the credibility of the conclusions.

      Significance

      Scientific Significance and Conceptual Advances:

      Biologically, the ability to predict organoid outcomes early is quite significant. It means researchers can potentially identify when and which organoids will form a given tissue, allowing them to harvest samples at the right moment for molecular assays or to exclude organoids that will not form the desired structure. The manuscript's results indicate that RPE and lens fate decisions in retinal organoids are made much earlier than visible differentiation, with predictive signals detectable as early as ~11 hours for RPE and ~4-5 hours for lens. This suggests a surprising synchronization or early commitment in organoid development that was not previously appreciated. The authors' introduction of deep learning-derived determination windows refines the concept of a developmental "point of no return" for cell fate in organoids. Focusing on these windows could help in pinpointing the molecular triggers of these fate decisions. Another conceptual advance is demonstrating that non-invasive imaging data can serve a predictive role akin to (or better than) destructive molecular assays. The study highlights that classical morphology metrics and even expert eyes capture mainly recognition of emerging tissues, whereas the CNN detects subtler, non-intuitive features predictive of future development. This underlines the power of deep learning to uncover complex phenotypic patterns that elude human analysis, a concept that could be extended to other organoid systems and developmental biology contexts. In sum, the work not only provides a tool for prediction but also contributes conceptual insights into the timing of cell fate determination in organoids.

      Strengths:

      The combination of high-resolution time-lapse imaging with advanced deep learning is innovative. The authors effectively leverage AI to solve a biological uncertainty problem, moving beyond qualitative observations to quantitative predictions. The study uses a remarkably large dataset (1,000 organoids, >100k images), which is a strength as it captures variability and provides robust training data. This scale lends confidence that the model isn't overfit to a small sample. By comparing deep learning with classical machine learning and human predictions, the authors provide context for the model's performance. The CNN ensemble consistently outperforms both the classical algorithms and human experts, highlighting the value added by the new method. The deep learning model achieves high accuracy (F1 > 0.85) at impressively early time points. The fact that it can predict lens formation just ~4.5 hours into development with confidence is striking. Performance remained strong and exceeded human capability at all assessed times. Key experimental and analytical steps (segmentation, cross-validation between experiments, model calibration, use of appropriate metrics) are executed carefully. The manuscript is transparent about training procedures and even provides source code references, enhancing reproducibility. The manuscript is generally well-written with a logical flow from the problem (organoid heterogeneity) to the solution (predictive modeling) and clear figures referenced.

      Weaknesses and Limitations:

      Generalizability Across Batches/Conditions: One limitation is the variability in model performance on organoids from independent experiments. The CNN did slightly worse on a validation set from a separate experiment, indicating that differences in the experimental batch (e.g., slight protocol or environmental variations) can affect accuracy. This raises the question of how well the model would generalize to organoids generated under different protocols or by other labs. While the authors do employ an experiment-wise cross-validation, true external validation (on a totally independent dataset or a different organoid system) would further strengthen the claim of general applicability.

      Interpretability of the Predictions: Despite using relevance backpropagation, the authors were unable to pinpoint clear human-interpretable image features that drive the predictions. In other words, the deep learning model remains somewhat of a "black box" in terms of what subtle cues it uses at early time points. This limits the biological insight that can be directly extracted regarding early morphological indicators of RPE or lens fate. It would be ideal if the study could highlight specific morphological differences (even if minor) correlated with fate outcomes, but currently those remain elusive.

      Scope of Outcomes: The study focuses on two particular tissues (RPE and lens) as the outcomes of interest. These were well-chosen as examples (one induced, one spontaneous), but they do not encompass the full range of retinal organoid fates (e.g., neural retina layers). It's not a flaw per se, but it means the platform as presented is specialized. The method might need adaptation to predict more complex or multiple tissue outcomes simultaneously.

      Requirement of Large Data and Annotations: Practically, the approach required a very large imaging dataset and extensive manual annotation; each organoid's RPE and lens outcome, plus manual masking for training the segmentation model. This is a substantial effort that may be challenging to reproduce widely. The authors suggest that perhaps ~500 organoids might suffice to achieve similar results, but the data requirement is still high. Smaller labs or studies with fewer organoids might not immediately reap the full benefits of this approach without access to such imaging throughput.

      Medaka Fish vs. Other Systems: The retinal organoids in this study appear to be from medaka fish, whereas much organoid research uses human iPSC-derived organoids. It's not fully clear in the manuscript as to how the findings translate to mammalian or human organoids. If there are species-specific differences, the applicability to human retinal organoids (which are important for disease modeling) might need discussion. This is a minor point if the biology is conserved, but worth noting as a potential limitation.

      Predicting Tissue Size is Harder: The model's accuracy in predicting how much tissue (relative area) an organoid will form, while good, is notably lower than for simply predicting presence/absence. Final F1 scores for size classes (~0.7) indicate moderate success. This implies that quantitatively predicting organoid phenotypic severity or extent is more challenging, perhaps due to more continuous variation in size. The authors do acknowledge the lower accuracy for size and treat it carefully.

      Latency vs. Determination: While the authors narrow down the time window of fate determination, it remains somewhat unclear whether the times at which the model reaches high confidence truly correspond to the biological "decision point" or are just the earliest detection of its consequences. The manuscript discusses this caveat, but it's an inherent limitation that the predictive time point might lag the actual internal commitment event. Further work might be needed to link these predictions to molecular events of commitment.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary: Afting et al. present a computational pipeline for analyzing timelapse brightfield images of retinal organoids derived from Medaka fish. Their pipeline processes images along two paths: 1) morphometrics (based on computer vision features from skimage) and 2) deep learning. They discovered, through extensive manual annotation of ground truth, that their deep learning method could predict retinal pigmented epithelium and lens tissue emergence in time points earlier than either morphometrics or expert predictions. Our review is formatted based on the review commons recommendation.

      Major comments:

      Are the key conclusions convincing?

      Yes, the key conclusion that deep learning outperforms morphometric approaches is convincing. However, several methodological details require clarification. For instance, were the data splitting procedures conducted in the same manner for both approaches? Additionally, the authors note in the methods: "The validation data were scaled to the same range as the training data using the fitted scalers obtained from the training data." This represents a classic case of data leakage, which could artificially inflate performance metrics in traditional machine learning models. It is unclear whether the deep learning model was subject to the same issue. Furthermore, the convolutional neural network was trained with random augmentations, effectively increasing the diversity of the training data. Would the performance advantage still hold if the sample size had not been artificially expanded through augmentation?

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? Their claims are currently preliminary, pending increased clarity and additional computational experiments described below.

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      • The authors discretize continuous variables into four bins for classification. However, a regression framework may be more appropriate for preserving the full resolution of the data. At a minimum, the authors should provide a stronger justification for this binning strategy and include an analysis of bin performance. For example, do samples near bin boundaries perform comparably to those near the bin centers? This would help determine whether the discretization introduces artifacts or obscures signals.
      • The relevance backpropagation interpretation analysis is not convincing. The authors argue that the model's use of pixels across the entire image (rather than just the RPE region) indicates that the deep learning approach captures holistic information. However, only three example images are shown out of hundreds, with no explanation for their selection, limiting the generalizability of the interpretation. Additionally, it is unclear how this interpretability approach would work at all in earlier time points, particularly before the model begins making confident predictions around the 8-hour mark. It is also not specified whether the input used for GradSHAP matches the input used during CNN training. The authors should consider expanding this analysis by quantifying pixel importance inside versus outside annotated regions over time. Lastly, Figure 4C is missing a scale bar, which would aid in interpretability.
      • The authors claim that they removed technical artifacts to the best of their ability, but it is unclear if the authors performed any adjustment beyond manual quality checks for contamination. Did the authors observe any illumination artifacts (either within a single image or over time)? Any other artifacts or procedures to adjust?
      • In line 434-436 the authors state "In this work, we used 1,000 organoids in total, to achieve the reported prediction accuracies. Yet, we suspect that as little as ~500 organoids are sufficient to reliably recapitulate our findings." It is unclear what evidence the authors use to support this claim? The authors could perform a downsampling analysis to determine tradeoff between performance and sample size.

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      Yes, we believe all experiments are realistic in terms of time and resources. We estimate all experiments could be completed in 3-6 months.

      Are the data and the methods presented in such a way that they can be reproduced?

      No, the code is not currently available. We were not able to review the source code.

      Are the experiments adequately replicated and statistical analysis adequate?

      • The experiments are adequately replicated.
      • The statistical analysis (deep learning) is lacking a negative control baseline, which would be helpful to observe if performance is inflated.

      Minor comments:

      Specific experimental issues that are easily addressable.

      Are prior studies referenced appropriately?

      Yes.

      Are the text and figures clear and accurate?

      The authors must improve clarity on terminology. For example, they should define a comprehensive dataset, significant, and provide clarity on their morphometrics feature space. They should elaborate on what they mean by "confounding factor of heterogeneity".

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      • Figure 2C describes a distance between what? The y axis is likely too simple. Same confusion over Figure 2D. Was distance computed based on tsne coordinates?
      • The authors perform a Herculean analysis comparing dozens of different machine learning classifiers. They select two, but they should provide justification for this decision.
      • It would be good to get a sense for how these retinal organoids grow - are they moving all over the place? They are in Matrigel so maybe not, but are they rotating? Can the author's approach predict an entire non-emergence experiment? The authors tried to standardize protocol, but ultimately if It's deriving this much heterogeneity, then how well it will actually generalize to a different lab is a limitation.
      • The authors should dampen claims throughout. For example, in the abstract they state, "by combining expert annotations with advanced image analysis". The image analysis pipelines use common approaches.
      • The authors state: "the presence of RPE and lenses were disagreed upon by the two independently annotating experts in a considerable fraction of organoids (3.9 % for RPE, 2.9% for lenses).", but it is unclear why there were two independently annotating experts. The supplements say images were split between nine experts for annotation.
      • Details on the image analysis pipeline would be helpful to clarify. For example, why did they choose to measure these 165 morphology features? Which descriptors were used to quantify blur? Did the authors apply blur metrics per FOV or per segmented organoid?
      • The description of the number of images is confusing and distracts from the number of organoids. The number of organoids and number of timepoints used would provide a better description of the data with more value. For example, does this image count include all five z slices?
      • The authors should consider applying a maximum projection across the five z slices (rather than the middle z) as this is a common procedure in image analysis. Why not analyze three-dimensional morphometrics or deep learning features? Might this improve performance further?
      • There is a lot of manual annotation performed in this work, the authors could speculate how this could be streamlined for future studies. How does the approach presented enable streamlining?

      Significance

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      The paper's advance is technical (providing new methods for organoid quality control) and conceptual (providing proof of concept that earlier time points contain information to predict specific future outcomes in retinal organoids)

      Place the work in the context of the existing literature (provide references, where appropriate).

      • The authors do a good job of placing their work in context in the introduction.
      • The work presents a simple image analysis pipeline (using only the middle z slice) to process timelapse organoid images. So not a 4D pipeline (time and space), just 3D (time). It is likely that more and more of these approaches will be developed over time, and this article is one of the early attempts.
      • The work uses standard convolutional neural networks.

      State what audience might be interested in and influenced by the reported findings.

      • Data scientists performing image-based profiling for time lapse imaging of organoids.
      • Retinal organoid biologists
      • Other organoid biologists who may have long growth times with indeterminate outcomes.

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      • Image-based profiling/morphometrics
      • Organoid image analysis
      • Computational biology
      • Cell biology
      • Data science/machine learning
      • Software

      This is a signed review: Gregory P. Way, PhD Erik Serrano Jenna Tomkinson Michael J. Lippincott Cameron Mattson Department of Biomedical Informatics, University of Colorado

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This study presents predictive modeling for developmental outcome in retinal organoids based on high-content imaging. Specifically, it compares the predictive performance of an ensemble of deep learning models with classical machine learning based on morphometric image features and predictions from human experts for four different task: prediction of RPE presence and lense presence (at the end of development) as well as the respective sizes. It finds that the DL model outperforms the other approaches and is predictive from early timepoints on, strongly indicating a time-frame for important decision steps in the developmental trajectory.

      Major comments: I find the paper over-all well written and easy to understand. The findings are relevant (see significance statement for details) and well supported. However, I have some remarks on the description and details of the experimental set-up, the data availability and reproducibility / re-usability of the data.

      1. Some details about the experimental set-up are unclear to me. In particular, it seems like there is a single organoid per well, as the manuscript does not mention any need for instance segmentation or tracking to distinguish organoids in the images and associate them over time. Is that correct? If yes, it should be explicitly stated so. Are there any specific steps in the organoid preparation necessary to avoid multiple organoids per well? Having multiple organoids per well would require the aforementioned image analysis steps (instance segmentation and tracking) and potentially add significant complexity to the analysis procedure, so this information is important to estimate the effort for setting up a similar approach in other organoid cultures (for example cancer organoids, where multiple organoids per well are common / may not be preventable in certain experimental settings).
      2. The terminology used with respect to the test and validation set is contrary to the field, and reporting the results on the test set (should be called validation set), should be avoided since it is used to select models. In more detail: the terms "test set" and "validation set" (introduced in 213-221) are used with the opposite meaning to their typical use in the deep learning literature. Typically, the validation set refers to a separate split that is used to monitor convergence / avoid overfitting during training, and the test set refers to an external set that is used to evaluate the performance of trained models. The study uses these terms in an opposite manner, which becomes apparent from line 624: "best performing model ... judged by the loss of the test set.". Please exchange this terminology, it is confusing to a machine learning domain expert. Furthermore, the performance on the test set (should be called validation set) is typically not reported in graphs, as this data was used for model selection, and thus does not provide an unbiased estimate of model performance. I would remove the respective curves from Figures 3 and 4.
      3. The experimental set-up for the human expert baseline is quite different to the evaluation of the machine learning models. The former is based on the annotation of 4,000 images by seven expert, the latter based on a cross-validation experiments on a larger dataset. First of all, the details on the human expert labeling procedure is very sparse, I could only find a very short description in the paragraph 136-144, but did not find any further details in the methods section. Please add a methods section paragraph that explains in more detail how the images were chosen, how they were assigned to annotators, and if there was any redundancy in annotation, and if yes how this was resolved / evaluated. Second, the fact that the set-up for human experts and ML models is quite different means that these values are not quite comparable in a statistical sense. Ideally, human estimators would follow the same set-up as in ML (as in, evaluate the same test sets). However, this would likely prohibitive in the required effort, so I think it's enough to state this fact clearly, for example by adding a comment on this to the captions of Figure 3 and 4.
      4. It is unclear to me where the theoretical time window for the Latent Determination Horizon in Figure 5 (also mentioned in line 350) comes from? Please explain this in more detail and provide a citation for it.
      5. The intepretability analysis (Figure 4, 634-639) based on relevance backpropagation was performed based on DenseNet121 only. Why did you choose this model and not the ResNet / MobileNet? I think it is quite crucial to see if there are any differences between these model, as this would show how much weight can be put on the evidence from this analysis and I would suggest to add an additional experiment and supplementary figure on this.
      6. The code referenced in the code availability statement is not yet present. Please make it available and ensure a good documentation for reproducibility. Similarly, it is unclear to me what is meant by "The data that supports the findings will be made available on HeiDoc". Does this only refer to the intermediate results used for statistical analysis? I would also recommend to make the image data of this study available. This could for example be done through a dedicated data deposition service such as BioImageArchive or BioStudies, or with less effort via zenodo. This would ensure both reproducibility as well as potential re-use of the data. I think the latter point is quite interesting in this context; as the authors state themselves it is unclear if prediction of the TOIs isn't even possible at an earlier point that could be achieved through model advances, which could be studied by making this data available.

      Minor comments:

      Line 315: Please add a citation for relevance backpropagation here.

      Line 591: There seems to be typo: "[...] classification of binary classification [...]"

      Line 608: "[...] where the images of individual organoids served as groups [...]" It is unclear to me what this means.

      Significance

      General assessment: This study demonstrates that (retinal) organoid development can be predicted from early timepoints with deep learning, where these cannot be discerned by human experts or simpler machine learning models. This fact is very interesting in itself due to its implication for organoid development, and could provide a valuable tool for molecular analysis of different organoid populations, as outlined by the authors. The contribution could be strengthened by providing a more thorough investigation of what features in the image are predictive at early timepoints, using a more sophisticated approach than relevance backprop, e.g. Discover (https://www.nature.com/articles/s41467-024-51136-9). This could provide further biological insight into the underlying developmental processes and enhance the understanding of retinal organoid development.

      Advance: similar studies that predict developmental outcome based on image data, for example cell proliferation or developmental outcome exist. However, to the best of my knowledge, this study is the first to apply such a methodology to organoids and convincingly shows is efficacy and argues is potential practical benefits. It thus constitutes a solid technical advance, that could be especially impactful if it could be translated to other organoid systems in the future.

      Audience: This research is of interest to a technical audience. It will be of immediate interest to researchers working on retinal organoids, who could adapt and use the proposed system to support experiments by better distinguishing organoids during development. To enable this application, code and data availability should be ensured (see above comments on reproducibility). It is also of interest to researchers in other organoid systems, who may be able to adapt the methodology to different developmental outcome predictions. Finally, it may also be of interest to image analysis / deep learning researchers as a dataset to improve architectures for predictive time series modeling.

      My research background: I am an expert in computer vision and deep learning for biomedical imaging, especially in microscopy. I have some experience developing image analysis for (cancer) organoids. I don't have any experience on the wet lab side of this work.

      Constantin Pape

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02830

      Corresponding author(s): Julien, Sage

      1. General Statements

      We thank the Reviewers for a fair review of our work and helpful suggestions. We have significantly revised the manuscript in response to these suggestions. We provide a point-by-point response to the Reviewers below but wanted to highlight in our response a recurring concern related to the strong cell cycle arrest observed upon the acute FAM53C knock-down being different than the limited phenotypes in other contexts, including the knockout mice and DepMap data.

      First, we now show that we can recapitulate the strong G1 arrest resulting from the FAM53C knock-down using two independent siRNAs in RPE-1 cells, supporting the specificity of the effects.

      Second, the G1 arrest that results from the FAM53C knock-down is also observed in cells with inactive p53, suggesting it is not due to a non-specific stress response due to “toxic” siRNAs. In addition, the arrest is dependent on RB, which fits with the genetic and biochemical data placing FAM53C upstream of RB, further supporting a specific phenotype.

      Third, we have performed experiments in other human cells, including cancer cell lines. As would be expected for cancer cells, the G1 arrest is less pronounced but is still significant, indicating that the G1 arrest is not unique to RPE-1 cells.

      Fourth, it is not unexpected that compensatory mechanisms would be activated upon loss of FAM53C during development or in cancer – which may explain the lack of phenotypes in vivo or upon long-term knockout. This has been true for many cell cycle regulators, either because of compensation by other family members that have overlapping functions, or by a larger scale rewiring of signaling pathways.

      2. Point-by-point description of the revisions

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __

      Summary:

      Taylar Hammond and colleagues identified new regulators of the G1/S transition of the cell cycle. They did so by screening public available data from the Cancer Dependency Map, and identified FAM53C as a positive regulator of the G1/S transition. Using biochemical assays they then show that FAM53 interacts with the DYRK1A kinase to inhibit its function. DYRK1A in its is known to induce degradation of cyclin D, leading the authors to propose a model in which DYRK1A-dependent cyclin D degradation is inhibited by FAM53C to permit S-phase entry. Finally the authors assess the effect of FAM53C deletion in a cortical organoid model, and in Fam53c knockout mice. Whereas proliferation of the organoids is indeed inhibited, mice show virtually no phenotype.

      Major comments:

      The authors show convincing evidence that FAM53C loss can reduce S-phase entry in cell cultures, and that it can bind to DYRK1A. However, FAM53 has multiple other binding partners and I am not entirely convinced that negative regulation of DYRK1A is the predominant mechanism to explain its effects on S-phase entry. Some of the claims that are made based on the biochemical assays, and on the physiological effects of FAM53C are overstated. In addition, some choices made methodology and data representation need further attention.

      1. The authors do note that P21 levels increase upon FAM53C. They show convincing evidence that this is not a P53-dependent response. But the claim that " p21 upregulation alone cannot explain the G1 arrest in FAM53C-deficient cells (line 138-139) is misleading. A p53-independent p21 response could still be highly relevant. The authors could test if FAM53C knockdown inhibits proliferation after p21 knockdown or p21 deletion in RPE1 cells. The Reviewer raises a great point. Our initial statement needed to be clarified and also need more experimental support. We have performed experiments where we knocked down FAM53C and p21 individually, as well as in combination, in RPE-1 cells. These experiment show that p21 knock-down is not sufficient to negate the cell cycle arrest resulting from the FAM53C knock-down in RPE-1 cells (Figure 4B,C and Figure S4C,D).

      We now extended these experiments to conditions where we inhibited DYRK1A, and we also compared these data to experiments in p53-null RPE-1 cells. Altogether, these experiments point to activation of p53 downstream of DYRK1A activation upon FAM53C knock-down, and indicate that p21 is not the only critical p53 target in the cell cycle arrest observed in FAM53C knock-down cells (Figure 4 and Figure S4).

      The authors do not convincingly show that FAM53C acts as a DYRK1A inhibitor in cells. Figures 4B+C and S4B+C show extremely faint P-CycD1 bands, and tiny differences in ratios. The P values are hovering around the 0.05, so n=3 is clearly underpowered here. Total CycD1 levels also correlate with FAM53C levels, which seems to affect the ratios more than the tiny pCycD1 bands. Why is there still a pCycD1 band visible in 4B in the GFP + BTZ + DYRK1Ai condition? And if I look at the data points I honestly don't understand how the authors can conclude from S4C that knockdown of siFAM53C increases (DYRK1A dependent) increases in pCycD1 (relative to total CycD1). In figure 5C, no blot scans are even shown, and again the differences look tiny. So the authors should either find a way to make these assays more robust, or alter their claims appropriately.

      We appreciate these comments from the Reviewer and have significantly revised the manuscript to address them.

      The analysis of Cyclin D phosphorylation and stability are complicated by the upregulation of p21 upon FAM53C knock-down, in particular because p21 can be part of Cyclin D complexes, which may affect its protein levels in cells (as was nicely showed in a previous study from the lab of Tobias Meyer – Chen et al., Mol Cell, 2013). Instead of focusing on Cyclin D levels and stability, we refocused the manuscript on RB and p53 downstream of FAM53C loss.

      We removed previous panel 4B from the revised manuscript. For panels 4E and S4B (now panels S3J and S3K)), we used a true “immunoassay” (as indicated in the legend – not an immunoblot), which is much more quantitative and avoids error-prone steps in standard immunoblots (“Western blots”). Briefly, this system was developed by ProteinSimple. It uses capillary transfer of proteins and ELISA-like quantification with up to 6 logs of dynamic range (see their web site https://www.proteinsimple.com/wes.html). The “bands” we show are just a representation of the luminescence signals in capillaries. We made sure to further clarify the figure legends in the revised manuscript.

      The representative Western blot images for 5C-D (now 5F-G) in the original submission are shown in Figure 5E, we apologize if this was not clear. The differences are small, which we acknowledge in the revised manuscript. Note that several factors can affect Cyclin D levels in cells, including the growth rate and the stage of the cell cycle. Our FACS analysis shows that normal organoids have ~63% of cells in G1 and ~13% in S phase; the overall lower proportion of S-phase cells in organoids may make the immunoblot difference appear smaller, with fewer cycling cells resulting in decreased Cyclin D phosphorylation.

      Nevertheless, the Reviewer brings up a good point and comments from this Reviewer and the others made us re-think how to best interpret our results. As discussed above, we re-read carefully the Meyer paper and think that FAM53C’s role and DYRK1A activity in cells may be understood when considering levels of both CycD and p21 at the same time in a continuum. While our genetic and biochemical data support a role for FAM53C in DYRK1A inhibition, it is likely that the regulation of cell cycle progression by FAM53C is not exclusively due to this inhibition. As discussed above and below, we noted an upregulation of p21 upon FAM53C knock-down, and activation of p53 and its targets likely contributes significantly to the phenotypes observed. We added new experiments to support this more complex model (Figure 4 and Figure S4, with new model in S4L).

      The experiments to test if DYRK1A inhibition could rescue the G1 arrest observed upon FAM53C knockdown are not entirely convincing either. It would be much more convincing if they also perform cell counting experiments as they have done in Figures 1F and 1G, to complement the flow cytometry assays. I suggest that the authors do these cell counting experiments in RPE1 +/- P53 cells as well as HCT116 cells. In addition, did the authors test if P21 is induced by DYRK1Ai in HCT116 cells?

      We repeated the experiments with the DYRK1A inhibitor and counted the cells. In p53-null RPE-1 cells, we found that cell numbers do not increase in these conditions where we had observed a cell cycle re-entry (Fig. 4E), which was accompanied by apoptotic cell death (Fig. S4I). Thus, cells re-enter the cell cycle but die as they progress through S-phase and G2/M. We note that inhibition of DYRK1A has been shown to decrease expression of G2/M regulators (PMID: 38839871), which may contribute to the inability of cells treated to DYRK1Ai to divide. Because our data in RPE-1 cells showed that p21 knock-down was not sufficient to allow the FAM53C knock-down cells to re-enter the cell cycle, we did not further analyze p21 in HCT-116 cells.

      The data in Figure 5C and 5D are identical, although they are supposed to represent either pCycD1 ratios or p21 levels. This is a problem because at least one of the two cannot be true. Please provide the proper data and show (representative) images of both data types.

      We apologize for these duplicated panels in the original submission. We now replaced the wrong panel with the correct data (Fig. 5F,G).

      Line 246: "Fam53c knockout mice display developmental and behavioral defects." I don't agree with this claim. The mutant mice are born at almost the expected Mendelian ratios, the body weight development is not consistently altered. But more importantly, no differences in adult survival or microscopic pathology were seen. The authors put strong emphasis on the IMPC behavioral analysis, but they should be more cautious. The IMPC mouse cohorts are tested for many other phenotypes related to behavior and neurological symptoms and apparently none of these other traits were changed in the IMPC Famc53c-/- cohort. Thus, the decreased exploration in a new environment could very well be a chance finding. The authors need to take away claims about developmental and behavioral defects from the abstract, results and discussion sections; the data are just too weak to justify this.

      We agree with the Reviewer that, although we observed significant p-values, this original statement may not be appropriate in the biological sense. We made sure in the revised manuscript to carefully present these data.

      Minor comments:

      Can the authors provide a rationale for each of the proteins they chose to generate the list of the 38 proteins in the DepMap analysis? I looked at the list and it seems to me that they do not all have described functions in the G1/S transition. The analysis may thus be biased.

      To address this point, we updated Table S1 (2nd tab) to provide a better rationale for the 38 factors chosen. Our focus was on the canonical RB pathway and we included RB binding proteins whose function had suggested they may also be playing a role in the G1/S transition. We do agree that there is some bias in this selection (e.g., there are more RB binding factors described) but we hope the Reviewer will agree with us that this list and the subsequent analysis identified expected factors, including FAM53C. Future studies using this approach and others will certainly identify new regulators of cell cycle progression.

      Figure 1B is confusing to me. Are these just some (arbitrarily) chosen examples? Consider leaving this heatmap out altogether, of explain in more detail.

      We agree with the Reviewer that this panel was not necessarily useful and possibly in the wrong place, and we removed it from the manuscript. We replaced it with a cartoon of top hits in the screen.

      The y-axes in Figures 2C, 2D, 2E, and 4D are misleading because they do not start at 0. Please let the axis start at 0, or make axis breaks.

      We re-graphed these panels.

      Line 229: " Consequences ... brain development." This subheader is misleading, because the in vitro cortical organoid system is a rather simplistic model for brain development, and far away from physiological brain development. Please alter the header.

      We changed the header to “Consequences of FAM53C inactivation in human cortical organoids in culture”.

      Figure S5F: the gating strategy is not clear to me. In particular, how do the authors know the difference between subG1 and G1 DAPI signals? Do they interpret the subG1 as apoptotic cells? If yes, why are there so many? Are the culturing or harvesting conditions of these organoids suboptimal? Perhaps the authors could consider doing IF stainings on EdU or BrdU on paraffin sections of organoids to obtain cleaner data?

      Thank you for your feedback. The subG1 population in the original Figure S5F represents cells that died during the dissociation step of the organoids for FACS analysis. To address this point, we performed live & dead staining to exclude dead cells and provide clearer data. We refined gating strategy for better clarity in the new S5F panel.

      Figure S6A; the labeling seems incorrect. I would think that red is heterozygous here, and grey mutant.

      We fixed this mistake, thank you.

      __Reviewer #1 (Significance (Required)): __

      The finding that the poorly studied gene FAM53C controls the G1/S transition in cell lines is novel and interesting for the cell cycle field. However, the lack of phenotypes in Famc53-/- mice makes this finding less interesting for a broader audience. Furthermore, the mechanisms are incompletely dissected. The importance of a p53-indepent induction of p21 is not ruled out. And while the direct inhibitory interaction between FAM53C and DYRK1A is convincing (and also reported by others; PMID: 37802655), the authors do not (yet) convincingly show that DYRK1A inhibition can rescue a cell proliferation defect in FAM53C-deficient cells.

      Altogether, this study can be of interest to basic researchers in the cell cycle field.

      I am a cell biologist studying cell cycle fate decisions, and adaptation of cancer cells & stem cells to (drug-induced) stress. My technical expertise aligns well with the work presented throughout this paper, although I am not familiar with biolayer interferometry.

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __

      Summary

      In this study Hammond et al. investigated the role of Dual-specificity Tyrosine Phosphorylation regulated Kinase 1A (DYRK1) in G1/S transition. By exploiting Dependency Map portal, they identified a previously unexplored protein FAM53C as potential regulator of G1/S transition. Using RNAi, they confirmed that depletion of FAM53C suppressed proliferation of human RPE1 cells and that this phenotype was dependent on the presence protein RB. In addition, they noted increased level of CDKN1A transcript and p21 protein that could explain G1 arrest of FAM53C-depleted cells but surprisingly, they did not observe activation of other p53 target genes. Proteomic analysis identified DYRK1 as one of the main interactors of FAM53C and the interaction was confirmed in vitro. Further, they showed that purified FAM53C blocked the ability of DYRK1 to phosphorylate cyclin D in vitro although the activity of DYRK1 was likely not inhibited (judging from the modification of FAM53C itself). Instead, it seems more likely that FAM53C competes with cyclin D in this assay. Authors claim that the G1 arrest caused by depletion of FAM53C was rescued by inhibition of DYRK1 but this was true only in cells lacking functional p53. This is quite confusing as DYRK1 inhibition reduced the fraction of G1 cells in p53 wild type cells as well as in p53 knock-outs, suggesting that FAM53C may not be required for regulation of DYRK1 function. Instead of focusing on the impact of FAM53C on cell cycle progression, authors moved towards investigating its potential (and perhaps more complex) roles in differentiation of IPSCs into cortical organoids and in mice. They observed a lower level of proliferating cells in the organoids but if that reflects an increased activity of DYRK1 or if it is just an off target effect of the genetic manipulation remains unclear. Even less clear is the phenotype in FAM53C knock-out mice. Authors did not observe any significant changes in survival nor in organ development but they noted some behavioral differences. Weather and how these are connected to the rate of cellular proliferation was not explored. In the summary, the study identified previously unknown role of FAM53C in proliferation but failed to explain the mechanism and its physiological relevance at the level of tissues and organism. Although some of the data might be of interest, in current form the data is too preliminary to justify publication.

      Major points

      1. Whole study is based on one siRNA to Fam53C and its specificity was not validated. Level of the knock down was shown only in the first figure and not in the other experiments. The observed phenotypes in the cell cycle progression may be affected by variable knock-down efficiency and/or potential off target effects. We thank the Reviewer for raising this important point. First, we need to clarify that our experiments were performed with a pool of siRNAs (not one siRNA). Second, commercial antibodies against FAM53C are not of the best quality and it has been challenging to detect FAM53C using these antibodies in our hands – the results are often variable. In addition, to better address the Reviewer’s point and control for the phenotypes we have observed, we performed two additional series of experiments: first, we have confirmed G1 arrest in RPE-1 cells with individual siRNAs, providing more confidence for the specificity of this arrest (Fig. S1B); second, we have new data indicating that other cell lines arrest in G1 upon FAM53C knock-down (Fig. S1E,F and Fig. 4F).

      Experiments focusing on the cell cycle progression were done in a single cell line RPE1 that showed a strong sensitivity to FAM53C depletion. In contrast, phenotypes in IPSCs and in mice were only mild suggesting that there might be large differences across various cell types in the expression and function of FAM53C. Therefore, it is important to reproduce the observations in other cell types.

      As mentioned above, we have new data indicating that other cell lines arrest in G1 upon FAM53C knock-down (three cancer cell lines) (Fig. S1E,F and Fig. 4F).

      Authors state that FAM53C is a direct inhibitor of DYRK1A kinase activity (Line 203), however this model is not supported by the data in Fig 4A. FAM53C seems to be a good substrate of DYRK1 even at high concentrations when phosphorylations of cyclin D is reduced. It rather suggests that DYRK1 is not inhibited by FAM53C but perhaps FAM53C competes with cyclin D. Further, authors should address if the phosphorylation of cyclin D is responsible for the observed cell cycle phenotype. Is this Cyclin D-Thr286 phosphorylation, or are there other sites involved?

      We revised the text of the manuscript to include the possibility that FAM53C could act as a competitive substrate and/or an inhibitor.

      We removed most of the Cyclin D phosphorylation/stability data from the revised manuscript. As the Reviewers pointed out, some of these data were statistically significant but the biological effects were small. As discussed above in our response to Reviewer #1, the analysis of Cyclin D phosphorylation and stability are complicated by the upregulation of p21 upon FAM53C knock-down, in particular because p21 can be part of Cyclin D complexes, which may affect its protein levels in cells (as was nicely showed in a previous study from the lab of Tobias Meyer – Chen et al., Mol Cell, 2013). Instead of focusing on Cyclin D levels and stability, we refocused the manuscript on RB and p53 downstream of FAM53C loss.

      We note, however, that we used specific Thr286 phospho-antibodies, which have been used extensively in the field. Our data in Figure 1 with palbociclib place FAM53C upstream of Cyclin D/CDK4,6. We performed Cyclin D overexpression experiments but RPE-1 cells did not tolerate high expression of Cyclin D1 (T286A mutant) and we have not been able to conduct more ‘genetic’ studies.

      At many places, information on statistical tests is missing and SDs are not shown in the plots. For instance, what statistics was used in Fig 4C? Impact of FAM53C on cyclin D phosphorylation does not seem to be significant. In the same experiment, does DYRK1 inhibitor prevent modification of cyclin D?

      As discussed above, we removed some of these data and re-focused the manuscript on p53-p21 as a second pathway activated by loss of FAM53C.

      Validation of SM13797 compound in terms of specificity to DYRK1 was not performed.

      This is an important point. We had cited an abstract from the company (Biosplice) but we agree that providing data is critical. We have now revised the manuscript with a new analysis of the compound’s specificity using kinase assays. These data are shown in Fig. S3F-H.

      A fraction of cells in G1 is a very easy readout but it does not measure progression through the G1 phase. Extension of the S phase or G2 delay would indirectly also result in reduction of the G1 fraction. Instead, authors could measure the dynamics of entry to S phase in cells released from a G1 block or from mitotic shake off.

      The Reviewer made a good point. As discussed in our response to Reviewer #1, with p53-null RPE-1 cells, we found that cell numbers do not increase in these conditions where we had observed a cell cycle re-entry (Fig. 4E), which was accompanied by apoptotic cell death (Fig. S4I). Thus, cells re-enter the cell cycle but die as they progress through S-phase and G2/M. We note that inhibition of DYRK1A has been shown to decrease expression of G2/M regulators (PMID: 38839871), which may contribute to the inability of cells treated to DYRK1Ai to divide. Because our data in RPE-1 cells showed that p21 knock-down was not sufficient to allow the FAM53C knock-down cells to re-enter the cell cycle, we did not further analyze p21 in HCT-116 cells. These data indicate that G1 entry by flow cytometry will not always translate into proliferation.

      Other points:

      Fig. 2C, 2D, 2E graphs should begin with 0

      We remade these graphs.

      Fig. 5D shows that the difference in p21 levels is not significant in FAM53C-KO cells but difference is mentioned in the text.

      We replaced the panel by the correct panel; we apologize for this error.

      Fig. 6D comparison of datasets of extremely different sizes does not seem to be appropriate

      We agree and revised the text. We hope that the Reviewer will agree with us that it is worth showing these data, which are clearly preliminary but provide evidence of a possible role for FAM53C in the brain.

      Could there be alternative splicing in mice generating a partially functional protein without exon 4? Did authors confirm that the animal model does not express FAM53C?

      We performed RNA sequencing of mouse embryonic fibroblasts derived from control and mutant mice. We clearly identified fewer reads in exon 4 in the knockout cells, and no other obvious change in the transcript (data not shown). However, immunoblot with mouse cells for FAM53C never worked well in our hands. We made sure to add this caveat to the revised manuscript.

      __Reviewer #2 (Significance (Required)): __

      Main problem of this study is that the advanced experimental models in IPSCs and mice did not confirm the observations in the cell lines and thus the whole manuscript does not hold together. Although I acknowledge the effort the authors invested in these experiments, the data do not contribute to the main conclusion of the paper that FAM53C/DYRK1 regulates G1/S transition.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This paper identifies FAM53C as a novel regulator of cell cycle progression, particularly at the G1/S transition, by inhibiting DYRK1A. Using data from the Cancer Dependency Map, the authors suggest that FAM53C acts upstream of the Cyclin D-CDK4/6-RB axis by inhibiting DYRK1A.

      Specifically, their experiments suggest that FAM53C Knockdown induces G1 arrest in cells, reducing proliferation without triggering apoptosis. DYRK1A Inhibition rescues G1 arrest in P53KO cells, suggesting FAM53C normally suppresses DYRK1A activity. Mass Spectrometry and biochemical assays confirm that FAM53C directly interacts with and inhibits DYRK1A. FAM53C Knockout in Human Cortical Organoids and Mice leads to cell cycle defects, growth impairments, and behavioral changes, reinforcing its biological importance.

      Strength of the paper:

      The study introduces a novel cell cycle control signalling module upstream of CDK4/6 in G1/S regulation which could have significant impact. The identification of FAM53C using a depmap correlation analysis is a nice example of the power of this dataset. The experiments are carried out mostly in a convincing manner and support the conclusions of the manuscript.

      Critique:

      1) The experiments rely heavily on siRNA transfections without the appropriate controls. There are so many cases of off-target effects of siRNA in the literature, and specifically for a strong phenotype on S-phase as described here, I would expect to see solid results by additional experiments. This is especially important since the ko mice do not show any significant developmental cell cycle phenotypes. Moreover, FAM53C does not show a strong fitness effect in the depmap dataset, suggesting that it is largely non-essential in most cancer cell lines. For this paper to reach publication in a high-standard journal, I would expect that the authors show a rescue of the S-phase phenotype using an siRNA-resistant cDNA, and show similar S-phase defects using an acute knock out approach with lentiviral gRNA/Cas9 delivery.

      We thank the Reviewer for this comment. Please refer to the initial response to the three Reviewers, where we discuss our use of single siRNAs and our results in multiple cell lines. Briefly, we can recapitulate the G1 arrest upon FAM53C knock-down using two independent siRNAs in RPE-1 cells. We also observe the same G1 arrest in p53 knockout cells, suggesting it is not due to a non-specific stress response. In addition, the arrest is dependent on RB, which fits with the genetic and biochemical data placing FAM53C upstream of RB, further supporting a specific phenotype. Human cancer cell lines also arrest in G1 upon FAM53C knock-down, not just RPE-1 cells. Finally, we hope the Reviewer will agree with us that compensatory mechanisms are very common in the cell cycle – which may explain the lack of phenotypes in vivo or upon long-term knockout of FAM53C.

      2) The S-phase phenotype following FAM53C should be demonstrated in a larger variety of TP53WT and mutant cell lines. Given that this paper introduces a new G1/S control element, I think this is important for credibility. Ideally, this should be done with acute gRNA/Cas9 gene deletion using a lentiviral delivery system; but if the siRNA rescue experiments work and validate an on-target effect, siRNA would be an appropriate alternative.

      We now show data with three cancer cell lines (U2OS, A549, and HCT-116 – Fig. S1E,F and Fig. 4F), in addition to our results in RPE-1 cells and in human cortical organoids. We note that the knock-down experiments are complemented by overexpression data (Fig. 1G-I), by genetic data (our original DepMap screen), and our biochemical data (showing direct binding of FAM53C to DYRK1A).

      3) The western blot images shown in the MS appear heavily over-processed and saturated (See for example S4B, 4A, B, and E). Perhaps the authors should provide the original un-processed data of the entire gels?

      For several of our panels (e.g., 4E and S4B, now panels S3J and S3K)), we used a true “immunoassay” (as indicated in the legend – not an immunoblot), which is much more quantitative and avoids error-prone steps in standard immunoblots (“Western blots”). Briefly, this system was developed by ProteinSimple. It uses capillary transfer of proteins and ELISA-like quantification with up to 6 logs of dynamic range (see their web site https://www.proteinsimple.com/wes.html). The “bands” we show are just a representation of the luminescence signals in capillaries. We made sure to further clarify the figure legends in the revised manuscript.

      Data in 4A are also not a western blot but a radiograph.

      For immunoblots, we will provide all the source data with uncropped blots with the final submission.

      4) A critical experiment for the proposed mechanism is the rescue of the FAM53C S-phase reduction using DYRK1A inhibition shown in Figure 4. The legend here states that the data were extracted from BrdU incorporation assays, but in Figure S4D only the PI histograms are shown, and the S-phase population is not quantified. The authors should show the BrdU scatterplot and quantify the phenotype using the S-phase population in these plots. G1 measurements from PI histograms are not precise enough to allow for conclusions. Also, why are the intensities of the PI peaks so variable in these plots? Compare, for example, the HCT116 upper and lower panels where the siRNA appears to have caused an increase in ploidy.

      We apologize for the confusion and we fixed these errors, for most of the analyses, we used PI to measure G1 and S-phase entry. We added relevant flow cytometry plots to supplemental figures (Fig. S1G, H, I, as well as Fig. S4E and S4K, and Fig. S5F).

      5) There's an apparent contradiction in how RB deletion rescues the G1 arrest (Figure 2) while p21 seems to maintain the arrest even when DYRK1A is inhibited. Is p21 not induced when FAM53C is depleted in RB ko cells? This should be measured and discussed.

      This comment and comments from the two other Reviewers made us reconsider our model. We re-read carefully the Meyer paper and think that DYRK1A activity may be understood when considering levels of both CycD and p21 at the same time in a continuum (as was nicely showed in a previous study from the lab of Tobias Meyer – Chen et al., Mol Cell, 2013). While our genetic and biochemical data support a role for FAM53C in DYRK1A inhibition, it is obvious that the regulation of cell cycle progression by FAM53C is not exclusively due to this inhibition. As discussed above and below, we noted an upregulation of p21 upon FAM53C knock-down, and activation of p53 and its targets likely contributes significantly to the phenotypes observed. We added new experiments to support this more complex model (Figure 4 and Figure S4, with new model in S4L).

      __Reviewer #3 (Significance (Required)): __

      In conclusion, I believe that this MS could potentially be important for the cell cycle field and also provide a new target pathway that could be relevant for cancer therapy. However, the paper has quite a few gaps and inconsistencies that need to be addressed with further experiments. My main worry is that the acute depletion phenotypes appear so strong, while the gene is non-essential in mice and shows only a minor fitness effect in the depmap screens. More convincing controls are necessary to rule out experimental artefacts that misguide the interpretation of the results.

      We appreciate this comment and hope that the Reviewer will agree it is still important to share our data with the field, even if the phenotypes in mice are modest.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This paper identifies FAM53C as a novel regulator of cell cycle progression, particularly at the G1/S transition, by inhibiting DYRK1A. Using data from the Cancer Dependency Map, the authors suggest that FAM53C acts upstream of the Cyclin D-CDK4/6-RB axis by inhibiting DYRK1A.

      Specifically, their experiments suggest that FAM53C Knockdown induces G1 arrest in cells, reducing proliferation without triggering apoptosis. DYRK1A Inhibition rescues G1 arrest in P53KO cells, suggesting FAM53C normally suppresses DYRK1A activity. Mass Spectrometry and biochemical assays confirm that FAM53C directly interacts with and inhibits DYRK1A. FAM53C Knockout in Human Cortical Organoids and Mice leads to cell cycle defects, growth impairments, and behavioral changes, reinforcing its biological importance.

      Strength of the paper:

      The study introduces a novel cell cycle control signalling module upstream of CDK4/6 in G1/S regulation which could have significant impact. The identification of FAM53C using a depmap correlation analysis is a nice example of the power of this dataset. The experiments are carried out mostly in a convincing manner and support the conclusions of the manuscript.

      Critique:

      1. The experiments rely heavily on siRNA transfections without the appropriate controls. There are so many cases of off-target effects of siRNA in the literature, and specifically for a strong phenotype on S-phase as described here, I would expect to see solid results by additional experiments. This is especially important since the ko mice do not show any significant developmental cell cycle phenotypes. Moreover, FAM53C does not show a strong fitness effect in the depmap dataset, suggesting that it is largely non-essential in most cancer cell lines. For this paper to reach publication in a high-standard journal, I would expect that the authors show a rescue of the S-phase phenotype using an siRNA-resistant cDNA, and show similar S-phase defects using an acute knock out approach with lentiviral gRNA/Cas9 delivery.
      2. The S-phase phenotype following FAM53C should be demonstrated in a larger variety of TP53WT and mutant cell lines. Given that this paper introduces a new G1/S control element, I think this is important for credibility. Ideally, this should be done with acute gRNA/Cas9 gene deletion using a lentiviral delivery system; but if the siRNA rescue experiments work and validate an on-target effect, siRNA would be an appropriate alternative.
      3. The western blot images shown in the MS appear heavily over-processed and saturated (See for example S4B, 4A, B, and E). Perhaps the authors should provide the original un-processed data of the entire gels?
      4. A critical experiment for the proposed mechanism is the rescue of the FAM53C S-phase reduction using DYRK1A inhibition shown in Figure 4. The legend here states that the data were extracted from Brad incorporation assays, but in Figure S4D only the PI histograms are shown, and the S-phase population is not quantified. The authors should show the Brad scatterplot and quantify the phenotype using the S-phase population in these plots. G1 measurements from PI histograms are not precise enough to allow for conclusions. Also, why are the intensities of the PI peaks so variable in these plots? Compare, for example, the HCT116 upper and lower panels where the siRNA appears to have caused an increase in ploidy.
      5. There's an apparent contradiction in how RB deletion rescues the G1 arrest (Figure 2) while p21 seems to maintain the arrest even when DYRK1A is inhibited. Is p21 not induced when FAM53C is depleted in RB ko cells? This should be measured and discussed.

      Significance

      In conclusion, I believe that this MS could potentially be important for the cell cycle field and also provide a new target pathway that could be relevant for cancer therapy. However, the paper has quite a few gaps and inconsistencies that need to be addressed with further experiments. My main worry is that the acute depletion phenotypes appear so strong, while the gene is non-essential in mice and shows only a minor fitness effect in the depmap screens. More convincing controls are necessary to rukle out experimental artefacts that misguide the interpretation of the results.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      In this study Hammond et al. investigated the role of Dual-specificity Tyrosine Phosphorylation regulated Kinase 1A (DYRK1) in G1/S transition. By exploiting Dependency Map portal, they identified a previously unexplored protein FAM53C as potential regulator of G1/S transition. Using RNAi, they confirmed that depletion of FAM53C suppressed proliferation of human RPE1 cells and that this phenotype was dependent on the presence protein RB. In addition, they noted increased level of CDKN1A transcript and p21 protein that could explain G1 arrest of FAM53C-depleted cells but surprisingly, they did not observe activation of other p53 target genes. Proteomic analysis identified DYRK1 as one of the main interactors of FAM53C and the interaction was confirmed in vitro. Further, they showed that purified FAM53C blocked the ability of DYRK1 to phosphorylate cyclin D in vitro although the activity of DYRK1 was likely not inhibited (judging from the modification of FAM53C itself). Instead, it seems more likely that FAM53C competes with cyclin D in this assay. Authors claim that the G1 arrest caused by depletion of FAM53C was rescued by inhibition of DYRK1 but this was true only in cells lacking functional p53. This is quite confusing as DYRK1 inhibition reduced the fraction of G1 cells in p53 wild type cells as well as in p53 knock-outs, suggesting that FAM53C may not be required for regulation of DYRK1 function. Instead of focusing on the impact of FAM53C on cell cycle progression, authors moved towards investigating its potential (and perhaps more complex) roles in differentiation of IPSCs into cortical organoids and in mice. They observed a lower level of proliferating cells in the organoids but if that reflects an increased activity of DYRK1 or if it is just an off target effect of the genetic manipulation remains unclear. Even less clear is the phenotype in FAM53C knock-out mice. Authors did not observe any significant changes in survival nor in organ development but they noted some behavioral differences. Weather and how these are connected to the rate of cellular proliferation was not explored. In the summary, the study identified previously unknown role of FAM53C in proliferation but failed to explain the mechanism and its physiological relevance at the level of tissues and organism. Although some of the data might be of interest, in current form the data is too preliminary to justify publication.

      Major points

      1. Whole study is based on one siRNA to Fam53C and its specificity was not validated. Level of the knock down was shown only in the first figure and not in the other experiments. The observed phenotypes in the cell cycle progression may be affected by variable knock-down efficiency and/or potential off target effects.
      2. Experiments focusing on the cell cycle progression were done in a single cell line RPE1 that showed a strong sensitivity to FAM53C depletion. In contrast, phenotypes in IPSCs and in mice were only mild suggesting that there might be large differences across various cell types in the expression and function of FAM53C. Therefore, it is important to reproduce the observations in other cell types.
      3. Authors state that FAM53C is a direct inhibitor of DYRK1A kinase activity (Line 203), however this model is not supported by the data in Fig 4A. FAM53C seems to be a good substrate of DYRK1 even at high concentrations when phosphorylations of cyclin D is reduced. It rather suggests that DYRK1 is not inhibited by FAM53C but perhaps FAM53C competes with cyclin D. Further, authors should address if the phosphorylation of cyclin D is responsible for the observed cell cycle phenotype. Is this Cyclin D-Thr286 phosphorylation, or are there other sites involved?
      4. At many places, information on statistical tests is missing and SDs are not shown in the plots. For instance, what statistics was used in Fig 4C? Impact of FAM53C on cyclin D phosphorylation does not seem to be significant. IN the same experiment, does DYRK1 inhibitor prevent modification of cyclin D?
      5. Validation of SM13797 compound in terms of specificity to DYRK1 was not performed.
      6. A fraction of cells in G1 is a very easy readout but it does not measure progression through the G1 phase. Extension of the S phase or G2 delay would indirectly also result in reduction of the G1 fraction. Instead, authors could measure the dynamics of entry to S phase in cells released from a G1 block or from mitotic shake off.

      Other points

      1. Fig. 2C, 2D, 2E graphs should begin with 0
      2. Fig. 5D shows that the difference in p21 levels is not significant in FAM53C-KO cells but difference is mentioned in the text.
      3. Fig. 6D comparison of datasets of extremely different sizes does not seem to be appropriate
      4. Could there be alternative splicing in mice generating a partially functional protein without exon 4? Did authors confirm that the animal model does not express FAM53C?

      Significance

      Main problem of this study is that the advanced experimental models in IPSCs and mice did not confirm the observations in the cell lines and thus the whole manuscript does not hold together. Although I acknowledge the effort the authors invested in these experiments, the data do not contribute to the main conclusion of the paper that FAM53C/DYRK1 regulates G1/S transition.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Taylar Hammond and colleagues identified new regulators of the G1/S transition of the cell cycle. They did so by screening public available data from the Cancer Dependency Map, and identified FAM53C as a positive regulator of the G1/S transition. Using biochemical assays they then show that FAM53 interacts with the DYRK1A kinase to inhibit its function. DYRK1A in its is known to induce degradation of cyclin D, leading the authors to propose a model in which DYRK1A-dependent cyclin D degradation is inhibited by FAM53C to permit S-phase entry. Finally the authors assess the effect of FAM53C deletion in a cortical organoid model, and in Fam53c knockout mice. Whereas proliferation of the organoids is indeed inhibited, mice show virtually no phenotype.

      Major comments:

      The authors show convincing evidence that FAM53C loss can reduce S-phase entry in cell cultures, and that it can bind to DYRK1A. However, FAM53 has multiple other binding partners and I am not entirely convinced that negative regulation of DYRK1A is the predominant mechanism to explain its effects on S-phase entry. Some of the claims that are made based on the biochemical assays, and on the physiological effects of FAM53C are overstated. IN addition, some choices made methodology and data representation need further attention.

      1. The authors do note that P21 levels increase upon FAM53C. They show convincing evidence that this is not a P53-dependent response. But the claim that " p21 upregulation alone cannot explain the G1 arrest in FAM53C-deficient cells (line 138-139) is misleading. A p53-independent p21 response could still be highly relevant. The authors could test if FAM53C knockdown inhibits proliferation after p21 knockdown or p21 deletion in RPE1 cells.
      2. The authors do not convincingly show that FAM53C acts a DYRK1A inhibitor in cells. Figures 4B+C and S4B+C show extremely faint P-CycD1 bands, and tiny differences in ratios. The P values are hovering around the 0.05, so n=3 is clearly underpowered here. Total CycD1 levels also correlate with FAM53C levels, which seems to affect the ratios more than the tiny pCycD1 bands. Why is there still a pCycD1 band visible in 4B in the GFP + BTZ + DYRK1Ai condition? And if I look at the data points I honestly don't understand how the authors can conclude from S4C that knockdown of siFAM53C increases (DYRK1A dependent) increases in pCycD1 (relative to total CycD1). In figure 5C, no blot scans are even shown, and again the differences look tiny. So the authors should either find a way to make these assays more robust, or alter their claims appropriately.
      3. The experiments to test if DYRK1A inhibition could rescue the G1 arrest observed upon FAM53C knockdown are not entirely convincing either. It would be much more convincing if they also perform cell counting experiments as they have done in Figures 1F and 1G, to complement the flow cytometry assays. I suggest that the authors do these cell counting experiments in RPE1 +/- P53 cells as well as HCT116 cells. In addition, did the authors test if P21 is induced by DYRK1Ai in HCT116 cells?
      4. The data in Figure 5C and 5D are identical, although they are supposed to represent either pCycD1 ratios or p21 levels. This is a problem because at least one of the two cannot be true. Please provide the proper data and show (representative) images of both data types.
      5. Line 246: "Fam53c knockout mice display developmental and behavioral defects." I don't agree with this claim. The mutant mice are born at almost the expected Mendelian ratios, the body weight development is not consistently altered. But more importantly, no differences in adult survival or microscopic pathology were seen. The authors put strong emphasis on the IMPC behavioral analysis, but they should be more cautious. The IMPC mouse cohorts are tested for many other phenotypes related to behavior and neurological symptoms and apparently none of these other traits were changed in the IMPC Famc53c-/- cohort. Thus, the decreased exploration in a new environment could very well be a chance finding. The authors need to take away claims about developmental and behavioral defects from the abstract, results and discussion sections; the data are just too weak to justify this.

      Minor comments:

      1. Can the authors provide a rationale for each of the proteins they chose to generate the list of the 38 proteins in the DepMap analysis? I looked at the list and it seems to me that they do not all have described functions in the G1/S transition. The analysis may thus be biased.
      2. Figure 1B is confusing to me. Are these just some (arbitrarily) chosen examples? Consider leaving this heatmap out altogether, of explain in more detail.
      3. The y-axes in Figures 2C, 2D, 2E, and 4D are misleading because they do not start at 0. Please let the axis start at 0, or make axis breaks.
      4. Line 229: " Consequences ... brain development." This subheader is misleading, because the in vitro cortical organoid system is a rather simplistic model for brain development, and far away from physiological brain development. Please alter the header.
      5. Figure S5F: the gating strategy is not clear to me. In particular, how do the authors know the difference between subG1 and G1 DAPI signals? Do they interpret the subG1 as apoptotic cells? If yes, why are there so many? Are the culturing or harvesting conditions of these organoids suboptimal? Perhaps the authors could consider doing IF stainings on EdU or BrdU on paraffin sections of organoids to obtain cleaner data?
      6. Figure S6A; the labeling seems incorrect. I would think that red is heterozygous here, and grey mutant.

      Significance

      The finding that the poorly studied gene FAM53C controls the G1/S transition in cell lines is novel and interesting for the cell cycle field. However, the lack of phenotypes in Famc53-/- mice makes this finding less interesting for a broader audience. Furthermore, the mechanisms are incompletely dissected. The importance of a p53-indepent induction of p21 is not ruled out. And while the direct inhibitory interaction between FAM53C and DYRK1A is convincing (and also reported by others; PMID: 37802655), the authors do not (yet) convincingly show that DYRK1A inhibition can rescue a cell proliferation defect in FAM53C-deficient cells.

      Altogether, this study can be of interest to basic researchers in the cell cycle field.

      I am a cell biologist studying cell cycle fate decisions, and adaptation of cancer cells & stem cells to (drug-induced) stress. My technical expertise aligns well with the work presented throughout this paper, although I am not familiar with biolayer interferometry.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Since we are at the stage of simply proposing a Revision Plan to an affiliate journal, there is not a revised version of the manuscript yet. But we honestly thank the three reviewers for their important input, which we are taken into consideration very seriously.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Major Comments:

      It is interesting case study but the main problem with the study is the use of an unsuitable tardigrade model species. It was shown in the past that Hypsibius exemplaris is not a good model species to test tardigrade survival under extreme stress. Of course, results of Hypsibius exemplaris can be published but from the entire manuscript all general comments that tardigrades react in this or in different way need to be removed. This is characteristic only to Hypsibius exemplaris species which is a poor model for studies focused on environmental stressTo present general conclusions use few different tardigrade species or at least a correct tardigrade species with confirmed high resilience for different kind of stress like Milnesium, Ramazzottius, Paramacrobiotus or similar must be tested. Based on present study I can only propose to publish this manuscript as a case study for one poorly stress resistant eutardigrade species, without any general conclusions about other tardigrades. See: Poprawa, I., Bartylak, T., Kulpla, A., Erdmann, W., Roszkowska, M., Chajec, Ł., Kaczmarek, Ł., Karachitos, A. & Kmita, H. (2022) Verification of Hypsibius exemplaris Gąsiorek et al., 2018 (Eutardigrada; Hypsibiidae) application in anhydrobiosis research. PLoS ONE 17(3): e0261485.

      Minor comments:

      1. General comment to entire manuscript. Please do not start sentences with abbreviations, i.e. The DNA instead of DNA, Caenorhabditis instead of C. etc. In bibliography many doin numbers for publications are lacking, you have a different styles of citations, do not use capital letters for words inside the article title e.g. "Tardigrades as a Potential Model Organism in Space Research.", change it to "Tardigrades as a potential model organism in space research." Or use capital letters in all citations. Use italics for Latin names of the species and genera. On figures please try to put all of them like this that specimens ill be situated horizontally and in the middle of figure.
      2. Introduction, Lines 80-96: I do not understand why this section is in Introduction. This is description of the results of the studies could be minimal and details could be moved to proper chapters.
      3. Results: In this section are mixed results with methods. Please put all parts to the correct chapters.
      4. Line 227 and 235: Based on what you interpreted: "fully-grown adults" and "juveniles" that they were adult and fully grown? Please explain in the text.
      5. Line 315: You wrote "These findings demonstrate that even a transient exposure to zeocin causes irreversible DNA damage, leading to delayed mortality." but not to all specimens as you marked above.
      6. Line 461-462: You wrote: "In this study, we probed why tardigrades-despite their impressive DNA repair capacity and extremotolerance-still succumb to genotoxic stress." But only one tardigrade species with poor resilience to stress conditions has been tested in this study. What if more repair mechanisms are activated in tardigrades when tardigrades leaving the state of anhydrobiosis? Authors tested only active animals and in such mechanisms maybe not activated or are activated on lower level. What is even more problematic, and what I marked this in one of the first comments, the species used in study is incorrect because is not very resilient to extreme conditions. This species is also a poor anhydrobiotic species with almost zero ability to anhydrobiosis (during which repair mechanisms are activated).
      7. Line 609: "..actively searching for food.." - How you know that they were looking for food? What was a difference between normal crawling around and looking for food?
      8. Line 635: "In sum, tardigrades illustrate that..." - Only in case of Hypsibius. This is not characteristic for tardigrades. See my previous comments. This conclusion is too strong without adequate proof.
      9. Lines 666-667: "Adults measured {greater than or equal to}240 μm in length, while juveniles ranged between 120-180 μm." - Why such measurements? It was connected with something or is it arbitrary? Please explain.
      10. Lines: 673-677: "For each timepoint, fertility was calculated by dividing the total number of eggs laid by the number of live animals at that time (using the last recorded number of live animals when all animals had died). In Fig. 5A-B, fertility is presented as the mean cumulative number of eggs laid per animal over time; in Fig. S9, it is shown as the mean number of eggs laid per animal at each timepoint." - This method of calculating fertility may be valid only if you know that all the females laid the same number of eggs. It is obvious that some females produced less and some others more eggs. Hence, fertility can not be accurately calculated in this way.

      Significance

      Studies described in the manuscript are very interesting for many potential readers, however manuscript need to be modified as case study for one tardigrades species without generalization of the results for all tardigrades. It is very important to not suggest that all tardigrades react in the same way especially that species used is not a good candidate for this type of studies (see my major comments).

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      This manuscript studies the effects of genotoxic stress using zeocin, a bleomycin-family drug, in the tardigrade species H. exemplaris. In a first experimental set, the authors evaluate the survival of the organisms as well as the levels of DNA damage.

      A RT-qPCR analysis of a set of DNA repair genes identified in a previous study by another group (Clark-Hachtel, Courtney M. et al.; Curr Biol, Vol. 34, Issue 9, 1819-1830.e6) and a comet assay reveal the damage observed during treatment.

      Experiments on fasting animals show variations in animal size that overlap with those seen in groups of animals treated with the genotoxic drug. Physiological variations are also observed, such as lipid loss and cuticle alteration.

      In a subsequent experimental set, the authors indicate that the genotoxic drug blocks DNA replication and activates DNA repair systems in various tissues, particularly the digestive tissue, which appears to be specifically targeted in terms of its replicative capacity following DNA damage caused by the drug. A sensitivity study of tardigrade embryo development then shows that their proliferative capacity, which is highly dependent on replication, mobilizes different sets of DNA repair genes that may be more closely associated with replication than in adults.

      Finally, a comparative study of the development of two organisms (C. elegans and planarian) also shows sensitivity to drugs that disrupt the replication process during development.

      The authors conclude from all of this work that the cells of the animals' intestines are the main target of the genotoxic stress induced by the drug. The effects of disruption of the normal replication process in intestinal cells are thought to be the cause of the observed loss of tissue homeostasis (loss of lipids and tissue renewal capacity).

      Major comments:

      1. Zeocin is a drug derived from bleomycin but has not yet been extensively studied. Could you give examples of the use/validation of zeocin as a radiomimetic in other biological systems?

      2. Similarities in transcriptional responses between UV and dehydration genotoxic stresses have already been observed (Yoshida et al., 2022; BMC Genomics 23, 405) in a tardigrade species closely related to H. exemplaris (R. varieornatus). However, no correlation in transcriptional responses could be observed after treating H. exemplaris with genotoxic stresses such as desiccation and 500 Gy gamma ray irradiation (Clark-Hachtel, Courtney M. et al.; Curr Biol, Vol 34, Issue 9, 1819 - 1830.e6). These results indicate that, depending on the type of genotoxic stress, transcriptomic responses can appear to be very different and sometimes uncorrelated, particularly in the species H. exemplaris. Bleomycin has been studied in previous reports (refs Yoshida Y, et al. Proc Jpn Acad Ser B Phys Biol Sci. 2024 100(7):414-428; Clark-Hachtel, Courtney M. et al.; Curr Biol, Vol 34, Issue 9, 1819 - 1830.e6; Marwan Anoud et al., 2024, eLife 13:RP92621), which used a transcriptomic study to confirm that it behaves as a radiomimetic for the species H. exemplaris.

      On the other hand, since zeocin is a bleomycin-family drug, it is possible that its effects may differ slightly from those of bleomycin, exhibiting specific effects as observed by comparison of chemical radiomimetic and radiation treatments.

      A control experiment comparing the effects of bleomycin and zeocin using RNAseq would validate that their use is equivalent.

      1. A major conclusion of the manuscript is that DNA damage induced by the genotoxic drug disrupts replication mechanisms and leads to the observed effects. Are RT-qPCR analyses on a subset of drug-induced repair genes induced solely by the drug itself or by its indirect effect on replication?

      It would be interesting to block replication in embryos and assess whether the same sets of DNA repair genes are induced when compared with treatment with zeocin only. Additionally, it will be interesting to redo the same DNA replication block experiments with additional treatment to compare the induced sets of DNA reparation genes. This will help to understand the true effect that will be directly imputable to zeocin.

      Minor comments:

      The data are well presented, and the experiments are well described for general understanding. Previous studies in this field have been well referenced. However, the link between DNA damage caused by the drug and its impact on replication needs to be better explained.

      Finally, the use of the drug zeocin should be validated in this system by comparison with bleomycin.

      Significance

      This study evaluates the resistance of a species of tardigrades to genotoxic stress. Several previous studies have conducted this type of experiment using the same species with consistent results and using the same type of genotoxic chemical drug : bleomycin. In this study, a new genotoxic drug is evaluated for its effects on DNA damage as well as on the survival of organisms and their embryonic development. Definitive validation experiments of this new genotoxic chemical tool are necessary to determine its similarities with drugs already known for their effects in the literature.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This manuscript concerns tardigrade sensitivity to genotoxic stress. Using the radiomimetic drug Zeocin to induce DNA breaks, authors show that continuous exposure progressively kills tardigrades, accompanied by striking body shrinkage and lipid depletion. Authors show that germ cells and embryos, with their high proliferation rates, show heightened sensitivity. To resume, their findings pinpoint DNA replication as an Achilles' heel of organismal survival under genotoxic stress.

      Major comments:

      The claims and conclusions in this article are not sufficiently supported by the data. They require additional experiments or analyses.

      The fundamental problem with this paper is the use of a single molecule, Zeocin, as a radiomimetic. It is absolutely essential to compare the results obtained with radiation. In the bibliography, researchers compare a drug with radiation. Bob Goldstein, for example, in his 2024 Current Biology paper uses radiation and bleomycin. The same is true for Concordet in his 2024 elife paper. Zeocin has been used very little on tardigrades. It cannot be used alone to draw conclusions from this study.

      Additionally, at the beginning of the paper, the authors tested different concentrations of Zeocin. They showed results at two concentrations : 100ug/ml and 1mg/ml. In the remainder of the paper, only the latter concentration is used. This is not sufficient. The analyses should have been conducted in parallel on several concentrations in order to compare and analyze a potential dose-dependent effect.

      Finally, the authors focused on two types of cells that have the particularity of replicating themselves: gut cells and storage cells. It would have been necessary to work on other cell types to compare the results.

      The realization of these additional experiences are completely realistic.

      The data and methods are presented in a reproducible manner. But experiments sometimes lack independent replicates and need to be reproduced.

      The legend to Figure 1, for example, indicates that the experiment was conducted with 3 to 7 biological replicates and 60 to 120 animals. These are still very different numbers. And this can lead to significant bias.

      For the other figures, no biological replicates were indicated and the numbers « n » are sometimes very different, as in Figure 4 with n=107 and n=166. A little more homogenization allows for better robustness of the results. And biological replicates are essential.

      Sometimes there are some unclear elements in the figures. In Figure 3, if I understand correctly, A and B show the gut cells (adult) and C and D the storage cells (juvenile). The size difference is not very clear in this image. How old is the juvenile compared to this adult?

      Significance

      This study, if confirmed by additional experiments that are absolutely essential to validate these conclusions, will be interesting for the community of researchers working on tardigrades, even if the effects of genotoxic stress on tardigrades are already widely studied.

      This study is relatively complete on only one molecule, Zeocin, at a concentration of 1 mg/ml. To be relevant, another genotoxic stress should be included in the study. And the study should also be conducted at the concentration of 100 ug/ml, which did show effects but was abandoned for the rest of the study. Similarly, only storage cells and gut cells were studied given their replication capacity. Other cell types should have been included in the study for comparison.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank all the reviewers for their valuable comments and criticisms. We have thoroughly revised the manuscript and the resource to address all the points raised by the reviewers. Below, we provide a point-by-point response for the sake of clarity.

      Reviewer #1

      __Evidence, reproducibility and clarity __

      Summary: This manuscript, "MAVISp: A Modular Structure-Based Framework for Protein Variant Effects," presents a significant new resource for the scientific community, particularly in the interpretation and characterization of genomic variants. The authors have developed a comprehensive and modular computational framework that integrates various structural and biophysical analyses, alongside existing pathogenicity predictors, to provide crucial mechanistic insights into how variants affect protein structure and function. Importantly, MAVISp is open-source and designed to be extensible, facilitating reuse and adaptation by the broader community.

      Major comments: - While the manuscript is formally well-structured (with clear Introduction, Results, Conclusions, and Methods sections), I found it challenging to follow in some parts. In particular, the Introduction is relatively short and lacks a deeper discussion of the state-of-the-art in protein variant effect prediction. Several methods are cited but not sufficiently described, as if prior knowledge were assumed. OPTIONAL: Extend the Introduction to better contextualize existing approaches (e.g., AlphaMissense, EVE, ESM-based predictors) and clarify what MAVISp adds compared to each.

      We have expanded the introduction on the state-of-the-art of protein variant effects predictors, explaining how MAVISp departs from them.

      - The workflow is summarized in Figure 1(b), which is visually informative. However, the narrative description of the pipeline is somewhat fragmented. It would be helpful to describe in more detail the available modules in MAVISp, and which of them are used in the examples provided. Since different use cases highlight different aspects of the pipeline, it would be useful to emphasize what is done step-by-step in each.

      We have added a concise, narrative description of the data flow for MAVISp, as well as improved the description of modules in the main text. We will integrate the results section with a more comprehensive description of the available modules, and then clarify in the case studies which modules were applied to achieve specific results.

      OPTIONAL: Consider adding a table or a supplementary figure mapping each use case to the corresponding pipeline steps and modules used.

      We have added a supplementary table (Table S2) to guide the reader on the modules and workflows applied for each case study

      We also added Table S1 to map the toolkit used by MAVISp to collect the data that are imported and aggregated in the webserver for further guidance.

      - The text contains numerous acronyms, some of which are not defined upon first use or are only mentioned in passing. This affects readability. OPTIONAL: Define acronyms upon first appearance, and consider moving less critical technical details (e.g., database names or data formats) to the Methods or Supplementary Information. This would greatly enhance readability.

      We revised the usage of acronyms following the reviewer’s directions of defying them at first appearance.

      • The code and trained models are publicly available, which is excellent. The modular design and use of widely adopted frameworks (PyTorch and PyTorch Geometric) are also strong points. However, the Methods section could benefit from additional detail regarding feature extraction and preprocessing steps, especially the structural features derived from AlphaFold2 models. OPTIONAL: Include a schematic or a table summarizing all feature types, their dimensionality, and how they are computed.

      We thank the reviewer for noticing and praising the availability of the tools of MAVISp. Our MAVISp framework utilizes methods and scores that incorporate machine learning features (such as EVE or RaSP), but does not employ machine learning itself. Specifically, we do not use PyTorch and do not utilize features in a machine learning sense. We do extract some information from the AlphaFold2 models that we use (such as the pLDDT score and their secondary structure content, as calculated by DSSP), and those are available in the MAVISp aggregated csv files for each protein entry and detailed in the Documentation section of the MAVISp website.

      • The section on transcription factors is relatively underdeveloped compared to other use cases and lacks sufficient depth or demonstration of its practical utility. OPTIONAL: Consider either expanding this section with additional validation or removing/postponing it to a future manuscript, as it currently seems preliminary.

      We have removed this section and included a mention in the conclusions as part of the future directions.

      Minor comments: - Most relevant recent works are cited, including EVE, ESM-1v, and AlphaFold-based predictors. However, recent methods like AlphaMissense (Cheng et al., 2023) could be discussed more thoroughly in the comparison.

      We have revised the introduction to accommodate the proper space for this comparison.

      • Figures are generally clear, though some (e.g., performance barplots) are quite dense. Consider enlarging font sizes and annotating key results directly on the plots.

      We have revised Figure 2 and presented only one case study to simplify its readability. We have also changed Figure 3, whereas retained the other previous figures since they seemed less problematic.

      • Minor typographic errors are present. A careful proofreading is highly recommended. Below are some of the issues I identified: Page 3, line 46: "MAVISp perform" -> "MAVISp performs" Page 3, line 56: "automatically as embedded" -> "automatically embedded" Page 3, line 57: "along with to enhance" -> unclear; please revise Page 4, line 96: "web app interfaces with the database and present" -> "presents" Page 6, line 210: "to investigate wheatear" -> "whether" Page 6, lines 215-216: "We have in queue for processing with MAVISp proteins from datasets relevant to the benchmark of the PTM module." -> unclear sentence; please clarify Page 15, line 446: "Both the approaches" -> "Both approaches" Page 20, line 704: "advantage of multi-core system" -> "multi-core systems"

      We have done a proofreading of the entire article, including the points above

      Significance

      General assessment: the strongest aspects of the study are the modularity, open-source implementation, and the integration of structural information through graph neural networks. MAVISp appears to be one of the few publicly available frameworks that can easily incorporate AlphaFold2-based features in a flexible way, lowering the barrier for developing custom predictors. Its reproducibility and transparency make it a valuable resource. However, while the technical foundation is solid and the effort substantial, the scientific narrative and presentation could be significantly improved. The manuscript is dense and hard to follow in places, with a heavy use of acronyms and insufficient explanation of key design choices. Improving the descriptive clarity, especially in the early sections, would greatly enhance the impact of this work.

      Advance

      to the best of my knowledge, this is one of the first modular platforms for protein variant effect prediction that integrates structural data from AlphaFold2 with bioinformatic annotations and even clinical data in an extensible fashion. While similar efforts exist (e.g., ESMfold, AlphaMissense), MAVISp distinguishes itself through openness and design for reusability. The novelty is primarily technical and practical rather than conceptual.

      Audience

      this study will be of strong interest to researchers in computational biology, structural bioinformatics, and genomics, particularly those developing variant effect predictors or analyzing the impact of mutations in clinical or functional genomics contexts. The audience is primarily specialized, but the open-source nature of the tool may diffuse its use among more applied or translational users, including those working in precision medicine or protein engineering.

      Reviewer expertise: my expertise is in computational structural biology, molecular modeling, and (rather weak) machine learning applications in bioinformatics. I am familiar with graph-based representations of proteins, AlphaFold2, and variant effects based on Molecular Dynamics simulations. I do not have any direct expertise in clinical variant annotation pipelines.

      Reviewer #2

      __Evidence, reproducibility and clarity __

      Summary: The authors present a pipeline and platform, MAVISp, for aggregating, displaying and analysis of variant effects with a focus on reclassification of variants of uncertain clinical significance and uncovering the molecular mechanisms underlying the mutations.

      Major comments: - On testing the platform, I was unable to look-up a specific variant in ADCK1 (rs200211943, R115Q). I found that despite stating that the mapped refseq ID was NP_001136017 in the HGVSp column, it was actually mapped to the canonical UniProt sequence (Q86TW2-1). NP_001136017 actually maps to Q86TW2-3, which is missing residues 74-148 compared to the -1 isoform. The Uniprot canonical sequence has no exact RefSeq mapping, so the HGVSp column is incorrect in this instance. This mapping issue may also affect other proteins and result in incorrect HGVSp identifiers for variants.

      We would like to thank the reviewer for pointing out these inconsistencies. We have revised all the entries and corrected them. If needed, the history of the cases that have been corrected can be found in the closed issues of the GitHub repository that we use for communication between biocurators and data managers (https://github.com/ELELAB/mavisp_data_collection). We have also revised the protocol we follow in this regard and the MAVISp toolkit to include better support for isoform matching in our pipelines for future entries, as well as for the revision/monitoring of existing ones, as detailed in the Method Section. In particular, we introduced a tool, uniprot2refseq, which aids the biocurator in identifying the correct match in terms of sequence length and sequence identity between RefSeq and UniProt. More details are included in the Method Section of the paper. The two relevant scripts for this step are available at: https://github.com/ELELAB/mavisp_accessory_tools/

      - The paper lacks a section on how to properly interpret the results of the MAVISp platform (the case-studies are helpful, but don't lay down any global rules for interpreting the results). For example: How should a variant with conflicts between the variant impact predictors be interpreted? Are specific indicators considered more 'reliable' than others?

      We have added a section in Results to clarify how to interpret results from MAVISp in the most common use cases.

      • In the Methods section, GEMME is stated as being rank-normalised with 0.5 as a threshold for damaging variants. On checking the data downloaded from the site, GEMME was not rank-normalised but rather min-max normalised. Furthermore, Supplementary text S4 conflicts with the methods section over how GEMME scores are classified, S4 states that a raw-value threshold of -3 is used.

      We thank the reviewer for spotting this inconsistency. This part in the main text was left over from a previous and preliminary version of the pre-print, we have revised the main text. Supplementary Text S4 includes the correct reference for the value in light of the benchmarking therewithin.

      • Note. This is a major comment as one of the claims is that the associated web-tool is user-friendly. While functional, the web app is very awkward to use for analysis on any more than a few variants at once. The fixed window size of the protein table necessitates excessive scrolling to reach your protein-of-interest. This will also get worse as more proteins are added. Suggestion: add a search/filter bar. The same applies to the dataset window.

      We have changed the structure of the webserver in such a way that now the whole website opens as its own separate window, instead of being confined within the size permitted by the website at DTU. This solves the fixed window size issue. Hopefully, this will improve the user experience.

      We have refactored the web app by adding filtering functionality, both for the main protein table (that can now be filtered by UniProt AC, gene name or RefSeq ID) and the mutations table. Doing this required a general overhaul of the table infrastructure (we changed the underlying engine that renders the tables).

      • You are unable to copy anything out of the tables.
      • Hyperlinks in the tables only seem to work if you open them in a new tab or window.

      The table overhauls fixed both of these issues

      • All entries in the reference column point to the MAVISp preprint even when data from other sources is displayed (e.g. MAVE studies).

      We clarified the meaning of the reference column in the Documentation on the MAVISp website, as we realized it had confused the reviewer. The reference column is meant to cite the papers where the computationally-generated MAVISp data are used, not external sources. Since we also have the experimental data module in the most recent release, we have also refactored the MAVISp website by adding a “Datasets and metadata” page, which details metadata for key modules. These include references to data from external sources that we include in MAVISp on a case-by-case basis (for example the results of a MAVE experiment). Additionally, we have verified that the papers using MAVISp data are updated in https://elelab.gitbook.io/mavisp/overview/publications-that-used-mavisp-data and in the csv file of the interested proteins.

      Here below the current references that have been included in terms of publications using MAVISp data:

      SMPD1

      ASM variants in the spotlight: A structure-based atlas for unraveling pathogenic mechanisms in lysosomal acid sphingomyelinase

      Biochim Biophys Acta Mol Basis Dis

      38782304

      https://doi.org/10.1016/j.bbadis.2024.167260

      TRAP1

      Point mutations of the mitochondrial chaperone TRAP1 affect its functions and pro-neoplastic activity

      Cell Death & Disease

      40074754

      https://doi.org/10.1038/s41419-025-07467-6

      BRCA2

      Saturation genome editing-based clinical classification of BRCA2 variants

      Nature

      39779848

      0.1038/s41586-024-08349-1

      TP53, GRIN2A, CBFB, CALR, EGFR

      TRAP1 S-nitrosylation as a model of population-shift mechanism to study the effects of nitric oxide on redox-sensitive oncoproteins

      Cell Death & Disease

      37085483

      10.1038/s41419-023-05780-6

      KIF5A, CFAP410, PILRA, CYP2R1

      Computational analysis of five neurodegenerative diseases reveals shared and specific genetic loci

      Computational and Structural Biotechnology Journal

      38022694

      https://doi.org/10.1016/j.csbj.2023.10.031

      KRAS

      Combining evolution and protein language models for an interpretable cancer driver mutation prediction with D2Deep

      Brief Bioinform

      39708841

      https://doi.org/10.1093/bib/bbae664

      OPTN

      Decoding phospho-regulation and flanking regions in autophagy-associated short linear motifs

      Communications Biology

      40835742

      10.1038/s42003-025-08399-9

      DLG4,GRB2,SMPD1

      Deciphering long-range effects of mutations: an integrated approach using elastic network models and protein structure networks

      JMB

      40738203

      doi: 10.1016/j.jmb.2025.169359

      Entering multiple mutants in the "mutations to be displayed" window is time-consuming for more than a handful of mutants. Suggestion: Add a box where multiple mutants can be pasted in at once from an external document.

      During the table overhaul, we have revised the user interface to add a text box that allows free copy-pasting of mutation lists. While we understand having a single input box would have been ideal, the former selection interface (which is also still available) doesn’t allow copy-paste. This is a known limitation in Streamlit.

      Minor comments

      • Grammar. I appreciate that this manuscript may have been compiled by a non-native English speaker, but I would be remiss not to point out that there are numerous grammar errors throughout, usually sentence order issues or non-pluralisation. The meaning of the authors is mostly clear, but I recommend very thoroughly proof-reading the final version.

      We have done proofreading on the final version of the manuscript

      • There are numerous proteins that I know have high-quality MAVE datasets that are absent in the database e.g. BRCA1, HRAS and PPARG.

      Yes, we are aware of this. It is far from trivial to properly import the datasets from multiplex assays. They often need to be treated on a case-by-case basis. We are in the process of carefully compiling locally all the MAVE data before releasing it within the public version of the database, so this is why they are missing. We are giving priorities to the ones that can be correlated with our predictions on changes in structural stability and then we will also cover the rest of the datasets handling them in batches. Having said this, we have checked the dataset for BRCA1, HRAS, and PPARG. We have imported the ones for PPARG and BRCA1 from ProtGym, referring to the studies published in 10.1038/ng.3700 and 10.1038/s41586-018-0461-z, respectively. Whereas for HRAS, checking in details both the available data and literature, while we did identify a suitable dataset (10.7554/eLife.27810), we struggled to understand what a sensible cut-off for discriminating between pathogenic and non-pathogenic variants would be, and so ended up not including it in the MAVISp dataset for now. We will contact the authors to clarify which thresholds to apply before importing the data.

      • Checking one of the existing MAVE datasets (KRAS), I found that the variants were annotated as damaging, neutral or given a positive score (these appear to stand-in for gain-of-function variants). For better correspondence with the other columns, those with positive scores could be labelled as 'ambiguous' or 'uncertain'.

      In the KRAS case study presented in MAVISP, we utilized the protein abundance dataset reported in (http://dx.doi.org/10.1038/s41586-023-06954-0) and made available in the ProteinGym repository (specifically referenced at https://github.com/OATML-Markslab/ProteinGym/blob/main/reference_files/DMS_substitutions.csv#L153). We adopted the precalculated thresholds as provided by the ProteinGym authors. In this regard, we are not really sure the reviewer is referring to this dataset or another one on KRAS.

      • Numerous thresholds are defined for stabilizing / destabilizing / neutral variants in both the STABILITY and the LOCAL_INTERACTION modules. How were these thresholds determined? I note that (PMC9795540) uses a ΔΔG threshold of 1/-1 for defining stabilizing and destabilizing variants, which is relatively standard (though they also say that 2-3 would likely be better for pinpointing pathogenic variants).

      We improved the description of our classification strategies for both modules in the Documentation page of our website. Also, we explained more clearly the possible sources of ‘uncertain’ annotations for the two modules in both the web app (Documentation page) and main text. Briefly, in the STABILITY module, we consider FoldX and either Rosetta or RaSP to achieve a final classification. We first classify one and the other independently, according to the following strategy:

      If DDG ≥ 3, the mutation is Destabilizing If DDG ≤ −3, the mutation is Stabilizing If −2 We then compare the classifications obtained by the two methods: if they agree, then that is the final classification, if they disagree, then the final classification is Uncertain. The thresholds were selected based on a previous study, in which variants with changes in stability below 3 kcal/mol were not featuring a markedly different abundance at cellular level [10.1371/journal.pgen.1006739, 10.7554/eLife.49138]

      Regarding the LOCAL_INTERACTION module, it works similarly as for the Stability module, in that Rosetta and FoldX are considered independently, and an implicit classification is performed for each, according to the rules (values in kcal/mol)

      If DDG > 1, the mutation is Destabilizing. If DDG Each mutation is therefore classified for both methods. If the methods agree (i.e., if they classify the mutation in the same way), their consensus is the final classification for the mutation; if they do not agree, the final classification will be Uncertain.

      If a mutation does not have an associated free energy value, the relative solvent accessible area is used to classify it: if SAS > 20%, the mutation is classified as Uncertain, otherwise it is not classified.

      Thresholds here were selected according to best practices followed by the tool authors and more in general in the literature, as the reviewer also noticed.

      • "Overall, with the examples in this section, we illustrate different applications of the MAVISp results, spanning from benchmarking purposes, using the experimental data to link predicted functional effects with structural mechanisms or using experimental data to validate the predictions from the MAVISp modules."

      The last of these points is not an application of MAVISp, but rather a way in which external data can help validate MAVISp results. Furthermore, none of the examples given demonstrate an application in benchmarking (what is being benchmarked?).

      We have revised the statements to avoid this confusion in the reader.

      • Transcription factors section. This section describes an intended future expansion to MAVISp, not a current feature, and presents no results. As such, it should be moved to the conclusions/future directions section.

      We have removed this section and included a mention in the conclusions as part of the future directions.

      • Figures. The dot-plots generated by the web app, and in Figures 4, 5 and 6 have 2 legends. After looking at a few, it is clear that the lower legend refers to the colour of the variant on the X-axis - most likely referencing the ClinVar effect category. This is not, however, made clear either on the figures or in the app.

      The reviewer’s interpretation on the second legend is correct - it does refer to the ClinVar classification. Nonetheless, we understand the positioning of the legend makes understanding what the legend refers to not obvious. We also revised the captions of the figures in the main text. On the web app, we have changed the location of the figure legend for the ClinVar effect category and added a label to make it clear what the classification refers to.

      • "We identified ten variants reported in ClinVar as VUS (E102K, H86D, T29I, V91I, P2R, L44P, L44F, D56G, R11L, and E25Q, Fig.5a)" E25Q is benign in ClinVar and has had that status since first submitted.

      We have corrected this in the text and the statements related to it.

      Significance

      Platforms that aggregate predictors of variant effect are not a new concept, for example dbNSFP is a database of SNV predictions from variant effect predictors and conservation predictors over the whole human proteome. Predictors such as CADD and PolyPhen-2 will often provide a summary of other predictions (their features) when using their platforms. MAVISp's unique angle on the problem is in the inclusion of diverse predictors from each of its different moules, giving a much wider perspective on variants and potentially allowing the user to identify the mechanistic cause of pathogenicity. The visualisation aspect of the web app is also a useful addition, although the user interface is somewhat awkward. Potentially the most valuable aspect of this study is the associated gitbook resource containing reports from biocurators for proteins that link relevant literature and analyse ClinVar variants. Unfortunately, these are only currently available for a small minority of the total proteins in the database with such reports. For improvement, I think that the paper should focus more on the precise utility of the web app / gitbook reports and how to interpret the results rather than going into detail about the underlying pipeline.

      We appreciate the interest in the gitbook resource that we also see as very valuable and one of the strengths of our work. We have now implemented a new strategy based on a Python script introduced in the mavisp toolkit to generate a template Markdown file of the report that can be further customized and imported into GitBook directly (​​https://github.com/ELELAB/mavisp_accessory_tools/). This should allow us to streamline the production of more reports. We are currently assigning proteins in batches for reporting to biocurator through the mavisp_data_collection GitHub to expand their coverage. Also, we revised the text and added a section on the interpretation of results from MAVISp. with a focus on the utility of the web-app and reports.

      In terms of audience, the fast look-up and visualisation aspects of the web-platform are likely to be of interest to clinicians in the interpretation of variants of unknown clinical significance. The ability to download the fully processed dataset on a per-protein database would be of more interest to researchers focusing on specific proteins or those taking a broader view over multiple proteins (although a facility to download the whole database would be more useful for this final group).

      While our website only displays the dataset per protein, the whole dataset, including all the MAVISp entries, is available at our OSF repository (https://osf.io/ufpzm/), which is cited in the paper and linked on the MAVISp website. We have further modified the MAVISp database to add a link to the repository in the modes page, so that it is more visible.

      My expertise. - I am a protein bioinformatician with a background in variant effect prediction and large-scale data analysis.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Evidence, reproducibility and clarity:

      Summary:

      The authors present MAVISp, a tool for viewing protein variants heavily based on protein structure information. The authors have done a very impressive amount of curation on various protein targets, and should be commended for their efforts. The tool includes a diverse array of experimental, clinical, and computational data sources that provides value to potential users interested in a given target.

      Major comments:

      Unfortunately I was not able to get the website to work correctly. When selecting a protein target in simple mode, I was greeted with a completely blank page in the app window. In ensemble mode, there was no transition away from the list of targets at all. I'm using Firefox 140.0.2 (64-bit) on Ubuntu 22.04. I would like to explore the data myself and provide feedback on the user experience and utility.

      We have tried reproducing the issue mentioned by the reviewer, using the exact same Ubuntu and Firefox versions, but unfortunately failed to produce it. The website worked fine for us under such an environment. The issue experienced by the reviewer may have been due to either a temporary issue with the web server or a problem with the specific browser environment they were working in, which we are unable to reproduce. It would be useful to know the date that this happened to verify if it was a downtime on the DTU IT services side that made the webserver inaccessible.

      I have some serious concerns about the sustainability of the project and think that additional clarifications in the text could help. Currently is there a way to easily update a dataset to add, remove, or update a component (for example, if a new predictor is published, an error is found in a predictor dataset, or a predictor is updated)? If it requires a new round of manual curation for each protein to do this, I am worried that this will not scale and will leave the project with many out of date entries. The diversity of software tools (e.g., three different pipeline frameworks) also seems quite challenging to maintain.

      We appreciate the reviewer’s concerns about long-term sustainability. It is a fair point that we consider within our steering group, who oversee and plans the activities and meet monthly. Adding entries to MAVISp is moving more and more towards automation as we grow. We aim to minimize the manual work where applicable. Still, an expert-based intervention is really needed in some of the steps, and we do not want to renounce it. We intend to keep working on MAVISp to make the process of adding and updating entries as automated as possible, and to streamline the process when manual intervention is necessary. From the point of view of the biocurators, they have three core workflows to use for the default modules, which also automatically cover the source of annotations. We are currently working to streamline the procedures behind LOCAL_INTERACTION, which is the most challenging one. On the data manager and maintainers' side, we have workflows and protocols that help us in terms of automation, quality control, etc, and we keep working to improve them. Among these, we have workflows to use for the old entries updates. As an example, the update of erroneously attributed RefSeq data (pointed out by reviewer 2) took us only one week overall (from assigning revisions and importing to the database) because we have a reduced version of Snakemake for automation that can act on only the affected modules. Also, another point is that we have streamlined the generation of the templates for the gitbook reports (see also answer to reviewer 2).

      The update of old entries is planned and made regularly. We also deposit the old datasets on OSF for transparency, in case someone needs to navigate and explore the changes. We have activities planned between May and August every year to update the old entries in relation to changes of protocols in the modules, updates in the core databases that we interact with (COSMIC, Clinvar etc). In case of major changes, the activities for updates continue in the Fall. Other revisions can happen outside these time windows if an entry is needed or a specific research project and needs updates too.

      Furthermore, the community of people contributing to MAVISp as biocurators or developers is growing and we have scientists contributing from other groups in relation to their research interest. We envision that for this resource to scale up, our team cannot be the only one producing data and depositing it to the database. To facilitate this we launched a pilot for a training event online (see Event page on the website) and we will repeat it once per year. We also organize regular meetings with all the active curators and developers to plan the activities in a sustainable manner and address the challenges we encounter.

      As stated in the manuscript, currently with the team of people involved, automatization and resources that we have gathered around this initiative we can provide updates to the public database every third month and we have been regularly satisfied with them. Additionally, we are capable of processing from 20 to 40 proteins every month depending also on the needs of revision or expansion of analyses on existing proteins. We also depend on these data for our own research projects and we are fully committed to it.

      Additionally, we are planning future activities in these directions to improve scale up and sustainability:

      • Streamlining manual steps so that they are as convenient as fast as possible for our curators, e.g. by providing custom pages on the MAVISp website
      • Streamline and automatize the generation of useful output, for instance the reports, by using a combination of simple automation and large language models
      • Implement ways to share our software and scripts with third parties, for instance by providing ready made (or close to) containers or virtual machines
      • For a future version 2 if the database grows in a direction that is not compatible with Streamlit, the web data science framework we are currently using, we will rewrite the website using a framework that would allow better flexibility and performance, for instance using Django and a proper database backend. On the same theme, according to the GitHub repository, the program relies on Python 3.9, which reaches end of life in October 2025. It has been tested against Ubuntu 18.04, which left standard support in May 2023. The authors should update the software to more modern versions of Python to promote the long-term health and maintainability of the project.

      We thank the reviewer for this comment - we are aware of the upcoming EOL of Python 3.9. We tested MAVISp, both software package and web server, using Python 3.10 (which is the minimum supported version going forward) and Python 3.13 (which is the latest stable release at the time of writing) and updated the instructions in the README file on the MAVISp GitHub repository accordingly.

      We plan on keeping track of Python and library versions during our testing and updating them when necessary. In the future, we also plan to deploy Continuous Integration with automated testing for our repository, making this process easier and more standardized.

      I appreciate that the authors have made their code and data available. These artifacts should also be versioned and archived in a service like Zenodo, so that researchers who rely on or want to refer to specific versions can do so in their own future publications.

      Since 2024, we have been reporting all previous versions of the dataset on OSF, the repository linked to the MAVISp website, at https://osf.io/ufpzm/files/osfstorage (folder: previous_releases). We prefer to keep everything under OSF, as we also use it to deposit, for example, the MD trajectory data.

      Additionally, in this GitHub page that we use as a space to interact between biocurators, developers, and data managers within the MAVISp community, we also report all the changes in the NEWS space: https://github.com/ELELAB/mavisp_data_collection

      Finally, the individual tools are all available in our GitHub repository, where version control is in place (see Table S1, where we now mapped all the resources used in the framework)

      In the introduction of the paper, the authors conflate the clinical challenges of variant classification with evidence generation and it's quite muddled together. They should strongly consider splitting the first paragraph into two paragraphs - one about challenges in variant classification/clinical genetics/precision oncology and another about variant effect prediction and experimental methods. The authors should also note that they are many predictors other than AlphaMissense, and may want to cite the ClinGen recommendations (PMID: 36413997) in the intro instead.

      We revised the introduction in light of these suggestions. We have split the paragraph as recommended and added a longer second paragraph about VEPs and using structural data in the context of VEPs. We have also added the citation that the reviewer kindly recommended.

      Also in the introduction on lines 21-22 the authors assert that "a mechanistic understanding of variant effects is essential knowledge" for a variety of clinical outcomes. While this is nice, it is clearly not the case as we can classify variants according to the ACMG/AMP guidelines without any notion of specific mechanism (for example, by combining population frequency data, in silico predictor data, and functional assay data). The authors should revise the statement so that it's clear that mechanistic understanding is a worthy aspiration rather than a prerequisite.

      We revised the statement in light of this comment from the reviewer

      In the structural analysis section (page 5, lines 154-155 and elsewhere), the authors define cutoffs with convenient round numbers. Is there a citation for these values or were these arbitrarily chosen by the authors? I would have liked to see some justification that these assignments are reasonable. Also there seems to be an error in the text where values between -2 and -3 kcal/mol are not assigned to a bin (I assume they should also be uncertain). There are other similar seemingly-arbitrary cutoffs later in the section that should also be explained.

      We have revised the text making the two intervals explicit, for better clarity.

      On page 9, lines 294-298 the authors talk about using the PTEN data from ProteinGym, rather than the actual cutoffs from the paper. They get to the latter later on, but I'm not sure why this isn't first? The ProteinGym cutoffs are somewhat arbitrarily based on the median rather than expert evaluation of the dataset, and I'm not sure why it's even worth mentioning them when proper classifications are available. Regarding PTEN, it would be quite interesting to see a comparison of the VAMP-seq PTEN data and the Mighell phosphatase assay, which is cited on page 9 line 288 but is not actually a VAMP-seq dataset. I think this section could be interesting but it requires some additional attention.

      We have included the data from Mighell’s phosphatase assay as provided by MAVEdb in the MAVISp database, within the experimental_data module for PTEN, and we have revised the case study, including them and explaining better the decision of supporting both the ProteinGym and MAVEdb classification in MAVISp (when available). See revised Figure3, Table 1 and corresponding text.

      The authors mention "pathogenicity predictors" and otherwise use pathogenicity incorrectly throughout the manuscript. Pathogenicity is a classification for a variant after it has been curated according to a framework like the ACMG/AMP guidelines (Richards 2015 and amendments). A single tool cannot predict or assign pathogenicity - the AlphaMissense paper was wrong to use this nomenclature and these authors should not compound this mistake. These predictors should be referred to as "variant effect predictors" or similar, and they are able to produce evidence towards pathogenicity or benignity but not make pathogenicity calls themselves. For example, in Figure 4e, the terms "pathogenic" and "benign" should only be used here if these are the classifications the authors have derived from ClinVar or a similar source of clinically classified variants.

      The reviewer is correct, we have revised the terminology we used in the manuscript and refers to VEPs (Variant Effect Predictors)

      Minor comments:

      The target selection table on the website needs some kind of text filtering option. It's very tedious to have to find a protein by scrolling through the table rather than typing in the symbol. This will only get worse as more datasets are added.

      We have revised the website, adding a filtering option. In detail, we have refactored the web app by adding filtering functionality, both for the main protein table (that can now be filtered by UniProt AC, gene name, or RefSeq ID) and the mutations table. Doing this required a general overhaul of the table infrastructure (we changed the underlying engine that renders the tables).

      The data sources listed on the data usage section of the website are not concordant with what is in the paper. For example, MaveDB is not listed.

      We have revised and updated the data sources on the website, adding a metadata section with relevant information, including MaveDB references where applicable.

      Figure 2 is somewhat confusing, as it partially interleaves results from two different proteins. This would be nicer as two separate figures, one on each protein, or just of a single protein.

      As suggested by the reviewer, we have now revised the figure and corresponding legends and text, focusing only on one of the two proteins.

      Figure 3 panel b is distractingly large and I wonder if the authors could do a little bit more with this visualization.

      We have revised Figure 3 to solve these issues and integrating new data from the comparison with the phosphatase assay

      Capitalization is inconsistent throughout the manuscript. For example, page 9 line 288 refers to VampSEQ instead of VAMP-seq (although this is correct elsewhere). MaveDB is referred to as MAVEdb or MAVEDB in various places. AlphaMissense is referred to as Alphamissense in the Figure 5 legend. The authors should make a careful pass through the manuscript to address this kind of issues.

      We have carefully proofread the paper for these inconsistencies

      MaveDB has a more recent paper (PMID: 39838450) that should be cited instead of/in addition to Esposito et al.

      We have added the reference that the reviewer recommended

      On page 11, lines 338-339 the authors mention some interesting proteins including BLC2, which has base editor data available (PMID: 35288574). Are there plans to incorporate this type of functional assay data into MAVISp?

      The assay mentioned in the paper refers to an experimental setup designed to investigate mutations that may confer resistance to the drug venetoclax. We started the first steps to implement a MAVISp module aimed at evaluating the impact of mutations on drug binding using alchemical free energy perturbations (ensemble mode) but we are far from having it complete. We expect to import these data when the module will be finalized since they can be used to benchmark it and BCL2 is one of the proteins that we are using to develop and test the new module.

      Reviewer #3 (Significance (Required)):

      Significance:

      General assessment:

      This is a nice resource and the authors have clearly put a lot of effort in. They should be celebrated for their achievments in curating the diverse datasets, and the GitBooks are a nice approach. However, I wasn't able to get the website to work and I have raised several issues with the paper itself that I think should be addressed.

      Advance:

      New ways to explore and integrate complex data like protein structures and variant effects are always interesting and welcome. I appreciate the effort towards manual curation of datasets. This work is very similar in theme to existing tools like Genomics 2 Proteins portal (PMID: 38260256) and ProtVar (PMID: 38769064). Unfortunately as I wasn't able to use the site I can't comment further on MAVISp's position in the landscape.

      We have expanded the conclusions section to add a comparison and cite previously published work, and linked to a review we published last year that frames MAVISp in the context of computational frameworks for the prediction of variant effects. In brief, the Genomics 2 Proteins portal (G2P) includes data from several sources, including some overlapping with MAVISp such as Phosphosite or MAVEdb, as well as features calculated on the protein structure. ProtVar also aggregates mutations from different sources and includes both variant effect predictors and predictions of changes in stability upon mutation, as well as predictions of complex structures. These approaches are only partially overlapping with MAVISp. G2P is primarily focused on structural and other annotations of the effect of a mutation; it doesn’t include features about changes of stability, binding, or long-range effects, and doesn’t attempt to classify the impact of a mutation according to its measurements. It also doesn’t include information on protein dynamics. Similarly, ProtVar does include information on binding free energies, long effects, or dynamical information.

      Audience:

      MAVISp could appeal to a diverse group of researchers who are interested in the biology or biochemistry of proteins that are included, or are interested in protein variants in general either from a computational/machine learning perspective or from a genetics/genomics perspective.

      My expertise:

      I am an expert in high-throughput functional genomics experiments and am an experienced computational biologist with software engineering experience.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The authors present MAVISp, a tool for viewing protein variants heavily based on protein structure information. The authors have done a very impressive amount of curation on various protein targets, and should be commended for their efforts. The tool includes a diverse array of experimental, clinical, and computational data sources that provides value to potential users interested in a given target.

      Major comments:

      Unfortunately I was not able to get the website to work properly. When selecting a protein target in simple mode, I was greeted with a completely blank page in the app window, and in ensemble mode, there was no transition away from the list of targets at all. I'm using Firefox 140.0.2 (64-bit) on Ubuntu 22.04. I would have liked to be able to explore the data myself and provide feedback on the user experience and utility.

      I have some serious concerns about the sustainability of the project and think that additional clarifications in the text could help. Currently is there a way to easily update a dataset to add, remove, or update a component (for example, if a new predictor is published, an error is found in a predictor dataset, or a predictor is updated)? If it requires a new round of manual curation for each protein to do this, I am worried that this will not scale and will leave the project with many out of date entries. The diversity of software tools (e.g., three different pipeline frameworks) also seems quite challenging to maintain.

      On the same theme, according to the GitHub repository, the program relies on Python 3.9, which reaches end of life in October 2025. It has been tested against Ubuntu 18.04, which left standard support in May 2023. The authors should update the software to more modern versions of Python to promote the long-term health and maintainability of the project.

      I appreciate that the authors have made their code and data available. These artifacts should also be versioned and archived in a service like Zenodo, so that researchers who rely on or want to refer to specific versions can do so in their own future publications.

      In the introduction of the paper, the authors conflate the clinical challenges of variant classification with evidence generation and it's quite muddled together. The y should strongly consider splitting the first paragraph into two paragraphs - one about challenges in variant classification/clinical genetics/precision oncology and another about variant effect prediction and experimental methods. The authors should also note that they are many predictors other than AlphaMissense, and may want to cite the ClinGen recommendations (PMID: 36413997) in the intro instead.

      Also in the introduction on lines 21-22 the authors assert that "a mechanistic understanding of variant effects is essential knowledge" for a variety of clinical outcomes. While this is nice, it is clearly not the case as we are able to classify variants according to the ACMG/AMP guidelines without any notion of specific mechanism (for example, by combining population frequency data, in silico predictor data, and functional assay data). The authors should revise the statement so that it's clear that mechanistic understanding is a worthy aspiration rather than a prerequisite.

      In the structural analysis section (page 5, lines 154-155 and elsewhere), the authors define cutoffs with convenient round numbers. Is there a citation for these values or were these arbitrarily chosen by the authors? I would have liked to see some justification that these assignments are reasonable. Also there seems to be an error in the text where values between -2 and -3 kcal/mol are not assigned to a bin (I assume they should also be uncertain). There are other similar seemingly-arbitrary cutoffs later in the section that should also be explained.

      On page 9, lines 294-298 the authors talk about using the PTEN data from ProteinGym, rather than the actual cutoffs from the paper. They get to the latter later on, but I'm not sure why this isn't first? The ProteinGym cutoffs are somewhat arbitrarily based on the median rather than expert evaluation of the dataset and I'm not sure why it's even worth mentioning them when proper classifications are available. Regarding PTEN, it would be quite interesting to see a comparison of the VAMP-seq PTEN data and the Mighell phosphatase assay, which is cited on page 9 line 288 but is not actually a VAMP-seq dataset. I think this section could be interesting but it requires some additional attention.

      The authors mention "pathogenicity predictors" and otherwise use pathogenicity incorrectly throughout the manuscript. Pathogenicity is a classification for a variant after it has been curated according to a framework like the ACMG/AMP guidelines (Richards 2015 and amendments). A single tool cannot predict or assign pathogenicity - the AlphaMissense paper was wrong to use this nomenclature and these authors should not compound this mistake. These predictors should be referred to as "variant effect predictors" or similar, and they are able to produce evidence towards pathogenicity or benignity but not make pathogenicity calls themselves. For example, in Figure 4e, the terms "pathogenic" and "benign" should only be used here if these are the classifications the authors have derived from ClinVar or a similar source of clinically classified variants.

      Minor comments:

      The target selection table on the website needs some kind of text filtering option. It's very tedious to have to find a protein by scrolling through the table rather than typing in the symbol. This will only get worse as more datasets are added.

      The data sources listed on the data usage section of the website are not concordant with what is in the paper. For example, MaveDB is not listed.

      I found Figure 2 to be a bit confusing in that it partially interleaves results from two different proteins. I think this would be nicer as two separate figures, one on each protein, or just of a single protein.

      Figure 3 panel b is distractingly large and I wonder if the authors could do a little bit more with this visualization.

      Capitalization is inconsistent throughout the manuscript. For example, page 9 line 288 refers to VampSEQ instead of VAMP-seq (although this is correct elsewhere). MaveDB is referred to as MAVEdb or MAVEDB in various places. AlphaMissense is referred to as Alphamissense in the Figure 5 legend. The authors should make a careful pass through the manuscript to address this kind of issues.

      MaveDB has a more recent paper (PMID: 39838450) that should be cited instead of/in addition to Esposito et al.

      On page 11, lines 338-339 the authors mention some interesting proteins including BLC2, which has base editor data available (PMID: 35288574). Are there plans to incorporate this type of functional assay data into MAVISp?

      Significance

      General assessment:

      This is a nice resource and the authors have clearly put a lot of effort in. They should be celebrated for their achievments in curating the diverse datasets, and the GitBooks are a nice approach. However, I wasn't able to get the website to work and I have raised several issues with the paper itself that I think should be addressed.

      Advance:

      New ways to explore and integrate complex data like protein structures and variant effects are always interesting and welcome. I appreciate the effort towards manual curation of datasets. This work is very similar in theme to existing tools like Genomics 2 Proteins portal (PMID: 38260256) and ProtVar (PMID: 38769064). Unfortunately as I wasn't able to use the site I can't comment further on MAVISp's position in the landscape.

      Audience:

      MAVISp could appeal to a diverse group of researchers who are interested in the biology or biochemistry of proteins that are included, or are interested in protein variants in general either from a computational/machine learning perspective or from a genetics/genomics perspective.

      My expertise:

      I am an expert in high-throughput functional genomics experiments and am an experienced computational biologist with software engineering experience.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The authors present a pipeline and platform, MAVISp, for aggregating, displaying and analysis of variant effects with a focus on reclassification of variants of uncertain clinical significance and uncovering the molecular mechanisms underlying the mutations.

      Major comments:

      • On testing the platform, I was unable to look-up a specific variant in ADCK1 (rs200211943, R115Q). I found that despite stating that the mapped refseq ID was NP_001136017 in the HGVSp column, it was actually mapped to the canonical UniProt sequence (Q86TW2-1). NP_001136017 actually maps to Q86TW2-3, which is missing residues 74-148 compared to the -1 isoform. The Uniprot canonical sequence has no exact RefSeq mapping, so the HGVSp column is incorrect in this instance. This mapping issue may also affect other proteins and result in incorrect HGVSp identifiers for variants.
      • The paper lacks a section on how to properly interpret the results of the MAVISp platform (the case-studies are useful, but don't lay down any global rules for interpreting the results). For example: How should a variant with conflicts between the variant impact predictors be interpreted? Are certain indicators considered more 'reliable' than others?
      • In the Methods section, GEMME is stated as being rank-normalised with 0.5 as a threshold for damaging variants. On checking the data downloaded from the site, GEMME was not rank-normalised but rather min-max normalised. Furthermore, Supplementary text S4 conflicts with the methods section over how GEMME scores are classified, S4 states that a raw-value threshold of -3 is used.
      • Note. This is a major comment as one of the claims is that the associated web-tool is user-friendly. While functional, the web app is very awkward to use for analysis on any more than a few variants at once.
        • The fixed window size of the protein table necessitates excessive scrolling to reach your protein-of-interest. This will also get worse as more proteins are added. Suggestion: add a search/filter bar.
        • The same applies to the dataset window.
        • You are unable to copy anything out of the tables.
        • Hyperlinks in the tables only seem to work if you open them in a new tab or window.
        • All entries in the reference column point to the MAVISp preprint even when data from other sources is displayed (e.g. MAVE studies).
        • Entering multiple mutants in the "mutations to be displayed" window is time-consuming for more than a handful of mutants. Suggestion: Add a box where multiple mutants can be pasted in at once from an external document.

      Minor comments

      • Grammar. I appreciate that this manuscript may have been compiled by a non-native English speaker, but I would be remiss not to point out that there are numerous grammar errors throughout, usually sentence order issues or non-pluralisation. The meaning of the authors is mostly clear, but I recommend very thoroughly proof-reading the final version.
      • There are numerous proteins that I know have high-quality MAVE datasets that are absent in the database e.g. BRCA1, HRAS and PPARG.
      • Checking one of the existing MAVE datasets (KRAS), I found that the variants were annotated as damaging, neutral or given a positive score (these appear to stand-in for gain-of-function variants). For better correspondence with the other columns, those with positive scores could be labelled as 'ambiguous' or 'uncertain'.
      • Numerous thresholds are defined for stabilizing / destabilizing / neutral variants in both the STABILITY and the LOCAL_INTERACTION modules. How were these thresholds determined? I note that (PMC9795540) uses a ΔΔG threshold of 1/-1 for defining stabilizing and destabilizing variants, which is relatively standard (though they also say that 2-3 would likely be better for pinpointing pathogenic variants).
      • "Overall, with the examples in this section, we illustrate different applications of the MAVISp results, spanning from benchmarking purposes, using the experimental data to link predicted functional effects with structural mechanisms or using experimental data to validate the predictions from the MAVISp modules."

      The last of these points is not an application of MAVISp, but rather a way in which external data can help validate MAVISp results. Furthermore, none of the examples given demonstrate an application in benchmarking (what is being benchmarked?). - Transcription factors section. This section describes an intended future expansion to MAVISp, not a current feature, and presents no results. As such, it should probably be moved to the conclusions/future directions section. - Figures. The dot-plots generated by the web app, and in Figures 4, 5 and 6 have 2 legends. After looking at a few, it is clear that the lower legend refers to the colour of the variant on the X-axis - most likely referencing the ClinVar effect category. This is not, however, made clear either on the figures or in the app. - "We identified ten variants reported in ClinVar as VUS (E102K, H86D, T29I, V91I, P2R, L44P, L44F, D56G, R11L, and E25Q, Fig.5a)"

      E25Q is benign in ClinVar and has had that status since first submitted.

      Significance

      Platforms that aggregate predictors of variant effect are not a new concept, for example dbNSFP is a database of SNV predictions from variant effect predictors and conservation predictors over the whole human proteome. Predictors such as CADD and PolyPhen-2 will often provide a summary of other predictions (their features) when using their platforms. MAVISp's unique angle on the problem is in the inclusion of diverse predictors from each of its different moules, giving a much wider perspective on variants and potentially allowing the user to identify the mechanistic cause of pathogenicity. The visualisation aspect of the web app is also a useful addition, although the user interface is somewhat awkward. Potentially the most valuable aspect of this study is the associated gitbook resource containing reports from biocurators for proteins that link relevant literature and analyse ClinVar variants. Unfortunately, these are only currently available for a small minority of the total proteins in the database with such reports.

      For improvement, I think that the paper should focus more on the precise utility of the web app / gitbook reports and how to interpret the results rather than going into detail about the underlying pipeline.

      In terms of audience, the fast look-up and visualisation aspects of the web-platform are likely to be of interest to clinicians in the interpretation of variants of unknown clinical significance. The ability to download the fully processed dataset on a per-protein database would be of more interest to researchers focusing on specific proteins or those taking a broader view over multiple proteins (although a facility to download the whole database would be more useful for this final group).

      My expertise.

      • I am a protein bioinformatician with a background in variant effect prediction and large-scale data analysis.
    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: This manuscript, "MAVISp: A Modular Structure-Based Framework for Protein Variant Effects," presents a significant new resource for the scientific community, particularly in the interpretation and characterization of genomic variants. The authors have developed a comprehensive and modular computational framework that integrates various structural and biophysical analyses, alongside existing pathogenicity predictors, to provide crucial mechanistic insights into how variants affect protein structure and function. Importantly, MAVISp is open-source and designed to be extensible, facilitating reuse and adaptation by the broader community.

      Major comments:

      • While the manuscript is formally well-structured (with clear Introduction, Results, Conclusions, and Methods sections), I found it challenging to follow in some parts. In particular, the Introduction is relatively short and lacks a deeper discussion of the state-of-the-art in protein variant effect prediction. Several methods are cited but not sufficiently described, as if prior knowledge were assumed. OPTIONAL: Extend the Introduction to better contextualize existing approaches (e.g., AlphaMissense, EVE, ESM-based predictors) and clarify what MAVISp adds compared to each.
      • The workflow is summarized in Figure 1(b), which is visually informative. However, the narrative description of the pipeline is somewhat fragmented. It would be helpful to describe in more detail the available modules in MAVISp, and which of them are used in the examples provided. Since different use cases highlight different aspects of the pipeline, it would be useful to emphasize what is done step-by-step in each. OPTIONAL: Consider adding a table or a supplementary figure mapping each use case to the corresponding pipeline steps and modules used.
      • The text contains numerous acronyms, some of which are not defined upon first use or are only mentioned in passing. This affects readability. OPTIONAL: Define acronyms upon first appearance, and consider moving less critical technical details (e.g., database names or data formats) to the Methods or Supplementary Information. This would greatly enhance readability.
      • The code and trained models are publicly available, which is excellent. The modular design and use of widely adopted frameworks (PyTorch and PyTorch Geometric) are also strong points. However, the Methods section could benefit from additional detail regarding feature extraction and preprocessing steps, especially the structural features derived from AlphaFold2 models. OPTIONAL: Include a schematic or a table summarizing all feature types, their dimensionality, and how they are computed.
      • The section on transcription factors is relatively underdeveloped compared to other use cases and lacks sufficient depth or demonstration of its practical utility. OPTIONAL: Consider either expanding this section with additional validation or removing/postponing it to a future manuscript, as it currently seems preliminary.

      Minor comments:

      • Most relevant recent works are cited, including EVE, ESM-1v, and AlphaFold-based predictors. However, recent methods like AlphaMissense (Cheng et al., 2023) could be discussed more thoroughly in the comparison.
      • Figures are generally clear, though some (e.g., performance barplots) are quite dense. Consider enlarging font sizes and annotating key results directly on the plots.
      • Minor typographic errors are present. A careful proofreading is highly recommended. Below are some of the issues I identified:

      Page 3, line 46: "MAVISp perform" -> "MAVISp performs"

      Page 3, line 56: "automatically as embedded" -> "automatically embedded"

      Page 3, line 57: "along with to enhance" -> unclear; please revise

      Page 4, line 96: "web app interfaces with the database and present" -> "presents"

      Page 6, line 210: "to investigate wheatear" -> "whether"

      Page 6, lines 215-216: "We have in queue for processing with MAVISp proteins from datasets relevant to the benchmark of the PTM module." -> unclear sentence; please clarify

      Page 15, line 446: "Both the approaches" -> "Both approaches"

      Page 20, line 704: "advantage of multi-core system" -> "multi-core systems"

      Significance

      General assessment: the strongest aspects of the study are the modularity, open-source implementation, and the integration of structural information through graph neural networks. MAVISp appears to be one of the few publicly available frameworks that can easily incorporate AlphaFold2-based features in a flexible way, lowering the barrier for developing custom predictors. Its reproducibility and transparency make it a valuable resource. However, while the technical foundation is solid and the effort substantial, the scientific narrative and presentation could be significantly improved. The manuscript is dense and hard to follow in places, with a heavy use of acronyms and insufficient explanation of key design choices. Improving the descriptive clarity, especially in the early sections, would greatly enhance the impact of this work.

      Advance: to the best of my knowledge, this is one of the first modular platforms for protein variant effect prediction that integrates structural data from AlphaFold2 with bioinformatic annotations and even clinical data in an extensible fashion. While similar efforts exist (e.g., ESMfold, AlphaMissense), MAVISp distinguishes itself through openness and design for reusability. The novelty is primarily technical and practical rather than conceptual.

      Audience: this study will be of strong interest to researchers in computational biology, structural bioinformatics, and genomics, particularly those developing variant effect predictors or analyzing the impact of mutations in clinical or functional genomics contexts. The audience is primarily specialized, but the open-source nature of the tool may diffuse its use among more applied or translational users, including those working in precision medicine or protein engineering.

      Reviewer expertise: my expertise is in computational structural biology, molecular modeling, and (rather weak) machine learning applications in bioinformatics. I am familiar with graph-based representations of proteins, AlphaFold2, and variant effects based on Molecular Dynamics simulations. I do not have any direct expertise in clinical variant annotation pipelines.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Summary:

      Miyamoto et al. report that importin α1 is highly enriched in a subfraction of micronuclei (about 40%), which exhibit defective nuclear envelopes and compromised accessibility of factors essential for the damage response associated with homologous recombination DNA repair. The authors suggest that the unequal localization and abnormal distribution of importin α1 within these micronuclei contribute to the genomic instability observed in cancer.


      Major comments:

      1.) It is crucial to quantitatively assess the localization of importin α1 in micronuclei (MN) across non-transformed MCM10A cells compared to transformed cell lines (MC7, HeLa, and MDA-MB-231). This analysis would help determine whether the localization of importin α1 in MN correlates with genomic stability in human cancer cells

      We appreciate the reviewer's thoughtful suggestion to compare non-transformed and transformed cell lines to evaluate importin α1 localization in MN. Given that HeLa cells are derived from cervical cancer rather than the mammary epithelium, we considered it inappropriate to directly compare them with non-transformed mammary epithelial MCF10A cells. Therefore, HeLa cells were analyzed separately to assess the effects of reversine treatment on importin α1 localization. The results indicated no significant difference between the treated and untreated HeLa cells. (Supplemental Fig. S2F in the revised manuscript). Regarding the comparison between MCF10A and the two cancer cell lines, MCF7 and MDA-MB-231, the proportion of importin α1-positive MN did not significantly differ across the cell lines, regardless of reversine treatment (Supplemental Fig. S3B, Untreated: p = 0.9850 and 0.5533; Reversine: p = 0.2218 and 0.9392). These results suggest that there is no clear difference in the localization of importin α1 in MN between the transformed and non-transformed cell lines tested. However, we acknowledge that this does not exclude the possibility that importin α1 localization to MN is linked to genomic instability under specific conditions.

      2.) While the authors provide some evidence indicating partial disruption of nuclear envelopes in MN (Figures 3 and S4), it is noteworthy that this phenomenon also occurs in importin α1-negative MN. Furthermore, according to the figure legends, the data presented in both figures stem from a single experiment. Current literature suggests that compromised nuclear envelope integrity is one of the major contributors to genomic instability, mediated through mechanisms such as chromothripsis and cGAS-STING-mediated inflammation arising from MN. Therefore, a more comprehensive quantification of nuclear envelope integrity-ideally comparing non-transformed MCM10A cells with transformed cell lines (MC7, HeLa, and MDA-MB-231)-is necessary to substantiate the connection between aberrant importin α1 behavior in MN and chromothripsis processes, as well as regulation of the cGAS-STING pathway linked to genomic instability in cancer cells.

      We thank the reviewer for the constructive suggestion to quantify nuclear envelope integrity more comprehensively. In response, we compared laminB1 localization at the MN membrane between importin α1-positive and -negative MN in MCF10A, MCF7, MDA-MB-231, and HeLa cells, and included these results in the revised manuscript (Fig. 4C). For each cell, the laminB1 intensity in the MN was normalized to that of the primary nucleus (PN). This analysis showed that laminB1 intensity was significantly lower in importin α1-positive MN across all cell lines, including non-transformed MCF10A cells. These findings support a close association between aberrant importin α1 accumulation and compromised nuclear envelope integrity, a key factor potentially linking MN to chromothripsis and cGAS-STING-mediated genomic instability.

      3.) The schematic illustration presented in Figure 8 does not adequately summarize all findings from this study nor does it clarify how the localization of importin α1 within MN might hypothetically influence genome stability. Although it is reasonable to propose that "importin α can serve as a molecular marker for characterizing the dynamics of MN" (Line 344), the authors assert (Line 325) that their findings, along with others, have "potential implications for the induction of chromothripsis processes and regulation of the cGAS-STING pathway in cancer cells." However, they fail to provide a clear or even hypothetical explanation regarding how their findings contribute to these molecular events. To address this gap, it would be essential for them to contextualize their results within existing literature that explores and links structural integrity deficits or aberrant DNA replication/damage responses in MN with chromothripsis and inflammation (e.g., PMID: 32601372; PMID: 32494070; PMID: 27918550; PMID: 28738408; PMID: 28759889).

      We agree that the previous schematic illustration (former Fig. 8) did not adequately summarize our findings and may have overstated our conclusions. Accordingly, we have removed this figure from the revised manuscript.

      To address the reviewer's concern, we performed additional analyses and included the results in the new Figure 8. These data show that, in addition to RAD51, both RPA2 and cGAS display mutually exclusive localization with importin α1 in MN. RPA2, a single-stranded DNA-binding protein, stabilizes damaged DNA and enables RAD51 filament assembly during homologous recombination repair. Previous studies have demonstrated that RPA2 accumulates in ruptured MN in a CHMP4B-dependent manner (PMID: 32601372). Likewise, cGAS is a cytosolic DNA sensor that localizes to ruptured MN and activates innate immune signaling through the cGAS-STING pathway, as widely reported (PMID: 28738408; 28759889; see also PMID: 32494070; 27918550).

      Our findings suggest an alternative scenario: even when nuclear envelope rupture occurs, importin α1-positive MN may remain inaccessible to DNA repair and sensing factors such as RPA2 and cGAS. This supports the view that importin α1 defines a distinct MN subset, separate from those characterized by the canonical DNA damage response or innate immune signaling factors. Furthermore, our overexpression experiments with EGFP-importin α1 (Fig. 7G, 7H) raises the possibility that importin α1 enrichment may impede the recruitment of DNA-binding proteins.

      Taken together, these results support the conclusion that importin α1 marks a unique MN state and provides a molecular framework for distinguishing between different MN environments. At the reviewer's suggestion, we have cited all the recommended references (PMID: 32601372, 32494070, 27918550, 28738408, and 28759889) in the revised manuscript to better contextualize our findings. We are grateful for the reviewer's thoughtful suggestions and literature recommendations, which helped us clarify the implications of our findings within the broader context of chromothripsis and cGAS-STING-mediated genomic instability.

      4.) Fig. 4D does not support the idea that importin α1 is euchromatin enriched: H3K9me3, H3K4me3 and H3K37me3 seem to be all deeply blue.

      We sincerely thank the reviewer for pointing out the important limitations of the original version of Fig. 4D, as also raised in minor comment #5. As the reviewer correctly noted, this figure was intended to demonstrate that importin-α1 preferentially localizes to euchromatin regions (H3K4me3 and H3K36me3) rather than heterochromatin (H3K9me3 and H3K27me3). However, we acknowledge that in the original figure, the predominantly blue tone of the heatmap made this interpretation unclear and that the Spearman's correlation coefficient for H3K36me3 was missing. In response, we have substantially revised the figure (now shown as Fig. 5E in the revised manuscript). Specifically, we improved the color scale for better visual distinction, added the missing Spearman's coefficients for H3K36me3, and strengthened the analysis by incorporating ChIP-seq data obtained with two independent antibodies against importin α1 (Ab1 and Ab2). We believe that these revisions provide a clear and more accurate representation of euchromatin enrichment of importin-α1, as originally intended.

      Indeed, the data presented by the authors do not adequately support a direct link between the presence of importin α1 in MN and genomic instability in human cancer cells. While the experimental correlations provided may not substantiate this connection definitively, they do lay a foundation for a grounded hypothesis and suggest the need for further research to explore this topic in greater depth. Additionally, it is worth noting that the evidence contributes to the growing list of nuclear proteins exhibiting abnormal behavior in micronuclei (MN). This highlights the significance of studying such proteins to understand their roles in genomic stability and cancer progression.

      Following the reviewer's suggestion, we carefully revised the manuscript to ensure that our statements are consistent with the scope of the data and do not overstate our conclusions. As part of this effort, we removed the schematic illustration (former Fig. 8), which might have overstated our findings, and refined the relevant text to prevent overinterpretation.

      To our knowledge, this study is the first to report the specific accumulation of importin α in MN. Our results suggest a previously unrecognized function of importin α beyond its canonical transport role and add to the growing list of nuclear proteins that exhibit abnormal behavior in MN. We hope that these findings will provide a conceptual and experimental basis for future studies aimed at clarifying the biological significance of MN heterogeneity and quality control in cancer biology.


      Additional experiments are necessary to quantitatively assess the localization of importin α1 in micronuclei (MN) across non-transformed MCM10A cells and transformed cell lines (MC7, HeLa, MDA-MB-231). This analysis would help determine whether the localization of importin α1 in MN correlates with genomic stability in human cancer cells.

      As part of our response to Major Comment 1, we conducted additional experiments to quantitatively compare importin α1 localization in MN between non-transformed MCF10A cells, breast cancer cell lines (MCF7 and MDA-MB-231), and HeLa cells. These results have been included in the revised manuscript (Supplemental Fig. S2F and Fig. S3B). The analyses showed no significant differences in the proportion of importin α1-positive MN among these cell lines, consistent with the reviewer's request for a more comprehensive evaluation.

      The authors claim that importin α1 preferentially localizes to euchromatic areas rather than heterochromatic regions within MN. While this assertion is supported by the immunofluorescence (IF) images presented in Figures 4A/B and S5A/B, it remains less clear for Figure S5C/B. To strengthen this claim, providing averages of IF distributions from multiple cells across independent experiments would be beneficial to draw more robust conclusions.

      We have quantified the co-localization of importin α1 with the euchromatin marker H3K4me3 and the heterochromatin marker H3K9me3 in micronuclei (MN) across four human cell lines (MCF10A, MCF7, MDA-MB-231, and HeLa). The results of this statistical analysis are included in the revised manuscript in Fig. 5C. These data provide quantitative evidence from independent experiments showing that importin α1 preferentially localizes to euchromatic regions within the MN, thereby supporting our initial observation.

      Furthermore, ChIP-seq data are presented to support the idea that importin α1 preferentially distributes over euchromatin areas in MN. However, as described, the epigenetic chromatin status indicated by these ChIP-seq experiments reflects that of the principal nucleus (PN), not specifically the status within MN in MCF7 cells. Given that MN represent only a small fraction of the cell population under normal culture conditions-likely less than 5% for HeLa cells as shown in Figure S2D-the relevance of this data is limited. Additionally, according to data presented in Figure 1B, importin α1 does not localize or distribute within the PN as it does in MN in MCF7 cells. Therefore, further experiments should be conducted to substantiate that importin α1 preferentially targets euchromatin areas within MN and to compare this distribution with that observed in the principal nucleus. Such studies could reveal potential abnormalities regarding the correlation between epigenetic chromatin status and importin α distribution in MN.

      As noted, these experiments were performed on whole-cell populations of MCF7 cells and therefore reflect the overall chromatin landscape, not specifically that of the MN. We fully acknowledge that MN constitute only a small fraction of the cell population under standard culture conditions (Supplemental Fig. S2D), and thus, the relevance of ChIP-seq data to MN must be interpreted with caution.

      Nevertheless, our intention in presenting these data was to illustrate that importin α1 preferentially associates with euchromatin regions marked by H3K4me3. To examine this more directly, we analyzed importin α1 localization in MN using immunofluorescence with histone modification markers across multiple cell lines. These analyses, together with the quantitative results now included in the revised manuscript (Fig. 5C), confirming that importin α1 preferentially localizes to euchromatic regions within MN.

      Taken together, although the ChIP-seq data were derived from whole-cell populations, the combined results from IF imaging and quantitative analysis support our interpretation that importin α1 retains its euchromatin-associating property within MN. We hope that these additional data will address the reviewer's concerns.

      To support the hypothesis that importin α1 inhibits RAD51 accessibility within MN, Figures 7D and E should be supplemented with thorough quantification and statistical analysis based on at least three independent experiments. This additional data would enhance confidence in their findings regarding RAD51 accessibility inhibition by importin α1.

      Following the reviewer's suggestion, we have added a new graph (Fig. 7F) in the revised manuscript. This figure presents the quantified frequency of RAD51-positive MN among importin α1-negative and importin α1-positive MN, analyzed across six microscopy fields (n = 6) from three independent experiments.

      To improve clarity and consistency, we reorganized the panels: representative RAD51 images are now shown in Fig. 7B, and the Cell #1 (low RAD51) vs. Cell #2 (high RAD51) classification with etoposide responsiveness is summarized in Fig. 7C. As illustrated in Figs. 7D and 7E, importin α1 and RAD51 exhibit mutually exclusive localization in MN. Fig. 7F provides a unified statistical summary at the population level.

      The results showed that the proportion of RAD51-positive MN was significantly lower among importin α1-positive MN than among importin α1-negative MN, providing robust quantitative support for the proposed mutual exclusivity between importin α1 localization and RAD51 accessibility in MN.

      We are grateful to the reviewer for this constructive suggestion, which helped us clarify and better support the central message of our study.


      The additional experiments proposed are controls and direct comparisons using the same techniques and experimental designs used by the authors, so it is reasonable that the authors can carry them out within a realistic timeframe.

      We appreciate the reviewer's thoughtful consideration of the feasibility of the additional experiments.

      Given the importance of reproducibility and the need to evaluate results based on imaging and quantitation, I strongly recommend that the authors include a detailed description of the optical microscopy procedures utilized in their study. This should encompass imaging conditions, acquisition settings, and the specific equipment used. Providing this information will enhance transparency and facilitate reproducibility. For reference, some valuable guidance on essential parameters for reproducibility can be found in Heddleston et al. (2021) (doi:10.1242/jcs.254144). Incorporating these details will not only strengthen the manuscript but also support other researchers in reproducing the findings accurately.

      Following the reviewer's suggestion, we have substantially revised the Materials and Methods sections in the main and supplemental manuscripts to provide detailed descriptions of the optical microscopy procedures, including the specifications of the imaging equipment, acquisition settings, and image processing parameters. These revisions follow the best practices recommended by Heddleston et al. (2021, J. Cell Sci., doi:10.1242/jcs.254144).

      We have also expanded the description of our quantitative image analysis using ImageJ, providing details on the parameters for MN identification and the measurement of colocalization rates between importin α and histone modifications. These additions ensured reproducibility and clarity.

      We believe that these modifications will enhance the reproducibility of our results and increase the value of our study for the research community. We sincerely appreciate the reviewer's helpful suggestions.


      Many of the plots and values in the manuscript lack appropriate statistical analysis, including p-values, which are not detailed in the figures or their legends. Furthermore, the Statistical Analysis section does not provide adequate information regarding the specific statistical tests employed or the criteria used to determine which analyses were applied in each case. To enhance the rigor and clarity of the study, it is essential that these issues be addressed prior to publication. A comprehensive presentation of statistical analysis will improve the reliability of the findings and allow readers to better understand the significance of the results. I recommend that the authors revise this section to include detailed explanations of all statistical methods used, along with corresponding p-values for all relevant comparisons.

      We sincerely appreciate the reviewer's constructive comments highlighting the importance of transparent and rigorous statistical analyses. In response, we have carefully revised all figure panels, figure legends, and the Materials and Methods (Statistical Analysis) section in both the main and the supplementary manuscripts.

      In the revised figure legends, we now provide the number of independent experiments and sample sizes (n), statistical tests applied (e.g., unpaired or paired two-tailed t-test, one-way ANOVA with Tukey's post-hoc test, two-way ANOVA with Sidak's multiple comparisons), data presentation format (mean {plus minus} SD), and corresponding p-values or significance indicators (*, **, ***). The Statistical Analysis section was also expanded to explain the rationale for selecting each statistical test, the criteria for significance, and the reporting of the replicates. These revisions ensure clarity, reproducibility, and transparency throughout the manuscript, directly addressing the reviewers' concerns. We are grateful for this valuable suggestion, which has significantly improved the rigor of our study.

      Minor comments:

      The authors claim that importin α1 exhibits remarkably low mobility in the micronuclei (MN) compared to its mobility in the principal nucleus (PN), as illustrated in Figure 1. However, based on the experimental design, this conclusion may not be appropriate. In the current setup, the FRAP experiment conducted in the PN measures the mobility of importin α1 molecules within the cell nucleus, where the influence of nuclear transport is likely negligible. Conversely, in the MN experiments shown, all molecules of importin α1 are bleached within a given MN. Consequently, what is being measured here primarily reflects the effects of nuclear transport rather than intrinsic molecular mobility. To accurately compare kinetics of nuclear transport, it would be essential to completely bleach the entire PN. If measuring molecular mobility between MN and PN is desired, only a small fraction of either MN or PN area/volume should be bleached during FRAP analysis. Additionally, it would be beneficial to include measurements of mobility for other canonical nuclear transport factors (e.g., RAN, CAS, RCC1) for comparative purposes. This broader context would allow for a more comprehensive understanding of importin α1 behavior relative to other factors involved in nuclear transport. Finally, utilizing cells that exhibit importin α1 signals in both PN and MN could further strengthen comparisons and provide more robust conclusions regarding its mobility dynamics.

      We thank the reviewer for their constructive suggestions regarding our FRAP analysis. To address the concern that the original comparison between PN and the micronuclei (MN) might have been biased by differences in bleaching areas, we performed new experiments in which both PN and MN were fully bleached within the same cells (Fig. 3A, and 3C). This approach allowed for a more direct comparison of importin α1 dynamics under equivalent conditions.

      These experiments revealed a markedly slower fluorescence recovery in MN than in PN, indicating reduced nuclear import and/or recycling efficiency of importin α1 in MN. In addition, we retained our original analysis to further characterize the heterogeneous mobility patterns of importin α1 in MN, identifying three distinct mobility classes: high, intermediate, and low (Fig. 3B, and 3D). Together, these results support our observation that importin α1 mobility is restricted in MN, likely due to altered nuclear transport dynamics.

      As suggested by the reviewer, we attempted partial bleaching of MN to assess intranuclear mobility. However, owing to the small size of MN, partial bleaching is technically challenging and inconsistent, with some MN recovering even during the bleaching process. Therefore, reliable quantification was not possible. For transparency, these data are provided as a Reviewer-only Figure but were not included in the revised manuscript.

      Finally, while we agree that examining other nuclear transport factors (e.g., RAN, CAS, RCC1) would be informative, our study focused on importin α1 dynamics. We consider these additional factors to be important directions for future investigations.


      Prior studies are referenced appropriately in general, but the authors missed some references (PMID: 32601372; PMID: 32494070; PMID: 27918550; PMID: 28738408; PMID: 28759889) that I consider key to put the present findings in frame with previous works which link the lack of structural integrity and/or aberrant DNA replication/damage responses in MN with Cchromothripsis and inflammation.

      We thank the reviewer for carefully pointing out the key references that are highly relevant to framing our findings in the context of previous studies on micronuclear instability, chromothripsis and inflammation. We fully agree with this suggestion.

      In the revised manuscript, we have cited these studies in both the Introduction and Discussion sections. Specifically, we incorporated these studies when discussing the structural fragility of MN, aberrant DNA replication, and the exposure of micronuclear DNA to cytoplasmic sensors, which mechanistically link MN rupture to chromothripsis and cGAS-STING-mediated immune activation. For example, we now refer to the study demonstrating RPA2 recruitment to ruptured MN in a CHMP4B-dependent manner (PMID: 32601372), reports showing defective replication and DNA damage responses in MN (PMID: 32494070; 27918550), and seminal studies establishing cGAS localization to ruptured MN and activation of innate immune signaling (PMID: 28738408; 28759889).

      By incorporating these references, we more clearly position our findings that importin α1 defines a distinct subset of MN lacking access to DNA repair and sensing factors such as RAD51, RPA2, and cGAS. This contextualization emphasizes that our data add to and extend the established view that compromised MN integrity underlies chromothripsis and inflammation by identifying importin α1 as a novel marker of an alternative MN microenvironment. We are grateful for this constructive recommendation, which has allowed us to strengthen the framing of our study in the existing literature.


      The figures presented in the manuscript are clear; however, where plots are included, they require appropriate statistical analysis. It is essential to display p-values on the plots or within their legends to provide readers with information regarding the significance of the results. Including this statistical information will enhance the interpretability of the data and strengthen the overall findings of the study. I recommend that the authors revise these sections accordingly before publication.

      In response, we have revised the relevant figure panels and their legends to clearly display the statistical significance, including p-values, where appropriate. Specifically, we added statistical annotations (p-values or significance markers such as asterisks) directly on the plots or in the corresponding legends, and clarified the number of replicates, statistical tests used, and definitions of error bars (mean {plus minus} SD). We believe that these revisions improve the interpretability and transparency of our results and strengthen the overall presentation of the data.

      __ 1.) In lines 134-135, it is stated that "up to 40% of the MN showed importin α1 accumulation under both standard culture conditions and the reversine treatment (Fig. S2F)." However, Figure S2F only displays percentages for reversine-treated cells, and there is no mention in the text or figures regarding the percentage of importin α1-positive MN determined by immunofluorescence (IF) under standard culture conditions. This discrepancy should be addressed.__

      Following the reviewer's comments, we revised Supplemental Fig. S2F shows a direct comparison of the proportion of importin α1-positive MN between untreated and reversine-treated HeLa cells based on indirect IF analysis. The Results section was updated accordingly (page 8, Lines 148-150): "We then examined whether reversine treatment affected the proportion of importin α1-positive MN. The results revealed that the MN formation rate for either untreated or treated cells was 36.2% {plus minus} 7.8 or 38.3% {plus minus} 8.8, respectively, with no significant difference (Fig. S2F). "

      We believe that this revision addresses the reviewer's concern by providing relevant quantitative data for the untreated condition.

      2.) In line 170, the authors state that "Cells in which overexpressed EGFP-importin α1 localized only in PN were excluded from the analysis (see Fig. 1E, top panels)." It is unclear why this exclusion was made. The authors should clarify whether they are referring to all constructs or only to the wild-type (WT) construct when mentioning EGFP-importin α1 localization solely in PN. This clarification is important as it may affect the results highlighted in line 173.

      In this section, we aimed to clarify that the quantitative analysis focused exclusively on cells harboring MN, as the purpose of the analysis was to compare the localization of EGFP-importin α1 between MN and PN. We excluded cells that contained no MN and showed EGFP-importin α1 localization only in the PN. This criterion was consistently applied to both wild-type and mutant constructs. To avoid confusion, we have removed the sentence "Cells in which overexpressed EGFP-importin α1 localized only in PN were excluded from the analysis (see Fig. 1E, top panels)." from the revised manuscript.

      3.) The statement in line 191 ("However, this antibody could not be further used in this context due to cross-reactivity with highly concentrated importin α1 in MN (Fig. S4)") is somewhat misleading. While it hints at a technical issue, it does not provide additional relevant information for understanding its implications for the rationale of the research. Moreover, Figure S4 is referenced but appears to refer specifically to panels S4D and E, which are not mentioned in the text. I recommend clarifying this point or removing it altogether.

      We agree with the reviewer that the statement "However, this antibody could not be further used in this context due to cross-reactivity with highly concentrated importin α1 in MN (Fig. S4)" was not essential for understanding the rationale of our study and could be misleading. In response, we have removed this sentence from the revised manuscript, along with the corresponding Supplementary Fig. S4.

      4.) Lines 197-199 contain a sentence that could be misleading and would benefit from clearer explanation. Although Figure 3D provides some clarity on this matter, no statistical analysis is included-only a bar plot is presented. A proper statistical analysis should be provided here to enhance understanding.

      In the revised manuscript, we performed one-way ANOVA followed by Holm-Sidak's multiple comparisons test to evaluate the MN localization ratio of EGFP-NES between Imp-α1-negative and Imp-α1-positive MN. This analysis revealed a statistically significant difference (**p

      5.) In lines 218-221, it states that importin α1 associates with euchromatin regions characterized by H3K4me3 and H3K36me3; however, Figure 4D lacks the Spearman's correlation coefficient value for H3K36me3 within the matrix. This omission needs correction.

      We thank the reviewer for this insightful comment. As addressed in response to Major comment #4, we have substantially revised Fig. 5 and added the missing Spearman's correlation coefficient value for H3K36me3 (now shown in Fig. 5E). These revisions, together with the overall improvements to the figure, more clearly illustrate the euchromatin enrichment of importin-α1.

      6.) For consistency in the experimental design aimed at identifying potential importin α1-interacting proteins, it would be more appropriate for Figures 5C/D to show IF data from MCF7 cells rather than HeLa cells.

      We sincerely apologize for the misstatements in the legends of the original Fig. 5C. The correct description is that this experiment was performed using MCF7 cells, and we have revised the legend accordingly in the revised manuscript (now Fig. 6C). In addition, because the original data in Fig. 5D were obtained from HeLa cells, we repeated this experiment using MCF7 cells and replaced the panel with new data (now Fig. 6D).

      7.) To substantiate claims that importin α1 inhibits RAD51 accessibility within MN, Figures 7D and E should include thorough quantitation and statistical analysis based on at least three independent experiments.

      As described above, we addressed this point by adding a new quantification and statistical analysis in Fig. 7F, based on six microscopy fields across three independent experiments. This analysis directly supports our claim that importin α1 inhibits RAD51 accessibility in the MN.

      We would also like to clarify that although the reviewer referred to Figs 7D and 7E, these two panels were designed to illustrate the same phenomenon-the mutually exclusive localization of importin α1 and RAD51 to distinct MN-shown in different contexts. Specifically, Fig. 7D presents examples from separate cells, each with MN containing either importin α1 or RAD51, while Fig. 7E shows a single cell containing two distinct MN, one enriched with importin α1 and the other with RAD51. Because both panels serve as illustrative examples of the same phenomenon, it would not be meaningful to quantify them independently as parallel datasets. Instead, we integrated the statistical analysis into a unified graph (Fig. 7F), which summarizes the frequency of RAD51-positive MN in relation to importin α1 status across the cell population, thereby supporting our interpretation that importin α1-positive MN represent a distinct subset that is less accessible to RAD51.

      8.) The meaning of lines 336-338-"Therefore, the enrichment of importin α1 in MN, along with its interaction with chromatin, may regulate the accessibility of RAD51 to DNA/chromatin fibers in MN and protect its activity"-is unclear. I suggest rephrasing this sentence for improved clarity and comprehension.

      We appreciate the reviewer's comment regarding the clarity of our statement in the Discussion (former lines 336-338). We agree that the original phrasing is ambiguous. To improve clarity and align with our results, we revised this section to emphasize that importin α1-positive MN represent a restricted environment from which DNA repair and sensing factors are excluded. Specifically, RAD51, RPA2, and cGAS showed mutually exclusive localization with importin α1, indicating that these MN are largely inaccessible to DNA-binding proteins (pages 20-21). This rephrasing removes the unclear phrase "protect its activity" and directly reflects our experimental findings, presenting a clearer interpretation that is consistent with the Results.

      9.) Fig. 1D: Numbers on the y-axis are missing, x-axis labeling is too small

      We appreciate the reviewer's careful examination of the figure. In the revised manuscript, we added numerical tick labels to both the x- and y-axes and increased the label font size to ensure clear readability, as shown in Fig. 1D. We also applied the same improvements to other fluorescence intensity plots, including Figs. 4A, 4B, 5A, 5B, 7H, and Supplemental Fig. S4C and S5A-S5F to ensure consistency in readability across the manuscript. We thank the reviewer for helping us improve the clarity and accuracy of our figure presentations.

      10.) Fig. 1F: As the PN/MN values of the three experiments are seemingly identical (third column) the distribution of the three individual data of the PN (first column) should mirror the distribution of the three individual data of the MN (second column). The authors might want to check why this is not the case.

      Upon re-examination of the source data, we identified and corrected a minor calculation error in one subset and regenerated the panel. After correction, the three independent PN/MN ratios were 3.1%, 2.9%, and 2.6%, rather than being identical. These corrected values were proportional to the corresponding PN and MN measurements and preserved the expected relationship between their distributions. Although the numerical differences were small, they demonstrated high reproducibility across independent experiments. These corrections do not alter the interpretation of Fig. 1F, and the distribution of PN/MN values is now consistent with the paired PN and MN data presented in the revised manuscript.

      Significance Micronuclei (MN) primarily arise from defects in mitotic progression and chromatin segregation, often associated with chromatin bridges and/or lagging chromosomes. MN frequently exhibit DNA replication defects and possess a rupture-prone nuclear envelope, which has been linked to genomic instability. The nuclear envelope of MN is notably deficient in crucial factors such as lamin B and nuclear pore complexes (NPCs). This deficiency may be attributed to the influence of microtubules and the gradient of Aurora B activity at the mitotic midzone, which inhibits the recruitment of proper nuclear envelope components. Additionally, several other factors may contribute to this process: for instance, PLK1 controls the assembly of NPC components onto lagging chromosomes; chromosome size and gene density positively correlate with the membrane stability of MN; and abnormal accumulation of the ESCRT complex on MN exacerbates DNA damage within these structures, triggering pro-inflammatory pathways.

      The work presented by Dr. Miyamoto and colleagues reveals the abnormal behavior of importin α1 in MN during interphase. According to their findings, it is reasonable to consider importin α1 as a molecular marker for characterizing MN dynamics. Furthermore, it could serve as a potential clinical marker if the authors provide additional experiments demonstrating significantly different localization patterns of importin α1 in transformed cells (e.g., MC7, HeLa, MDA-MB-231) compared to non-transformed cells (e.g., MCM10A).

      While the authors present some evidence indicating partial disruption of nuclear envelopes in MN (Figures 3 and S4), it is noteworthy that this phenomenon also occurs in importin α1-negative MN. Moreover, according to the figure legends, data for both figures originate from a single experiment. As such, convincing evidence linking the aberrant behavior of importin α1 in MN with chromothripsis processes or regulation of the cGAS-STING pathway-and its implications for genomic instability in cancer cells-remains lacking.

      Overall, it is not entirely clear what significance this advance holds for the field; while there are conceptual contributions made by this work, they do not appear sufficiently robust at this time. Further research is needed to clarify these connections and strengthen their conclusions regarding importin α1's role in MN dynamics and genomic instability.

      We sincerely appreciate the reviewer's thoughtful and constructive evaluation of the significance of our study. We agree that in the original submission, the conceptual contribution was not fully supported by sufficient evidence. In the revised manuscript, we have substantially strengthened our findings by incorporating new data on RPA2 and cGAS, in addition to RAD51. These results consistently show that importin α1-positive MN are largely inaccessible to multiple DNA-recognizing proteins-including DNA repair factors (RAD51 and RPA2) and the innate immune sensor cGAS-whereas importin α1-negative MN readily recruit these proteins. This broader dataset reinforces the concept that importin α defines a distinct and restricted MN subset, extending beyond our initial observation of RAD51 exclusion.

      By framing importin α as a molecular marker that discriminates between functionally distinct MN environments, our study conceptually advances the understanding of MN heterogeneity. This adds to the prior literature showing that defective nuclear envelope integrity underlies chromothripsis and cGAS-STING activation and positions importin α as a new marker for identifying MN that are refractory to these DNA repair and sensing pathways. While we agree that further work is necessary to directly link importin α enrichment to downstream genomic instability or inflammation in cancer, we believe that our revised data now provide a robust foundation for future investigations.

      Taken together, the revised manuscript presents a clearer and more comprehensive conceptual advance: importin α-positive MN represents a previously unrecognized molecular environment distinct from MN characterized by canonical DNA repair or sensing factors. We are grateful to the reviewer, whose constructive comments greatly improved the clarity, robustness, and overall impact of our study. We believe that these findings will be of particular interest to researchers studying the mechanisms of genomic instability, chromothripsis, and cancer biology.


      Reviewer #2

      Summary:

      The authors have shown that Importin α1, a nuclear transport factor, is enriched in subsets of micronuclei (MN) of cancer cells (MCF7 and HeLa) and, using FRAP, has an altered dynamics in MN. Moreover, the authors have shown that these levels of Importin α1 in the MN are likely not due to its traditional role for signal-dependent protein transport, as suggested by immunofluorescence of other factors important for this function. Additionally, cargo dynamics carrying NLS or NES signals were disrupted in Importin α1-positive micronuclei. Importin α1-positive micronuclei also appear to have a disrupted nuclear envelope, potentially explaining some of these cargo disruptions. The authors also demonstrated that Importin α colocalizes with proteins important for DNA replication, and p53 signaling using RIME, followed by immunofluorescence. Lastly, the authors show that Importin α and RAD51 have mutual exclusivity in the micronuclei.

      Major comments:

      1) A key issue is there are very few statistical tests used in this study. It is crucial to the interpretation of the data. We strongly urge the authors to re-analyze the data using appropriate statistical analyses. Along those lines, in many figures 1 or 2 images are shown without stating how many biological or technical replicates this is representative of or showing quantification of the anlyses. In general, the authors' statements would be strengthened by showing more examples and/or stating "N" in the figure legends or supplement.

      We sincerely thank the reviewer for emphasizing the importance of including sufficient statistical analyses and replication information. As noted in our response to Reviewer #1, we have carefully revised the manuscript to enhance statistical rigor and transparency throughout. Specifically, we expanded the Statistical Analysis section in the Materials and Methods section to provide a clear description of the statistical approaches used. In addition, all figure legends have been revised to explicitly state the number of biological replicates, sample sizes, statistical tests applied, and corresponding p-values or significance indicators. Representative images are consistently accompanied by quantitative analyses derived from multiple independent experiments.

      We believe that these comprehensive revisions directly address the reviewer's concerns and substantially improve the rigor, clarity, and interpretability of our manuscript.

      2) Using RIME and immunofluorescence, the authors identify factors that co-localize with Importin α1 in subsets of micronuclei (Figure 5), which is interesting, but there is no functional data associated with this result. Are the authors stating that these differences account for altered DNA damage or replication? It is unclear what the conclusion is beyond "some MN are different than others." Could the authors knockdown/knockout these factors to determine if they recruit Importin α1 into MN or the reciprocal? For many of these factors, they appear to be broadly present throughout the entire primary nucleus as well, indicating there is nothing unique about their MN localization.

      We agree that our original RIME and indirect IF analyses were primarily descriptive and lacked functional validation. To strengthen this aspect, we added new IF and quantification data (now presented in Fig. 8) showing that importin α1-positive MN are largely mutually exclusive with DNA repair and sensing factors such as RAD51, RPA2, and cGAS, whereas importin α1 frequently co-localizes with chromatin regulators identified by RIME, such as PARP1 and SUPT16H/FACT. These findings indicate that importin α1-positive MN define a distinct molecular environment enriched in replication- and chromatin-associated regulators but inaccessible to canonical DNA repair and sensing proteins.

      This combination of mutual exclusivity with DNA repair/sensing factors and frequent co-localization with chromatin regulators underscores the biological significance of importin α1 localization in MN, as it may contribute to localized chromatin stabilization through association with chromatin regulators while simultaneously restricting access to DNA repair and sensing factors. Thus, importin α1-positive MN represent a restricted subset with potential implications for genome stability and immune signaling, going beyond the descriptive notion that "some MN are different than others."

      Moreover, many chromatin regulators identified by RIME contain classical nuclear localization signals (NLSs), raising the possibility that importin α1 interacts with these proteins via their NLS sequences. We fully agree with the reviewer that knockdown or knockout experiments would be highly valuable to clarify whether such interactions actively recruit importin α1 into MN or occur reciprocally, and we regard this as an important direction for future investigations.

      3) In line 274, the authors state that MN highly enriched for Importin α1 inhibits RAD51 accessibility but this is an overstatement of the data. Instead, the authors show that RAD51 and importin α1 do not colocalize in micronuclei, albeit without quantification which weakens their argument. Also, the consequence of this "mutual exclusivity" is unclear. Can the authors inhibit or knockdown Importin α1 and show that RAD51 goes to all micronuclei? And how is this different than the data shown for factors in Figure 5? Some of those show colocalization with Importin α1-positive micronuclei and others do not. Could you perform live imaging of labeled Importin a1 and RAD51 and show that as Importin α1 accumulates in MN that RAD51 or other DNA repair factors are exported? An alternative experiment would be to show that the C-mutant, which is defective in nuclear export, now colocalizes with RAD51 in MN. Please reconcile this or show experiments to prove the statement above.

      We agree that our original wording "inhibits RAD51 accessibility" was not sufficiently supported by direct evidence, as it was based solely on the immunofluorescence data. Therefore, we have removed this statement from the Results section of the revised manuscript. To strengthen this point, we added a quantitative analysis (Fig. 7F) showing that RAD51 signals were significantly reduced in importin α1-enriched MN.

      Regarding the suggestion to perform knockdown experiments, we note that the depletion of KPNA2 (gene name of importin α1) has been reported to cause severe cell-cycle arrest (Martinez-Olivera et al, 2018; Wang et al, 2012). Consistent with these reports, we also found that siRNA-mediated knockdown of KPNA2 in our system strongly reduced MN induction upon reversine treatment, making it technically unfeasible to analyze RAD51 localization under these conditions. We also sincerely thank the reviewer for suggesting the live imaging experiments. We fully agree that such experiments would provide valuable mechanistic insights, and we regard this as an important direction for future research.

      In addition, to address the reviewer's concern about other DNA repair factors, we added new data (Fig. 8) showing that importin α1-positive MN are mutually exclusive with RPA2 and cGAS. RPA2 is a canonical single-strand DNA (ssDNA)-binding protein that stabilizes exposed ssDNA and facilitates RAD51 recruitment. It has been reported to accumulate in ruptured MN in a CHMP4B-dependent manner (Vietri et al, 2020). cGAS is a cytosolic DNA sensor that detects ruptured MN and activates innate immune signaling via the cGAS-STING pathway. Together with our RAD51 results, these data show that importin α1-positive MN are consistently segregated from multiple DNA-recognizing factors, including RAD51. Simultaneously, importin α1 co-localizes with chromatin regulators identified by RIME, such as PARP1 and SUPT16H/FACT. These findings support the view that importin α1-positive MN define a distinct molecular environment enriched in chromatin regulators but largely inaccessible to DNA repair and sensing factors. While the precise mechanism remains unclear, one possibility is that importin α1-associated chromatin interactions limit the access of DNA repair and sensing proteins. However, this interpretation is speculative and requires further investigation.

      4) In the Discussion, line 343-344 states that "importin α1 is uniquely distributed and alters the nuclear/chromatin status when enriched in MN," however this is not currently supported by the present data. The data presented shows correlation (albeit weak) between euchromatic modifications and Importin α1, and it does not definitively show that importin α1 is sufficient to alter the nuclear-chromatin status when enriched in the MN. More substantial experiments would be required to show whether Importin α1 plays an active role in these modifications.

      Following the reviewer's suggestion, in the revised manuscript, we removed this overstatement and rephrased the relevant sections of the Discussion. Rather than implying a causal role, we now describe the mutually exclusive localization of importin α1 with DNA repair and sensing factors (RAD51, RPA2, and cGAS), emphasize its preferential association with euchromatin regions marked by H3K4me3, and note its frequent co-localization with chromatin regulators identified by RIME, such as PARP1 and SUPT16H/FACT. These findings suggest that importin α1-positive MN define a distinct subset characterized by limited accessibility to DNA repair and sensing proteins, whereas cGAS-positive ruptured MN exemplify a state in which these proteins can accumulate.

      We also added a concluding statement that frames importin α1 as defining a previously unrecognized MN subset that is distinct from conventional ruptured MN. This revision provides a more accurate and appropriately cautious interpretation of our data while underscoring the conceptual advance of our study by clarifying how importin α1 localization reveals MN heterogeneity.

      Minor Comments

      1) Summary statement (page 3 Line 40): The use of "their" is confusing. Whose microenvironment are you referring to?

      We have rephrased the sentence as follows: The accumulation of importin α in micronuclei, followed by modulation of the microenvironment of the micronuclei, suggests the non-canonical function of importin α in genomic instability and cancer development. Thank you for this useful suggestion.

      2) In Abstract and introduction (page 4, Line 44 and page 5, line 59) it states that MN are membrane enclosed structures, but this is not always the case (see https://doi.org/10.1038/nature23449 as one example).

      While MN are typically surrounded by a nuclear envelope at the time of their formation during mitosis, we agree that this envelope can later rupture or fail to assemble completely, thereby exposing micronuclear DNA to the cytoplasm. To clarify this point, we revised the Introduction to explicitly acknowledge that MN may lose nuclear envelope integrity, which can have important consequences for genomic instability and immune activation inflammation. Specifically, we have added the following sentence to the Introduction (page 4, lines 77-80): "The nuclear envelope of MN can be partially or completely disrupted, allowing cytoplasmic DNA sensors, such as cyclic GMP-AMP synthase (cGAS), to access micronuclear DNA and trigger innate immune responses via the cGAS-STING pathway (Harding et al, 2017; Li & Chen, 2018; Mackenzie et al, 2017). "

      We hope this addition appropriately addresses the concerns raised by Reviewer #2 while incorporating the valuable suggestions from Reviewer #1 without altering the overall structure and flow of the manuscript.

      3) Given the fact that the RIME result identified proteins involved in DNA replication to be enriched with Importin α1, are these MN enriched in factors described in Fig. 5 simply localizing to MN that are in S phase, as described previously (doi: 10.1038/nature10802)?

      We sincerely thank the reviewer for raising this constructive perspective regarding the potential relationship between importin α1 enrichment in micronuclei (MN) and the S phase. Our RIME analysis identified chromatin-associated proteins, such as PARP1 and SUPT16H/FACT, which are often activated during replication stress and frequently function in the S phase. However, importin α1-positive MN were not exclusively associated with S-phase-specific molecules, and our data do not indicate that these MN are restricted to the S phase.

      Previous studies [e.g., (Crasta et al, 2012)] have established that MN are prone to replication defects and represent hotspots of genomic instability. The recovery of replication stress-responsive molecules, such as PARP1 and FACT, by RIME is therefore consistent with the biology of MN. Based on this valuable suggestion, we have revised the Discussion (page 19) to explicitly mention the potential involvement of replication-related proteins in importin α1-positive MN, as well as the possibility that importin α1 accumulation may contribute to replication defects in these structures. We are grateful to the reviewer for raising this important perspective, which has enabled us to place our findings in a broader mechanistic context.

      We are grateful to the reviewer for this important comment, which has allowed us to place our findings in a broader mechanistic context and outline directions for future research, including testing the relationship between importin α1-positive MN and established S-phase markers such as PCNA.

      4) The FRAP data is not very compelling. While it is clear there are differences between the PN and MN dynamics, what is driving these differences? Are these differences meaningful to the biology of the MN or PN? It is unclear what this data is contributing to the conclusions of the paper. Also, if the mobility of the MN is plotted on the same graph as the PN, the differences in MN mobility might not look as compelling.

      We respectfully emphasize that FRAP analysis is a key component of our study, as it provides important insights into the distinct dynamics of importin α1 in MN compared to PN.

      In the revised manuscript, we included new experiments (now shown in Fig. 3A and 3C) that directly compare the recovery kinetics of importin α1 in PN and MN in the same cells. By plotting the PN and MN recovery curves side by side, we aimed to improve clarity and provide a direct visualization of the pronounced differences in importin α1 dynamics between these compartments.

      Our FRAP results showed that importin α1 accumulated in both PN and MN but exhibited markedly reduced mobility in MN. These findings suggest that, unlike in the PN, canonical nucleocytoplasmic recycling of importin α1 is impaired in MN. Furthermore, the reduced mobility indicates that importin α1 is stably associated with chromatin or chromatin-associated factors in MN, consistent with our additional biochemical and imaging data showing preferential association with euchromatin (e.g., H3K4me3) and chromatin regulators.

      Taken together, the FRAP data provide functional evidence that complements our structural and molecular analyses, supporting our central conclusion that importin α1 accumulation in MN defines a restricted chromatin environment that influences the accessibility of DNA repair and sensing factors.

      5) In Results (line 117), you state that "the cytoplasm of those cell lines emitted quite strong signals" for Importin α1, but that phrasing is a little confusing. Yes, Importin α1 is in present the cytoplasm in most cells, but it appears you are referring to the enrichment in MN. I would recommend re-phrasing this statement to make your intent clearer.

      As the reviewer rightly noted, the original phrasing, "the cytoplasm of those cell lines emitted quite strong signals," was misleading, as it could suggest a broad cytoplasmic distribution of importin α1. Our observations showed that importin α1 accumulated specifically in MN located within the cytoplasm, but not in the cytoplasmic regions. To clarify this, we revised the Results section (page 7, lines 125-127) to read: " Next, we performed indirect immunofluorescence (IF) analysis on human cancer cell lines, including MCF7 and HeLa cells. Notably, we found that importin α1 accumulated prominently in MN located within the cytoplasm (MCF7 cells, Fig. 1B; HeLa cells, Fig. 1C; yellow arrowhead). " .

      We believe that this revised wording more accurately reflects our findings and addresses the reviewer's concerns.

      6) In Results (line 135, Figure S2E,F), the ratio of high, low or no Importin α1 intensity is confusing. Is this percentage relative to the total number of MN? It Is unclear what is meant by "whole number" of MN. Is Importin α1 intensity quantified or is it subjective?

      We apologize for the confusing terminology used in the original manuscript for Supplemental Fig. S2 and thank the reviewer for pointing it out. Although the reviewer did not specifically comment on the classification of importin α1 signal intensity as "high" or "low," we recognized that this approach relied on subjective visual assessment and lacked clearly defined thresholds. To improve clarity and objectivity, we have removed this classification and now analyze importin α1 localization in MN as simply positive or negative (revised Supplemental Fig. S2E). The previous graph (original Fig. S2F) was deleted. In addition, the frequency of Importin α1-positive MN has been reported in the Results section of the main text (page 8). We believe that these revisions have improved the clarity and reproducibility of our data presentation.

      7) Figure 2C is confusing. Are you counting MN with co-localization of Importin α1 and these factors? Please clarify.

      Figure 2C shows the percentage of importin α1-positive MN that displayed localization of importin β1, CAS, or Ran based on IF analysis. In other words, it represents the co-localization rates of these transport factors specifically within the subset of MN positive for importin α1. To improve clarity, we revised the y-axis label in Fig. 2C to "Localization in Impα1-positive MN (%)" and modified the figure legend accordingly. We have clarified this point in the Results section (page 9). We believe that these revisions resolve the confusion and clarify the scope of the analysis.

      8) Figure S3D quantification is very confusing and unclear. Also, how is this normalized? Are you controlling for total signal in each cell? And can the results of this experiment give you any mechanistic insight as to what is regulating MN localization beyond the interpretation of "MN localization is distinct from PN localization"? The "C-mutant" appears quite a bit different than the others. What might that indicate about the role of CAS/CSE1L in MN enrichment?

      We apologize for the confusion caused by the quantification in the Supplemental Fig. S3D (now revised as Fig. S4D). This figure shows the relative enrichment of EGFP-importin α1 in MN compared with that in PN for wild-type and mutant constructs. To control for nuclear size, fluorescence intensity was measured using a fixed circular ROI (1.5-2.0 µm in diameter) placed in both the MN and PN of the same cell, and MN/PN intensity ratios were directly plotted for individual cells (n = 8 per condition). This procedure is described in detail in the Results section (page 10).

      Regarding the C-mutant, the reduced MN/PN ratio primarily reflects increased importin α1 accumulation in the PN rather than a reduced retention in the MN. As discussed in the revised manuscript (page 18), this suggests that CAS/CSE1L-mediated nuclear export is active in the PN but may be impaired or uncoupled in the MN, possibly due to differences in nuclear envelope integrity or chromatin context. We believe that this clarification addresses the reviewer's concerns and highlights the mechanistic implications of the C-mutant phenotype.

      9) For Figures 3A,B and S4, are these images of single z-slices or projections? It would be helpful to clarify for your interpretations as to whether they are truly partial or diffuse or the membrane is in another z-plane. Also, how does the localization of Importin α1 different or similar to other factors that localize to MN with a compromised nuclear envelope, such as cGAS? If it is based on epigenetic marks, it should be different than cGAS, which primarily binds non-chromatinized DNA.

      We thank the reviewer for this valuable suggestion. All images shown in Figs 3A, 3B, and S4 in the original manuscript (now revised as Fig. 4A and 4B, with the original Fig. S4 omitted) were derived from single optical sections rather than projections. We would like to emphasize that similar discontinuities in signals for lamin proteins (including laminB1 and laminA/C) were consistently observed across multiple cells and independent experiments, indicating that these observations are not due to an artifact of image acquisition or a missing z-plane, but rather reflect a genuine partial loss of the MN membrane.

      In contrast to cGAS, which predominantly binds non-chromatinized DNA in ruptured MN, our data indicate that importin α1 preferentially localizes to MN regions enriched in euchromatin-associated histone modifications, such as H3K4me3. The new data presented in Fig. 8 further strengthen this point by directly comparing importin α1 with DNA-recognizing proteins such as cGAS and RPA2, which preferentially localize to MN lacking importin α1. Together, these results highlight that importin α1-positive MN constitute a distinct subset characterized by chromatin-associated localization and reduced accessibility to DNA repair and sensing proteins.

      10) In Results, it is unclear how Fig. 7B was calculated. Are the authors qualitatively assessing if RAD51 is there or looking for MN enrichment relative to PN? Additionally, in Fig. 7C, RAD51 localization is diffuse. It should be enriched in foci. I would recommend the authors repeat this experiment using pre-extraction then quantify RAD51 foci number and/or intensity.

      For the quantification shown in Fig. 7B of the original manuscript, we acquired images containing approximately 15-50 cells per condition and counted all the micronuclei (MN) in those fields. The percentage of RAD51-positive MN relative to the total MN was calculated. In the revised manuscript, we further refined this analysis by classifying RAD51-positive MN into two categories based on signal intensity: weak (Cell #1 type) and strong (Cell #2 type). For each condition, nine independent fields were analyzed (302 MN in untreated cells and 213 MN in etoposide-treated cells). This quantification revealed that etoposide treatment preferentially increased the proportion of MN with strong RAD51 accumulation (Fig. 7C, right panels), indicating enhanced DNA damage in MN. Thus, our analysis was quantitative rather than qualitative, based on systematic counting across multiple fields.

      Regarding the reviewer's suggestion of pre-extraction, we believe that this approach is technically difficult because MN are structurally fragile. Importantly, in the subset of MN with strong RAD51 accumulation, RAD51 was clearly present in foci rather than diffuse signals, as shown in the high-magnification images (Fig. 7E).

      Finally, in response to Reviewer #1, we performed a new quantitative analysis (Fig. 7F) focusing on the frequency of strongly RAD51-positive MN in relation to importin α1 status. This analysis confirmed the mutually exclusive relationship between RAD51 and importin α1 in MN and further strengthened our conclusions.

      11) In line 264, "notably" is misspelled.

      Thank you for pointing this out. We have corrected the spelling.

      12) In line 303, "scenarios" should be changed to the singular form.

      Thank you for this confirmation. We have corrected this to "scenario".

      13) In Figure legend, line 571-582, H3K27me3 is shown in Figure 4D, but the written legend does not mention this mark.

      We have added the marks in the legend for Fig. 5E.


      Significance: Overall, this paper shows compelling evidence for micronuclear localization of regulators of nuclear export, notably Importin α1. Of note, this occurs in subsets of MN that lack an intact nuclear envelope. And while it has been appreciated that compromised micronuclear envelopes lead to genomic instability, this is one of the first that demonstrate alteration in the nuclear envelope may disrupt import or export of nuclear proteins into micronuclei.

      A limitation of the study is that much of the work is based on immunofluorescence and lacks mechanism. While there is much correlative data showing that Importin α1 localizes to micronuclei with compromised envelopes, it is unclear whether Importin α1 drives micronuclear collapse or it is downstream of this process. Additionally, Importin α1 micronuclear localization anti-correlates with RAD51 but does colocalize with other DNA replication factors, yet it is unclear whether their localization is dependent on Importin α1 or its role in nuclear export. Currently, the audience for this manuscript would be focused to those interested in micronuclei. If these concerns about an active role for Importin α1 in micronuclear export are resolved, it would greatly increase the impact of this manuscript to those interested more broadly in genomic instability, DNA repair, and cancer.

      We thank the reviewer for positively evaluating our study and highlighting the importance of defining the biological significance of our findings. In the revised manuscript, we incorporated new data (Fig. 8) demonstrating that importin α1-positive MN are mutually exclusive not only with RAD51 but also with RPA2 and cGAS. These results clearly establish importin α1-positive MN as a distinct subset, defined by the enrichment of chromatin-associated proteins, while being largely inaccessible to canonical DNA repair and DNA-sensing factors.

      Consistent with this, our FRAP experiments and analysis of the CAS/CSE1L-binding mutant (C-mut) further indicated that the recycling dynamics of importin α1 were altered in MN compared to PN. In addition, importin α1 was enriched in lamin-deficient areas of MN, where electron microscopy revealed a fragile nuclear envelope morphology. Together with prior evidence, as discussed in the revised manuscript that recombinant importin α can inhibit nuclear envelope assembly in Xenopus egg extracts (Hachet et al, 2004), these findings raise the possibility that high local concentrations of importin α1 may actively contribute to impaired nuclear envelope formation or stability in MN.

      Such a distinct MN state may have important biological consequences. By limiting the access of DNA repair and DNA-sensing proteins, importin α1 accumulation may influence chromothripsis and immune activation, which, in turn, could play a role in tumor progression and genome instability. We believe that the identification of importin α1 as a marker defining such a restricted MN environment represents a conceptual advance that extends the relevance of our study beyond the MN field to the broader areas of genome instability, DNA repair, and cancer biology. We are grateful to the reviewer for encouraging us to strengthen the framing of our work, which has helped us clarify the novelty and impact of our findings.

      Reviewer #3

      Summary:

      This study reports that importin alpha isoforms enrich strongly in a subset of micronuclei in cancer cells and uses mutagenesis and immunostaining to define how this localization relates to importin alpha's nuclear transport function. This enrichment occurs even though importin-alpha-positive micronuclei also contain Ran and the importin alpha export factor CSE1L, indicating that importin a enrichment is not simply a consequence of the absence of components of the nuclear transport machinery that control its localization. Mutagenesis of importin a indicates that Mn enrichment persists even when the importin beta binding and NLS binding capacities of imp a are impaired. Potential importin alpha interacting proteins are identified by proteomics, although the relationship of these potential binding partners to micronucleus localization is unclear.


      1. In Figure S3, the authors show that mutagenesis of importin alpha's CSE1L binding domain decreases the ratiometric enrichment in Mn vs. Pn. However, is this effect occurring because th CSE1L binding mutant decreases Mn enrichment, or increases Pn enrichment? It seems that the latter is possible based on the images shown. If the Pn specifically becomes brighter on average in cells expressing the C-mut, while Mn remain similar in fluorescence intensity, that might suggest that CSE1L has less of an effect on importin alpha export in Mn compared to Pn.

      We appreciate the reviewer's insightful observations. In the revised analysis (now presented in Supplemental Fig. S4D), we quantified EGFP-importin α1 intensities in both PN and MN using fixed circular regions of interest. This revealed that the reduced MN/PN ratio observed in the CSE1L-binding mutant (C-mut) was mainly due to an increase in the PN signal rather than a decrease in the MN signal. These results are consistent with the reviewer's suggestion and indicate that CSE1L-mediated nuclear export is functional in PN but has a limited impact on MN.

      Importantly, this interpretation is supported by our FRAP experiments (Fig. 3), which show that importin α1 recycles normally in the PN but exhibits markedly reduced mobility in the MN. Together with our proteomic and colocalization analyses (Fig. 6), which identified importin α1 association with chromatin regulators such as PARP1 and SUPT16H/FACT, these findings suggest that importin α1 accumulates in MN not only because the recycling machinery is uncoupled but also because it forms stable interactions with chromatin-associated proteins. As discussed in the revised manuscript, this dual mechanism provides a plausible explanation for the persistent retention of importin α1 in MN and its role in defining a distinct MN environment.

      It is unclear from the text or the methods whether RIME identification of importin-alpha binding partners is performed in reversine-treated cells, which would increase the proportion of importin alpha in Mn, or in untreated cells. In either case, it seems likely that the majority of interactors identified would be cargoes that rely on importin alpha for import into the Pn. The rationale for linking these potential interactions to the Mn is unclear. While some of these factors are indeed shown enriched in Mn in Figure 5, the significance of this is also unclear. These points should be clarified.

      We thank the reviewer for raising this important point. The RIME assay was performed using whole-cell extracts from untreated wild-type MCF7 cells, which primarily identified importin α1-associated nuclear cargo proteins. To assess their potential relevance to MN, we screened the RIME candidates using immunofluorescence data provided by the Human Protein Atlas database and experimentally validated those showing clear MN localization by colocalization with importin α1. This two-step approach enabled us to highlight importin α1 interactors that are functionally relevant to MN biology rather than general nuclear cargoes.

      In response to the reviewer's concerns, we revised the Results section to clarify this rationale. Specifically, we added the explanation that "As importin α1 interactors are typically nuclear proteins, it is plausible that they reside not only in the primary nucleus but also in the MN. To test this possibility, we screened the identified candidates for MN localization using immunofluorescence images provided by the Human Protein Atlas (HPA) database (Pontén et al, 2008; Thul et al, 2017)." (page 14, lines 294-297).

      This is consistent with the idea that a wide range of nuclear proteins carrying NLS motifs can recruit importin α1 into the micronuclei, where they reside. This protein-driven enrichment of importin α1 may create a restricted microenvironment in which canonical DNA repair and sensing proteins, including RAD51, RPA2, and cGAS, are excluded, thereby defining a distinct subset of micronuclei with limited genome surveillance capacity.

      In Figure 6, the authors perform FRAP of importin alpha in Mn and show that it recovers much more slowly in Mn than in Pn. However, it appears from the images shown that the entire Mn was photobleached in each FRAP experiment. It thus is unclear whether the slow FRAP recovery is limited by slow diffusion of importin alpha within Mn/on Mn chromatin or impaired trafficking of importin alpha into and out of Mn. These distinct outcomes have distinct implications: either importin alpha is immobilized on Mn (eu)chromatin, or alternatively importin alpha is poorly transported into / out of Mn. This ambiguity could be resolved by bleaching a portion of a Mn and testing whether importin alpha diffuses within a single Mn.

      We thank the reviewer for this insightful comment regarding the interpretation of FRAP data. As the reviewer rightly pointed out, the original FRAP design-where the entire MN was photobleached-does not allow for a clear discrimination between the intranuclear immobilization of importin α1 and impaired trafficking into or out of the MN.

      In line with a similar suggestion from Reviewer #1, we attempted partial photobleaching of MN to evaluate whether importin α1 can diffuse within MN independently of nucleocytoplasmic transport. However, due to the small size of MN, precise targeting is technically challenging and recovery is often unreliable, with some MN even exhibiting partial recovery during the bleaching process itself. These data were not included in the revised figures; however, we provide representative examples as reviewer-only figures to illustrate these technical limitations.

      To further clarify the nuclear transport dynamics of importin α1, we redesigned our FRAP experiments to fully photobleach both the PN and MN within the same cells under identical conditions. These results, presented in revised Fig. 3A and 3C, demonstrate a markedly slower recovery of importin α1 in MN compared to PN, strongly suggesting that nucleocytoplasmic recycling of importin α1 is impaired in MN. Moreover, the reduced mobility of importin α1 in the MN is consistent with stable chromatin binding, limiting its ability to diffuse freely within the nuclear space.

      We believe that this additional analysis, prompted by the reviewer's comment, significantly strengthens the mechanistic interpretation of our FRAP data.

      References

      Crasta K, Ganem NJ, Dagher R, Lantermann AB, Ivanova EV, Pan Y, Nezi L, Protopopov A, Chowdhury D, Pellman D (2012) DNA breaks and chromosome pulverization from errors in mitosis. Nature 482: 53-58

      Hachet V, Kocher T, Wilm M, Mattaj IW (2004) Importin α associates with membranes and participates in nuclear envelope assembly in vitro. EMBO J 23: 1526-1535

      Martinez-Olivera R, Datsi A, Stallkamp M, Köller M, Kohtz I, Pintea B, Gousias K (2018) Silencing of the nucleocytoplasmic shuttling protein karyopherin a2 promotes cell-cycle arrest and apoptosis in glioblastoma multiforme. Oncotarget 9: 33471-33481

      Vietri M, Schultz SW, Bellanger A, Jones CM, Petersen LI, Raiborg C, Skarpen E, Pedurupillay CRJ, Kjos I, Kip E, Timmer R, Jain A, Collas P, Knorr RL, Grellscheid SN, Kusumaatmaja H, Brech A, Micci F, Stenmark H, Campsteijn C (2020) Unrestrained ESCRT-III drives micronuclear catastrophe and chromosome fragmentation. Nat Cell Biol 22: 856-867

      Wang CI, Chien KY, Wang CL, Liu HP, Cheng CC, Chang YS, Yu JS, Yu CJ (2012) Quantitative proteomics reveals regulation of karyopherin subunit alpha-2 (KPNA2) and its potential novel cargo proteins in nonsmall cell lung cancer. Mol Cell Proteomics 11: 1105-1122

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This study reports that importin alpha isoforms enrich strongly in a subset of micronuclei in cancer cells and uses mutagenesis and immunostaining to define how this localization relates to importin alpha's nuclear transport function. This enrichment occurs even though importin-alpha-positive micronuclei also contain Ran and the importin alpha export factor CSE1L, indicating that importin a enrichment is not simply a consequence of the absence of components of the nuclear transport machinery that control its localization. Mutagenesis of importin a indicates that Mn enrichment persists even when the importin beta binding and NLS binding capacities of imp a are impaired. Potential importin alpha interacting proteins are identified by proteomics, although the relationship of these potential binding partners to micronucleus localization is unclear.

      Significance

      1. In Figure S3, the authors show that mutagenesis of importin alpha's CSE1L binding domain decreases the ratiometric enrichment in Mn vs. Pn. However, is this effect occurring because th CSE1L binding mutant decreases Mn enrichment, or increases Pn enrichment? It seems that the latter is possible based on the images shown. If the Pn specifically becomes brighter on average in cells expressing the C-mut, while Mn remain similar in fluorescence intensity, that might suggest that CSE1L has less of an effect on importin alpha export in Mn compared to Pn.
      2. It is unclear from the text or the methods whether RIME identification of importin-alpha binding partners is performed in reversine-treated cells, which would increase the proportion of importin alpha in Mn, or in untreated cells. In either case, it seems likely that the majority of interactors identified would be cargoes that rely on importin alpha for import into the Pn. The rationale for linking these potential interactions to the Mn is unclear. While some of these factors are indeed shown enriched in Mn in Figure 5, the significance of this is also unclear. These points should be clarified.
      3. In Figure 6, the authors perform FRAP of importin alpha in Mn and show that it recovers much more slowly in Mn than in Pn. However, it appears from the images shown that the entire Mn was photobleached in each FRAP experiment. It thus is unclear whether the slow FRAP recovery is limited by slow diffusion of importin alpha within Mn/on Mn chromatin or impaired trafficking of importin alpha into and out of Mn. These distinct outcomes have distinct implications: either importin alpha is immobilized on Mn (eu)chromatin, or alternatively importin alpha is poorly transported into / out of Mn. This ambiguity could be resolved by bleaching a portion of a Mn and testing whether importin alpha diffuses within a single Mn.
    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The authors have shown that Importin α1, a nuclear transport factor, is enriched in subsets of micronuclei (MN) of cancer cells (MCF7 and HeLa) and, using FRAP, has an altered dynamics in MN. Moreover, the authors have shown that these levels of Importin α1 in the MN are likely not due to its traditional role for signal-dependent protein transport, as suggested by immunofluorescence of other factors important for this function. Additionally, cargo dynamics carrying NLS or NES signals were disrupted in Importin α1-positive micronuclei. Importin α1-positive micronuclei also appear to have a disrupted nuclear envelope, potentially explaining some of these cargo disruptions. The authors also demonstrated that Importin α colocalizes with proteins important for DNA replication, and p53 signaling using RIME, followed by immunofluorescence. Lastly, the authors show that Importin α and RAD51 have mutual exclusivity in the micronuclei.

      Major comments:

      1. A key issue is there are very few statistical tests used in this study. It is crucial to the interpretation of the data. We strongly urge the authors to re-analyze the data using appropriate statistical analyses. Along those lines, in many figures 1 or 2 images are shown without stating how many biological or technical replicates this is representative of or showing quantification of the anlyses. In general, the authors' statements would be strengthened by showing more examples and/or stating "N" in the figure legends or supplement.
      2. Using RIME and immunofluorescence, the authors identify factors that co-localize with Importin α1 in subsets of micronuclei (Figure 5), which is interesting, but there is no functional data associated with this result. Are the authors stating that these differences account for altered DNA damage or replication? It is unclear what the conclusion is beyond "some MN are different than others." Could the authors knockdown/knockout these factors to determine if they recruit Importin α1 into MN or the reciprocal? For many of these factors, they appear to be broadly present throughout the entire primary nucleus as well, indicating there is nothing unique about their MN localization.
      3. In line 274, the authors state that MN highly enriched for Importin α1 inhibits RAD51 accessibility but this is an overstatement of the data. Instead, the authors show that RAD51 and importin α1 do not colocalize in micronuclei, albeit without quantification which weakens their argument. Also, the consequence of this "mutual exclusivity" is unclear. Can the authors inhibit or knockdown Importin α1 and show that RAD51 goes to all micronuclei? And how is this different than the data shown for factors in Figure 5? Some of those show colocalization with Importin α1-positive micronuclei and others do not. Could you perform live imaging of labeled Importin a1 and RAD51 and show that as Importin α1 accumulates in MN that RAD51 or other DNA repair factors are exported? An alternative experiment would be to show that the C-mutant, which is defective in nuclear export, now colocalizes with RAD51 in MN. Please reconcile this or show experiments to prove the statement above.
      4. In the Discussion, line 343-344 states that "importin α1 is uniquely distributed and alters the nuclear/chromatin status when enriched in MN," however this is not currently supported by the present data. The data presented shows correlation (albeit weak) between euchromatic modifications and Importin α1, and it does not definitively show that importin α1 is sufficient to alter the nuclear-chromatin status when enriched in the MN. More substantial experiments would be required to show whether Importin α1 plays an active role in these modifications.

      Minor Comments

      1. Summary statement (page 3 Line 40): The use of "their" is confusing. Whose microenvironment are you referring to?
      2. In Abstract and introduction (page 4, Line 44 and page 5, line 59) it states that MN are membrane enclosed structures, but this is not always the case (see https://doi.org/10.1038/nature23449 as one example).
      3. Given the fact that the RIME result identified proteins involved in DNA replication to be enriched with Importin α1, are these MN enriched in factors described in Fig. 5 simply localizing to MN that are in S phase, as described previously (doi: 10.1038/nature10802)?
      4. The FRAP data is not very compelling. While it is clear there are differences between the PN and MN dynamics, what is driving these differences? Are these differences meaningful to the biology of the MN or PN? It is unclear what this data is contributing to the conclusions of the paper. Also, if the mobility of the MN is plotted on the same graph as the PN, the differences in MN mobility might not look as compelling.
      5. In Results (line 117), you state that "the cytoplasm of those cell lines emitted quite strong signals" for Importin α1, but that phrasing is a little confusing. Yes, Importin α1 is in present the cytoplasm in most cells, but it appears you are referring to the enrichment in MN. I would recommend re-phrasing this statement to make your intent clearer.
      6. In Results (line 135, Figure S2E,F), the ratio of high, low or no Importin α1 intensity is confusing. Is this percentage relative to the total number of MN? It Is unclear what is meant by "whole number" of MN. Is Importin α1 intensity quantified or is it subjective?
      7. Figure 2C is confusing. Are you counting MN with co-localization of Importin α1 and these factors? Please clarify.
      8. Figure S3D quantification is very confusing and unclear. Also, how is this normalized? Are you controlling for total signal in each cell? And can the results of this experiment give you any mechanistic insight as to what is regulating MN localization beyond the interpretation of "MN localization is distinct from PN localization"? The "C-mutant" appears quite a bit different than the others. What might that indicate about the role of CAS/CSE1L in MN enrichment?
      9. For Figures 3A,B and S4, are these images of single z-slices or projections? It would be helpful to clarify for your interpretations as to whether they are truly partial or diffuse or the membrane is in another z-plane. Also, how does the localization of Importin α1 different or similar to other factors that localize to MN with a compromised nuclear envelope, such as cGAS? If it is based on epigenetic marks, it should be different than cGAS, which primarily binds non-chromatinized DNA.
      10. In Results, it is unclear how Fig. 7B was calculated. Are the authors qualitatively assessing if RAD51 is there or looking for MN enrichment relative to PN? Additionally, in Fig. 7C, RAD51 localization is diffuse. It should be enriched in foci. I would recommend the authors repeat this experiment using pre-extraction then quantify RAD51 foci number and/or intensity.
      11. In line 264, "notably" is misspelled.
      12. In line 303, "scenarios" should be changed to the singular form.
      13. In Figure legend, line 571-582, H3K27me3 is shown in Figure 4D, but the written legend does not mention this mark.

      Significance

      Overall, this paper shows compelling evidence for micronuclear localization of regulators of nuclear export, notably Importin α1. Of note, this occurs in subsets of MN that lack an intact nuclear envelope. And while it has been appreciated that compromised micronuclear envelopes lead to genomic instability, this is one of the first that demonstrate alteration in the nuclear envelope may disrupt import or export of nuclear proteins into micronuclei.

      A limitation of the study is that much of the work is based on immunofluorescence and lacks mechanism. While there is much correlative data showing that Importin α1 localizes to micronuclei with compromised envelopes, it is unclear whether Importin α1 drives micronuclear collapse or it is downstream of this process. Additionally, Importin α1 micronuclear localization anti-correlates with RAD51 but does colocalize with other DNA replication factors, yet it is unclear whether their localization is dependent on Importin α1 or its role in nuclear export. Currently, the audience for this manuscript would be focused to those interested in micronuclei. If these concerns about an active role for Importin α1 in micronuclear export are resolved, it would greatly increase the impact of this manuscript to those interested more broadly in genomic instability, DNA repair, and cancer.

      Reviewer's areas of expertise: Genomic instability, cancer epigenetics, and mitosis

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate). Miyamoto et al. report that importin α1 is highly enriched in a subfraction of micronuclei (about 40%), which exhibit defective nuclear envelopes and compromised accessibility of factors essential for the damage response associated with homologous recombination DNA repair. The authors suggest that the unequal localization and abnormal distribution of importin α1 within these micronuclei contribute to the genomic instability observed in cancer.

      Major comments:

      Are the key conclusions convincing?

      The conclusions drawn by the authors would benefit from additional supportive experiments and a more detailed explanation. 1. It is crucial to quantitatively assess the localization of importin α1 in micronuclei (MN) across non-transformed MCM10A cells compared to transformed cell lines (MC7, HeLa, and MDA-MB-231). This analysis would help determine whether the localization of importin α1 in MN correlates with genomic stability in human cancer cells 2. While the authors provide some evidence indicating partial disruption of nuclear envelopes in MN (Figures 3 and S4), it is noteworthy that this phenomenon also occurs in importin α1-negative MN. Furthermore, according to the figure legends, the data presented in both figures stem from a single experiment. Current literature suggests that compromised nuclear envelope integrity is one of the major contributors to genomic instability, mediated through mechanisms such as chromothripsis and cGAS-STING-mediated inflammation arising from MN. Therefore, a more comprehensive quantification of nuclear envelope integrity-ideally comparing non-transformed MCM10A cells with transformed cell lines (MC7, HeLa, and MDA-MB-231)-is necessary to substantiate the connection between aberrant importin α1 behavior in MN and chromothripsis processes, as well as regulation of the cGAS-STING pathway linked to genomic instability in cancer cells. 3. The schematic illustration presented in Figure 8 does not adequately summarize all findings from this study nor does it clarify how the localization of importin α1 within MN might hypothetically influence genome stability. Although it is reasonable to propose that "importin α can serve as a molecular marker for characterizing the dynamics of MN" (Line 344), the authors assert (Line 325) that their findings, along with others, have "potential implications for the induction of chromothripsis processes and regulation of the cGAS-STING pathway in cancer cells." However, they fail to provide a clear or even hypothetical explanation regarding how their findings contribute to these molecular events. To address this gap, it would be essential for them to contextualize their results within existing literature that explores and links structural integrity deficits or aberrant DNA replication/damage responses in MN with chromothripsis and inflammation (e.g., PMID: 32601372; PMID: 32494070; PMID: 27918550; PMID: 28738408; PMID: 28759889). 4. Fig. 4D does not support the idea that importin α1 is euchromatin enriched: H3K9me3, H3K4me3 and H3K37me3 seem to be all deeply blue.

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      Indeed, the data presented by the authors do not adequately support a direct link between the presence of importin α1 in MN and genomic instability in human cancer cells. While the experimental correlations provided may not substantiate this connection definitively, they do lay a foundation for a grounded hypothesis and suggest the need for further research to explore this topic in greater depth. Additionally, it is worth noting that the evidence contributes to the growing list of nuclear proteins exhibiting abnormal behavior in micronuclei (MN). This highlights the significance of studying such proteins to understand their roles in genomic stability and cancer progression.

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      Additional experiments are necessary to quantitatively assess the localization of importin α1 in micronuclei (MN) across non-transformed MCM10A cells and transformed cell lines (MC7, HeLa, MDA-MB-231). This analysis would help determine whether the localization of importin α1 in MN correlates with genomic stability in human cancer cells. The authors claim that importin α1 preferentially localizes to euchromatic areas rather than heterochromatic regions within MN. While this assertion is supported by the immunofluorescence (IF) images presented in Figures 4A/B and S5A/B, it remains less clear for Figure S5C/B. To strengthen this claim, providing averages of IF distributions from multiple cells across independent experiments would be beneficial to draw more robust conclusions.

      Furthermore, ChIP-seq data are presented to support the idea that importin α1 preferentially distributes over euchromatin areas in MN. However, as described, the epigenetic chromatin status indicated by these ChIP-seq experiments reflects that of the principal nucleus (PN), not specifically the status within MN in MCF7 cells. Given that MN represent only a small fraction of the cell population under normal culture conditions-likely less than 5% for HeLa cells as shown in Figure S2D-the relevance of this data is limited. Additionally, according to data presented in Figure 1B, importin α1 does not localize or distribute within the PN as it does in MN in MCF7 cells. Therefore, further experiments should be conducted to substantiate that importin α1 preferentially targets euchromatin areas within MN and to compare this distribution with that observed in the principal nucleus. Such studies could reveal potential abnormalities regarding the correlation between epigenetic chromatin status and importin α distribution in MN. To support the hypothesis that importin α1 inhibits RAD51 accessibility within MN, Figures 7D and E should be supplemented with thorough quantification and statistical analysis based on at least three independent experiments. This additional data would enhance confidence in their findings regarding RAD51 accessibility inhibition by importin α1.

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      The additional experiments proposed are controls and direct comparisons using the same techniques and experimental designs used by the authors, so it is reasonable that the authors can carry them out within a realistic timeframe.

      Are the data and the methods presented in such a way that they can be reproduced?

      Given the importance of reproducibility and the need to evaluate results based on imaging and quantitation, I strongly recommend that the authors include a detailed description of the optical microscopy procedures utilized in their study. This should encompass imaging conditions, acquisition settings, and the specific equipment used. Providing this information will enhance transparency and facilitate reproducibility. For reference, some valuable guidance on essential parameters for reproducibility can be found in Heddleston et al. (2021) (doi:10.1242/jcs.254144). Incorporating these details will not only strengthen the manuscript but also support other researchers in reproducing the findings accurately.

      Are the experiments adequately replicated and statistical analysis adequate?

      Many of the plots and values in the manuscript lack appropriate statistical analysis, including p-values, which are not detailed in the figures or their legends. Furthermore, the Statistical Analysis section does not provide adequate information regarding the specific statistical tests employed or the criteria used to determine which analyses were applied in each case. To enhance the rigor and clarity of the study, it is essential that these issues be addressed prior to publication. A comprehensive presentation of statistical analysis will improve the reliability of the findings and allow readers to better understand the significance of the results. I recommend that the authors revise this section to include detailed explanations of all statistical methods used, along with corresponding p-values for all relevant comparisons.

      Minor comments:

      Specific experimental issues that are easily addressable.

      The authors claim that importin α1 exhibits remarkably low mobility in the micronuclei (MN) compared to its mobility in the principal nucleus (PN), as illustrated in Figure 1. However, based on the experimental design, this conclusion may not be appropriate. In the current setup, the FRAP experiment conducted in the PN measures the mobility of importin α1 molecules within the cell nucleus, where the influence of nuclear transport is likely negligible. Conversely, in the MN experiments shown, all molecules of importin α1 are bleached within a given MN. Consequently, what is being measured here primarily reflects the effects of nuclear transport rather than intrinsic molecular mobility. To accurately compare kinetics of nuclear transport, it would be essential to completely bleach the entire PN. If measuring molecular mobility between MN and PN is desired, only a small fraction of either MN or PN area/volume should be bleached during FRAP analysis. Additionally, it would be beneficial to include measurements of mobility for other canonical nuclear transport factors (e.g., RAN, CAS, RCC1) for comparative purposes. This broader context would allow for a more comprehensive understanding of importin α1 behavior relative to other factors involved in nuclear transport. Finally, utilizing cells that exhibit importin α1 signals in both PN and MN could further strengthen comparisons and provide more robust conclusions regarding its mobility dynamics.

      Are prior studies referenced appropriately?

      Prior studies are referenced appropriately in general, but the authors missed some references (PMID: 32601372; PMID: 32494070; PMID: 27918550; PMID: 28738408; PMID: 28759889) that I consider key to put the present findings in frame with previous works which link the lack of structural integrity and/or aberrant DNA replication/damage responses in MN with Cchromothripsis and inflammation.

      Are the text and figures clear and accurate?

      The figures presented in the manuscript are clear; however, where plots are included, they require appropriate statistical analysis. It is essential to display p-values on the plots or within their legends to provide readers with information regarding the significance of the results. Including this statistical information will enhance the interpretability of the data and strengthen the overall findings of the study. I recommend that the authors revise these sections accordingly before publication.

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      1. In lines 134-135, it is stated that "up to 40% of the MN showed importin α1 accumulation under both standard culture conditions and the reversine treatment (Fig. S2F)." However, Figure S2F only displays percentages for reversine-treated cells, and there is no mention in the text or figures regarding the percentage of importin α1-positive MN determined by immunofluorescence (IF) under standard culture conditions. This discrepancy should be addressed.
      2. In line 170, the authors state that "Cells in which overexpressed EGFP-importin α1 localized only in PN were excluded from the analysis (see Fig. 1E, top panels)." It is unclear why this exclusion was made. The authors should clarify whether they are referring to all constructs or only to the wild-type (WT) construct when mentioning EGFP-importin α1 localization solely in PN. This clarification is important as it may affect the results highlighted in line 173.
      3. The statement in line 191 ("However, this antibody could not be further used in this context due to cross-reactivity with highly concentrated importin α1 in MN (Fig. S4)") is somewhat misleading. While it hints at a technical issue, it does not provide additional relevant information for understanding its implications for the rationale of the research. Moreover, Figure S4 is referenced but appears to refer specifically to panels S4D and E, which are not mentioned in the text. I recommend clarifying this point or removing it altogether.
      4. Lines 197-199 contain a sentence that could be misleading and would benefit from clearer explanation. Although Figure 3D provides some clarity on this matter, no statistical analysis is included-only a bar plot is presented. A proper statistical analysis should be provided here to enhance understanding.
      5. In lines 218-221, it states that importin α1 associates with euchromatin regions characterized by H3K4me3 and H3K36me3; however, Figure 4D lacks the Spearman's correlation coefficient value for H3K36me3 within the matrix. This omission needs correction.
      6. For consistency in the experimental design aimed at identifying potential importin α1-interacting proteins, it would be more appropriate for Figures 5C/D to show IF data from MCF7 cells rather than HeLa cells.
      7. To substantiate claims that importin α1 inhibits RAD51 accessibility within MN, Figures 7D and E should include thorough quantitation and statistical analysis based on at least three independent experiments.
      8. The meaning of lines 336-338-"Therefore, the enrichment of importin α1 in MN, along with its interaction with chromatin, may regulate the accessibility of RAD51 to DNA/chromatin fibers in MN and protect its activity"-is unclear. I suggest rephrasing this sentence for improved clarity and comprehension.
      9. Fig. 1D: Numbers on the y-axis are missing, x-axis labeling is too small
      10. Fig. 1F: As the PN/MN values of the three experiments are seemingly identical (third column) the distribution of the three individual data of the PN (first column) should mirror the distribution of the three individual data of the MN (second column). The authors might want to check why this is not the case.

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.
      • Place the work in the context of the existing literature (provide references, where appropriate).

      Micronuclei (MN) primarily arise from defects in mitotic progression and chromatin segregation, often associated with chromatin bridges and/or lagging chromosomes. MN frequently exhibit DNA replication defects and possess a rupture-prone nuclear envelope, which has been linked to genomic instability. The nuclear envelope of MN is notably deficient in crucial factors such as lamin B and nuclear pore complexes (NPCs). This deficiency may be attributed to the influence of microtubules and the gradient of Aurora B activity at the mitotic midzone, which inhibits the recruitment of proper nuclear envelope components. Additionally, several other factors may contribute to this process: for instance, PLK1 controls the assembly of NPC components onto lagging chromosomes; chromosome size and gene density positively correlate with the membrane stability of MN; and abnormal accumulation of the ESCRT complex on MN exacerbates DNA damage within these structures, triggering pro-inflammatory pathways. The work presented by Dr. Miyamoto and colleagues reveals the abnormal behavior of importin α1 in MN during interphase. According to their findings, it is reasonable to consider importin α1 as a molecular marker for characterizing MN dynamics. Furthermore, it could serve as a potential clinical marker if the authors provide additional experiments demonstrating significantly different localization patterns of importin α1 in transformed cells (e.g., MC7, HeLa, MDA-MB-231) compared to non-transformed cells (e.g., MCM10A). While the authors present some evidence indicating partial disruption of nuclear envelopes in MN (Figures 3 and S4), it is noteworthy that this phenomenon also occurs in importin α1-negative MN. Moreover, according to the figure legends, data for both figures originate from a single experiment. As such, convincing evidence linking the aberrant behavior of importin α1 in MN with chromothripsis processes or regulation of the cGAS-STING pathway-and its implications for genomic instability in cancer cells-remains lacking. Overall, it is not entirely clear what significance this advance holds for the field; while there are conceptual contributions made by this work, they do not appear sufficiently robust at this time. Further research is needed to clarify these connections and strengthen their conclusions regarding importin α1's role in MN dynamics and genomic instability. - State what audience might be interested in and influenced by the reported findings.

      Scientist and health care professionals that research on mechanism of genomic instability and cancer - Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Mitosis, mitotic chromatin decondensation, nuclear reformation, hematopoietic cancers, light microscopy, image analysis.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __

      This study explores chromatin organization around trans-splicing acceptor sites (TASs) in the trypanosomatid parasites Trypanosoma cruzi, T. brucei and Leishmania major. By systematically re-analyzing MNase-seq and MNase-ChIP-seq datasets, the authors conclude that TASs are protected by an MNase-sensitive complex that is, at least in part, histone-based, and that single-copy and multi-copy genes display differential chromatin accessibility. Altogether, the data suggest a common chromatin landscape at TASs and imply that chromatin may modulate transcript maturation, adding a new regulatory layer to an unusual gene-expression system.

      I value integrative studies of this kind and appreciate the careful, consistent data analysis the authors implemented to extract novel insights. That said, several aspects require clarification or revision before the conclusions can be robustly supported. My main concerns are listed below, organized by topic/result section.

      TAS prediction * Why were TAS predictions derived only from insect-stage RNA-seq data? Restricting TAS calls to one life stage risks biasing predictions toward transcripts that are highly expressed in that stage and may reduce annotation accuracy for lowly expressed or stage-specific genes. Please justify this choice and, if possible, evaluate TAS robustness using additional transcriptomes or explicitly state the limitation.

      TAS predictions derived only from insect-stage RNA-seq data because in a previous study it was shown that there are no significant differences between stages in the 5’UTR procesing in T. cruzi life stages (https://doi.org/10.3389/fgene.2020.00166) We are not testing an additional transcriptome here, because the robustness of the software was already probed in the original article were UTRme was described (Radio S, 2018 doi:10.3389/fgene.2018.00671).

      Results - "There is a distinctive average nucleosome arrangement at the TASs in TriTryps": * You state that "In the case of L. major the samples are less digested." However, Supplementary Fig. S1 suggests that replicate 1 of L. major is less digested than the T. brucei samples, while replicate 2 of L. major looks similarly digested. Please clarify which replicates you reference and correct the statement if needed.

      The reviewer has a good point. We made our statement based on the value of the maximum peak of the sequenced DNA molecules, which in general is a good indicative of the extension of the digestion achieved by the sample (Cole H, NAR, 2011).

      As the reviewer correctly points, we should have also considered the length of the DNA molecules in each percentile. However, in this case both, T. brucei’s and L major’s samples were gel purified before sequencing and it is hard to know exactly what fragments were left behind in each case. Therefore, it is better not to over conclude on that regard.

      We have now comment on this in the main manuscript, and we have clarified in the figure legends which data set we used in each case.

      * It appears you plot one replicate in Fig. 1b and the other in Suppl. Fig. S2. Please indicate explicitly which replicate is in each plot. For T. brucei, the NDR upstream of the TAS is clearer in Suppl. Fig. S2 while the TAS protection is less prominent; based on your digestion argument, this should correspond to the more-digested replicate. Please confirm.

      The replicates used for the construction of each figure are explicitly indicated in Table S1. Although we have detailed in the table the original publication, the project and accession number for each data set, the reviewer is correct that in this case it was still not completely clear to which length distribution heatmap was each sample associated with. To avoid this confusion, we have now added the accession number for each data set to the figure legends and also clarified in Table S1. Regarding the reviewer’s comment on the correspondence between the observed TAS protection and the extent of samples digestion, he/she is correct that for a more digested sample we would expect a clearer NDR. In this case, the difference in the extent of digestion between these two samples is minor, as observed the length of the main peak in the length distribution histogram for sequenced DNA molecules is the same. These two samples GSM5363006, represented in Fig1 b, and GSM5363007, represented in S2, belong to the same original paper (Maree et al 2017), and both were gel purified before sequencing. Therefore, any difference between them could not only be the result of a minor difference in the digestion level achieved in each experiment but could be also biased by the fragments included or not during gel purification. Therefore, I would not over conclude about TAS protection from this comparison. We have now included a brief comment on this, in the figure discussion

      * The protected region around the TAS appears centered on the TAS in T. brucei but upstream in L. major. This is an interesting difference. If it is technical (different digestion or TAS prediction offset), explain why; if likely biological, discuss possible mechanisms and implications.

      We appreciate the reviewer suggestion. We cannot assure if it is due to technical or biological reasons, but there is evidence that L. major ‘s genome has a different dinucleotide content and it might have an impact on nucleosome assembly. We have now added a comment about this observation in the final discussion of the manuscript.

      Results - "An MNase sensitive complex occupies the TASs in T. brucei": * The definition of "MNase activity" and the ordering of samples into Low/Intermediate/High digestion are unclear. Did you infer digestion levels from fragment distributions rather than from controlled experimental timepoints? In Suppl. Fig. S3a it is not obvious how "Low digestion" was defined; that sample's fragment distribution appears intermediate. Please provide objective metrics (e.g., median fragment length, fraction 120-180 bp) used to classify digestion levels.

      As the reviewer suggests, the ideal experiment would be to perform a time course of MNase reaction with all the samples in parallel, or to work with a fixed time point adding increasing amounts of MNase. However, even when making controlled experimental timepoints, you need to check the length distribution histogram of sequenced DNA molecules to be sure which level of digestion you have achieved.

      In this particular case, we used public available data sets to make this analysis. We made an arbitrary definition of low, intermediate and high level of digestion, not as an absolute level of digestion, but as a comparative output among the tested samples. We based our definition on the comparison of __the main peak in length distribution heatmaps because this parameter is the best metric to estimate the level of digestion of a given sample. It represents the percentage of the total DNA sequenced that contains the predominant length in the sample tested. __Hence, we considered:

      low digestion: when the main peak is longer than the expected protection for a nucleosome (longer than 150 bp). We expect this sample to contain additional longer bands that correspond to less digested material.

      intermediate digestion, when the main peak is the expected for the nucleosome core-protection (˜146-150bp).

      high digestion, when the main peak is shorter than that (shorter than 146 bp). This case, is normally accompanied by a bigger dispersion in fragment sizes.

      To do this analysis, we chose samples that render different MNase protection of the TAS when plotting all the sequenced DNA molecules relative to this point and we used this protection as a predictor of the extent of sample digestion (Figure 2). To corroborate our hypothesis, that the degree of TAS protection was indeed related to the extent of the MNase digestion of a given sample, we looked at the length distribution histogram of the sequenced DNA molecules in each case. It is the best measurement of the extent of the digestion achieved, especially, when sequencing the whole sample without any gel purification and representing all the reads in the analysis as we did. The only caveat is with the sample called “intermediate digestion 1” that belongs to the original work of Mareé 2017, since only this data set was gel purified.

      Whether the sample used in Figure 1 (from Mareé 2017) is also from the same lab and is an MNase-seq. Strictly speaking, there is no methodological difference between MNase-seq and the input of a native MNase-ChIP-seq, since the input does not undergo the IP.

      * Several fragment distributions show a sharp cutoff at ~100-125 bp. Was this due to gel purification or bioinformatic filtering? State this clearly in Methods. If gel purification occurred, that can explain why some datasets preserve the MNase-sensitive region.

      The sharp cutoff is neither due to gel purification or bioinformatic filtering, it is just due to the length of the paired-end read used in each case. In earlier works the most common was to sequence only 50bp, with the improvement of technologies it went up to 75,100 or 125 bp. We have now clarified in Table S1 the length of the paired-reads used in each case when possible.

      * Please reconcile cases where samples labeled as more-digested contain a larger proportion of >200 bp fragments than supposedly less-digested samples; this ordering affects the inference that digestion level determines the loss/preservation of TAS protection. Based on the distributions I see, "Intermediate digestion 1" appears most consistent with an expected MNase curve - please confirm and correct the manuscript accordingly.

      As explained above, it's a common observation in MNase digestion of chromatin that more extensive digestion can still result in a broad range of fragment sizes, including some longer fragments. This seemingly counter-intuitive result is primarily due to the non-uniform accessibility of chromatin and the sequence preference of the MNase enzyme, which has a preference for AT reach sequences.

      The rationale of this is as follows: when you digest chromatin with MNase and the objective is to map nucleosomes genome-wide, the ideal situation would be to get the whole material contained in the mononucleosome band. Given that MNase is less efficient to digest protected DNA but, if the reaction proceeds further, it always ends up destroying part of it, the result is always far from perfect. The better situation we can get, is to obtain samples were ˜80% of the material is contained in the mononucloesome band. __And here comes the main point: __even in the best scenario, you always get some additional longer bands, such as those for di or tri nucleosomes. If you keep digesting, you will get less than 80 % in the nucleosome band and, those remaining DNA fragments that use to contain di and tri nucleosomes start getting digested as well, originating a bigger dispersion in fragments sizes. How do we explain persistence of Long Fragments? The longest fragments (di-, tri-nucleosomes) that persist in a highly digested sample are the ones that were originally most highly protected by proteins or higher-order structure, or by containing a poor AT sequence content, making their linker DNA extremely resistant to initial cleavage. Once the majority of the genome is fragmented, these few resistant longer fragments become a more visible component of the remaining population, contributing to a broader size dispersion. Hence, you end up observing a bigger dispersion in length distributions in the final material. Bottom line, it is not a good practice to work with under or over digested samples. Our main point, is to emphasize that especially when comparing samples, it important to compare those with comparable levels of digestion. Otherwise, a different sampling of the genome will be represented in the remaining sequenced DNA.

      Results - "The MNase sensitive complexes protecting the TASs in T. brucei and T. cruzi are at least partly composed of histones": * The evidence that histones are part of the MNase-sensitive complex relies on H3 MNase-ChIP signal in subnucleosomal fragment bins. This seems to conflict with the observation (Fig. 1) that fragments protecting TASs are often nucleosome-sized. Please reconcile these points: are H3 signals confined to subnucleosomal fragments flanking the TAS while the TAS itself is depleted of H3? Provide plots that compare MNase-seq and H3 ChIP signals stratified by consistent fragment-size bins to clarify this.

      What we learned from other eukaryotic organisms that were deeply studied, such as yeast, is that NDRs are normally generated at regulatory points in the genome. In this sense, yeast tRNA genes have a complex with a bootprint smaller than a nucleosome formed by TFIIIC-TFIIB (Nagarajavel, doi: 10.1093/nar/gkt611). On the other hand, many promotor regions have an MNase-sensitive complex with a nucleosome-size footprint, but it does not contain histones (Chereji, et al 2017, doi:10.1016/j.molcel.2016.12.009). The reviewer is right that from Figure 1 and S2 we could observe that the footprint of whatever occupies the TAS region, especially in T. brucei, is nucleosome-size. However, it only shows the size, but it doesn’t prove the nature of its components. Nevertheless, those are only MNase-seq data sets. Since it does not include a precipitation with specific antibodies, we cannot confirm the protecting complex is made up by histones. In parallel, a complementary study by Wedel 2017, from Siegel’s lab, shows that using a properly digested sample and further immunoprecipitating with a-H3 antibody, the TAS is not protected by nucleosomes at least not when analyzing nucleosome size-DNA molecules. Besides, Briggs et. al 2018 (doi: 10.1093/nar/gky928) showed that at least at intergenic regions H3 occupancy goes down while R-loops accumulation increases. We have now added a supplemental figure associated to Figure 3 (new Suplemental 5) replotting R-loops and MNase-ChIP-seq for H3 relative to our predicted TAS showing this anti-correlation and how it partly correlates with MNase protection as well. As a control we show that Rpb9 trends resembles H3 as Siegel’s lab have shown in Wedel 2018.

      * Please indicate which datasets are used for each panel in Suppl. Fig. S4 (e.g., Wedel et al., Maree et al.), and avoid calling data from different labs "replicates" unless they are true replicates.

      In most of our analysis we used real replicated experiments. Such is the case MNase-seq data used in Figure 1, with the corresponding replicate experiments used in Figure S2; T. cruzi MNase-ChIP-seq data used in Figure 3b and 4a with the respective replicate used in Figures S4 and S5 (now S6 in the revised manuscript). The only case in which we used experiments coming from two different laboratories, is in the case of MNase-ChIP-seq for H3 from T. brucei. Unfortunately, there are only two public data sets coming each of them from different laboratories. The samples used in Fig 3 (from Siegel’s lab) whether the IP from H3 represented in S4 and S5 (S6 n the updated version) comes from another lab (Patterton’s). To be more rigorous, we now call them data 1 and 2 when comparing these particular case.

      The reviewer is right that in this particular case one is native chromatin (Pattertons’) while the other one is crosslinked (Siegel’s). We have now clarified it in the main text that unfortunately we do not count on a replicate but even under both condition the result remains the same, and this is compatible with my own experience, were crosslinking does not affect the global nucleosome patterns (compared nucleosome organization from crosslinked chromatin MNAse-seq inputs Chereji, Mol Cell, 2017 doi: 10.1016/j.molcel.2016.12.009 and native MNase-seq from Ocampo, NAR, 2016 doi: 10.1093/nar/gkw068).

      * Several datasets show a sharp lower bound on fragment size in the subnucleosomal range (e.g., ~80-100 bp). Is this a filtering artifact or a gel-size selection? Clarify in Methods and, if this is an artifact, consider replotting after removing the cutoff.

      We have only filtered adapter dimmer or overrepresented sequences when needed. In Figures 2 and S3 we represented all the sequenced reads. In other figures when we sort fragments sizes in silico, such as nucleosome range, dinucleosome or subnucleosome size, we make a note in the figure legends. What the reviewer points is related to the length of the sequence DNA fragment in each experiment. As we explained above, the older data-sets were performed with 50 bp paired-end reads, the newer ones are 75, 100 or 125bp. This is information is now clarified in Table S1.

      __Results - "The TASs of single and multi-copy genes are differentially protected by nucleosomes": __

      __ __* Please include T. brucei RNA-seq data in Suppl. Fig. S5b as you did for T. cruzi.

      We have shown chromatin organization for T. brucei in S5b to show that there is a similar trend. Unfortunately, we did not get a robust list of multi-copy genes for T. brucei as we did get for T. cruzi, therefore we do not want to over conclude showing the RNA-seq for these subsets of genes. The limitation is related to the fact that UTRme restrict the search and is extremely strict when calling sites at repetitive regions.

      * Discuss how low or absent expression of multigene families affects TAS annotation (which relies on RNA-seq) and whether annotation inaccuracies could bias the observed chromatin differences.

      The mapping of occurrence and annotations that belong to repetitive regions has great complexity. UTRme is specially designed to avoid overcalling those sites. In other words, there is a chance that we could be underestimating the number of predicted TASs at multi-copy genes. Regarding the impact on chromatin analysis, we cannot rule out that it might have an impact, but the observation favors our conclusion, since even when some TASs at multi-copy genes can remain elusive, we observe more nucleosome density at those places.

      * The statement that multi-copy genes show an "oscillation" between AT and GC dinucleotides is not clearly supported: the multi-copy average appears noisier and is based on fewer loci. Please tone down this claim or provide statistical support that the pattern is periodic rather than noisy.

      We have fixed this now in the preliminary revised version

      * How were multi-copy genes defined in T. brucei? Include the classification method in Methods.

      This classification was done the same way it was explained for T. cruzi

      Genomes and annotations: * If transcriptomic data for the Y strain was used for T. cruzi, please explain why a Y strain genome was not used (e.g., Wang et al. 2021 GCA_015033655.1), or justify the choice. For T. brucei, consider the more recent Lister 427 assembly (Tb427_2018) from TriTrypDB. Use strain-matched genomes and transcriptomes when possible, or discuss limitations.

      The most appropriate way to analyze high throughput data, is to aline it to the same genome were the experiments were conducted. This was clearly illustrated in a previous publication from our group were we explained how should be analyzed data from the hybrid CL Brener strain. A common practice in the past was to use only Esmeraldo-like genome for simplicity, but this resulted in output artifacts. Therefore, we aligned it to CL Brener genome, and then focused the main analysis on the Esmeraldo haplotype (Beati Plos ONE, 2023). Ideally, we should have counted on transcriptomic data for the same strain (CL Brener or Esmeraldo). Since this was not the case at that moment, we used data from Y strain that belongs to the same DTU with Esmeraldo.

      In the case of T. brucei, when we started our analysis and the software code for UTRme was written, the previous version of the genome was available. Upon 2018 version came up, we checked chromatin parameters and observed that it did not change the main observations. Therefore, we continue working with our previous setups.

      Reproducibility and broader integration: * Please share the full analysis pipeline (ideally on GitHub/Zenodo) so the results are reproducible from raw reads to plots.

      We are preparing a full pipeline in GitHub. We will make it available before manuscript full revision

      * As an optional but helpful expansion, consider including additional datasets (other life stages, BSF MNase-seq, ATAC-seq, DRIP-seq) where available to strengthen comparative claims.

      We are now including a new suplemental figure including DRIP-seq and Rp9 ChIP-seq (revised S5). Additionally, we added a new panel c to figure 4, representing FAIRE-seq data for T. cruzi fore single and multi-copy genes

      We are working on ATAC-seq analysis and BSF MNase-seq

      Optional analyses that would strengthen the study: * Stratify single-copy genes by expression (high / medium / low) and examine average nucleosome occupancy at TASs for each group; a correlation between expression and NDR depth would strengthen the functional link to maturation.

      We have now included a panel in suplemental figure 5 (now revised S6), showing the concordance for chromatin organization of stratified genes by RNA-seq levels relative to TAS.

      __Minor / editorial comments: __ * In the Introduction, the sentence "transcription is initiated from dispersed promoters and in general they coincide with divergent strand switch regions" should be qualified: such initiation sites also include single transcription start regions.

      We have clarified this in the preliminary revised version

      * Define the dotted line in length distribution plots (if it is not the median, please clarify) and consider placing it at 147 bp across plots to ease comparison.

      The dotted line is just to indicate where the maximum peak is located. It is now clarified in figure legends.

      * In Suppl. Fig. 4b "Replicate2" the x-axis ticks are misaligned with labels - please fix.

      We have now fixed the figure. Thanks for noticing this mistake.

      * Typo in the Introduction: "remodellingremodeling" → "remodeling

      Thanks for noticing this mistake, it is fixed in the current version of the manuscript

      **Referee cross-commenting** Comment 1: I think Reviewer #2 and Reviewer #3 missed that they authors of this manuscript do cite and consider the results from Wedel at al. 2017. They even re-analysed their data (e.g. Figure 3a). I second Reviewer #2 comment indicating that the inclusion of a schematic figure to help readers visualize and better understand the findings would be an important addition.

      Comment 2: I agree with Reviewer #3 that the use of different MNase digestion procedures in the different datasets have to be considered. On the other hand, I don't think there is a problem with figure 1 showing an MNase-protected TAS for T. brucei as it is based on MNase-seq data and reproduces the reported results (Maree et al. 2017). What the Siegel lab did in Wedel et al. 2017 was MNase-ChIPseq of H3 showing nucleosome depletion at TAS, but both results are not necessary contradictory: There could still be something else (which does not contain H3) sitting on the TAS protecting it from MNase digestion.

      Reviewer #1 (Significance (Required)):

      This study provides a systematic comparative analysis of chromatin landscapes at trans-splicing acceptor sites (TASs) in trypanosomatids, an area that has been relatively underexplored. By re-analyzing and harmonizing existing MNase-seq and MNase-ChIP-seq datasets, the authors highlight conserved and divergent features of nucleosome occupancy around TASs and propose that chromatin contributes to the fidelity of transcript maturation. The significance lies in three aspects: 1. Conceptual advance: It broadens our understanding of gene regulation in organisms where transcription initiation is unusual and largely constitutive, suggesting that chromatin can still modulate post-transcriptional processes such as trans-splicing. 2. Integrative perspective: Bringing together data from T. cruzi, T. brucei and L. major provides a comparative framework that may inspire further mechanistic studies across kinetoplastids. 3. Hypothesis generation: The findings open testable avenues about the role of chromatin in coordinating transcript maturation, the contribution of DNA sequence composition, and potential interactions with R-loops or RNA-binding proteins. Researchers in parasitology, chromatin biology, and RNA processing will find it a useful resource and a stimulus for targeted experimental follow-up.

      My expertise is in gene regulation in eukaryotic parasites, with a focus on bioinformatic analysis of high-throughput sequencing data

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __

      Siri et al. perform a comparative analysis using publicly available MNase-seq data from three trypanosomatids (T. brucei, T. cruzi, and Leishmania), showing that a similar chromatin profile is observed at TAS (trans-splicing acceptor site) regions. The original studies had already demonstrated that the nucleosome profile at TAS differs from the rest of the genome; however, this work fills an important gap in the literature by providing the most reliable cross-species comparison of nucleosome profiles among the tritryps. To achieve this, the authors applied the same computational analysis pipeline and carefully evaluated MNase digestion levels, which are known to influence nucleosome profiling outcomes.

      In my view, the main conclusion is that the profiles are indeed similar-even when comparing T. brucei and T. cruzi. This was not clear in previous studies (and even appeared contradictory, reporting nucleosome depletion versus enrichment) largely due to differences in chromatin digestion across these organisms. The manuscript could be improved with some clarifications and adjustments:

      1. The authors state from the beginning that available MNase data indicate altered nucleosome occupancy around the TAS. However, they could also emphasize that the conclusions across the different trypanosomatids are inconsistent and even contradictory: NDR in T. cruzi versus protection-in different locations-in T. brucei and Leishmania.

      We start our manuscript by referring to the first MNase-seq data sets publicly available for each TriTryp and we point that one of the main observations, in each of them, is the occurrence of a change in nucleosome density or occupancy at intergenic regions. In T. cruzi, in a previous publication from our group, we stablished that this intergenic drop in nucleosome density occurs near the trans-splicing acceptor site. In this work, we extend our study to the other members of TriTryps: T. brucei and L. major.

      In T. brucei the papers from Patterton’s lab and Siegel’s lab came out almost simultaneously in 2017. Hence, they do not comment on each other’s work. The first one claims the presence of a well-positioned nucleosome at the TAS by using MNase-seq, while the second one, shows an NDR at the TAS by using MNase-ChIP-seq. However, we do not think they are contradictory, or they have inconsistency. We brought them together along the manuscript because we think these works can provide complementary information.

      On one hand, we infer data from Pattertons lab is slightly less digested than the sample from Siegel’s lab. Therefore, we discuss that this moderate digestion must be the reason why they managed to detect an MNase protecting complex sitting at the TAS (Figure 1). On the other hand, Sigel’s lab includes an additional step by performing MNase-ChIP-seq, showing that when analyzing nucleosome size fragments, histones are not detected at the TAS. Here, we go further in this analysis on figure 3, showing that only when looking at subnucleosome-size fragments, we are able to detect histone H3. And this is also true for T. cruzi.

      By integrating every analysis in this work and the previous ones, we propose that TASs are protected by an MNase-sensitive complex (probed in Figure 2). This complex most likely is only partly formed by histones, since only when analyzing sub-nucleosomes size DNA molecules we can detect histone H3 (Figure 3). To be absolutely sure that the complex is not entirely made up by histones, future studies should perform an MNse-ChIP-seq with less digested samples. However, it was previously shown that R-loops are enriched at those intergenic NDRs (Briggs, 2018 doi: 10.1093/nar/gky928) and that R-loops have plenty of interacting proteins (Girasol, 2023 10.1093/nar/gkad836). Therefore, most likely, this MNase-sensitive complexed have a hybrid nature made up by H3 and some other regulatory molecules, possibly involved in trans-splicing. We have now added a new figure S5 showing R-loop co-localization with the NDR.

      Regarding the comparison between different organisms, after explaining the sensitivity to MNase of the TAS protecting complex, we discuss that when comparing equally digested samples T. cruzi and T. brucei display a similar chromatin landscape with a mild NDR at the TAS (See T. cruzi represented in Figure 1 compared to T. brucei represented in Intermediate digestion 2 in Figure 2, intermediate digestion in the revised manuscript). Unfortunately, we cannot make a good comparison with L. major, since we do not count on a similar level of digestion.

      Another point that requires clarification concerns what the authors mean in the introduction and discussion when they write that trypanosomes have "...poorly organized chromatin with nucleosomes that are not strikingly positioned or phased." On the other hand, they also cite evidence of organization: "...well-positioned nucleosome at the spliced-out region.. in Leishmania (ref 34)"; "...a well-positioned nucleosome at the TASs for internal genes (ref37)"; "...a nucleosome depletion was observed upstream of every gene (ref 35)." Aren't these examples of organized chromatin with at least a few phased nucleosomes? In addition, in ref 37, figure 4 shows at least two (possibly three to four) nucleosomes that appear phased. In my opinion, the authors should first define more precisely what they mean by "poorly organized chromatin" and clarify that this interpretation does not contradict the findings highlighted in the cited literature.

      For a better understanding of nucleosome positioning and phasing I recommend the review: Clark 2010 doi:10.1080/073911010010524945, Figure 4. Briefly, in a cell population there are different alternative positions that a given nucleosome can adopt. However, some are more favorable. When talking about favorable positions, we refer to the coordinates in the genome that are most likely covered by a nucleosome and are predominant in the cell population. Additionally, nucleosomes could be phased or not. This refers not only the position in the genome, but to the distance relative to a given point. In yeast, or in highly transcribed genes of more complex eukaryotes, nucleosomes are regularly spaced and phased relative to the transcription start site (TSS) or to the +1 nucleosome (Ocampo, NAR, 2016, doi:10.1093/nar/gkw068). In trypanosomes, nucleosomes have some regular distribution when making a browser inspection but, given that they are not properly phased with respect to any point, it is almost impossible to make a spacing estimation from paired-end data. This is also consistent with a chromatin that is transcribed in an almost constitutive manner.

      As the reviewer mention, we do site evidence of organization. We think the original observations are correct, but we do not fully agree with some of the original statements. In this manuscript our aim is to take the best we learned from their original works and to make a constructive contribution adding to the original discussions. In this regard, in trypanosomes there are some conserved patterns in the chromatin landscape, but their nucleosomes are far from being well-positioned or phased. For a better understanding, compare the variations observed in the y axis when representing av. nucleosome occupancy in yeast with those observed in trypanosomes and you will see that the troughs and peaks are much more prominent in yeast than the ones observed in any TryTryp member.

      Following the reviewer’s suggestion we have now clarified this in the main text

      The paper would also benefit from the inclusion of a schematic figure to help readers visualize and better understand the findings. What is the biological impact of having nucleosomes, di-nucleosomes, or sub-nucleosomes at TAS? This is not obvious to readers outside the chromatin field. For example, the following statement is not intuitive: "We observed that, when analyzing nucleosome-size (120-180 bp) DNA molecules or longer fragments (180-300 bp), the TASs of either T. cruzi or T. brucei are mostly nucleosome-depleted. However, when representing fragments smaller than a nucleosome-size (50-120 bp) some histone protection is unmasked (Fig. 3 and Fig. S4). This observation suggests that the MNase sensitive complex sitting at the TASs is at least partly composed of histones." Please clarify.

      We appreciate the reviewer’s suggestion to make a schematic figure. We are working on this and will be added to the manuscript upon final revision.

      Regarding the biological impact of having mono, di or subnucleosome fragments, it is important to unveil the fragment size of the protected DNA to infer the nature of the protecting complex. In the case of tRNA genes in yeast, at pol III promoters they found footprints smaller than a nucleosome size that ended up being TFIIB-TFIIC (Nagarajavel, doi: 10.1093/nar/gkt611). Therefore, detecting something smaller than a nucleosome might suggest the binding of trans-acting factors different than histones or involving histones in a mixed complex. These mixed complexes are also observed, and that is the case of the centromeric nucleosome which has a very peculiar composition (Ocampo and Clark, Cells Reports, 2015). On the other hand, if instead we detect bigger fragments, it could be indicative of the presence of bigger protecting molecules or that those regions are part of higher order chromatin organization still inaccessible for MNase linker digestions.

      Here we show on 2Dplots, that complex or components protecting the TAS have nucleosome size, but we cannot assure they are entirely made up by histones, since, only when looking at subnucleosome-size fragments, we are able to detect histone H3. We have now added part of this explanation to the discussion.

      By integrating every analysis in this work and the previous ones, we propose that the TAS is protected by an MNase-sensitive complex (Figure 2). This complex most likely is only partly formed by histones, since only when analyzing sub-nucleosomes size DNA molecules we can detect histone H3 (Figure 3). As explained above, to be absolutely sure that the complex is not entirely made up by histones, future studies should perform an MNse-ChIP-seq with less digested samples. However, it was previously shown that R-loops are enriched at those intergenic NDRs (Briggs 2018) and that R-loops have plenty of interacting proteins (Girasol, 2023). Therefore, most likely, this MNase-sensitive complexed have a hybrid nature made up by H3 and some other regulatory molecules. We have now added a new S5 figure showing R-loop co-localization.

      Some references are missing or incorrect:

      we will make a thorough revision

      "In trypanosomes, there are no canonical promoter regions." - please check Cordon-Obras et al. (Navarro's group). Thank you for the appropiate suggestion.

      We have now added this reference

      Please, cite the study by Wedel et al. (Siegel's group), which also performed MNase-seq analysis in T. brucei.

      We understand that reviewer number 2# missed that we cited this reference and that we did used the raw data from the manuscript of Wedel et. al 2017 form Siegel’s group. We used the MNase-ChIP-seq data set of histone H3 in our analysis for Figures 3, S4b and S5b (S6c in the revised version), also detailed in table S1. To be even more explicit we have now included the accession number of each data set in the figure legend.

      Figure-specific comments: Fig. S3: Why does the number of larger fragments increase with greater MNase digestion? Shouldn't the opposite be expected?

      This a good observation. As we also explained to reviewer#1:

      It's a common observation in MNase digestion of chromatin that more extensive digestion can still result in a broad range of fragment sizes, including some longer fragments. This seemingly counter-intuitive result is primarily due to the non-uniform accessibility of chromatin and the sequence preference of the MNase enzyme.

      The rationale of this is as follows: when you digest chromatin with MNase and the objective is to map nucleosomes genome-wide, the ideal situation would to get the whole material contained in the mononucleosome band. Given that MNase is less efficient to digest protected DNA but, if the reaction proceeds further, it always ends up destroying part of it, the result is always far from perfect. The better situation we can get, is to obtain samples were ˜80% of the material is contained in the mononucloesome band. __And here comes the main point: __even in the best scenario, you always have some additional longer bands, such as those for di or tri nucleosomes. If you keep digesting, you will get less than 80 % in the nucleosome band and, those remaining DNA fragments that use to contain di and tri nucleosomes start getting digested as well originating a bigger dispersion in fragments sizes. How do we explain persistence of Long Fragments? The longest fragments (di-, tri-nucleosomes) that persist in a highly digested sample are the ones that were originally most highly protected by proteins or higher-order structure, making their linker DNA extremely resistant to initial cleavage. Once the majority of the genome is fragmented, these few resistant longer fragments become a more visible component of the remaining population, contributing to a broader size dispersion. Hence, there you end up having a bigger dispersion in length distributions in the final material. Bottom line, it is not a good practice to work with under or overdigested samples. Our main point is to emphasize that especially when comparing samples, it important to compare those with comparable levels of digestion. Otherwise, a different sampling of the genome will be represented in the remaining sequenced DNA Fig. S5B: Why not use MNase conditions under which T. cruzi and T. brucei display comparable profiles at TAS? This would facilitate interpretation.

      The reviewer made a reasonable observation. The reason why we used MNase-ChIP_seq instead of just MNase to test occupancy at TAS at the subsets of genes, is because we intended to be more certain if we were talking about the presence of histones or something else. By using IP for histone H3 we can see that at multi-copy genes this protein is present when looking at nucleosome-size fragments. Additionally, as shown in figure S4b, length distribution histograms are also similar for the compared IPs.

      Minor points:

      There are several typos throughout the manuscript.

      Thanks for the observation. We will check carefully.

      Methods: "Dinucelotide frecuency calculation."

      We will add a code in GitHub

      Reviewer #2 (Significance (Required)):

      In my view, the main conclusion is that the profiles are indeed similar-even when comparing T. brucei and T. cruzi. This was not clear in previous studies (and even appeared contradictory, reporting nucleosome depletion versus enrichment) largely due to differences in chromatin digestion across these organisms. Audience: basic science and specialized readers.

      Expertise: epigenetics and gene expression in trypanosomatids.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __

      The authors analysed publicly accessible MNase-seq data in TriTryps parasites, focusing on the chromatin structure around trans-splicing acceptor sites (TASs), which are vital for processing gene transcripts. They describe a mild nucleosome depletion at the TAS of T. cruzi and L. major, whereas a histone-containing complex protects the TASs of T. brucei. In the subsequent analysis of T. brucei, they suggest that a Mnase-sensitive complex is localised at the TASs. For single-copy versus multi-copy genes, the authors show different di-nucleotide patterns and chromatin structures. Accordingly, they propose this difference could be a novel mechanism to ensure the accuracy of trans-splicing in these parasites.

      Before providing an in- depth review of the manuscript, I note that some missing information would have helped in assessing the study more thoroughly; however, in the light of the available information, I provide the following comments for consideration.

      The numbering of the figures, including the figure legends, is missing in the PDF file. This is essential for assessing the provided information.

      We apologized for not including the figure numbers in the main text, although they are located in the right place when called in the text. The omission was unwillingly made when figure legends were moved to the bottom of the main text. This is now fixed in the updated version of the manuscript.

      The publicly available Mnase- seq data are manyfold, with multiple datasets available for T. cruzi, for example. It is unclear from the manuscript which dataset was used for which figure. This must be clarified.

      This was detailed in Table S1. We have now replaced the table by an improved version, and we have also included the accession number of each data set used in the figure legends.

      Why do the authors start in figure 1 with the description of an MNase- protected TAS for T.brucei, given that it has been clearly shown by the Siegel lab that there is a nucleosome depletion similar to other parasites?

      We did not want to ignore the paper from Patterton’s lab because it was the first one to map nucleosomes genome-wide in T. brucei and the main finding of that paper claimed the existence of a well-positioned nucleosome at intergenic regions, what we though constitutes a point worth to be discussed. While Patterton’s work use MNase-seq from gel-purified samples and provides replicated experiments sequenced in really good depth; Siegel’s lab uses MNase-ChIP-seq of histone H3 but performs only one experiment and its input was not sequenced. So, each work has its own caveats and provides different information that together contributes to make a more comprehensive study. We think that bringing up both data sets to the discussion, as we have done in Figures 1 and 3, helps us and the community working in the field to enrich the discussion.

      If the authors re- analyse the data, they should compare their pipeline to those used in the other studies, highlighting differences and potential improvements.

      We are working on this point. We will provide a more detail description in the final revision.

      Since many figures resemble those in already published studies, there seems little reason to repeat and compare without a detailed comparison of the pipelines and their differences.

      Following the reviewer advice, we are now working on highlighting the main differences that justify analyzing the data the way we did and will be added in the finally revised method section.

      At a first glance, some of the figures might look similar when looking at the original manuscripts comparing with ours. However, with a careful and detailed reading of our manuscripts you can notice that we have added several analyses that allow to unveil information that was not disclosed before.

      First, we perform a systematic comparison analyzing every data set the same way from beginning to end, being the main difference with previous studies the thorough and precise prediction of TAS for the three organisms. Second, we represent the average chromatin organization relative to those predicted TASs for TriTryps and discuss their global patterns. Third, by representing the average chromatin into heatmaps, we show for the very first time, that those average nucleosome landscape are not just an average, they keep a similar organization in most of the genome. These was not done in any of the previous manuscripts except for our own (Beati, PLOS One 2023). Additionally, we introduce the discussion of how the extension of MNase reaction can affect the output of these experiments and we show 2D-plots and length distribution heatmaps to discuss this point (a point completely ignored in all the chromatin literature for trypanosomes). Furthermore, we made a far-reaching analysis by considering the contributions of each publish work even when addressed by different techniques. Finally, we discuss our findings in the context of a topic of current interest in the field, such as TriTryp’s genome compartmentalization.

      Several previous Mnase- seq analysis studies addressing chromatin accessibility emphasized the importance of using varying degrees of chromatin digestion, from low to high digestion (30496478, 38959309, 27151365).

      The reviewer is correct, and this point is exactly what we intended to illustrate in figure number 2. We appreciate he/she suggests these references that we are now citing in the final discussion. Just to clarify, using varying degrees of chromatin digestion is useful to make conclusions about a given organism but when comparing samples, strains, histone marks, etc. It is extremely important to do it upon selection of similar digested samples.

      No information on the extent of DNA hydrolysis is provided in the original Mnase- seq studies. This key information can not be inferred from the length distribution of the sequenced reads.

      The reviewer is correct that “No information on the extent of DNA hydrolysis is provided in the original Mnase-seq studies” and this is another reason why our analysis is so important to be published and discussed by the scientific community working in trypanosomes. We disagree with the reviewer in the second statement, since the level of digestion of a sequenced sample is actually tested by representing the length distribution of the total DNA sequenced. It is true that before sequencing you can, and should, check the level of digestion of the purified samples in an agarose gel and/or in a bioanalyzer. It could be also tested after library preparation, but before sequencing, expecting to observe the samples sizes incremented in size by the addition of the library adapters. But, the final test of success when working with MNase digested samples is to analyze length of DNA molecules by representing the histograms with length distribution of the sequenced DNA molecules. Remarkably, on occasions different samples might look very similar when run in a gel, but they render different length distribution histograms and this is because the nucleosome core could be intact but they might have suffered a differential trimming of the linker DNA associated to it or even be chewed inside (see Cole Hope 2011, section 5.2, doi: 10.1016/B978-0-12-391938-0.00006-9, for a detailed explanation).

      As the input material are selected, in part gel- purified mono- nucleosomal DNA bands. Furthermore the datasets are not directly comparable, as some use native MNase, while others employ MNase after crosslinking; some involve short digestion times at 37 {degree sign} C, while others involve longer digestion at lower temperatures. Combining these datasets to support the idea of an MNase- sensitive complex at the TAS of T. brucei therefore may not be appropriate, and additional experiments using consistent methodologies would strengthen the study's conclusions.

      In my opinion, describing an MNase- sensitive complex based solely on these data is not feasible. It requires specifically designed experiments using a consistent method and well- defined MNase digestion kinetics.

      As the reviewer suggests, the ideal experiment would be to perform a time course of MNase reaction with all the samples in parallel, or to work with a fix time point adding increasing amounts of MNase. However, the information obtained from the detail analysis of the length distribution histogram of sequenced DNA molecules the best test of the real outcome. In fact, those samples with different digestion levels were probably not generated on purpose.

      The only data sets that were gel purified are those from Mareé 2017 (Patterton’s lab), used in Figures 1, S1 and S2 and those from L. major shown in Fig 1. It was a common practice during those years, then we learned that is not necessary to gel purify, since we can sort fragment sizes later in silico when needed.

      As we explained to reviewer #1, to avoid this conflict, we decided to remove this data from figures 2 and S3. In summary, the 3 remaining samples comes from the same lab, and belong to the same publication (Mareé 2022). These sample are the inputs of native MNase ChIp-seq, obtain the same way, totally comparable among each other.

      Reviewer #3 (Significance (Required)):

      Due to the lack of controlled MNase digestion, use of heterogeneous datasets, and absence of benchmarking against previous studies, the conclusions regarding MNase-sensitive complexes and their functional significance remain speculative. With standardized MNase digestion and clearly annotated datasets, this study could provide a valuable contribution to understanding chromatin regulation in TriTryps parasites.

      As we have explained in the previous point our conclusions are valid since we do not compare in any figure samples coming from different treatments. The only exception to this comment could be in figure 3 when talking about MNase-ChIP-seq. We have now added a clear and explicit comment in the section and the discussion that despite having subtle differences in experimental procedures we arrive to the same results. This is the case for T. cruzi IP, run from crosslinked chromatin, compared to T. brucei’s IP, run from native chromatin.

      Along the years it was observed in the chromatin field that nucleosomes are so tightly bound to DNA that crosslinking is not necessary. However, it is still a common practice specially when performing IPs. In our own hands, we did not observe any difference at the global level neither in T. cruzi or in my previous work with yeast.

      ...

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The authors analysed publicly accessible MNase-seq data in TriTryps parasites, focusing on the chromatin structure around trans-splicing acceptor sites (TASs), which are vital for processing gene transcripts. They describe a mild nucleosome depletion at the TAS of T. cruzi and L. major, whereas a histone-containing complex protects the TASs of T. brucei. In the subsequent analysis of T. brucei, they suggest that a Mnase-sensitive complex is localised at the TASs. For single-copy versus multi-copy genes, the authors show different di-nucleotide patterns and chromatin structures. Accordingly, they propose this difference could be a novel mechanism to ensure the accuracy of trans-splicing in these parasites.

      Before providing an in- depth review of the manuscript, I note that some missing information would have helped in assessing the study more thoroughly; however, in the light of the available information, I provide the following comments for consideration.

      The numbering of the figures, including the figure legends, is missing in the PDF file. This is essential for assessing the provided information. The publicly available Mnase- seq data are manyfold, with multiple datasets available for T. cruzi, for example. It is unclear from the manuscript which dataset was used for which figure. This must be clarified. Why do the authors start in figure 1 with the description of an MNase- protected TAS for T.brucei, given that it has been clearly shown by the Siegel lab that there is a nucleosome depletion similar to other parasites? If the authors re- analyse the data, they should compare their pipeline to those used in the other studies, highlighting differences and potential improvements. Since many figures resemble those in already published studies, there seems little reason to repeat and compare without a detailed comparison of the pipelines and their differences. Several previous Mnase- seq analysis studies addressing chromatin accessibility emphasised the importance of using varying degrees of chromatin digestion, from low to high digestion (30496478, 38959309, 27151365). No information on the extent of DNA hydrolysis is provided in the original Mnase- seq studies. This key information can not be inferred from the length distribution of the sequenced reads. As the input material are selected, in part gel- purified mono- nucleosomal DNA bands. Furthermore the datasets are not directly comparable, as some use native MNase, while others employ MNase after crosslinking; some involve short digestion times at 37 {degree sign} C, while others involve longer digestion at lower temperatures. Combining these datasets to support the idea of an MNase- sensitive complex at the TAS of T. brucei therefore may not be appropriate, and additional experiments using consistent methodologies would strengthen the study's conclusions. In my opinion, describing an MNase- sensitive complex based solely on these data is not feasible. It requires specifically designed experiments using a consistent method and well- defined MNase digestion kinetics.

      Significance

      Due to the lack of controlled MNase digestion, use of heterogeneous datasets, and absence of benchmarking against previous studies, the conclusions regarding MNase-sensitive complexes and their functional significance remain speculative. With standardized MNase digestion and clearly annotated datasets, this study could provide a valuable contribution to understanding chromatin regulation in TriTryps parasites.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Siri et al. perform a comparative analysis using publicly available MNase-seq data from three trypanosomatids (T. brucei, T. cruzi, and Leishmania), showing that a similar chromatin profile is observed at TAS (trans-splicing acceptor site) regions. The original studies had already demonstrated that the nucleosome profile at TAS differs from the rest of the genome; however, this work fills an important gap in the literature by providing the most reliable cross-species comparison of nucleosome profiles among the tritryps. To achieve this, the authors applied the same computational analysis pipeline and carefully evaluated MNase digestion levels, which are known to influence nucleosome profiling outcomes.

      In my view, the main conclusion is that the profiles are indeed similar-even when comparing T. brucei and T. cruzi. This was not clear in previous studies (and even appeared contradictory, reporting nucleosome depletion versus enrichment) largely due to differences in chromatin digestion across these organisms. The manuscript could be improved with some clarifications and adjustments:

      1. The authors state from the beginning that available MNase data indicate altered nucleosome occupancy around the TAS. However, they could also emphasize that the conclusions across the different trypanosomatids are inconsistent and even contradictory: NDR in T. cruzi versus protection-in different locations-in T. brucei and Leishmania.
      2. Another point that requires clarification concerns what the authors mean in the introduction and discussion when they write that trypanosomes have "...poorly organized chromatin with nucleosomes that are not strikingly positioned or phased." On the other hand, they also cite evidence of organization: "...well-positioned nucleosome at the spliced-out region.. in Leishmania (ref 34)"; "...a well-positioned nucleosome at the TASs for internal genes (ref37)"; "...a nucleosome depletion was observed upstream of every gene (ref 35)." Aren't these examples of organized chromatin with at least a few phased nucleosomes? In addition, in ref 37, figure 4 shows at least two (possibly three to four) nucleosomes that appear phased. In my opinion, the authors should first define more precisely what they mean by "poorly organized chromatin" and clarify that this interpretation does not contradict the findings highlighted in the cited literature.
      3. The paper would also benefit from the inclusion of a schematic figure to help readers visualize and better understand the findings. What is the biological impact of having nucleosomes, di-nucleosomes, or sub-nucleosomes at TAS? This is not obvious to readers outside the chromatin field. For example, the following statement is not intuitive: "We observed that, when analyzing nucleosome-size (120-180 bp) DNA molecules or longer fragments (180-300 bp), the TASs of either T. cruzi or T. brucei are mostly nucleosome-depleted. However, when representing fragments smaller than a nucleosome-size (50-120 bp) some histone protection is unmasked (Fig. 3 and Fig. S4). This observation suggests that the MNase sensitive complex sitting at the TASs is at least partly composed of histones." Please clarify. Some references are missing or incorrect:

      "In trypanosomes, there are no canonical promoter regions." - please check Cordon-Obras et al. (Navarro's group).

      Please, cite the study by Wedel et al. (Siegel's group), which also performed MNase-seq analysis in T. brucei.

      Figure-specific comments:

      Fig. S3: Why does the number of larger fragments increase with greater MNase digestion? Shouldn't the opposite be expected?

      Fig. S5B: Why not use MNase conditions under which T. cruzi and T. brucei display comparable profiles at TAS? This would facilitate interpretation.

      Minor points:

      There are several typos throughout the manuscript.

      Methods: "Dinucelotide frecuency calculation."

      Significance

      In my view, the main conclusion is that the profiles are indeed similar-even when comparing T. brucei and T. cruzi. This was not clear in previous studies (and even appeared contradictory, reporting nucleosome depletion versus enrichment) largely due to differences in chromatin digestion across these organisms.

      Audience: basic science and specialized readers.

      Expertise: epigenetics and gene expression in trypanosomatids.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This study explores chromatin organization around trans-splicing acceptor sites (TASs) in the trypanosomatid parasites Trypanosoma cruzi, T. brucei and Leishmania major. By systematically re-analyzing MNase-seq and MNase-ChIP-seq datasets, the authors conclude that TASs are protected by an MNase-sensitive complex that is, at least in part, histone-based, and that single-copy and multi-copy genes display differential chromatin accessibility. Altogether, the data suggest a common chromatin landscape at TASs and imply that chromatin may modulate transcript maturation, adding a new regulatory layer to an unusual gene-expression system.

      I value integrative studies of this kind and appreciate the careful, consistent data analysis the authors implemented to extract novel insights. That said, several aspects require clarification or revision before the conclusions can be robustly supported. My main concerns are listed below, organized by topic/result section.

      TAS prediction:

      • Why were TAS predictions derived only from insect-stage RNA-seq data? Restricting TAS calls to one life stage risks biasing predictions toward transcripts that are highly expressed in that stage and may reduce annotation accuracy for lowly expressed or stage-specific genes. Please justify this choice and, if possible, evaluate TAS robustness using additional transcriptomes or explicitly state the limitation.

      Results

      • "There is a distinctive average nucleosome arrangement at the TASs in TriTryps":
      • You state that "In the case of L. major the samples are less digested." However, Supplementary Fig. S1 suggests that replicate 1 of L. major is less digested than the T. brucei samples, while replicate 2 of L. major looks similarly digested. Please clarify which replicates you reference and correct the statement if needed.
      • It appears you plot one replicate in Fig. 1b and the other in Suppl. Fig. S2. Please indicate explicitly which replicate is in each plot. For T. brucei, the NDR upstream of the TAS is clearer in Suppl. Fig. S2 while the TAS protection is less prominent; based on your digestion argument, this should correspond to the more-digested replicate. Please confirm. The protected region around the TAS appears centered on the TAS in T. brucei but upstream in L. major. This is an interesting difference. If it is technical (different digestion or TAS prediction offset), explain why; if likely biological, discuss possible mechanisms and implications.

      Results

      • "An MNase sensitive complex occupies the TASs in T. brucei":
      • The definition of "MNase activity" and the ordering of samples into Low/Intermediate/High digestion are unclear. Did you infer digestion levels from fragment distributions rather than from controlled experimental timepoints? In Suppl. Fig. S3a it is not obvious how "Low digestion" was defined; that sample's fragment distribution appears intermediate. Please provide objective metrics (e.g., median fragment length, fraction 120-180 bp) used to classify digestion levels.
      • Several fragment distributions show a sharp cutoff at ~100-125 bp. Was this due to gel purification or bioinformatic filtering? State this clearly in Methods. If gel purification occurred, that can explain why some datasets preserve the MNase-sensitive region.
      • Please reconcile cases where samples labeled as more-digested contain a larger proportion of >200 bp fragments than supposedly less-digested samples; this ordering affects the inference that digestion level determines the loss/preservation of TAS protection. Based on the distributions I see, "Intermediate digestion 1" appears most consistent with an expected MNase curve - please confirm and correct the manuscript accordingly. Results - "The MNase sensitive complexes protecting the TASs in T. brucei and T. cruzi are at least partly composed of histones":
      • The evidence that histones are part of the MNase-sensitive complex relies on H3 MNase-ChIP signal in subnucleosomal fragment bins. This seems to conflict with the observation (Fig. 1) that fragments protecting TASs are often nucleosome-sized. Please reconcile these points: are H3 signals confined to subnucleosomal fragments flanking the TAS while the TAS itself is depleted of H3? Provide plots that compare MNase-seq and H3 ChIP signals stratified by consistent fragment-size bins to clarify this.
      • Please indicate which datasets are used for each panel in Suppl. Fig. S4 (e.g., Wedel et al., Maree et al.), and avoid calling data from different labs "replicates" unless they are true replicates.
      • Several datasets show a sharp lower bound on fragment size in the subnucleosomal range (e.g., ~80-100 bp). Is this a filtering artifact or a gel-size selection? Clarify in Methods and, if this is an artifact, consider replotting after removing the cutoff. Results - "The TASs of single and multi-copy genes are differentially protected by nucleosomes":
      • Please include T. brucei RNA-seq data in Suppl. Fig. S5b as you did for T. cruzi.
      • Discuss how low or absent expression of multigene families affects TAS annotation (which relies on RNA-seq) and whether annotation inaccuracies could bias the observed chromatin differences.
      • The statement that multi-copy genes show an "oscillation" between AT and GC dinucleotides is not clearly supported: the multi-copy average appears noisier and is based on fewer loci. Please tone down this claim or provide statistical support that the pattern is periodic rather than noisy.
      • How were multi-copy genes defined in T. brucei? Include the classification method in Methods.

      Genomes and annotations:

      • If transcriptomic data for the Y strain was used for T. cruzi, please explain why a Y strain genome was not used (e.g., Wang et al. 2021 GCA_015033655.1), or justify the choice. For T. brucei, consider the more recent Lister 427 assembly (Tb427_2018) from TriTrypDB. Use strain-matched genomes and transcriptomes when possible, or discuss limitations.

      Reproducibility and broader integration:

      • Please share the full analysis pipeline (ideally on GitHub/Zenodo) so the results are reproducible from raw reads to plots.
      • As an optional but helpful expansion, consider including additional datasets (other life stages, BSF MNase-seq, ATAC-seq, DRIP-seq) where available to strengthen comparative claims. Optional analyses that would strengthen the study:
      • Stratify single-copy genes by expression (high / medium / low) and examine average nucleosome occupancy at TASs for each group; a correlation between expression and NDR depth would strengthen the functional link to maturation.

      Minor / editorial comments:

      • In the Introduction, the sentence "transcription is initiated from dispersed promoters and in general they coincide with divergent strand switch regions" should be qualified: such initiation sites also include single transcription start regions.
      • Define the dotted line in length distribution plots (if it is not the median, please clarify) and consider placing it at 147 bp across plots to ease comparison.
      • In Suppl. Fig. 4b "Replicate2" the x-axis ticks are misaligned with labels - please fix.
      • Typo in the Introduction: "remodellingremodeling" → "remodeling."

      Referee cross-commenting

      Comment 1: I think Reviewer #2 and Reviewer #3 missed that they authors of this manuscript do cite and consider the results from Wedel at al. 2017. They even re-analysed their data (e.g. Figure 3a). I second Reviewer #2 comment indicating that the inclusion of a schematic figure to help readers visualize and better understand the findings would be an important addition.

      Comment 2: I agree with Reviewer #3 that the use of different MNase digestion procedures in the different datasets have to be considered. On the other hand, I don't think there is a problem with figure 1 showing an MNase-protected TAS for T. brucei as it is based on MNase-seq data and reproduces the reported results (Maree et al. 2017). What the Siegel lab did in Wedel et al. 2017 was MNase-ChIPseq of H3 showing nucleosome depletion at TAS, but both results are not necessary contradictory: There could still be something else (which does not contain H3) sitting on the TAS protecting it from MNase digestion.

      Significance

      This study provides a systematic comparative analysis of chromatin landscapes at trans-splicing acceptor sites (TASs) in trypanosomatids, an area that has been relatively underexplored. By re-analyzing and harmonizing existing MNase-seq and MNase-ChIP-seq datasets, the authors highlight conserved and divergent features of nucleosome occupancy around TASs and propose that chromatin contributes to the fidelity of transcript maturation.

      The significance lies in three aspects:

      1. Conceptual advance: It broadens our understanding of gene regulation in organisms where transcription initiation is unusual and largely constitutive, suggesting that chromatin can still modulate post-transcriptional processes such as trans-splicing.
      2. Integrative perspective: Bringing together data from T. cruzi, T. brucei and L. major provides a comparative framework that may inspire further mechanistic studies across kinetoplastids.
      3. Hypothesis generation: The findings open testable avenues about the role of chromatin in coordinating transcript maturation, the contribution of DNA sequence composition, and potential interactions with R-loops or RNA-binding proteins. Researchers in parasitology, chromatin biology, and RNA processing will find it a useful resource and a stimulus for targeted experimental follow-up.

      My expertise is in gene regulation in eukaryotic parasites, with a focus on bioinformatic analysis of high-throughput sequencing data

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.


      Reply to the Reviewers

      We thank the reviewers for their positive assessments overall and for many helpful suggestions for clarification to make the manuscript more accessible to a broader audience. We made minor text changes and added more labels to the figures to address these comments.

      • *

      __Referee #1

      __

      Summary: In this study, the authors show a genetic interaction of the lipid receptors Lpr-1, Lpr-3 and Scav-2 in C. elegans. They show that Lpr-1 loss-of-function specifically affects aECM localization of Lpr-3 and attribute the lethality of Lpr-1 mutants to this phenotype. The authors performed a mutagenesis screen and identified a third lipid receptor, Scav-2, as a modulating factor: loss of scav-2 partially rescues the Lpr-1 phenotype. The authors created a variety of tools for this study, notably Crispr-Cas9-mediated knock-ins for endogenous tagging of the receptors.

      Major comments:

      1. while the authors provide a nice diagram showing the potential roles and interplay of lpr-1, lpr-3 and scav-2, it remains unclear what their respective cargo is. The nature of interaction between the proteins remains unclear from the data.

      Response

      • We agree that identifying the relevant cargo(s) will be key to understanding the detailed mechanisms involved and that the lack of such information is a limitation of our study. However, the impact of our study is to show that these lipid transporters functionally interact to affect aECM organization, a role that could be relevant to many systems, including humans.

      As an optional (since time-consuming) experiment I would suggest trying more tissue-specific lipidomics.

      Response

      • This would be an interesting future experiment but is outside our current technical capabilities.

      The lipidomics data should be presented in the figures, even if there were no significant changes. Importantly, show the lipid abundance at least of total lipids, better of individual classes, normalized to the material input (e.g. number of embryos, protein).

      Response

      • The reviewer is right to point out that lipid variations could occur at different levels, and that we should exercise caution. However, the unsupervised lipidomics analysis would have detected not only individual lipid variations, but also variations in the total or subgroup lipid content. Indeed, the eggs were weighed prior to extraction and each sample was extracted with the same precise volume of solvent before analysis. Furthermore, the LC-MS/MS injection sequence included blanks and quality control (QC) samples. The blanks were the extraction solvent, which allowed us to control for features unrelated to the biological samples. The QC sample was a mixture of all the samples included in the injection sequence, reflecting the central values of the model. If a subclass of samples, such as the lpr-1 mutant, had been characterized by a decrease in one lipid, a subgroup of lipids, or all lipids, it would have clustered separately. Instead, our PCA showed that the variation between samples of the same genotype (wild type, lpr-1 mutant, or lpr-1; scav-2) was similar to the variation between samples from two different genotypes. This means that we did not detect modifications to lipid quantity specifically or in total. A figure illustrating the lipid contents would show no difference between groups.

      Figure 1g: I do not understand what the lpr3:gfp signal is: the punctae in the overview image? and where are they in the zoom image showing anulli and alae? Also, how where the anulli and alae structures labeled? please provide more information

      Response

      • All of the fluorescent signal shown in this figure panel corresponds to the indicated LPR fusion - no other labelling method was used. SfGFP::LPR-3 labels the matrix structures (alae and annuli) as well as some puncta – the ratio of matrix to puncta changes over developmental stages. We edited the figure legend to make this more clear.

      One point that is not sufficiently adressed is that the authors deduce from the inability of the scav-2 gfp knock in to suppress lpr1 lethality that scav2 function is not impaired. This is quite indirect. Can the authors provide more convincing evidence that scav-2 ki has normal function?

      Response

      • Suppression of lpr-1 (or other aECM mutant) lethality is the only known phenotype caused by loss of scav-2 Therefore, this is the only phenotype for which we can do a rescue experiment to test functionality of the knock-in. The data presented do indicate that the knock-in fusion retains significant function.

      In general, the data is clearly presented and the statistical analyses look sound.

      Response

      • Thank you

      __Minor comments: __

      Please provide page and line numbers!

      Response:

      • done

      Avoid contractions like "don't" in both text and figure legends

      Response:

      • changed one instance of “don’t” to “do not”

      Page 12: I do not understand the meaning of the sentence "This transgene also caused more modest lethality in a wild-type background"

      Response:

      • Wording changed to “This transgene caused very little lethality in a wild-type background (Fig. 6C), indicating it is not generally toxic.”

      Figure 7: what is meant with "Dodt"?

      Response:

      • Dodt gradient contrast imaging is a method for transmitted light imaging similar to DIC and is used on some confocal microscopes. It is now explained in the Methods section. We removed the Dodt label from Figure 7 since it seems to be confusing and it is not really important whether the brightfield image is DIC or Dodt.

        Reviewer #1 (Significance (Required)):

        The study is experimentally sound and uses numerous novel tools, such as endogenously tagged lipid receptors. It is an interesting study for researchers in basic research studying lipid receptors and ECM biology. It provides insights on the genetic interaction of lipid receptors. My expertise is in lipid biochemistry, inter-organ lipid trafficking and imaging. I am not very familiar with C. elegans genetics.

      __Referee #2 __ 1. The manuscript is very well written; the documentation is fine, but some more details are needed for better following the subject for readers not familiar with nematode anatomy.

      For instance, while alae are somehow explained, annuli are not - structures that look abnormal in lpr1 and lpr1-scav2 mutants (Fig. 5B).

      Response

      • Apologies for this oversight. We added annuli labels to Figure 1 and Figure 5 panels and added descriptions of annuli to the Figure 1 legend and the Results text.

      Moreover, the authors show in Fig. 1 the punctae etc in the epidermis, whereas in Fig. 2 the show Lpr3 accumulation or not in the duct and the pore (lpr1). How do they localize in the cells of these structures at high magnification? It is also important to see the Lpr3 localisation in lpr1 mutants shown in Fig. 2A with the quality of the images shown in Fig. 1F. This applies also to Figs. 4 and 5.

      Responses:

      • The embryonic duct and pore cells are very small and we have not reliably seen puncta within them. In Figs 2 and 5, we supplemented the duct and pore images with those from the epidermis, which is a much larger tissue, allowing us to resolve puncta and matrix structures with better resolution.
      • The laser settings in Figs 2,4,5 (as opposed to Fig. 1) were chosen to avoid saturation of the matrix signal so that we could do accurate quantifications as shown. The images are unmodified with respect to brightness and therefore appear relatively dim – but we think they convey the observations very accurately.

      I would like to see punctae in lpr1-scav2 doubles.

      Response:

      • Puncta in this genotype are shown for the epidermis in Figure 5. It has not been possible to see puncta specifically within the embryonic duct and pore.

      Regarding the central mechanism, one possibility is - what the authors describe - that Lpr1 is needed for Lpr3 accumulation in ducts and tubes. Alternatively, Lpr1 is needed for duct and tube expansion, in lack of which Lpr3 is unable to reach its destination that is the lumina. Scav2, in this scenario, might be antagonist of tube and duct expansion, and thereby rescue the Lpr1 mutant phenotype independently. Admittedly, the non-accumulation of Lpr3 in scav2 mutants argues against a lpr1-independent function of scav2.

      Responses:

      • LPR-1 is indeed needed to maintain duct and pore tube integrity as the tubes grow, but in mutants the tubes appear to collapse at a later stage than we imaged here (Stone et al 2009). The ~normal accumulation of LET-4 and LET-653 further argues that the duct and pore tubes are still intact at the 1.5-to-2-fold stages. Therefore, we conclude that the defect in LPR-3 accumulation precedes duct and pore collapse.
      • The changes we document in the epidermis also show that the lpr-1 mutant affects LPR-3 accumulation in another (non-tube) tissue.

      In any case, to underline the aspect of Lpr1-Scav2 dosage relationship, the authors may also have a look at Lpr3 distribution in lpr1 heterozygous, and lpr1-scav2 double heterozygous worms. In this spirit, it would be interesting to see the semi-dominant effects of scav2 on Lpr3 localisation in lpr1 mutants by microscopy.

      Response:

      • Because of the hermaphroditism of C. elegans, it would be technically challenging to confidently identify heterozygous (vs. homozygous) embryos for confocal imaging. We do not think that the results would be informative enough to warrant the effort, given that we’ve already shown that scav-2 heterozygosity can partly suppress lpr-1 The expectation is that LPR-3 levels would be partially restored in the scav-2 het, but it might take a very large sample size to confidently assess that partial effect.

      One word to the overexpression studies: it is surprising that the amounts of Scav2 delivered by the expression through the grl-2 promoter in the lpr1, scav2 background are almost matching those by the opposite effect of scav2 mutations on lpr1 dysfunction.

      Response:

      • The reviewer refers to the transgenic rescue experiment with the grl-2pro::SCAV-2 transgene. Because the scav-2 mutant phenotype being tested is suppression of lpr-1 lethality, the expected result from scav-2 rescue is to restore the lpr-1 lethal phenotype to the strain. This is exactly the result we see. We have revised the text to more clearly explain the logic.

      One issue concerns the localization of scav2-gfp "rarely" in vesicles: what are these vesicles?

      Response

      • Only a handful of vesicles were seen across all the images we collected, and we have not yet identified them. They could be associated with either SCAV-2 delivery or removal from the plasma membrane, as now stated in the text. SCAV-2 trafficking would be an interesting area for further study but is beyond the scope of this paper.

      One comment to the Let653 transgenes/knock-ins: the localization of transgenic Let653-gfp may be normal in lpr1 mutants because there are wild-type copies in the background.

      Response

      • There are wild type copies of LET-653 in the background, but no wild type copies of LPR-1. Even if the untagged LET-653 would be recruiting the tagged LET-653 as the reviewer suggests, we can still conclude that lpr-1 loss does not prevent the untagged LET-653 (and thus also the tagged LET-653) from accumulating in the duct lumen matrix.

      One thought to the model: if Scav2 has a function in a lpr1 background, this means that yet another transporter X delivers the substrate for Scav2, isn't it?

      Response

      • Yes, we completely agree with this interpretation and have revised the discussion and Figure 8 legend to more explicitly make this point.

      A word to the term haploinsifficient that is used in this study: scav2 mutants would be haploinsifficient if the heterozygous worms died in an otherwise wild-type background.

      Response

      • We disagree with this comment. The term “haploinsufficient” simply means that heterozygosity for a deletion or other loss of function allele can cause a mutant phenotype – the term is not restricted to lethal phenotypes.

        Reviewer #2 (Significance (Required)):

        Alexandra C.Belfi and colleagues wrote the manuscript entitled "Opposing roles for lipocalins and a CD36 family scavenger receptor in apical extracellular matrix-dependent protection of narrow tube integrity" in which they report on their findings on the genetic and cell-biological interaction between the lipid transporters Lpr1 and scav2 in the nematode C. elegans. In principle, these two proteins are involved in shaping the apical extracellular matrix (aECM) of ducts by regulating the amounts of Lpr3 in the extracellular space. While seems to act cell autonomously, Lpr1 has a non-cell autonomous effect on Lpr3.


      __Referee #3 __ Summary: Using a powerful combination of genetic and quantitative imaging approaches, Belfi et al., describe novel findings on the roles of several lipocalins-secreted lipid carrier proteins-in the production and organization of the apical extracellular matrix (aECM) required for small diameter tube formation in C. elegans. The work comprises a substantial extension of previous studies carried out by the Sundaram lab, which has pioneered studies into the roles of aECM and accessory proteins in creating the duct-pore excretion tube and which also plays a role in patterning of the epidermal cuticle. One core finding is that the lipocalin LPR-1 does not stably associate with the aECM but is instead required for the incorporation of another lipocalin, LPR-3. A second major finding is that reduction of function in SCAV-2, a SCARB family membrane lipid transporter, suppresses lpr-1 mutant lethality along with associated duct-pore defects and mislocalization of LPR-3. Likewise loss of scav-2 partially suppresses defects in two other aECM proteins and restores defects in LPR-3 localization in one of them (let-653). Additional genetic and protein localization studies lead to the model that LPR-1 and SCAV-2 may antagonistically regulate one or more lipid or lipoprotein factors necessary for LPR-3 localization and duct-pore formation. A role for LPR-1 and LPR-3 at lysosomes is clearly implicated based on co-localization studies, although a specific role for lysosomes (or related organelles) is not defined. Finally, MS data suggests that neither LPR-1 or SCAV-2 grossly affect lipid composition in embryos, consistent with dietary interventions failing to affect mutant phenotypes. Ultimately, a plausible schematic model is presented to explain for much of the data.

      __*Major comments:

      *__

      1. The studies are very thorough, convincing, and generally well described. Conclusions are logical and well grounded. Additional experiments are not required to support the authors major conclusions, and the data and methods are described in a sufficient detail to allow replication. As such my comments are minor and should be addressable at the author's discretion in writing.

      Response

      • Thank you for these positive comments

        __Minor comments: __2) In the abstract, "tissue-specific suppression" made me think that there was going to be a tissue-specific knockdown experiment, which was not the case. Rather scav-2 suppression is specific to the duct-pore, which corresponds to where scav-2 is expressed. Consider rewording this.

      Response

      • Wording was changed to “duct/pore-specific suppression”

        3) Page 5. Suggest wording change to, "Whereas LPR-3 incorporates stably into the precuticle, suggesting a structural role in matrix organization, LPR-1..."

      Response

      • Done

        4) LIMP-2 versus LIMP2. Both are used. Uniprot lists LIMP2, but some papers use LIMP-2. Choose one and be consistent.

      Response

      • Everything changed to LIMP2.

        5) Some of the data for S6 Fig wasn't referred to directly in the text. Namely results regarding pcyt-1 and pld-1. I'd suggest incorporating this into the results section possibly using, "As a control for our lipid supplementation experiments..."

      Response

      • These experiments are now described on page 11.

        6) Page 12 bottom. I understand the use of "oppose", but another way to put it is that SCAV-2 and LPR-1 (antagonistically or collectively) modulate aECM composition. Other terms that might confuse some readers is the use of upstream and downstream, although I OK with its use in the context of this work.

      Response

      • The genetics indicate that lpr-1 and scav-2 have opposite effects on tube shaping and LPR-3 localization, so they do function antagonistically rather than collectively/cooperatively; we decided to keep this terminology.

        7) Page 16. I understand the logic that SCAV-2 is unlikely to directly modulate LPR-3 given its presumed molecular function. But is it possible that LPR-3 levels are already maxed out in the aECM so that loss of SCAV-2 doesn't lead to any increase? Conversely, one could argue that even if acting indirectly, SCAV-2 could have led to increased LPR-3 levels, unless they were already maxed.

      Response

      • This is a good point and the possibility is now mentioned in the Results page 9. We also changed our wording in the Abstract and Discussion to acknowledge the possibility that LPR-3 could be the SCAV-2 cargo, though we still don’t favor this model.

        8) Figure legend 1. I did not see an asterisk in figure 1B.

      Response

      • thanks for catching this error, text removed

        9) Figure 1C. Might want to define the "degree" term in the legend for people outside the field.

      Response

      • We added an explanation to the figure legend.

        10) Fig 1 G. I was just wondering if cuticle autofluorescence was an issue for taking these images.

      Response

      • Cuticle auto fluorescence is generally quite dim in L4s with our settings, and it was not an issue at this mid/late L4 stage, which corresponds to when both LPR fusions are at their brightest. Note that both large panels are MAX projections and yet you can’t see any cuticle auto-fluorescence in the LPR-1 panel.

        11) Fig 2 and others. Please define error bars.

      Response

      • These correspond to the standard deviation; this information is now added to the Methods.

        12) Fig 5. From the images, it looks like lpr-1; scav-2 doubles might have a worse (pre)cuticle defect in LPR-3 localization than lpr-1 singles. If so that would be interesting and would suggest that their relationship with respect to the modulation of LPR-3 is context dependent. Admittedly, the lack of obvious scav-2 expression in the epidermis would not be consistent with an effect (positive or negative).

      Response

      • The lpr-1 scav-2 strain is certainly not improved over lpr-1 but we have not noted any consistent worsening of the phenotype either.

        13) Consider defining Dodt in the first figure legend where it appears.

      Response

      • Dodt gradient contrast imaging is a method of transmitted light imaging similar to DIC and is used on some confocal microscopes. It is now explained in the Methods section. We removed the term from Figure 7 since it seems to be confusing.

        14) For Mander's, is there a reason to report just one of the two findings (M1 or M2) versus both?

      Response

      • We now include the 2nd Manders value in the figure legend and note that value is much lower (0.25) because much of the red signal is lysosomes (where green would be quenched by acidity).

        15) Consider referring to specific panels (A, B...) within references to the supplemental files.

      Response

      • done

        16) Fig S6E. Neither "increasing nor increasing" to "increasing nor decreasing".

      Response

      • fixed

        **Referees cross-commenting**

        I thought that Reviewers 1 and 2 brought up some good points. My sense is that Belfi and colleagues can address most of these in writing, but are of course welcome to add new data as they see fit. I get that it's not a "perfect" paper where everything is explained fully or comes together, but I don't see that as a flaw that needs to be fixed. I think that the manuscript represents a good deal of work (as it is) and provides a sufficient advance while also suggesting an interesting link to disease. It will be up to individual journals to decide if the findings meets their criteria.

        Reviewer #3 (Significance (Required)):

        Significance: The work carried out in this paper, and more generally by the Sundaram lab, always has a ground-breaking element because very few labs in the field have studied in detail the developmental roles and regulation of the aECM, in large part because it can be challenging to dissect. The core findings in this study are rather novel and unexpected, namely the opposing roles of the paralogous LPR-1 and LPR-3 lipocalins and their functional interactions with SCAV-2. The study does stop short of finding specific molecules (lipid or lipoprotein) that would mediate the effects they report, and it wasn't yet clear how the lysosomal co-loc plays a role, but this is not a criticism of the work presented or the forward progress. I was particularly intrigued by the idea, presented in the discussion, that disruption of vascular aECM could potentially account for some of the (complex) observations regarding the role of lipocalins and SCARB proteins in human disease. This would represent a new avenue for researchers to consider and underscores the power of using non-biased approaches in model systems.

        As for all my reviews, this is signed by David Fay.

      • *

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Using a powerful combination of genetic and quantitative imaging approaches, Belfi et al., describe novel findings on the roles of several lipocalins-secreted lipid carrier proteins-in the production and organization of the apical extracellular matrix (aECM) required for small diameter tube formation in C. elegans. The work comprises a substantial extension of previous studies carried out by the Sundaram lab, which has pioneered studies into the roles of aECM and accessory proteins in creating the duct-pore excretion tube and which also plays a role in patterning of the epidermal cuticle. One core finding is that the lipocalin LPR-1 does not stably associate with the aECM but is instead required for the incorporation of another lipocalin, LPR-3. A second major finding is that reduction of function in SCAV-2, a SCARB family membrane lipid transporter, suppresses lpr-1 mutant lethality along with associated duct-pore defects and mislocalization of LPR-3. Likewise loss of scav-2 partially suppresses defects in two other aECM proteins and restores defects in LPR-3 localization in one of them (let-653). Additional genetic and protein localization studies lead to the model that LPR-1 and SCAV-2 may antagonistically regulate one or more lipid or lipoprotein factors necessary for LPR-3 localization and duct-pore formation. A role for LPR-1 and LPR-3 at lysosomes is clearly implicated based on co-localization studies, although a specific role for lysosomes (or related organelles) is not defined. Finally, MS data suggests that neither LPR-1 or SCAV-2 grossly affect lipid composition in embryos, consistent with dietary interventions failing to affect mutant phenotypes. Ultimately, a plausible schematic model is presented to explain for much of the data.

      Major comments:

      The studies are very thorough, convincing, and generally well described. Conclusions are logical and well grounded. Additional experiments are not required to support the authors major conclusions, and the data and methods are described in a sufficient detail to allow replication. As such my comments are minor and should be addressable at the author's discretion in writing.

      Minor comments:

      1) In the abstract, "tissue-specific suppression" made me think that there was going to be a tissue-specific knockdown experiment, which was not the case. Rather scav-2 suppression is specific to the duct-pore, which corresponds to where scav-2 is expressed. Consider rewording this.

      2) Page 5. Suggest wording change to, "Whereas LPR-3 incorporates stably into the precuticle, suggesting a structural role in matrix organization, LPR-1..."

      3) LIMP-2 versus LIMP2. Both are used. Uniprot lists LIMP2, but some papers use LIMP-2. Choose one and be consistent.

      4) Some of the data for S6 Fig wasn't referred to directly in the text. Namely results regarding pcyt-1 and pld-1. I'd suggest incorporating this into the results section possibly using, "As a control for our lipid supplementation experiments..."

      5) Page 12 bottom. I understand the use of "oppose", but another way to put it is that SCAV-2 and LPR-1 (antagonistically or collectively) modulate aECM composition. Other terms that might confuse some readers is the use of upstream and downstream, although I OK with its use in the context of this work.

      6) Page 16. I understand the logic that SCAV-2 is unlikely to directly modulate LPR-3 given its presumed molecular function. But is it possible that LPR-3 levels are already maxed out in the aECM so that loss of SCAV-2 doesn't lead to any increase? Conversely, one could argue that even if acting indirectly, SCAV-2 could have led to increased LPR-3 levels, unless they were already maxed.

      7) Figure legend 1. I did not see an asterisk in figure 1B.

      8) Figure 1C. Might want to define the "degree" term in the legend for people outside the field.

      9) Fig 1 G. I was just wondering if cuticle autofluorescence was an issue for taking these images.

      10) Fig 2 and others. Please define error bars.

      11) Fig 5. From the images, it looks like lpr-1; scav-2 doubles might have a worse (pre)cuticle defect in LPR-3 localization than lpr-1 singles. If so that would be interesting and would suggest that their relationship with respect to the modulation of LPR-3 is context dependent. Admittedly, the lack of obvious scav-2 expression in the epidermis would not be consistent with an effect (positive or negative).

      12) Consider defining Dodt in the first figure legend where it appears.

      13) For Mander's, is there a reason to report just one of the two findings (M1 or M2) versus both?

      14) Consider referring to specific panels (A, B...) within references to the supplemental files.

      15) Fig S6E. Neither "increasing nor increasing" to "increasing nor decreasing".

      As for all my reviews, this is signed by David Fay.

      Referees cross-commenting

      I thought that Reviewers 1 and 2 brought up some good points. My sense is that Belfi and colleagues can address most of these in writing, but are of course welcome to add new data as they see fit. I get that it's not a "perfect" paper where everything is explained fully or comes together, but I don't see that as a flaw that needs to be fixed. I think that the manuscript represents a good deal of work (as it is) and provides a sufficient advance while also suggesting an interesting link to disease. It will be up to individual journals to decide if the findings meets their criteria.

      Significance

      Significance:

      The work carried out in this paper, and more generally by the Sundaram lab, always has a ground-breaking element because very few labs in the field have studied in detail the developmental roles and regulation of the aECM, in large part because it can be challenging to dissect. The core findings in this study are rather novel and unexpected, namely the opposing roles of the paralogous LPR-1 and LPR-3 lipocalins and their functional interactions with SCAV-2. The study does stop short of finding specific molecules (lipid or lipoprotein) that would mediate the effects they report, and it wasn't yet clear how the lysosomal co-loc plays a role, but this is not a criticism of the work presented or the forward progress. I was particularly intrigued by the idea, presented in the discussion, that disruption of vascular aECM could potentially account for some of the (complex) observations regarding the role of lipocalins and SCARB proteins in human disease. This would represent a new avenue for researchers to consider and underscores the power of using non-biased approaches in model systems.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript is very well written; the documentation is fine, but some more details are needed for better following the subject for readers not familiar with nematode anatomy. For instance, while alae are somehow explained, annuli are not - structures that look abnormal in lpr1 and lpr1-scav2 mutants (Fig. 5B). Moreover, the authors show in Fig. 1 the punctae etc in the epidermis, whereas in Fig. 2 the show Lpr3 accumulation or not in the duct and the pore (lpr1). How do they localize in the cells of these structures at high magnification? It is also important to see the Lpr3 localisation in lpr1 mutants shown in Fig. 2A with the quality of the images shown in Fig. 1F. This applies also to Figs. 4 and 5. I would like to see punctae in lpr1-scav2 doubles. Regarding the central mechanism, one possibility is - what the authors describe - that Lpr1 is needed for Lpr3 accumulation in ducts and tubes. Alternatively, Lpr1 is needed for duct and tube expansion, in lack of which Lpr3 is unable to reach its destination that is the lumina. Scav2, in this scenario, might be antagonist of tube and duct expansion, and thereby rescue the Lpr1 mutant phenotype independently. Admittedly, the non-accumulation of Lpr3 in scav2 mutants argues against a lpr1-independent function of scav2. In any case, to underline the aspect of Lpr1-Scav2 dosage relationship, the authors may also have a look at Lpr3 distribution in lpr1 heterozygous, and lpr1-scav2 double heterozygous worms. In this spirit, it would be interesting to see the semi-dominant effects of scav2 on Lpr3 localisation in lpr1 mutants by microscopy. One word to the overexpression studies: it is surprising that the amounts of Scav2 delivered by the expression through the grl-2 promoter in the lpr1, scav2 background are almost matching those by the opposite effect of scav2 mutations on lpr1 dysfunction.

      One issue concerns the localization of scav2-gfp "rarely" in vesicles: what are these vesicles?

      One comment to the Let653 transgenes/knock-ins: the localization of transgenic Let653-gfp may be normal in lpr1 mutants because there are wild-type copies in the background.

      One thought to the model: if Scav2 has a function in a lpr1 background, this means that yet another transporter X delivers the substrate for Scav2, isn't it?

      A word to the term haploinsifficient that is used in this study: scav2 mutants would be haploinsifficient if the heterozygous worms died in an otherwise wild-type background.

      Significance

      Alexandra C.Belfi and colleagues wrote the manuscript entitled "Opposing roles for lipocalins and a CD36 family scavenger receptor in apical extracellular matrix-dependent protection of narrow tube integrity" in which they report on their findings on the genetic and cell-biological interaction between the lipid transporters Lpr1 and scav2 in the nematode C. elegans. In principle, these two proteins are involved in shaping the apical extracellular matrix (aECM) of ducts by regulating the amounts of Lpr3 in the extracellular space. While seems to act cell autonomously, Lpr1 has a non-cell autonomous effect on Lpr3.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: In this study, the authors show a genetic interaction of the lipid receptors Lpr-1, Lpr-3 and Scav-2 in C. elegans. They show that Lpr-1 loss-of-function specifically affects aECM localization of Lpr-3 and attribute the lethality of Lpr-1 mutants to this phenotype. The authors performed a mutagenesis screen and identified a third lipid receptor, Scav-2, as a modulating factor: loss of scav-2 partially rescues the Lpr-1 phenotype. The authors created a variety of tools for this study, notably Crispr-Cas9-mediated knock-ins for endogenous tagging of the receptors.

      Major comments: while the authors provide a nice diagram showing the potential roles and interplay of lpr-1, lpr-3 and scav-2, it remains unclear what their respective cargo is. The nature of interaction between the proteins remains unclear from the data. As an optional (since time-consuming) experiment I would suggest trying more tissue-specific lipidomics. The lipidomics data should be presented in the figures, even if there were no significant changes. Importantly, show the lipid abundance at least of total lipids, better of individual classes, normalized to the material input (e.g. number of embryos, protein). Figure 1g: I do not understand what the lpr3:gfp signal is: the punctae in the overview image? and where are they in the zoom image showing anulli and alae? Also, how where the anulli and alae structures labeled? please provide more information One point that is not sufficiently adressed is that the authors deduce from the inability of the scav-2 gfp knock in to suppress lpr1 lethality that scav2 function is not impaired. This is quite indirect. Can the authors provide more convincing evidence that scav-2 ki has normal function? In general, the data is clearly presented and the statistical analyses look sound.

      Minor comments: Please provide page and line numbers! Avoid contractions like "don't" in both text and figure legends Page 12: I do not understand the meaning of the sentence "This transgene also caused more modest lethality in a wild-type background" Figure 7: what is meant with "Dodt"?

      Significance

      The study is experimentally sound and uses numerous novel tools, such as endogenously tagged lipid receptors. It is an interesting study for researchers in basic research studying lipid receptors and ECM biology. It provides insights on the genetic interaction of lipid receptors.

      My expertise is in lipid biochemistry, inter-organ lipid trafficking and imaging. I am not very familiar with C. elegans genetics.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      _Below we address all the comments by the reviewers. However, the figures that were used in our response are unfortunately not displayed in this format. _

      Reviewer #1

      Evidence, reproducibility and clarity

      Thanks to the development of Ribo-Seq, translational buffering has been reported in the database of published Ribo-Seq and matched RNA-Seq, Rao et al. attempt to understand the mechanism underlying translational buffering of mRNA variation across diverse materials. Although the authors' report provides a step forward in our understanding of translational buffering, this reviewer found a series of concerns in this paper. These points could be tackled to improve the reliability of their findings, the strength of their main message, and the global understandability of the paper.

      Major comments: 1. This paper heavily relies on the reference 18. However, this paper was not properly stated (no page or journal number); the study in Bioinformatics is nowhere to be found on the website, despite being out in 2024 apparently. Either title is wrong (yet a biorxiv can be found). This reviewer guessed that the reference 18 may be accepted. However, without a proper reference, this paper could not be judged since nearly all the parts of this work have been based on the reference 18. Also, the Ribobase data used in this manuscript comes from this reference, so it had better be well defined, especially when another Ribobase data set seems to be available online: http://www.bioinf.uni-freiburg.de/~ribobase/index.html

      We apologize for the citation issue. This citation by Liu et al , 2024 (18) was a preprint from BioRxiv. This manuscript is now published in Nature Biotechnology. The reference has been updated in the revised version of the manuscript. The reference number in revised manuscript is Liu et al, 2025 (23).

      In the Discussion, the authors mentioned "TE is based on a compositional regression model (18) rather than the commonly applied approach of using a logarithmic ratio of ribosome occupancy to mRNA abundance." This important information should be mentioned in the early section of the manuscript. Related to this, there are other published methods for exploring change in translation efficiency (e.g., 10.1093/bioinformatics/btw585; 10.1093/nar/gkz223) that could also be suitable in this context. It is not entirely clear if their approach is better than before. Again, the improper reference to 18 made our assessment of this work difficult.

      We apologize and acknowledge the impact of the citation issue on this point. In Liu et al (2025), we have provided a comparison between our approach and the log-ratio strategy. We also agree that additional context was needed within the current study. Hence, we have now included more detailed information about the TE calculations in the initial results section (line 94).

      As noted by the reviewer, several other methods have been developed previously for measuring changes in translation efficiency. These methods are designed to be used in cases of paired designs where there is a treatment or manipulation that is assayed along with controls. While these methods are highly valuable in assessing differential TE, they are unable to accommodate the type of meta-analyses described in our study. In particular, we do not report changes/differential TE with respect to a control sample but instead focus on the coordinated patterns of TE across experiments. We now note this important distinction in the manuscript in the discussion section (line 494).

      The paper mainly relies on detecting a set of buffered genes using mRNA-TE correlation and MAD ratios (Ribo-Seq/RNA-Seq). While the concept seems sound, the authors should ensure that this method is reliable. Several controls could be used to confirm this. First, if any studies in humans or mice have described a set of genes as buffered, it would be worth checking for overlap between the authors' set of 'TB high' genes and the previously established list. Furthermore, the authors could use packages explicitly developed for translational buffering detection, such as annota2seq (https://academic.oup.com/nar/article/47/12/e70/5423604?login=true). Not all of the data used by the authors may be suitable for such packages, but the authors could at least partially use them on some of their datasets and see whether the buffered genes reported by these packages match their predictions.

      We thank the reviewer for this constructive suggestion. To the best of our knowledge, no prior study in humans or mice has systematically analyzed translational buffering across a wide range of conditions. As a result, defining a gold-standard set for benchmarking is currently not feasible.

      While packages such as anota2seq have proven highly valuable for identifying buffering effects in controlled experimental designs (e.g., comparing a treatment to a matched control), they are not readily applicable to the type of large-scale meta-analysis we present here.Our study integrates ribosome profiling and RNA-seq data across diverse datasets and conditions, which lies outside the design scope of such tools.

      The most relevant point of comparison to our work is Wang et al. 2020 Nature, which examined a related but distinct form of translational buffering across species for a given tissue. We now present the overlap of genes identified as buffered in our study vs Wang et al. 2020. The details are presented in the reviewer's comment 5-2.

      The threshold of 'TB high' or 'TB low' (top and bottom 250) is somewhat arbitrary. Why not top 100 or 500? The authors should provide a rationale for this choice. Also, they could include a numeric measure of buffering (the sum of the two rankings is probably suitable for this purpose). Several of the authors' explorations are suitable for numerical quantification (GO enrichment can be turned into GSEA, and the boxplot can be shown as correlations)

      Thanks for these suggestions. We agree that the threshold used to define TB high and low are somewhat subjective. We ensure that changing this cutoff as suggested is easily achievable with the provided R script. These can be used to reproduce all of the reported analyses of translational buffering with different cutoffs.

      To further assess whether our conclusions are robust to the selection of these thresholds, we tested several different values to define the TB high and TB low groups. As an example, we show here that the effect on protein variation and association of intrinsic features like the UTR lengths with the buffering potential of genes for different thresholds (i.e. if the TB high = top 100 or TB high = top 200) remain similar to the current cutoff of 250. However, if we increase the cutoff of TB high to 2000 and TB low to top 2000-4000 , the difference between the various features is diminished (Figure A& B). Further, protein variation (human cancer cell line and tissue) also becomes more similar across the three categories, possibly indicating a reduced regulatory potential of genes as their rank increases (Figure C& D).Our analyses reveal that highly ranked genes show associations with particular features, indicating an underlying hierarchy in translational buffering potential. This point is now discussed in the manuscript (line 177).

      Legend: Effect of different thresholds on . A. Length features B. Median RNA expression C. Protein variation in human cancer cell line and D. on Primary human tissues

      In response to the reviewer's suggestion of presenting data using numerical quantitation, we incorporated several additional inclusions in the manuscript.

      1. We now report association of CDS / UTR length with translational buffering as a function of their translational buffering rank with highly ranked genes showing associations with particular features, indicating an underlying hierarchy in translational buffering potential (Sup Fig 3 A-B) Ii. We now include scatter plots which show that highly ranked genes have lower variation at the protein level in both cancer cell line and primary tissues (Sup Fig. 6 A-C).

      Iii. We have now carried out modified GO enrichment analyses. Specifically, Gene Ontology enrichment analysis was performed for the TB high genes in humans and mouse using the clusterProfiler R package. Lists of TB high genes in human or mouse were analyzed against the Gene Ontology (GO) database using the enrichGO() function, with the organism-specific annotation database (org.Hs.eg.db for human or org.Mm.eg.db for mouse) as reference. Gene identifiers were supplied as gene symbols, and all genes in the current study were used as the background universe. Enrichment was carried out for the Biological Process (BP) ontology, with significance assessed by the hypergeometric test. P-values were adjusted for multiple testing using the Benjamini–Hochberg method, and terms with an adjusted p-value Legend: Gene Ontology (GO) enrichment analysis of the TB high gene set, performed with the clusterProfiler R package. Enriched GO Biological Process terms are shown after redundancy reduction using clusterProfiler::simplify. Each dot represents a GO term, with dot size indicating the number of genes associated with the term and color reflecting the adjusted p-value (Benjamini–Hochberg correction). Only the top non-redundant terms are displayed.

      • *

      Additionally, we performed Gene set enrichment analysis using the list of genes ordered according to their RNA-TE correlation. Hence lower ranks have lower RNA-TE correlations. The GSEA plots show significantly enriched Gene Ontology Biological Process (GO:BP) terms at the lower ranks of the ordered gene list. Together, these analyses further emphasize the observation that genes involved in macromolecular complexes are translationally buffered.

      • *

      Legend: Curves represent the enrichment score (ES) across the ranked gene list, with vertical bars indicating the positions of pathway-associated genes. The enrichment was identified using the gseGO() function from clusterProfiler.

      Several of the statements of the authors in the Introduction or Discussion sections are not entirely true regarding the literature on the topics, or lack major papers on the topic, and therefore, they are a bit misleading. Among others, here are some:

      We thank the reviewer for the suggestions and now have been incorporated in the revised manuscript, accordingly.

      5-1 "In addition, genetic differences arising from aneuploidy, cell type differences or variability observed in the natural population can further determine the amplitude of variation (4-7). The effect of mRNA variation under these conditions is mostly reflected at the protein levels (2, 4-8).". Several recent or more ancient papers suggest that mRNA variation coming from aneuploidy, natural genetic variation, or CNV is buffered or not well reflected at the protein level: DOI: 10.1038/s41586-024-07442-9 DOI: 10.1073/pnas.2319211121 DOI: 10.1016/j.cels.2017.08.013 DOI: 10.15252/msb.20177548

      We agree that mRNA variation coming from aneuploidy, natural genetic variation, or CNV is buffered or not well reflected at the protein level for some genes. This point has now been revised in the introduction. We have incorporated all the suggested literature into the revised manuscript (line 38).

      5-2: The authors should also consider mentioning these studies and softening their initial statement. "Similarly, translational buffering of certain genes have been reported in mammalian cells, specifically under estrogen receptor alpha (ERα) depletion conditions (16).". Translational buffering has been deeply explored in mammalian tissues and even across several mammalian species in this study (DOI: 10.1038/s41586-020-2899-z). In this, the authors also provide a nice exploration of the gene characteristics that are associated with translational buffering. The authors should mention it and compare the study's findings to theirs ultimately.

      We thank the reviewer for this suggestion. We have now cited the recommended study in the revised manuscript (line 65). Here, we provide a comparison of its findings with ours. While this related work offers important insights into translational buffering, its focus is on buffering across species within a given tissue, whereas our study emphasizes buffering across conditions, cell types, and treatments within a species. Despite this difference in focus, the comparison is highly informative, and we now highlight both the similarities and distinctions between the two studies in the relevant section of the revised manuscript.

      Wang et al. calculate the variation at the transcriptome level vs at the translatome level and is represented as delta ∆ value for each gene. A lower value represents lower variation at the ribosome occupancy level than at the mRNA levels across various species. We classified the genes in the Wang et al study as TB high, TB low genes or others as identified in the current study while indicating the calculated delta ∆ from Wang et al. Many of the genes with a lower delta value (are delta ∆ Legend: A. Dot plot to highlight the delta value of all genes in the Wang et al study (also present in RiboBase) which are further grouped as TB high, low or others in (A) brain and (B) liver.

      5-3: "Differences in species evaluated and statistical methods have resulted in conflicting interpretations (13, 28).". These conflicting results have been previously discussed in reviews on the topic that would be worth mentioning: DOI: 10.1016/j.cell.2016.03.014 DOI: 10.1038/s41576-020-0258-4

      We have added these reviews at the appropriate location of the manuscript.

      1. In addition to the p-values stated in the main text, the authors should annotate their plots when they find significant differences between groups to greatly facilitate the visual interpretation of the graphs.

      We have now annotated many of the relevant graphs with p-values to facilitate visual interpretation, adding them where space and figure design allow.

      Based on the data of Figure 4D, apparently, ribosome occupancy was not buffered even in high TB sets. The authors may argue that translational buffering may not cope with such a strong mRNA reduction. In that case, how big a difference in mRNA level does the buffering system adjust in protein synthesis? The authors should test gradual gene knockdown and/or overexpression and conduct Ribo-Seq/RNA-Seq to survey the buffering range.

      We appreciate the reviewer’s suggestion regarding the experiment to determine the buffering range.To understand this for multiple genes, we attempted a series of knockdowns using CRISPR/gRNA approach using a MutiCas12a approach. We targeted 8 buffered and 2 non-buffered genes using a 10-plex crRNA along with 10-plex gRNA serving as a negative control (Figure below). The fold change at the mRNA level of the targeted gene was within the variation range observed in replicates for other non-targeted genes. The challenge in performing a gradual knockdown is the subtle changes in RNA expression falls within the margin of error of estimation, making it difficult to understand the clear implications of the mRNA levels on buffering. Hence, the precise experimental manipulation of mRNA expression levels that would be conducive to translational buffering remains highly technically challenging. As noted in our manuscript (Figure 4D), the conventional approaches for manipulation of transcript abundance lead to larger changes than typically observed as a result of natural variation.

      *Legend: Validation of translational buffering by targeted knockdown of genes. A. The scatter plot shows the coefficient of variation of mRNA and ribosome occupancy between HEK293T cells targeted with sgRNA of different efficiencies. The genes indicated in blue are buffered and those in green are non buffered genes. B. The plot shows the fold change in mRNA abundance and ribosome occupancy as compared to cells that were infected with non-targeting crRNA array control (ratio of cpm in test vs control). Each color represents a gene and each point of a gene represents cells targeted by one of the four CRISPR arrays. *

      "differential transcript accessibility model" could not be functional if mRNA is reduced beyond the accessible pool (i.e., less than the threshold, all the mRNAs are translated without buffering). The authors should carefully reconsider this model and the effective range of mRNAs.

      We agree with the reviewer that according to the 'differential transcript accessibility model,' transcripts with abundances below a certain threshold should be completely accessible to the translational pool. Further, this could also be true for the other model, wherein initiation rate cannot increase beyond a particular threshold for transcripts of very low abundance. However, our observation from our haploinsufficiency analysis (Figure 4 B& C) and siRNA knockdown analysis from RiboBase (Figure 4 D) suggests that buffering might be possible within a given range of transcript abundance. Testing the buffering range by serial knockdowns might help in determining the threshold at which transcripts exhibit buffering. However, due to the challenges of serial knockdown as discussed above, makes this analysis difficult with Ribosome profiling and matched RNA-seq approach. An alternative approach could involve imaging translating and non-translating mRNA of buffered genes in different cells, which may help distinguish the two models. However, this falls outside the scope of the manuscript.

      Minor comments:

      1. Some figures are of poor quality as they seem to have points outside of the panel representations... Like Figure 3C, one point is out of the square, same for Figure 4E. Similarly, on figure 5F, some outliers seem to be clearly cut from the figure (maybe not, but then the author should put a larger space between the end of the figure and the max y points). Same for panel S2D and S6D, this does not sound so rigorous.

      We agree and apologize for this issue. The axes of the figures have been annotated appropriately to indicate the presence of outliers in the figures.

      1. There are several typos or weird sentences. Here are some (but maybe not all): 2-1: [...]with lower sums corresponding to higher final ranks. "two rankings". Based on these final ranks[...] 2-2: For each dataset, median absolute deviation (MAD) "i" protein abundance was calculated across samples 2-3: [...]neighbor method implemented in the MatchIT package (38) Differences in protein[...] a point is missing here. 2-4: Additionally a second dataset providing predictions of haploinsufficiency (pHaplo score) and triplosensitivity (pTriplo score) for all autosomal genes (25) was used to asses the distribution of these score"S" across buffered and non-buffered gene sets . There is a missing "s" at "score" and there is a space between the last word and the final point.

      The necessary corrections have been incorporated in the revised version of the manuscript.

      1. In the "Lymphoblastoid cell line data analysis:" section, this reviewer wonders why the authors used a different method to calculate buffering compared to before.

      The main reason is the limited sample of the lymphoblastoid cell line data. In our larger analyses, we could use median absolute deviation as a robust metric of dispersion across heterogeneous samples. However, given the smaller dataset in that study we decided CV would be a better indicator of dispersion. To evaluate the potential for translational buffering of genes from RiboBase, we used two metrics. The first was the negative correlation between translation efficiency and RNA abundance across samples. The second metric relied on the ratio of variation in ribosome occupancy to variation in RNA levels. Given the limited sample size of the lymphoblastoid cell line dataset, we used the coefficient of variation (CV) instead of the median absolute deviation (MAD), as the data in this study were normalized using counts per million (CPM) rather than the centered log-ratio (clr) normalization used in RiboBase. This CV ratio allowed us to assess the effect of natural variation in RNA abundance on ribosome occupancy.

      1. "Samples which had R2 less than 0.2 were removed as the residuals calculated for these samples could be unreliable". These samples for which the correspondence between RNA-Seq and Ribo-Seq is low wouldn't be the ones most impacted by translational buffering? Is it sure that the authors are not missing something here?

      We agree with the reviewer that genes that show translational buffering may not conform to linear relationships between the two parameters. However, the proportion of genes exhibiting this buffering effect is not expected to significantly influence the overall regression fit. Instead, we hypothesized that low quality samples or truly different relationships between the two parameters can make this relationship nonlinear, rendering it unsuitable for linear regression analysis for calculation of TE.

      To address these possibilities, we first analysed a commonly used proxy for data quality. Given the characteristic movement of ribosomes across mRNAs, periodicity of sequencing reads is a useful metric to assess whether reads are randomly fragmented, as in RNA-seq, or specifically represent ribosome-protected footprints. For this, we compared two groups: samples that were removed (~30) and those retained for analysis. We plotted the distribution of periodicity scores for all samples in both groups. For the calculation of periodicity scores, first the percentage of reads mapped to the dominant frame position across the dynamic ribosome footprint read length range was calculated for each sample. The periodicity score was calculated by taking the weighted sum of these dominant percentages, with weights based on the total read counts at each length.

      The results indicate that the removed samples did not have lower periodicity scores, suggesting that their quality in terms of periodicity was comparable to the retained samples.

              To assess the second possibility, we checked if the study involved major perturbations, which may skew the relationship towards non linearity. The 30 samples that were removed came from 14 unique studies, 18 of which involved perturbation which possibly affected either of the two parameters. In addition to the genetic/pharmacological perturbations specific to the study, the overall conditions of the cells during an experiment could influence this relationship. Another point to note is that many of the filtered-out samples are HeLa and HEK293T cells, which show a normal relationship between ribosome occupancy and RNA abundance for the majority of cases.
      
              These considerations suggest that removing these samples is most appropriate, as their inclusion could bias the TE calculations.
      

      For Figure 4B and 4C, the authors should provide statistical tests and p-values to confirm the observed trends.

      The haploinsufficiency and triplosensitivity analyses are now supported by a chi-squared test. The details of the statistical test are now mentioned in the text and the p-values have been noted on the respective figures.

      In Figure 2A, the "all genes" color doesn't correspond to the point color.

      The color in the figure has been modified in the revised version of the manuscript.

      1. "To understand if codon usage patterns are[...]". This comes slightly out of the blue. The authors could maybe explain why codon usage should be explored for translational buffering. The authors should cite recent key works in the fields: DOI: 10.1016/j.celrep.2023.113413 DOI: 10.1101/2023.11.27.568910

      We would like to thank the reviewer for their suggestion. The references have been incorporated in the revised version of the manuscript. We have now explained why codon usage could be a contributor in determining the translational buffering potential (line 190).

      "The change in each metric was calculated by subtracting the mean value in the control samples from that in the knockdown samples. This yielded the differential mRNA abundance and ribosome occupancy resulting from gene knockdown.". This looks statistically weak. The authors should consider using more robust methods like DESeq.

      We thank the reviewer for the suggestion. We reanalyzed the selected studies using edgeR and the modified figure is included in the revised version of the manuscript (Figure 4D). The conclusion after this analysis remains essentially the same. In particular, translational buffering is ineffective when mRNA abundance is perturbed drastically. Additionally, the limited number of experiments with direct perturbation of buffered genes limit the generalizability of this observation. This limitation is included in the result section (line 342).

      Legend: Scatter plot represents log2 fold change in RNA abundance and ribosome occupancy. Each point represents a gene and the fold change in its RNA and ribosome occupancy with respect to their controls. The line represents the line of equivalence. Buffered genes do not show less change in ribosome occupancy upon reduction in their RNA levels than other genes.

      1. "Genes in the buffered gene set had a higher codon adaptation index than the non-buffered set, indicating that candidates in the buffered gene set are relatively well expressed due to the presence of a higher proportion of the codons observed in highly expressed genes". What do the authors mean by "relatively well expressed"? Abundantly expressed? This sentence and the causality under it is unclear and should be modified or better explained.

      We thank the reviewer for pointing out the lack of clarity in the sentence. We have now quantitatively measured the CAI in the three categories and modified the sentence to better explain the rationale in the revised version (line 183). “To understand if codon usage patterns are associated with translational buffering, we next analyzed codon properties across buffered and non-buffered human gene sets. The codon adaptation index quantifies how closely a gene’s codon usage aligns with that of highly expressed genes. Genes in the buffered gene set had a higher codon adaptation index than the non-buffered set. Specifically, 28.4% of TB high genes, 14% of TB low genes and 9.3% of genes in the other category fall within the top decile (>90th percentile) of codon adaptation index.”

      The panel 4D is unclear. Is one point associated with one gene? Or is it the average of several genes? If it's one point for one gene, it is important to clearly state it because the number of cases is therefore quite low, especially for the TB high and low.

      Each point and line are associated with a single gene. This is now clarified in the legend of the figure (line 364). The number of genes in this analysis is limited to the available ribosome profiling data with gene knockdown experiments.

      1. In Figure 2J, GGU (Gly), AAG (Lys), and ACU (Arg) provide negative effects on prediction, although these were enriched in the high TB set (Figure 2E). This contradiction should be explained.

      While this appears to be a seeming contradiction, it is in line with what we expected. In particular, the objective of Figure 2J is to illustrate the features that predict the mRNA–TE correlation of genes, as identified using a LGBM model. The Spearman correlation shown reflects the relationship between each feature and the mRNA–TE correlation values. A negative correlation for codons such as GGU (Gly), AAG (Lys), and ACU (Thr) suggests that enrichment of these codons is associated with lower mRNA–TE correlation. This is in agreement with our observation in Figure 2E which suggests that high TB genes are enriched in these codons. In contrast, transcript size exhibits a positive correlation, indicating that shorter transcripts tend to have lower mRNA–TE correlation values.

      Given that the choice of colors is a potential source of confusion, we have revised the text (line 230) and the figure (& legend) to try to clarify this relationship.

      The subtitle of "Translationally buffered genes exhibit variable association kinetics with the translational machinery in response to mRNA variation" sounds unfair to this reviewer. Since the authors did not work on kinetics directly, the use of this word is misleading.

      We agree and revised the subtitle to “The association of translationally buffered genes with the translational machinery varies in response to changes in mRNA abundance"

      1. The explanation of Figure 5A "We next explored the potential mechanisms that may give rise to translational buffering. Specifically, we considered two non-mutually exclusive models by which mRNA abundance might be decoupled from ribosome occupancy. In the first, the "differential transcript accessibility model", mRNA abundance determines the fraction of transcripts that are accessible to the translational pool. In this scenario, an increase in mRNA abundance would be accompanied by a proportionally smaller increase in the fraction of transcripts entering the translating pool for buffered genes, compared to non-buffered genes. In the second, the "initiation rate model", the rate of translation initiation per transcript scales inversely with mRNA abundance. Under this model, the proportion of mRNA entering the translational pool would be comparable across buffered and non-buffered genes (Fig 5A)." is hard to understand. The authors should rewrite for a better understanding of the readers.

      This section has been rewritten in the revised version of the manuscript. The text now reads as

      “We next explored the potential mechanisms that may give rise to translational buffering. Specifically, we considered two non-mutually exclusive models by which mRNA abundance might be decoupled from ribosome occupancy. In the first, the “differential transcript accessibility model”, mRNA abundance determines the fraction of transcripts that are accessible to the translational pool. In this scenario, an increase in mRNA abundance would be accompanied by a proportionally smaller increase in the fraction of transcripts entering the translating pool for buffered genes, compared to non-buffered genes. In the second, the “initiation rate model”, the rate of translation initiation per transcript scales inversely with mRNA abundance. Under this model, as mRNA abundance increases, translation initiation on each transcript is reduced, thereby lowering the number of ribosomes per transcript. However, this mechanism allows a proportional increase in transcripts entering the translational pool for buffered genes, similar to non-buffered genes”

      Significance

      Thanks to the development of Ribo-Seq, translational buffering has been reported in various works. However, the systematic investigation has remained challenging. Employing the database of published Ribo-Seq and matched RNA-Seq, Rao et al. attempt to understand the mechanism underlying translational buffering of mRNA variation across diverse materials. A group of mRNAs whose expression variance is buffered at the translation level was comprehensively surveyed in humans and mice. The authors found a series of features in the translationally buffered genes, including high GC contents in the 5′ UTR, optimal codon usage, and mRNA length. The depletion or increase of one allele of the genes in the group may be particularly detrimental to cells. The authors' report provides a step forward in our understanding of translational buffering, appealing to the broad scientific community in basic and applied biology. However, this reviewer found a series of concerns in this paper, including clarity in the methods, experimental validation, referring the earlier works, etc. These points could be tackled to improve the reliability of their findings, the strength of their main message, and the global understandability of the paper.

      We thank the reviewer for noting the significance of the work and for their constructive feedback.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Rao and colleagues present a comprehensive analysis of translational buffering in human and mouse by mining 1515 matched ribosome profiling and RNAseq datasets from diverse tissues and cell lines. They define translational buffering as genes whose TE is negatively correlated with mRNA abundance across conditions, and further identify candidates by comparing median absolute deviations of ribosome occupancy versus mRNA levels. The authors find a conserved set of buffered genes enriched for components of multiprotein complexes, demonstrate that buffered genes exhibit lower protein variability and greater dosage sensitivity, and propose two non-mutually exclusive mechanistic models (differential accessibility and initiation rate modulation). Finally, they perform complementary fractionation experiments in HEK293T cells to support these models.

      These findings propose a novel, conserved mechanism of translational buffering that tunes gene expression in mouse and human, showing how intrinsic sequence features and cellular context cooperate to stabilize protein output across diverse conditions. However, further evidence is required to fully support the authors conclusions, particularly direct validation of the proposed models of buffering.

      We thank the reviewer for their positive assessment and thoughtful suggestions that we address below.

      Below are my main concerns:

      1. The choice of the top 250 genes by spearman correlation and MAD ratio as "TB high" seems arbitrary. The authors should justify these cut offs (via permutation analysis or FDR control) and show that conclusions are robust to different thresholds.

      We agree that the threshold used to define TB high and TB low is somewhat subjective, and we now clearly acknowledge this in the discussion section (line 485). We now provide an R script that reproduces all analyses of translational buffering, where changing this cutoff to higher or lower values is straightforward.

      To ensure the robustness of our conclusions, we evaluated several thresholds for defining TB high and TB low. We observed that the conclusions hold within a reasonable range of values (100-250). For example, the effects on protein variation and the association of intrinsic features such as UTR lengths with buffering potential remain consistent when TB high is defined as the top 100 or the top 200 genes, compared with the current cutoff of 250. In contrast, when we define TB high as the top 2000 and TB low as ranks 2000–4000, the difference between the various features is diminished (Figure A& B). Further, protein variation (human cancer cell line and tissue) also becomes more similar across the three categories, possibly indicating a reduced regulatory potential of genes as their rank increases (Figure C& D). Our results show that highly ranked genes consistently associate with specific features, suggesting an underlying hierarchy in translational buffering potential.

      Legend: Effect of different thresholds on . A. Length features B. Median RNA expression C. Protein variation in human cancer cell line and D. on Primary human tissues

      The modified compositional regression approach for TE and imputation of missing values are central to the study, but details are relegated to supplemental methods. The manuscript would benefit from a clear comparison of this method against standard log-ratio TE estimates, including sensitivity analyses to missing-data imputation strategies

      We thank the reviewer for the feedback. We have now added further description of the modified compositional regression and the imputation strategy in the results section (line 94). Comparison to standard log-ratio TE estimates and their limitations has already been detailed in Liu et al. 2025, Nature Biotechnology. Therefore, in the current manuscript we specifically focus on the effect of the imputation strategy.

              Specifically, the modified imputation slightly improved concordance between the set of genes that are identified to be translationally buffered using the negative RNA-TE relationship or using RNA -Ribosome occupancy correlation (0.91 to 0.94). Further, we assessed the correlation between TE and protein abundance as measured by mass spectrometry from seven human cell lines (A549, HEK293, HeLa, HepG2, K562, MCF7 and U2OS). The protein measurements were obtained from PaxDb. The new imputation strategy slightly increased mean correlation between the TE and proteome abundance as compared to naive strategy. It specifically showed improved correlation for HepG2, A549 and HeLa cell lines. 3507 genes were used for this analysis that were common between PaxDb, Liu et al., 2005 and the current study.
      

      Legend: Proteomics vs TE correlation of cell types without or with imputation strategy. Spearman correlation between compositional TE calculated as calculated by Liu et al., 2025 from 68 samples from 11 studies (HEK293), 86 samples from 10 studies (HeLa), 58 samples from four studies (U2OS), 29 samples from five studies (A549), five samples from two studies (MCF7), seven samples from two studies (K562) and 10 samples from two studies (HepG2) or from the current study. 57 samples from 10 studies (HEK293), 82 samples from 9 studies (HeLa), 58 samples from four studies (U2OS), 29 samples from five studies (A549), 5 samples from two studies (MCF7), one samples from one studies (K562) and 9 samples from two studies (HepG2) . 3507 genes were used for this analysis that were common between Paxdb, Liu et al., 2005 and the current study.

      Human data are derived mainly from immortalized cell lines, whereas mouse data are from primary tissues. Pooling these heterogeneous sources may conflate cell type-specific regulation with intrinsic buffering. The authors should either stratify analyses by context or demonstrate buffering signatures remain consistent within more homogeneous subsets

      We thank the reviewer for the suggestion and agree that heterogeneity could potentially mask cell type-specific buffering effects. The TB-high genes we report are those that show consistent and robust expression across diverse contexts. However, unlike RNA-seq datasets, the current number of ribosome profiling samples per cell type is still limited, and a more comprehensive assessment of context-specific buffering will require larger datasets that will accumulate over time.

      Nonetheless, we have stratified the analysis by cellular context. Specifically, we grouped samples of the same cell-type and repeated the buffering analysis. We provide a new table listing TB-ranks of genes for the five cell types with the largest sample sizes as a table in github.

      https://github.com/CenikLab/Translational-buffering/blob/Translational-Buffering/combined_tables.xlsx

      As an additional control, we compared buffering patterns between related and unrelated cell lines. For example, the correlation of TB ranks between related cell lines HEK293T (n = 98) and HEK293 (n = 57) is higher (0.46) than between either and an unrelated cell line, HeLa (n = 82). Similarly, the correlation between two liver cell lines, Huh7 (n = 39) and HepG2 (n = 9), is higher (0.20) than between Huh7 and a similarly sampled but unrelated lymphoblastoid cell line (LCL, n = 9; correlation = 0.05). While these analyses suggest that cell type-specific patterns may exist, their exploration is currently limited by sample size, as detecting buffering requires substantial variability in mRNA expression. We now highlight this as a limitation in the Discussion section (line 573).

      *Legend: Spearman correlation between TB ranks of different pairs of cell lines. The first set indicates comparison with HEK293T. The second set indicates comparison between liver cells (HepG2 and Huh 7). *

      The HEK293T fractionation experiments offer preliminary support for both the "accessibility" and "initiation" models, but only slope analyses are shown. To validate these models, the authors should perform targeted reporter assays (dual luciferase constructs with 5′UTR swaps) or manipulations of initiation factors (eIF4E knockdown) to directly test how transcript abundance alters initiation rates versus pool entry

      We thank the reviewer for suggesting experiments to validate the proposed models. In the luciferase reporter experiments, constructs bearing the endogenous UTRs from non-buffered genes would be expected to result in expression that is proportional to transcript abundance. In contrast, swapping a 5’ UTR from buffered genes would mitigate this effect of translation buffering via “initiation rate model” depending on the 5 UTR sequence of transcript. However, as outlined below, this experiment has important caveats:

      1. Role of coding sequence: Such assays primarily test the contribution of the 5′UTR and do not address potential cooperative effects between the 5′UTR and the coding sequence (CDS). Thus, if 5′UTRs fails to recapitulate translational buffering, it would be unclear whether the buffering requires coordinated action of the 5′UTR and CDS or whether the gene in question simply does not conform to the initiation-rate model.
      2. Sensitivity of measurements: Reporter-based measurements often rely on RT-qPCR to quantify expression changes. While suitable for large fold-changes, small shifts may fall within the assay’s technical margin of error, limiting the interpretability of the results. iii. Gene-to-gene variability: Buffered and non-buffered transcripts likely span a wide range of intrinsic initiation rates. Selecting only a few “representative” transcripts for 5′UTR swapping could yield results that are not broadly generalizable.

      Similarly, knockdown of general initiation factors will likely impact on both buffered and non-buffered genes, which could limit the ability to distinguish the effect of transcript abundance on translational buffering via either of the proposed models. We envision an alternative future approach that would involve single molecule imaging translating and non-translating mRNAs of buffered and non-buffered genes under varying abundance conditions in a physiological context. Such experiments are likely the most suitable for disentangling the contributions of accessibility versus initiation. While we find this an exciting direction for future work, it lies beyond the scope of the present manuscript.

      The conclusion that buffering reduces protein variability relies on mass-spec comparisons, but ribosome occupancy does not always reflect functional protein output (due to elongation stalling or co-translational degradation). Incorporating orthogonal measures, such as pulse-labeling or western blots for key buffered versus non-buffered genes, would strengthen the link between buffering and proteome stability

      We agree with the reviewer’s concern and have been acknowledged as a limitation in the discussion section. To address this with orthogonal approaches, we carried out several additional experiments. Specifically, we identified a study from RiboBase (GSE132703) that exhibited significant variation in FUS transcript (a translationally buffered gene) abundance across conditions—namely HEK293T wild type, LARP1A single knockout (SKO), and LARP1A/B double knockout (DKO) using their RNA-seq data. We reached out to the authors of the study and obtained these knockout cell lines. We reanalyzed RNA abundance under the different conditions by RT-qPCR and assessed protein levels by Western blot. Despite observing differences in RNA abundance, FUS protein levels did not exhibit corresponding change at the protein level.

      We also selected a non-buffered gene; DNAJC6, that also showed RNA-level differences. However, the change in RNA expression was not consistent at the protein level. Some caveats of Western blot is its limited sensitivity which may prevent detection of subtle changes and that the measurements are steady-state protein levels which cannot resolve whether differences arise from altered synthesis or degradation.

      *Legend : Validation of buffering gene by western blot: A. Plot showing the RNA abundance and ribosome occupancy of buffered gene ; FUS and non buffered genes; DNAJC6 with variation in HEK293T-wild type, LARP1A single knockout and LARP1A/B double knockout. B. Validation of the RNA seq data by qPCR. C. Western Blot showing the FUS, DNAJC6 and Actin in wild type and different mutants. D. Bar plot showing the quantification of western blot. *

              In addition to this targeted analysis , we performed quantitative mass spectrometry to evaluate the effect of mRNA variation at the protein level at global scale.
      

      LC MS/MS analysis was performed on the above samples in triplicates at the Proteomics facility of the University of Texas. A total of 4,048 proteins were identified using a peptide confidence threshold of 95% and a protein confidence threshold of 99%, with a minimum of two peptides required for identification. Total precursor intensities for all peptides of a protein was summed and was used for protein quantification using DEP (Differential Enrichment of Proteomics Analysis) Package, in Bioconductor, R (https://rdrr.io/bioc/DEP/man/DEP.html). DEP was used for variance normalization and statistical testing of differentially expressed proteins. As expected LARP1 protein was identified in the control cells but not in the single or double knockouts.

      We then plotted the fold change in RNA as determined by edgeR analysis of RNA-seq from (Philippe et al. 2020) and the fold change in protein abundance from our mass spectrometry data. We observed that genes in the TB high group show reduced changes at the protein level compared to TB low or others as determined by the linear regression analysis in both single and double LARP1 KO mutants. This finding is consistent with our findings that buffered genes show lower variation in the protein abundance in response to change in mRNA expression.

      Legend: Scatter plot showing the log2fold change in the RNA and protein levels as determined by RNA seq from (Philippe et al. 2020) or mass spectroscopy. Differential analysis of RNA was done using the edgeR package and the DEP (Differential Enrichment of Proteomics Analysis) Package *was used for mass spectrometry analysis. Only genes with an FDR We have not included this data in the manuscript given the deviation of the approach from our original analysis, but we are happy to reconsider the inclusion of this data to supplement our proteomic analysis.

      While the LGBM modeling shows modest predictive power of sequence features alone, the manuscript stops short of exploring what cellular factors might drive context dependence. Integrating public datasets on RNA-binding protein expression or mTOR pathway activity across samples could illuminate trans-acting determinants of buffering and move beyond correlative sequence analyses,

      We thank the reviewer for this suggestion. To investigate potential trans-acting determinants of buffering, we focused on 1,394 human RBPs as classified by Hentze et al. (2018), reasoning that some of these factors may facilitate translational buffering. Specifically, we examined correlations between the RNA expression of each RBP and the TE of all other genes across samples. p-values were corrected using the Bonferroni procedure. For each RBP, we then performed a Fisher’s exact test to assess whether the number of significant correlations was enriched among buffered versus non-buffered genes.

      This analysis revealed that the expression levels of many RBPs are significantly enriched for either positive or negative correlations with the TE of buffered genes. In particular, we note that RNA expression of many buffered RBPs is enriched for negative correlations with the TE of other buffered transcripts. These results suggest that, rather than considering translational buffering in isolation for each transcript, buffering effects may be coordinated at the translational level and influenced by shared trans-acting factors such as RBPs. Network-based approaches have been valuable for RNA co-expression and are only now being applied to TE covariation. However, the correlative nature of these analyses limits causal inference. For example, although many ribosomal proteins appear to influence the buffering of other ribosomal proteins, they themselves may be regulated by a non-ribosomal RBP—so the apparent effects could reflect upstream regulatory influences. This analysis is now included as a supplementary figure (Sup. Fig. 5) of the revised manuscript.

      Legend: A scatter plot of odds ratio log of number of significant correlations (RNA abundance of RBPs ::TE of genes) and the p value from fisher test. The vertical dashed line represents the threshold odds ratio, above which RBPs exhibit a higher number of significant correlations with buffered genes. P values were corrected using Bonferroni procedure* and the horizontal dashed line represents the adjusted p value cutoff. *

      Reviewer #2 (Significance (Required)):

      Overall, this manuscript leverages an unprecedented compendium of matched ribosome profiling and RNAseq datasets across human cell lines and mouse tissues, combined with improved TE estimation, to robustly catalog genes exhibiting translational buffering, a clear methodological and conceptual strength. The main limitations stem from heterogeneous sample sources, largely correlative analyses, and a lack of targeted mechanistic validation. Compared to prior yeast focused studies, it fills a key gap by demonstrating conservation of buffering in mammals and linking it to dosage sensitivity and protein stability, representing a conceptual advance in understanding post-transcriptional homeostasis and a methodological step forward in TE analysis. This work will interest researchers in RNA biology, gene expression regulation, systems biology, and cancer proteomics, as well as those studying dosage-sensitive pathways and translational control. My expertise is on translational control in cancer.

      We thank the reviewer for noting the broader significance of the work and for their constructive feedback.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, Rao and colleagues present a comprehensive analysis of translational buffering in human and mouse by mining 1515 matched ribosome profiling and RNAseq datasets from diverse tissues and cell lines. They define translational buffering as genes whose TE is negatively correlated with mRNA abundance across conditions, and further identify candidates by comparing median absolute deviations of ribosome occupancy versus mRNA levels. The authors find a conserved set of buffered genes enriched for components of multiprotein complexes, demonstrate that buffered genes exhibit lower protein variability and greater dosage sensitivity, and propose two non-mutually exclusive mechanistic models (differential accessibility and initiation rate modulation). Finally, they perform complementary fractionation experiments in HEK293T cells to support these models.

      These findings propose a novel, conserved mechanism of translational buffering that tunes gene expression in mouse and human, showing how intrinsic sequence features and cellular context cooperate to stabilize protein output across diverse conditions. However, further evidence is required to fully support the authors conclusions, particularly direct validation of the proposed models of buffering. Below are my main concerns:

      1. The choice of the top 250 genes by spearman correlation and MAD ratio as "TB high" seems arbitrary. The authors should justify these cut offs (via permutation analysis or FDR control) and show that conclusions are robust to different thresholds
      2. The modified compositional regression approach for TE and imputation of missing values are central to the study, but details are relegated to supplemental methods. The manuscript would benefit from a clear comparison of this method against standard log-ratio TE estimates, including sensitivity analyses to missing-data imputation strategies
      3. Human data are derived mainly from immortalized cell lines, whereas mouse data are from primary tissues. Pooling these heterogeneous sources may conflate cell type-specific regulation with intrinsic buffering. The authors should either stratify analyses by context or demonstrate buffering signatures remain consistent within more homogeneous subsets
      4. The HEK293T fractionation experiments offer preliminary support for both the "accessibility" and "initiation" models, but only slope analyses are shown. To validate these models, the authors should perform targeted reporter assays (dual luciferase constructs with 5′UTR swaps) or manipulations of initiation factors (eIF4E knockdown) to directly test how transcript abundance alters initiation rates versus pool entry
      5. The conclusion that buffering reduces protein variability relies on mass-spec comparisons, but ribosome occupancy does not always reflect functional protein output (due to elongation stalling or co-translational degradation). Incorporating orthogonal measures, such as pulse-labeling or western blots for key buffered versus non-buffered genes, would strengthen the link between buffering and proteome stability
      6. While the LGBM modeling shows modest predictive power of sequence features alone, the manuscript stops short of exploring what cellular factors might drive context dependence. Integrating public datasets on RNA-binding protein expression or mTOR pathway activity across samples could illuminate trans-acting determinants of buffering and move beyond correlative sequence analyses

      Significance

      Overall, this manuscript leverages an unprecedented compendium of matched ribosome profiling and RNAseq datasets across human cell lines and mouse tissues, combined with improved TE estimation, to robustly catalog genes exhibiting translational buffering, a clear methodological and conceptual strength. The main limitations stem from heterogeneous sample sources, largely correlative analyses, and a lack of targeted mechanistic validation. Compared to prior yeast focused studies, it fills a key gap by demonstrating conservation of buffering in mammals and linking it to dosage sensitivity and protein stability, representing a conceptual advance in understanding post-transcriptional homeostasis and a methodological step forward in TE analysis. This work will interest researchers in RNA biology, gene expression regulation, systems biology, and cancer proteomics, as well as those studying dosage-sensitive pathways and translational control. My expertise is on translational control in cancer.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Thanks to the development of Ribo-Seq, translational buffering has been reported in various works. However, the systematic investigation has remained challenging. Employing the database of published Ribo-Seq and matched RNA-Seq, Rao et al. attempt to understand the mechanism underlying translational buffering of mRNA variation across diverse materials. Although the authors' report provides a step forward in our understanding of translational buffering, this reviewer found a series of concerns in this paper. These points could be tackled to improve the reliability of their findings, the strength of their main message, and the global understandability of the paper.

      Major comments:

      1. This paper heavily relies on the reference 18. However, this paper was not properly stated (no page or journal number); the study in Bioinformatics is nowhere to be found on the website, despite being out in 2024 apparently. Either title is wrong (yet a biorxiv can be found). This reviewer guessed that the reference 18 may be accepted. However, without a proper reference, this paper could not be judged since nearly all the parts of this work have been based on the reference 18. Also, the Ribobase data used in this manuscript comes from this reference, so it had better be well defined, especially when another Ribobase data set seems to be available online: http://www.bioinf.uni-freiburg.de/~ribobase/index.html
      2. In the Discussion, the authors mentioned "TE is based on a compositional regression model (18) rather than the commonly applied approach of using a logarithmic ratio of ribosome occupancy to mRNA abundance." This important information should be mentioned early section of the manuscript. Related to this, there are other published methods for exploring change in translation efficiency (e.g., 10.1093/bioinformatics/btw585; 10.1093/nar/gkz223) that could also be suitable in this context. It is not entirely clear if their approach is better than before. Again, the improper reference to 18 made our assessment of this work difficult.
      3. The paper mainly relies on detecting a set of buffered genes using mRNA-TE correlation and MAD ratios (Ribo-Seq/RNA-Seq). While the concept seems sound, the authors should ensure that this method is reliable. Several controls could be used to confirm this. First, if any studies in humans or mice have described a set of genes as buffered, it would be worth checking for overlap between the authors' set of 'TB high' genes and the previously established list. Furthermore, the authors could use packages explicitly developed for translational buffering detection, such as annota2seq (https://academic.oup.com/nar/article/47/12/e70/5423604?login=true). Not all of the data used by the authors may be suitable for such packages, but the authors could at least partially use them on some of their datasets and see whether the buffered genes reported by these packages match their predictions.
      4. The threshold of 'TB high' or 'TB low' (top and bottom 250) is somewhat arbitrary. Why not top 100 or 500? The authors should provide a rationale for this choice. Also, they could include a numeric measure of buffering (the sum of the two rankings is probably suitable for this purpose). Several of the authors' explorations are suitable for numerical quantification (GO enrichment can be turned into GSEA, and the boxplot can be shown as correlations)
      5. Several of the statements of the authors in the Introduction or Discussion sections are not entirely true regarding the literature on the topics, or lack major papers on the topic, and therefore, they are a bit misleading. Among others, here are some:

      5-1 "In addition, genetic differences arising from aneuploidy, cell type differences or variability observed in the natural population can further determine the amplitude of variation (4-7). The effect of mRNA variation under these conditions is mostly reflected at the protein levels (2, 4-8).". Several recent or more ancient papers suggest that mRNA variation coming from aneuploidy, natural genetic variation, or CNV is buffered or not well reflected at the protein level:

      DOI: 10.1038/s41586-024-07442-9 DOI: 10.1073/pnas.2319211121 DOI: 10.1016/j.cels.2017.08.013 DOI: 10.15252/msb.20177548

      5-2: The authors should also consider mentioning these studies and softening their initial statement. "Similarly, translational buffering of certain genes have been reported in mammalian cells, specifically under estrogen receptor alpha (ERα) depletion conditions (16).". Translational buffering has been deeply explored in mammalian tissues and even across several mammalian species in this study (DOI: 10.1038/s41586-020-2899-z). In this, the authors also provide a nice exploration of the gene characteristics that are associated with translational buffering. The authors should mention it and compare the study's findings to theirs ultimately.

      5-3: "Differences in species evaluated and statistical methods have resulted in conflicting interpretations (13, 28).". These conflicting results have been previously discussed in reviews on the topic that would be worth mentioning: DOI: 10.1016/j.cell.2016.03.014 DOI: 10.1038/s41576-020-0258-4 6. In addition to the p-values stated in the main text, the authors should annotate their plots when they find significant differences between groups to greatly facilitate the visual interpretation of the graphs. 7. Based on the data of Figure 4D, apparently, ribosome occupancy was not buffered even in high TB sets. The authors may argue that translational buffering may not cope with such a strong mRNA reduction. In that case, how big a difference in mRNA level does the buffering system adjust in protein synthesis? The authors should test gradual gene knockdown and/or overexpression and conduct Ribo-Seq/RNA-Seq to survey the buffering range. 8. "differential transcript accessibility model" could not be functional if mRNA is reduced beyond the accessible pool (i.e., less than the threshold, all the mRNAs are translated without buffering). The authors should carefully reconsider this model and the effective range of mRNAs.

      Minor comments:

      1. Some figures are of poor quality as they seem to have points outside of the panel representations... Like Figure 3C, one point is out of the square, same for Figure 4E. Similarly, on figure 5F, some outliers seem to be clearly cut from the figure (maybe not, but then the author should put a larger space between the end of the figure and the max y points). Same for panel S2D and S6D, this does not sound so rigorous.
      2. There are several typos or weird sentences. Here are some (but maybe not all):

      2-1: [...]with lower sums corresponding to higher final ranks. "two rankings". Based on these final ranks[...]

      2-2: For each dataset, median absolute deviation (MAD) "i" protein abundance was calculated across samples

      2-3: [...]neighbor method implemented in the MatchIT package (38) Differences in protein[...] a point is missing here.

      2-4: Additionally a second dataset providing predictions of haploinsufficiency (pHaplo score) and triplosensitivity (pTriplo score) for all autosomal genes (25) was used to asses the distribution of these score"S" across buffered and non-buffered gene sets . There is a missing "s" at "score" and there is a space between the last word and the final point. 3. In the "Lymphoblastoid cell line data analysis:" section, this reviewer wonders why the authors used a different method to calculate buffering compared to before. 4. "Samples which had R2 less than 0.2 were removed as the residuals calculated for these samples could be unreliable". These samples for which the correspondence between RNA-Seq and Ribo-Seq is low wouldn't be the ones most impacted by translational buffering? Is it sure that the authors are not missing something here? 5. For Figure 4B and 4C, the authors should provide statistical tests and p-values to confirm the observed trends. 6. In Figure 2A, the "all genes" color doesn't correspond to the point color. 7. "To understand if codon usage patterns are[...]". This comes slightly out of the blue. The authors could maybe explain why codon usage should be explored for translational buffering. The authors should cite recent key works in the fields: DOI: 10.1016/j.celrep.2023.113413 DOI: 10.1101/2023.11.27.568910 8. "The change in each metric was calculated by subtracting the mean value in the control samples from that in the knockdown samples. This yielded the differential mRNA abundance and ribosome occupancy resulting from gene knockdown.". This looks statistically weak. The authors should consider using more robust methods like DESeq. 9. "Genes in the buffered gene set had a higher codon adaptation index than the non-buffered set, indicating that candidates in the buffered gene set are relatively well expressed due to the presence of a higher proportion of the codons observed in highly expressed genes". What do the authors mean by "relatively well expressed"? Abundantly expressed? This sentence and the causality under it is unclear and should be modified or better explained. 10. The panel 4D is unclear. Is one point associated with one gene? Or is it the average of several genes? If it's one point for one gene, it is important to clearly state it because the number of cases is therefore quite low, especially for the TB high and low. 11. In Figure 2J, GGU (Gly), AAG (Lys), and ACU (Arg) provide negative effects on prediction, although these were enriched in the high TB set (Figure 2E). This contradiction should be explained. 12. The subtitle of "Translationally buffered genes exhibit variable association kinetics with the translational machinery in response to mRNA variation" sounds unfair to this reviewer. Since the authors did not work on kinetics directly, the use of this word is misleading. 13. The explanation of Figure 5A "We next explored the potential mechanisms that may give rise to translational buffering. Specifically, we considered two non-mutually exclusive models by which mRNA abundance might be decoupled from ribosome occupancy. In the first, the "differential transcript accessibility model", mRNA abundance determines the fraction of transcripts that are accessible to the translational pool. In this scenario, an increase in mRNA abundance would be accompanied by a proportionally smaller increase in the fraction of transcripts entering the translating pool for buffered genes, compared to non-buffered genes. In the second, the "initiation rate model", the rate of translation initiation per transcript scales inversely with mRNA abundance. Under this model, the proportion of mRNA entering the translational pool would be comparable across buffered and non-buffered genes (Fig 5A)." is hard to understand. The authors should rewrite for a better understanding of the readers.

      Significance

      Thanks to the development of Ribo-Seq, translational buffering has been reported in various works. However, the systematic investigation has remained challenging. Employing the database of published Ribo-Seq and matched RNA-Seq, Rao et al. attempt to understand the mechanism underlying translational buffering of mRNA variation across diverse materials. A group of mRNAs whose expression variance is buffered at the translation level was comprehensively surveyed in humans and mice. The authors found a series of features in the translationally buffered genes, including high GC contents in the 5′ UTR, optimal codon usage, and mRNA length. The depletion or increase of one allele of the genes in the group may be particularly detrimental to cells. The authors' report provides a step forward in our understanding of translational buffering, appealing to the broad scientific community in basic and applied biology. However, this reviewer found a series of concerns in this paper, including clarity in the methods, experimental validation, referring the earlier works, etc. These points could be tackled to improve the reliability of their findings, the strength of their main message, and the global understandability of the paper.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      SECTION A - Evidence, Reproducibility, and Clarity Summary The study investigates the neurodevelopmental impact of trisomy 21 on human cortical excitatory neurons derived from induced pluripotent stem cells (hiPSCs). Key findings include a modest reduction in spontaneous firing, a marked deficit in synchronized bursting, decreased neuronal connectivity, and altered ion channel expression-particularly a downregulation of voltage‐gated potassium channels and HCN1. These conclusions are supported by a combination of in vitro calcium imaging, electrophysiological recordings, viral monosynaptic tracing, RNA sequencing, and in vivo transplantation with two‐photon imaging.

      Major Comments • Convincing Nature of Key Conclusions: The study's conclusions are generally well supported by a diverse set of experimental approaches. However, certain claims regarding the intrinsic properties of the excitatory network would benefit from further qualification. In particular, the assertion that reduced synchronization is solely attributable to altered ion channel expression might be considered somewhat preliminary without additional corroborative experiments.

      1.1) We agree with the reviewer and now write in the abstract: 'Together, these findings demonstrate long-lasting impairments in human cortical excitatory neuron network function associated with Trisomy 21 .' And in the Introduction: 'Collectively, the observed changes in ion channel expression, neuronal connectivity, and network activity synchronization may contribute to functional differences relevant to the cognitive and intellectual features associated with Down syndrome.'

      • One major limitation of the current experimental design is the reliance on predominantly excitatory neuronal cultures derived from hiPSCs. Although the authors convincingly demonstrate differences in network synchronization and connectivity between trisomic (TS21) and control neurons, the almost exclusive focus on excitatory cells limits the physiological relevance of the in vitro network. In the developing cortex, interneurons and astrocytes play crucial roles in modulating network excitability, synaptogenesis, and plasticity. Therefore, incorporating these cell types-either through co-culture systems or through directed differentiation protocols that yield a more heterogeneous neuronal population-could help to determine whether the observed deficits are intrinsic to excitatory neurons or are compounded by a lack of proper inhibitory regulation and glial support. 1.2) Thank you for this thoughtful comment. We agree that interneurons and astrocytes are crucial for network function. To clarify, astrocytes are generated in this culture system, as we previously reported in our characterisation of the timecourse of network development using this approach (Kirwan et al., Development 2025). However, our primary goal was to first isolate and define the cell-autonomous defects intrinsic to TS21 excitatory neurons, minimizing the complexity introduced by additional neuronal types. This focused approach was chosen also because engineering a stable co-culture system with reproducible excitatory/inhibitory (E/I) proportions is a significant undertaking that extends beyond the scope of this initial investigation, and has proven challenging to date for the field. By establishing this foundational phenotype, our work complements prior studies on interneuron and glial contributions. Future studies building on this work will be essential to dissect the more complex, non-cell-autonomous effects within a heterogeneous network. Importantly, since our initial submission, two highly relevant preprints have emerged-including a notable study from the Geschwind laboratory at UCLA (Vuong et al., bioRxiv, 2025; Risgaard et al., bioRxiv, 2025), as well as our own complementary study Lattke et al, under revision, that highlight widespread transcriptional changes in excitatory cells of the human fetal DS cortex, providing strong validation for our central findings. This convergence of results from multiple groups underscores the timeliness and importance of our work.

      • Furthermore, the assessment of neuronal connectivity via pseudotyped rabies virus tracing, while innovative, has inherent limitations. The quantification of connectivity as a ratio of red-to-green fluorescence pixels may be influenced by differential viral infection efficiencies, variations in the expression levels of the TVA receptor, or even by the lower basal activity levels observed in TS21 cultures. Complementary approaches-such as electron microscopy for synaptic density analysis or functional connectivity measurements using multi-electrode arrays (MEAs)-could provide additional structural and functional insights that would validate the rabies tracing data. 1.3) Thank you for this constructive feedback. While we cannot formally exclude that TS21 cells might express the TVA receptor at lower levels due to generalized gene dysregulation, we infected all WT and TS21 cultures in parallel using identical virus preparations and titers to minimize technical variability. Crucially, we also addressed the potential confound of differential basal activity by performing the rabies tracing under TTX incubation (see Suppl. Fig. 7), which blocks network activity and ensures that viral spread reflects structural connectivity alone.

      While complementary methods like EM or MEA could provide additional insight, they fall outside the scope of the current study. We are confident that our rigorous controls validate our use of the rabies tracing method to assess structural connectivity.

      • Qualification of Claims: Some conclusions, particularly those linking specific ion channel dysregulation (e.g., HCN1 loss) directly to network deficits, might be better presented as preliminary. The authors could temper their language to indicate that while the evidence is suggestive, the mechanistic link remains to be fully established. 1.4) We have revised the text to more clearly indicate that the link between HCN1 dysregulation and network deficits is correlative and remains to be fully established. While our ex vivo recordings suggest altered Ih-like currents consistent with reduced HCN1 expression, we now present these findings as preliminary and hypothesis-generating, pending further functional validation. We write in the discussion: However, further targeted functional validation will be needed to confirm a causal link.

      • Need for Additional Experiments: Additional experiments that could further consolidate the current findings include: o Inclusion of Inhibitory Neurons or Co-culture Systems: Incorporating interneurons or astrocytes would help determine whether the observed deficits are solely intrinsic to excitatory neurons. See 1.2 o Alternative Connectivity Assessments: Complementing the rabies virus tracing with electron microscopy or multi-electrode array (MEA) recordings would add structural and functional validation of the connectivity differences. See 1.3 o Extended Temporal Profiling: Monitoring network activity over a longer developmental window would clarify whether the observed deficits represent a delay or a permanent alteration in network maturation. 1.5) In vivo we were able to track the cells for up to five months post-transplantation supporting the interpretation of a permanent alteration.

      • Reproducibility and Statistical Rigor: The methods and data presentation are largely clear, with adequate replication and appropriate statistical analyses. Nonetheless, a more detailed description of the experimental replicates, particularly regarding the viral tracing and in vivo transplantation studies, would enhance reproducibility. The availability of raw data and scripts for calcium imaging analysis would also further support independent verification. We thank the reviewer for these suggestions and we now provide a more detailed description of replicates. We also add the raw data.

      Minor Comments • Experimental Details: Minor revisions could include clarifying the infection efficiency and expression levels of the viral constructs used in connectivity assays to rule out technical variability.

      See 1.3

      • Literature Context: The authors reference prior studies appropriately; however, integrating a brief discussion comparing their findings with alternative DS models (e.g., organoids or other hiPSC-derived systems) would improve contextual clarity. We thank the reviewer for this helpful suggestion. We have now added a brief discussion comparing our findings with those reported in alternative Down syndrome models, including brain organoids and other hiPSC-derived systems. This addition helps to contextualize our results within the broader field and highlights the unique strengths and limitations of our in vitro and in vivo xenograft approach. We write: 'Our findings align with and extend previous studies using alternative Down syndrome models, such as brain organoids and other hiPSC-derived systems. Organoid models have provided valuable insights into early neurodevelopmental phenotypes in DS, including altered interneuron proportions (Xu et al Cell Stem Cell 2019) but also suggest that variability across isogenic lines can overshadow subtle trisomy 21 neurodevelopmental phenotypes (Czerminski et al Front in Neurosci 2023). However, these systems often lack the structural complexity, vascularization, and long-term maturation achievable in vivo. By using a xenotransplantation model, we were able to assess the maturation and functional properties of human neurons within a physiologically relevant environment over extended time frames, offering complementary insights into DS-associated circuit dysfunction (Huo et al Stem Cell Reports 2018; Real et al., 2018).

      • Presentation and Clarity: Figures are generally clear,.But the manuscript contains a minor labeling error. On page 13, the figure is erroneously labeled as "Fig6A", whereas, based on the context and corresponding data, it should be "Fig5A". I recommend that the authors correct this mistake to ensure consistency and avoid potential confusion for readers. Thank you for pointing this out. This has been corrected in the revised manuscript.

      Reviewer #1 (Significance (Required)):

      SECTION B - Significance • Nature and Significance of the Advance: The work offers a substantial conceptual advance by providing a mechanistic link between trisomy 21 and impaired neuronal network synchronization. Technically, the study integrates state-of-the-art imaging, electrophysiology, and transcriptomic profiling, thereby offering a multifaceted view of DS-related neural dysfunction. Clinically, the findings have the potential to inform future therapeutic strategies targeting network connectivity and ion channel function in Down syndrome.

      We thank the reviewer for this very supportive comment.

      • Context in the Existing Literature: The study builds on previous observations of altered network activity in DS patients and DS mouse models (e.g., altered EEG synchronization and reduced synaptic connectivity). It extends these findings to human-derived neuronal models, thus bridging a gap between clinical observations and molecular/cellular mechanisms. Relevant literature includes studies on DS neurodevelopment and the role of ion channels in synaptic maturation. • Target Audience: The reported findings will be of interest to researchers in neurodevelopmental disorders, Down syndrome, and ion channel physiology. Additionally, the study may attract the attention of those working on hiPSC-derived models of neurological diseases, as well as clinicians interested in the pathophysiology of DS. • Keywords and Field Contextualization: Keywords: Down syndrome, trisomy 21, neuronal connectivity, synchronized network activity, hiPSC-derived cortical neurons, ion channel dysregulation.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary The manuscript by Peter et al., reports on the neuronal activity and connectivity of iPSC-derived human cortical neurons from Down syndrome (DS) that is caused by caused by trisomy of the human chromosome 21 (TS21). Major points: Although the manuscript is potentially interesting, the results appear somehow preliminary and need to be corroborated by control experiments and quantifications of effects to fully sustain the conclusions. (1) The authors have not assessed the percentage of WT and TS21 cells that acquire a neuronal or glia identity in their cultures. Indeed, the origin of alterations in network activity and connectivity observed in TS21 neurons could simply derive from reduced number of neurons arising from TS21 iPSC. Alternatively, the same alteration in network activity and connectivity could derive from a multitude of other factors including deficits in neuronal development, neurite extension, or intrinsic electrophysiological properties. In the current version of the manuscript, none of these has been investigated. 2.1) We thank the reviewer for this thoughtful comment. In response, we included an in vivo characterization of cell-type proportions at the same time points where we observed network activity defects using in vivo calcium imaging (see Supplementary Fig. 6).

      Previous work has identified several cellular and molecular phenotypes in human cells, postmortem tissue, and mouse models-including those mentioned by the reviewer. In this study, our focus was on investigating neural network activity, intrinsic electrophysiological properties both in vitro and in vivo, and preliminary bulk RNA sequencing. We have also independently measured cell proportions in the human fetal cortex and conducted a more extensive transcriptomic analysis of Ts21 versus control cells in a separate study (Lattke et al., under revision). We observed a reduction of RORB/FOXP1-expressing Layer 4 neurons in the human fetal cortex at midgestation, as well as increased GFAP+ cells, reduced progenitors and a non significant reduction of Cux2+ cells in late stage DS human cell transplants, along with a gene network dysregulation specifically affecting excitatory neurons (Lattke et al., under revision). Here, we provide complementary findings, demonstrating reduced excitatory neuron network connectivity in vitro and decreased neural network synchronised activity in both in vitro and in vivo models (see also 2.8). We agree with the reviewer that this could be for a number of reasons, both cell autonomous (channel expression and/or function) or non-autonomous (connectivity and/or network composition - as reflected in differences in proportions of SATB2+ neurons generated in TS21 cortical differentiations).

      (2) Electrophysiological properties of TS21 and WT neurons at day 53/54 in vitro indicate an extremely immature stage of development (i.e. RMP between -36 and -27 mV with most of the cells firing a single action potential after current injection) in the utilized culture conditions: This is far from ideal for in vitro neuronal-network studies. Finally, reduced activity of HCN1 channels should be confirmed by specific recordings isolating or blocking the related current.

      2.2) Thank you for this thoughtful comment. We have also conducted ex vivo electrophysiological recordings and found that the neurons exhibit relatively immature properties, consistent with the known slow developmental trajectory of human neuron cultures. In light of this and the absence of direct confirmatory evidence, we now refer to the observed reduction in HCN1 as preliminary.

      Main points highlighting the preliminary character of the study. 1) In Figure 1 immunofluorescence images of the neuronal differentiation markers (Tbr1, Ctip2 and Tuj1) are showed. However, no quantification of the percentage of cells expressing these markers for WT and TS21 neurons is reported. On the other hand, simple inspection of the representative images clearly seams to indicate a difference between the two genotypes, with TS21 cultures showing lower number of cells expressing neuronal markers. This quantification should be corroborated by a similar staining for an astrocyte marker (GFAP, but not S100b since is triplicated in DS). This is an extremely important point since it is obvious that any change in the percentage of neurons (or the neuron/astrocyte ratio) in the cultures will strongly affect the resulting network activity (shown in Figure 2) and the connectivity (showed in Figure 4). Possibly, the quantification should be done at the same time points of the calcium imaging experiments.

      2.3) See 2.1. We included an in vivo characterization of cell-type proportions at the same time points where we observed network activity defects using in vivo calcium imaging. (see Supplementary Fig. 6).

      2) In Figure 2 the authors show some calcium imaging traces of WT and TS21 cultures at different time points. However, they again do not show any quantification of neuronal activity. A power spectra analysis is shown in Supplementary Figure 2, but only for WT cultures, while in Supplementary Figure 3 a comparison between WT and Ts21 power spectra is done, but only at the 50 day time point, while difference in synchrony are assessed at 60 days. At minimum, the author should include in main Figure 2 the quantification of the mean calcium event rate and mean event amplitude at the different time points and the power spectra analysis for both WT and TS21 cultures at the same timepoints.

      2.4) We thank the reviewer for this comment. We now add the power spectra analysis in the main Figure 2 and quantification of the mean calcium burst rate and mean event amplitude in SuppFig. 4.

      Of note, the synchronized neuronal activity is present in WT cultures at day 60, but totally lost at subsequent time-points (70 and 80 days). The results of this later time points are different from previous data from the same lab (Kirwan et al., 2015). How might these data be explained? It would be important to rule out any potential issues with the health of the culture that could explain the loss of neuronal activity.It would be beneficial to check cell viability at the different time points to exclude possible confounding factors ? A propidium staining or a MTT assay would strongly improve the soundness of the calcium data.

      2.5) We thank the reviewer for this important observation. The difference from the findings reported in Kirwan et al., 2015 is due to the use of a different neuronal differentiation medium in the current study (BrainPhys versus N2B27). BrainPhys medium supports robust early network activity compared to N2B27 (onset before day 60 in BrainPhys, post-day 60 in N2B27), resulting in an earlier decline in synchrony at later stages (day 70-80 in BrainPhys, compared with day 90-100 in N2B27). Importantly, in our in vivo xenograft model, burst activity is sustained up to at least 5 months post-transplantation (mpt), indicating that the neurons retain the capacity for network activity over extended periods in a more physiological environment. We adapted the text accordingly.

      3) In Figure 3 there is no quantification of the number and/or density of transplanted neurons for WT and TS21, but only representative images. As above, inspection of the representative images seems to show a decrease in cells labeled by the Tbr1 neuronal marker for TS21 cells. Moreover, the in vivo calcium imaging of transplanted WT and TS21 cells lacks most of the quantification normally done in calcium imaging experiments. Are the event rate and event amplitude different between WT and TS21 neurons ? The measure of neuronal synchrony by mean pixel correlation is not well explained, but it looks somehow simplistic. Neuronal synchrony can be more precisely measured by cross-correlation analysis or spike time tiling coefficients on the traces from single-neuron ROI rather than on all pixels in the field of view, as apparently was done here.

      2.6) We thank the reviewer for these valuable points. We now include quantification of the number and density of transplanted neurons for both WT and Ts21 grafts in Extended Data Figure 5 (see 2.1).

      Regarding the in vivo calcium imaging, we appreciate the reviewer's suggestion to include additional standard metrics. We have quantified the event rate in Real et al 2018. These analyses reveal that Ts21 neurons show a reduction in event rate.

      We agree that our initial description of the synchrony analysis using mean pixel correlation was not sufficiently detailed. We have now clarified this in the Methods and Results, and we acknowledge its limitations. Importantly, we note that the reduced synchronisation is a highly consistent phenotype, observed across at least six independent donor pairs, different differentiation protocols, and both in vitro (and in two independent labs) and in vivo settings. As suggested, future studies using ROI-based approaches-such as cross-correlation or spike-time tiling coefficients-would provide a more refined characterization of synchrony at the single-neuron level (Sintes et al, in preparation). We now include this point in the discussion.

      4) The results on reduced neuronal connectivity in Figure 3 look very striking. However, these results should be accompanied by control experiments to verify the number of neuronal cells and neurite extension in WT and Ts21 cultures. These two parameters could indeed strongly influence the results. As the cultures appear to grow in clusters, bright-field images and TuJ1 staining of the cultures will also greatly help to understand the degree of morphological interconnection between the clusters.

      We now add Tuj1 staining in Supplementary figure 10.

      5) The authors performed RNA-seq experiments on day 50 cultures. Why the authors do not show the complete differential gene expression analysis, but only a small subset of genes? A comprehensive volcano plot and the complete list of identified genes with logFC and FDR values would be helpful. If possible, comparison of the present data (particularly on KCN and HCN expression changes) with published and publicly available expression datasets of other human or human Down syndrome iPSC-derived neurons or human Down syndrome brains will greatly increase the soundness of the present findings. In addition, the gene ontology (GO) results are mentioned in the text, but are not presented. Showing the complete GO analysis for both up and downregulated genes will help the reader to better understand the RNA-seq results. Notably, the results shown in Supplementary Figure on GRIN2A and GRIN2B expression (with values of 300-700 counts versus 2000-4000 counts, respectively) clearly indicate that in both WT and TS21 cultures the NMDA developmental switch has not occurred yet at the 50 days timepoint.

      We now show volcano plots in Supplementary Fig. 11.

      6) The measure of hyperpolarization-activated currents shown in Figure 5 lack proper control experiments. First, the hyperpolarizing current in TS21 cells do not reach a steady-state as the controls. The two curves are therefore hard to compare. To exclude possible difference in kinetic activation, the authors should have prolonged the current injection period (1-2 seconds). Second, to ultimately prove that such currents are mediated by HCN channels in WT cells the authors should perform some control experiments with a specific HCN blocker. A good example of a suitable protocol, with also current blockers to exclude all other possible current contributions, is the one reported in Matt et al Cell. Mol. Life Sci. 68, 125-137 (2011).

      2.7) We thank the reviewer for this detailed and helpful comment. We agree that to definitively identify the recorded currents as Ih, it would be necessary to isolate them pharmacologically using specific HCN channel blockers and appropriate controls, such as those described in Matt et al., Cell. Mol. Life Sci. Unfortunately, due to current constraints, we no longer have access to the animals used in this study and cannot allocate the necessary time or resources, we are unable to perform the additional experiments at this stage.

      However, our goal here was to use electrophysiological recordings as an indication of altered HCN channel activity, which we then support with molecular evidence. We now emphasize this point more clearly in the revised manuscript.

      7) The manuscript lacks information on the statistical analysis used. Also, the numerosity of samples is not clear. Were the dots shown in some graph technical replicates from a single neuronal induction or were all independent neuronal inductions or a mix of the two ? Please clarify.

      We now clarify the numbers in the Figure legend.

      8) The method section lacks important information to guarantee reproducibility. Just a few examples: • Only electrophysiology methods for slice are reported, but not for in vitro culture.

      We now clarify these details in the methods.

      • Details on Laminin coating is lacking. What concentration was used ? Was poly-ornithine or poly-lysine used before Laminin coating ? We now clarify these details in the methods.

      • How long cells were switched to BrainPhys medium before calcium imaging ? We now clarify these details in the methods.

      Minor point/typos etc.

      Introduction • Page 4 line 6: in the line "Trisomy 21 in humans commonly results in a range in developmental and morphological changes in the forebrain ..." "in" could be replaced by "of". We have fixed this. • Page 5 line 2: please remove "an" before the word "another". We have fixed this. • Page 5 line 2: please replace "ecitatory" with "excitatory". We have fixed this typo.

      Results • Page 10 line 25: The concept of "pixel-wise" appears for the first time in this section and could be better introduced to facilitate the understanding of the experiment. • In the "results" section, page 11 line 1 and 4, references are made to "Figure 4D" and "4F," but these figures do not appear to be present in the figure section. Upon reviewing the rest of the section, the data seem to refer to "Figure 3D" and "3E." We have fixed this. Discussion • Page 15 line 20: please replace "synchronised" with "synchronized". We have fixed this typo. • Page 16 line 11: please replace "T21" with "TS21". We have fixed this typo. Methods • Page 19 line 12: "Pens/Strep" has to be replaced by Pen/Strep. We have fixed this typo. • Page 20 line 20: "Tocris Biocience" has to be replaced by "Tocris Bioscience". We have fixed this typo. • Page 21 line 2: "Addegene" has to be replaced by "Addgene". We have fixed this typo. Figures • Figure 3: the schematic experimental design (Fig. 3A) could be enlarged to match the width of the images/graphs below. We have fixed this. • Figure 5: the reviewer suggests resizing/repositioning the graphs in Fig. 1A so that they match the width of those below. We have fixed this. • Figure S1D: In all the figures of the paper, the respective controls for the TS21 1 and TS21 2 lines are labelled as "WT1/WT2," while in these graphs, they are called "Ctrl1" and "Ctrl2." To ensure consistency throughout the paper, it is suggested to change the names in these graphs. We have fixed this. • Figure S4L: The graph is not very clear, especially regarding the significance reported at -50 pA, please modify the graphical visualization and/or add a legend in the caption. We have fixed this.

      Reviewer #2 (Significance (Required)):

      Nature and significance of the advance for the field. The results presented in the manuscript are potentially interesting and useful, but not completely novel (currents deregulation has already been highlighted in mouse models of Down Syndrome).

      2.8) We thank the reviewer for this comment. While we agree that current deregulation has been observed in mouse models of Down syndrome, the novelty and significance of our study lie in demonstrating these alterations directly in human neurons using both in vitro and in vivo xenograft models.

      This is a critical advance because the human cortex has distinct developmental and functional properties not fully recapitulated in mice. In fact, three recent studies have already highlighted significant defects mainly in excitatory neurons within the fetal human DS cortex (Vuong et al., bioRxiv, 2025; Risgaard et al., bioRxiv, 2025; Lattke et al, under revision). Our work builds directly on these observations by providing, for the first time, an electrophysiological and network-level characterization of these human-specific deficits.

      Our findings thus provide translationally relevant insight that is not merely confirmatory but extends previous work by grounding it in a human cellular context.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      The manuscript by Peter et al., reports on the neuronal activity and connectivity of iPSC-derived human cortical neurons from Down syndrome (DS) that is caused by caused by trisomy of the human chromosome 21 (TS21).

      Major points:

      Although the manuscript is potentially interesting, the results appear somehow preliminary and need to be corroborated by control experiments and quantifications of effects to fully sustain the conclusions.

      (1) The authors have not assessed the percentage of WT and TS21 cells that acquire a neuronal or glia identity in their cultures. Indeed, the origin of alterations in network activity and connectivity observed in TS21 neurons could simply derive from reduced number of neurons arising from TS21 iPSC. Alternatively, the same alteration in network activity and connectivity could derive from a multitude of other factors including deficits in neuronal development, neurite extension, or intrinsic electrophysiological properties. In the current version of the manuscript, none of these has been investigated.

      (2) Electrophysiological properties of TS21 and WT neurons at day 53/54 in vitro indicate an extremely immature stage of development (i.e. RMP between -36 and -27 mV with most of the cells firing a single action potential after current injection) in the utilized culture conditions: This is far from ideal for in vitro neuronal-network studies. Finally, reduced activity of HCN1 channels should be confirmed by specific recordings isolating or blocking the related current.

      Main points highlighting the preliminary character of the study.

      1) In Figure 1 immunofluorescence images of the neuronal differentiation markers (Tbr1, Ctip2 and Tuj1) are showed. However, no quantification of the percentage of cells expressing these markers for WT and TS21 neurons is reported. On the other hand, simple inspection of the representative images clearly seams to indicate a difference between the two genotypes, with TS21 cultures showing lower number of cells expressing neuronal markers. This quantification should be corroborated by a similar staining for an astrocyte marker (GFAP, but not S100b since is triplicated in DS). This is an extremely important point since it is obvious that any change in the percentage of neurons (or the neuron/astrocyte ratio) in the cultures will strongly affect the resulting network activity (shown in Figure 2) and the connectivity (showed in Figure 4). Possibly, the quantification should be done at the same time points of the calcium imaging experiments.

      2) In Figure 2 the authors show some calcium imaging traces of WT and TS21 cultures at different time points. However, they again do not show any quantification of neuronal activity. A power spectra analysis is shown in Supplementary Figure 2, but only for WT cultures, while in Supplementary Figure 3 a comparison between WT and Ts21 power spectra is done, but only at the 50 day time point, while difference in synchrony are assessed at 60 days. At minimum, the author should include in main Figure 2 the quantification of the mean calcium event rate and mean event amplitude at the different time points and the power spectra analysis for both WT and TS21 cultures at the same timepoints.

      Of note, the synchronized neuronal activity is present in WT cultures at day 60, but totally lost at subsequent time-points (70 and 80 days). The results of this later time points are different from previous data from the same lab (Kirwan et al., 2015). How might these data be explained? It would be important to rule out any potential issues with the health of the culture that could explain the loss of neuronal activity.It would be beneficial to check cell viability at the different time points to exclude possible confounding factors ? A propidium staining or a MTT assay would strongly improve the soundness of the calcium data.

      3) In Figure 3 there is no quantification of the number and/or density of transplanted neurons for WT and TS21, but only representative images. As above, inspection of the representative images seems to show a decrease in cells labeled by the Tbr1 neuronal marker for TS21 cells. Moreover, the in vivo calcium imaging of transplanted WT and TS21 cells lacks most of the quantification normally done in calcium imaging experiments. Are the event rate and event amplitude different between WT and TS21 neurons ? The measure of neuronal synchrony by mean pixel correlation is not well explained, but it looks somehow simplistic. Neuronal synchrony can be more precisely measured by cross-correlation analysis or spike time tiling coefficients on the traces from single-neuron ROI rather than on all pixels in the field of view, as apparently was done here.

      4) The results on reduced neuronal connectivity in Figure 3 look very striking. However, these results should be accompanied by control experiments to verify the number of neuronal cells and neurite extension in WT and Ts21 cultures. These two parameters could indeed strongly influence the results. As the cultures appear to grow in clusters, bright-field images and TuJ1 staining of the cultures will also greatly help to understand the degree of morphological interconnection between the clusters.

      5) The authors performed RNA-seq experiments on day 50 cultures. Why the authors do not show the complete differential gene expression analysis, but only a small subset of genes? A comprehensive volcano plot and the complete list of identified genes with logFC and FDR values would be helpful. If possible, comparison of the present data (particularly on KCN and HCN expression changes) with published and publicly available expression datasets of other human or human Down syndrome iPSC-derived neurons or human Down syndrome brains will greatly increase the soundness of the present findings. In addition, the gene ontology (GO) results are mentioned in the text, but are not presented. Showing the complete GO analysis for both up and downregulated genes will help the reader to better understand the RNA-seq results. Notably, the results shown in Supplementary Figure on GRIN2A and GRIN2B expression (with values of 300-700 counts versus 2000-4000 counts, respectively) clearly indicate that in both WT and TS21 cultures the NMDA developmental switch has not occurred yet at the 50 days timepoint.

      6) The measure of hyperpolarization-activated currents shown in Figure 5 lack proper control experiments. First, the hyperpolarizing current in TS21 cells do not reach a steady-state as the controls. The two curves are therefore hard to compare. To exclude possible difference in kinetic activation, the authors should have prolonged the current injection period (1-2 seconds). Second, to ultimately prove that such currents are mediated by HCN channels in WT cells the authors should perform some control experiments with a specific HCN blocker. A good example of a suitable protocol, with also current blockers to exclude all other possible current contributions, is the one reported in Matt et al Cell. Mol. Life Sci. 68, 125-137 (2011).

      7) The manuscript lacks information on the statistical analysis used. Also, the numerosity of samples is not clear. Were the dots shown in some graph technical replicates from a single neuronal induction or were all independent neuronal inductions or a mix of the two ? Please clarify.

      8) The method section lacks important information to guarantee reproducibility. Just a few examples: - Only electrophysiology methods for slice are reported, but not for in vitro culture. - Details on Laminin coating is lacking. What concentration was used ? Was poly-ornithine or poly-lysine used before Laminin coating ? - How long cells were switched to BrainPhys medium before calcium imaging ?

      Minor point/typos etc.

      Introduction

      • Page 4 line 6: in the line "Trisomy 21 in humans commonly results in a range in developmental and morphological changes in the forebrain ..." "in" could be replaced by "of".
      • Page 5 line 2: please remove "an" before the word "another".
      • Page 5 line 2: please replace "ecitatory" with "excitatory"

      Results

      • Page 10 line 25: The concept of "pixel-wise" appears for the first time in this section and could be better introduced to facilitate the understanding of the experiment.
      • In the "results" section, page 11 line 1 and 4, references are made to "Figure 4D" and "4F," but these figures do not appear to be present in the figure section. Upon reviewing the rest of the section, the data seem to refer to "Figure 3D" and "3E."

      Discussion

      • Page 15 line 20: please replace "synchronised" with "synchronized".
      • Page 16 line 11: please replace "T21" with "TS21".

      Methods

      • Page 19 line 12: "Pens/Strep" has to be replaced by Pen/Strep.
      • Page 20 line 20: "Tocris Biocience" has to be replaced by "Tocris Bioscience".
      • Page 21 line 2: "Addegene" has to be replaced by "Addgene".

      Figures

      • Figure 3: the schematic experimental design (Fig. 3A) could be enlarged to match the width of the images/graphs below.
      • Figure 5: the reviewer suggests resizing/repositioning the graphs in Fig. 1A so that they match the width of those below.
      • Figure S1D: In all the figures of the paper, the respective controls for the TS21 1 and TS21 2 lines are labelled as "WT1/WT2," while in these graphs, they are called "Ctrl1" and "Ctrl2." To ensure consistency throughout the paper, it is suggested to change the names in these graphs.
      • Figure S4L: The graph is not very clear, especially regarding the significance reported at -50 pA, please modify the graphical visualization and/or add a legend in the caption.

      Significance

      Nature and significance of the advance for the field. The results presented in the manuscript are potentially interesting and useful, but not completely novel (currents deregulation has already been highlighted in mouse models of Down Syndrome).

      Work in the context of the existing literature. This work follows the line of evidence that characterizes Down Syndrome in human neurons (Huo, H.-Q. et al. Stem Cell Rep. 10, 1251-1266 (2018); Briggs, J. A. et al. Etiology. Stem Cells 31, 467-478 (2013)), both in vitro and in xenotransplanted mice, by corrborating some important findings already found in animal models (Stern, S., Segal, M. & Moses, E. EBioMedicine 2, 1048-1062 (2015); Cramer, N. P., Xu, X., F. Haydar, T. & Galdzicki, Z. Physiol. Rep. 3, e12655 (2015); Stern, S., Keren, R., Kim, Y. & Moses, E. http://biorxiv.org/lookup/doi/10.1101/467522 (2018) doi:10.1101/467522.

      Audience. Scientists in the field of pre-clinical biomedical research, especially those working on neurodevelopmental disorders and iPSC-based non-animal models.

      Field of expertise. In vitro electrophysiology, Neurodevelopmental disorders, Down Syndrome, ips cells.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      The study investigates the neurodevelopmental impact of trisomy 21 on human cortical excitatory neurons derived from induced pluripotent stem cells (hiPSCs). Key findings include a modest reduction in spontaneous firing, a marked deficit in synchronized bursting, decreased neuronal connectivity, and altered ion channel expression-particularly a downregulation of voltage‐gated potassium channels and HCN1. These conclusions are supported by a combination of in vitro calcium imaging, electrophysiological recordings, viral monosynaptic tracing, RNA sequencing, and in vivo transplantation with two‐photon imaging.

      Major Comments

      • Convincing Nature of Key Conclusions: The study's conclusions are generally well supported by a diverse set of experimental approaches. However, certain claims regarding the intrinsic properties of the excitatory network would benefit from further qualification. In particular, the assertion that reduced synchronization is solely attributable to altered ion channel expression might be considered somewhat preliminary without additional corroborative experiments.
      • One major limitation of the current experimental design is the reliance on predominantly excitatory neuronal cultures derived from hiPSCs. Although the authors convincingly demonstrate differences in network synchronization and connectivity between trisomic (TS21) and control neurons, the almost exclusive focus on excitatory cells limits the physiological relevance of the in vitro network. In the developing cortex, interneurons and astrocytes play crucial roles in modulating network excitability, synaptogenesis, and plasticity. Therefore, incorporating these cell types-either through co-culture systems or through directed differentiation protocols that yield a more heterogeneous neuronal population-could help to determine whether the observed deficits are intrinsic to excitatory neurons or are compounded by a lack of proper inhibitory regulation and glial support.
      • Furthermore, the assessment of neuronal connectivity via pseudotyped rabies virus tracing, while innovative, has inherent limitations. The quantification of connectivity as a ratio of red-to-green fluorescence pixels may be influenced by differential viral infection efficiencies, variations in the expression levels of the TVA receptor, or even by the lower basal activity levels observed in TS21 cultures. Complementary approaches-such as electron microscopy for synaptic density analysis or functional connectivity measurements using multi-electrode arrays (MEAs)-could provide additional structural and functional insights that would validate the rabies tracing data.
      • Qualification of Claims: Some conclusions, particularly those linking specific ion channel dysregulation (e.g., HCN1 loss) directly to network deficits, might be better presented as preliminary. The authors could temper their language to indicate that while the evidence is suggestive, the mechanistic link remains to be fully established.
      • Need for Additional Experiments: Additional experiments that could further consolidate the current findings include:
        • Inclusion of Inhibitory Neurons or Co-culture Systems: Incorporating interneurons or astrocytes would help determine whether the observed deficits are solely intrinsic to excitatory neurons.
        • Alternative Connectivity Assessments: Complementing the rabies virus tracing with electron microscopy or multi-electrode array (MEA) recordings would add structural and functional validation of the connectivity differences.
        • Extended Temporal Profiling: Monitoring network activity over a longer developmental window would clarify whether the observed deficits represent a delay or a permanent alteration in network maturation.
      • Reproducibility and Statistical Rigor: The methods and data presentation are largely clear, with adequate replication and appropriate statistical analyses. Nonetheless, a more detailed description of the experimental replicates, particularly regarding the viral tracing and in vivo transplantation studies, would enhance reproducibility. The availability of raw data and scripts for calcium imaging analysis would also further support independent verification.

      Minor Comments

      • Experimental Details:

      Minor revisions could include clarifying the infection efficiency and expression levels of the viral constructs used in connectivity assays to rule out technical variability. - Literature Context:

      The authors reference prior studies appropriately; however, integrating a brief discussion comparing their findings with alternative DS models (e.g., organoids or other hiPSC-derived systems) would improve contextual clarity. - Presentation and Clarity:

      Figures are generally clear,.But the manuscript contains a minor labeling error. On page 13, the figure is erroneously labeled as "Fig6A", whereas, based on the context and corresponding data, it should be "Fig5A". I recommend that the authors correct this mistake to ensure consistency and avoid potential confusion for readers.

      Significance

      • Nature and Significance of the Advance:

      The work offers a substantial conceptual advance by providing a mechanistic link between trisomy 21 and impaired neuronal network synchronization. Technically, the study integrates state-of-the-art imaging, electrophysiology, and transcriptomic profiling, thereby offering a multifaceted view of DS-related neural dysfunction. Clinically, the findings have the potential to inform future therapeutic strategies targeting network connectivity and ion channel function in Down syndrome. - Context in the Existing Literature:

      The study builds on previous observations of altered network activity in DS patients and DS mouse models (e.g., altered EEG synchronization and reduced synaptic connectivity). It extends these findings to human-derived neuronal models, thus bridging a gap between clinical observations and molecular/cellular mechanisms. Relevant literature includes studies on DS neurodevelopment and the role of ion channels in synaptic maturation. - Target Audience:

      The reported findings will be of interest to researchers in neurodevelopmental disorders, Down syndrome, and ion channel physiology. Additionally, the study may attract the attention of those working on hiPSC-derived models of neurological diseases, as well as clinicians interested in the pathophysiology of DS. - Keywords and Field Contextualization:

      Keywords: Down syndrome, trisomy 21, neuronal connectivity, synchronized network activity, hiPSC-derived cortical neurons, ion channel dysregulation.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the referees for taking time to review our manuscript. These reviews are positive, highlighting the novelty of our findings. The majority of comments are cosmetic, and we have added data in response to some technical points. We feel that some of the additional experiments proposed would not add significant methodological depth, and cross-commenting suggests that our referees agree. At present we are attempting antibody staining to quantify Tk peptide retention in the midgut, as per suggestion by reviewer #2.

      We enclose our point-by-point response to each referee's points, below.



      __Reviewer #1 __

      • Can the authors state in the figure legends the numbers of flies used for each lifespan and whether replicates have been done?
      • We have incorporated the requested information into legends for lifespan experiments.

      • Do the interventions shorten lifespan relative to the axenic cohort? Or do they prevent lifespan extension by axenic conditions? Both statements are valid, and the authors need to be consistent in which one they use to avoid confusing the reader.

      • We read these statements differently. The only experiment in which a genetic intervention prevented lifespan extension by axenic conditions is neuronal TkR86C knockdown (Figure 6B-C). Otherwise, microbiota shortened lifespan relative to axenic conditions, and genetic knockdowns extend blocked this effect (e.g. see lines 131-133). We have ensured that the framing is consistent throughout, with text edited at lines 198-199, 298-299, 311-312, 345-347, 408-409, 424-425, 450, 497-503.

      • TkRNAi consistently reduces lipid levels in axenic flies (Figs 2E, 3D), essentially phenocopying the loss of lipid stores seen in control conventionally reared (CR) flies relative to control axenic. This suggests that the previously reported role of Tk in lipid storage - demonstrated through increased lipid levels in TkRNAi flies (Song et al (2014) Cell Rep 9(1): 40) - is dependent on the microbiota. In the absence of the microbiota TkRNAi reduces lipid levels. The lack of acknowledgement of this in the text is confusing

      • We have added text at lines 219-222 to address this point. We agree that this effect is hard to interpret biologically, since expressing RNAi in axenics has no additional effect on Tk expression (Figure S7). Consequently we can only interpret this unexpected effect as a possible off-target effect of RU feeding on TAG, specific to axenic flies. However, this possibility does not void our conclusion, because an off-target dimunition of TAG cannot explain why CR flies accumulate TAG following TkRNAi We hope that our added text clarifies.

      • *I have struggled to follow the authors logic in ablating the IPCs and feel a clear statement on what they expected the outcome to be would help the reader. *

      • We have added the requested statement at lines 423-424, explaining that we expected the IPC ablation to render flies constitutively long-lived and non-responsive to A pomorum.

      • *Can the authors clarify their logic in concluding a role for insulin signalling, and qualify this conclusion with appropriate consideration of alternative hypotheses? *

      • We have added our logic at lines 449-454. In brief, we conclude involvement for insulin signalling because FoxO mutant lifespan does not respond to TkRNAi, and diminishes the lifespan-shortening effect of * pomorum*. However, we cannot state that the effects are direct because we do not have data that mechanistically connects Tk/TkR99D signalling directly in insulin-producing cells. The current evidence is most consistent with insulin signalling priming responses to microbiota/Tk/TkR99D, as per the newly-added text.

      • Typographical errors

      • We have remedied the highlighted errors, at lines 128-140.

      • I'd encourage the authors to provide lifespan plots that enable comparison between all conditions

      • We have plotted our figures in faceted boxes, because the number of survival curves that would need to be presented on the same axis (e.g. 16 for Figure 5) would not be intellegible. However we have ensured that axes on faceted plots are equivalent and with grid lines for comparison. Moreover, our approach using statistical coefficients (EMMs) enables direct quantitative comparison of the differences among conditions.

      Reviewer #2

      • Not…essential for publication…is it possible to look at Tk protein levels?
      • We have acquired a small amount of anti-TK antibody and we will attempt to immunostain guts associated with * pomorum and L. brevis*. We are also attempting the equivalent experiment in mouse colon reared with/without a defined microbiota. These experiments are ongoing, but we note that the referee feels that the manuscript is a publishable unit whether these stainings succeed or not.

      • it would be good to show that the bacterial levels are not impacted [by TkRNAi]

      • We have quantified CFUs in CR flies upon ubiquitous TkRNAi (Figure S5), finding that the RNAi does not affect bacterial load. New text at lines 138-139 articulates this point.

      • The effect of Tk RNAi on TAG is opposite in CR and Ax or CR and Ap flies, and the knockdown shows an effect in either case (Figure 2E, Figure 3D). Why is this?

      • As per response to Reviewer #1, we have added text at lines 219-222 to address this point.

      • Is it possible to perform at least one lifespan repeat with the other Tk RNAi line mentioned?

      • We have added another experiment showing longevity upon knockdown in conventional flies, using an independent TkRNAi line (Figure S3).

      • Is it possible that this driver is simply not resulting in an efficient KD of the receptor? I would be inclined to check this

      • This comment relates to Figure 7G. We do see an effect of the knockdown in this experiment, so we believe that the knockdown is effective. However the direction of response is not consistent with our hypothesis so the experiment is not informative about the role of these cells. We therefore feel there is little to be gained by testing efficacy of knockdown, which would also be technically challenging because the cells are a small population in a larger tissue which expresses the same transcripts elsewhere (i.e. necessitating FISH).

      • Would it be possible to use antibodies for acetylated histones?

      • The comment relates to Figure 4C-E. The proposed studies would be a significant amount of work because, to our knowledge, the specific histone marks which drive activation in TK+ cells remain unknown. On the other hand, we do not see how this information would enrich the present story, rather such experiments would appear to be the beginning of something new. We therefore agree with Reviewer #1 (in cross-commenting) that this additional work is not justified.

      Reviewer #3

      • *In Line243, the manuscript states that the reporter activity was not increased in the posterior midgut. However, based on the presented results in Fig4E, there is seemingly not apparent regional specificity. A more detailed explanation is necessary. *
      • We thank the reviewer sincerely for their keen eye, which has highlighted an error in the previous version of the figure. In revisiting this figure we have noticed, to our dismay, that the figures for GFP quantification were actually re-plots of the figures for (ac)K quantification. This error led to the discrepancy between statistics and graphics, which thankfully the reviewer noticed. We have revised the figure to remedy our error, and the statistics now match the boxplots and results text.

      • Fig1C uses Adh for normalization. Given the high variability of the result, the authors should (1) check whether Adh expression levels changed via bacterial association

      • We selected Adh on the basis of our RNAseq analysis, which showed it was not different between AX and CV guts, whereas many commonly-used “housekeeping” genes were. We have now added a plot to demonstrate (Figure S2).

      • The statement in Line 82 that EEs express 14 peptide hormones should be supported with an appropriate reference

      • We have added the requested reference (Hung et al, 2020) at line 86.

      • Tk+ EEC activity should be assessed directly, rather than relying solely on transcript levels. Approaches such as CaLexA or GCaMP could be used.

      • We agree with reviewers 1-2 (in cross-commenting) that this proposal is non-trivial and not justified by the additional insight that would be gained. As described above, we are attempting to immunostain Tk, which if successful will provide a third line of evidence for regulation of Tk+ cells. However we note that we already have the strongest possible evidence for a role of these cells via genetic analysis (Figure 5).

      • While the difficulty of maintaining lifelong axenic conditions is understandable, it may still be feasible to assess the induction of Tk (ie. Tk transcription or EE activity upregulation) by the microbiome on males.

      • As the reviewer recognises, maintaining axenic experiments for months on end is not trivial. Given the tendency for males either to simply mirror female responses to lifespan-extending interventions, or to not respond at all, we made the decision in our work to only study females. We have instead emphasised in the manuscript that results are from female flies.

      • TkR86C, in addition to TkR99D, may be involved in the A. pomorum-lifespan interaction. Consider revising the title to refer more generally to the "tachykinin receptor" rather than only TkR99D.

      • We disagree with this interpretation: the results do not show that TkR86C-RNAi recapitulates the effect of enteric Tk-RNAi. A potentially interesting interaction is apparent, but the data do not support a causal role for TkR86C. A causal role is supported only for TkR99D, knockdown of which recapitulates the longevity of axenic flies and TkRNAi flies. Therefore we feel that our current title is therefore justified by the data, and a more generic version would misrepresent our findings.

      • The difference between "aging" and "lifespan" should also be addressed.

      • The smurf phenotype is a well-established metric of healthspan. Moreover, lifespan is the leading aggregate measure of ageing. We therefore feel that the use of “ageing” in the title is appropriate.

      • If feasible, assessing foxo activation would add mechanistic depth. This could be done by monitoring foxo nuclear localization or measuring the expression levels of downstream target genes.

      • Foxo nuclear localisation has already been shown in axenic flies (Shin et al, 2011). We have added text and citation at lines 402-403.
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      Marcu et al. demonstrate a gut-neuron axis that is required for the lifespan-shortening effects mediated by gut bacteria. They show that the presence of commensal bacteria-particularly Acetobacter pomorum-promotes Tk expression in the gut, which then binds to neuronal tachykinin receptors to shorten lifespan. Tk has also recently been reported to extend lifespan via adipokinetic hormone (Akh) signaling (Ahrentløv et al., Nat Metab 7, 2025), but the mechanism here appears distinct. The lifespan shortening by Ap via Tk seems to be partially dependent on foxo and independent of both insulin signaling and Akh-mediated lipid mobilization. Although the detailed mechanistic link to lifespan is not fully resolved, the experiment and its results clearly shows the involvement of the molecules tested. This work adds a valuable dimension to our growing understanding of how gut bacteria influence host longevity. However, there are some points that should be addressed.

      1. Tk+ EEC activity should be assessed directly, rather than relying solely on transcript levels. Approaches such as CaLexA or GCaMP could be used.
      2. In Line243, the manuscript states that the reporter activity was not increased in the posterior midgut. However, based on the presented results in Fig4E, there is seemingly not apparent regional specificity. A more detailed explanation is necessary.
      3. If feasible, assessing foxo activation would add mechanistic depth. This could be done by monitoring foxo nuclear localization or measuring the expression levels of downstream target genes.
      4. Fig1C uses Adh for normalization. Given the high variability of the result, the authors should (1) check whether Adh expression levels changed via bacterial association and/or (2) compare the results using different genes as internal standard.
      5. While the difficulty of maintaining lifelong axenic conditions is understandable, it may still be feasible to assess the induction of Tk (ie. Tk transcription or EE activity upregulation) by the microbiome on males.
      6. We also had some concerns regarding the wording of the title. Fig6B and C suggests that TkR86C, in addition to TkR99D, may be involved in the A. pomorum-lifespan interaction. Consider revising the title to refer more generally to the "tachykinin receptor" rather than only TkR99D. The difference between "aging" and "lifespan" should also be addressed. While the study shows a role for Tk in lifespan, assessment of aging phenotypes (eg. Climbing assay, ISC proliferation) beyond the smurf assay is required to make conclusions about aging.
      7. The statement in Line 82 that EEs express 14 peptide hormones should be supported with an appropriate reference, if available.

      Referees cross-commenting

      I agree with the other reviewers that the study has been done very well and hence additional experiments are not mandatory to be published such as calcium imaging. However, I still believe that testing Tk's elevation by the Ap in males should greatly increase the generality of the finding, no matter what the outcome would be. Too many studies use only females.

      Significance

      General assessment

      The main strength of this study is the careful and extensive lifespan analyses, which convincingly demonstrate the role of gut microbiota in regulating longevity. The authors clarify an important aspect of how microbial factors contribute to lifespan control. The main limitation is that the study primarily confirms the involvement of previously reported signaling pathways, without identifying novel molecular players or previously unrecognized mechanisms of lifespan regulation.

      Advance

      The lifespan-shortening effect of Acetobacter pomorum (Ap) has been reported previously, as has the lifespan-shortening effect of Tachykinin (Tk). However, this study is the first to link these two factors mechanistically, which represents a significant and original contribution to the field. The advance is primarily mechanistic, providing new insight into how microbial cues converge on host signaling pathways to influence ageing.

      Audience

      This work will be of particular interest to a specialized audience of basic researchers in ageing biology. It will also attract interest from microbiome researchers who are investigating host-microbe interactions and their physiological consequences. The findings will be useful as a conceptual framework for future mechanistic studies in this area.

      Field of expertise

      Drosophila ageing, lifespan, microbiome, metabolism

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The main finding of this work is that microbiota impacts lifespan though regulating the expression of a gut hormone (Tk) which in turn acts on its receptor expressed on neurons. This conclusion is robust and based on a number of experimental observation, carefully using techniques in fly genetics and physiology: 1) microbiota regulates Tk expression, 2) lifespan reduction by microbiota is absent when Tk is knocked down in gut (specifically in the EEs), 3) Tk knockdown extends lifespan and this is recapitulated by knockdown of a Tk receptor in neurons. These key conclusions are very convincing. Additional data are presented detailing the relationship between Tk and insulin/IGF signalling and Akh in this context. These are two other important endocrine signalling pathways in flies. The presentation and analysis of the data are excellent.

      There are only a few experiments or edits that I would suggest as important to confirm or refine the conclusions of this manuscript. These are:

      1. When comparing the effects of microbiota (or single bacterial species) in different genetic backgrounds or experimental conditions, I think it would be good to show that the bacterial levels are not impacted by the other intervention(s). For example, the lifespan results observed in Figure 2A are consistent with Tk acting downstream of the microbes but also with Tk RNAi having an impact on the microbiota itself. I think this simple, additional control could be done for a few key experiments. Similarly, the authors could compare the two bacterial species to see if the differences in their effects come from different ability to colonise the flies.
      2. The effect of Tk RNAi on TAG is opposite in CR and Ax or CR and Ap flies, and the knockdown shows an effect in either case (Figure 2E, Figure 3D). Why is this? Better clarification is required.
      3. With respect to insulin signalling, all the experiments bar one indicate that insulin is mediating the effects of Tk. The one experiment that does not is using dilpGS to knock down TkR99D. Is it possible that this driver is simply not resulting in an efficient KD of the receptor? I would be inclined to check this, but as a minimum I would be a bit more cautious with the interpretation of these data.
      4. Is it possible to perform at least one lifespan repeat with the other Tk RNAi line mentioned? This would further clarify that there are no off-target effects that can account for the phenotypes.

      There are a few other experiments that I could suggest as I think they could enrich the current manuscript, but I do not believe they are essential for publication: 5. The manuscript could be extended with a little more biochemical/cell biology analysis. For example, is it possible to look at Tk protein levels, Tk levels in circulation, or even TkR receptor activation or activation of its downstream signalling pathways? Comparing Ax and CR or Ap and CR one would expect to find differences consistent with the model proposed. This would add depth to the genetic analysis already conducted. Similarly, for insulin signalling - would it be possible to use some readout of the pathway activity and compare between Ax and CR or Ap and CR? 6. The authors use a pan-acetyl-K antibody but are specifically interested in acetylated histones. Would it be possible to use antibodies for acetylated histones? This would have the added benefit that one can confirm the changes are not in the levels of histones themselves. 7. I think the presentation of the results could be tightened a bit, with fewer sections and one figure per section.

      Referees cross-commenting

      Reviewer 1

      I generally agree with this reviewer but for

      "I'm convinced by the data showing that FOXO is required for TkRNAi to prevent lifespan shortening by Ap, but FOXO doesn't only respond to insulin signalling and can't be taken by itself to indicate a role for insulin signalling which the authors appear to do here."

      To the best of my knowledge, Foxo has only been shown to be required for lifespan extension/modulation by a reduction in insulin-like signalling. I.e. it does respond to other pathways but this is the only one where Foxo activity is known to modulate lifespan.

      Reviewer 3

      I agree with reviewer 1 that point raised under (1) does not appear strictly required for the conclusions of the manuscript.

      Both reviewers 1 and 3:

      I have a different take on the results of experiments where IPCs are manipulated. To me, Figure 7D and E show that ablating the IPCs removes the difference between Ax and Ap i.e. the IPCs are involved and insulin-like signalling is likely involved. The fact that RNAi against the TKR99D receptor does not have the same effect, does not matter (the sensing could happen in different neurons). Similarly, dilp expression is only a minor readout of what is happening with insulin-like signalling - dilps are controlled at the level of secretion.

      However, I would be happy for the authors to present different arguments and make a reasonable conclusion, which may differ from mine. But I think the arguments I present above should be taken into account.

      Significance

      The main contribution of this manuscript is the identification of a mechanism that links the microbiota to lifespan. This is very exciting and topical for several reasons:

      1) The microbiota is very important for overall health but it is still unclear how. Studying the interaction between microbiota and health is an emerging, growing field, and one that has attracted a lot of interest, but one that is often lacking in mechanistic insight. Identifying mechanisms provides opportunities for therapies. The main impact of this study comes from using the fruit fly to identify a mechanism.

      2) It is very interesting that the authors focus on an endocrine mechanism, especially with the clear clinical relevance of gut hormones to human health recently demonstrated with new, effective therapies (e.g. Wegovy).

      3) Tk is emerging as an important fly hormone and this study adds a new and interesting dimension by placing TK between microbiota and lifespan.

      I think the manuscript will be of great interest to researchers in ageing, human and animal physiology and in gut endocrinology and gut function.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this study the authors use a Drosophila model to demonstrate that Tachykinin (Tk) expression is regulated by the microbiota. In Drosophila conventionally reared (CR) flies are typically shorter lived than those raised without a microbiota (axenic). Here, knockdown of Tk expression is found to prevent lifespan shortening by the microbiota and the reduction of lipid stores typically seen in CR flies when compared to axenic counterparts. It does so without reducing food intake or fecundity which are often seen as necessary trade-offs for lifespan extension. Further, the strength of the interaction between Tk and the microbiota is found to be bacteria specific and is stronger in Acetobacter pomorum (Ap) monoassociated flies compared to Levilactobacillus brevis (Lb) monoassociation. The impact on lipid storage was also only apparent in Ap-flies. Building on these findings the authors show that gut specific knockdown is largely sufficient to explain these phenotypes. Knockdown of the Tk receptor, TkR99D, in neurons recapitulates the lifespan phenotype of intestinal Tk knockdown supporting a model whereby Tk from the gut signals to TkR99D expressing neurons to regulate lifespan. In addition, the authors show that FOXO may have a role in lifespan regulation by the Tk-microbiota interaction. However, they rule out a role for insulin producing cells or Akh-producing cells suggesting the microbiota-Tk interaction regulates lifespan through other, yet unidentified, mechanisms.

      Major comments:

      Overall, I find the key conclusions of the paper convincing. The authors present an extensive amount of experimental work, and their conclusions are well founded in the data. In particular, the impact of TkRNAi on lifespan and lipid levels, the central finding in this study, has been demonstrated multiple times in different experiments and using different genetic tools. As a result, I don't feel that additional experimental work is necessary to support the current conclusions. However, I find it hard to assess the robustness of the lifespan data from the other manipulations used (TkR99DRNAi, TkRNAi in dFoxo mutants etc.) because information on the population size and whether these experiments have been replicated is lacking. Can the authors state in the figure legends the numbers of flies used for each lifespan and whether replicates have been done? For all other data it is clear how many replicates have been done, and the methods give enough detail for all experiments to be reproduced.

      Minor comments:

      While I feel the conclusions of this study are well supported by the data I found this to be a complex read and in places hard to follow. I feel some work is necessary in the writing to help the reader follow the authors logic. Below I describe some of the issues that confused me and provide some suggestions that I hope the authors will find helpful.

      Survival curves The authors state that the lifespan difference between CR and axenic flies disappears with TkRNAi because TkRNAi CR flies are longer lived, rather than because TkRNAi axenic flies are shorter lived. Is this consistent in every TkRNAi experiment? It's hard for the reader to assess this because the relevant lifespan curves are presented on separate plots. I'd encourage the authors to provide lifespan plots that enable comparison between all conditions. For example, in figures 2 and 6 the reader wants to directly compare between RU- and RU+ but can't easily do so. Additional plots could be made available in the supplementary figures showing the comparisons that are not easy to make on the main figures.

      Consistent framing of the data Do the interventions shorten lifespan relative to the axenic cohort? Or do they prevent lifespan extension by axenic conditions? Both statements are valid, and the authors need to be consistent in which one they use to avoid confusing the reader. For example, line 325 says TkR86CRNAi prevents lifespan extension in axenic flies. Given the framing in the previous sections, it might be clearer to say that TkR86CRNAi shortens the lifespan of axenic flies to that of CR flies in contrast to TkRNAi and TkR99DRNAi which don't.

      The impact of TkRNAi on lipid levels in axenic flies TkRNAi consistently reduces lipid levels in axenic flies (Figs 2E, 3D), essentially phenocopying the loss of lipid stores seen in control conventionally reared (CR) flies relative to control axenic. This suggests that the previously reported role of Tk in lipid storage - demonstrated through increased lipid levels in TkRNAi flies (Song et al (2014) Cell Rep 9(1): 40) - is dependent on the microbiota. In the absence of the microbiota TkRNAi reduces lipid levels. The lack of acknowledgement of this in the text is confusing for the reader because it is inconsistent with the microbiota driving both higher Tk expression and higher lipid storage. If the microbiota increases Tk expression and this results in reduced lipid storage, why does reduced Tk expression also result in reduced lipid storage in axenic flies? This could further highlight the unique impact that the interaction between TkRNAi and the microbiota has on lipid storage, given it reverses both the impact of the microbiota alone and TkRNAi alone. I feel this aspect of the data should be given more attention in the text both for clarity and because it may be telling us something important about the function of Tk. The current framing around pleiotropic effects is valid, and the impact of Tk on lipid storage is clearly independent of its impact on lifespan and so is not central to this study. However, I feel a short additional paragraph to acknowledge this nuance of the data is needed. It can be made clear in the text that further exploration is beyond the scope of the current study.

      Role of insulin signalling and insulin producing cells I'm convinced by the data showing that FOXO is required for TkRNAi to prevent lifespan shortening by Ap, but FOXO doesn't only respond to insulin signalling and can't be taken by itself to indicate a role for insulin signalling which the authors appear to do here.

      I would expect ablation of IPCs to have the opposite effect to foxo mutation and to increase FOXO activity throughout the organism due to a reduction in Dilp levels and so reduced insulin signalling. So, I have struggled to follow the authors logic in ablating the IPCs and feel a clear statement on what they expected the outcome to be would help the reader. They find that TkRNAi still prevents lifespan shortening by Ap when IPCs are ablated and that TkR99DRNAi in IPCs also doesn't block lifespan shortening by Ap despite reducing the expression of dilp3 and dilp5. To me these data rule out a role for insulin signalling despite the requirement for FOXO and yet the authors conclude that insulin signalling is involved in the response to Ap and TkRNAi, although not obligately (lines 420 - 422 and 511 - 512). Can the authors clarify their logic in concluding a role for insulin signalling, and qualify this conclusion with appropriate consideration of alternative hypotheses? The potential involvement of other signalling inputs to FOXO activity, e.g. immune signalling and JNK, should be acknowledged and warrants some discussion.

      Typographical errors:

      Incomplete sentence line 121 to 122 - starting "Cox proportional hazards.... and posthoc tests (Fig 2b).

      Line 123 "EMMs" - define abbreviation on first use

      References to Fig 2b (first given on line 122), should be capitalised to Fig 2B for consistency.

      Lines 231 and 317 - the phrase "steady state (microbiota independent) expression" in reference to flyATLAS 2 data could be misleading. The term "microbiota independent" could suggest that expression levels have been shown not to be regulated by the microbiota and this is not the case. The authors should change this to simply state they are referring to steady state expression in conventionally reared flies.

      Referees cross-commenting

      Below are brief comments on the revision suggestions that reviewers 2 and 3 have requested.

      Reviewer 2

      1. I agree that confirmation that TkRNAi doesn't impact microbial levels could be helpful and would be straightforward for the authors to do. However, I don't feel it's essential to support the central claims of the paper.
      2. I agree.
      3. I don't feel that any of these experiments supports a role for insulin signalling, so I don't feel that this additional control is needed.
      4. It would be a good addition to have lifespan data from a separate knockdown line for corroboration. However, this has already been done in several different genetic backgrounds through crosses with different driver lines in multiple tissues, so I feel it's unnecessary given the time and resources that lifespan experiments take. There's also the caveat that different RNAi lines can knockdown to different extents so that would have to be assessed as well and if there's a difference it may mean that ultimately not much can be concluded from this additional experiment.
      5. A good suggestion, but not straightforward and depends on the availability of the necessary tools, or possibly the generation of new tools. One for a follow up study.
      6. I feel this is not important enough to the central findings of the study to warrant the extra work.
      7. I agree.

      Reviewer 3 1. Imaging calcium signalling is not straightforward unless a lab already has the tools available and optimised. If Tk+ EEs show changes in calcium signalling I'm not convinced that this tells us anything specific to the Tk-microbiota interaction. The point is the role of Tk itself, not the broader activity of the cells that express it. 2. I agree this needs clarification. 3. I agree that this would add depth, if feasible, but feel it's not essential to support the current conclusions. 4. This is a minor point and given the RT-qPCR data and the RNAseq data corroborate each other I'm convinced that Tk levels are elevated. 5. I feel exploring this in males is opening an additional line of enquiry beyond the scope of the current study. Either the phenotypes are the same - in which case what is added? - or they are different but there's no scope to assess why. A good suggestion for a follow up study. 6. No comment. 7. Agreed.

      One final comment. It's true that FOXO has only been shown to regulate lifespan in the context of insulin signalling. However, as far as I'm aware it hasn't been shown not to regulate lifespan downstream of it's other activators, this simply hasn't been explored due to the historical focus on insulin signalling in this field. In the context of host-microbiota interactions considering other pathways the activate FOXO, such as immune and JNK signals, would make sense.

      Reviewed by Dr Rebecca Clark, Department of Biosciences, Durham University

      Significance

      Overall, I find the key conclusions of the paper convincing. The authors present an extensive amount of experimental work, and their conclusions are well founded in the data. We have known that the microbiota influence lifespan for some time but the mechanisms by which they do so have remained elusive. This study identifies one such mechanism and as a result opens several avenues for further research. The Tk-microbiota interaction is shown to be important for both lifespan and lipid homeostasis, although it's clear these are independent phenotypes. The fact that the outcome of the Tk-microbiota interaction depends on the bacterial species is of particular interest because it supports the idea that manipulation of the microbiota, or specific aspects of the host-microbiota interaction, may have therapeutic potential.<br /> These findings will be of interest to a broad readership spanning host-microbiota interactions and their influence on host health. They move forward the study of microbial regulation of host longevity and have relevance to our understanding of microbial regulation of host lipid homeostasis. They will also be of significant interest to those studying the mechanisms of action and physiological roles of Tachykinins.

      Field of expertise: Drosophila, gut, ageing, microbiota, innate immunity

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      We thank the reviewers for providing thoughtful and constructive feedback, which will help us improve the clarity and rigor of the paper. On balance, the reviews were positive. Reviewer 1 mentioned that “This is a strong manuscript with few problems and all important findings well justified, indeed this is a nicely polished…..high-quality manuscript,” and that “this paper makes a major breakthrough, showing that cell autonomous defects in hTSCs are very likely at the heart of the pathology observed in GIN-prone murine mutants.” Reviewer 3 stated that “The study is well designed, and the manuscript is very well written. The conclusions are supported by the evidence presented.” Reviewer 2 was less enthusiastic, with main concerns being that “The paper is mostly descriptive and often quite confusing leaving one not much closer to understanding the mechanistic basis for the interesting sex-biased semi-lethal phenotype.” and felt that figure titles/section headers overstated the results, and finally recommended to improve some technical aspects and tempering conclusions. The proposed edits we think address most issues raised by the reviewers either with re-writing or adding data as described below.

      In response to reviewer #1 comments:

      Major comments:

      • I am confused as to the basis of the sex-skewing phenomenon? Is the problem that lack of maternally loaded WT Mcm4 worsens the phenotype, or is the issue that Mcm4C3/C3 dams are less able to retain pregnancies, perhaps being a more inflammatory environment? Also, while there quite consistent evidence for reduced viability of Mcm4C3/C3McmGt/+ progeny, especially for female progeny, how confident can we be that the genotype of the dam vs. sire is important? Notably on a Ddx58 background, the progeny of the Mcm4C3/C3 sire included seven live male Mcm4C3/C3McmGt/+ but no female.

      Regarding the first point (sex skewing only when female is C3/C3), we also suspected either: 1) the maternal uterine environment, or 2) reduced oocyte quality. Although not reported in this manuscript, we tested #1 by performing embryo transfer experiments. Transferring 2-cell stage embryos from sex-skewing mating to WT females did not rescue the sex-bias. We then examined oocytes from C3/C3 females. We found evidence for compromised mitochondria and transcriptome disruption. However, we are not sure why this happens (poor follicle support? Oocyte intrinsic phenomenon?). We are reserving these results and additional experiments for another paper, especially since this one mainly deals with GIN and placenta development. If the reviewers feel strongly that the embryo transfer data is crucial, we can include it.

      Regarding how confident we are that the genotype of the dam vs. sire is important, this stems from our previous paper by McNairn et al 2019 (the percentage of female C3/C3 M2/+ from sex-skewing mating is 20% compared to 60% from the reciprocal mating), which was quite dramatic. Consistent with this, MCM levels were significantly reduced in the placentae only when the dam was C3/C3 and the sire C3/+ M2/+, but not in the reciprocal cross. The reviewer makes a good observation about the Ddx58 cross; we can only hypothesize that the mutation somehow sensitizes females in this scenario and will make mention of it in the revision. We also realize that we neglected to write in Methods that the Ddx58 allele was coisogenic in the C3H background.

      • I'm not sure what Supplementary Figure 6 is showing (faster differentiation of C3 but less TGC?). Regardless, it's hard to draw too much conclusion from one not-very-pretty Western blot. This figure requires both additional replicates and a better explanation of how it fits with the other conclusions of the paper..

      We hypothesized that the JZ defect observed in the semi-lethal genotype placentas could arise either from impaired maintenance of the progenitor pool or from reduced capacity of mutant trophoblast progenitors to differentiate into the JZ lineage. The blot in Supplementary Figure 6 was intended as a qualitative demonstration that mutant trophoblast stem cells can differentiate into JZ lineages. We recognize that the figure is not definitive and will revise the text to clarify its purpose. A replicate(s) of the Western will be performed as suggested.

      • Supplementary Figure 7F-G is puzzling. Half of the mESCs have gamma-H2AX at all times, including most in S or G2 phase? In Figure S7E, do the quadrants correspond to being negative or positive for gamma-H2AX? At very least, IF images showing clear gamma-H2AX foci would be much more convincing.

      The gates for γH2AX FACS analysis were established using negative controls lacking primary antibody. As reported previously, embryonic stem cells display high basal levels of γH2AX staining (Chuykin et al., Cell Cycle 2008; Turinetto et al., Stem Cells 2012; Ahuja et al., Nat Comm 2016), which likely explains the broad signal observed across cell cycle phases. Regardless, we will provide immunofluorescence staining of γH2Ax and foci count in our revision.

      • The methods section is well detailed, but it would be ideal to clarify how many replicates each Western Blot or flow cytometry experiment is representative of.

      Thanks for the suggestion. We will update this for Fig4 and Fig5.

      Minor comments:

      • Is it possible that cGAS-STING and RIG pathways act redundantly to cause inflammation and lethality, or that other innate immune components are involved? I don't expect the authors to make compound mutants to test this but at least this possibility should be discussed textually.

      We appreciate the reviewer’s point, and had the same suspicion. Supporting this, we will add new RNA-seq analysis of Tmem173 KO placentas revealed elevated inflammatory gene expression compared to C3/C3 M2/+ controls, consistent with potential redundancy or feedback regulation. We will update in supplementary figures to reflect this.

      In response to reviewer #2 comments:

      Major comments:

      A major concern throughout the paper is that conclusions are often overstating their data. The title of figure 2 is "placentae with replication stress have smaller junctional and labyrinth zones". However, there is no measure of replication stress in this figure, just a histological evaluation of the placentae from the different mutants. The title of figure 3 is "Impact of GIN on LZ is less than JZ," but there is no measure of GIN, but instead measurement of number of cells in cell cycle and some bulk RNA-seq analysis. Title of figure 4 is "TSCs with increased genomic instability exhibit abnormal phenotypes." Again there is no measure of GIN, but instead staining of derived TSCs for proliferation, cell death, and a TSC marker. Title of figure 5 is "DNA damage responses and G2/M checkpoint activation drive premature TSC differentiation." However, there does not appear to be a difference in gH2AX between the two mutant genotypes. Checkpoint proteins might be up, but need quantification and reproduction. > 4C is the only marker of differentiation. Importantly, all the analyses here are associations, not connections, so cannot use the word "drive". Similar issues can be raised with a number of the supplementary figures.

      The Chaos3 (chromosome aberrations occurring spontaneously 3) model is a well-established system of intrinsic chronic replication stress and GIN. It is characterized by ~20 fold elevation of blood micronuclei (Shima et al., Nature 2007), a hallmark of GIN (Soxena et al., Mol Cell 2022); a destabilized MCM2-7 helicase prone to replication fork collapse (Bai et al., PLoS Genet 2016); and increased mitotic chromosome abnormalities and decreased dormant origins (Kawabata et al., Mol Cell 2011; Chuang et al., Nucleic Acid Res 2012) that are known to cause GIN and replication stress (Ibarra et al., PNAS 2008 ). Also, in our previous work (McNairn et al Nature 2019), we showed that placentae from C3/C3 dams exhibit significantly elevated γH2Ax as well as reduced MCM2 and MCM4 protein levels. In our current study, we also observe elevated γH2Ax in mutant TSCs (C3/C3 and C3/C3 M2/+), consistent with genomic instability. Nevertheless, we acknowledge that in TSCs, we did not formally demonstrate replications stress(RS), so where appropriate, we will advise figure titles, for example to say that “cells/placentae with a GIN or RS genotype.”

      We acknowledge the reviewers concern regarding western blots. We will provide quantification and statistics in our revision.

      1) A deeper analysis of the cell lines is likely to be the most fruitful path to reveal interesting mechanisms. It is very surprising that there is no phenotype in ESCs. Authors should check for increased apoptosis. Maybe the phenotypic cells are lost. Or do ESCs use different MCMs/mechanisms of DNA replication or are they better able to handle replication stress and GIN? How many passages were the TSCs and ESCs cultured for? Does GIN (i.e. aneuploidy, CNVs) develop in TSCs and ESCs with passaging? How do the MCM mutations impact the molecular identity of the ESC and TSC cells including their heterogeneity in the population.

      We assessed apoptosis using cleaved caspase 3 flow cytometry in mutant ESCs and observed no difference compared to controls (we will add this data as Supplementary Fig. 7).

      We believe there are intrinsic differences in TSCs and ESCs in their ability to respond to and counteract replication stress and DNA damage. ESCs are known to license more replication origins than somatic cells at a higher rate, which protects them from short G1-induced replication stress (Ahuja et al., Nat Comm 2016; Ge et al., Stem Cell Rep 2015; Matson et al., eLife 2017). Human placental cells physiologically exhibit high levels of mutation rate and chromosomal instability in vivo (Coorens et al., Nature 2021). Supporting this, Wang, D., et al (Nat Comm 2025) reported that several cell cycle and DDR regulators are differentially expressed in human TSCs vs human pluripotent stem cells. Whether such transcriptional differences directly contribute to functional outcomes remains to be determined.

      All experiments in this study were conducted using early-passage ESCs and TSCs (i.e. Finally, we showed that close to 90% mutant ESCs are KLF4+ (a naive pluripotency marker) whereas EOMES+ cells were significantly reduced in TSCs carrying the GIN genotype (Fig. 4E–F and Supplementary Fig. 7), highlighting lineage-specific differences.

      Minor Comments:

      1) There is a lack of quantification and repeats for all Westerns. At minimum there should be three repeats for each experiment, quantification including normalization to a reference protein, and stats confirming any proposed differences between conditions.

      We will update our revision with quantification and statistics for western blots.

      2) I would recommend moving the results in supp table 1 to figure 1. While negative, they are the newer results. The results shown in current figure 1 are essentially a reproduction of their previous work.

      The placental observations presented in Fig.1 are new. In particular, the placental and embryonic weight measurements graphed in Fig1B and C have not been published by our group. Fig1A reproduces our previous observation on embryo viability in GIN mutants (McNairn et al., Nature 2019), while the schematic was provided for better flow and readability given the complex mating schemes. We are agnostic on the Suppl Table 1. It could be changed to a new Table 1 in the main section depending on the journal.

      In response to reviewer #3 comments:

      Major Comments

      While the inclusion of bulk RNAseq data of whole placental tissue is appreciated, the interpretation of the results is somewhat problematic, as it is acknowledged that the cell type composition of the placentas is drastically different between groups. Making conclusions based upon GSEA analysis of two different groups with drastically different cell type composition is somewhat misleading, as based on the results, it is a direct reflection of the cell types present. It would be more helpful to perform cell type deconvolution of the RNAseq data to estimate the proportion of each cell type within the bulk samples and compare that to what is seen histologically and not dive too deeply into the pathways since the results could just be a reflection of the cell types e.g. angiogenesis pathways from more endothelial cells. Additionally, the RNAseq data can be leveraged to look at expression of inflammatory genes by sex, which may show interesting patterns based on the other results.

      We agree that the representation of cell types in the placenta is problematic especially for underrepresented genes. We propose to use the BayesPrism tool (Chu et al., Nat Cancer 2022) to deconvolute bulk RNA-seq for better representation of transcriptional changes in the placenta.

      Section: GIN impairs trophoblast stem cell establishment and maintenance. To support the assertion in the first paragraph, beyond measuring apoptosis, it would be helpful at this stage to look at RNA expression levels indicative of the activation of DNA damage checkpoint genes

      We have performed RNA-seq on mutant ESC and TSCs and are in the process of data analysis. We will update these results in the revision.

      Please include additional methodological details in the methods section on the statistical analysis done for differential expression analysis. Specifically, what type of normalization was used, if lowly expressed genes were filtered out and at what cutoff, what statistical model was used (did you include covariates?), what comparisons were made? Did you stratify by sex? What cutoff was used for statistical significance? Did you perform multiple testing correction?

      We will update RNA-Seq data analysis methods in our full revision.

      2. Description of the revisions that have already been incorporated in the transferred manuscript

      Reviewer #1 comments:

      • Supplementary Table 1. would be enhanced greatly showing comparable tables for Mcm4C3/C3 x Mcm4C3/+McmGt/+ in mice without the Tmem173 or Ddx58 mutations. It is fine to recycle data from McNairn 2019 here, as long as the source is indicated, but a comparison is needed.

      Thanks for pointing this out. We have updated this suggestion in Supp table 1.

      • In Figure S3E-F, is the box above each graph supposed to show the genotype of the dam?

      Yes. Thanks for pointing this out. We have added a description in the figure legend to make it clear.

      • "Indeed, the placenta and embryo weights of E13.5 Mcm4C3/C3 Mcm2Gt/+ Mcm3Gt/+ animals were significantly improved vs. Mcm4C3/C3 Mcm2Gt/+ animals, rendering them similar to Mcm4C3/C3 littermates (Fig. 6A-C). The JZ (but not LZ) area in Mcm4C3/C3 Mcm2Gt/+ Mcm3Gt/+ placentae also increased to the level of Mcm4C3/C3 littermates (Fig. 6D-H)." There are two problems here. First, the figure calls are wrong. Second, the description of the data is not quite right, it looks like the C3/C3 and C3/C3 M2/+ M3/+ LZs are a similar size to each and are statistically indistinguishable.

      Thanks for catching this. We have updated these in the main text.

      *Reviewer #2 comments: *

      Minor comment

      • Need to review citations to figures. For example, no citations are made to figure 4a and 4c.

      Thanks for catching this. We have updated the text.

      Reviewer #3 comments:

      Define the first use of >4C DNA content to help readers understand this potentially unfamiliar term.

      We have edited this part to indicate cells with more than 4C DNA content for better clarity.

      iDEP tool - please include citation to manuscript instead of link

      We have updated this citation.

      Check citations. Some citations to BioRxiv that are now published e.g. 13.

      We have updated this citation.

      3. Description of analyses that authors prefer not to carry out

      Reviewer 2

      2) Along similar lines, most of the in vivo phenotypic analyses are performed at E13.5, long after defects are likely beginning to express themselves especially given that they see phenotypes in the TSCs, which represent the polar TE of a E4.5. To understand the primary defects of the in vivo phenotype, they should be looking much earlier. Supplemental figure 5 is a start but represents a rather superficial analysis.

      The peri-implantation period, namely E4.5, represents a “black box” of embryonic development given that this is a critical stage for implantation. Aside from being an extremely difficult stage to analyze technically, we don’t think it is essential to the conclusions (or doable in a timely manner), especially given the use of TSCs. If we complete EdU studies on E6.5 embryos, we will include them.

      3) Fig. 6 would benefit from evidence that MCM3 mutant is rescuing MCM4 levels in the chromatin fraction of cells and the DNA damage phenotype.

      The genetic evidence presented is strong, and although we didn’t do the suggested experiment, we feel that our previous studies (McNairn et al., Nature 2019 and Chuang et al., PLoS Genet 2010) on the effects of MCM3 as a nuclear export factor (as it is in yeast (Liku et al., Mol Biol Cell 2005)) are a reasonable basis for not repeating such experiments. Furthermore, we are no longer maintaining the Mcm3 line and it would take over a year to reconstitute and rebreed triple mutants.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript examines chronic replication stress-mediated genomic instability in placental development and concludes that it disrupts placental development in mice. The study is well designed and the manuscript is very well written. The conclusions are supported by the evidence presented. The manuscript would be improved by addressing the comments below.

      Major Comments:

      • While the inclusion of bulk RNAseq data of whole placental tissue is appreciated, the interpretation of the results is somewhat problematic, as it is acknowledged that the cell type composition of the placentas is drastically different between groups. Making conclusions based upon GSEA analysis of two different groups with drastically different cell type composition is somewhat misleading, as based on the results, it is a direct reflection of the cell types present. It would be more helpful to perform cell type deconvolution of the RNAseq data to estimate the proportion of each cell type within the bulk samples and compare that to what is seen histologically and not dive too deeply into the pathways since the results could just be a reflection of the cell types e.g. angiogenesis pathways from more endothelial cells. Additionally, the RNAseq data can be leveraged to look at expression of inflammatory genes by sex, which may show interesting patterns based on the other results.

      • Section: GIN impairs trophoblast stem cell establishment and maintenance. To support the assertion in the first paragraph, beyond measuring apoptosis, it would be helpful at this stage to look at RNA expression levels indicative of the activation of DNA damage checkpoint genes

      Minor Comments:

      • Define the first use of >4C DNA content to help readers understand this potentially unfamiliar term.

      • Please include additional methodological details in the methods section on the statistical analysis done for differential expression analysis. Specifically, what type of normalization was used, if lowly expressed genes were filtered out and at what cutoff, what statistical model was used (did you include covariates?), what comparisons were made? Did you stratify by sex? What cutoff was used for statistical significance? Did you perform multiple testing correction?

      • iDEP tool - please include citation to manuscript instead of link

      • Check citations. Some citations to BioRxiv that are now published e.g. 13.

      Significance

      The manuscript concludes that replication-stress induced genomic instability impairs placental development in mice. This is a significant advance in the field, as it mechanistically links genomic instability to placental development with further study needed in human trophoblast to establish clinical relevance. Strengths of this manuscript include solid study design, interpretation and presentation (both writing and figures). Weakness of the manuscript reside primarily in the RNAseq analysis results, methods and interpretation. The manuscript is of interest to audiences with interests in genome maintenance, development and placental biology. To contextualize this reviewer's point of view, this review is based on expertise in genomics, computational biology and placental biology.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript, "Chronic replication stress-mediated genomic instability disrupts placenta development in mice" by Munisha et al follows up a 2019 paper in Nature by the same group where they show that mutations to the MCM genes lead to a sex-skewed semi-lethal phenotype starting after embryonic day 9.5 and extending to birth. In the paper, they hypothesized that the semi-lethality is secondary to genomic instability (GIN) driven inflammation due to activation of the innate immune pathways sensing cytoplasmic DNA. In this paper, they start by disproving that hypothesis and then go on to present data arguing lethality is due to a placental development defect rather than inflammation. The paper is mostly descriptive and often quite confusing leaving one not much closer to understanding the mechanistic basis for the interesting sex-biased semi-lethal phenotype that was described in their original paper. The most interesting aspect of the paper is the derivation of TSC and ESCs and initial analysis suggesting that the TSCs are more sensitive to the MCM mutations, but the analysis is rather shallow. Importantly it is unclear how the phenotype explains the sex-skewing of the phenotype. Are the TSC phenotypes sex-skewed and if so why? Also, why is the JZ and especially GlyTCs most effected?

      A major concern throughout the paper is that conclusions are often overstating their data. The title of figure 2 is "placentae with replication stress have smaller junctional and labyrinth zones". However, there is no measure of replication stress in this figure, just a histological evaluation of the placentae from the different mutants. The title of figure 3 is "Impact of GIN on LZ is less than JZ," but there is no measure of GIN, but instead measurement of number of cells in cell cycle and some bulk RNA-seq analysis. Title of figure 4 is "TSCs with increased genomic instability exhibit abnormal phenotypes." Again there is no measure of GIN, but instead staining of derived TSCs for proliferation, cell death, and a TSC marker. Title of figure 5 is "DNA damage responses and G2/M checkpoint activation drive premature TSC differentiation." However, there does not appear to be a difference in gH2AX between the two mutant genotypes. Checkpoint proteins might be up, but need quantification and reproduction. > 4C is the only marker of differentiation. Importantly, all the analyses here are associations, not connections, so cannot use the word "drive". Similar issues can be raised with a number of the supplementary figures.

      Major Comments:

      1) A deeper analysis of the cell lines is likely to be the most fruitful path to reveal interesting mechanisms. It is very surprising that there is no phenotype in ESCs. Authors should check for increased apoptosis. Maybe the phenotypic cells are lost. Or do ESCs use different MCMs/mechanisms of DNA replication or are they better able to handle replication stress and GIN? How many passages were the TSCs and ESCs cultured for? Does GIN (i.e. aneuploidy, CNVs) develop in TSCs and ESCs with passaging? How do the MCM mutations impact the molecular identity of the ESC and TSC cells including their heterogeneity in the population.

      2) Along similar lines, most of the in vivo phenotypic analyses are performed at E13.5, long after defects are likely beginning to express themselves especially given that they see phenotypes in the TSCs, which represent the polar TE of a E4.5. To understand the primary defects of the in vivo phenotype, they should be looking much earlier. Supplemental figure 5 is a start but represents a rather superficial analysis.

      3) Fig. 6 would benefit from evidence that MCM3 mutant is rescuing MCM4 levels in the chromatin fraction of cells and the DNA damage phenotype.

      Minor Comments:

      1) There is a lack of quantification and repeats for all Westerns. At minimum there should be three repeats for each experiment, quantification including normalization to a reference protein, and stats confirming any proposed differences between conditions.

      2) I would recommend moving the results in supp table 1 to figure 1. While negative, they are the newer results. The results shown in current figure 1 are essentially a reproduction of their previous work.

      3) Need to review citations to figures. For example, no citations are made to figure 4a and 4c.

      Significance

      As is, the study does not provide much new insight or understanding of how the MCM mutants are driving the sex-skewed semi-lethal phenotype. It would likely take much effort (months) to reach such a goal. However, without such effort, it is unclear what the significance of the story is. It does make the observation that the placenta appears to be impacted more severely and earlier than then the embryo, and that within the placenta, certain zones and cell types are more vulnerable. The reasons for these differential impacts are unclear though.

      If the authors choose not to dig deeper as suggested in the major comments, then at a minimum it would be important to soften their conclusions as raised in the summary and at least perform experiments/edits proposed in minor comments.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In a previous paper (McNairn et al. 2019 "Female-biased embryonic death from inflammation induced by genomic instability" Science), the Schimenti lab demonstrated that mouse embryos with hypomorphic mutations of the heterohexameric minichromosome maintenance complex, mutations that cause increased genomic instability (GIN), show reduced embryonic viability, with greater loss of female embryos and some parent-of-origin effect. Treatment with immunosuppressants, including ibuprofen and testosterone, partially rescued the observed lethality.

      In this new manuscript, the Schimenti lab demonstrates that these GIN-prone mutants feature smaller placentas with fewer cells. Mutations that interfere with the ability of the innate immune system to respond to micronuclei (a consequence of GIN) have no protective effect. Munisha and colleagues then demonstrate that MCM-mutant TSCs are harder to derive and show elevated apoptosis and a greater propensity for differentiation. The mutant TSCs show CHK1 phosphorylation, P53 phosphorylation and higher P21 levels, all consistent with a response to DNA damage. Downstream of this, they also show loss and inhibition of CDK1, which is already established to cause G2/M arrest (generally) and endoreduplication (specifically in trophoblast). The authors advance a model in which GIN results in loss of the TSC pool by apoptosis, cell cycle arrest and premature differentiation, resulting in smaller placentas and particularly fewer junctional zone cells. How this causes inflammation is less clear, but inflammation appears to be a downstream effect rather than cause of poor placentation.

      Major comments:

      This is a strong manuscript with few problems and all important findings well justified, indeed this is a nicely polished manuscript for something just entering peer review. There are a few unclear points textually and a couple places in the supplementary figures where better data quality would help, but generally it is a high-quality manuscript.

      • I am confused as to the basis of the sex-skewing phenomenon? Is the problem that lack of maternally loaded WT Mcm4 worsens the phenotype, or is the issue that Mcm4C3/C3 dams are less able to retain pregnancies, perhaps being a more inflammatory environment? Also, while there quite consistent evidence for reduced viability of Mcm4C3/C3McmGt/+ progeny, especially for female progeny, how confident can we be that the genotype of the dam vs. sire is important? Notably on a Ddx58 background, the progeny of the Mcm4C3/C3 sire included seven live male Mcm4C3/C3McmGt/+ but no female.

      • I'm not sure what Supplementary Figure 6 is showing (faster differentiation of C3 but less TGC?). Regardless, it's hard to draw too much conclusion from one not-very-pretty Western blot. This figure requires both additional replicates and a better explanation of how it fits with the other conclusions of the paper..

      • Supplementary Figure 7F-G is puzzling. Half of the mESCs have gamma-H2AX at all times, including most in S or G2 phase? In Figure S7E, do the quadrants correspond to being negative or positive for gamma-H2AX? At very least, IF images showing clear gamma-H2AX foci would be much more convincing.

      • The methods section is well detailed, but it would be ideal to clarify how many replicates each Western Blot or flow cytometry experiment is representative of.

      The required additional experiments re: Supplementary Figure 6 and 7 could be conducted in a couple of months.

      Minor comments:

      • Supplementary Table 1. would be enhanced greatly showing comparable tables for Mcm4C3/C3 x Mcm4C3/+McmGt/+ in mice without the Tmem173 or Ddx58 mutations. It is fine to recycle data from McNairn 2019 here, as long as the source is indicated, but a comparison is needed.

      • Is it possible that cGAS-STING and RIG pathways act redundantly to cause inflammation and lethality, or that other innate immune components are involved? I don't expect the authors to make compound mutants to test this but at least this possibility should be discussed textually.

      • In Figure S3E-F, is the box above each graph supposed to show the genotype of the dam?

      • "Indeed, the placenta and embryo weights of E13.5 Mcm4C3/C3 Mcm2Gt/+ Mcm3Gt/+ animals were significantly improved vs. Mcm4C3/C3 Mcm2Gt/+ animals, rendering them similar to Mcm4C3/C3 littermates (Fig. 6A-C). The JZ (but not LZ) area in Mcm4C3/C3 Mcm2Gt/+ Mcm3Gt/+ placentae also increased to the level of Mcm4C3/C3 littermates (Fig. 6D-H)." There are two problems here. First, the figure calls are wrong. Second, the description of the data is not quite right, it looks like the C3/C3 and C3/C3 M2/+ M3/+ LZs are a similar size to each and are statistically indistinguishable.

      Significance

      I partially discussed the above in the summary, but this paper makes a major breakthrough, showing that cell autonomous defects in hTSCs are very likely at the heart of the pathology observed in GIN-prone murine mutants.

      Some questions go unsolved. Why are TSCs more prone to die in response to GIN than mESCs, particularly in light of the general observation that karyotypic abnormality is more common in placental lineage? How does the placental abnormality give rise to inflammation? No manuscript can answer every question, and I think this is a mature manuscript that can be published in a good journal with limited modifications.

      I am an expert on gene regulation in placental development, with somewhat less expertise in the DNA damage field. The placenta field will find this paper interesting, as will the DNA damage field. There are also ramifications for cancer research. The question of why some cells tolerate high levels of DNA damage and others die is very relevant to cancer.