6,784 Matching Annotations
  1. Dec 2022
    1. Author Response

      Reviewer #1 (Public Review):

      This is a very interesting paper trying to quantify excess deaths due to the COVID-19 pandemic in the USA. The paper is roughly divided into two main sections. In the first section, the authors estimate age and cause-specific excess mortality. In the second section, using their excess mortality estimates, the authors attempt to disentangle the impact of SARS-CoV-2 infection (direct impact) vs. the impact of NPIs on this excess mortality (indirect impact). I have some concerns, particularly with respect to the second section.

      The model used to estimate excess mortality is quite clear. The authors adjust the baseline model to account for low influenza circulation (and deaths) during the COVID-19 pandemic, to avoid underestimating the number of deaths caused by COVID-19. While this makes sense if the authors are trying to estimate the total number of deaths caused by COVID-19, I'm not sure it needs to be accounted for if the authors want to estimate excess/added deaths. A counterfactual scenario would've included influenza. It also raises the question of whether (conceptually) they should be adjusting for other causes of deaths that may have also decreased during the pandemic. The authors briefly acknowledge this in the discussion ("we can't account for changes in baseline respiratory mortality due to depressed circulation of endemic pathogens other than influenza") but my comment goes beyond respiratory diseases. Analyses of excess mortality from other settings have suggested, for example, decreased deaths due to fewer traffic accidents (not in the US) or due to decreased air pollution, and not accounting for these would also lead to an underestimate of the total deaths caused by COVID-19. I understand that it is not feasible to account for all potential factors, so I wonder if they should focus on reporting excess deaths as compared to a counterfactual with influenza.

      Thanks. We think it is helpful to “single out” influenza as it causes major fluctuations in mortality from multiple causes in regular years and is a useful reference to contrast the pandemic impact. But the reviewer’s point is well taken. We have clarified our assumptions about the meaning of the baseline in this analysis (methods p 5), discussed the depressed circulation of other pathogens in depth, and mentioned air pollution (p 12-13). We have also slightly reworked our comparison between COVID19 and influenza so that excess mortality estimates are comparable and now cover periods of the same duration (Nov 2017-Mar 2018 for flu and Nov 2020-Mar 2021 for COVID19, see Figure S11).

      The second section, trying to estimate direct vs. indirect effects is also very interesting. However, more details are required about the regression model used and, importantly, what the assumptions and limitations of the approach are. Specifically:

      • Please provide a bit more information on the regression used for direct vs. indirect effects. I'd like to see explicit discussion of the assumptions and limitations of the approach but also of the stringency index used. Does this model include an intercept? Was the association between stringency index and excess deaths assumed to be linear? Or were different functional forms considered? It is also not clear how well the model fits the data.

      Thanks for these comments which helped us improve this section. We have provided more details about the stringency index in methods (it captures the “sum” of interventions), described the model in methods and supplement, and discussed limitations in caveats section, especially regarding effectiveness of these interventions (p13). We had tried different linear models with and without intercepts but elected to use models with intercepts so as not to overly constrain the relationship between interventions, COVID19 activity and excess mortality. These models also incorporate lags in the predictors that are determined by cross-correlation analysis (as detailed in supplement). In the revised version, we now use gam models, where the relationships between excess mortality and predictors do not have to be linear. We can do so since we were able to add several weeks of data (the regression is now based on 96 pandemic weeks from March 1, 2020 to January 1, 2022). The models are described in detail in supplement p 4-5, and we now specify that they have intercepts. We have also provided additional plots of model fits in main text and supplement (Figures 4 and S16-19).

      • Related to the above, please provide more details on how the results of the regressions were translated into the results presented. The main text reports percentages, but the methods only briefly explain how numbers of direct deaths were calculated, and the supplementary tables report coefficients. It is not clear if these estimates of direct and indirect deaths were somehow constrained to add up to the total number of excess deaths, but it doesn't seem like it since point estimates cross 100% in some cases.

      As discussed in response to one of the editor’s questions, estimates are not constrained to 100%. We have provided more details in the supplement on how we estimate the direct impact of the pandemic. Briefly, we calculate expected deaths in the gam model with all predictors set to their observed values and again with the COVID19 predictor to zero. The direct impact is the difference between the two predictions, divided by the predictions of the full model.

      We note that while some of the estimates derived from gam model exceed 100% (and are similar to the linear model estimates presented in the initial analysis, before revision), these estimates echo the findings from a more empirical analysis, in which we compare all-cause excess deaths with official COVID19 deaths tallies. There, in the two oldest age groups, we find more official COVID19 deaths than estimated by the excess mortality models. Hence both analyses point to an underestimation of the direct burden of COVID19 by the excess mortality approach, specific to the oldest age groups. We return to this point in depth in the discussion (p 12-13) and consider the possible effects of harvesting, depressed circulation of non-SARS pathogens, and inaccurate coding of official statistics (as pointed by reviewer #3).

      • Please discuss the potential limitations of using the stringency index to quantify NPIs.

      Several limitations have been added to caveats (p 13); major issues include aggregation of multiple interventions into a single index, which does not consider the actual implementation nor the effect of interventions. The index is solely based on mandates in place in different locations and time periods. We also assume that the effectiveness of these interventions, for a given level of stringency, does not change over time.

      • When estimating direct and indirect effects, the paper assumes that the estimated parameter is time-invariant? Indirect effects might have changed over the course of the epidemic by factors not necessarily captured by the stringency index used, particularly since the index doesn't take into account the implementation of the measures. Have the authors tested this assumption?

      This is an interesting point, which we have explored further. The non-linear relationships we find between NPIs and chronic condition excess mortality may suggest that the reviewer is right. We discuss the role of NPIs in the results section much more deeply than we were previously (bottom of p8).

      “At lower levels of interventions (Oxford index between 0 and 50), representing the early stages of the lockdown in March 2020, excess mortality rose with interventions. Later in the pandemic, increased interventions were estimated to have a beneficial effect on excess mortality, driven by comparison between the period when interventions were strengthened in response to increasing COVID19 activity in late 2020 (Oxford index above 60) to the period when interventions were relaxed in 2021 (Oxford index between 50 and 60).”

      We cannot run an analysis over different time windows because NPI and time are highly conflated (for instance NPI rise from 0-50% in the very early part of the lockdown period, and then stays above 50% for the rest of the study, so we cannot compare the effect of a 25% level in 2020 and 2021). We have added this limitation in the caveat section p.13.

      • The authors state "In contrast, the indirect impact of the pandemic measured by the intervention term was highest in youngest age groups, decreased with age, and lost significance in individuals above 65 years" - I'm not entirely sure of where this statement comes from? For example Table S3 suggests that the indirect effect (multivariate or univariate) is higher in 25-64 yo than in <25s? The same table also suggests negative impacts (protective effects?) in >75s in the multivariate model. Please clarify.

      There are fewer deaths in the under 25 yo so this is why the coefficients were lower overall in table S3. Yet we find that the proportion of variance explained by interventions is higher in the under 25 yrs than in 25-44 yrs.

      We have now changed our modeling strategy to use gam so Table S3 is no longer relevant but the main conclusion that interventions explain a larger relative portion of excess mortality in the under 25 yrs than in the other age groups, and than other covariates, remains valid. The NPI term is now significant is in all groups (although the relative contribution of NPI still declines with age, as in the prior analysis), so we have rephrased this sentence: “In contrast, the relative contribution of indirect effects, via the intervention variable, was highest in youngest age groups and decreased with age”.

      • How do the authors interpret "Percents of excess deaths" over 100%? Similarly, I don't fully understand how to interpret "The upper bound of the 95% confidence interval for heart diseases was above 100% (158%), suggesting that for every excess death from heart disease estimated by our model, up to 1.58 death from heart disease could be directly linked to SARS-CoV-2 infection.

      We have rephrased this section although the overall conclusions remain unchanged. GAM estimates of the direct COVID 19 impact is statistically significantly above 100% in the 85 yo and over, suggesting that our excess mortality approach is too conservative and does not estimate enough COVID19 excess deaths in this age group. We draw a similar conclusion from a more empirical analysis, in which we compare all-cause excess death estimates with official COVID19 deaths tallies. In this analysis, we find more official COVID19 deaths than estimated by the excess mortality models in the two oldest age groups (point estimates above 100% in the 75-84 and 85+ yrs). Hence both analyses point to an underestimation of the direct burden of COVID19 in the oldest age groups by excess mortality approaches.

      Rephrased results section bottom of p.9: “We estimate that the direct contribution of COVID-19 to excess mortality increases with age, from negative and non-statistically significant in individuals under 25 yrs to over 100% in those over 85 years, echoing the gradient seen in official statistics (Table 4). It is also worth noting that our excess mortality estimates may be too conservative (too high) as we did not account for missed circulation of endemic pathogens. This could explain why our estimates of direct COVID-19 contribution exceed 100% in the oldest age group.“

      We return to this point in depth in the discussion and consider the possible effects of harvesting and depressed circulation of non SARS pathogens (p 12-13).

      • Table 3: The signs of the point estimate vs CI for vehicle accidents are inconsistent.

      Thanks, this was a typo. It should have been 4300 (-700, 9300) excess deaths from accidents. This has been updated with more recent data.

      Reviewer #3 (Public Review):

      Authors examine mortality data in the US and use time-series approaches to estimate excess mortality during the COVID-19 pandemic.

      Major comments:

      I would encourage authors to discuss the two different concepts of excess mortality:

      (#1) what deaths were caused, directly or indirectly, by the pandemic. This is what the authors have aimed to assess, and I have no major concerns with the methodology

      (#2) how many additional deaths occurred during the pandemic, compared to what would have been expected in the absence of a pandemic. For such an analysis I think expected annual influenza deaths should be added back to the baseline (or subtracted from the excess)? Some of the discussion seems to relate more to an impression of #2 rather than #1 but I would be interested in the authors' thoughts.

      We have added more details about the approach, in particular why we think that #1 is the proper analysis here (see methods p 5). Given the sheer magnitude of COVID19 excess deaths (over 1 million excess deaths at the end of our study), adding back influenza deaths (up to 52,000 deaths in a recent severe season with a mismatched vaccine, as in 2017-18) would not make a large difference. We have also provided a more direct comparison of the impact of influenza and COVID19.

      1. Authors estimate fewer excess COVID deaths in the elderly than there were confirmed deaths (Table 3). Could this be an indication of some confirmed deaths being "deaths with COVID" rather than "deaths from COVID"? I'm not sure how to interpret the %s in the final column when they exceed 100%. The authors suggested a harvesting effect but I would suggest "deaths with COVID" might be a more likely explanation? This issue can be a limitation of confirmed-death data.

      This is a good point. We have added a comment along these lines in discussion in the middle of p 12. Still, we think harvesting and/or the depressed circulation of endemic pathogens, which would have inflated our baseline, are more likely explanations for these findings. This is because we find similar estimates (exceeding 100%) in gam models that ignore official statistics and rely on COVID19 case data, or COVID19 hospital occupancy data, and this suggests that other mechanisms, beyond coding of official mortality statistics, are at play.

      Yet, as more detailed official statistics become available, a tabulation of confirmed deaths by presence of a primary vs secondary COVID (U07) code may be revealing and get more directly at the reviewer’s question.

    1. Author Response*

      Reviewer #1 (Public Review):

      ARL3 is a small GTPase that localizes to the primary cilium and plays a role in regulating the localization of some specific ciliary membrane proteins, including PDEδ and NPHP3. Mutations in this gene cause Joubert syndrome, a type of ciliopathy characterized by cerebellar malformation, and retinal degeneration. While the majority of the diseases occur in an autosomal recessive manner, two mutations in ARL3 (D67V and Y90C) have been reported to cause autosomal dominant retinal diseases. In the current paper, Travis et al. sought to understand the pathogenesis of the diseases caused by the two autosomal dominant mutations. They found that D67V acts as a constitutive active mutation, whereas Y90C is a fast-cycling mutant, which can be activated in a guanine nucleotide exchange factor (GEF) independent manner. Since the fast-cycle mutant did not bind to the effector proteins in vitro (likely because the guanine nucleotide falls off from the mutant ARL3, which has a lower affinity to GDP/GTP), they developed a method to snapshot the interaction between ARL3 and its effector. Using this method, they showed that the Y90C mutant indeed has increased interaction with the effectors, suggesting that Y90C is an overactive form of ARL3. They then addressed how photoreceptor cells are affected by these two mutations using a mouse model and found that the mutations disrupt the proper migration of the photoreceptor cells.

      Strengths:

      • The paper is well written, and it was easy to understand what the authors did from the figure legends and the methods section.

      • It was easy to find out what is known or unknown, as the paper has accurate references.

      • The authors developed a method to analyze a snapshot of the interaction between ARL3 and its interactors.

      • The paper has an in vivo model and connects the biochemical characteristics of ARL3 to in vivo cellular phenotypes.

      Weaknesses:

      (1) I understand that authors focused on nuclear migration defect as the phenotype was first described in ARL3-Q71L transgenic mice. The similar phenotype observed in RP2 knockout mice further supports the idea that the defect is caused by the hyperactivation of ARL3. Indeed, the defect is not reported in the ARL3 knockout mice, however, I feel that it does not necessarily mean that the defect is not caused by loss of function. Although it has not been assessed, ARL3 knockout mice might have the same defect. Therefore, I think analyzing both the migration defect and trafficking defect would be more informative, rather than focusing on the migration defect. The fact that the relationship between nuclear migration defect and the retinal degeneration phenotype is not entirely clear further enhances the importance of analyzing the trafficking defect.

      Does the expression of ARL3-Y90C also cause the trafficking defect? If it is the case, you can separate the nuclear migration phenotype from the one caused by the trafficking defect. Would the expression of lipidated cargo(s) rescue the trafficking defect as well?

      I think many questions can be addressed by analyzing the localization of the lipidated cargos, such as PDEδ and GRK1.

      The effect of Arl3-Y90C expression on trafficking of lipidated cargos is an interesting question. Previous papers showed mislocalization of lipidated outer segment proteins in Arl3-KO rods and down-regulation or subtle mislocalization in Arl3-Q71L overexpressing rods. So, this was one of the first things we investigated; however, we never observed mislocalization of ciliary or outer segment lipidated cargos (i.e. GRK1, transducin, Rab28, and PDE) in wild type mature rods that were overexpressing Arl3 mutants, and many were tested. It was through these experiments that we first identified the pronounced nuclear migration defect. Rod photoreceptor nuclear migration is a developmental process that is completed by P10, so Arl3-Y90C overexpression is causing a developmental defect. When rods are positioning their nuclei in the ONL, they are still “immature” as their primary cilium has not begun to elaborate disc membranes for light capture. All our analysis was performed in mature rods, so it is not surprising that we did not observe any lipidated trafficking defects at this timepoint. Since the developmental timing of the nuclear migration defect is important for our manuscript, we have added this to our introduction. Additionally, we use “immature” photoreceptors for the cartoon diagrams showing how Arl3 activity is altered by different mutation and rescue experiments, since formation of the mature outer segment occurs post-migration.

      (2) I am not quite sure if the nuclear migration was assessed properly. Based on the pictures in Fig.1, some of the FLAG-negative cells also seem to be migrating to INL (please see Fig.1C and Fig.1D). Is this biologically normal during development? Could this analysis be affected by the thickness of OPL, the layer between ONL and INL? Also, the picture is cut out in the middle of INL. Could authors include more layers, such as IPL, of the retina in the picture, so that we can evaluate INL and OPL better? Taking this into account, I think it is worth measuring the nuclear position of FLAG-negative cells as a negative control in all the experiments.

      Our electroporation technique results in a small population of rods that express our constructs of interest (~5-15% with a patch). All the experiments were performed in wild type retina which develop normal retinal layers, so analysis of the nuclear position of FLAG-positive cells with the retina is cell autonomous. Migration defects are assessed by differences in the skew of FLAG-labeled rods relative to the boundaries of the wild type ONL, which is marked by Hoechst nuclear stain (also a measure of the FLAG-negative rods). Wild type photoreceptors nuclei are not found within the INL, the nuclei in that layer belong to either horizontal cells or bipolar cells both of which are not targeted by our electroporation approach. As a control, we show that when wild type Arl3-FLAG was expressed FLAG-labeled rods were never observed within the INL. We have now included the % of displaced nuclei in Table 1.

      (3) The way that the authors showed the Y90C mutant of ARL3 is a fast-cycling mutant is not very compelling. In Figure 2C, the authors showed that ARL3 Y90C can bind to PDEδ, its effector, once it is pre-loaded with GTP. The authors also showed that the mutant can bind to its effector even without EDTA as long as an excess amount of GTP is added. The authors used endogenous ARL3 as a control to compare the effects between wild-type and mutants. I see that this experiment has multiple pitfalls. First, ideally, this type of experiment needs to be done with a purified protein using fluorescent guanine nucleotide/radioactive guanine nucleotide (e.g. nucleotide loading assay or nucleotide exchange assay) to directly access the kinetics of nucleotide exchange. However, I do understand that this is out of the authors' expertise. In the authors' experimental setting, I am not sure loading the protein with GTP in the presence of the EDTA means anything more than confirming that the protein is intact. Theoretically, wild-type and a fast-cycling mutant can load GTP with similar efficiency in the presence of EDTA. Then during immuno-precipitation, GTP falls off from the Y90C mutant faster than wild-type (because a fast-cycling mutant theoretically has a lower affinity to guanine nucleotides), assuming that GTP was not added during immuno-precipitation (GTP addition was not mentioned in the method, but could authors confirm this?). But in this case, the kinetic of GTP dissociation can be affected by many factors, including the presence of GAP in the reaction, the dissociation constant of Y90C, the volume of the buffer used, and the number of washing steps. Thus, it is not very easy to estimate the difference between wild-type and Y90C. Besides, using endogenous ARL3 rather than ARL3-wild type FLAG as a control can be dangerous. I have experienced that a tagged protein is cleaved to a protein that has a similar size to endogenous protein. (I expressed GFP-protein X in knockout cells lacking protein X, and saw the band at the position where the endogenous protein is observed in wild-type cells). So, the endogenous band that the authors showed could come from the cleaved FLAG-Arl3. (Authors can easily confirm this by having wild-type not expressing FLAG-tagged ARL3, though).

      An alternative experiment that I would suggest is doing immuno-precipitation in the buffer containing: 1) no guanine nucleotide, 2) 10mM GDP, or 3) 10mM GTP in the cells expressing the following protein: 1) ARL3 wild-type FLAG, 2) ARL3 Y90C FLAG, or 3) ARL3 D129N FLAG. 10mM guanine nucleotide should be added throughout the process including washing. This experiment might also be affected by many factors, but variability should be lower than the experiment presented in Fig 2C. ARL3-wild type FLAG is also a better control here than endogenous protein.

      Variability due to the factors you mention is a concern, but we were able to repeatedly obtain the same results using our method—admittedly our method is testing whether the mutated Arl3 can exchange under a certain condition more than exactly how. We know that we are not providing precise kinetics or elucidating the underlying mechanism for how these mutations lead to what we are calling fast cycling. While that information is important, it is outside the scope of this paper.

      As you mention, an important conclusion from the PDEδ binding experiments is that we confirm the Arl3-Y90C protein is intact by showing it can indeed bind nucleotide as long as there is an excess of GTP (Fig 2B. The interesting finding from these experiments is that Arl3-Y90C binds GTP even in the presence of magnesium, a behavior not observed for wild type Arl3. We feel that showing that endogenous Arl3 is not activated in the presence magnesium in each of our preparations is a lovely internal control. However, we agree that showing wild type Arl3-FLAG in these assays is an important negative control and have now included this blot as Fig 2-Sup Fig 1.

      (4) In Fig.3, the authors attempted to take a snapshot of the interaction between ARL3 and multiple effector proteins. The three bands that were enriched in the Q71L cells were found as RP2, UNC119, and BART by mass spec (Fig.3B). These bands were used as a readout for the subsequent experiments. I am not quite sure why the authors used this approach rather than using the cell line that expresses both FLAG-ARL3 and GFP tagged protein of interest, just like what the authors did in Fig3G. The reasons why I prefer the latter approach are the following: FLAG bands that correspond to the three proteins (RP2, UNC119, and BART) in wild-type cells are very close to the detection limit, 2) authors failed to confirm that the lowest band actually comes from BART, 3) authors cannot access some important effector proteins, such as PDEδ because 293 cells might not express them. All of the problems can be solved by using the approach that was taken in Figur 3G.

      If the authors chose the former approach because of some specific reason, I would appreciate it if the authors could explain that in the main text of the paper.

      In vitro crosslinking experiments were performed to test whether overexpression of Arl3 mutants resulted in an active cellular Arl3 without artificially changing any components of the GTPase cycle. We feel these experiments are highly elegant as they allow us to take a snapshot of native Arl3 activities without compromising the analysis by artificially altering GAP/GEF/effector interactions through overexpression or during lysis (as we show that the concentration of GTP/Mg could alter interactions in Fig 2). While AD293T cells are not rod photoreceptors, we are able to use this system to better understand how the Arl3 mutants alter the level of activity within the cell. Yes, this experimental assay is novel, but we confirmed the identity of the effectors by Western and mass spec, used positive and negative controls in each experiment, and show that the method is highly reproducible. We agree with Reviewers 2 and 3 that using this method to study the cellular activity of fast cycling Arl3 mutants is a strength of our paper.

      (5) ALR3 Y90C causes nuclear migration defect. Given that Y90C is a fast-cycling mutant (hyperactive) and has a high affinity to ARL13B, the nuclear migration defect might come from either the increased activity of ARL3 or sequestration of ARL13B, which can act as a GEF for ARL3 but potentially have other functions. If my understanding is correct, the authors concluded that the defect caused by ARL3-Y90C is likely due to hyper-activation of the protein, as Y90C/T31N mutant, which cannot bind to effectors but still retains the ability to capture ARL13B, did not cause migration defect. But I am a little confused by the fact that Y90C/R149H, which is unable to bind to ARL13B (Fig.2C) but still retains the ability to interact with the effectors (Fig.3F), did not have migration defect (Fig.7B). Wouldn't this mean that the sequestration of ARL13B could contribute to the phenotype?

      If my understanding is correct, the authors are trying to say that both hyper-activation of cytosolic ARL3 and the defect in endogenous ARL3 activation in cilium is necessary to cause migration defect. I am not very convinced by this hypothesis, and still think that the defect could be caused by sequestration of ARL13B to the cytoplasm.

      Then why Y90C/T31N did not cause the defect even though they can sequester ARL13B? This might be explained by the localization of the ARL13B mutants. If Y90C can localize to the cilium while the double mutant, Y90C/T31N, does not, then only Y90C might be able to inhibit the ARL13B function in the cilium. This could explain the lack of the defect in the cells expressing Y90C/T31N.

      It would be helpful to understand how exactly the fast-cycling mutant causes the defect if the authors can provide more information, including localization of ARL3 (wild-type and mutants) as well as key proteins, such as ARL13B and the effector proteins. Assessing ARL13B defect seems to be particularly important to me because ARL13B deficiency has been connected to neuronal migration defect (Higginbotham et al., 2012)

      What I am trying to say here is that how the defect is caused is likely very complex. So, providing more information without sticking to one specific hypothesis might be important for readers/authors to accurately interpret the data.

      Our data shows that for the fast cycling Arl3-Y90C mutation both features: blocking endogenous Arl3 activation in the cilium (through Arl13B binding) and increasing activity of Arl3-Y90C in the cell body are required to produce a nuclear migration defect. We find that we can rescue migration defects by either restoring activation in the cilium or reducing GTP activity outside the cilium. As long as there is more Arl3-GTP activity in the cilium, then the rod can handle aberrant Arl3-GTP activity in the cell body. The Y90C/R149H was a critical result that led to our hypothesis that there is a gradient between the two compartments that is used for proper migration. One interesting point is that absence of any activity does not produce the migration phenotype, further suggesting that an imbalance in the gradient is important.

      We performed new experiments to investigate whether Arl3-Y90C is sequestering Arl13B away from the cilium but found that localization of Arl13B (both endogenous and overexpressed) is not altered by expression of Arl3-Y90C – see Fig 3-SupFig 1-2.

      It is an interesting question as to how different Arl3-FLAG constructs are localized within the photoreceptor. Sadly, we did not analyze the data in a way that would allow us to draw any conclusion about the localization of different Arl3-FLAG constructs. In general, we observed FLAG localization throughout the photoreceptor cell and focused our imaging on the FLAG staining around the nucleus so we could further analyze ONL position. Looking back through our images, some of mutants might have a more prominent localization within a specific subcellular compartment (e.g. Arl3-D67V is more prominent in the inner segment than outer segment and Arl3-Y90C appears to have dominant outer segment localization). Likely, these differences represent each mutant binding a particular effector: D67V to RP2 and Y90C to Arl13B, which we show biochemically. Ideally, Arl3 mutant localization would be analyzed during development to provide a more direct link to the nuclear migration defect, a future direction for our lab. We have updated our manuscript to be more transparent about the potential differences in rod localization of Arl3 mutants.

      (6) The rescue experiments that the authors presented in Fig.5-6 are striking and would build a base for future therapy of the diseases caused by ARL3 defects. However, I believe more examinations are needed to accurately interpret the data. The authors did this rescue experiment by co-injecting ARL3-FLAG and chaperons/cargos if I understand the method section correctly. But I feel we can interpret this data correctly only when ARL3-FLAG and chaperons/cargos are co-expressed in the same cells. I think a better way to analyze the data might be by comparing the nuclear migration phenotype between ARL3-FLAG only and ARL3-FLAG;chaperons/cargos double-positive cells.

      Our lab has found that the initial estimates by the Cepko Lab that co-injection of two plasmids results in above 90% of rods expressing both proteins is accurate (see reference Matsuda and Cepko PNAS 2004). Since we only assess nuclear position of FLAG-labeled rods, it is true that a small percentage of cells in this analysis express the Arl3-FLAG mutant and not the chaperone/cargo; however, inclusion of these cells really only bolsters our findings as complete rescue would likely be even more robust than measured.

      Reviewer #2 (Public Review):

      The small GTPase Arl3 (Arf-like 3) is a well-characterized component of primary cilia, including the outer segment of photoreceptors, which contain specialized cilia. Arl3 is critical for the import of multiple lipid-modified proteins into cilia that are vital to ciliary function. Human mutations in Arl3 are reported to cause both autosomal recessive and dominant inherited retinal dystrophies, but the mechanisms through which these mutations disrupt photoreceptor development are not known. Here the authors show that two dominant Arl3 mutants, Arl3-D67V and Arl3-Y90C exhibit increased activity, but for different reasons. Arl3-D67V is constitutively active (unable to hydrolyze GTP), whereas Arl3-Y90C is a classic rapid-cycling mutant, able to bind GTP spontaneously (independent of its guanine nucleotide exchange factor Arl13) but still able to complete the GTPase cycle by hydrolyzing GTP. Expression of either mutant in developing murine retinas results in a nuclear migration defect, specifically aberrant localization of rod nuclei to the inner rather than outer nuclear layer. In this sense, they phenocopy another well-characterized constitutively active mutant, Arl3-Q71L. Normal nuclear distribution could be restored by overexpression of Arl3 effectors, suggesting that the active mutants disrupt nuclear migration, at least in part, by sequestering Arl3 effectors.

      While the data are reasonably clear and convincing, there are several instances where the conclusions drawn are either confusing or problematic. Specifically:

      1) Although retinal rod cells are ciliated in their outer segment, the authors never actually examine ciliation here. Their only morphological readout is nuclear migration. How does nuclear migration failure impact ciliogenesis in the outer segment?

      Imaging was performed in mature retinas at P21 after outer segment formation is completed. Electroporation only targets a small population of cells for which we observed normal outer segments structures in all conditions tested — therefore we conclude that ciliogenesis is unaffected. Previous literature has also showed that defects in rod nuclear migration do not affect ciliation of the outer segment.

      2) The Arl3-Y90C mutant seems to act physiologically more like a dominant-negative than an activated mutant. A second mutation in Y90C (R149H) that blocks binding to the GEF Arl13 abrogates the nuclear migration defect, suggesting that Y90C is preventing activation of endogenous Arl3 by sequestering the GEF. Yet overexpression of effectors or cargos still rescues nuclear migration in the presence of Y90C, suggesting that it also sequesters effectors. How do the authors explain this?

      We agree with this interpretation. We have now included language about Arl3-Y90C’s role as a dominant negative in that it blocks Arl13B activity. The interesting caveat to this black and white usage is that blocking Arl13B would suggest a reduction in endogenous Arl3 activity in rods (which we find to be true, see Fig 5A). However, the migration defect phenotype mimics overly active Arl3 (Arl3-Q71L) and not a loss of function in Arl3 (Arl3-T31N). Using in vivo crosslinking experiments, we show that the fast cycling nature of Arl3-Y90C also causes GEF-independent activation of Arl3 (Fig 4D-E) that leads to the migration defect. Our rescue data shows that only the combination of both effects – reduced Arl3 activity in the cilium and GEF-independent Arl3 activation outside the cilium - is enough to disrupt the ciliary gradient and produce the migration defect.

      3) Fig. 1 suggests that an Arl3-T31N mutant has no phenotype. This is a canonical mutation in small GTPases that typically renders them dominant negative. The lack of phenotype is surprising since most dominant-negative mutants act by sequestering their GEFs, thereby preventing activation of the endogenous GTPase. Fig. 2C suggests that this may not be the case for Arl3-T31N, which binds Arl13 only weakly. Some of this confusion may arise from the fact that Arl13 is not a typical GEF. It is very unusual for one GTPase to directly promote nucleotide exchange on another. Does Arl3-T31N affect ciliation in the rod outer segment, or in other ciliated cells? Some discussion of this point is warranted here.

      Our paper finds that Arl3 mutants must produce an aberrant activity outside the cilium, whether through constitutive activity (seen for D67V and Q71L) or fast cycling (seen for Y90C and D129N) to cause the migration defect. Since T31N does not cause excess Arl3 activity in cells (see Fig 4) even if it does have some dominant negative activity toward Arl13B, then it is still not enough to cause the migration phenotype. This was directly tested in Fig 5, where we increase T31N binding to Arl13B by introducing Y90C/T31N and still do not see migration defect. Our results are also in line with a previous study showing that despite rapid photoreceptor degeneration in a retina-specific conditional Arl3 knockout mouse the outer segments were initially formed, in contrast the retina-specific conditional Arl13B knockout mouse did disrupt photoreceptor ciliogenesis leading to a more rapid degeneration (Hanke-Gogokhia, JBC 2017). Since complete loss of Arl3 activity did not disrupt ciliogenesis, it is unlikely that expression of Arl3-T31N in wild type retinas would alter outer segment formation, and we observed that outer segments formed in all Arl3 mutants.

      4) Oddly, Arl3-Y90C does robustly bind Arl13 (Fig. 2C), while at the same time binding to effectors (Fig. 3D/E), although less strongly than the canonical Q71L constitutively active mutant (Fig. 2A). As noted in point #2, the Y90C/R149H double mutant, which fails to bind Arl13, abrogates the nuclear migration defect observed with Y90C alone. Although the authors refer to Y90C as "rapid cycling" its phenotype is more similar to a dominant-negative than an activated mutant.

      We agree with this interpretation. We have now included language about Arl3-Y90C’s role as a dominant negative in that it blocks Arl13B activity. However, the rapid cycling behavior is important to cause the phenotype.

      5) The authors also mention that loss of Arl3 has no phenotype in their assay, however, Arl3 knockout mice exhibit severe retinal degeneration. How do they explain this?

      Our study finds that not all human Arl3 mutations will target the same cellular process even though they all result in degeneration. Arl3 knockout mice show drastic alterations in lipidated protein trafficking to the rod outer segment in mature retinas, a phenotype that we did not observe by expressing the dominant Arl3 mutants in wild type rods. Since our tools are not designed to study degeneration of rods, the precise mechanisms of degeneration caused by loss of function or dominant mutations remains to be determined. We outline some ideas in the discussion, but more work needs to be done before making any big statements regarding this. We hope that our manuscript will inspire clinicians to take a closer look at human patients to determine if there are subtle differences between disease presentation for dominant and recessive forms Arl3 inherited mutations. This is beyond the scope of our expertise.

      Reviewer #3 (Public Review):

      This work provides mechanistic insights into two recently described dominant variants of Arl3, a small GTPase, namely mutations D67V and Y90C. The authors identified a phenotype of these dominant variants during the development of rod photoreceptors by in vivo experiments in mice. They specifically observed a defect in rod nuclear migration to their final outer nuclear layer. This phenotype has been previously observed in another constitutively active variant of Arl3, Q71L. The authors performed a series of extensive and thorough biochemical assays to clarify the mode of action of these variants, mostly the Y90C variant, comparing the behavior of these variants to previously described mutants and combining multiple variants by mutagenesis. They also developed a new in vivo crosslinking strategy to be able to identify transient states of protein-protein interactions. They finally performed phenotypic rescue experiments by co-expression of various relevant proteins interacting/involved with Arl3. They finally propose a model based on differential subcellular compartmentalization of Arl3 activation which when disrupted leads to rod nuclei misplacement. These data add to the current understanding of contribution of different Arl3 variants causing human retinal degeneration, which has strong potential translational implications.

      Strengths:

      Relevance of Arl3 dominant variants to human retinal degeneration. Identification of Y90C variant as a "fast cycling" GTPase, and not as a predicted destabilizer of the protein structure.

      New method of crosslinking to enable snapshots of endogenous protein-protein interactions.

      Weaknesses:

      • The relevance of this study is justified by the fact that newly identified dominant variants of Arl3 have been associated to retinal degeneration. However, the authors never assess a degeneration phenotype.

      Electroporation technique allows for rapid expression of constructs, but the sparse expression makes it a poor means to study retinal degeneration. This is important to examine in the future using robust genetic mouse models.

      • The authors show new dominant variants of Arl3, namely Y90C and D67V, cause rod nuclear mislocalization. This phenotype is interesting but this was previously observed with other constitutively active mutation of Arl3, Q71L, and therefore is not novel.

      Yes, the Q71L paper is well cited in our manuscript and set the basis for many of our experiments.

      • The main claim of this paper is that subcellular compartmentalization of Alr3 activation to the cilium (the so called gradient by the authors) is required for proper rod nuclear migration to their final outer nuclear layer destination. The authors provide multiple experiments to support this model, but this is never directly demonstrated.

      We are not aware of any methods that could be done to directly show the subcellular localization of active Arl3-GTP within rod photoreceptors. We agree that we have provided many experiments that support our hypothesis that altering the Arl3-GTP gradient between cilium and cell body produces a nuclear migration defect. Some of our favorites include Fig 6, where we find that the migration phenotype is only rescued with expression of ciliary cargos and not rescued by non-ciliary cargos. Also, the new data requested by reviewers showing Arl13B expression in the cilium can restore the Y90C defect further supports that the Arl3 ciliary gradient is necessary for proper nuclear migration.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01707

      Corresponding author(s): Sarah Butcher, Richard Lundmark

      1. General Statements [optional]

      We thank the reviewers for their insightful comments. The inclusion of the points raised by the referees have strengthened the manuscript. However, some of the reviewer suggestions are beyond the scope of the work (see below), but will doubtlessly be touched upon in future studies by the authors. In addition to incorporating changes relevant to answering the reviewers’ comments, we have edited the manuscript for increased clarity and precision.

      2. Description of the planned revisions

      1. Liposome flotation assay Reviewer #1 suggested that we should perform a liposome floatation assay to separate possible C protein aggregation from membrane binding: "I would strongly recommend supplementing the current liposome sedimentation assay by liposome flotation assay. In contrast to liposome co-sedimentation, the flotation assay can discriminate protein aggregates from proteins bound to liposomes. Although the SDS PAGE shown in Fig. 1A looks pretty convincing, a faint protein band in the „P" lane of the middle panel for the (-) sample is evident. Therefore, C protein aggregation cannot be ruled out and it would be indistinguishable from liposome binding examined by mere co-sedimentation assay”

      Response: We agree that this is a necessary control experiment to add, and we will perform it with liposomes containing 40 % POPS. As we detected complete C protein co-sedimentation with this lipid composition, performing the flotation experiment with the same composition will prove that the earlier result indicates lipid binding and not protein aggregation.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      1. Reviewer #1
      2. In addition, it needs to be clarified which TBEV C protein construct, whether full-length or truncated, was used for co-sedimentation fragmentation.

      Response: We have clarified in this section of the manuscript that the full-length C protein construct was used for the liposome co-sedimentation assays by adding “full-length” prior to instances of “C protein” e.g. in the paragraph starting line 118.

      1. How to understand the finding that „the C protein forms a very rigid layer when adsorbed to the membrane". Can the aggregation of C-protein be ruled-out? Following the 1M NaCl wash of C-protein-bound to SLB, the authors stated: „This shows that initial membrane recruitment of C protein is strongly dependent on its interactions with the negatively-charged lipid headgroups. However, once bound, the C protein-membrane interaction is complemented with non-electrostatic interactions such as membrane insertion or protein oligomerization": does it mean that there are several layers of C protein, the first held by electrostatic interactions, overlayed by non-electrostatically bound C protein? If yes, the illustration of single-layered C-protein adsorbed onto SLB in Fig. 2A, B is not correct.

      Response: We understand the confusion regarding the term “rigid” which was used as a way to describe how we interpret the relatively minor change in the dissipation upon membrane binding. What we intended to describe was that this indicates that the protein is attached in a stable way that does not add viscoelastic properties to the system. These data indicate that the protein does not form large aggregates that non-specifically attach to the membrane in different protrusive orientations. We have clarified this in the manuscript and specified that the as there is no dissipation change, there is no aggregation. We added the following to line 168 “This, in turn, indicates that the C protein does not bind as non-specific aggregates as these would have changed the viscoelastic properties of the system.”

      We do not mean that there are several layers of C protein. We consider, due to the highly charged nature of C, that the most likely explanation is that there are multiple modes of C binding but the result is only one layer, with multiple C-proteins interacting with each other within that layer. We have modified the text at line 184 to: “However, once bound, the C protein-membrane interaction is complemented with non-electrostatic interactions such as membrane insertion or protein oligomerization within the bound layer.”

      1. The sentence: “To confirm that the C protein is biologically active, we investigated its ability to bind RNA" seems to be a little odd because it suggests the model membrane binding assays do not require biological active proteins. However, considering that the interactions leading to binding either negatively-charged lipid or negatively-charged RNA are electrostatic - this sentence must be rewritten.” Response: We thank the reviewer and have now rephrased this sentence to the following at line 249 “Since RNA binding is crucial for the NC assembly, we investigated the C protein’s ability to perform this function.”

      2. “The authors´ statement in the Abstract: „....we investigate nucleocapsid assembly..." is too speculative because the assembly was not studied in their work. It needs to be reformulated.” Response: We agree, and the statement has been removed from the abstract.

      3. Despite this clear and valuable methodological contribution, the authors' contribution to our knowledge of the coordination of the nucleocapsid components to the sites of assembly and budding is not so obvious. Contrary to the earlier idea that the flavivirus is asymmetrically charged (that is, hydrophobic on one side (α2) and positively charged on the other side (α4), recent studies show that the entire surface of the protein is highly electropositive (Mebus-Antunnes et al., 2022). Therefore, a well-ordered neutralization of the flaviviral C proteins' highly positive surface seems critical for the proper organization and assembly of nucleocapsid. I am afraid that the authors do not shed much light on this issue.” Response: The recent structure of the TBEV C protein, published after we submitted the manuscript, shows that indeed the C protein is highly positively charged on all surfaces (updated Supplementary Figure 1 and Selinger et al., 2022). The recruitment of C protein to the membrane, that we demonstrate is dependent on negatively-charged head groups, provides a biologically relevant mechanism for charge neutralization on the C protein surface that interacts with the lipids. The remaining surface charge can be then neutralized by RNA recruitment. Mebus-Antunnes et al. made their observations with just RNA and C protein from Dengue virus in the context of artificial surfaces e.g. mica. However, our experiments utilize the TBEV C protein and specifically include a membrane, the third critical component of NC assembly. Thus, we build upon the work of Mebus-Antunnes et al. by adding a second biologically relevant charge-neutralising component and comparing with a distantly-related virus. We have changed the discussion section of the manuscript to reflect this new structure and to emphasize the advance here. Starting from line 371 we changed the text to: “Recently, it has been shown that the neutralization of the C protein surface positive charge is important for RNA binding in the distantly-related Dengue virus (DENV) (Mebus-Antunes et al, 2022). The recruitment of C protein to the membrane, that we demonstrate is dependent on negatively-charged head groups, provides a biologically relevant mechanism for charge neutralization on the C protein surface that interacts with the lipids. The remaining surface charge can be then neutralized by RNA recruitment.”

      Reviewer #2 1. “What results demonstrate C protein inserts into membrane? The current results support the C protein interacts with membranes with positive charge, but do not seem to demonstrate membrane insertion. If the C protein inserts into the membrane, which regions (helices) play this role?”

      Response: The Langmuir-Blodgett trough tensiometry experiments with monolayers directly measure the insertion of a protein into the monolayer. By determining the maximum insertion pressure of the C protein constructs, we also show that the membrane insertion can occur in bilayers. We show that the N-terminus is not inserting into the membrane, further work, beyond the scope of this manuscript, is needed to pinpoint the residues responsible for insertion, for instance by hydrogen-deuterium exchange or FRET measurements that would not affect folding. To clarify the use of the LB trough, we added the following at line 216: “To investigate if the C protein membrane binding includes insertion into the membrane after the initial electrostatic binding, we used Langmuir-Blodgett trough monolayer experiments. In this approach, the insertion of a protein into a lipid monolayer can be detected by following the pressure (π) of the monolayer after protein injection into the aqueous subphase, with increases in π corresponding to protein injection (Brockman, 1999; Liu et al, 2022).“

      1. The authors should discuss several previous papers reporting the effect of partial deletions of the C gene on the replication of TBEV, West Nile virus, and other flaviviruses.” Response: We agree that this is a necessary addition, and have now added a paragraph in the discussion section starting at line 333: “N-terminally truncated flaviviral C proteins have been shown to be assembly competent and in vitro, able to bind RNA, which is consistent with our results with N-terminally truncated TBEV C protein (Khromykh & Westaway, 1996; Kofler et al, 2002; Patkar et al, 2007; Schlick et al, 2009). One role of C is in the modulation of host responses to infection and the N-terminus maybe involved in that (Yang et al, 2002; Limjindaporn et al, 2007; Colpitts et al, 2011; Bhuvanakantham & Ng, 2013; Katoh et al, 2013; Urbanowski & Hobman, 2013; Samuel et al, 2016; Slomnicki et al, 2017; Fontaine et al, 2018). The membrane insertion directly detected in our experiments is central to C protein function. Other studies have found that deletions in the hydrophobic region of the α2 helix significantly impair particle assembly (Kofler et al, 2002; Patkar et al, 2007; Schlick et al, 2009). In the light of this evidence, we consider that the α2 helix could be responsible for membrane insertion (Markoff et al, 1997; Kofler et al, 2002; Nemésio et al, 2011, 2013).”

      Reviewer #3 1. “In Figure 4, the band (256:1) that are supposedly in the wells (red arrow) is not clear- it is only slightly darker than the other wells.”

      Response: This confusion was the result of unclear wording. We have now revised the figure legend at line 278 to : “The black arrow indicates the bands of freely-migrating RNA, and the red arrow the wells. On lanes 624:1 and 256:1, RNA has been immobilized in the wells.”

      1. Figure S1A, the N-terminal end (which is truncated in the mutant) should be colored on the cyan molecule.” Response: We have coloured the truncated part of the cyan molecule in the figure (now S1B) according to the reviewer’s comment.

      Other 1. As the nuclear magnetic resonance structure of the truncated TBEV C protein has recently been released (Selinger et al, 2022), we have updated the manuscript and Figure S1 to include the information from this structure. We have also generated a new homology model of the full-length TBEV C protein using this structure as a template and included that in Figure S1.

      4. Description of analyses that authors prefer not to carry out

      1. Reviewer #1
      2. However, we do not know whether in the infected cells, the C protein is pre-bound to ER membrane or to viral RNA. Having such a unique assay in their hands, I wonder whether the authors could use the pre-bound C protein with genomic RNA (i.e. the experiment shown in Fig. 4A) ribonucleoprotein complex in the SLB binding assay. If doable, this experiment would be exciting and could bring some important information about NC assembly.”

      Response: We agree that it would be very interesting to decipher if the C-protein first binds to RNA or to membranes using the QCM-D methodology. Yet, our data on pre-incubated C-protein and RNA suggests that large aggregates are formed which would hamper the interpretation of the QCM-D data. Furthermore, based on the suggested experiment, we will not be able to firmly conclude whether or not the C-protein first binds to RNA or to membranes since the time of the experiment will allow rearrangement of preformed complexes between C-protein and RNA. Additionally, the QCM-D measurement cannot differentiate if the preformed complexes bind on their own, or if excess unbound C protein binds the membrane and then recruits the complex. Therefore, addressing this question would require major adjustments to the RNA model system and methodology that we feel are beyond the scope of this study.

      Reviewer 2 1. “The authors should use the lipids detected in the virions to confirm C protein binding experiments.”

      Response: In the mass spectrometry characterization of the TBEV virions, we detected lipids from 9 classes (Car, PE, PS, PI, PG, PC, Cer, HexCer & TG). We have tested four of them (PE, PS, PI, PC) in the liposome sedimentation assay. Additionally, we tested GalCer, which, like HexCer, are cerebrosides. Our liposome binding experiments clearly demonstrate that the C protein does not bind to a specific lipid class, but instead to lipids with negatively-charged headgroups. Therefore, we would argue that doing additional sedimentation experiments with Car, PG, Cer, and TG would not add extra insight to the manuscript.

      Additionally, while the population of lipid species in the TBEV envelope is diverse, the diversity mostly comes from differences in the lipid tails, which do not generally affect the head group-mediated binding of proteins. Therefore, performing additional lipid binding experiments with varying tail lengths would not likely lead to new observations.

      Finally, to perform the authentic experiment of testing C protein binding to liposomes formed from lipids extracted from purified virions would require orders of magnitude more virus sample than our research laboratory is capable of producing. Therefore, we argue that this experiment is beyond the scope of this study.

      1. The study may be strengthened by performing virus mutagenesis experiments.” Response: While we agree that, ultimately, experiments on virus and cells would help to understand the role of the C protein in the biological context, we think these experiments are beyond the scope of this study. For virus mutagenesis, candidate residues should be first identified with biochemical and biophysical studies, which is already beyond the scope of this work. Additionally, the C protein has multiple functions in the host cell in addition to NC assembly, and interpreting the effect on the mutations on e.g. virus titer is difficult.

      Reviewer #3 1. “In all figure legends, authors should write a conclusion line after the description of the experiments - what conclusion is drawn from each experiment.”

      Response: While we agree that adding such a conclusion line would make it easier for the reader to understand each figure, the format of the figure legends is highly subject to journal policy. Therefore, we think that the addition of such lines will be an editorial decision and will depend on the journal. We have, however strived to make the figure titles as informative as possible in lieu of such concluding lines.

  2. Nov 2022
    1. Author Response

      Reviewer #1 (Public Review):

      In this manuscript, the author characterizes the lattice of kinesin-decorated microtubule reconstituted from porcine tubulins in vitro and Xenopus egg extract using cryo-electron tomography and subtomogram averaging. Using the SSTA, they looked at the transition in the lattice of individual microtubules. The authors found that the lattice is not always uniform but contains transitions of different types of lattices. The finding is quite interesting and probably will lead to more investigation of the microtubule lattice inside the cells later on for this kind of lattice transition.

      The manuscript is easy to read and well-organized. The supporting data is very well prepared.

      Overall, it seems the conclusion of the author is justified. However, the manuscript appears to show a lack of data. Only 4 tomograms are done for the porcine microtubules. Increasing the data number would make the manuscript statistically convincing.

      One tomogram can contain one to several tens of microtubules. For example, 64 microtubules were analyzed in the Xenopus-DMSO dataset obtained on 5 tomograms, versus 24 microtubules for the GTP-dataset obtained on 4 tomograms (see Table 1). Hence, taking the number of tomograms to assess the statistical relevance of our work cannot be considered as a valid criterion. Tomograms are taken randomly on the EM-grid sample, solely based on ice quality and the covering of microtubules in the holes as determined at low magnification before tomographic acquisition. No prior knowledge of the structure and lattice-type organization of the microtubules can be obtained before acquisition. It appears to us that a more pertinent criterion is the number of events that we characterized, specifically lattice-type transitions along individual microtubules. In the dataset mentioned by the referee (see Figure 2-figure supplement 3-4 and Table I), 24 microtubules were analyzed and further divided into 195 segments, providing an equivalent number of individual 3D reconstructions. For each 3D reconstruction, almost all lateral interactions could be characterized in terms of lattice-type, i.e., 2091 of the B-type, 460 of the A-type, and 112 not determined (essentially at transition regions). Most importantly, we document in this specific dataset 119 transitions in lattice-type, which we think is sufficient to characterize such molecular events and provide solid statistics for this dataset. Adding the GMPCPP and Xenopus data, we end-up with 938 individual 3D reconstructions (not including the full-length microtubule volumes), 12 463 lateral interactions analyzed (A-, B-, or ND-type), and the observation of 172 lattice-type transitions. Therefore, we respectfully disagree with the referee stating that our work lacks data.

      To highlight the quantity of data used in our work, we have modified the following sentences: L124-131: ' Analysis of 24 microtubules taken on 4 tomograms, representing 195 segments of ~160 nm length (i.e., 2664 lateral interactions), allowed us to characterize 119 lattice type transitions with an average frequency of 3.69 µm-1 (Table 1), but with a high heterogeneity' L160-164: ' Analysis of 31 GMPCPP-microtubules taken on 6 tomograms, representing 338 segments of ~150 nm in length (i.e., 3236 lateral interactions), and using the same strategy as in the presence of GTP (Figure 5—figure supplement 1-2) revealed a transition frequency of 1.25 µm-1 (Table 1), i.e., ~3 fold lower than microtubules assembled in the presence of GTP.' L200-203: ' A total of 64 microtubules taken on 5 tomograms were analyzed in the Xenopus-DMSO dataset (i.e., 419 segments from which we characterized 5446 lateral interactions), and 15 microtubules taken on one tomogram for the Xenopus Ran-dataset (i.e., 86 segments from which we characterized 1118 lateral interactions), (Table 1).'

      In addition, having the same transition with the missing wedge orientation randomly from different subtomograms will allow a better average of transition without the missing wedge artifact.

      In this work, we did not aim at averaging transitions. Transitions in lattice-types are highly heterogeneous in nature, and we wonder what additional information an averaging strategy would have provided. Conversely, each transition is a unique event that we characterized to obtain useful statistics, and the missing data at high angle inherent to electron tomography were not an obstacle to fulfill this task.

      Another thing that I found lacking is the mapping of the transition region/alignment in the raw data.

      In Figure 4, we clearly show the correspondence between the segmented sub-tomogram averages (SSTA) and the raw filtered images at the transition region. This is also the case in Figure 5 where the SSTA (Figure 5A) are compared with the raw tomogram (Figure 5B), and where we clearly visualize the holes that result from the transitions in lattice types.

      However, it is not easy for me or the reader to understand how each segment is oriented relative to each other apart from the simplified seam diagrams in the figures, and also the orientation of the seam corresponding to the missing wedge in the average. With these improvements, I think the conclusion of the manuscript will be better justified.

      The segmentation process is explained in Figure 2-figure supplement 2 and in the Materials and Methods section, which shows that each segment is linearly related to the next. Small rotations can happen between individual segments, and it is important to check that the same protofilaments are followed during the initial modeling (see the online tutorial referenced in the manuscript for full-length microtubules). The segment models are derived from that of the full-length microtubule, as explained in the Materials and Methods section, using a new routine (splitIntoNsegments) implemented into the PEET program. In addition, a detailed protocol describing our SSTA strategy will be submitted following publication of our manuscript.

      Reviewer #2 (Public Review):

      Differences in protofilament and subunit helical-start numbers for in vitro polymerized and cellular microtubules have previously been well characterized. In this work, Guyomar et al. analyze the fine organization of tubulin dimers within the microtubule lattice using cryo-electron tomography and subtomogram averaging. Microtubules were assembled in vitro or within Xenopus egg cytoplasmic extracts and plunge frozen after addition of a kinesin motor domain to mark the position of tubulin dimers. By generating subtomogram averages of consecutive sections of each microtubule and manually annotating their lattice geometry, the authors quantified changes in lattice arrangement in individual microtubules. They found in vitro polymerized microtubules often contained multiple seams and lattice-type changes. In contrast, microtubules polymerized in the cytoplasmic extract more frequently contained a single seam and fewer lattice-type transitions.

      Overall, their segmented subtomogram averaging approach is appropriately used to identify regions of lattice-type transition and quantify their abundance. This study provides new data on how often small holes in the lattice occur and suggests that regulators of microtubule growth in cells also control lateral tubulin interactions. However, not all of the claims are well supported by their data and the presentation of their main conclusions could be improved.

      1 - We have corrected approximative claims and conclusions where necessary. In particular, we now discuss separately the Xenopus-DMSO and the Xenopus-Ran egg extract samples, and have modified our conclusions accordingly. We also deposited onto the EMPIAR all tomograms and PEET models to reproduce the 938 segmented sub-tomogram averages analyzed in this study (see new Supplementary file 2).

      Reviewer #3 (Public Review):

      Protofilament number changes have been observed in in vitro assembled microtubules. This study by Guyomar and colleagues uses cryo-ET and subtomogram averaging to investigate the structural plasticity of microtubules assembled in vitro from purified porcine brain tubulin at high concentrations and from Xenopus egg extracts in which polymerization was initiated either by addition of DMSO or by adding a constitutively active Ran. They show that the microtubule lattice is plastic with frequent protofilament changes and contains multiple seams. A model is proposed for microtubule polymerization whereby these lattice discontinuities/defects are introduced due to the addition of tubulin dimers through lateral contacts between alpha and beta tubulin, thus creating gaps in the lattice and shifting the seam. The study clearly shows quantitatively the lattice changes in two separate conditions of assembling microtubules. The high frequency of defects they observe under their microtubule assembly conditions is much higher than what has been observed in vivo in intact cells. Their observations are clear and supported by the data, but it is not at all clear how generalizable they are and whether the defect frequencies they see are not a result of the assembly conditions, dilutions used and presence of kinesin with which the lattice is decorated. The study definitely has implications for mechanistic studies of microtubules in vitro and raises the question of how these defects vary for protocols from different labs and between different tubulin preparations.

      1 - High tubulin concentration: It has been documented by many laboratories since the discovery of tubulin and the characterization of its assembly properties that a sufficient concentration of free tubulin is necessary to self-assemble microtubules. This is called the critical concentration for self-assembly (the CC, i.e., the critical concentration to overcome the nucleation barrier), and has been reported to be in the range 14~25 µM in the presence of GTP depending on laboratories. For example, in the seminal work of Mitchison and Kirschner the CC was estimated at 14 µM (Fig. 5 of ref. (Mitchison & Kirschner, 1984b)) and self-assembly was induced at concentrations in the range 32-59 µM (Mitchison & Kirschner, 1984a). Our own estimate of the CC for porcine brain tubulin was 21 µM (Fig 2C of (Weis et al., 2010)), and we routinely use a tubulin concentration slightly above the CC when we aim at robust microtubule self-assembly. Hence, we argue that 40 µM, which is ~twice the CC, cannot be considered as a "very high" tubulin concentration to induce microtubule self-assembly.

      2 - Protofilament number and lattice-type transitions in cells: While microtubules with protofilament numbers different than 13 have been observed in different cell types and species (reviewed in (Chaaban & Brouhard, 2017)), we are aware of only one recent study where changes in protofilament numbers along individual microtubules have been reported in cells (Foster et al., 2021), but with no statistics concerning their frequencies. Hence, we cannot compare changes in protofilament number frequencies in Xenopus egg extracts with those that occur in intact cells. Concerning lattice-type transitions, we are not aware of any previous study that documented such features, whether in vitro or in cells.

      3 - Generalization of our results, source of tubulin and protocols: Multi-seams in microtubules assembled in vitro have been reported by several groups in the past (see our Introduction, L49-62), starting from (Kikkawa et al., 1994), the Milligan group (Dias & Milligan, 1999; Sosa et al., 1997), and more recently by the Sindelar group (Debs et al., 2020). In Kikkawa et al. (1994), the authors purified tubulin from porcine brain by three cycles of assembly/disassembly followed by phosphocellulose chromatography. Assembly was carried out at 24 µM in the presence of Taxol. In Sosa and Milligan (1996-1997), the authors used a commercial source (Cytoskeleton) and assembled the microtubules at 30 µM in the presence of Taxol. In Debs et al. (2020), the authors used tubulin purified from porcine brain according to (Castoldi & Popov, 2003), as we did, to assemble GMPCPP microtubules, and bovine brain tubulin (Cytoskeleton) to assemble Taxol-stabilized microtubules. Noticeably, they used an initial tubulin concentration of 100 µM to initiate microtubule polymerization and then added Taxol to continue the reaction.

      We add to these previous studies that microtubules with different numbers of seams are not unique ones, but that both the number and location of seams can vary within individual microtubules. The reason why this was not observed before is that the analytical tools used in those previous studies were not suited to reveal this structural heterogeneity within individual microtubules. By contrast, the SSTA approach that we designed was specifically developed towards this aim. Even in the recent work by Debs et al. (2020) that provides the most comprehensive characterization of multi-seams in microtubules assembled in vitro and that obtained a seam distribution very similar to ours (compare their Figure 3C with our new Figure 10C for GDP microtubules, dark blue bars), their protofilament-based approach could not reveal changes in the number and location of seams within individual microtubules. Yet, they probably could have done it if they had asked whether segments with different seam numbers had been extracted from the same microtubules.

      Here, we designed a specific approach to tackle the structural heterogeneity of individual lattices that permitted this discovery. Not only do we confirm results obtained by others, but we also propose a molecular mechanism that explains how multi-seams form in microtubules assembled in vitro and how they change in location in a cytoplasmic environment. By doing so, we propose a novel molecular event - formation of unique lateral interactions without longitudinal ones - that was not envisioned before, and which to our opinion, must be incorporated in further modelling studies concerning microtubule nucleation and assembly, including the mechanism of dynamic instability (see the Ideas and speculation section).

      4 - Dilution: A 50X dilution was used only for Xenopus egg cytoplasmic extracts to decrease their density on the EM grid just before freezing. These conditions were settled by cryo-fluorescence microscopy to ensure that we had the adequate density of microtubules onto the EM-grid (Figure 7 and Figure 2—figure supplement 1D). Of note, the microtubules analyzed by SSTA were assembled in extracts that were not supplemented with fluorescent tubulin. While we could imagine that dilution may induce the removal of dimers from the microtubule lattice, we cannot foresee how this could change the register between tubulin subunits within the microtubule lattice.

      5 - Kinesin decoration: Like many other laboratories (see the Table in Figure 3 of (Manka & Moores, 2018)), we use the non-processive motor domain of kinesin 1 to decorate microtubules, with the aim to differentiate the - and -tubulin monomers within the microtubule lattice. In particular, it has been shown that lattice parameters such as the protofilament skew and lattice spacing are unmodified when kinesin motor domains are added to GMPCPP- or GDP-microtubules (Zhang et al., 2015, 2018). In addition, we cannot envisage how this non processive motor added to preformed microtubules could change the registry of the -tubulin heterodimers within the microtubule lattice.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Overall comments

      We are pleased by the reviewers’ comments and appreciate their suggestions for improvements. In addition to correcting small typos throughout the manuscript, the major changes we did in response to their comments are as follows:

      • Changed the title of our paper to reflect the strong evolutionary correlation more accurately between sex chromosomal meiotic drive and gains/losses of SNBP genes in
      • New experiments to test the role of the well-conserved, universally retained SNBP, CG30056, in male fertility in * melanogaster*. Although reviewers had suggested we could eliminate this section, we felt that this would add a lot of weight to the unexpectedly inverse relationship between age/retention and fertility functions of SNBP genes. Thus, over the past few months, we have carried out new experiments with increased sample sizes, better controls, and sperm exhaustion. These new results strengthen our earlier analyses.
      • Better clarification of the X-Y chromosome fusion, which is a new observation, in the montium group via careful rewriting as partly suggested by Reviewer #2.
      • Highlighting that the genetic conflicts hypothesis does not rule out a role for sperm competition or other conflicts in shaping SNBP evolution in a revised Discussion. All changes in response to the reviewer’s comments have been detailed in our point-by-point response (below). You will see that we have addressed almost all the suggestions made, including with new experiments. The only reviewer suggestions (all optional from Reviewer 3), which we did not directly address in our revision are:

      • __Branch specific protamine evolution analyses for sex chromosome amplified SNBP genes: __given the state of SNBP gene annotation and the difficulties of assembling these genes in large tandem arrays, this will require considerable work and is beyond the scope of the paper.

      • Covariation between SNBP evolution and sperm morphology: We cannot perform these experiments as there is a paucity of sperm morphology data currently. Obtaining this data reliably is a significant undertaking.
      • Are SNBP genes more prone to be lost than average in the montium group: We have not comprehensively examined all loss events in the montium group or any other Drosophila This is also a non-trivial analysis, albeit it would be very interesting. However, we believe the more relevant comparison is whether these lost SNBP genes are more likely to be retained in non-montium species, which they are, as we now highlight. We hope you will favorably judge our good faith efforts to address all other reviewers’ comments, and their laudatory comments during the previous round of reviews.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): __

      Chang and Malik present a comprehensive evolutionary analysis of sperm nuclear basic proteins (SNBPs) in Drosophila. In addition, they provide a preliminary functional characterization of one such protein (CG30056) and describe a newly discovered X-Y chromosomal fusion in the Drosophila montium species group. All of these findings are interesting and important, but the headline from this study is the well-supported possibility that SNBPs, or at least a large fraction of them, function in suppressing X vs. Y chromosome meiotic drive. While this hypothesis is challenging to test experimentally, the authors provide strong correlational evidence that SNBPs are associated with drive by documenting these proteins' rapid evolution. This rapid evolution takes the form of sequence changes (as predicted by coevolution between drivers and suppressors of drive), gene amplification in cases when SNBPs move to sex chromosomes (consistent with the SNBP becoming a potential agent of drive for its new "home chromosome"), and gene loss in species with X-Y chromosome fusions (in which drive is not predicted to occur).

      Overall, this is an excellent, comprehensive study. The phylogenetic and genomic analyses are first-rate (and one of the first to make use of the new 101 Drosophila genomes); the logic is very well explained; conclusions are supported by multiple lines of evidence; the writing and figures are clear and accessible; and, the findings are fascinating. It's a good sign that it is easy to imagine several experiments one could do to follow up on this study, but I do not feel any are required in revision, as the manuscript is comprehensive as is. Thus, I have just a few minor points the authors may wish to consider in making revisions and a few suggestions for clarity/typos.

      __

      We thank the reviewer for their positive comments on our work.

      1. I would be interested in whether the authors think that all SNBPs in a given Drosophila species function(ed) in meiotic drive, or whether some fraction may play other roles, such as sexual selection or chromatin compaction, which have been the traditional hypotheses for SNBP function. Relatedly, given the high turnover of SNBPs the authors observe and the fact that some melanogaster-essential SNBPs are younger genes, would they like to comment on whether the subsets of SNBPs involved in drive/suppression vs. chromatin packaging/sperm traits/Wolbachia defense are likely to differ from across fly species? The reviewer raises an excellent point. In our revised discussion, we now speculate that different SNBPs might have distinct functions. For example, the same subset of SNBPs is subject to gene amplification and loss whereas other SNBPs are subject to less turnover. Moreover, even this stable set of SNBPs evolves rapidly, including in the montium group of species that have undergone dramatic SNBP loss. As the reviewer suggests, sperm competition or pressures from Wolbachia toxins might be is a driving force for sperm evolution. We discuss these possibilities and conclude in our discussion: “Our findings do not rule out the possibility that forces other than meiotic drive are also important for driving the rapid evolution and turnover of SNBP genes in Drosophila species.

      What do the authors make of the lower isoelectric points for a few of the SNBPs (e.g., CG31010 with pI = 4.77 in Table 1)? These proteins have identifiable HMG box domains, so is the pI driven lower by other parts of the protein sequence?

      We thank the reviewer for raising this point. We found that the pI of HMG domains can range from 6 to 12. Thus, the pI is driven by both HMG domains and other parts of proteins. We now include the pI of the whole SNBP protein and the HMG domain alone in Table 1. We do not have enough biochemical information to speculate on how these differences could alter SNBP function.

      __3. For readers less familiar with the field, it may help to spell out (e.g., on p. 6) why the authors consider ProtA/B to be important for fertility. Some of the previous papers on these genes describe them as dispensable - though the present authors are correct that these previous studies do detect fertility defects of various magnitudes under some conditions.

      __

      We agree with the reviewer. Previous studies are in disagreement about the importance of ProtA/ProtB for male fertility- while no significant effects were seen under standard fertility assays, sperm exhaustion conditions (mating with excess females) did reveal fertility effects. We have now added these references and discussed ProtA/ProtB more fully in our revision.

      On p. 9, paragraph 2, the data showing that "six different SNBP genes underwent 11 independent degeneration events in the montium group" are shown in Fig. 6A, not 5A.

      Thank you. This has been fixed in our revision.

      5. The summary Table 2 is useful, but I wonder whether including relative levels of expression and dN/dS in addition to ordinal rankings might help clarify. For instance, if there were a drop off in mean expression level between the 5th and 6th most highly expressed SNBP, this wouldn't be evident from the table.

      We agree with this suggestion and have now added this information.

      In Fig. 3, I like the use of the clean CG31010 figure in panel A to illustrate the circular representations. In addition, though, it might be useful to show Prot's graph at this same, larger size, since it's the most complicated and will likely be most closely examined.

      We agree with this suggestion and have now amended this figure in line with the reviewer’s suggestion.

      In Fig. 4, the end of the legend says that the species tree is shown "on the right," but it's on the left in the figure.

      Thank you. This has been fixed in our revision.

      __CROSS-CONSULTATION COMMENTS • I agree with both Reviewers 2 and 3 that the title could be changed to be a bit more tentative. I'd had this thought as well.

      __

      We agree with this suggestion. We have now amended this title to “Expansion and loss of sperm nuclear basic protein genes in Drosophila correspond with genetic conflicts between sex chromosomes.”

      • I agree with Reviewer 2 that the fertility assay could be conducted with a larger sample size and a better control in order to be better compared with how the authors described other published fertility phenotypes for SNBPs. For the control, crossing the deletion line to y w (or w1118) and using the resulting heterozygotes (KO/+) would be better than using the mutation over the balancer chromosome (KO/CyO). We agree with both suggestions. We now compare fertility between KO/KO and KO/+ males in sperm exhaustion assays. Our more stringent fertility assays find no evidence of CG30056 role in male fertility, strengthening our previous findings. We have now added the motivation for the new assays and the new results to our Revision.

      • I agree with Reviewer 3's third bullet point about spending a bit more time on the different possible roles that SNBPs could play in spermatids. (This is a more eloquent version of my review point #1.)

      We have now expanded our discussion of other possibilities in our revision.

      • I agree in principle with Reviewer 3's first bullet point about examining whether SNBP evolution correlates with changes in sperm morphology, but this feels like it could be a whole, fascinating study on its own, while this manuscript is already packed with data. I'd welcome the authors' thoughts about this in discussion, but wouldn't personally require a formal analysis of this to be added prior to publication.

      We also agree that this would be an interesting test. However, we are not able to do the test due to the scarcity of sperm phenotype data in Drosophila. We also think that our original version unintentionally downplayed this possibility. Our revised discussion makes clear that the rapid evolution of some Drosophila SNBP genes may be driven by sperm competition, just as in mammals, and influence the evolution of sperm morphologies.

      __Reviewer #1 (Significance (Required)):

      This study describes an important conceptual advancing in our understanding of the evolution and potential functions of sperm nuclear basic proteins (SNBPs) in Drosophila, which stands in interesting contrast to the functional roles of equivalent proteins in primates. It should be of broad interest to biologists studying spermatogenesis, meiotic drive, and genome evolution, both in and out of Drosophila. __

      We thank the reviewer for their positive appraisal.

      __ To contextualize the work, paternal DNA is typically compacted during spermatogenesis. This process involves the replacement of histones with other small, positively charged proteins in a sequential order, ending with protamines that bind DNA in mature sperm. In Drosophila, work over the last two decades (largely from the labs of R. Renkawitz-Pohl, B. Loppin and B. Wakimoto) has identified more than a dozen sperm nuclear basic proteins that localize to condensing/condensed spermatid nuclei. Two interesting observations have been that many of these proteins are dispensable for male fertility, and the proteins vary in their degree of evolutionary conservation. Recent work from Eric Lai's lab (J Vedanayagam et al. 2021, Nat Ecol Evol) showed that in D. simulans and sister species, at least one of these SNBP genes (Prot) underwent gene amplification and now acts in those species as a meiotic driver. This finding suggested the hypothesis, tested thoroughly in the present study, that the rapidly evolving SNBP gene family could be involved in causing or suppressing meiotic drive. Consistent with this idea, the authors here find that SNBP genes expand in copy number more frequently when they move from autosomes to sex chromosomes (consistent with the idea that they may cause or contribute to drive), and that otherwise well-conserved SNBP genes are lost in a group of species in which sex chromosome meiotic drive is not expected to occur. These findings are based on a thorough and well conducted phylogenomic and molecular evolutionary analysis of SNBPs across dozens of Drosophila species. Overall, this work generates exciting new hypotheses about the function of SNBPs and should be widely read both within and outside of the field.

      __

      We are grateful for the reviewer’s accurate summary of our work and its significance. We share the reviewer’s excitement and expect that more studies will explore the new function of SNBPs in multiple taxa soon.

      Keyword describing my field of expertise: Drosophila, molecular evolution, reproduction, genetics, genome evolution.

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The paper describes interesting patterns on the evolution of Drosophila SNBP genes, and proposes a very interesting explanation, namely, that meiotic drive is the main evolutionary force behind these patterns. Some of these observations have recently been made by other authors in a single case (the Dox genes in D. simulans), but not in the scale and breadth of the present ms. The ms combines an extensive investigation of available genomes with expert analysis, and new experimental data. In particular, the finding that the ancestral Y became incorporated into de X in montium species is very exciting, and may provide a smoking gun for the explanation proposed by the authors. Overall, I think it is a very good paper. I do have several criticisms and suggestions that may help to improve it.

      __

      We are grateful for the positive comments of the reviewer and for their constructive criticism and suggestions, which we have incorporated into our revision.

      __The paper has a speculative side that it almost unavoidable given its novelty and breadth. I do not see this as a problem per se, but I think the uncertain/unsupported/problematic points should be more openly presented to the readers. The main cases I noted are:

      1. The title of the ms states that "Genetic conflicts between sex chromosomes drive expansion and loss of sperm nuclear basic protein genes in Drosophila", but the evidence is somewhat circumstantial, and the patterns may be explained also by other known phenomena (e.g., demasculinization of the sex chromosomes; below). I think the tone of the end of the Introduction reflects more faithfully the strength of the evidence ("Thus, we conclude that rapid diversification of SNBP genes might be largely driven by genetic conflicts between sex chromosomes in Drosophila."). I understand the temptation of writing a bold title, but I think it is a bit misleading in the present case. I.e., it would be desirable that the title conveys the uncertainties of the data and their interpretation. __

      We agree with this suggestion. We have now amended this title to “Expansion and loss of sperm nuclear basic protein genes in Drosophila correspond with genetic conflicts between sex chromosomes.”

      However, we also want to highlight that de-masculinization of the X chromosome cannot explain the observed amplification and loss patterns of SNBP genes, except in cases of sex chromosome fusions. We now highlight the de-masculinization hypothesis for the latter case, but still strongly favor the genetic conflicts hypothesis.

      "In contrast, we found no instances of pseudogenization or subsequent translocation to the X chromosome of SNBP genes that are still preserved on their original autosomal locations or involved in chromosome fusions between autosomes (0/16). This difference is highly significant (Fig 5 and Table S11; 3:5 versus 0:16, Fisher's exact test, P=0.03). " Readers should be warned that this pattern can also be explained by the well-known demasculinization of X chromosomes (e.g., Sturgill et al. Nature 2007, 450, 238-241)

      We agree with this point and thank the reviewer for pointing this out. We now expressly raise the ‘de-masculinization of X chromosomes’ as one potential explanation of the pattern we observe here.

      "Indeed, no meiotic drive has been documented in the montium species even though it is rampant in many other Drosophila lineages [38]." Two remarks here: a) the authors should make clear that they are referring to sex-chromosome meiotic drive. b) I think the evidence is much weaker than the sentence implies. Sex-chromosome meiotic drive is known in less than 20 Drosophila species, scattered throughout the phylogeny. As far as I know all cases were discovered by accident, so the sampling is biased towards model species (e.g., the obscura group, which was very popular around 1930-1960). So we do not know the true frequency of sex-ratio meiotic drive among Drosophila species, nor, say, if it is more common in the Drosophila or Sophophora species, if it is suspiciously absent in the montium group (as suggested by the authors), etc. I think these uncertainties should be acknowledged or, perhaps, given the weakness of the argument, the sentence should be deleted or attenuated.

      We agree with this comment and have now removed this argument in our revision.

      __ "X-Y chromosome fusions eliminate the extent of meiotic drive and may lead to the degeneration of otherwise conserved SNBP genes, whose functions as drive suppressors are no longer required. Thus, unlike in mammals, sex chromosome-associated meiotic drive appears to be the primary cause of SNBP evolutionary turnover in Drosophila species." The authors found that in the montium species the ancestral Y became incorporated into de X chromosome, and that montium species seem to have an inordinate amount of SNBP gene losses. They combine these two observations by suggesting that these SNBP became dispensable or deleterious because they originally were involved in XY meiotic drive. I think many readers will think that males in montium species are X/0, whereas in fact in all of them carry a Y chromosome (just, in most cases, more gene poor than "normal" Y-chromosomes). I do not think this is a fatal flaw for the explanation proposed by the authors, but certainly is a difficulty that should be acknowledged.

      __

      We agree with this point. It was not our intention to suggest that montium group males are X/O, but this could be misinterpreted as we originally stated. We now add a clarification that montium group males still harbor a Y chromosome, which is missing most ancestrally Y-linked genes.

      __Problems/suggestions with experiments and data analysis

      1. There is a section titled "CG30056 is universally retained in Drosophila but dispensable for male fertility in D. melanogaster". In this section and in the figures, it is stated, "Although CG30056 is the most conserved SNBP we surveyed, we found no clear difference in offspring number between heterozygous controls and homozygous knockout males (Fig 2B). (...) We found either no or weak evidence of fertility impairments in two different crosses with homozygous CG30056 knockout males.". I think the fertility data are weak for the purpose of the authors, and I strongly suspect that this conclusion is wrong. Let me explain why. At other passages of the ms, the authors classify the SNPB genes in three groups. (i) essential/important for male fertility: "Three genes (Mst77F, Prtl99C and ddbt) are essential for male fertility while knockdown or knockout of two other SNBP genes (ProtA, and ProtB) leads to significant reduction in male fertility [27-30, 32]." (ii) genes that do not appear to impair male fertility at all. (iii) untested. CG30056 was in the last group, and hence the authors produced knockouts, tested their effect in male fertility, and concluded that it belongs to the second group. Now, look at Fig. 3B. The numbers of tested males are too small (it seems to range from 3 to 10), and male fertility is known to be a very noisy phenotype (as shown by the huge scatter in the authors' data). Furthermore, two different knockouts were tested, and both were nominally less fertile than the controls, and in one of them the difference is statistically significant. Taken at the face value, the knockouts seem to be perhaps ~25% less fertile than the controls. Another potentially big problem is that the "control males" actually carry visible dominant mutants (the balancers CyO or SM6) which certainly reduce their fitness, whereas the experimental males are wild-type for these mutants. Without the detrimental effect of these visible mutants in the controls, the difference to the CG30056 knockouts will probably be even larger. Note that the fertility effects of the genes ProtA, and ProtB (a.k.a. "Mst35B") , which the authors put in group "essential/important for male fertility" would not had been detected if assayed as the CG30056 gene: Tirmarche et al (2014; the reference cited by the authors) stated that: "In fact, the impact of Mst35B on male fertility was only revealed when mutant males were allowed to mate with a large excess of virgin females (1 for 10; Figure 3F) but not with a 1:1 sex ratio (not shown). " The authors' fertility test did not used this type of challenge. My general impression is that the fertility effects of CG30056 may actually be similar to ProtA and ProtB. I think the authors should do a proper fertility test of CG30056, or remove this section. Another possibly useful approach would be to classify the SNPB genes in those essential for male fertility and those that are not essential, because "experimentally speaking" this is a safer distinction (e.g., the fertility testes reported by other authors may also had been quick tests). Since these genes only function in sperm and are under purifying selection (otherwise they would have been lost; also, all have dN/dS We are very appreciative of the many important points raised by the reviewer. Rather than removing this conclusion, which is not central to our paper, we have now performed additional, well-controlled experiments to address the reviewer’s concerns, which we summarize below:

      2. We agree with the reviewer that it is easier classification to identify SNBP genes that are essential for male fertility versus those that are not.

      3. We also agree with the reviewer and now include more details about earlier studies to highlight that ProtA/ProtB fertility effects were only revealed in a sperm exhaustion setting.
      4. We agree with the reviewer’s suggestion and have now included sample sizes for all our experiments in a new supplementary Table (Supplementary file 8).
      5. We agree with the reviewer that a comparison between KO/KO and KO/Bal males is non-ideal given that Balancer chromosomes carry many deleterious mutations. We now include new experiments in our revision that compare KO/KO and KO/wt chromosomes.
      6. We agree with the reviewer that standard fertility assays may be too noisy to detect subtle fertility effects. We therefore now carry out much more stringent fertility assays under sperm exhaustion conditions with a male: female ratio of 1:10 and at least 10 males tested per genotype Despite this higher stringency, we detect no difference in fertility between KO/KO males and KO/wt controls for CG30056 (>10 males were tested for each). Thus, our original conclusion is even stronger that CG30056 has no detectable effect on male fertility. We have not tested the possibility of sperm storage or precedence being affected in our assays. However, we do believe that the finding that one of the best conserved and retained SNBP genes has no detectable effect on male fertility is an important conclusion which greatly increases the impact of our study, especially since most fertility-essential genes are either young or not universally conserved. We hope these changes will satisfy the reviewer's concerns about this section of our paper.

      "Our phylogenomic analyses also highlighted one Drosophila clade- the montium group of species (including D. kikkawai)- which suffered a precipitous loss of at least five SNBP genes that are otherwise conserved in sister and outgroup species (Fig 3). (...) Given our hypothesis that autosomal SNBP genes might be linked to the suppression of meiotic drive (above), we speculated that the loss of these genes in the montium group of Drosophila species may have coincided with reduced genetic conflicts between sex chromosomes in this clade." The montium data is an important part of the paper. I think the authors should test the statistical significance of this pattern.

      We appreciate the reviewer’s suggestion. However, we are unable to perform the statistical tests suggested for technical reasons. We note that three loss events occurred in the ancestor of D. montium species, while two happened in the ancestors of most D. montium species. Since it’s hard to estimate the evolutionary rates using these internal branches, we can’t directly compare them to other branches using statistics. However, in response to the reviewer’s comments, we now more clearly contrast the fate of SNBPs between D. montium species and other melanogaster group species, noting that three of five genes lost in the montium group are retained in all other melanogaster group species.

      __Other points:

      1. "The five remaining SNBP genes (Mst33A, CG30056, CG31010, CG34269, and CG42355) remain cytologically uncharacterized [30]." I think it will be interesting if the authors look at other potentially useful resources: Vibranovski et al papers which looked at gene expression in mitotic, meiotic and post-meiotic cells (_https://mnlab.uchicago.edu/sppress/index.php), and the papers by several labs on testis single-cell transcriptomic data (Witt et al 2021 PLOS Genetics. 17(8):e1009728 ; Nat Commun. 2021;12: 892). These may provide additional clues on the function of SNBP genes. There is also a recent report on sperm proteome (doi: _https://doi.org/10.1101/2022.02.14.480191) __

      We are grateful to the reviewer for this suggestion. We now add the data from single-cell expression analyses from Witt et al. in Table 1-figure supplement 1. We found most SNBPs are expressed at late spermatocytes and early spermatids, although CG30056 is primarily expressed in late spermatids, whereas CG34269 is expressed earlier in late spermagonia. The data from Vibrranovski et al. also show similar patterns but don’t have four of these genes, including CG34269. The data from Mahadevaraju et al. are from larva testes, and lack some critical stages during spermatogenesis. Thus, we only report the data from Witt et al.

      We also surveyed the proteome data as the reviewer suggested, but we only found 3 SNBPs (ProtA, ProtB, and Prtl99C) in the data. This did not include, Mst77F, which is the most highly expressed (see Table 2) and well-studied SNBP, so we suspect the proteomic study might be biased toward proteins from sperm tails. Therefore, we decide not to include this analysis.

      ____ "Our inability to detect homologs beyond the reported species does not appear to result from their rapid sequence evolution. Indeed, abSENSE analyses [45] support the finding that Prtl99C, Mst77F, Mst33A, Tpl94 and CG42355 were recently acquired in Sophophora within 40 MYA. For example, the probability of a true homolog being undetected for Prtl99C and Mst77F is 0.07 and 0.18 (using E-value=1), respectively (Table S1, Methods)." This should be complemented by synteny analysis.

      It may not have been clear from our original version that we did perform synteny analyses for all SNBP genes. We have now restated this more clearly in our revision.

      I found the following sentence unclear: "However, we could only ascribe a sex chromosomal linked location for species if no data was available from either BUSCO genes or females (only males and mixed-sex flies)."

      We modified the sentence to make it clearer: “However, we could not ascribe a sex-chromosomal linked location of a contig to either the X or Y chromosome in cases where there was no linkage information from BUSCO genes and no read data available from females, only from males and mixed-sex flies.”

      "Using the available assemblies with Illumina-based chromosome assignment, we surprisingly found that most ancestrally Y-linked genes are not linked to autosomes as was previously suggested [by Dupim et al 2018] (Fig 6A)."

      The new result of X-linkage is exciting, but the sentence is not exact: Dupim et al 2018 made clear that they could only separate X/A from Y-linkage. E.g., the legend of their Fig 3: "Phylogeny and gene content of the Y chromosome in the montium subgroup. "M" means amplification only in males (i.e., Y-linkage), whereas "MF" means amplification in both sexes (autosomal or X-linkage)."

      We are grateful to the reviewer for this correction. We now modified the sentence to make clear that Dupim et al had “showed that many ancestrally Y-linked genes are present in females because of possible relocation to other chromosomes in the montium group.”

      "The most parsimonious explanation for these findings is a single translocation of most of the Y chromosome to the X chromosome via a chromosome fusion in the ancestor of the montium group of species. Afterward, some of these genes relocated back to the Y chromosome in some species (Fig S6; Supplementary text)." Explanations for this pattern of "return to the Y" have been extensively discussed and tested in Dupim et al 2008 (see their section "Why genes seem to return to the Y chromosome after Y incorporations?" ) The available evidence strongly suggests that it is not a case of relocation to the Y.

      We thank the reviewer for raising this point. However, our conclusions disagree slightly with those from Dupim et al. 2018, in part because of additional sequencing in this clade. Dupim et al. suggested the possibility that most Y chromosomal loci duplicated to other chromosomes in the ancestor of the D. montium clade, following which each species degenerated either Y-linked or autosomal copies of genes. If this was the case, Y-linked copies should have diverged from X-linked copies since the ancestor of the D. montium clade. In contrast to this expectation, our phylogenetic analyses found that D. kikkawai Y-linked PRY is more closely related to X-linked PRY in all other related species (Figure 6- figure supplement 1). This result is much less parsimoniously explained by the ancient duplication event proposed by Dupim et al. and is more consistent with a ‘return-to-Y’ that we propose. We also make clear that, unlike PRY, we can’t differentiate the two hypotheses in the case of kl-2.

      Fig 6B suggests that the authors assembled the "translocated Y" in D. triauraria. However, no direct data or account for this assembly is provided. Please clarify.

      This was not our assembly. We searched all publicly available assemblies in the montium group and found one assembly (NCBI accession GCA_014170315.2) that assembled all ancestral Y-linked regions. We now clarify this in our revision.

      __ "Why would meiotic drive only influence Drosophila, but not mammalian, SNBP evolution? One important distinction may arise from the timing of SNBP transcription. In D. melanogaster, SNBP genes are transcribed before meiosis but translated after meiosis [29, 43, 57]. Thus, SNBP transcripts from a single allele, e.g., Xlinked allele, are inherited and translated by all sperm, regardless of which chromosomes they carry. Consequently, they can act as meiotic drivers by causing chromatin dysfunction in sperm without the allele, e.g., Y-bearing sperm." During spermatogenesis Drosophila haploid cells actually are syncytial, which has interesting consequences for the evolution of male genes (Raices et al, Genome Res. 1115-1122, 2019). This may be relevant for the present paper.

      __

      We thank the reviewer for this suggestion. We now gratefully include this citation in our revision.

      __Reviewer #2 (Significance (Required)):

      see above __ __Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This manuscript by Chang & Malik consider the evolution of HMG-box-containing sperm nuclear basic proteins (SNBPs) across Drosophila species in phylogenetic context.

      Previous work in mammals had highlighted fast evolution of proteins involved in chromatin remodeling during spermatogenesis. Here, the authors provide evidence for widespread positive selection and likely involvement in genetic conflict in a set of proteins with analogous functions in Drosophila. Amongst other findings, the authors highlight biased amplification of SNBP paralogs on sex chromosomes along several Drosophila lineages, a tendency towards loss/pseudogenization following translocation onto a sex chromosome, and an intriguing concerted SNBP loss event in the montium group where parts of the Y chromosome have become fused to the X, thus nullifying the chance that genetic conflicts can play out via distorted segregation of sex chromosomes. The authors suggest that, taken together, their findings support widespread of SNBPs involvement (as instigators and repressors) in meiotic drive. Overall, I found the manuscript to be well written and thorough in its exploration of the evolutionary dynamics of SNBPs in this clade.

      __

      We thank the reviewer for the accurate summary and the kind comments.

      __Below, I have highlighted some aspects that I think would benefit from further attention, none of them major.

      • Following their exploration of patterns of SNBP evolution in Drosophila, the authors highlight support of their data for genetic conflict between sex chromosomes. They also rightly acknowledge that other evolutionary drivers such as sperm competition might also play a role in, for example, fast evolution of certain SNBPs. Yet those (not mutually exclusive) alternatives are never pitted directly against each other. The focus is firmly on exploring the support for the sex chromosome genetic conflict model. Given that the authors highlight Drosophila as a great model in part because of its well characterized sperm biology (including comparative morphology), I wondered why the authors had not made an explicit attempt to see if SNBP evolution covaries with aspects of sperm morphology across Drosophila. __

      We do agree with the reviewer that it will be very interesting to test whether SNBP evolution covaries with sperm morphology in Drosophila. However, data on sperm morphology is scant in most Drosophila species. Indeed, this trait has only been well studied in clades with heteromorphic (different-sized) sperm but we agree this will be an exciting topic to consider in the future.

      We also clarify better in our revised discussion that our analyses do not rule out a role for sperm competition or sperm morphology in driving the evolution of at least some SNBP genes. We note that a subset of SNBP genes undergo gene amplifications and loss, but most SNBP genes evolve rapidly including in species with gene loss. Thus, the meiotic drive hypothesis is not to the exclusion of other hypotheses.

      • The most intriguing part of the manuscript for me was the exploration of SNBP fate in the montium group, where the authors find evidence for an ancestral fusion event between the X and parts of the Y chromosome. The loss of SNBPs is certainly consistent with the conflict model but I was wondering to what extent this lineage is characterized more broadly by unusual evolution at the chromosomal level. Is there simply a lot of upheaval in montium, with more frequent gain/loss across the board? How specific is SNBP loss in the context of other orthologous groups? This could be investigated by looking at retention of other genes in other orthologous groups (in montium and some other control group) or perhaps by looking at synteny conservation. This is a good suggestion. Using the same methodology as used in this paper, we found that very few D. melanogaster essential genes (2000) are lost in any single species we surveyed here (unpublished data). However, we have not carried out similar analyses for all genes; given vastly different rates of evolution, this would be a significant undertaking. Thus, we are not able to make a direct comparison between SNBP genes and a control group, that would include other testes-specific or fertility-essential genes. Instead, we highlight the fact that since we identify SNBPs using syntenic analyses, we have known that the neighboring genes of SNBPs are much better conserved than the SNBP genes themselves in the montium group species.

      • In introducing SNBPs, the authors focus on their role as packaging agents. Clearly, SNBPs do package the genome in the sense that they bind to DNA and lead to reduced chromosome volume. But is this all packaging for packaging's sake (as portrayed by the sperm shape hypothesis)? Or is the situation a bit more nuanced, where condensation leads to a reduction of volume but also to a shutdown of transcription, protection from DNA damage, etc.? I think the focus on packaging alone is somewhat limiting when it comes to imagining how these proteins might act in the context of genomic conflicts. The authors may want to broaden their description of SNBPs in the Introduction accordingly. We completely agree with the reviewer and are currently exploring these possibilities in follow-up studies on SNBP function. However, it is fair to add that this hypothesis has not been well-recognized, and we, therefore, prefer to include it in our revised Discussion rather than Introduction. However, we also think that SNBP packaging function might be targeted by Wolbachia-encoded toxins, speeding up their evolution (revised Discussion). We think there are many molecular possibilities for SNBPs.

      • The authors highlight that some SNBPs are expressed in mature sperm whereas others are transition proteins. The evidence for positive selection chiefly comes from the latter group (and "undefined" proteins that could also be transition proteins). Can the authors comment on whether this is expected/unexpected? Along the same lines, the authors highlight differences between Drosophila and mammals when it comes to the timing of transcription/translation during meiosis, suggesting that meiotic drive can happen in Drosophila because alleles are expressed early and can exert an effect after meiosis regardless of whether the associated locus is present in the gamete. I wonder how this relates, if at all, to the author's finding that transition SNBPs are more likely to be part of conflicts (as indicated by positive selection signals) compared to SNBPs in mature sperm. We thank the reviewer for this comment. We expect that many genes expressed explicitly in spermatogenesis, including SNBP genes, would be under position selection, regardless of whether they are associated with X-Y conflicts. The positive selection signals could come from either X-Y conflicts, sperm competition, or conflicts with Wolbachia; we now discuss all of these in the Discussion.

      In contrast, the amplification and loss of a subset of Drosophila SNBPs are more likely associated with X-Y conflicts. We note that known SNBPs retained in mature sperm are more likely to be subject to amplification than known transition proteins.

      Regarding the timing of expression, it is true that transition SNBPs act earlier in spermatogenesis than SNBPs retained in mature sperm. However, for the meiotic drive hypothesis to apply, all it requires is for SNBP expression to precede sperm individualization, which it does for most SNBPs, including transition proteins.

      • ____ It is not entirely clear from the text (and also e.g. Table S4) how dN and dS (and subsequently dN/dS) where calculated. I presume as a single estimate across the whole phylogeny? If so, how heterogeneous is dN/dS across the phylogeny and can the authors identify specific branches on which selective regimes are different? A branch-level analysis should be better powered than the site-level analysis the authors present, which requires repeated selection on the same set of sites to get a strong enough signal. A branch-specific assessment of evolution would be particularly valuable in combination when combined with the assessment of amplifications/losses. We thank the reviewer for this question. The reviewer is correct. We estimated dN and dS in Supplementary file 4 across the whole phylogeny. We conducted branch tests for the amplification of tHMG only in the Dsim clade (Supplementary file 11).

      We are interested in how SNBP amplification happened across species, but we need better gene annotation for their structure in many of these 19 independent cases. Moreover, we hope to combine these with transcriptomic analyses with detailed sequence analyses to reveal how the event happened and how gene conversion, gene duplication, and mutations affect their evolution. Each of these analyses requires extensive additional resources and analyses, and we feel are beyond the scope of this current paper.

      • The authors suggest that young SNBPs are more likely to encode essential, non-redundant male fertility functions (p7, third paragraph). I'm not sure whether this generalization is appropriate given the small sample. Tpl94D is as young as Mst77F/Prtl99C, tHMG and CG14835 homologs have been lost along different lineages and most of the events are in a single lineage leading up to D. kikkawai. Do the authors really feel that this generalization is warranted? We agree with the reviewer. However, it is striking that the known fertility essential genes are either young or not universally conserved. We have therefore reworded our conclusion to make this contrast more accurate.__

      • How do the sex-chromosomal amplifications differ in sequence from the ancestral autosomal copies? The authors suggest that the sex chromosomal copies might be involved in meiotic drive? Does the sequence offer a function as to how? (e.g. loss of charged residues/DNA-binding capacity?__

      These are good questions. We do not know mechanistically how the sex-chromosome amplifications may cause meiotic drive. We did not observe the loss of positive charge or HMG domain in most sex-chromosomal amplified copies (Supplementary file 3). Our current working hypothesis is that they compete for the DNA binding with autosomal SNBP, and might interact with other proteins, e.g., heterochromatin proteins, to disturb sperm function. How they might function to cause meiotic drive is an active area of investigation in our and other labs.

      • I think it would be nice to have a final table/figure to summarizing the different lines of evidence for all the genes in Table 1 (i.e. positive selection yes/no, amplification in some lineages yes/no, sex chromosome translocations yes/no), for different lineages, including whether any of the HMG-box genes are unlikely to act as SNBPs. We agree with this suggestion. We have now significantly revised and added to Table 2 to include this added information.

      • The evidence the authors present is often consistent with genetic conflicts between sex chromosomes. Is it cogent? Arguably not (since direct tests of the mechanism are provided. I would therefore suggest a more cautious title than one stating that conflicts drive expansion and loss of SNBPs. We agree with all three reviewers and have amended our title to highlight the correlation. We also discuss other possibilities that can drive SNBP evolution in our revised Discussion.

      __Typographical errors etc.:

      • P3. First paragraph: "One of the driving forces ... " I found this sentence a bit odd in terms of causality (changes in composition being portrayed as a force that leads to selection) __

      We thank the reviewer for pointing out the confusing construction. We modified the sentence to “The positive selection of SNBPs results in changes to their amino acid composition.”

      - P3. Second paragraph: should be "HMG-box" rather than "HMB-box"

      Fixed.

      - P3. Fourth paragraph "..., consistent with the observation in mammals". I think "consistent" should be reserved for two observations that speak to the same phenomenon. SNBPs could evolve with no evidence for positive selection in Drosophila and that wouldn't exactly be "inconsistent" with mammals. It would just be different.

      Fixed. We changed “consistent with” to “similar to”.

      ____- P5. Fifth paragraph: should be "in the PAML package" rather than "in PAML package"

      Fixed.

      - P9. Second paragraph: "... montium group (Fig 5A)...)" should be Fig 6A.

      Fixed.

      __CROSS-CONSULTATION COMMENTS I have not much to add. The other reviews seem fair and well-informed from my somewhat-outside perspective. I don't know how tricky/time-consuming the suggested additional fly mating experiments are but want to note that, in general, I'm loath to "punish" authors of principally bioinformatic work for including some experiments. If experimental shortcomings can be addressed with appropriate caveats, that should be an option, as should removal of experimental data that - by the experts - would be considered too preliminary.

      __

      We thank the reviewer for their support. However, we felt that improved experiments on CG30056 role in fertility could broaden the scope of this paper, despite the additional time and labor commitment. We have now finished these experiments and they do reinforce our original conclusions with much greater support.

      __It is my policy to sign my reviews.

      Tobias Warnecke

      Reviewer #3 (Significance (Required)):

      I'm not enough of an expert in the field of SNBPs to assess the level of advance provided by this study. __

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The paper describes interesting patterns on the evolution of Drosophila SNBP genes, and proposes a very interesting explanation, namely, that meiotic drive is the main evolutionary force behind these patterns. Some of these observations have recently been made by other authors in a single case (the Dox genes in D. simulans), but not in the scale and breadth of the present ms. The ms combines an extensive investigation of available genomes with expert analysis, and new experimental data. In particular, the finding that the ancestral Y became incorporated into de X in montium species is very exciting, and may provide a smoking gun for the explanation proposed by the authors. Overall, I think it is a very good paper. I do have several criticisms and suggestions that may help to improve it.

      The paper has a speculative side that it almost unavoidable given its novelty and breadth. I do not see this as a problem per se, but I think the uncertain/unsupported/problematic points should be more openly presented to the readers. The main cases I noted are:

      1. The title of the ms states that "Genetic conflicts between sex chromosomes drive expansion and loss of sperm nuclear basic protein genes in Drosophila", but the evidence is somewhat circumstantial, and the patterns may be explained also by other known phenomena (e.g., demasculinization of the sex chromosomes; below). I think the tone of the end of the Introduction reflects more faithfully the strength of the evidence ("Thus, we conclude that rapid diversification of SNBP genes might be largely driven by genetic conflicts between sex chromosomes in Drosophila."). I understand the temptation of writing a bold title, but I think it is a bit misleading in the present case. I.e., it would be desirable that the title conveys the uncertainties of the data and their interpretation.
      2. "In contrast, we found no instances of pseudogenization or subsequent translocation to the X chromosome of SNBP genes that are still preserved on their original autosomal locations or involved in chromosome fusions between autosomes (0/16). This difference is highly significant (Fig 5 and Table S11; 3:5 versus 0:16, Fisher's exact test, P=0.03). " Readers should be warned that this pattern can also be explained by the well-known demasculinization of X chromosomes (e.g., Sturgill et al. Nature 2007, 450, 238-241)
      3. "Indeed, no meiotic drive has been documented in the montium species even though it is rampant in many other Drosophila lineages [38]." Two remarks here: a) the authors should make clear that they are referring to sex-chromosome meiotic drive. b) I think the evidence is much weaker than the sentence implies. Sex-chromosome meiotic drive is known in less than 20 Drosophila species, scattered throughout the phylogeny. As far as I know all cases were discovered by accident, so the sampling is biased towards model species (e.g., the obscura group, which was very popular around 1930-1960). So we do not know the true frequency of sex-ratio meiotic drive among Drosophila species, nor, say, if it is more common in the Drosophila or Sophophora species, if it is suspiciously absent in the montium group (as suggested by the authors), etc. I think these uncertainties should be acknowledged or, perhaps, given the weakness of the argument, the sentence should be deleted or attenuated.
      4. "X-Y chromosome fusions eliminate the extent of meiotic drive and may lead to the degeneration of otherwise conserved SNBP genes, whose functions as drive suppressors are no longer required. Thus, unlike in mammals, sex chromosome-associated meiotic drive appears to be the primary cause of SNBP evolutionary turnover in Drosophila species." The authors found that in the montium species the ancestral Y became incorporated into de X chromosome, and that montium species seem to have an inordinate amount of SNBP gene losses. They combine these two observations by suggesting that these SNBP became dispensable or deleterious because they originally wee involved in XY meiotic drive. I think many readers will think that males in montium species are X/0, whereas in fact in all of them carry a Y chromosome (just, in most cases, more gene poor than "normal" Y-chromosomes). I do not think this is a fatal flaw for the explanation proposed by the authors, but certainly is a difficulty that should be acknowledged.

      Problems/suggestions with experiments and data analysis

      1. There is a section titled "CG30056 is universally retained in Drosophila but dispensable for male fertility in D. melanogaster". In this section and in the figures, it is stated, "Although CG30056 is the most conserved SNBP we surveyed, we found no clear difference in offspring number between heterozygous controls and homozygous knockout males (Fig 2B). (...) We found either no or weak evidence of fertility impairments in two different crosses with homozygous CG30056 knockout males.". I think the fertility data are weak for the purpose of the authors, and I strongly suspect that this conclusion is wrong. Let me explain why. At other passages of the ms, the authors classify the SNPB genes in three groups.
        • (i) essential/important for male fertility: "Three genes (Mst77F, Prtl99C and ddbt) are essential for male fertility while knockdown or knockout of two other SNBP genes (ProtA, and ProtB) leads to significant reduction in male fertility [27-30, 32]."
        • (ii) genes that do not appear to impair male fertility at all.
        • (iii) untested. CG30056 was in the last group, and hence the authors produced knockouts, tested their effect in male fertility, and concluded that it belongs to the second group. Now, look at Fig. 3B. The numbers of tested males are too small (it seems to range from 3 to 10), and male fertility is known to be a very noisy phenotype (as shown by the huge scatter in the authors' data). Furthermore, two different knockouts were tested, and both were nominally less fertile than the controls, and in one of them the difference is statistically significant. Taken at the face value, the knockouts seem to be perhaps ~25% less fertile than the controls. Another potentially big problem is that the "control males" actually carry visible dominant mutants (the balancers CyO or SM6) which certainly reduce their fitness, whereas the experimental males are wild-type for these mutants. Without the detrimental effect of these visible mutants in the controls, the difference to the CG30056 knockouts will probably be even larger. Note that the fertility effects of the genes ProtA, and ProtB (a.k.a. "Mst35B") , which the authors put in group "essential/important for male fertility" would not had been detected if assayed as the CG30056 gene: Tirmarche et al (2014; the reference cited by the authors) stated that: "In fact, the impact of Mst35B on male fertility was only revealed when mutant males were allowed to mate with a large excess of virgin females (1 for 10; Figure 3F) but not with a 1:1 sex ratio (not shown). " The authors' fertility test did not used this type of challenge. My general impression is that the fertility effects of CG30056 may actually be similar to ProtA and ProtB. I think the authors should do a proper fertility test of CG30056, or remove this section. Another possibly useful approach would be to classify the SNPB genes in those essential for male fertility and those that are not essential, because "experimentally speaking" this is a safer distinction (e.g., the fertility testes reported by other authors may also had been quick tests). Since these genes only function in sperm and are under purifying selection (otherwise they would have been lost; also, all have dN/dS < 1 ), they all most likely affect male fertility to some extent. In case the section on male fertility stays, it will be necessary to provide more details. How many males were crossed for each genotype? In some cases in Fig 2B, it seems that as low as 3, but it may be data superposition in the graph. Please provide the raw data in the supplementary material.
      2. "Our phylogenomic analyses also highlighted one Drosophila clade- the montium group of species (including D. kikkawai)- which suffered a precipitous loss of at least five SNBP genes that are otherwise conserved in sister and outgroup species (Fig 3). (...) Given our hypothesis that autosomal SNBP genes might be linked to the suppression of meiotic drive (above), we speculated that the loss of these genes in the montium group of Drosophila species may have coincided with reduced genetic conflicts between sex chromosomes in this clade." The montium data is an important part of the paper. I think the authors should test the statistical significance of this pattern.

      Other points:

      1. "The five remaining SNBP genes (Mst33A, CG30056, CG31010, CG34269, and CG42355) remain cytologically uncharacterized [30]." I think it will be interesting if the authors look at other potentially useful resources: Vibranovski et al papers which looked at gene expression in mitotic, meiotic and post-meiotic cells (https://mnlab.uchicago.edu/sppress/index.php), and the papers by several labs on testis single-cell transcriptomic data (Witt et al 2021 PLOS Genetics. 17(8):e1009728 ; Nat Commun. 2021;12: 892). These may provide additional clues on the function of SNBP genes. There is also a recent report on sperm proteome (doi: https://doi.org/10.1101/2022.02.14.480191)
      2. "Our inability to detect homologs beyond the reported species does not appear to result from their rapid sequence evolution. Indeed, abSENSE analyses [45] support the finding that Prtl99C, Mst77F, Mst33A, Tpl94 and CG42355 were recently acquired in Sophophora within 40 MYA. For example, the probability of a true homolog being undetected for Prtl99C and Mst77F is 0.07 and 0.18 (using E-value=1), respectively (Table S1, Methods)." This should be complemented by synteny analysis.
      3. I found the following sentence unclear: "However, we could only ascribe a sex chromosomal linked location for species if no data was available from either BUSCO genes or females (only males and mixed-sex flies)."
      4. "Using the available assemblies with Illumina-based chromosome assignment, we surprisingly found that most ancestrally Y-linked genes are not linked to autosomes as was previously suggested [by Dupim et al 2018] (Fig 6A)." The new result of X-linkage is exciting, but the sentence is not exact: Dupim et al 2018 made clear that they could only separate X/A from Y-linkage. E.g., the legend of their Fig 3: "Phylogeny and gene content of the Y chromosome in the montium subgroup. "M" means amplification only in males (i.e., Y-linkage), whereas "MF" means amplification in both sexes (autosomal or X-linkage)."
      5. "The most parsimonious explanation for these findings is a single translocation of most of the Y chromosome to the X chromosome via a chromosome fusion in the ancestor of the montium group of species. Afterward, some of these genes relocated back to the Y chromosome in some species (Fig S6; Supplementary text)." Explanations for this pattern of "return to the Y" have been extensively discussed and tested in Dupim et al 2008 (see their section "Why genes seem to return to the Y chromosome after Y incorporations?" ) The available evidence strongly suggests that it is not a case of relocation to the Y.
      6. Fig 6B suggests that the authors assembled the "translocated Y" in D. triauraria. However, no direct data or account for this assembly is provided. Please clarify.
      7. "Why would meiotic drive only influence Drosophila, but not mammalian, SNBP evolution? One important distinction may arise from the timing of SNBP transcription. In D. melanogaster, SNBP genes are transcribed before meiosis but translated after meiosis [29, 43, 57]. Thus, SNBP transcripts from a single allele, e.g., Xlinked allele, are inherited and translated by all sperm, regardless of which chromosomes they carry. Consequently, they can act as meiotic drivers by causing chromatin dysfunction in sperm without the allele, e.g., Y-bearing sperm." During spermatogenesis Drosophila haploid cells actually are syncytial, which has interesting consequences for the evolution of male genes (Raices et al, Genome Res. 1115-1122, 2019). This may be relevant for the present paper.

      Significance

      see above

    1. And on this logic—the same logic, by the way, that rightly grounds the conclusion that we should, for example, prefer the term “person living with depression” to “depressed person”—we should not refer to persons in ways that may imply that they are essentially defined by something that they are, in fact, managing.

      I disagree with this. I don't think it is a particularly relevant argument to make, as something like a medical condition or diagnosis does not correlate ot something like being gay. Being gay is not something people "suffer from"

    1. A lot of [my students]think people are obese because people can’t put down afork . . . [In this unit] we do research about things likegenetics . . . [to counter that notion].” In addition to theinformation about the availability of healthy food in theircommunities, this challenged the idea that obese and/oroverweight people are just lazy: they may be respondingto larger forces outside their control.

      It's always interesting for a student to challenge their own thinking and questions the things that may be thought of as "absolute truths". Growth and understanding, empathy even, comes from these realizations.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: #RC-2022-01697

      Corresponding author(s): William Roman; Edgar R. Gomes

      [The “revision plan” should delineate the revisions that authors intend to carry out in response to the points raised by the referees. It also provides the authors with the opportunity to explain their view of the paper and of the referee reports.

      The document is important for the editors of affiliate journals when they make a first decision on the transferred manuscript. It will also be useful to readers of the reprint and help them to obtain a balanced view of the paper.

      If you wish to submit a full revision, please use our "Full Revision" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

      1. General Statements [optional]

      This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

      We would like to thank the reviewers for their careful evaluation of our study. The goal of this work is to demonstrate that fiber type composition can be altered with exercise of in vitro muscle cultures. These findings provide an additional strategy to better mimic muscle in vitro for biological investigation and disease modelling. The reviewers’ comments will strengthen the conclusions of our study.

      In this point-by-point answer, we also include a statement on the feasibility of each comment based on preliminary work we have performed since receiving the reviews. We expect experiments can be achieved within 2 – 3 months.

      2. Description of the planned revisions

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Henning et al describes a method to induce myofiber subtype specification in vitro based on optogenetics and particle image velocimetry. The work is well performed and the manuscript is clear. The findings might be useful to the muscle community, but there are some issues which should be addressed in order to improve the quality and impact of the manuscript.

      My main concern is that the whole work is performed in murine cells. Although I appreciate that the authors have used primary myoblasts rather than cell lines, I also think that the key advantage of such in vitro platforms is the possibility to "humanise" the experiments as much as possible. In this context, the key findings of this work should be reproduced using human myoblasts. This will significantly enhance the relevance of the work. *

      Point 1.1) We thank the reviewer for his suggestion and have already performed some pilot experiments to “humanize” experiments. We infected hiPSC-derived myotubes (van der Wal et al., 2018) and human immortalized myotubes (Mamchaoui et al., 2011) with AAV9-pACAGW-ChR2-Venus-AAV. After infection, human immortalized myotubes did not express ChR2, not permitting optogenetic training on these cultures. For hiPSC-derived myotubes, the infection rate was very low and insufficient to perform a bulk analysis to evaluate the effect of long term intermittent light stimulation. Moreover, the contractile behavior of hiPSC-derived myotubes expressing ChR2 significantly differed from primary mouse myotubes. They underwent a single and slow contraction when compared to the cyclic contractions observed in mouse myotubes. This suggests that the maturation of the contractile apparatus of 2D hiPSC-derived myotubes is insufficient to perform consistent in vitro training studies.

      As such, we agree with the reviewer that reproducing our key findings with human cells would improve the relevance of this work. However, due to the experimental limitations described above, significant improvements in human myotube maturation in vitro are required to perform such experiments. We will attempt to increase infection efficiency by using another AAV serotype in hiPSC-derived myotubes but this has a low probability of solving all the technical limitations. Our work is a proof of principal that fiber type composition can be influenced in vitro through contraction stimulation. We expect these findings to be the translated to human cultures when the field has discovered the necessary protocols to push human myotube maturation.

      Feasibility: run additional tests but probability of success is low due to technical limitations.

      *Other issues: *

      1) From a methodological perspective, I think some clarifications are needed on the western blots shown in Fig 4K-L, as the pattern of Myh3 and Myh8 in both panels appear very similar. This could easily be ruled out by providing raw data/images. Please accept my apologies if this is simply caused by similar migration patterns in the gels (worth checking).

      Point 1.2) The very similar appearance of both patterns is due to the same molecular weight (220 kDA) of distinct myh isoforms. After an initial staining of western blot membranes, primary and secondary antibodies were stripped off and the membrane was subsequently re-probed using a primary and secondary antibody. We incubated stripped membranes with secondary antibodies only and observed no signal, confirming the stripping was efficient. We have updated the representative images of the Western Blot membranes in Figure 4 and included the α-actinin loading controls on which the bands are normalized to account for sarcomerogenesis (Figure 4 K-M).

      Feasibility: Accomplished

      *2) Figure 3K-L (BTX): better imaging should be performed to assess morphology of NMJ (eg. pretzel-shaped as in mature/adult NMJ?) *

      Point 1.3) We agree with the point raised by the reviewer. However, a morphological assessment of the NMJ is difficult in this in vitro system due to our inability to generate mature muscle end plates as seen in in vivo adult NMJs. We will nevertheless perform a more quantitative evaluation of BTX stainings imaged with high spatial resolution by measuring the size and shape of the AChR clusters. The technical pipeline to do this quantitative approach is already established.

      Feasibility: will be accomplished

      *3) Figure 3 N-P: Why did the authors used a relatively complex techniques such as smFISH to answer a question more simply addressable with more conventional (and perhaps less operator dependent) techniques such quantitative PCR?

      *

      Point 1.4) We agree with the reviewer that the more conventional qPCR technique would highlight similar results to the smFISH quantifications. Due to the heterogeneity of our primary myotube cultures (presence of non-muscle cell types and varying degrees of muscle cell maturation), we opted to monitor AChR expression by conserving a spatial dimension. This allows us to observe ChrnE and ChrnG expression in mature muscle cells selected to perform the contraction analysis. Nevertheless, performing a bulk RNA expression analysis would be informative to show a significant increase in AChR expression across the culture. This point will be fully addressed by qPCR assays of ChrnE and ChrnG.

      Feasibility: will be accomplished

      *Reviewer #1 (Significance (Required)):

      Nature and significance: as mentioned in the previous section, the work can be very significant if expanded to human myoblasts/myotubes, which can have different slow/fast myosin expression pattern. The work is clearly methodological/descriptive, so showing an application of this technique using diseased/mutant cells may increase its relevance even more (but I do not believe it is a key barrier to publication). *

      We thank the reviewer for his comments as the “other issues” raised will significantly improve the manuscript and will all be tackled. With regards to using human myotubes, we will attempt a few more strategies to translate our findings to human cultures, but our preliminary data suggests that many technical barriers need to be overcome to perform such experiments. Nevertheless, it is our opinion that the main contribution of this manuscript is to show that fiber switching can be achieved in vitro and that this will be routinely used in the next generation of human in vitro muscle systems.

      *

      *

      *Comparison with other methods: Similar methods have been published but not with this level of resolution.

      Expertise: muscle disease and regeneration, in vitro and in vivo models.*

      *Reviewer #2 (Evidence, reproducibility and clarity (Required)): *

      * The work presented shows that muscle stem cells isolated from 5-day-old mice can be transduced with a DNA coding for a Channelrhodopsin2-Venus which will allow the muscle cell to be excited by a light beam (475nm) and to induce the contraction of myotubes. The authors measure the speed of contraction, relaxation and fatigability of such cells as a function of a more or less long excitation time. In particular, they show that myotubes in culture, excited at a frequency of 5 Hz, 8 hours per day for 7 days are larger than unstimulated myotubes and are more resistant to fatigue. Surprisingly, they show that myotubes stimulated at the low frequency of 5Hz express the neonatal Myosin heavy chain more than the slow Myh whose expression is known in adult muscle to be specifically strong in muscle fibers stimulated at low frequency. As the authors do not apply a high stimulation frequency (100Hz) to their culture, it is difficult to conclude whether the stimulation frequency applied in the study induces a specific phenotypic specialization of the myofiber, or a more general role. In this respect, the size of the myotubes obtained after training seems to be increased, showing a hypertrophic effect on the cultured myotubes. This study does not allow us to conclude, beyond the expression of the Myh8 gene, on the “gain” of the fast-twitch specialization of the myofiber by repeated stimulation over several days. A complementary study would certainly provide elements to better understand the role of muscle fiber stimulation, apart from the trophic contribution provided in vivo by the motoneuron. If the study is well conducted, some points are nevertheless important to address before publication.*

      *Reviewer #2 (Significance (Required)): *

      * - Figures 4F/G are difficult to understand: the Myh7 signal seems much higher in trained myonuclei (F), but the histogram shows the opposite (G).*

      __Point 2.1) __We apologize for the confusion. The apparent higher Myh7 signal in trained cells in Figure 4F is due to background noise in the image. When mRNA is expressed, the smFISH probes are visible as small round dots. For clarity, we updated the representative images for the smFISH probes and highlighted the smFISH dots with arrows. We also adapted the y-axis of each graph to better represent the analysis of mRNA counts per myonuclei.

      Feasibility: Accomplished

      *- Figures 4L, the western blot shows the same increase in Myh3 and Myh8 at day 4, while the graph shows an increase at d4 only in Myh8, why? *

      Point 2.2) We have chosen another western blot to better reflect the quantification. It is important to note that we have normalized the band intensity to a-actinin instead of a house keeping gene to account for changes in sarcomerogenesis over the lifetime of the cultures. As such, although we observe an increase in Myh3 intensity, it is counter balanced by an increase in a-actinin expression. We have now added the a-actinin bands.

      - For immunocytochemistry against fMyh (Fig4 H, I) as well as for Western blots (Fig 4M, N), the authors have to provide arguments regarding the specificity of the antibodies used: some fMyh-specific antibodies recognize, Myh 3, 8, 1, 2, and 4, some only Myh 8, 1, 2, and 4, so it is quite difficult to conclude on the experiments using sc-32732 antibodies, (clone F59) which Myh are actually recognized in Western blot or immunocytochemistry.

      Point 2.3) According to the manufacturer, the sc-32732 antibody is specific for fast Myh (Myh1, 2, 4 and 6). Nevertheless, we will ensure the specificity of the sc-32732 antibody against fast Myosins by staining neonatal and adult TA/EDL muscle sections with anti-Myh3 (embryonic), anti-Myh8 (neonatal) and anti-fMyh antibodies.

      Feasibility: will be accomplished

      While 10Hz stimulation is known in vivo to increase the slow program, and Myh7 expression in adult muscles, the authors show that ex vivo this is not the case with primary myotubes, with Myh7 protein level not being upregulated in the 7 day stimulation paradigm, while on the contrary Myh8 expression is upregulated. I think it would be important to quantify the mRNA of each of the Myh genes to be sure that there is no problem with the antibodies, which could recognize several Myh proteins, in the absence of a resolving acrylamide gel allowing visualization and relative level of each isoform according to its migration. Nevertheless, this is an interesting observation that could be related to the early phases of muscle contraction in vivo. Indeed, it has been shown in rats that early postnatal development animals are essentially sedentary and whose muscles (Sol and EDL) are stimulated by short intermittent bursts similar to 10Hz (doi: 10.1111/j.0953-816X.2004.03418.x) during the first 2-3 weeks of life. This should be compatible with Myh8 expression. It would be relevant in this idea to verify that the paradigm presented leads to myotubes with a "neonatal" phenotype. Quantification of the expression level of *genes specifically expressed during the neonatal period, compared with those expressed in adult slow or fast myofibers, would enhance the conclusions drawn by the authors. *

      Point 2.4) The reviewer raises an important technical limitation of observing Myh proteins to identify fiber types due to the cross-reactivity of antibodies. Despite our best efforts to select the appropriate antibodies, we agree that investigating mRNA expression of individual Myh isoforms would strengthen the conclusion of our study. We will design specific primers and perform qPCR for distinct Myh isoforms on untrained and trained cultures.

      With regards to the “neonatal” phenotype of these in vitro cultures, this does indeed seem to be the case as the cultures transition from embryonic and neonatal myosins to adult myosins during the lifetime of the cultures.

      Feasibility: will be accomplished

      *Should we also be cautious about bulk analysis since, as shown in Figure S1, not all myotubes express ChR2? *

      Point 2.5) Although 10% of myotubes do not express ChR2, we believe that 90% of infected myotubes is sufficient for bulk analysis. We nevertheless combine in our study bulk analysis with single cell assays such as smFISH and immunofluorescence, which are in line with the bulk analyses.

      Feasibility: Accomplished

      May the authors correlate the ex vivo neonatal phenotype observed with the neonatal muscles they used to prepare myogenic stem cells?

      Point 2.6) We understand from this that the reviewer would like us to check the expression of distinct Myh isoforms in our in vitro system and compare it to neonatal muscle. We will perform Myh staining of muscle sections from 6-day old mouse pups (time of myogenic stem cell isolation) and compare the expression of Myosin heavy chains with what we observe in our in vitro cultures.

      Feasibility: will be accomplished

      Overall, we will address all the points of the reviewer. Those ensuring the specificity of antibodies used are particularly relevant. With regards to the comparison between our in vitro cultures with neonatal muscle, we believe this will help contextualize our findings with the literature.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      *Summary: *

      *In this work, the authors propose an in vitro model describing a strategy to alter fiber type composition of myotubes with a long-term, intermittent mechanical training. The authors present a model of myotubes transfected with an adenovirus, which makes them photosensitive; in this way, fibers contraction can be induced upon stimulation with blue LEDs. *

      *Even though ChR2 expressing myotubes have previously been used by other groups (Asano T, Ishizua T, Yawo H. Optically controlled contraction of photosensitive skeletal muscle cells. Biotechnol Bioeng. 2012 Jan;109(1):199-204), no one has ever used it in the way proposed by the authors. For this reason, this work opens new perspectives on the possible use for clinical and therapeutic purposes for this in vitro muscle system. *

      *Major comments: *

      *I believe that the authors have presented their results, conclusion and methods in a fair and clear way, so that the experiment could also be reproduced. *

      *However, I think there are some adjustments that could be done in order to improve and strengthen the quality of this work: *

      *- The authors have analysed the expression of different myosin heavy chain isoforms, both regarding the slow and fast twitch fibers. Though, I think it would be interesting to investigate also the expression of Myh4, which is mainly expressed in type IIB fast twitch fibers; *

      Point 3.1) We agree with the reviewer’s comment. We will add the analysis for Myh 4 (western blots and qPCR) to our manuscript.

      Feasibility: will be accomplished

      The authors have observed a switch in the fiber type upon prolonged intermittent stimulation with blue LEDs, which translates into a higher number of type II fibers. It is known that exercise helps rescuing the loss of type II fibers, which is typical of age-related physiological processes, such as sarcopenia (Brunner F, Schmid A, Sheikhzadeh A, Nordin M, Yoon J, Frankel V. Effects of aging on Type II muscle fibers: a systematic review of the literature. J Aging Phys Act. 2007 Jul;15(3):336-48). However, I believe that providing a deeper analysis of the metabolism of the type II fibers (i.e. oxidative or glycolytic) could be helpful in order to have a clearer view on the specific subset of fibers that are generated with the given experimental conditions;

      Point 3.2) We agree with the reviewer's suggestion that an additional metabolic analysis would strengthen our observation. We propose to perform lactate measurements in cell lysate and supernatant to monitor a switch from oxidative to glycolytic metabolism. Specific inhibitors of the glycolytic pathway (2-DG, UK5099, Rotenone and AntimycinA) will be used as a control to prevent trained cells to shift towards a fast fiber type.

      Alternatively, we will assess the protein expression levels of key metabolic proteins involved in oxidative phosphorylation and in pyruvate and lactate production (e.g. OxPhos, …). All these techniques are routinely performed in an adjacent laboratory and we foresee no technical limitations.

      Feasibility: will be accomplished

      *Minor comments: *

      *The text and the figures are clear and well written, and help to explain better the experimental setup and procedures. Still, I would suggest some minor adjustments: *

      - I would suggest providing more information on the pH used for the experiments, since it plays a pivotal role in regulating myosin ATPase activity and, thus, muscular contractility. This would improve the replicability of your experiment.

      We thank the reviewer for this comment. We will provide information regarding the pH and add it in the method and materials section.

      Feasibility: will be accomplished

      The caption of Figure 1 is missing a description of panel E, even if it has been addressed in the text.

      Point 3.3.) We apologize for this mistake. We added the missing description of Fig. 1E.

      Feasibility: Accomplished

      *Reviewer #3 (Significance (Required)): *

      *This model opens new perspectives on in vitro muscle systems for the study of pathologies. The authors have been able to assess that myofibers contraction is able to induce a shift towards type II fibers, reproducing in vitro what is also known in vivo. For this reason, I believe that this model could be useful for further clinical approaches. It is important, though, to keep in mind that muscular disorders are not all characterized by a loss of type II fibers; for instance, myotonic dystrophies type I and type 2 exhibit similar phenotypes, even if different types of muscle fibers are affected. *

      *For this reason, it would be interesting to investigate the versatility of this model in terms of giving rise to different fiber types. *

      Point 3.4.) We added a sentence in the introduction that highlights an example of muscle disorders in which slow muscle fibers are predominately affected. Concerning the versatility of the model, we will add a paragraph to the discussion elaborating on how different stimulus frequency and durations could influence the specialization of fiber types.

      Feasibility: Accomplished

      Overall, we will address all major and minor comments from the reviewer. We have identified the experiments required for the metabolic analysis and agree that it will bolster our findings.

      Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      We have already carried out the following changes in the manuscript, which were proposed by the reviewers:

      Point 1.2: pattern of Myh3 and Myh8 in both panels appear very similar - We updated the representative images of Myh 3 and Myh8 in Figure 4 K-N __and included the loading controls Myh 8 and fMyh images in __Figure 4K-N __and to __supplementary Figure 4 A, B.

      Point 2.1: Figures 4F/G: representative images of Myh7 smFISH probe and the graph showing opposite trends – We have updated the representative images of Figure 4F and we have changed the x-axis of the graph in Figure 4E and G.

      __Point 2.5: __caution around bulk analysis we consider that based on the high percentage of contracting cells in response to blue light (~90%), this concern is not warranted.

      Point 3.3: caption of Figure 1 is missing a description of panel E – We have added the missing description to the manuscript (Figure 1E).

      Point 3.4: muscular disorders are not all characterized by a loss of type II fibers – we have added an example of a muscle disorder, in which slow fibers are predominantly affected, to the introduction (line 42-44) of the manuscript.

      investigate the versatility of this model in terms of giving rise to different fiber types – we added a paragraph to the discussion elaborating on how different stimulus frequency can lead to different fiber types (line 264-275).

      3. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      Point 1.1: Reproducing our key findings with human cells – we ran pilot experiments on immortalized human cell lines and human iPSC-derived myotubes but were not able to mature these cells sufficiently nor infect them to allow long-term in vitro training. Increased maturation of myotubes derived from hiPSCs is an endeavor currently undertaken by many laboratories. Although we will attempt a few more trials, we believe the technical limitations are too important to address this point.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The work presented shows that muscle stem cells isolated from 5-day-old mice can be transduced with a DNA coding for a Channelrhodopsin2-Venus which will allow the muscle cell to be excited by a light beam (475nm) and to induce the contraction of myotubes. The authors measure the speed of contraction, relaxation and fatigability of such cells as a function of a more or less long excitation time. In particular, they show that myotubes in culture, excited at a frequency of 5 Hz, 8 hours per day for 7 days are larger than unstimulated myotubes and are more resistant to fatigue. Surprisingly, they show that myotubes stimulated at the low frequency of 5Hz express the neonatal Myosin heavy chain more than the slow Myh whose expression is known in adult muscle to be specifically strong in muscle fibers stimulated at low frequency. As the authors do not apply a high stimulation frequency (100Hz) to their culture, it is difficult to conclude whether the stimulation frequency applied in the study induces a specific phenotypic specialization of the myofiber, or a more general role. In this respect, the size of the myotubes obtained after training seems to be increased, showing a hypertrophic effect on the cultured myotubes. This study does not allow us to conclude, beyond the expression of the Myh8 gene, on the "gain" of the fast-twitch specialization of the myofiber by repeated stimulation over several days. A complementary study would certainly provide elements to better understand the role of muscle fiber stimulation, apart from the trophic contribution provided in vivo by the motoneuron.

      If the study is well conducted, some points are nevertheless important to address before publication.

      Significance

      • Figures 4F/G are difficult to understand: the Myh7 signal seems much higher in trained myonuclei (F), but the histogram shows the opposite (G).
      • Figures 4L, the western blot shows the same increase in Myh3 and Myh8 at day 4, while the graph shows an increase at d4 only in Myh8, why?
      • For immunocytochemistry against fMyh (Fig4 H, I) as well as for Western blots (Fig 4M, N), the authors have to provide arguments regarding the specificity of the antibodies used: some fMyh-specific antibodies recognize, Myh 3, 8, 1, 2, and 4, some only Myh 8, 1, 2, and 4, so it is quite difficult to conclude on the experiments using sc-32732 antibodies, (clone F59) which Myh are actually recognized in Western blot or immunocytochemistry.
      • While 10Hz stimulation is known in vivo to increase the slow program, and Myh7 expression in adult muscles, the authors show that ex vivo this is not the case with primary myotubes, with Myh7 protein level not being upregulated in the 7 day stimulation paradigm, while on the contrary Myh8 expression is upregulated. I think it would be important to quantify the mRNA of each of the Myh genes to be sure that there is no problem with the antibodies, which could recognize several Myh proteins, in the absence of a resolving acrylamide gel allowing visualization and relative level of each isoform according to its migration. Nevertheless, this is an interesting observation that could be related to the early phases of muscle contraction in vivo. Indeed, it has been shown in rats that early postnatal development animals are essentially sedentary and whose muscles (Sol and EDL) are stimulated by short intermittent bursts similar to 10Hz (doi: 10.1111/j.0953-816X.2004.03418.x) during the first 2-3 weeks of life. This should be compatible with Myh8 expression. It would be relevant in this idea to verify that the paradigm presented leads to myotubes with a "neonatal" phenotype. Quantification of the expression level of genes specifically expressed during the neonatal period, compared with those expressed in adult slow or fast myofibers, would enhance the conclusions drawn by the authors.
      • Should we also be cautious about bulk analysis since, as shown in Figure S1, not all myotubes express ChR2?
      • May the authors correlate the ex vivo neonatal phenotype observed with the neonatal muscles they used to prepare myogenic stem cells?
    1. Author Response

      Reviewer #1 (Public Review):

      The authors are trying to determine how time is valued by humans relative to energy expenditure during non-steady-state walking - this paper proposes a new cost function in an optimal control framework to predict features of walking bouts that start and stop at rest. This paper's innovation is the addition of a term proportional to the duration of the walking bout in addition to the conventional energetic term. Simulations are used to predict how this additional term affects optimal trajectories, and human subjects experiments are conducted to compare with simulation predictions.

      I think the paper's key strengths are its simulation and experimental studies, which I regard as cleverly-conceived and well-executed. I think the paper's key weakness is the connection between these two studies, which I regard as tenuous for reasons I will now discuss in detail.

      The Title asserts that "humans dynamically optimize walking speed to save energy and time". Directly substantiating this claim would require independently manipulating the (purported) energy and time cost of walking for human subjects, but these manipulations are not undertaken in the present study. What the Results actually report are two findings:

      1. (simulation) minimizing a linear combination of energy and time in an optimal control problem involving an inverted-pendulum model of walking bouts that (i) start and stop at rest and (ii) walk at constant speed yields a gently-rounded speed-vs-time profile (Fig 2A);

      2. (experiment) human subject walking bouts that started and stopped at rest had self-similar speed-vs-time profiles at several bout lengths after normalizing by the average duration and peak speed of each subject's bouts (Fig 4B).

      If the paper established a strong connection between (1.) and (2.), e.g. if speed-vs-time trajectories from the simulation predicted experimental results significantly better than other plausible models (such as the 'steady min-COT' and 'steady accel' models whose trajectories are shown in Fig 2A), this finding could be regarded as providing indirect evidence in support of the claim in the paper's Title. Personally, I would regard this reasoning as rather weak evidence - it would be more accurate to assert 'brief human walking bouts look like trajectories of an inverted-pendulum model that minimize a linear combination of energy and time' (of course this phrasing is too wordy to serve as a replacement Title -- I am just trying to convey what assertion I think can be directly substantiated by the evidence in the paper). But unfortunately, the connection between (1.) and (2.) is only discussed qualitatively, and the other plausible models introduced in the Results are not revisited in the Discussion. To my naive eye, the representative 'steady min-COT' trace in Fig 2A seems like a real contender with the 'Energy-Time' trace for explaining the experimental results in Fig 4, but this candidate is rejected at the end of the third-to-last paragraph in the 'Model Predictions' subsection of Results based on the vague rationale that is never revisited.

      We have addressed most of this comment above, but respond here regarding Fig. 4. The argument against steady min-COT should also point out the peak speed. The Results have been revised thus: “In contrast to the min-COT hypothesis, the human peak speeds increased with distance, many well below the min-COT speed of about 1.25 m/s. The human speed trajectories did not resemble the trapezoidal profiles of the steady min-COT hypothesis for all distances, nor the triangular profiles of steady acceleration.”

      An additional limitation of the approach not discussed in the manuscript is that a fixed step length was prescribed in the simulations. The 'Optimal control formulation' subsection in the Methods summarizes the results of a sensitivity analysis conducted by varying the fixed step length, but all results reported here impose a constant-step-length constraint on the optimal control problem. Although this is a reasonable modeling simplification for steady-state walking, it is less well-motivated for the walking bouts considered here that start and stop at rest. For instance, the representative trial from a human subject in Figure 8 clearly shows initiation and termination steps that differ in length from the intermediate steps (visually discernable via the slope of the dashed line interpolating the black dots). Presumably different trajectories would be produced by the model if the constant-step-length constraint were removed. It is unclear whether this change would significantly alter predictions from either the 'Energy-Time' or 'steady min-COT' model candidates, and I imagine that this change would entail substantial work that may be out of scope for the present paper, but I think it is important to discuss this limitation.

      This is addressed elsewhere (Essential Revisions 2), but we explain more here. One of the parameter studies included step length increasing with speed according to the human preferred relationship. This is included in Fig. 3, and so we concluded that variable step lengths are not critical to the speed trajectories. A related assumption is that the energetic cost of modulating step length/frequency is small compared to the step-to-step transition cost. We believe that humans expend substantial energy for both costs, but that the overall cost of walking is still dominated by step-to-step transitions.

      With my concerns about the paper's framing and through-line noted as above, I want to emphasize that I regard the computational and empirical work reported here to be top-notch and potentially influential. In particular, the experimental study's use of inexpensive wearable sensors (as opposed to more conventional camera-based motion capture) is an excellent demonstration of efficient study design that other researchers may find instructive. To maximize potential impact, I encourage the authors to release their data, simulations, and details about their experimental apparatus (the first two I regard as essential for reproducibility - the third a selfless act of service to the scientific community).

      I think the most important point to emphasize is that the bulk of prior work on human walking has focused on steady-state movement - not because of the real-world relevance (since one study reports 50% of walking bouts in daily life are < 16 steps as summarized in Fig 1B), but rather because steady walking is a convenient behavior to study in the laboratory. Significantly, this paper advances both our theoretical and empirical understanding of the characteristics of non-steady-state walking.

      It is also significant to note the relationship between this study, where time was incorporated as an additive term in the cost of walking, with previous studies that incorporated time in a multiplicative discount in the cost of eye and arm movements. There is an emerging consensus that time plays a key role in the generation of movement across the body - future studies will discern whether and when additive or multiplicative effects dominate.

      We have acknowledged this in a brief sentence: “Indeed, we have found a similar valuation of time to explain how reaching durations and speed trajectories vary with reaching distance (Wong et al., 2021).” As an aside, in that reference we measured metabolic cost of cyclic arm reaching, combined it with a linear time cost, and predicted reaching durations vs. distance and bell-shaped hand speed trajectories. Others (Shadmehr et al. Curr Biol. 2016) have proposed multiplicative (hyperbolic) temporal discounting to explain durations, but the cost formulas are not dynamical, and cannot produce trajectories. We agree with reviewer’s point, but we think the evidence for hyperbolic discounting is not strong. Linear time costs are simpler and work at least as well. This is of great interest to us, but we didn’t discuss beyond the brief mention above, because we fear it is too far afield.

      Reviewer #2 (Public Review):

      This paper provides a novel approach to quantifying the tradeoff between energetic optimality during walking and the valuation of time to travel a given distance. Specifically, the authors investigated the relationships between walking speed trajectories, distance traveled, and the valuation of (completion) time. Time has been proposed as a potential factor influencing movement speed, but less is understood about how individuals balance energetic optimality and time constraints during walking. The authors used a simple, sagittal-plane walking model to test competing hypotheses about how individuals optimize gait speed from gait initiation to gait termination. Their approach extends literature in the space by identifying optimal gaits for shorter, partially non-steady speed walking bouts.

      The authors successfully evaluated three competing walking objectives (constant acceleration, minimum cost of transport at steady speed, and the energy-time objective), showing that the energy-time objective best matched experimental data in able-bodied adults. Although other candidate objectives may exist, the paper's findings provide a likely-generalizable explanation of how able-bodied humans select movement strategies that encompass studies of steady-speed walking.

      Overall, this paper provides a foundation for future studies testing the validity of the energy-time hypothesis for human gait speed selection in able-bodied and patient populations. Extensions of this work to patient populations may explain differences in walking speed during clinical assessments and provide insight into how individual differences in time valuation impact performance on assessments. For example, understanding whether physical capacity or time valuation (or something comparable) better explains individual differences in walking speed may suggest distinct approaches for improving walking speed.

      Strengths:

      The authors presented a compelling rationale for the tradeoffs between energetic optimality and time and their results provide strong support for a majority of their conclusions. In particular, significant reductions in the variance of experimental speed trajectories provides good support for the scaling of speeds across individuals and the plausibility of the energy-time hypothesis. Comparison to theoretical (model-based) reductions across difference time valuation (cT) parameters would further enhance confidence in the practical significance of the variance reductions. Further, while additional work is needed to determine the range of "normal" valuations of time, the authors present experimental ranges that appear reasonable and are well explained. The computational and analytical methods are rigorous and are supported by the literature. Overall, the paper's conclusions are consistent with experimental and computational results.

      The introduction of a model-based analytical approach to quantify the effects of time valuation of walking could generalize to test other cost functions, populations, or locomotion modes. Further, models of varying complexity could be implemented to test more individualized estimates of metabolic cost, ranging from 3D dynamic walking models (Faraji et al., Scientific Reports, 2018) or physiologically-detailed models (Falisse et al., Journal of The Royal Society Interface. 2019). The relatively simple set of analyses used in this paper is consistent with prior literature and should generalize across applications and populations.

      The authors justified simplifications in the analysis and addressed major limitations of the paper, such as using a fixed step length in model predictions, using a 2D model, and basing energy estimates on the mechanical work of a simple model. It is unlikely that the paper's conclusions would change given additional model complexity. For example, a 3D walking model would need to control frontal plane stability. However, in able-bodied adults, valuation of frontal-plane stability during normal walking would not likely alter the overall shape of the predicted speed profiles.

      Weaknesses:

      The primary weakness of this work is that alternative objectives may provide similar speed profiles and thus be plausible objectives for human movement. For example, the authors tested an objective minimizing the steady-speed cost of transport. This cost function is consistent with the literature, but (as predicted) unlikely to explain acceleration and deceleration during gait. An objective more comparable to the energy-time hypothesis would be to minimize the net energy cost over the entire bout, including accelerations and decelerations. This may produce results similar to the energy-time hypothesis. However, a more complex model that incorporates non-mechanical costs (e.g., cost of body weight support) may be needed to test such objectives. Therefore, the energy-time hypothesis should be considered in the context of a simple model that may be incapable of testing certain alternative hypotheses.

      We have addressed some of this comment in Essential Revisions 4.

      We are unsure what is meant by “net energy over the entire bout, including accelerations and decelerations.” Our hypothesis uses total (gross) energy over the entire bout, and already includes accelerations and decelerations. If “net” refers to the customary definition of metabolic energy minus resting, then it differs from our gross cost (Fig. 6A) only in the amount of constant offset, namely resting cost. Removing the offset is equivalent to a decrease in C_T. As shown in Fig. 3, this would reduce peak speeds magnitudes but not change the shape of the speed, peak speed, and duration patterns. There is also another interpretation where the cost of walking includes only net energy, and the cost of time includes the resting metabolic rate (Fig. 6C). This interpretation yields the same predictions, the only difference is whether resting rate is treated as an energy or a time cost. We have not made further changes, because we are unsure what the reviewer meant. The difference between net and total is at most one of degree, not of qualitatively different behavior.

      We do not address the proposed “cost of body weight support” because we are unsure of the definition. There is a hypothesis by Kram & Taylor (1990) that defines a metabolic cost rate proportional to body weight divided by ground contact time. It is unclear if this is what reviewer is referring to, so we did not include it in the manuscript. However, IF this is what reviewer means, we do not consider the Kram & Taylor (“K&T”) cost to be a viable hypothesis for computational models. It is a correlation observed from data, which is inadequate as a model, for several reasons. First, in a model optimization, it leads to absurd predictions, because metabolic cost could then be reduced simply by increasing stance (contact) time. A model could do so simply by walking with very long double support phases, or running with a very brief aerial phase, both of which people clearly do not do. In walking, extended double support durations result in much higher metabolic cost (Gordon et al., APMR 2009). Models must operate quite literally on whatever objective they are given, and here, a literal interpretation of K&T makes absurd predictions.

      Another issue with the K&T cost is that it is not mechanistic. A mechanistic model is concerned with the forces and work performed by an actuator such as muscle. Muscles experience forces far greater than body weight, not captured by the K&T cost. Of course, overall cost for animal locomotion is roughly proportional to body weight, but what a model needs is a cost associated with its control inputs, e.g. actuator forces.

      We have also examined the K&T hypothesis in previous publications. In Schroeder & Kuo (Plos Comp Biol 2021), we used a simple model of running that minimizes an energetic cost dominated by mechanical work. Even though the model has no cost similar to K&T, its predicted metabolic cost is correlated with the K&T cost. Correlation does not imply causation, which is known in this model.

      We have also examined the K&T hypothesis in experimental data. In Riddick & Kuo (Sci Rep 2022), we examined human data and found that there are many variables that correlate quite well with metabolic cost, including the K&T correlate. We use human data to show how mechanical work could explain metabolic cost, and even if it does, the K&T cost appears as a correlate. In our interpretation, both model and data that experience an energetic cost proportional to mechanical work may have a number of variables correlated to energy cost. Those correlates need not have any causal influence.

      There are, of course, many similar correlates that could be or have been proposed to explain the metabolic cost of running. Most such correlates are not operational enough to work in a model, and it is also difficult to predict what a reader might consider plausible, even if we do not.

      We agree with this statement: “the energy-time hypothesis should be considered in the context of a simple model that may be incapable of testing certain alternative hypotheses.” In fact, in Discussion of limitations we listed other potential factors (e.g. forced leg motion, stability, 3D motion), and stated “We did not explore more complex models here, but would expect similar predictions to result if similar, pendulum-like principles of work and energetic cost apply.” We had also cited other models that include such factors and are compatible with the step-to-step transition concept. Finally, we already stated, “the Energy-Time hypothesis should be regarded as a subset of the many factors that should govern human actions, rendered here in a simple but quantitative form.” We believe this is already aligned with reviewer’s comment.

      An experimental design involving an intervention to perturb the valuation of time would provide stronger support for time being a critical factor influencing gait speed trajectories. The authors noted this limitation as an area of future work.

      While the results are compelling, the limited sample size and description of participants limit the obvious generalizability of the results. Older adults tend to have higher metabolic costs of walking than younger adults, which may alter the predicted time-energy relationships (Mian OS, et al., Acta physiologica. 2006). As noted in the introduction, differences in walking speeds have been observed in different living environments. General information on where participants lived (city, small town, etc...) may provide readers with insight into the generalizability of the paper's conclusions. Additionally, the experimental results figures show group-level trends, but individual-specific trends and the existence of exceptional cases are unclear.

      We wish to defend the “limited sample size.” The present sample size was (in our opinion) sufficient to test the hypothesis, and we have reported confidence intervals and other statistics where appropriate. (As always, it is up to the individual reader to decide whether they are convinced or not.) It is true that the data may be insufficient for other purposes, but we cannot anticipate or address all other purposes.

      We appreciate the relevant connection to aging. We have added to Discussion, “We do not know whether that family [of trajectories] also applies to older adults, who prefer slower steady speeds and expend more energy to walk the same speed (Malatesta, 2003). Perhaps an age-related valuation of time might explain some of the differences in speed.”

      We agree about the participants, and have added “Subjects were recruited from the community surrounding the University of Calgary; the city has a moderately affluent population of about 1.4 M, with a developed Western culture.”

      No specific reviewer recommendation was made about individual-specific trends, but there are several indicators already included in the manuscript. First, all trials from all subjects are shown in Fig. 4A. Any truly exceptional cases should be visible as substantial deviations from the group. Second, the normalization by peak speed in Fig. 4B shows how individuals tend to be fairly consistent in their preferred speeds, in that shorter and longer bouts of an individual are consistent with each other, even if some walk faster than others. Third, this observation is analyzed more quantitatively by the reduction in standard deviations with normalization (Results). Fourth, we will provide a data repository with all the data, to allow readers to inspect individuals more carefully (Data availability statement).

      The authors' interpretation of clinical utility is vague and should be interpreted with caution. A simple pendulum-based walking model is unlikely to generalize to patient populations, whose gait energetics may involve greater positive and negative mechanical work (Farris et al., 2015; Holt et al., 2000). Additionally, the proposed analytical framework based on mechanical work as a proxy for the metabolic cost may not generalize to patient populations who have heterogeneous musculotendon properties and increased co-contraction (e.g., children with cerebral palsy; Ries et al., 2018). Consequently, the valuation of time for an individual could be incorrectly estimated if the estimates of metabolic cost were inaccurate. Therefore, as the authors noted for their able-bodied participants, more precise measures of metabolic rates will be critical for translating this work into clinical settings.

      We agree, and did not intend to say that clinical populations must walk the same way, rather that the Normal patterns could be used as a basis of comparison. To make this clearer, we have amended the Discussion of clinical implications (new text emphasized): “it may be possible to predict the duration and steady speed for another distance, referenced from a universal family of walking trajectories. We have identified one such family that applies to healthy individuals with pendulum-like gait. Of course, some clinical conditions might be manifested by a deviance from that family, perhaps in the acceleration or deceleration phases, or in how the trajectories vary with distance. If quantified, such deviance might prove clinically useful… the characterization of distance-dependent speed trajectories can potentially provide more information than available from steady speed alone.”

      We agree that the valuation of time can be inaccurate if the metabolic cost is inaccurate. That is why we did not make a precise estimate of the valuation. We have amended the text to help clarify that our rough estimates are based on previous data. In addition, our general scientific intent is to reveal behavioral sensitivities, for example of walking duration to bout distance, as opposed to absolute numerical quantities.

    1. Author Response

      Reviewer #1 (Public Review):

      1) One nagging concern is that the category structure in the CNN reflects the category structure baked into color space. Several groups (e.g. Regier, Zaslavsky, et al) have argued that color category structure emerges and evolves from the structure of the color space itself. Other groups have argued that the color category structure recovered with, say, the Munsell space may partially be attributed to variation in saturation across the space (Witzel). How can one show that these properties of the space are not the root cause of the structure recovered by the CNN, independent of the role of the CNN in object recognition?

      We agree that there is overlap with the previous studies on color structure. In our revision, we show that color categories are directly linked to the CNN being trained on the objectrecognition task and not the CNN per se. We repeated our analysis on a scene-trained network (using the same input set) and find that here the color representation in the final layer deviates considerably from the one created for object classification. Given the input set is the same, it strongly suggests that any reflection of the structure of the input space is to the benefit of recognizing objects (see the bottom of “Border Invariance” section; Page 7). Furthermore, the new experiments with random hue shifts to the input images show that in this case stable borders do not arise, as might be expected if the border invariance was a consequence of the chosen color space only.

      A crucial distinction to previous results is also, is that in our analysis, by replacing the final layer, specifically, we look at the representation that the network has built to perform the object classification task on. As such the current finding goes beyond the notion that the color category structure is already reflected in the color space.

      2) In Figure 1, it could be useful to illustrate the central observation by showing a single example, as in Figure 1 B, C, where the trained color is not in the center of the color category. In other words, if the category structure is immune to the training set, then it should be possible to set up a very unlikely set of training stimuli (ones that are as far away from the center of the color category while still being categorized most of the time as the color category). This is related to what is in E, but is distinctive for two reasons: first, it is a post hoc test of the hypothesis recovered in the data-driven way by E; and second, it would provide an illustration of the key observation, that the category boundaries do not correspond to the median distance between training colors. Figure 5 begins to show something of this sort of a test, but it is bound up with the other control related to shape.

      We have now added a post-hoc test where we shift the training bands from likely to unlikely positions using the original paradigm: Retraining output layers whilst shifting training bands from the left to the right category-edge (in 9 steps) we can see the invariance to the category bounds specifically (see Supp. Inf.: Figure S11). The most extreme cases (top and bottom row) have the training bands right at the edge of the border, which are the interesting cases the reviewer refers to. We also added 7 steps in between to show how the borders shift with the bands.

      Similarly, if the claim is that there are six (or seven?) color categories, regardless of the number of colors used to train the data, it would be helpful to show the result of one iteration of the training that uses say 4 colors for training and another iteration of the training that uses say 9 colors for training.

      We have now included the figure presented in 1E, but for all the color iterations used (see SI: Figure S10. We are also happy to include a single iteration, but believe this gives the most complete view for what the reviewer is asking.

      The text asserts that Figure 2 reflects training on a range of color categories (from 4 to 9) but doesn’t break them out. This is an issue because the average across these iterations could simply be heavily biased by training on one specific number of categories (e.g. the number used in Figure 1). These considerations also prompt the query: how did you pick 4 and 9 as the limits for the tests? Why not 2 and 20? (the largest range of basic color categories that could plausibly be recovered in the set of all languages)?

      The number of output nodes was inspired by the number of basic color categories that English speakers observe in the hue spectrum (in which a number of the basic categories are not represented). We understand that this is not a strong reason, however, unfortunately the lack of studies on color categories in CNNs forced us to approach this in an explorative manner. We have adapted the text to better reflect this shortcoming (Bottom page 4). Naturally if the data would have indicated that these numbers weren’t a good fit, we would have adapted the range. (if there were more categories, we would have expected more noise and we would have increased the number of training bands to test this). As indicated above, we have now also included the classification plots for all the different counts, so the reader can review this as well (SI: Section 9).

      3) Regarding the transition points in Figure 2A, indicated by red dots: how strong (transition count) and reliable (consistent across iterations) are these points? The one between red and orange seems especially willfully placed.

      To answer the question on the consistency we have now included a repetition of the ResNet18, with the ResNet34, ResNet50 and ResNet101 in the SI (section 1). We have also introduced a novel section presenting the result of alternate CNNs to the SI (section S8). Despite small idiosyncrasies the general pattern of results recurs.

      Concerning the red-orange border, it was not willfully placed, but we very much understand that in isolation it looks like it could simply be the result of noise. Nevertheless, the recurrence of this border in several analyses made us confident that it does reflect a meaningful invariance. Notably:

      • We find a more robust peak between red and orange in the luminance control (SI section 3).

      • The evolutionary algorithm with 7 borders also places a border in this position.

      • We find the peak recurs in the Resnet-18 replication as well as several of the deeper ResNets and several of the other CNNs (SI section 1)

      • We also find that the peak is present throughout the different layers of the ResNet-18.

      4) Figure 2E and Figure 5B are useful tests of the extent to which the categorical structure recovered by the CNNs shifts with the colors used to train the classifier, and it certainly looks like there is some invariance in category boundaries with respect to the specific colors uses to train the classifier, an important and interesting result. But these analyses do not actually address the claim implied by the analyses: that the performance of the CNN matches human performance. The color categories recovered with the CNN are not perfectly invariant, as the authors point out. The analyses presented in the paper (e.g. Figure 2E) tests whether there is as much shift in the boundaries as there is stasis, but that’s not quite the test if the goal is to link the categorical behavior of the CNN with human behavior. To evaluate the results, it would be helpful to know what would be expected based on human performance.

      We understand the lack of human data was a considerable shortcoming of the previous version of the manuscript. We have now collected human data in a match-to-sample task modeled on our CNN experiment. As with the CNN we find that the degree of border invariance does fluctuate considerably. While categorical borders are not exact matches, we do broadly find the same category prototypes and also see that categories in the red-to-yellow range are quite narrow in both humans and CNNs. Please, see the new “Human Psychophysics” (page 8) addition in the manuscript for more details.

      5) The paper takes up a test of color categorization invariant to luminance. There are arguments in the literature that hue and luminance cannot be decoupled-that luminance is essential to how color is encoded and to color categorization. Some discussion of this might help the reader who has followed this literature.

      We have added some discussion of the interaction between luminance and color categories (e.g., Lindsay & Brown, 2009) at the bottom of page 6/ top of page 7. The current analysis mainly aimed at excluding that the borders are solely based on luminance.

      Related, the argument that “neighboring colors in HSV will be neighboring colors in the RGB space” is not persuasive. Surely this is true of any color space?

      We removed the argument about “neighboring colors”. Our procedure requires the use of a hue spectrum that wraps around the color space while including many of the highly saturated colors that are typical prototypes for human color categories. We have elected to use the hue spectrum from the HSV color space at full saturation and brightness, which is represented by the edges of the RGB color cube. As this is the space in which our network was trained, it does not introduce any deformations into the color space. Other potential choices of color space either include strong non-linear transformations that stretch and compress certain parts of the RGB cube, or exclude a large portion of the RGB gamut (yellow in particular).

      We have adapted the text to better reflect our reasoning (page 6, top of paragraph 2).

      6) The paper would benefit from an analysis and discussion of the images used to originally train the CNN. Presumably, there are a large number of images that depict manmade artificially coloured objects. To what extent do the present results reflect statistical patterns in the way the images were created, and/or the colors of the things depicted? How do results on color categorization that derive from images (e.g. trained with neural networks, as in Rosenthal et al and presently) differ (or not) from results that derive from natural scenes (as in Yendrikhovskij?).

      We initially hoped we could perhaps analyze differences between colors in objects and background like in Rosenthal, unfortunately in ImageNet we did not find clear differences between pixels in the bounding boxes of objects provided with ImageNet and pixels outside these boxes (most likely because the rectangular bounding boxes still contain many background pixels). However, if we look at the results from the K-means analysis presented in Figure S6 (Suppl. Inf.) of the supplemental materials and the color categorization throughout the layers in the objecttrained network (end of the first experiment on page 7) as well as the color categorization in humans (Human Psychophysics starting on page 8), we see very similar border positions arise.

      7) It could be quite instructive to analyze what's going on in the errors in the output of the classifiers, as e.g. in Figure 1E. There are some interesting effects at the crossover points, where the two green categories seem to split and swap, the cyan band (hue % 20) emerges between orange and green, and the pink/purple boundary seems to have a large number of green/blue results. What is happening here?

      One issue with training the network on the color task, is that we can never fully guarantee that the network is using color to resolve the task and we suspected that in some cases the network may rely on other factors as well, such as luminance. When we look at the same type of plots for the luminance-controlled task (see below left) presented in the supplemental materials we do not see these transgressions. Also, when we look at versions of the original training, but using more bands, luminance will be less reliable and we also don’t see these transgressions (see right plot below).

      8) The second experiment using an evolutionary algorithm to test the location of the color boundaries is potentially valuable, but it is weakened because it pre-determines the number of categories. It would be more powerful if the experiment could recover both the number and location of the categories based on the "categorization principle" (colors within a category are harder to tell apart than colors across a color category boundary). This should be possible by a sensible sampling of the parameter space, even in a very large parameter space.

      The main point of the genetic algorithm was to see whether the border locations would be corroborated by an algorithm using the principle of categorical perception. Unfortunately, an exact approach to determining the number of borders is difficult, because some border invariances are clearly stronger than others. Running the algorithm with the number of borders as a free parameter just leads to a minimal number of borders, as 100% correct is always obtained when there is only one category left. In general, as the network can simply combine categories into a class at no cost (actually, having less borders will reduce noise) it is to be expected that less classes will lead to better performance. As such, in estimating what the optimal category count would be, we would need to introduce some subjective trade-off between accuracy and class count.

      9) Finally, the paper sets itself up as taking "a different approach by evaluating whether color categorization could be a side effect of learning object recognition", as distinct from the approach of studying "communicative concepts". But these approaches are intimately related. The central observation in Gibson et al. is not the discovery of warm-vscool categories (these as the most basic color categories have been known for centuries), but rather the relationship of these categories to the color statistics of objects-those parts of the scene that we care about enough to label. This idea, that color categories reflect the uses to which we put our color-vision system, is extended in Rosenthal et al., where the structure of color space itself is understood in terms of categorizing objects versus backgrounds (u') and the most basic object categorization distinction, animate versus inanimate (v'). The introduction argues, rightly in our view, that "A link between color categories and objects would be able to bridge the discrepancy between models that rely on communicative concepts to incorporate the varying usefulness of color, on the one hand, and the experimental findings laid out in this paragraph on the other". This is precisely the link forged by the observation that the warmcool category distinction in color naming correlates with object-color statistics (Gibson, 2017; see also Rosenthal et al., 2018). The argument in Gibson and Rosenthal is that color categorization structure emerges because of the color statistics of the world, specifically the color statistics of the parts of the world that we label as objects, which is the same approach adopted by the present work. The use of CNNs is a clever and powerful test of the success of this approach.

      We are sorry we did not properly highlight the enormous importance of these two earlier papers in our previous version of the manuscript. We have now elaborated our description of Gibson’s work to better reflect the important relation between the usefulness of colors and color categories (Page 2, middle and Page 19 par. above methods). We think our work nicely extends the earlier work by showing that their approach works even at a more general level with more color categories,

    1. I believe Victor Margolin when he says that he developed his own system. That's what I did in the years before people started widely discussing personal knowledge systems online. Nobody taught me how to do it when I was in college. @chrisaldrich repeatedly tries to connect everyone's knowledge practices to an ongoing tradition that stretches back to commonplace books, but he overstates it. There is such a thing as independent development of a personal knowledge system. I know it because I've lived it. It's not so difficult that it requires extraordinary genius.

      Reply to Andy https://forum.zettelkasten.de/discussion/comment/16865#Comment_16865

      Andy, I'll take you at your word. You're right that none of it requires extraordinary genius--though many who seem to exhibit extraordinary genius do have variations of these practices in their lives, and the largest proportion of them either read about them or were explicitly taught them.

      With these patterns and practices being so deeply rooted in our educational systems for so long (not to mention the heavy influences of our orality and evolved thinking apparatus even prior to literacy), it's a bit difficult for many to truly guarantee that they've done these things independently without heavy cultural and societal influence. As a result, it's not a far stretch for people to evolve their own practices to what works for them and then think that they've invented something new. The common person may not be aware of the old ideas of scala naturae or scholasticism, but they certainly feel them in their daily lives. Commonplacing is not much different.

      By analogy, Elon Musk might say he created the Tesla, but it's a far bigger stretch for him to say that he invented a new means of transportation, or a car, or the wheel when we know he's swimming in a culture rife with these items. Humans are historically far better at imitation than innovation. If people truly independently developed systems like these so many times, then in the evolutionary record of these practices we should expect to see more diversity than we do in practice. We might expect to see more innovation than just the plain vanilla adjacent possible. Given Margolin's age, time period, educational background, and areas of expertise, there is statistically very little chance that he hadn't seen or talked about versions of this practice with several dozens of his peers through his lifetime after which he took that tacit knowledge and created his own explicit version which worked for him.

      Historian Keith Thomas talks about some of these traditions which he absorbed himself without having read some of the common advice (see London Review of Books https://www.lrb.co.uk/the-paper/v32/n11/keith-thomas/diary). He also indicates that he slowly evolved to some of the often advised practices like writing only on one side of a slip, though, like many, he completely omits to state the reason why this is good advice. We can all ignore these rich histories, but we'll probably do so at our own peril and at the expense of wasting some of our time to re-evolve the benefits.

      Why are so many here (and in other fora on these topics) showing up regularly to read and talk about their experiences? They're trying to glean some wisdom from the crowds of experimenters to make improvements. In addition to the slow wait for realtime results, I've "cheated" a lot and looked at a much richer historical record of wins and losses to gain more context of our shared intellectual history. I'm reminded of one of Goethe's aphorisms from Maxims and Reflections "Inexperienced people raise questions which were answered by the wise thousands of years ago."

    1. drawing on materials less often granted the legitimacy of academic preservation.

      This line reminded me of something a friend told me recently. The last few generations of humans will be the first generations that will have living memories (Videos and audio) of them living their lives. To think that in several hundred years, someones great great great great grandchild may be able to pull up a video of us today and say to their grandchildren that this is how we used to be/look/interact with our world is immensely interesting in my opinion. We don't have videos of what life was like in the 1500's. We do have paintings and records and thus can fill in the blanks with our imaginations, but future generations won't have to employ that technique nearly as much as we do.

    1. Part of the activation energy required to start any task comes from the picture you get in your head when you imagine doing it. It may not be that going for a run is actually costly; but if it feels costly, if the picture in your head looks like a slog, then you will need a bigger expenditure of will to lace up. Slowness seems to make a special contribution to this picture in our heads. Time is especially valuable. So as we learn that a task is slow, an especial cost accrues to it. Whenever we think of doing the task again, we see how expensive it is, and bail. That’s why speed matters.

      The story you tell yourself creates reality.

    1. Abstract

      This work has been published in GigaScience Journal under a CC-BY 4.0 license (https://doi.org/10.1093/gigascience/giac034 and has published the reviews under the same license. These reviews were as follows.

      Reviewer 1. Sean Walkowiak

      First review: Comment 1: The authors could more clearly and accurately present and discuss sequencing and assembly approaches, including the advantages and limitations of the ONT assembly presented here

      While the standards of 'quality' for assemblies are evolving, there are standard sets of 'science-based' criteria for considering the quality of a genome, such as the 14 criteria listed in the manuscript here: https://www.nature.com/articles/s41586-021-03451-0#Tab1. Many of these criteria are ambitious, particularly for wheat due to its size and complexity, and many criteria are not met using previous assembly approaches, or the approaches used in this study. It is true that CS and 10+ Wheat Genomes do not use long reads; however, these assemblies are valuable and have been rigorously validated using 10X Genomics, Hi-C, and long read data. They also perform well for TE content, BUSCO (as outlined by Tables 1 and 2 and Fig 3 in this manuscript), and they were actually used in this MS as a reference for guiding the ONT assembly. I would also expect that they have a better base pair accuracy than the assembly presented here. I therefore suggest that the authors revise their statement "these assemblies have been produced using short-read technologies and are therefore not up to the quality standard of current genome assemblies". If the authors wish to discuss assembly quality, which I recommend they should, I suggest focusing on advantages and limitations of each technology and assembly approach in a constructive way, perhaps with a stronger focus on the value of the ONT resource developed here. In regards to base pair accuracy, ONT is at a disadvantage to short reads or to PacBio. This is particularly true in the context of HiFi reads, which have increased accuracy over ONT and Illumina and have greater lengths than Illumina, but PacBio and HiFi are not discussed. This is not to say that PacBio is superior in every way, the reads from ONT are longer and these hold a significant value. As an example of differences between PacBio and ONT that might provide useful context to describe the differences between ONT and PacBio approaches, please see: https://pubmed.ncbi.nlm.nih.gov/33319909/, for differences between short read (TriTex) and PacBio, please see https://www.nature.com/articles/s41586-020-2947-8 . All of these approaches are valuable but have both advantages and limitations, with ONT also having many clear advantages and disadvantages. But these need to be clearly communicated and supported, either through the results of this study or through the literature. For example, in the discussion, the authors state that "ONT devices HAVE a real advantage over other long-read technologies". There is only one other long read sequencing technology, so are if you saying that ONT HAS a 'real advantage' over PacBio based on read length, this is valid, but can be stated more explicitly and with examples of the read lengths from this study and the literature. It is then stated that the "error rate is drastically reduced for nanopore", again this valuable and a great advancement in regards to ONT, but it would be wise not to dismiss that this error rate is still higher than PacBio HiFi, which again can be stated explicitly with support from the literature. While both of these concepts are important, after they are stated, they are not actually discussed or framed to highlight the work from this study. The true advantage of ONT, even over PacBio HiFi, is that the long reads can resolve more complex regions that span greater distances, which are abundant in wheat (see reference from above). The authors are presenting an exciting and valuable resource with this genome assembly and this assembly has advantages due to the application of ONT, for the reasons mentioned above regarding long complex regions, but these are not fully highlighted and the authors do not take full advantage of what this assembly has to offer. I think the authors should provide additional context and support related to the value and drawbacks of their ONT assembly. The advantages are discussed superficially at the gene level through a couple of examples (Fig 5), though none of these examples are supported with any significant biological data or validation analysis. There are many interesting features of genomes that are captured by ONT that are not captured well by short reads or PacBio, and it is unfortunate that these are not explored in any significant depth in the manuscript.

      Comment 2: Some of the 'highlighted features' in the manuscript could be better selected/executed

      This comment relates to the previous comment on having little detail on what the ONT genome is uniquely capable of providing over other approaches. Instead, the authors focus on some anomalies in the D genome as well as differences in the nanopore software for base calling. It is unclear to me what the objective is of the report on the D genome. I suspect that this may be due to differences in repeat content between D and the other subgenomes, or an artifact of the tools and analyses used. Page 6, Figures S1 and S2, may be a consequence of poor read filtering for reads that align ambiguously - i,e perhaps reads from A and B may crossmap at a greater likelihood than those from D due to differences/similarities in repeat content between subgenomes. Once reads are aligned, the alignments should be properly filtered using standard 'best practices for NGS'- I do not see that any filtering or analysis of cross mapping was performed, but I may have missed it. Once the alignments are filtered, read coverage dips and peaks can then be assessed statistically using tools such as CNVnator and cn.mops, which are designed specifically for comparative read depth analysis since depth may not be normally distributed, rather than arbitrarily looking at 2 times the median. There may be differences between genes and intergenic regions in terms of mapping accuracy, so it may be ideal to interrogate read depth for those separately. The increased gaps is also interesting and I wonder if this could be due to the read accuracy of ONT and read mapping and assembly biases when having similar subgenomes.

      Nevertheless, the results and discussion on the D genome are interesting but distracting and likely reflect that the authors should take more time to explore their data and its biases before presenting this information. In summary, I believe that additional work is needed to bring value to the read depth and D genome analysis should the authors choose to include this in the manuscript. While I agree that it would be useful to communicate that a significant gain was observed when basecalling using the more accurate basecaller, the emphasis on this is disproportionate to its value in the manuscript. The observation of a better assembly when using reads from a more advanced basecaller is not something new. As for the error rate of the ONT between organisms (yeast and wheat), with a sample size of 2, I do not think that this is worth presenting or discussing in any detail. While this may just be an artifact of the DNA quality itself from two experiments, I suspect that this may be a valid result from the manuscript and due to sequencing repeats, which are more abundant in wheat, in combination with how these basecallers self train to be more accurate. While this is certainly valid, it is not novel or interesting. This result comparing species was not tested with sufficient scientific rigor/evidence, it distracts from the central result of the manuscript, and just reaffirms something that we already known about the basecalling software and challenges of sequencing homopolymers and the importance of getting accurate reads using the more advanced basecalling methods.

      Comment 3: Why Renan? This comment relates to the other two comments on the selected areas of focus. The biological story, which was on gliadins, was of some value and highlighted some of the advantages of an ONT assembly, but this was not supported by any significant biological data. Renan is a well-known cultivar with abundant genomic data, mapping populations, trait data for diseases, etc. It is unfortunate that the authors could not use the genome to dig deeper to more thoroughly demonstrate the value of this assembly specifically in the context of ONT and genomics of wheat or the biology of wheat and Renan, specifically. With abundant QTL data available specifically for Renan, these could have easily been anchored to the assembly to highlight novel transcripts from the RNAseq from this study, just as an example. Even the comparisons of the Renan assembly to other available assemblies was mostly superficial and did not highlight in significant detail the value of having an ONT assembly or the value of having data specifically for Renan. While a detailed 'biological story' may be beyond the scope of this manuscript, there was minimal effort to highlight the value of the assembly, and this comment is more of a larger reflection that more could have been done to highlight the value of the genome to support the author's vague claims that the genome "will benefit the wheat community and help breeding programs".

      Minor Comments The absence of numbered lines made it difficult to provide more detailed feedback, but there are minor items throughout, so I suggest numbering the lines and also giving the manuscript a thorough review. I appreciate that the authors present and suggest methods for future assembly of complex genomes using ONT, but unlike the abstract states 'we also provide the methodological standards to generate high-quality assemblies of complex genomes'. I would argue that the standards used for ONT assembly are known and are not established here. I would also suggest caution when stating that the methods here should be considered the 'standard' for the reasons indicated in Comment 1 regarding other approaches used to assemble complex genomes, such as PacBio/HiFi, and the lack of a proper investigation/discussion/comparison of assembly quality.

      Page 2: last line - what is the abbreviation ca. ? Table 1: Busco is presented twice with different values. Table 1 and 2 use different versions of RefSeq, I would stick to one version. It is unclear to me what trend or result is that the authors are trying to present in figure 1, which I would say is common for circos plots. Presenting data 'for the sake of presenting it' is not terribly valuable and I would encourage the authors to use the figures to present a trend or result that is impactful. In addition, the data that is presented is not presented clearly, and is cryptic. The roman numerals in the figure caption for Figure 1 are not actually in the figure. The caption also indicate that the dots indicate lower and higher values, but not of what - perhaps density of gaps? The color scales are not presented for each track. Two of the color scale pallets also look similar.

      Page 6: 62% of exons were identical, which means 48% had SNPs, so the authors argue that SNPs are therefore rare at 48% of exons? I do not think that 48% of exons having SNPs is rare, I think it that this would mean that nearly half of exons have SNPs, so this is therefore common. Perhaps this statistic is misleading and the focus should instead be on the 0.7% divergence. How does this value compare with other within species comparisons of gene content and could this be an artifact of ONT accuracy? This question relates to a general comment that the authors could do better at bringing relevant comparisons or parallels in from the literature throughout the manuscript to bring value to any findings or insights they are presenting. Particularly in the context of other ONT assemblies.

      Page 7, capitalize the T for technology, it is part of the name of the company and is a proper noun. This is repeated elsewhere.

      Page 7: 'on wheat'? this statement could be written more clearly The way that the text is worded, it sounds like the basis for selecting the SmartDenovo assembly was the number of unknown bases, when I suspect it was actually a multitude of factors (BUSCO, gene or TE content, assembly stats, etc). While I do not question the selection of the assembly, I do suggest a clearer presentation of the information. I appreciate that the authors presented the data from multiple assemblers, one of the concerns with ONT is that the read accuracy is low and this may lead to issues in assembly of complex polyploids with similar subgenomes. I suspect that based on this study, it is clear that this is a valid concern for some assemblers, but may have been overcome in others. Though none of this is explored or discussed. Again, is there any information in the literature contrasting assemblers that could provide insights into what you observed?

      Searches at 90% identify and coverage for genes and TEs is not strict, especially with genomes that have highly identical subgenomes. If you reduce your thresholds enough, all features will map to your genome.....

      The choice of language is often objective or not representative of the results. For example, the 'extremely' similar TE content between Renan and CS. Why not say it is similar and actually report a value or a % difference. This would be more concise and informative than using vague and overzealous language. Page 8, short reads (dash or no dash?) The font sizes in Figure 2 are very small.

      The RNAseq is not really presented at all, except in the Materials and Methods. I thought the genes were ab initio predicted until I saw RNAseq in the materials and methods. I suggest at least making a note of RNAseq into the results and/or discussion since this additional effort does bring added value to the annotations and the manuscript. The discussion says de novo annotations, but I suggest explicitly stating that RNAseq was performed.

      Figure 3 C and D do not have horizontal axis labels, the top should be labelled as subgenome, bottom as chromosome, and the vertical axis (not the top) should be labelled as number of gaps and chromosome length. Same comment for labelling of vertical axis for panels A and B, horizontal axis should be labelled as genome assemblies, which are reflected in the pallet/legend. Note that many of the colours in this pallet are similar and difficult to differentiate, it may actually take less space to label the bars with each wheat line to make it less cryptic.

      How were the dotplots in figure 4 generated? Perhaps I missed it in the materials and methods. Also one of the axis have labels or units, etc.

      Much of the text in Figure 5 is too small and illegible.

      Page 10: The discussion is superficial and vague and should provide an accurate and pragmatic discussion of the results in the context of the literature. For example, the manuscript boasts a 'higher resolution'... but of what? Perhaps 'complex repetitive regions'? To reiterate my previous comment on the lack of literature support throughout the manuscript - Were these 'higher resolutions' of <complex repetitive regions> comparable to what was observed in the literature when ONT was applied to other systems? Again, these advantages of ONT and the assembly could be more thoroughly

      Re-review:

      The revised manuscript addresses the major concerns/comments. The assembly and its report are an exciting new resource for the wheat community. I only have one very minor comment below:

      When writing variety names in text and figures, it is important to be exact because there are many varieties with similar names internationally. CDC Stanley, not "Stanley"; CDC Landmark, not "Landmark"; "LongReach Lancer", not "Lancer", not "LongRead Lancer" - typo on line 308. I suggest performing a thorough check throughout.

    1. Whenever I read about the various ideas, I feel like I do not necessarily belong. Thinking about my practice, I never quite feel that it is deliberate enough.

      https://readwriterespond.com/2022/11/commonplace-book-a-verb-or-a-noun/

      Sometimes the root question is "what to I want to do this for?" Having an underlying reason can be hugely motivating.

      Are you collecting examples of things for students? (seeing examples can be incredibly powerful, especially for defining spaces) for yourself? Are you using them for exploring a particular space? To clarify your thinking/thought process? To think more critically? To write an article, blog, or book? To make videos or other content?

      Your own website is a version of many of these things in itself. You read, you collect, you write, you interlink ideas and expand on them. You're doing it much more naturally than you think.


      I find that having an idea of the broader space, what various practices look like, and use cases for them provides me a lot more flexibility for what may work or not work for my particular use case. I can then pick and choose for what suits me best, knowing that I don't have to spend as much time and effort experimenting to invent a system from scratch but can evolve something pre-existing to suit my current needs best.

      It's like learning to cook. There are thousands of methods (not even counting cuisine specific portions) for cooking a variety of meals. Knowing what these are and their outcomes can be incredibly helpful for creatively coming up with new meals. By analogy students are often only learning to heat water to boil an egg, but with some additional techniques they can bake complicated French pâtissier. Often if you know a handful of cooking methods you can go much further and farther using combinations of techniques and ingredients.

      What I'm looking for in the reading, note taking, and creation space is a baseline version of Peter Hertzmann's 50 Ways to Cook a Carrot combined with Michael Ruhlman's Ratio: The Simple Codes Behind the Craft of Everyday Cooking. Generally cooking is seen as an overly complex and difficult topic, something that is emphasized on most aspirational cooking shows. But cooking schools break the material down into small pieces which makes the processes much easier and more broadly applicable. Once you've got these building blocks mastered, you can be much more creative with what you can create.

      How can we combine these small building blocks of reading and note taking practices for students in the 4th - 8th grades so that they can begin to leverage them in high school and certainly by college? Is there a way to frame them within teaching rhetoric and critical thinking to improve not only learning outcomes, but to improve lifelong learning and thinking?

    1. https://www.youtube.com/watch?v=zCLCIw-HSJc

      I'm curious if you knew if Nelson, Engelbart or any of their contemporaries had/maintained/used commonplace books or card indexes as precursors of their computing work? That is, those along the lines of those most commonly used by academics, for example as described by Markus Krajewski in Paper Machines (MIT Press, 2011) or even Beatrice Webb's Appendix C on Note Taking in My Apprenticeship (Longmans, 1926) in which she describes a slip (or index card)-based database method of scientific note taking. I've always felt that Vannevar Bush held things back unnecessarily by not mentioning commonplace book traditions in As We May Think.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors define regulatory networks across 77 tissue contexts using software they have previously published (PECA2, Duren et al. 2020). Each regulatory network is a set of nodes (transcription factors (TF), target genes (TG), and regulatory elements (RE)) and edges (regulatory scores connecting the nodes). For each context, the authors define context-specific REs, as those that do not overlap REs from any of the other 76 contexts, and context-specific regulatory networks as the collection of TFs, TGs, and REs connected to at least one context-specific RE. This approach essentially creates annotations that are aggregated across genes, elements, and specific contexts. For each tissue, the authors use linkage disequilibrium score regression (LDSC) to calculate enrichment for complex trait heritability within the set of all REs from the corresponding context-specific regulatory network. Heritability enrichments in context-specific regulatory network REs are compared with heritability enrichments in regions defined using other approaches.

      We thank the reviewers for the pertinent and precise summary of our paper.

      Reviewer #2 (Public Review):

      In this manuscript the authors develop a method, SpecVar, to perform heritability estimation from regulatory networks derived from gene expression and chromatin accessibility data. They apply this approach to public datasets available in ENCODE and Roadmap Epigenomics consortia as well as GWAS phenotype associations in UK Biobank. It promises to be a powerful method to interpret mechanisms from genetic associations. Below are some strengths and weaknesses of the paper.

      Strengths

      • The method performs heritability enrichment on two major genomic data types: gene expression and chromatin accessibility.

      • This method leverages gene regulatory networks to perform the heritability estimation, which may better capture complex disease architecture.

      • The authors perform an extensive comparison to other LDSC-based approaches using different tissue datasets.

      Weaknesses

      (1) This approach may represent a modest advance over existing LDSC methods when looking at other complex traits.

      (2) The authors only compare with LDSC using different functional annotations as input, which may not be appropriate. A more broad comparison with other heritability methods would be helpful.

      (3) The method seems to be applied to "paired" data, but this is still bulk profiles not paired single-cell RNA/ATAC data.

      The authors successfully applied a regulatory network approach to improving the heritability estimation of complex traits by using both gene expression and chromatin accessibility data. While the results could be further strengthened by comparing them to other network and non-network-based methods, it provides important insight into a few traits beyond the standard LDSC model with different functional annotations.

      Given that this method is based on the widely used LDSC approach it should be broadly applied in the field. However, the authors should consider adapting this to single-cell data as well as admixed human population genetic data.

      We thank the reviewer for the positive comment on our work by specifically pointing out that SpecVar is a powerful method to interpret mechanisms from genetic associations. We appreciate that the reviewer’s summarized “Strength” part well captures our major contribution in building an atlas of regulatory networks by integrating paired gene expression and chromatin accessibility data, leveraging regulatory networks to perform the heritability enrichment, and identifying relevant tissues and estimate relevance correlation. We also thank the reviewer for pointing out the weakness to further enhance our results. To address the comments, we (1) performed ablation studies and added more description to clarify the novelty of our methods; (2) conducted extensive comparison to another network-based method CoCoNet and non-network-based method RolyPoly; (3) discussed the promising direction in identification of relevant contexts at cell type level by leveraging single cell multi-omics profiles and application on admixed populations.

      Reviewer #3 (Public Review):

      Identifying the critical tissues and cell types in which genetic variants exert their effects on complex traits is an important question that has attracted increasing attention. Feng et al propose a new method, SpecVar, to first construct context-specific regulatory networks by integrating tissue-specific chromatin states and gene expression data, and then run stratified LD score regression (LDSC) to test if the constructed regulatory network in tissue is significantly associated with the trait, measured by a statistic called trait relevance score in this study. They apply their method to 6 traits for which there exists prior evidence on the most relevant tissues in the literature, and then further apply to 206 traits in the UK Biobank. They find that compared to LDSC using other sources of information to define context-specific annotations, their method can "improve heritability enrichment", "accurately detect relevant tissues", helps to "interpret SNPs" identified from GWAS, and "better reveals shared heritability and regulations of phenotypes" between traits.

      We thank the reviewer for the summary and appreciation of our efforts to address the important question: identifying the critical tissues and cell types in which genetic variants exert their effects on complex traits.

      However, I think it requires more work to understand where exactly the benefits come from and the statistical properties of their proposed test statistic (e.g., how to perform hypothesis tests with their relevance score and whether the false positive rate is under control). In addition, it's not clear to me what they can conclude about the shared heritability (which means genetic correlation) by comparing their relevance score correlation across tissues to the phenotypic correlation between traits.

      We thank the reviewer’s advice to do more work to enhance the statistical rigorousness of SpecVar. We have added the significant test of heritability enrichment and our proposed R score in the revision. We also clarified that SpecVar can use common relevant contexts and shared SNP-associated regulatory networks as potential explanation for the correlation between traits.

      They show that SpecVar gives much higher heritability enrichment than the other methods in the trait-relevant tissues (Fig. 2). The fold enrichment from SpecVar is extremely high, e.g., more than 600x in the right lobe of the liver for LDL. First, I think a standard error should be given so that the significance of the differences can be assessed. Second, it is very rare (hence suspicious) to observe such a huge enrichment. Since SpecVar is based on LDSC, the same methodology that other methods in comparison depend on, the differences to the other methods must come from the set of SNPs annotated for each tissue. I think it is important to understand the difference between the SpecVar annotated SNPs and those from other methods. For example, is the extra heritability enrichment mainly from the SpecVar-specific annotation or from the intersection narrowed down by SpecVar?

      The reviewer has pinpointed a question about one important advantage of our method to improve heritability enrichment. We addressed this question by first providing standard errors, p values, and q values of heritability enrichment. Second, we conduct the ablation analysis to study the source of extra heritability enrichment. This question greatly helps us to clarify the main contribution of our method.

      They propose to use the relevance score (R score) to prioritise trait-relevant tissues. In Fig. 3, they show tissue-trait pairs with the highest R scores, and from there they prioritise several tissues for each trait (Table 1). I can see that some tissue has an outstanding R score, however, it is not clear to me where they draw the line to declare a positive result. The threshold doesn't seem to be even consistent across traits. For example, for LDL, only the right lobe of the liver is identified although other tissues have R scores greater than 100, whereas, for EA, Ammor's horn and adrenal gland are identified although their R scores are apparently smaller than 100. It seems to me they use some subjective criteria to pick the results. It leads to a serious question on how to apply their R score in a hypothesis test: how to measure the uncertainty of their R score? What significance threshold should be used? Whether the false positive rate is under control? (Without knowing these statistical properties, readers won't be able to use this method with confidence in their own research.

      We thank the reviewer to raise the question about the hypothesis test of the R score. We used the block Jackknife stratagem to estimate standard errors, p values, and q values in our revision. We added the new result to the main text and they greatly enhanced the statistical rigorousness of our method.

      Another related comment to the above is to investigate false positive associations, they should show the results for all tissues tested to see if SpecVar tends to give higher R scores even in tissues that are not relevant to the trait. It would also be useful to include some negative control traits, such as height for brain tissues.

      We agree that negative control is important and the six phenotypes in our manuscript are negative for each other. For example, LDL is relevant to liver tissue and not relevant to brain tissue. Educational attainment is relevant to brain tissue but not relevant to liver tissue.

      Fig. 3 shows that tissues prioritised by LDSC-SAP and LDSC-SEG seem to make less sense than those from SpecVar. However, some of the results are not consistent with the LDSC-SEG paper (Finucane et al 2018). For example, LDL was significantly associated with the liver in Finucane et al (Fig. 2), but not in this study. How to explain the difference? (Question 3)

      We checked the results in Figure 3 and found that even though the liver was not ranked to be top 5 tissues, it has a significant P-value to LDL in our implementation. There is indeed some difference in heritability enrichment and P-value between the LDSC-SEG paper and our implementation. And the difference was from the different sets of tissues (77 tissues in our paper and 53 tissues in the LDSC-SEG paper) for the two applications.

      The authors highlight an example where SpecVar facilitates the interpretation of GWAS signals near FOXC2. They find GWAS-significant SNPs located in a CNCC-specific RE downstream of FOXC2 and reason these SNPs affect brain shape by regulating the expression of FOXC2. I think more work can be done to consolidate the conclusion. For example, if the GWAS signals are colocalised with the eQTL for FOXC2 in the brain. Also, note that the top GWAS signal is actually on the left of the CNCC-specific RE (Fig. 4b). A deeper investigation should be warranted.

      We agree that more work should be done to consolidate the regulation of FOXC2. In our revision, we used the HiChIP loop in the brain to support the SNP-associated regulation of FOXC2. We also thank the reviewer’s suggestion for the idea of eQTL colocalization and we conduct eQTL colocalization analysis on our method-revealed SNP-associated regulation to show our method can facilitate the fine mapping of GWAS signals. Lastly, brain shape is a complex trait and may be relevant to multiple tissues. Hence it is reasonable to suspect that the top GWAS signal may be active in other relevant tissues’ regulatory elements.

      They show that SpecVar's relevance score correlation across tissues can better approximate phenotypic correlation between traits. However, the estimation of the phenotypic correlation between traits is neither very interesting nor a thing difficult to do (it can be directly estimated from GWAS summary statistics). A more interesting question is to which extent the observed phenotypic correlation is due to common genetic factors acting in the shared tissues/cell types/pathways/regulatory networks between traits. Note that in their Abstract, they use words "depict shared heritability and regulations" but I don't seem to see results supporting that.

      We are sorry that we didn’t make it clear how SpecVar “depict shared heritability and regulations”. We added more results and one example in the UKBB application to show SpecVar can use common relevant contexts and shared SNP-associated regulatory networks as potential explanation for the correlation between traits.

      Line 396-402: "For example, ... heritability could select most relevant tissues ... but failed to get correct tissues for other phenotypes ... P-value could obtain correct tissues for CP ... but failed to get correct tissues for ... SpecVar could prioritize correct relevant tissues for all the six phenotypes." Honestly, I find hard to judge which tissues are "correct" or "incorrect" for a trait in real life. It would be more straightforward to compare methods using simulation where we know which tissues are causal.

      We thank the reviewers to pinpoint the improper statement of “correct”. It is difficult to find phenotypes with gold-standard relevant tissues and we used six relatively well-studied phenotypes with prior knowledge of possible relevant tissues in our paper. We revised the “correct” statement in our revision.

    1. Author Response

      Reviewer #1 (Public Review):

      Trudel and colleagues aimed to uncover the neural mechanisms of estimating the reliability of the information from social agents and non-social objects. By combining functional MRI with a behavioural experiment and computational modelling, they demonstrated that learning from social sources is more accurate and robust compared with that from non-social sources. Furthermore, dmPFC and pTPJ were found to track the estimated reliability of the social agents (as opposed to the non-social objects). The strength of this study is to devise a task consisting of the two experimental conditions that were matched in their statistical properties and only differed in their framing (social vs. non-social). The novel experimental task allows researchers to directly compare the learning from social and non-social sources, which is a prominent contribution of the present study to social decision neuroscience.

      Thank you so much for your positive feedback about our work. We are delighted that you found that our manuscript provided a prominent contribution to social decision neuroscience. We really appreciate your time to review our work and your valuable comments that have significantly helped us to improve our manuscript further.

      One of the major weaknesses is the lack of a clear description about the conceptual novelty. Learning about the reliability/expertise of social and non-social agents has been of considerable concern in social neuroscience (e.g., Boorman et al., Neuron 2013; and Wittmann et al., Neuron 2016). The authors could do a better job in clarifying the novelty of the study beyond the previous literature.

      We understand the reviewer’s comment and have made changes to the manuscript that, first, highlight more strongly the novelty of the current study. Crucially, second, we have also supplemented the data analyses with a new model-based analysis of the differences in behaviour in the social and non-social conditions which we hope makes clearer, at a theoretical level, why participants behave differently in the two conditions.

      There has long been interest in investigating whether ‘social’ cognitive processes are special or unique compared to ‘non-social’ cognitive processes and, if they are, what makes them so. Differences between conditions could arise during the input stage (e.g. the type of visual input that is processed by social and non-social system), at the algorithm stage (e.g. the type of computational principles that underpin social versus non-social processes) or, even if identical algorithms are used, social and non-social processes might depend on distinct anatomical brain areas or neurons within brain areas. Here, we conducted multiple analyses (in figures 2, 3, and 4 in the revised manuscript and in Figure 2 – figure supplement 1, Figure 3 – figure supplement 1, Figure 4 – figure supplement 3, Figure 4 – figure supplement 4) that not only demonstrated basic similarities in mechanism generalised across social and non-social contexts, but also demonstrated important quantitative differences that were linked to activity in specific brain regions associated with the social condition. The additional analyses (Figure 4 – figure supplement 3, Figure 4 – figure supplement 4) show that differences are not simply a consequence of differences in the visual stimuli that are inputs to the two systems1, nor does the type of algorithm differ between conditions. Instead, our results suggest that the precise manner in which an algorithm is implemented differs when learning about social or non-social information and that this is linked to differences in neuroanatomical substrates.

      The previous studies mentioned by the reviewer are, indeed, relevant ones and were, of course, part of the inspiration for the current study. However, there are crucial differences between them and the current study. In the case of the previous studies by Wittmann, the aim was a very different one: to understand how one’s own beliefs, for example about one’s performance, and beliefs about others, for example about their performance levels, are combined. Here, however, instead we were interested in the similarities and differences between social and non-social learning. It is true that the question resembles the one addressed by Boorman and colleagues in 2013 who looked at how people learned about the advice offered by people or computer algorithms but the difference in the framing of that study perhaps contributed to authors’ finding of little difference in learning. By contrast, in the present study we found evidence that people were predisposed to perceive stability in social performance and to be uncertain about non-social performance. By accumulating evidence across multiple analyses, we show that there are quantitative differences in how we learn about social versus non-social information, and that these differences can be linked to the way in which learning algorithms are implemented neurally. We therefore contend that our findings extend our previous understanding of how, in relation to other learning processes, ‘social’ learning has both shared and special features.

      We would like to emphasize the way in which we have extended several of the analyses throughout the revision. The theoretical Bayesian framework has made it possible to simulate key differences in behaviour between the social and non-social conditions. We explain in our point-by-point reply below how we have integrated a substantial number of new analyses. We have also more carefully related our findings to previous studies in the Introduction and Discussion.

      Introduction, page 4:

      [...] Therefore, by comparing information sampling from social versus non-social sources, we address a long-standing question in cognitive neuroscience, the degree to which any neural process is specialized for, or particularly linked to, social as opposed to non-social cognition 2–9. Given their similarities, it is expected that both types of learning will depend on common neural mechanisms. However, given the importance and ubiquity of social learning, it may also be that the neural mechanisms that support learning from social advice are at least partially specialized and distinct from those concerned with learning that is guided by nonsocial sources. However, it is less clear on which level information is processed differently when it has a social or non-social origin. It has recently been argued that differences between social and non-social learning can be investigated on different levels of Marr’s information processing theory: differences could emerge at an input level (in terms of the stimuli that might drive social and non-social learning), at an algorithmic level or at a neural implementation level 7. It might be that, at the algorithmic level, associative learning mechanisms are similar across social and non-social learning 1. Other theories have argued that differences might emerge because goal-directed actions are attributed to social agents which allows for very different inferences to be made about hidden traits or beliefs 10. Such inferences might fundamentally alter learning about social agents compared to non-social cues.

      Discussion, page 15:

      […] One potential explanation for the assumption of stable performance for social but not non-social predictors might be that participants attribute intentions and motivations to social agents. Even if the social and non-social evidence are the same, the belief that a social actor might have a goal may affect the inferences made from the same piece of information 10. Social advisors first learnt about the target’s distribution and accordingly gave advice on where to find the target. If the social agents are credited with goal-directed behaviour then it might be assumed that the goals remain relatively constant; this might lead participants to assume stability in the performances of social advisors. However, such goal-directed intentions might not be attributed to non-social cues, thereby making judgments inherently more uncertain and changeable across time. Such an account, focussing on differences in attribution in social settings aligns with a recent suggestion that any attempt to identify similarities or differences between social and non-social processes can occur at any one of a number of the levels in Marr’s information theory 7. Here we found that the same algorithm was able to explain social and non-social learning (a qualitatively similar computational model could explain both). However, the extent to which the algorithm was recruited when learning about social compared to non-social information differed. We observed a greater impact of uncertainty on judgments about social compared to non-social information. We have shown evidence for a degree of specialization when assessing social advisors as opposed to non-social cues. At the neural level we focused on two brain areas, dmPFC and pTPJ, that have not only been shown to carry signals associated with belief inferences about others but, in addition, recent combined fMRI-TMS studies have demonstrated the causal importance of these activity patterns for the inference process […]

      Another weakness is the lack of justifications of the behavioural data analyses. It is difficult for me to understand why 'performance matching' is suitable for an index of learning accuracy. I understand the optimal participant would adjust the interval size with respect to the estimated reliability of the advisor (i.e., angular error); however, I am wondering if the optimal strategy for participants is to exactly match the interval size with the angular error. Furthermore, the definitions of 'confidence adjustment across trials' and 'learning index' look arbitrary.

      First, having read the reviewer’s comments, we realise that our choice of the term ‘performance matching’ may not have been ideal as it indeed might not be the case that the participant intended to directly match their interval sizes with their estimates of advisor/predictor error. Like the reviewer, our assumption is simply that the interval sizes should change as the estimated reliability of the advisor changes and, therefore, that the intervals that the participants set should provide information about the estimates that they hold and the manner in which they evolve. On re-reading the manuscript we realised that we had not used the term ‘performance matching’ consistently or in many places in the manuscript. In the revised manuscript we have simply removed it altogether and referred to the participants’ ‘interval setting’.

      Most of the initial analyses in Figure 2a-c aim to better understand the raw behaviour before applying any computational model to the data. We were interested in how participants make confidence judgments (decision-making per se), but also how they adapt their decisions with additional information (changes or learning in decision making). In the revised manuscript we have made clear that these are used as simple behavioural measures and that they will be complemented later by more analyses derived from more formal computational models.

      In what we now refer to as the ‘interval setting’ analysis (Figure 2a), we tested whether participants select their interval settings differently in the social compared to non-social condition. We observe that participants set their intervals closer to the true angular error of the advisor/predictor in the social compared to the non-social condition. This observation could arise in two ways. First, it could be due to quantitative differences in learning despite general, qualitative similarity: mechanisms are similar but participants differ quantitatively in the way that they learn about non-social information and social information. Second, it could, however, reflect fundamentally different strategies. We tested basic performance differences by comparing the mean reward between conditions. There was no difference in reward between conditions (mean reward: paired t-test social vs. non-social, t(23)= 0.8, p=0.4, 95% CI= [-0.007 0.016]), suggesting that interval setting differences might not simply reflect better or worse performance in social or non-social contexts but instead might reflect quantitative differences in the processes guiding interval setting in the two cases.

      In the next set of analyses, in which we compared raw data, applied a computational model, and provided a theoretical account for the differences between conditions, we suggest that there are simple quantitative differences in how information is processed in social and nonsocial conditions but that these have the important impact of making long-term representations – representations built up over a longer series of trials – more important in the social condition. This, in turn, has implications for the neural activity patterns associated with social and non-social learning. We, therefore, agree with the reviewer, that one manner of interval setting is indeed not more optimal than another. However, the differences that do exist in behaviour are important because they reveal something about the social and non-social learning and its neural substrates. We have adjusted the wording and interpretation in the revised manuscript.

      Next, we analysed interval setting with two additional, related analyses: interval setting adjustment across trials and derivation of a learning index. We tested the degree to which participants adjusted their interval setting across trials and according to the prediction error (learning index, Figure f); the latter analysis is very similar to a trial-wise learning rate calculated in previous studies11. In contrast to many other studies, the intervals set by participants provide information about the estimates that they hold in a simple and direct way and enable calculation of a trial-wise learning index; therefore, we decided to call it ‘learning index’ instead of ‘learning rate’ as it is not estimated via a model applied to the data, but instead directly calculated from the data. Arguably the directness of the approach, and its lack of dependence on a specific computational model, is a strength of the analysis.

      Subsequently in the manuscript, a new analysis (illustrated in new Figure 3) employs Bayesian models that can simulate the differences in the social and non-social conditions and demonstrate that a number of behavioural observations can arise simply as a result of differences in noise in each trial-wise Bayesian update (Figure 3 and specifically 3d; Figure 3 – figure supplement 1b-c). In summary, the descriptive analyses in Figure 2a-c aid an intuitive understanding of the differences in behaviour in the social and non-social conditions. We have then repeated these analyses with Bayesian models incorporating different noise levels and showed that in such a way, the differences in behaviour between social and non-social conditions can be mimicked (please see next section and manuscript for details).

      We adjusted the wording in a number of sections in the revised manuscript such as in the legend of Figure 2 (figures and legend), Figure 4 (figures and legend).

      Main text, page 5:

      The confidence interval could be changed continuously to make it wider or narrower, by pressing buttons repeatedly (one button press resulted in a change of one step in the confidence interval). In this way participants provided what we refer to as an ’interval setting’.

      We also adjusted the following section in Main text, page 6:

      Confidence in the performance of social and non-social advisors

      We compared trial-by-trial interval setting in relation to the social and non-social advisors/predictors. When setting the interval, the participant’s aim was to minimize it while ensuring it still encompassed the final target position; points were won when it encompassed the target position but were greater when it was narrower. A given participant’s interval setting should, therefore, change in proportion to the participant’s expectations about the predictor’s angular error and their uncertainty about those expectations. Even though, on average, social and non-social sources did not differ in the precision with which they predicted the target (Figure 2 – figure supplement 1), participants gave interval settings that differed in their relationships to the true performances of the social advisors compared to the non-social predictors. The interval setting was closer to the angular error in the social compared to the non-social sessions (Figure 2a, paired t-test: social vs. non-social, t(23)= -2.57, p= 0.017, 95% confidence interval (CI)= [-0.36 -0.4]). Differences in interval setting might be due to generally lower performance in the nonsocial compared to social condition, or potentially due to fundamentally different learning processes utilised in either condition. We compared the mean reward amounts obtained by participants in the social and non-social conditions to determine whether there were overall performance differences. There was, however, no difference in the reward received by participants in the two conditions (mean reward: paired t-test social vs. non-social, t(23)= 0.8, p=0.4, 95% CI= [-0.007 0.016]), suggesting that interval setting differences might not simply reflect better or worse performance

      Discussion, page 14:

      Here, participants did not match their confidence to the likely accuracy of their own performance, but instead to the performance of another social or non-social advisor. Participants used different strategies when setting intervals to express their confidence in the performances of social advisors as opposed to non-social advisors. A possible explanation might be that participants have a better insight into the abilities of social cues – typically other agents – than non-social cues – typically inanimate objects.

      As the authors assumed simple Bayesian learning for the estimation of reliability in this study, the degree/speed of the learning should be examined with reference to the distance between the posterior and prior belief in the optimal Bayesian inference.

      We thank the reviewer for this suggestion. We agree with the reviewer that further analyses that aim to disentangle the underlying mechanisms that might differ between both social and non-social conditions might provide additional theoretical contributions. We show additional model simulations and analyses that aim to disentangle the differences in more detail. These new results allowed clearer interpretations to be made.

      In the current study, we showed that judgments made about non-social predictors were changed more strongly as a function of the subjective uncertainty: participants set a larger interval, indicating lower confidence, when they were more uncertain about the non-social cue’s accuracy to predict the target. In response to the reviewer’s comments, the new analyses were aimed at understanding under which conditions such a negative uncertainty effect might emerge.

      Prior expectations of performance First, we compared whether participants had different prior expectations in the social condition compared to the non-social condition. One way to compare prior expectations is by comparing the first interval set for each advisor/predictor. This is a direct readout of the initial prior expectation with which participants approach our two conditions. In such a way, we test whether the prior beliefs before observing any social or non-social information differ between conditions. Even though this does not test the impact of prior expectations on subsequent belief updates, it does test whether participants have generally different expectations about the performance of social advisors or non-social predictors. There was no difference in this measure between social or non-social cues (Figure below; paired t-test social vs. non-social, t(23)= 0.01, p=0.98, 95% CI= [-0.067 0.68]).

      Figure. Confidence interval for the first encounter of each predictor in social and non-social conditions. There was no initial bias in predicting the performance of social or non-social predictors.

      Learning across time We have now seen that participants do not have an initial bias when predicting performances in social or non-social conditions. This suggests that differences between conditions might emerge across time when encountering predictors multiple times. We tested whether inherent differences in how beliefs are updated according to new observations might result in different impacts of uncertainty on interval setting between social and non-social conditions. More specifically, we tested whether the integration of new evidence differed between social and non-social conditions; for example, recent observations might be weighted more strongly for non-social cues while past observations might be weighted more strongly for social cues. This approach was inspired by the reviewer’s comments about potential differences in the speed of learning as well as the reduction of uncertainty with increasing predictor encounters. Similar ideas were tested in previous studies, when comparing the learning rate (i.e. the speed of learning) in environments of different volatilities 12,13. In these studies, a smaller learning rate was prevalent in stable environments during which reward rates change slower over time, while higher learning rates often reflect learning in volatile environments so that recent observations have a stronger impact on behaviour. Even though most studies derived these learning rates with reinforcement learning models, similar ideas can be translated into a Bayesian model. For example, an established way of changing the speed of learning in a Bayesian model is to introduce noise during the update process14. This noise is equivalent to adding in some of the initial prior distribution and this will make the Bayesian updates more flexible to adapt to changing environments. It will widen the belief distribution and thereby make it more uncertain. Recent information has more weight on the belief update within a Bayesian model when beliefs are uncertain. This increases the speed of learning. In other words, a wide distribution (after adding noise) allows for quick integration of new information. On the contrary, a narrow distribution does not integrate new observations as strongly and instead relies more heavily on previous information; this corresponds to a small learning rate. So, we would expect a steep decline of uncertainty to be related to a smaller learning index while a slower decline of uncertainty is related to a larger learning index. We hypothesized that participants reduce their uncertainty quicker when observing social information, thereby anchoring more strongly on previous beliefs instead of integrating new observations flexibly. Vice versa, we hypothesized a less steep decline of uncertainty when observing non-social information, indicating that new information can be flexibly integrated during the belief update (new Figure 3a).

      We modified the original Bayesian model (Figure 2d, Figure 2 – figure supplement 2) by adding a uniform distribution (equivalent to our prior distribution) to each belief update – we refer to this as noise addition to the Bayesian model14,21 . We varied the amount of noise between δ = [0,1], while δ= 0 equals the original Bayesian model and δ= 1 represents a very noisy Bayesian model. The uniform distribution was selected to match the first prior belief before any observation was made (equation 2). This δ range resulted in a continuous increase of subjective uncertainty around the belief about the angular error (Figure 3b-c). The modified posterior distribution denoted as 𝑝′(σ x) was derived at each trial as follows:

      We applied each noisy Bayesian model to participants’ choices within the social and nonsocial condition.

      The addition of a uniform distribution changed two key features of the belief distribution: first, the width of the distribution remains larger with additional observations, thereby making it possible to integrate new observations more flexibly. To show this more clearly, we extracted the model-derived uncertainty estimate across multiple encounters of the same predictor for the original model and the fully noisy Bayesian model (Figure 3 – figure supplement 1). The model-derived ‘uncertainty estimate’ of a noisy Bayesian model decays more slowly compared to the ‘uncertainty estimate’ of the original Bayesian model (upper panel). Second, the model-derived ‘accuracy estimate’ reflects more recent observations in a noisy Bayesian model compared to the ‘accuracy estimate’ derived from the original Bayesian model, which integrates past observations more strongly (lower panel). Hence, as mentioned beforehand, a rapid decay of uncertainty implies a small learning index; or in other words, stronger integration of past compared to recent observations.

      In the following analyses, we tested whether an increasingly noisy Bayesian model mimics behaviour that is observed in the non-social compared to social condition. For example, we tested whether an increasingly noisy Bayesian model also exhibits a strongly negative ‘predictor uncertainty’ effect on interval setting (Figure 2e). In such a way, we can test whether differences in noise in the updating process of a Bayesian model might reproduce important qualitative differences in learning-related behaviour seen in the social and nonsocial conditions.

      We used these modified Bayesian models to simulate trial-wise interval setting for each participant according to the observations they made when selecting a particular advisor or non-social cue. We simulated interval setting at each trial and examined whether an increase in noise produced model behaviours that resembled participant behaviour patterns observed in the non-social condition as opposed to social condition. At each trial, we used the accuracy estimate (Methods, equation 6) – which represents a subjective belief about a single angular error -- to derive an interval setting for the selected predictor. To do so, we first derived the point-estimate of the belief distribution at each trial (Methods, equation 6) and multiplied it with the size of one interval step on the circle. The step size was derived by dividing the circle size by the maximum number of possible steps. Here is an example of transforming an accuracy estimate into an interval: let’s assume the belief about the angular error at the current trial is 50 (Methods, equation 6). Now, we are trying to transform this number into an interval for the current predictor on a given trial. To obtain the size of one interval step, the circle size (360 degrees) is divided by the maximum number of interval steps (40 steps; note, 20 steps on each side), which results in nine degrees that represents the size of one interval step. Next, the accuracy estimate in radians (0,87) is multiplied by the step size in radians (0,1571) resulting in an interval of 0,137 radians or 7,85 degrees. The final interval size would be 7,85.

      Simulating Bayesian choices in that way, we repeated the behavioural analyses (Figure 2b,e,f) to test whether intervals derived from more noisy Bayesian models mimic intervals set by participants in the non-social condition: greater changes in interval setting across trials (Figure 3 – figure supplement 1b), a negative ‘predictor uncertainty' effect on interval setting (Figure 3 – figure supplement 1c), and a higher learning index (Figure 3d).

      First, we repeated the most crucial analysis -- the linear regression analysis (Figure 2e) and hypothesized that intervals that were simulated from noisy Bayesian models would also show a greater negative ‘predictor uncertainty’ effect on interval setting. This was indeed the case: irrespective of social or non-social conditions, the addition of noise (increased weighting of the uniform distribution in each belief update) led to an increasingly negative ‘predictor uncertainty’ effect on confidence judgment (new Figure 3d). In Figure 3d, we show the regression weights (y-axis) for the ‘predictor uncertainty’ on confidence judgment with increasing noise (x-axis). This result is highly consistent with the idea that that in the non-social condition the manner in which task estimates are updated is more uncertain and more noisy. By contrast, social estimates appear relatively more stable, also according to this new Bayesian simulation analysis.

      This new finding extends the results and suggests a formal computational account of the behavioural differences between social and non-social conditions. Increasing the noise of the belief update mimics behaviour that is observed in the non-social condition: an increasingly negative effect of ‘predictor uncertainty’ on confidence judgment. Noteworthily, there was no difference in the impact that the noise had in the social and non-social conditions. This was expected because the Bayesian simulations are blind to the framing of the conditions. However, it means that the observed effects do not depend on the precise sequence of choices that participants made in these conditions. It therefore suggests that an increase in the Bayesian noise leads to an increasingly negative impact of ‘predictor uncertainty’ on confidence judgments irrespective of the condition. Hence, we can conclude that different degrees of uncertainty within the belief update is a reasonable explanation that can underlie the differences observed between social and non-social conditions.

      Next, we used these simulated confidence intervals and repeated the descriptive behavioural analyses to test whether interval settings that were derived from more noisy Bayesian models mimic behavioural patterns observed in non-social compared to social conditions. For example, more noise in the belief update should lead to more flexible integration of new information and hence should potentially lead to a greater change of confidence judgments across predictor encounters (Figure 2b). Further, a greater reliance on recent information should lead to prediction errors more strongly in the next confidence judgment; hence, it should result in a higher learning index in the non-social condition that we hypothesize to be perceived as more uncertain (Figure 2f). We used the simulated confidence interval from Bayesian models on a continuum of noise integration (i.e. different weighting of the uniform distribution into the belief update) and derived again both absolute confidence change and learning indices (Figure 3 – figure supplement 1b-c).

      ‘Absolute confidence change’ and ‘learning index’ increase with increasing noise weight, thereby mimicking the difference between social and non-social conditions. Further, these analyses demonstrate the tight relationship between descriptive analyses and model-based analyses. They show that a noise in the Bayesian updating process is a conceptual explanation that can account for both the differences in learning and the difference in uncertainty processing that exist between social and non-social conditions. The key insight conveyed by the Bayesian simulations is that a wider, more uncertain belief distribution changes more quickly. Correspondingly, in the non-social condition, participants express more uncertainty in their confidence estimate when they set the interval, and they also change their beliefs more quickly as expressed in a higher learning index. Therefore, noisy Bayesian updating can account for key differences between social and non-social condition.

      We thank the reviewer for making this point, as we believe that these additional analyses allow theoretical inferences to be made in a more direct manner; we think that it has significantly contributed towards a deeper understanding of the mechanisms involved in the social and non-social conditions. Further, it provides a novel account of how we make judgments when being presented with social and non-social information.

      We made substantial changes to the main text, figures and supplementary material to include these changes:

      Main text, page 10-11 new section:

      The impact of noise in belief updating in social and non-social conditions

      So far, we have shown that, in comparison to non-social predictors, participants changed their interval settings about social advisors less drastically across time, relied on observations made further in the past, and were less impacted by their subjective uncertainty when they did so (Figure 2). Using Bayesian simulation analyses, we investigated whether a common mechanism might underlie these behavioural differences. We tested whether the integration of new evidence differed between social and non-social conditions; for example, recent observations might be weighted more strongly for non-social cues while past observations might be weighted more strongly for social cues. Similar ideas were tested in previous studies, when comparing the learning rate (i.e. the speed of learning) in environments of different volatilities12,13. We tested these ideas using established ways of changing the speed of learning during Bayesian updates14,21. We hypothesized that participants reduce their uncertainty quicker when observing social information. Vice versa, we hypothesized a less steep decline of uncertainty when observing non-social information, indicating that new information can be flexibly integrated during the belief update (Figure 5a).

      We manipulated the amount of uncertainty in the Bayesian model by adding a uniform distribution to each belief update (Figure 3b-c) (equation 10,11). Consequently, the distribution’s width increases and is more strongly impacted by recent observations (see example in Figure 3 – figure supplement 1). We used these modified Bayesian models to simulate trial-wise interval setting for each participant according to the observations they made by selecting a particular advisor in the social condition or other predictor in the nonsocial condition. We simulated confidence intervals at each trial. We then used these to examine whether an increase in noise led to simulation behaviour that resembled behavioural patterns observed in non-social conditions that were different to behavioural patterns observed in the social condition.

      First, we repeated the linear regression analysis and hypothesized that interval settings that were simulated from noisy Bayesian models would also show a greater negative ‘predictor uncertainty’ effect on interval setting resembling the effect we had observed in the nonsocial condition (Figure 2e). This was indeed the case when using the noisy Bayesian model: irrespective of social or non-social condition, the addition of noise (increasing weight of the uniform distribution to each belief update) led to an increasingly negative ‘predictor uncertainty’ effect on confidence judgment (new Figure 3d). The absence of difference between the social and non-social conditions in the simulations, suggests that an increase in the Bayesian noise is sufficient to induce a negative impact of ‘predictor uncertainty’ on interval setting. Hence, we can conclude that different degrees of noise in the updating process are sufficient to cause differences observed between social and non-social conditions. Next, we used these simulated interval settings and repeated the descriptive behavioural analyses (Figure 2b,f). An increase in noise led to greater changes of confidence across time and a higher learning index (Figure 3 – figure supplement 1b-c). In summary, the Bayesian simulations offer a conceptual explanation that can account for both the differences in learning and the difference in uncertainty processing that exist between social and non-social conditions. The key insight conveyed by the Bayesian simulations is that a wider, more uncertain belief distribution changes more quickly. Correspondingly, in the non-social condition, participants express more uncertainty in their confidence estimate when they set the interval, and they also change their beliefs more quickly. Therefore, noisy Bayesian updating can account for key differences between social and non-social condition.

      Methods, page 23 new section:

      Extension of Bayesian model with varying amounts of noise

      We modified the original Bayesian model (Figure 2d, Figure 2 – figure supplement 2) to test whether the integration of new evidence differed between social and non-social conditions; for example, recent observations might be weighted more strongly for non-social cues while past observations might be weighted more strongly for social cues. [...] To obtain the size of one interval step, the circle size (360 degrees) is divided by the maximum number of interval steps (40 steps; note, 20 steps on each side), which results in nine degrees that represents the size of one interval step. Next, the accuracy estimate in radians (0,87) is multiplied by the step size in radians (0,1571) resulting in an interval of 0,137 radians or 7,85 degrees. The final interval size would be 7,85.

      We repeated behavioural analyses (Figure 2b,e,f) to test whether confidence intervals derived from more noisy Bayesian models mimic behavioural patterns observed in the nonsocial condition: greater changes of confidence across trials (Figure 3 – figure supplement 1b), a greater negative ‘predictor uncertainty' on confidence judgment (Figure 3 – figure supplement 1c) and a greater learning index (Figure 3d).

      Discussion, page 14: […] It may be because we make just such assumptions that past observations are used to predict performance levels that people are likely to exhibit next 15,16. An alternative explanation might be that participants experience a steeper decline of subjective uncertainty in their beliefs about the accuracy of social advice, resulting in a narrower prior distribution, during the next encounter with the same advisor. We used a series of simulations to investigate how uncertainty about beliefs changed from trial to trial and showed that belief updates about non-social cues were consistent with a noisier update process that diminished the impact of experiences over the longer term. From a Bayesian perspective, greater certainty about the value of advice means that contradictory evidence will need to be stronger to alter one’s beliefs. In the absence of such evidence, a Bayesian agent is more likely to repeat previous judgments. Just as in a confirmation bias 17, such a perspective suggests that once we are more certain about others’ features, for example, their character traits, we are less likely to change our opinions about them.

      Reviewer #2 (Public Review):

      Humans learn about the world both directly, by interacting with it, and indirectly, by gathering information from others. There has been a longstanding debate about the extent to which social learning relies on specialized mechanisms that are distinct from those that support learning through direct interaction with the environment. In this work, the authors approach this question using an elegant within-subjects design that enables direct comparisons between how participants use information from social and non-social sources. Although the information presented in both conditions had the same underlying structure, participants tracked the performance of the social cue more accurately and changed their estimates less as a function of prediction error. Further, univariate activity in two regions-dmPFC and pTPJ-tracked participants' confidence judgments more closely in the social than in the non-social condition, and multivariate patterns of activation in these regions contained information about the identity of the social cues.

      Overall, the experimental approach and model used in this paper are very promising. However, after reading the paper, I found myself wanting additional insight into what these condition differences mean, and how to place this work in the context of prior literature on this debate. In addition, some additional analyses would be useful to support the key claims of the paper.

      We thank the reviewer for their very supportive comments. We have addressed their points below and have highlighted changes in our manuscript that we made in response to the reviewer’s comments.

      (1) The framing should be reworked to place this work in the context of prior computational work on social learning. Some potentially relevant examples:

      • Shafto, Goodman & Frank (2012) provide a computational account of the domainspecific inductive biases that support social learning. In brief, what makes social learning special is that we have an intuitive theory of how other people's unobservable mental states lead to their observable actions, and we use this intuitive theory to actively interpret social information. (There is also a wealth of behavioral evidence in children to support this account; for a review, see Gweon, 2021).

      • Heyes (2012) provides a leaner account, arguing that social and non-social learning are supported by a common associative learning mechanism, and what distinguishes social from non-social learning is the input mechanism. Social learning becomes distinctively "social" to the extent that organisms are biased or attuned to social information.

      I highlight these papers because they go a step beyond asking whether there is any difference between mechanisms that support social and nonsocial learning-they also provide concrete proposals about what that difference might be, and what might be shared. I would like to see this work move in a similar direction.

      References<br /> (In the interest of transparency: I am not an author on these papers.)

      Gweon, H. (2021). Inferential social learning: how humans learn from others and help others learn. PsyArXiv. https://doi.org/10.31234/osf.io/8n34t

      Heyes, C. (2012). What's social about social learning?. Journal of Comparative Psychology, 126(2), 193.

      Shafto, P., Goodman, N. D., & Frank, M. C. (2012). Learning from others: The consequences of psychological reasoning for human learning. Perspectives on Psychological Science, 7(4), 341-351.

      Thank you for this suggestion to expand our framing. We have now made substantial changes to the Discussion and Introduction to include additional background literature, the relevant references suggested by the reviewer, addressing the differences between social and non-social learning. We further related our findings to other discussions in the literature that argue that differences between social and non-social learning might occur at the level of algorithms (the computations involved in social and non-social learning) and/or implementation (the neural mechanisms). Here, we describe behaviour with the same algorithm (Bayesian model), but the weighing of uncertainty on decision-making differs between social and non-social contexts. This might be explained by similar ideas put forward by Shafto and colleagues (2012), who suggest that differences between social and non-social learning might be due to the attribution of goal-directed intention to social agents, but not non-social cues. Such an attribution might lead participants to assume that advisor performances will be relatively stable under the assumption that they should have relatively stable goal-directed intentions. We also show differences at the implementational level in social and non-social learning in TPJ and dmPFC.

      Below we list the changes we have made to the Introduction and Discussion. Further, we would also like to emphasize the substantial extension of the Bayesian modelling which we think clarifies the theoretical framework used to explain the mechanisms involved in social and non-social learning (see our answer to the next comments below).

      Introduction, page 4:

      [...]<br /> Therefore, by comparing information sampling from social versus non-social sources, we address a long-standing question in cognitive neuroscience, the degree to which any neural process is specialized for, or particularly linked to, social as opposed to non-social cognition 2–9. Given their similarities, it is expected that both types of learning will depend on common neural mechanisms. However, given the importance and ubiquity of social learning, it may also be that the neural mechanisms that support learning from social advice are at least partially specialized and distinct from those concerned with learning that is guided by nonsocial sources.

      However, it is less clear on which level information is processed differently when it has a social or non-social origin. It has recently been argued that differences between social and non-social learning can be investigated on different levels of Marr’s information processing theory: differences could emerge at an input level (in terms of the stimuli that might drive social and non-social learning), at an algorithmic level or at a neural implementation level 7. It might be that, at the algorithmic level, associative learning mechanisms are similar across social and non-social learning 1. Other theories have argued that differences might emerge because goal-directed actions are attributed to social agents which allows for very different inferences to be made about hidden traits or beliefs 10. Such inferences might fundamentally alter learning about social agents compared to non-social cues.

      Discussion, page 15:

      […] One potential explanation for the assumption of stable performance for social but not non-social predictors might be that participants attribute intentions and motivations to social agents. Even if the social and non-social evidence are the same, the belief that a social actor might have a goal may affect the inferences made from the same piece of information 10. Social advisors first learnt about the target’s distribution and accordingly gave advice on where to find the target. If the social agents are credited with goal-directed behaviour then it might be assumed that the goals remain relatively constant; this might lead participants to assume stability in the performances of social advisors. However, such goal-directed intentions might not be attributed to non-social cues, thereby making judgments inherently more uncertain and changeable across time. Such an account, focussing on differences in attribution in social settings aligns with a recent suggestion that any attempt to identify similarities or differences between social and non-social processes can occur at any one of a number of the levels in Marr’s information theory 7. Here we found that the same algorithm was able to explain social and non-social learning (a qualitatively similar computational model could explain both). However, the extent to which the algorithm was recruited when learning about social compared to non-social information differed. We observed a greater impact of uncertainty on judgments about social compared to non-social information. We have shown evidence for a degree of specialization when assessing social advisors as opposed to non-social cues. At the neural level we focused on two brain areas, dmPFC and pTPJ, that have not only been shown to carry signals associated with belief inferences about others but, in addition, recent combined fMRI-TMS studies have demonstrated the causal importance of these activity patterns for the inference process […]

      (2) The results imply that dmPFC and pTPJ differentiate between learning from social and non-social sources. However, more work needs to be done to rule out simpler, deflationary accounts. In particular, the condition differences observed in dmPFC and pTPJ might reflect low-level differences between the two conditions. For example, the social task could simply have been more engaging to participants, or the social predictors may have been more visually distinct from one another than the fruits.

      We understand the reviewer’s concern regarding low-level distinctions between the social and non-social condition that could confound for the differences in neural activation that are observed between conditions in areas pTPJ and dmPFC. From the reviewer’s comments, we understand that there might be two potential confounders: first, low-level differences such that stimuli within one condition might be more distinct to each other compared to the relative distinctiveness between stimuli within the other condition. Therefore, simply the greater visual distinctiveness of stimuli in one condition than another might lead to learning differences between conditions. Second, stimuli in one condition might be more engaging and potentially lead to attentional differences between conditions. We used a combination of univariate analyses and multivariate analyses to address both concerns.

      Analysis 1: Univariate analysis to inspect potential unaccounted variance between social and non-social condition

      First, we used the existing univariate analysis (exploratory MRI whole-brain analysis, see Methods) to test for neural activation that covaried with attentional differences – or any other unaccounted neural difference -- between conditions. If there were neural differences between conditions that we are currently not accounting for with the parametric regressors that are included in the fMRI-GLM, then these differences should be captured in the constant of the GLM model. For example, if there are attentional differences between conditions, then we could expect to see neural differences between conditions in areas such as inferior parietal lobe (or other related areas that are commonly engaged during attentional processes).

      Importantly, inspection of the constant of the GLM model should capture any unaccounted differences, whether they are due to attention or alternative processes that might differ between conditions. When inspecting cluster-corrected differences in the constant of the fMRI-GLM model during the setting of the confidence judgment, there were no clustersignificant activation that was different between social and non-social conditions (Figure 4 – figure supplement 4a; results were familywise-error cluster-corrected at p<0.05 using a cluster-defining threshold of z>2.3). For transparency, we show the sub-threshold activation map across the whole brain (z > 2) for the ‘constant’ contrasted between social and nonsocial condition (i.e. constant, contrast: social – non-social).

      For transparency we additionally used an ROI-approach to test differences in activation patterns that correlated with the constant during the confidence phase – this means, we used the same ROI-approach as we did in the paper to avoid any biased test selection. We compared activation patterns between social and non-social conditions in the same ROI as used before; dmPFC (MNI-coordinate [x/y/z: 2,44,36] 16), bilateral pTPJ (70% probability anatomical mask; for reference see manuscript, page 23) and additionally compared activation patterns between conditions in bilateral IPLD (50% probability anatomical mask, 20). We did not find significantly different activation patterns between social and non-social conditions in any of these areas: dmPFC (confidence constant; paired t-test social vs nonsocial: t(23) = 0.06, p=0.96, [-36.7, 38.75]), bilateral TPJ (confidence constant; paired t-test social vs non-social: t(23) = -0.06, p=0.95, [-31, 29]), bilateral IPLD (confidence constant; paired t-test social vs non-social: t(23) = -0.58, p=0.57, [-30.3 17.1]).

      There were no meaningful activation patterns that differed between conditions in either areas commonly linked to attention (eg IPL) or in brain areas that were the focus of the study (dmPFC and pTPJ). Activation in dmPFC and pTPJ covaried with parametric effects such as the confidence that was set at the current and previous trial, and did not correlate with low-level differences such as attention. Hence, these results suggest that activation between conditions was captured better by parametric regressors such as the trial-wise interval setting, i.e. confidence, and are unlikely to be confounded by low-level processes that can be captured with univariate neural analyses.

      Analysis 2: RSA to test visual distinctiveness between social and non-social conditions

      We addressed the reviewer’s other comment further directly by testing whether potential differences between conditions might arise due to a varying degree of visual distinctiveness in one stimulus set compared to the other stimulus set. We used RSA analysis to inspect potential differences in early visual processes that should be impacted by greater stimulus similarity within one condition. In other words, we tested whether the visual distinctiveness of one stimuli set was different to the visual distinctiveness of the other stimuli set. We used RSA analysis to compare the Exemplar Discriminability Index (EDI) between conditions in early visual areas. We compared the dissimilarity of neural activation related to the presentation of an identical stimulus across trials (diagonal in RSA matrix) with the dissimilarity in neural activation between different stimuli across trials (off-diagonal in RSA matrix). If stimuli within one stimulus set are very similar, then the difference between the diagonal and off-diagonal should be very small and less likely to be significant (i.e. similar diagonal and off-diagonal values). In contrast, if stimuli within one set are very distinct from each other, then the difference between the diagonal and off-diagonal should be large and likely to result in a significant EDI (i.e. different diagonal and off-diagonal values) (see Figure 4g for schematic illustration). Hence, if there is a difference in the visual distinctiveness between social and non-social conditions, then this difference should result in different EDI values for both conditions – hence, visual distinctiveness between the stimuli set can be tested by comparing the EDI values between conditions within the early visual processing. We used a Harvard-cortical ROI mask based on bilateral V1. Negative EDI values indicate that the same exemplars are represented more similarly in the neural V1 pattern than different exemplars. This analysis showed that there was no significant difference in EDI between conditions (Figure 4 – figure supplement 4b; EDI paired sample t-test: t(23) = -0.16, p=0.87, 95% CI [-6.7 5.7]).

      We have further replicated results in V1 with a whole-brain searchlight analysis, averaging across both social and non-social conditions.

      In summary, by using a combination of univariate and multivariate analyses, we could test whether neural activation might be different when participants were presented with a facial or fruit stimuli and whether these differences might confound observed learning differences between conditions. We did not find meaningful neural differences that were not accounted for with the regressors included in the GLM. Further, we did not find differences in the visual distinctiveness between the stimuli sets. Hence, these control analyses suggest that differences between social and non-social conditions might not arise because of differences in low-level processes but are instead more likely to develop when learning about social or non-social information.

      Moreover, we also examined behaviourally whether participants differed in the way they approached social and non-social condition. We tested whether there were initial biases prior to learning, i.e. before actually receiving information from either social or non-social information sources. Therefore, we tested whether participants have different prior expecations about the performance of social compared to non-social predictors. We compared the confidence judgments at the first trial of each predictor. We found that participants set confidence intervals very similarly in social and non-social conditions (Figure below). Hence, it did not seem to be the case that differences between conditions arose due to low level differences in stimulus sets or prior differences in expectations about performances of social compared to non-social predictors. However, we can show that differences between conditions are apparent when updating one’s belief about social advisors or non-social cues and as a consequence, in the way that confidence judgments are set across time.

      Figure. Confidence interval for the first encounter of each predictor in social and non-social conditions. There was no initial bias in predicting the performance of social or non-social predictors.

      Main text page 13:

      [… ]<br /> Additional control analyses show that neural differences between social and non-social conditions were not due to the visually different set of stimuli used in the experiment but instead represent fundamental differences in processing social compared to non-social information (Figure 4 – figure supplement 4). These results are shown in ROI-based RSA analysis and in whole-brain searchlight analysis. In summary, in conjunction, the univariate and multivariate analyses demonstrate that dmPFC and pTPJ represent beliefs about social advisors that develop over a longer timescale and encode the identities of the social advisors.

      References

      1. Heyes, C. (2012). What’s social about social learning? Journal of Comparative Psychology 126, 193–202. 10.1037/a0025180.
      2. Chang, S.W.C., and Dal Monte, O. (2018). Shining Light on Social Learning Circuits. Trends in Cognitive Sciences 22, 673–675. 10.1016/j.tics.2018.05.002.
      3. Diaconescu, A.O., Mathys, C., Weber, L.A.E., Kasper, L., Mauer, J., and Stephan, K.E. (2017). Hierarchical prediction errors in midbrain and septum during social learning. Soc Cogn Affect Neurosci 12, 618–634. 10.1093/scan/nsw171.
      4. Frith, C., and Frith, U. (2010). Learning from Others: Introduction to the Special Review Series on Social Neuroscience. Neuron 65, 739–743. 10.1016/j.neuron.2010.03.015.
      5. Frith, C.D., and Frith, U. (2012). Mechanisms of Social Cognition. Annu. Rev. Psychol. 63, 287–313. 10.1146/annurev-psych-120710-100449.
      6. Grabenhorst, F., and Schultz, W. (2021). Functions of primate amygdala neurons in economic decisions and social decision simulation. Behavioural Brain Research 409, 113318. 10.1016/j.bbr.2021.113318.
      7. Lockwood, P.L., Apps, M.A.J., and Chang, S.W.C. (2020). Is There a ‘Social’ Brain? Implementations and Algorithms. Trends in Cognitive Sciences, S1364661320301686. 10.1016/j.tics.2020.06.011.
      8. Soutschek, A., Ruff, C.C., Strombach, T., Kalenscher, T., and Tobler, P.N. (2016). Brain stimulation reveals crucial role of overcoming self-centeredness in self-control. Sci. Adv. 2, e1600992. 10.1126/sciadv.1600992.
      9. Wittmann, M.K., Lockwood, P.L., and Rushworth, M.F.S. (2018). Neural Mechanisms of Social Cognition in Primates. Annu. Rev. Neurosci. 41, 99–118. 10.1146/annurev-neuro080317-061450.
      10. Shafto, P., Goodman, N.D., and Frank, M.C. (2012). Learning From Others: The Consequences of Psychological Reasoning for Human Learning. Perspect Psychol Sci 7, 341– 351. 10.1177/1745691612448481.
      11. McGuire, J.T., Nassar, M.R., Gold, J.I., and Kable, J.W. (2014). Functionally Dissociable Influences on Learning Rate in a Dynamic Environment. Neuron 84, 870–881. 10.1016/j.neuron.2014.10.013.
      12. Behrens, T.E.J., Woolrich, M.W., Walton, M.E., and Rushworth, M.F.S. (2007). Learning the value of information in an uncertain world. Nature Neuroscience 10, 1214– 1221. 10.1038/nn1954.
      13. Meder, D., Kolling, N., Verhagen, L., Wittmann, M.K., Scholl, J., Madsen, K.H., Hulme, O.J., Behrens, T.E.J., and Rushworth, M.F.S. (2017). Simultaneous representation of a spectrum of dynamically changing value estimates during decision making. Nat Commun 8, 1942. 10.1038/s41467-017-02169-w.
      14. Allenmark, F., Müller, H.J., and Shi, Z. (2018). Inter-trial effects in visual pop-out search: Factorial comparison of Bayesian updating models. PLoS Comput Biol 14, e1006328. 10.1371/journal.pcbi.1006328.
      15. Wittmann, M., Trudel, N., Trier, H.A., Klein-Flügge, M., Sel, A., Verhagen, L., and Rushworth, M.F.S. (2021). Causal manipulation of self-other mergence in the dorsomedial prefrontal cortex. Neuron.
      16. Wittmann, M.K., Kolling, N., Faber, N.S., Scholl, J., Nelissen, N., and Rushworth, M.F.S. (2016). Self-Other Mergence in the Frontal Cortex during Cooperation and Competition. Neuron 91, 482–493. 10.1016/j.neuron.2016.06.022.
      17. Kappes, A., Harvey, A.H., Lohrenz, T., Montague, P.R., and Sharot, T. (2020). Confirmation bias in the utilization of others’ opinion strength. Nat Neurosci 23, 130–137. 10.1038/s41593-019-0549-2.
      18. Trudel, N., Scholl, J., Klein-Flügge, M.C., Fouragnan, E., Tankelevitch, L., Wittmann, M.K., and Rushworth, M.F.S. (2021). Polarity of uncertainty representation during exploration and exploitation in ventromedial prefrontal cortex. Nat Hum Behav. 10.1038/s41562-020-0929-3.
      19. Yu, Z., Guindani, M., Grieco, S.F., Chen, L., Holmes, T.C., and Xu, X. (2022). Beyond t test and ANOVA: applications of mixed-effects models for more rigorous statistical analysis in neuroscience research. Neuron 110, 21–35. 10.1016/j.neuron.2021.10.030.
      20. Mars, R.B., Jbabdi, S., Sallet, J., O’Reilly, J.X., Croxson, P.L., Olivier, E., Noonan, M.P., Bergmann, C., Mitchell, A.S., Baxter, M.G., et al. (2011). Diffusion-Weighted Imaging Tractography-Based Parcellation of the Human Parietal Cortex and Comparison with Human and Macaque Resting-State Functional Connectivity. Journal of Neuroscience 31, 4087– 4100. 10.1523/JNEUROSCI.5102-10.2011.
      21. Yu, A.J., and Cohen, J.D. Sequential effects: Superstition or rational behavior? 8.
      22. Nili, H., Wingfield, C., Walther, A., Su, L., Marslen-Wilson, W., and Kriegeskorte, N. (2014). A Toolbox for Representational Similarity Analysis. PLoS Comput Biol 10, e1003553. 10.1371/journal.pcbi.1003553.
      23. Lockwood, P.L., Wittmann, M.K., Nili, H., Matsumoto-Ryan, M., Abdurahman, A., Cutler, J., Husain, M., and Apps, M.A.J. (2022). Distinct neural representations for prosocial and self-benefiting effort. Current Biology 32, 4172-4185.e7. 10.1016/j.cub.2022.08.010.
    1. Author Response

      Reviewer #2 (Public Review):

      Reinforcement learning (RL) theory is important because it provides a broad, mathematically proven framework for linking behavioral states to behavioral actions, and has the potential for linking realistic biological network dynamics to behavior. The most detailed neurophysiological modeling uses biophysical compartmental models with the theoretical framework of HodgkinHuxley and Rall to describe the dynamics of real neurons, but those models are extremely difficult to link to behavioral output. RL provides a theoretical framework that could help bridge across the still-underexplored chasm between behavioral modeling and neurophysiological detail.

      On the positive side, this paper uses a network of interacting neurons in region CA3 and CA1 (as used in previous models by McNaughton and Morris, 1987; Hasselmo and Schnell, 1994; Treves and Rolls, 1994; Mehta, Quirk and Wilson. 2000; Hasselmo, Bodelon and Wyble, 2002) to address how a simple representation of biological network dynamics could generate the successor representation used in RL. The successor representation is an interesting theory of hippocampal function, as it contrasts with a previous idea of model-based planning. Previous neuroscience data supports the idea that animals use a model-based representation (a cognitive map made up of place cells or grid cells) to read out potential future paths to plan their behavior in the environment. For example, Johnson and Redish, 2007 showed activity spreading into alternating arms of a T-maze before a decision is made (i.e. a model-based exploration of possible actions, NOT a successor representation), and Pfeiffer and Foster, 2013 showed that replay in 2-dimensions corresponds to future goal directed activity. Models such as Erdem and Hasselmo, 2012 and Fenton and Kubie, 2012 showed how forward planning of possible trajectories could guide performance of behavioral tasks. In contrast, the successor representation proposes that model-based activity is too computationally expensive and proposes that instead of reading out various possible model-based future paths when making a decision, that a simulated agent could instead learn a look-up table indicating the probability of future behavioral states accessible from a given state. In previous work, the successor representations accounted for certain aspects of experimental neuroscience data such as place cells responding to the insertion of barriers as seen by Alvernhe et al. and the backward expansion of place field seen by Mehta et al. The current paper is admirable for addressing the potential role of neural replay in training of successor representations and its relationship to other neural and behavioral data such as the papers by Cheng and Frank 2008 and by Wu et al. 2017.

      However, a lot of this same data could still be interpreted as indicating that animals use a model-based representation as described above. There's nothing in this paper that rules out a model-based interpretation of the results discussed above. In fact, the cited paper by Momennejad et al. 2017 shows that humans extensively use model-based mechanisms along with some use of a successor representation in addition to the model-based mechanism. The description in the article under review needs to avoid treating successor representations as if they are already the ground truth.

      To do this, throughout the paper, the authors need to repeatedly address the fact that the Successor Representation is just a theory and not proven experimental fact. And they need to repeatedly in all sections point out that the successor representations hypothesis can be contrasted with the theory that model-based neural activity could instead guide behavior and could be the correct account for all of the data that they address (i.e. such as the darkavoidance behavior). They should cite the previous examples of neural data that looks like model-based planning such as Johnson and Redish, 2007 in the T-maze and Pfeiffer and Foster, 2013 in open fields, and cite models such as Hasselmo and Eichenbaum, 2005; Erdem and Hasselmo, 2012 and Fenton and Kubie, 2012 that showed how forward replay or planning of possible trajectories could guide performance of behavioral tasks

      We thank the reviewer for the valuable feedback. We have adapted the manuscript throughout to discuss the important point that the SR is not the ground truth (e.g. the final paragraphs in the sections “Bias-variance trade-off” and “Leveraging replays to learn novel trajectories”). We also discussed more extensively the model-based literature and the suggested citations in the manuscript.

      The title and text repeatedly refers to a "spiking" model. They show spikes in Figure 2 and extensively discuss the influence of spiking on STDP, but they ought to more explicitly discuss the interaction of their spike generation mechanisms (using a Poisson process) and the authors should compare their model to the model of George, DeCothi, Stachenfeld and Barry which addresses many of the same questions but using theta phase precession to obtain the correct spike timing in STDP.

      Yes, that's a great suggestion. We have extended our discussion section. In particular, we added:

      In our work, we did not include theta modulation, but phase precession and theta sequences could be yet another type of activity within the TD lambda framework. Interestingly, more groups have recently investigated related ideas. A recent work \citep{George2022} incorporated the theta sweeps into behavioural activity, showing it approximately learns the SR. Moreover, theta sequences allow for fast learning, playing a similar role as replays (or any other fast temporalcode sequences) in our work. By simulating the temporally compressed and precise theta sequences, their model also reconciles the learning over behavioral timescales with STDP. In contrast, our framework reconciles both timescales relying purely on rate-coding during behaviour. Finally, their method allows to learn the SR within continuous space. It would be interesting to investigate whether these methods co-exist in the hippocampus and other brain areas. Furthermore, \citep{Fang2022} et al. recently showed how the SR can be learned using recurrent neural networks with biologically plausible plasticity.

      The introduction and start of the Results section are should have more citations to neuroscience data. The introduction currently cites only three experimental citations (O'Keefe and Dostrovsky, 1971; O'Keefe and Nadel, 1978 and Mehta et al. 2000) and then gives repeated citations of previous theory papers as if those papers define the experimental data that is relevant to this study. The article should review actual neuroscience literature, instead of acting as if a few theory papers in the last five years are more important sources of data than decades worth of experimental work. The start of the results section makes a statement about the role of hippocampus and only cites Stachenfeld et al. 2017 as if it were an experimental paper. The introduction, start of results and discussion need to be modified to address actual experimental data instead of just prior modeling papers. They need to add at least a paragraph to the introduction discussing real experimental data. There are numerous original research papers that should be cited for the role of hippocampus in behavior so that the reader doesn't get the impression all of this work started with the paper by Stachenfeld et al. 2017. For example, the introduction should supplement the citations to O'Keefe and Mehta with other experimental papers including those that they cite later in the paper. They should also cite other seminal work of Morris et al. 1982 in Morris water maze and Olton, 1979 in 8-arm radial maze and work by Wood, Dudchenko, Robitsek and Eichenbaum on neural activity during spatial alternation. At the start of the Results, instead of only citing Stachenfeld (which should have reduced emphasis when speaking about experiments), they should again cite O'Keefe and Nadel, 1978 for the very comprehensive review of the literature up to that time, plus the work of Morris and Eichenbaum and Aggleton and other experimental work.

      We thank the reviewer for the suggested citations. We have added many citations in order to discuss the experimental literature more thoroughly.

      This article is admirable for addressing how to utilize a continuous representation of space and time, which Kenji Doya also addressed in his NeurIPS article in 1995 and Neural Computation 2000 (which should be cited). To emphasize the significance of this continuous representation, they could note that reinforcement learning (RL) theory models still tend to use a discretized grid-like map of the world and discrete representation of time that does not correspond to the probabilistic nature of place cell response properties (Fenton and Muller) and the continuous nature of the response of time cells (Kraus et al. 2013).

      We thank the reviewer for this important comment and this is indeed one of the main strengths of the proposed framework. We have now emphasised this point, by adding the following paragraph to the Discussion:

      “Importantly, the discount parameter also depends on the time spent in each state. This eliminates the need for time discretization, which does not reflect the continuous nature of the response of time cells (Kraus et al. 2013).”

      I think the authors of this article need to be clear about the shortcomings of RL. They should devote some space in the discussion to noting neuroscience data that has not been addressed yet. They could note that most components of their RL framework are still implemented as algorithms rather than neural models. They could note that most RL models usually don't have neurons of any kind in them and that their own model only uses neurons to represent state and successor representations, without representing actions or action selection processes. They could note that the agents in most RL models commonly learn about barriers by needing to bang into the barrier in every location, rather than learning to look at it from a distance. The ultimate goal of research such as this should to link cellular level neurophysiological data to experimental data on behavior. To the extent possible, they should focus on how they link neurophysiological data at the cellular level to spatial behavior and the unit responses of place cells in behaving animals, rather than basing the validity of their work on the assumption that the successor representation is correct.

      We thank the reviewer for this suggestion, we have now extended the Discussion to include a paragraph on the “Limitations of the Reinforcement Learning framework” which we reproduce here:

      We have already outlined some of the perks of using reinforcement learning for modelling behaviour, including providing clear computational and algorithmic frameworks. However, there are several intrinsic limitations to this framework. For example, it needs to be noted that RL agents that only use spatial data do not provide complete descriptions of behavior, which likely arises from integrating information across multiple sensory inputs. Whereas an animal would be able to smell and see a reward from a certain distance, an agent exploring the environment would only be able to discover it when randomly visiting the exact reward location. Furthermore, the framework rests on fairly strict mathematical assumptions: typically the state space needs to be markovian, time and space need to be discretized (which we manage to evade in this particular framework) and the discounting needs to follow an exponential decay. These assumptions are overly simplistic and it is not clear how often they are actually met. Reinforcement Learning is also a sample-intensive technique, whereas we know that some animals, including humans, are capable of much faster or even one-shot learning. \ Regarding the specific limitations of our model, we can note that even though we have provided a neural implementation of the SR, and of the value function as its read-out (see Figure 5-figure supplement S2, the whole action selection process is still computed only at the algorithmic level. It may be interesting to extend the neural implementation to the policy selection mechanism in the future.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      _Reply to the reviewers _

      Note: the three reviewers who provided comments were identified as Reviewers 2-4

      Reviewer #2

      1) I could not open any of the movies (while those associated with the BioRXiv preprint were fine). Some of the movies could be combined to minimize download/open clicking sequences.

      • The movies were uploaded as .avi files, as per Review Commons instructions, and we tested our ability to view them on several computers at our institution before submission. We are relieved the reviewer was able to access the .mp4 formatted movies via BioRXiv. We will ask the Review Commons Managing Editor to make sure there are no problems with the videos uploaded with the revised manuscript.*

      2) I really dislike reviewing papers without line numbers

      • Line numbers have been added to the revised version.*

      3) The manuscript could be made more relevant to malaria researchers by briefly discussing red cell invasion by merozoites (a single constriction and force against the cell cortex), migration of ookinetes (multiple constrictions during mosquito gut penetration) and sporozoites (long distance migration), but this is not a must.

      • Constrictions during ookinete migration are now mentioned on lines 265-269, and the discussion of the constriction at the moving junction has been broadened to include other apicomplexan parasites lines 270-278.*

      4) I would limit reporting of numbers to two digits, e.g. instead of 46.3% make it 46%; 2.56 +/- 0.38 to 2,6 +/- 0,4 etc

      > We have adjusted all numbers in the text and figures to the appropriate number of significant figures based on measurement precision.

      5) Millions of deaths, please rewrite, more like around 1 million from malaria and cryptosporidium; use citation (WHO)

      > Done (line 40)

      6) Motility: please don't mention flagella, which are used for swimming, in the same sentence / phrase / logic connection as lamellipodia, which are used for substrate based migration

      > The sentence has been rewritten to make clear that cilia and flagella are not organelles involved in the substrate-dependent motility of other eukaryotic cells (lines 47-49).

      7) In Figure 1B, I can see one microsphere and it's not clear if it moves completely back to the original position. In the movie it looks like it goes completely back, maybe exchange the last panel of the figure with a last frame from the movie? Or maybe better: replace with frames from movie 2, which is more striking and shows many beads being displaced?

      > As suggested, Figure 1B now shows frames from the other movie (former Video 2), where bead movement is more obvious.

      8) Please add the entire figure S1 to Figure 1. This is important for readers to understand and 'deserves' full figure status. Same for Figure S2.

      *> We have moved most of former Figure S1 into a new main Figure 2, as suggested. We left the two graphs as Supplemental data (new Figure S1), since these graphs simply show that parasite motility in fibrin is similar to the previously described motility of parasites in Matrigel. *

      *> Figure S2 has been moved to the main text, as suggested (in new Figures 3 and 6). *

      9) I would encourage the authors to elaborate more on the data on Figure S2. It appears that motile parasites did mostly not exert forces above the level for non-motile parasites; for how much motility did they observe forces? The meaning of the x-axis does not become clear. Are those individual parasites per time point or time points of one parasite or of the analyzed matrix volumes over several parasites? How many parasites where observed? This is stated more clearly later but needs to be done already here.

      > We have moved the data in former Suppl. Figure S2 into the main figures, broken it into two parts (Figures 3 and 6B-E) and included a new 3D volume view and additional explanatory detail in the figure legends and text to clarify these points of confusion (lines 100-116, 500-507, 564-570).

      10) Please change 0.042 um into 42 nm etc

      *> Done, lines 113-116. *

      11) Please move some of the data in Figure S8 to the main figures e.g. Figure 4, where it would make a nice contrast / comparison to the mic2 mutant. Please also put a WT for comparison.

      > Done; see revised Figure 6.

      12) I wonder if the defect in directional migration of the mic2 mutant is also partly due to the parasite not being able to squeeze through narrow matrix pores and hence is deflected more often. While I understand (and agree) with the authors observation (interpretation) of the wt parasites not squeezing but pulling, it's hard to think that such squeezing would not still play a part.

      *> The idea that the parasite needs to squeeze its way through pores in the matrix is intuitively appealing (and, in fact, what we had expected to see) but there is currently no data to support it. If squeezing were occurring, we should see an outward deformation of the matrix as the parasite pushes on the matrix fibers, but this is something we have never observed. We therefore think it is unlikely that the loss of directional migration is due to an inability to squeeze through pores in order to “stay on track”. *

      13) Hueschen et al is now on BioRXiv

      > The BioRXiv citation has been added (lines 293, 320).

      14) The shaving off of antibodies could be brought into context to the work on sporozoites by Aliprandini Nat Micro 2018 and on trypanosomes by Enstler Cell 2007 (but not a must)

      *> The two studies mentioned are intriguing and may be related to the well-documented anterior to posterior flux and shedding of GPI-anchored proteins from the surface of gliding Toxoplasma tachyzoites. What we are showing here is slightly different: the fluorescent antibodies on the cell surface seem to be “shaved” backwards at the constriction, much like surface bound antibodies are shaved backwards at the moving junction during invasion (Dubremetz 1985). In other words, there is a discontinuity in the density of surface staining at the constriction/junction. All of these processes may be related, but this is only speculation at this point and since the shaving of antibody at the constriction is a minor point of the paper (meant only to illustrate another similarity between 3D motility and invasion), we would prefer not to try to tie it to these other observations which may or may not be related. *

      15) Anterior-posterior flux: best experimental evidence for this is Quadt et al. ACS Nano 2016 for Plasmodium and Stadler MBoC 2017 for Toxoplasma. The common observations and differences could be discussed as they pertain to the current study

      > These two papers are now cited in our discussion of the linear motor model along with our speculation that the constriction reflects the motility-relevant zone of engagement of this rearward flux with ligands in the matrix (lines 319-322).

      16) The loss of mic2 could lead to the loss of the capability to form discrete adhesion sites that reveal themselves as the observed rings in 3D. I suggest to be careful to hypothesize that the absence of this and MyoA reveals a completely different motility mechanism. To me it seems more likely that the absence of the proteins means that the existing mechanism doesn't work perfectly any more, ie the highly tuned migration machinery misses a key part and malfunctions.

      *> The paragraph in question offered possible explanations for how parasites lacking the constriction could in fact move at normal speeds, not that motility was negatively affected. We have tried to make this more clear in the revision (lines 352-354), before describing the 3 possible explanations. *

      17) Maybe reflect on whether 'search strategy' might be a better word than 'guidance system'

      *> We have replaced the term “guidance system” in the title (lines 1-2), abstract (lines 33-36) and introduction (line 75) with more conservative references to the ability of the parasite to move directionally. The only place the term “guidance system” remains is in the final paragraph of the discussion, which is more speculative in nature, and where we now suggest it to be “part of” a guidance system. *

      Reviewer #3

      1) Extracellular matrix choice. The authors track the parasite movement first on Matrigel and next on fibrin. The authors exemplify the fibrin matrix on an image on Suppl. Fig 1 that shows a relatively quite large pore size, similar or greater than parasite size. Was the analysis done on parasites touching the fibers?

      *> Previous Suppl Figure 1A showed a confocal image at only one z-plane which did indeed give the impression that the pores are relatively large. We have changed this image to a more informative maximum intensity projection (New Figure 2A) and included a video showing the entire imaging volume (new Video 4), which makes clear that the matrix contains many small fibers and that the pores are smaller than the previous single z-plane suggested, so the parasite is likely to be near to or in contact with fibers of the matrix at all times. In Suppl Figure 1D we purposely used a less dense matrix in order to make the matrix deformation more obvious to the eye. The density of the matrix in Fig. 1D has been added to the legend. *

      2) Lack of movement of parasites. In many figures of the articles it is revealed that the majority of parasites in fibrin remain immobile (Suppl Fig 1, Fig 2, Video 5, Suppl Fig 2, Suppl Fig 8). The number of immobile parasites in Matrigel seem to be lower than in fibrin (Suppl Fig 1B) although no quantification is shown. How does the movement in fibrin and Matrigel compare? How does this compares with movement in stiff substrates in 2D? Could the lack of movement be caused by the large pore site in fibrin?.

      > We have added a panel to Suppl. Figure S1 showing that the proportions of parasites moving in fibrin vs Matrigel are not significantly different. In fact, none of our measured motility parameters are different between fibrin and Matrigel. Not all parasites move during the 80s of capture used for these matrix comparisons; some of the parasites are likely dead, but others may have simply not initiated motility during this time window. We typically see between 30-50% movement in 3D motility assays of this duration and similar numbers in 2D trail assays although we have not explored the effect of 2D substrate stiffness.

      3) Considering parasite movement: The authors consider that 3SD is a cutoff for considering parasite displacement. However, several timepoints fall behind this cutoff in the control without parasites and the knockouts with restricted movement.

      > We chose three standard deviations from the mean as our cutoff, in order to eliminate 99.7% of the noise. Since we calculate 16807 vectors per comparison, this leaves us with ~50 vectors above the cutoff even in samples with no moving parasites. Not surprisingly, these vectors are found at random locations in the volume. New Figures 3 and 6B-E and the associated text (lines 100-116, 500-507, 564-570) hopefully clarify this point adequately; it is quite obvious in Figure 3C which vectors correspond to parasite-induced displacements and which correspond to random noise.

      4) Imaging: Although the authors show a very detailed an illustrative table of the imaging acquisition conditions in table 1, it is unclear which microscope the authors used, as two microscopes are described in the methods section, a Nikon Eclipse TE300 widefield microscope and a Nikon AIR-ER confocal microscope. Which images were taken in each system? For the location of Table1 in the manuscript it seems that most images were taken with the Nikon Eclipse. Although this microscope has control over z, the images are quite noisy. How does the lack of confocallity might interfere with the analysis?

      > The high temporal resolution needed for 3D force mapping of cells that move several microns per second meant that all these experiments were done using a widefield microscope equipped with a piezo-driven z-stage. The fastest confocal we tested was not as fast as the widefield. However, spatial resolution suffered as a result of having to use widefield, particularly in z,* and this did indeed make our data more noisy as suggested by the reviewer. This may be why we were unable to detect fibrin deformation in the knockout parasites. The only data collected on the confocal microscope were those shown in new Figure 2A; we have clarified this on lines 421-427. Future studies will explore other imaging modalities such as light sheet microscopy in an attempt to achieve better spatial resolution while maintaining the high frame rates required for force mapping. *

      5) Nuclear constriction. The authors did not show any image or video exemplifying this.

      The images in Suppl. Figure 6 have been replaced with data that show the nuclear shape more clearly.

      6) Knockouts: The authors did not explain how did they generated the knockouts in the methods or did now show the efficacy of the knockout in any figure. If these knockout strains were a gift (I did not find it on the manuscript), the authors should indicate this more explicitly and reference the manuscript where they were described for the first time.

      > Both of the stable knockout lines used were generous gifts from Dr. Markus Meissner. We cited the original papers describing these lines in the text and thanked Dr. Meissner for providing them in the Acknowledgements section. We have now included an additional citation at the first mention of each of the knockouts (lines 174, 188) to make it even clearer where they came from.

      7) Discussion: Although the experimental methodology is sound the authors seem to make many assumptions and speculations on the discussion as how the appearance of this ring/constriction on the parasite translates into the helical movement of the parasite or the coupling of the ring with the cytoskeleton. Live imaging of actin dynamics or mathematical modelling could be used to support their claims.

      > We imaged parasites expressing the actin chromobody but were unable to visualize a ring of actin at the constriction. However, due to the speed of the parasites and the need for a fast frame rate (~15 ms per image) to reconstruct the 3D image volumes, the actin chromobody signal could be under our threshold of detection. We need to develop new, more sensitive ways to visualize proteins at the constriction, and this will be a major focus of our work going forward.

      *> We fully concur that mathematical modeling such as the work recently done by Hueschen et al on actin flow during motility and by Pavlou et al on the role of parasite twist during invasion has much to offer our understanding of these processes. Similar approaches may provide support to the speculations (not claims!) we offer in the discussion and, although beyond the scope of the current study, are a direction we intend to take this work in the future – particularly if we are able to improve the signal-to-noise in our force mapping. *

      8) Quantification of experiments missing: Overall, the main figures lack quantification that sometimes can be found in the supplemental information and sometimes is missing. I would suggest including quantifications next to the events described in the main figures). Likewise, some of the supplemental figures lack quantification (Suppl Fig 7, how many parasites showed this protein trail?)… Overall, the authors should indicate how many parasites were quantified in each figure. As they usually refer to number of constrictions. This is overall a problem in main figures 3 and 5. Or for example in Suppl Fig 5: How many parasites were quantified in this figure? The authors only show number of constrictions, and as the authors described, a parasite might have more than one constriction.

      > We have added further detail on the number of events/parasites quantified to both the figure legends and text throughout the manuscript, including the specific examples noted by the reviewer.

      9) Videos: The videos lack scale of time. Although this that can be found in main figures, it would be helpful to have the annotation in the videos. Likewise, some references for positions in videos, such as the cross found on Fig1 would be helpful for parasites that present little movement.

      > Time stamps have been added to all videos as suggested, and crosshairs have been applied to new Figure 1B and Suppl. Figures 7 and 8 to make the movement of the parasites more obvious. *

      *

      Reviewer #4

      1) I am not sure about the premise that the "linear model" of gliding motility predicts uniformly forward direction. Previous videos of 2D gliding show sporadic motility, changes in direction, or even reversal of direction are not infrequent. However, the current model could explain these behaviors if one or more of the following conditions occur: 1) myosin motors might be coordinating activated to initiate motility, followed by relaxation, 2) actin fibers might be transiently arrayed in clusters that change density and polarity over time, or 3) adhesins, necessary to generate traction, might vary in density and spatial orientation across the surface of the parasite. Changes in these properties would be expected result in zones that promote or disfavor local forces needed for motility - and reversal of direction could occur when forward forces relax and external elastic forces predominate.

      > The potential explanations offered by the reviewer for the frequent changes in direction of zoite motility are intriguing and worth exploring experimentally. The ability of actin fibers to periodically reverse polarity, or the presence of counteracting elastic forces are not components of the “standard” linear motor model of motility but, if they occur, could explain the patch gliding phenomenon and help refine our understanding of motility. Since the data in this manuscript do not in the end either strongly support or disprove the linear motor model – this may ultimately require higher resolution force mapping methods that can detect the forces responsible for forward motion – we have de-emphasized potential problems with the model in the introduction and deleted specific discussion of patch gliding as one of these problems (lines 61-64).

      2) The model favored here: "we propose that force is generated, at least in part, by the rearward translocation of the subset of actin filaments that are coupled to adhesins at the circular ring of attachment" does not seem fundamentally different from the current model - other than it focuses the forces at a critical junction that the parasite migrates through. It seems to me that this is a refinement of the current model and not a replacement. As such, the authors might focus on how their data improve the model rather than pointing out prior deficiencies (although I get that editors like this style).

      > We agree with the reviewer and have modified the text to be more circumspect on this issue* (lines 319-331). *

      3) The finding that the absence of MIC2 affects the constriction formed by inward pull on the matrix is quite convincing and interesting. However, mutants that cannot form the constriction, still move at similar speeds. This suggest that the inward force is different from the motor itself and affects its ability to impart direction, rather than the ability to move per see. The interpretation of the MyoA defect is complicated since motility is certain to be disrupted, the potential role of an independent inward force may no longer be detectable.

      > We agree with the reviewer on this point as well: the forces we have observed to date cannot explain forward motion. We stated this previously and have now emphasized the point further *(lines 322-324, 352-357). Because the parasite is moving forward, the forces responsible must be there but are likely below our threshold of detection. In order to visualize these forces, we are going to need new imaging modalities that can achieve better signal-to-noise than our current setup at the high frame rates required for force mapping. That said, we new data we have added to the manuscript are at least consistent with the narrow diameter ring of the constriction making a contribution to the parasite’s forward motion (new Suppl. Figure 10 and lines 347-351) *

      4) Although I agree with the authors that there are striking parallels between motility in 3D and cell invasion, I am not certain about their conclusion that the construction seen during cell entry is due to the parasite pulling inwardly. When entering the host cell, the parasite must also navigate the dense subcortical actin network, which likely also aids in forming the constriction that is observed. It would be interesting to record this pattern under conditions where host cell actin is destabilized while parasite motility is intact- for example using cytochalasin D to treat wild type host cells during invasion by resistant parasites.

      *> We do not conclude that the constriction during invasion is due to the parasites pulling inwardly, but we do propose that this possibility needs to be considered based on the noted similarities between invasion and motility and our clear (and somewhat surprising) demonstration that the moving parasite pulls on the matrix at the constriction during motility. During invasion, the parasite may indeed have to squeeze through the dense subcortical network – or it may use secreted proteins to loosen up the network so that no squeezing is required. We just don’t know, and our purpose here was simply to put this alternative possibility on the table because we believe it is a viable possibility that follows from the data presented. *

      > We thank the reviewer for the suggestion of testing what happens when cytoD resistant parasites invade in the presence of cytoD; this is a clever idea that we will likely pursue in future work.

      5) Not all of the color patterns shown in Figure 1A are consistent with the model. For example, GAP40 (yellow) does not appear in the model, there are two MLC boxes, but they are different shades, and ELC1/2 does not appear in the model.

      > We thank the reviewer for catching this error; it has now been fixed.

    1. Discussion, revision and decision


      Decision

      Verified with reservations: The content is scientifically sound, but has shortcomings that could be improved by further studies and/or minor revisions.

      Dr. Bañuelos: Verified manuscript

      Dr. Morris: Verified with reservations


      Revision

      Response to Reviewer 1 (Dr. Bañuelos)

      1. Most importantly, I would like to see an introduction that explains the authors’ general arguments about grading changes – including the trajectory of these changes at Dalhousie and why this arc contributes to our knowledge of the history of higher education more broadly. Then, the authors might continually remind us of the arc they present at the outset of their paper – especially when they are highlighting a piece of evidence that illustrates their central argument. To me, the quotes from students and faculty responding to grading changes are among the most interesting parts of the paper and placing these in additional context should make them shine even more brightly!

      Our Response: Thank you so much for your thoughtful review. We have added a larger new introduction section of the paper (paragraphs 1-5 in the latest draft are new) that outlines the general importance of the topic, the Canadian context, details on Dalhousie University, and our overall thesis statement (i.e., most decisions were to improve the external communication value of grades). Moreover, we have added three new student quotes form the Dalhousie Gazette to build a stronger picture for student reactions, and to build a better case for our overall thesis statement (i.e., that changes in grading were often to increase the external communication value of grades). Moreover, throughout we have added some details on the overall funding trajectory for institutions in Canada that created some pressure to standardize grading. We think that these changes have improved the manuscript.

      1. I’d like to read a little more about Dalhousie itself – why it is either a remarkable or unremarkable place to study changes in grading policies. Is it representative of most Canadian universities and thus, a good example of how grading changes work in this national context? Is it unlike any other institution of higher education and thus, tells us something important about grades that we could not learn from other case studies? I don’t think this kind of description needs to be particularly long, but it should be a little more involved than the brief sentences the authors currently include (p.3, paragraph 1) and should explain the choice of this case.

      Our Response: This comment revealed that two additional pieces of context were needed for the introduction: (a) some national context for higher education policy in Canada and (b) some extended description of Dalhousie University when compared to other universities in Canada. To this end, two new paragraphs have been added to the paper (paragraphs 2 & 3 in the current draft).

      Notably, Jones (2014) notes that “Canada may have the most decentralized approach to higher education than any other developed country on the planet” (pg 20). With this in mind, any historical review of education policy is by necessity specific to province and institution – that is, the information can be placed in its context, but resists wide generalization to the country as a whole. In the newest draft, we tried to describe the national, provincial, and institutional context in some more detail in paragraphs 2 & 3.

      1. I’d also like to know more about the archival materials the authors used. The authors mention that they drew from “Senate minutes, university calendars, and student newspapers” (p. 3), but what kinds of conversations about grades did these materials include? At various points, the authors engage in “speculation” (e.g. p.4) about why a particular change occurred. This is just fine and, in fact, it’s good of the authors to remind us that they are not really sure why some of these shifts happened. But, they might go one step further and tell us why they have to speculate. Were explicit discussions of grading changes – including in inter- and intradepartmental letters and memo, reports, and other documents – not available in these archives? Why are these important discussions absent from the historical record?

      Our Response: We have added a new paragraph (paragraph 4) to the paper discussing the sources in some more detail. It is true that the verbatim discussions are frequently absent from the record, especially earlier in history – or if they exist, we have not found them! Instead, we frequently are reviewing meeting minutes or committee reports, which are summaries of discussions. As we now note in the paper, “Thus, the sources used showed what policy changes were implemented, when they were implemented, and a general sense of whether there was opposition to changes; however, there were notable gaps in faculty and student reactions to grade policy changes, as these reactions were frequently not written down and archived.”

      This gap was most apparent in the Senate minutes around the 1940s, where I (the first author) could not find any direct discussions of why changes were implemented. Under the 1937-1947 heading, we more clearly indicate that the rationale for the changes was absent from the Senate minutes during this period. I add some further speculation on why these records might be absent, based on summaries from Waite (1998b); specifically, the university president of the time often made unilateral decisions, circumventing Senate, which might account for why the changes are absent from the records.

      This will hopefully make the limitations of what can be learned from this approach more apparent.

      1. At various points, the authors make references to the outside world – for example, WWII (p. 5), the Veteran’s Rehabilitation Act (pp. 6-7), and British versus American grading schemas (p. 6). But, these references are brief and seem almost off-handed. I know space is limited, but putting these grading changes in their broader context might help make the case for why this study is interesting and important. Are the changes in the 1940s, for example, related to the ascendance of one national graduate education model over another (e.g. American versus British)? Are there any data on how many Canadian undergraduates enrolled in British versus American graduate programs over time? If so, I would share any information you might have on these broader trends.

      Our Response: To our knowledge, there isn’t any comparable report to what we’ve written here documenting the transition from British “divisions” to American “letter grades” in Canadian Universities, making our report novel in this regard. It might well be that a similar historical arc exists in many of the 223 public and private universities in Canada, but we don’t believe such data exists in any readily accessible way – excepting perhaps undergoing a similar deep dive into historical documents at each respective institution! So, we do not have the answer to your question: “Are there any data on how many Canadian undergraduates enrolled in British versus American graduate programs over time?” However, we did add one reference which provided a snapshot point of comparison in 1960, noting in the paper “Baldwin (1960) notes that the criteria for “High First Class” grades in the humanities was around 75-80% at Universities of Toronto, Alberta, and British Columbia in 1960, suggesting that Dalhousie’s system was similar to other research-intensive universities around this time.” That said, there are a few major national events related to the funding of universities in Canada that we have elaborated on in the text to address the spirit of your recommendation for describing the national context:

      a) In the “Late 1940s” section of the paper, we added: “Though Dalhousie had an unusually high proportion of veterans enrolled relative to other maritime universities during this period (Turner, 2011), the Veteran’s Rehabilitation Act was a turning point for large increases in enrollment and government funding Canada-wide, at least until the economic recession of the 1970s (Jones, 2014).”

      b) In the 1990s, there were major government cuts to funding, creating challenging financial times for the university. We discuss the funding pressures that likely contributed to standardization of grading during this time by saying the following in the 1980s-2000s section: “Starting in in the 1980s-1990s there were major government cuts to university funding nation-wide, with the cuts becoming more severe in the 1990s (Jones, 2014; Higher Education Strategy Associates, 2021). Because of the nature of the funding formulas, cuts in Nova Scotia were especially deep. Beyond tuition increases, university administrators knew that obtaining external research grants, Canada Research Chairs, and scholarship funding was one of the few other ways for a university to balance budgets, so there was extra pressure to be competitive in these pools. […] The increased standardization was likely related to increased financial pressures at this time – standardization is an oft-employed tool to deal with ever-increasing class sizes with no additional resources.”

      c) In the 2010s section of the paper, we added context to how universities in country-wide have become increasingly dependent on tuition fees for funding: “Following the 2008 recession, federal funding decreased again (Jones, 2014; Higher Education Strategy Associates, 2021); however, this time universities tended to balance budgets by increasing tuition and international student fees. This trend towards increased reliance on tuition for income is especially pronounced in Nova Scotia, which has the highest tuition rates in the country (Higher Education Strategy Associates, 2021). Thus, the university moved closer to a “consumer” model of education, so it makes sense that a driving force for standardization was student complaints.”

      1. This is a very nitpicky concern that doesn’t fit well elsewhere, so please take it with a grain of salt. I was surprised at the length of the reference list – it seemed quite short for a historical piece! I wonder, again, if more description of the archival material - including why you looked at these sources, in particular, and what was missing from the record – would help explain this and further convince the reader that you have all your bases covered.

      Our Response: In the introduction section, paragraph 4, we describe our sources in more detail including what is likely missing from the record and why we used them. Regarding the length of the reference list, we did add ~12 new references to the list in the course of making various revisions, which partially addresses your concern. Beyond this though, it’s worth noting that some of the sources more extensive than they seem, even though they don’t take up much space in the reference list (e.g., there is one entry for course calendars, but this covers ~100 documents reviewed!). Moreover, there were many dead-ends in the archives that are not cited (e.g., reviewing 10 years of Senate minutes in the 1940s produced little of relevance), so the reference list is curated to only those sources where relevant materials were found.

      Reviewer response to revisions

      The new introduction to the piece addresses many of my previous questions about the authors’ general arguments, the Dalhousie context, and the source material. Thank you for addressing these! Reading this version, it is much clearer that the key argument is that standardized, centralized grading practices were “to improve the external communication value of the grades, rather than for pedagogical reasons” (p. 6). I also really enjoyed the added quotes from students in the Dalhousie Gazette.

      The authors’ response to Reviewer 2 really gave me a better sense of why they wrote this piece and also helped me to more clearly put my finger on what was troubling me in the first round. It still reads a little like a report for an internal audience – which is just fine and, in fact, can be extremely useful for historians of the future. But, as Reviewer 2 notes, this means it does not really seem like a piece of historical scholarship. I do worry that shaping it into this form would take an extensive revision and might not be in the spirit of what the authors intended to do.

      A different version of this article might start with this idea that grades were standardized for external audiences and in response to financial pressures. It would then develop a richer story behind the sudden importance of these external audiences and the nature (i.e. source, type) of financial pressures Dalhousie was facing. It would highlight the impact such changes had on students and their future careers/graduate experiences. It could then connect these trends to other similar changes for external audiences and the increasing interconnectedness of American, Canadian, and British systems through graduate education. It might even turn to sociological theories of organizational change and adaptation and make an argument for when (historically) similar forms of decoupling were likely to occur in the Canadian higher education system. Finally, it might connect these grading changes to current trends – including accusations of grade inflation and accepted best practices for measuring learning outcomes.

      But, it doesn’t seem that the authors necessarily want to do this, which I can understand and respect. I think there is enormous value in a piece of scholarship like this existing – both for internal audiences and for future historians. Indeed, imagine if every university had a detailed history of its grading policies like this available somewhere online! Comparing such practices across institutions would certainly tell us a lot about why grading currently looks the way it does.

      Decision changed

      Verified manuscript: The content is scientifically sound, only minor amendments (if any) are suggested.


      Response to Reviewer 2 (Dr. Morris)

      The authors dove headfirst into Dalhousie’s archives, unpacking the subtle shifts in grading policy. Their work seems to be comparable to archaeologists, digging deep beneath mountains of primary sources to find nuggets of clues into Dalhousie’s grading evolution. I particularly liked when the authors were able to link these changes to student voices, as seen in moments when they referenced student publications.

      Ultimately, I kept coming back to one main comment that I wrote in the margins: “So what?” I would humbly suggest that the authors reflect on why this history matters to them. Granted, they do this in the conclusion, where they touch on Schneider & Hutt’s argument that grades evolved to increasingly be a form of external communication with audiences beyond school communities. Sure. But I want more. I wanted to see a new insight that this microhistory of Dalhousie significant to the history of Canada or the history of education more generally.

      If the authors are so inclined, there might be several approaches to transform this manuscript. I would suggest the following. First, instead of tracing the entire history of grading at the institution, choose one moment of change that you think is the most important. Perhaps in the 1920s and the lack of transparency in grading, or the post-war shift toward American grading. Second, show me – don’t tell me – what Dalhousie was like at this moment. Paint a picture of the institution with details about student demographics, curriculum, educational goals, the broader town, etc. Make the community come alive. Show me what makes Dalhousie unique from other institutions of higher ed. Once you establish that picture, perhaps you could link the change in grading practices to subtle changes at the university community, thereby establishing a before and after snapshot. This will require considerable amounts of work, and the skills of a historian. You will have to find primary and secondary sources that go far beyond what you’ve relied on thus far.

      In the end, I found myself wanting the authors to humanize this manuscript, meaning I wanted them to show me that changes in grading practices have tangible effects on real-life human beings. A humanization of their research would mean going narrower and deeper; or, in other words, eliminating much of what they have documented.

      However, if that is too tall of an order, I would ask that the authors clarify for themselves who this manuscript is for. Is this a chronicling of facts for an internal audience at Dalhousie’s faculty, alumni, and students? Fine. But my guess is that even members of the Dalhousie community want to read something relatable.

      I am suggesting revisions, although not because of objective errors. History is more of an art, in my opinion. With that in mind, I would suggest that the authors paint a more vivid picture (metaphorically) of Dalhousie, showing me how changes one moment of change in grading practices impacted the lives of human beings.

      Our Response: Thank you very much for taking the time to read our paper and provide your thoughts and recommendations. It may be helpful to begin by describing why I (the first author) decided to write this paper. Ultimately, I wrote this paper to satisfy my own personal curiosity and to connect with other people at my own place of employment by exploring our shared history. At present day, Dalhousie has a letter grading scheme with a standardized percentage conversion scheme that all instructors used. I wanted to know why this particular scheme was used, but I quickly realized that nobody at Dalhousie really knew how we ended up grading this way! There was an institutional memory gap, and a puzzle that was irresistible to me. So, I wrote this paper for the most basic of all academic reasons: Pure curiosity. I do very much recognize that the subject matter is very niche, perhaps too niche for a traditional journal outlet. Thus, my publishing plan is to self-publish a manuscript to the Education Resources Information Center (ERIC) database and a preprint server as a way of sharing my work with others who might be interested in what I found. Nonetheless, I believe in the importance and value of peer review, especially since I am writing in a field different than most of my scholarly work. That is why I chose PeerRef as a place to submit, so that I could undergo rigorous peer review to improve the work while still maintaining the niche subject matter and focus that drives my passion and curiosity for the project. Of course, if you feel the whole endeavor is so flawed that it precludes publication anywhere, then we can consider this a “rejection” and I will not make any further edits through PeerRef.<br /> The core of your critique suggested that I should write a fundamentally different paper on different subject matter. While I don’t necessarily disagree that the kind of paper you describe might have broader appeal, it would no longer answer the core research question I wanted an answer to: How has Dalhousie’s grading changed over time? So, I must decline to rewrite the paper to focus on a single timeframe as recommended. All this said, I did try my best to address the spirit of your various concerns to improve the quality of the manuscript. Below, I will outline the various major changes to the manuscript that we made to improve the manuscript along the lines you described, while maintaining our original vision for the structure and focus of the paper. The specific changes are outline below:

      a) Two new paragraphs (now paragraphs 1-2 of the revised manuscript) were added to explain the “so what” part of the question. Specifically, we describe why we think the subject matter might be of interest to others and summarize the general dearth of historical information on grading practices in Canada as a whole.

      b) Consistent with recommendations from the other reviewer, we now state a core argument (i.e., that most major grading changes were implemented to improve the external communication value of the grades) earlier in the introduction in paragraph 5 and describe how various pieces of evidence throughout the manuscript tie back to that core theme.

      c) In an attempt to “humanize” the manuscript more, we added more student quotes from the Dalhousie Gazette throughout the paper so that readers can get a better sense of how students thought about grading practices at various times throughout history. Specifically, three new quotes were added in the following sections: 1901-1936, late 1940s, 1950s-1970s. We also added this short note about the physical location where grades used to be posted: “Naturally, this physical location was dreaded by students, and was colloquially referred to as “The Morgue” (Anonymous Dalhousie Gazette Author, 1937).”

      d) Early in the paper, we describe why we chose Dalhousie and the potential audience of interest: “As employees of Dalhousie, we naturally chose this institution as a case study due to accessibility of records and because it has local, community-level interest. The audience was intended to be members of the Dalhousie community; however, it may also be a useful point of comparison for other institutions, should similar histories be written.”

      e) We have described some of the limitations of our sources in paragraph 4, which may explain why the manuscript takes the form it does – it has conformed to the information that is available!

      f) We have linked events at Dalhousie to the national context in some more detail, by detailing some national events related to the funding of universities in Canada. See our response to Reviewer 1, #4 above for more details on the specific changes.

      g) Consistent with your stylistic recommendations, we have changed various spots throughout the paper from the present tense (e.g., “is”) to the past tense (e.g., “was”), and were careful in our new additions to maintain the past tense, when appropriate. If there are any spots that we missed, let us know the page number / section, and we will make further changes, as necessary.

      h) We retained the first person in our writing – this may be discipline-specific, but in Psychology (the first author’s home discipline), first person is acceptable in academic writing. If you feel strongly about this, we can go through the manuscript and remove all instances of the first person, but we would prefer to keep it, if at all possible.

      Hopefully this helps address the spirit of your concerns, and I look forward to hearing your thoughts in the second round of reviews.

      Decision changed

      Verified with reservations: The content is scientifically sound, but has shortcomings that could be improved by further studies and/or minor revisions.

    1. Joint Public Review:

      In this work Malis et al introduce a novel spin-labeling MRI sequence to measure cerebrospinal fluid (CSF) outflow. The glymphatic system is of growing interest in a range of diseases, but few studies have been conducted in humans due to the requirement for and invasiveness of contrast injections. By labeling one hemisphere of the brain the authors attempt to assess outflow through the superior sagittal sinus (SSS), one of the major drainage pathways for CSF, signal changes across time were assessed to extract commonly used metrics. Additionally, correlations with age are explored in their cohort of healthy volunteers. The authors report the movement of labeled CSF from the subarachnoid space to the dura mater, parasagittal dura, and ultimately SSS, evidence of leakage from the subarachnoid space to the SSS, and decreases in CSF outflow metrics with older age.

      1. I don't think that the description of Parasagittal dura in figure 1 is correct. There is no anatomical structure at the top of SSS that is known as PSD. The location of the lymphatic structures is also incorrect. Please review "Anatomic details of intradural channels in the parasagittal dura: a possible pathway for flow of cerebrospinal fluid" Neurosurgery 1996 Fox at al. There is usually no obvious tissue between the upper wall of the SSS and the calvarium, which can also be seen in the authors' fig 2A and 2B. All of the tissues located lateral to the SSS are known as PSD. Also, the SSS wall is not as thick as the authors stated and is known as PSD in this region. For this reason, the authors need to revise Fig 1 and it should be changed to PSD in the areas referred to as the SSS wall in the article.

      2. The authors described tagged CSF in two pathways: from dura mater to PSD and SAS into the SSS and directly from SAS to SSS. Flow from dura mater to PSD and SAS in the main and supplement cannot be seen. Only a flow from PSD to SSS can be seen. Also, regular dura cannot carry flow-collagen-rich fibrous tissue, except parasagittal dura. There is no flow from dura to the CSF in the figures.

      3. The authors have conducted many tests to prevent venous contamination. However, measurements were made based on SSS flow rates in all tests. Small parenchymal venous structures, and small cortical-SAS veins might be tagged due to different flow patterns and T2- Relaxation times.

      4. The rate of CSF formation in humans is 0.3 - 0.4 ml min-1. ( Brinker et al 2014. Fluids Barriers CNS). We can assume that the absorption rate is also similar to the CSF formation for the entire system brain and Spine. Therefore, the absorption rate of this very small amount of CSF by SSS is very low in seconds. It is hard to detect by MR and especially CSF flow from the PSD to SSS. The authors concluded that using this technique the rate averaged less than a couple of seconds, rather than on the order of hours or days as previously reported with the use of intrathecal administration of GBCA (Ringstad et al., 2020).

      5. Overall, I think that the CSF flow from the PSD to the CNS described by the authors - the CSF flow, might be the venous flow that drains into the SSS slowly, predominantly in the rich venous channels, venous lacunae, and previously described channels in the PSD. Additional explanations are needed.

      6. The study is generally well described and to the best of my knowledge an innovative approach. The findings are broadly consistent with what might be expected from the literature and the authors make a good argument in support of their findings. However, the lack of validation is a major limitation of the presented work. In introducing a novel technique a comparison with an existing approach, such as Gd enhanced contrast techniques, or phase contrast would have been expected. Several considerations could have been mentioned/addressed in more detail e.g. what effect labeling efficiency, tortuosity of vessels, lack of gating, the effectiveness of the intensity thresholding to remove the signal from blood, etc may have on the quantification, etc. Without a more thorough validation, it is difficult to evaluate the findings. While scans were conducted on two volunteers to assess reproducibility this is a very small sample and it is notable that scans were conducted consecutively, which might be expected to reduce variance relative to scans further apart e.g. on different dates, scanned by a different operator and no information is provided on how the two scans were positioned (i.e. separately vs copied from the first to the second scan), some metrics showed large percentage differences, which were more pronounced in one subject than the other. Without further data, it is difficult to interpret the reproducibility results. No assessment of the effect of physiological parameters e.g. breathing, cardiac pulsations, or factors affecting glymphatic clearance e.g. amount of sleep the evening before was given.

      7. Given these limitations it is hard to adequately assess the likely impact or utility. In recent years several groups have published work e.g. doi.org/10.1038/s41467-020-16002-4 , doi.org/10.1016/j.neuroimage.2021.118755 assessing the blood-CSF barrier. However, previous work has generally focused on larger structures, and by labeling in the oblique-sagittal plane it is unclear how drainage and blood flow rates may affect the presented values here.

      8. Some validation data would greatly increase the value of the reported work. I would therefore encourage the authors to consider acquiring some additional datasets to compare measures of CSF draining against another method e.g. 2-D or 4-D phase contrast, or Gd-based contrast-enhanced techniques. Some additional points to consider are noted below.

      8. Abstract

      CSF outflow may also be imaged with phase contrast MRI (albeit in a limited way).<br /> Demographics would fit better in Results, breakdown could be given for the young and old groups i.e. n, ages, sex.<br /> Conclusion - unless further validation can be provided I think some of the claims should be toned down.

      9. Introduction

      The authors emphasise the role of Nedergaard, however, there was some relevant earlier work (e.g. Rennels et al, PMID: 2396537).

      10. Methods

      It would be more conventional to summarise the volunteer characteristics in the Results.<br /> Given the age difference between the two groups, and the fact that for conventional ASL we know of differences in labelling efficiency and the need for a different post-labelling duration in more elderly patients how did the authors account for this?<br /> More broadly what would the effect of differences in labeling efficiency be, given the labeling plane is unlikely to be perpendicular to the draining vessels?<br /> While the authors mention circadian effects there is no mention of controlling for other factors before the scan e.g. caffeine consumption, smoking, etc.<br /> Various mechanisms have been hypothesised to drive glymphatic pulsations. Assessing how physiological signals correlated with the flow may have been a useful proof of concept. Why was it not considered necessary to use a gated acquisition? Did the authors consider the potential impact of respiratory and cardiac pulsations on their measurements?<br /> ROI segmentation - manually selected by two raters, was this done independently and blinded? How were consensus ROIs agreed?<br /> Intensity values outwith MEAN +/- 2 SD were excluded from further analyses. This is justified as removing pulsatile blood. However, was this done independently for tag-on and tag-off? Does this mean slight differences were present in the number of voxels between the two?<br /> The starting points and parameter ranges are given in Eq'n 3, how were the ranges defined? Was there a reason for constraining the fit to positive values only, is there a risk of bias from this?<br /> While the main results appear to have a reasonable sample size n=2 for the reproducibility analysis is very limited. Additional datasets would be useful in properly interpreting the results.

      11. Results<br /> While the authors have taken some measures to reduce potential contamination from blood I would be concerned about the risk of surface vessels affecting the signal, and there does not seem to be an evaluation of how effective their measures are.<br /> The labeling pulse is applied in the oblique sagittal orientation, but in tandem with differing rates of blood flow and CSF drainage from the labeling plane does that not risk circulating flow from other slices potentially affecting the values?<br /> Figure 4. The authors focus on the parasagittal dura, but in both the subtraction image and panel C showing different slices at TI=1250 ms some movement appears visible in the opposing hemisphere. Similarly in S2 If the signal does represent CSF movement then this seems counterintuitive and should be explained.<br /> In Figures 4 and 5 the angulation of the TIME-SLIP tag pulse seems quite different. What procedure was used to standardise this, and what effect may this have on the results?

      12. Discussion<br /> Phrasing error 'which will be assessed in future studies'.<br /> I would suggest that some of the claims of novelty be moderated e.g. 'may facilitate establishment of normative values for CSF outflow' seems a stretch given multiple pathways exist and this is only considered one.<br /> More consideration should be given to some of the points mentioned in the results. The lack of validation should be properly discussed.

    1. Author Response

      Reviewer #1 (Public Review):

      This manuscript describes the generation and characterization of a mouse knockout model of Cep78, which codes for a centrosomal protein previously implicated in cone-rod dystrophy (CRD) and hearing loss in humans. Previous work in cultured mammalian cells (including patient fibroblasts) also indicated roles for CEP78 in primary cilium assembly and length control, but so far no animal models for CEP78 were described. Here, the authors first use CRISPR/Cas9 to knock out Cep78 in the mouse and convincingly demonstrate loss of CEP78 protein in lysates of retina and testis of Cep78-/- animals. Next, by careful phenotypic analysis, the authors demonstrate significant defects in photoreceptor structure and function in these mutant animals, which become more severe over a 9 (or 18) month period. Specifically, TEM analysis demonstrates ultrastructural defects of the connecting cilium and photoreceptor outer segments in the Cep78 mutants, which is in line with previously reported roles for CEP78 in CRD and in regulating primary cilia assembly in humans. In addition to a CRD-like phenotype, the authors also convincingly show that male Cep78-/- animals are infertile and exhibit severe defects in spermatogenesis, sperm flagella structure and manchette formation (MMAF phenotype). Furthermore, the authors provide evidence for an MMAF phenotype from a male individual carrying a previously reported CEP78 c.1629-2A>G mutation, substantiating that CEP78 is required for sperm development and function in mammals and supporting previously published work (Ascari et al. 2020).

      Finally, to identify the underlying molecular mechanism by which CEP78 loss causes MMAF, the authors perform some biochemical analyses, which suggest that CEP78 physically interacts with IFT20 and TTC21A (an ortholog of Chlamydomonas IFT139) and might regulate their stability. The authors conclude that CEP78 directly binds IFT20 and TTC21A in a trimeric complex and that disruption of this complex underlies the MMAF phenotype observed in Cep78-/- male mice. However, this conclusion is not fully justified by the data provided, and the mechanism by which CEP78 affects spermatogenesis therefore remains to be clarified.

      Specific strengths are weaknesses of the manuscript are listed below.

      Strengths:

      Overall, the phenotypic characterisation of the Cep78-/- animals appears convincing and provides new evidence supporting that CEP78 plays an important role in the development and function of photoreceptors and sperm cells in vertebrates.

      Weaknesses:

      1) The immunoprecipitation experiments of mouse testis extracts that were used for the mass spectrometry analysis in Table S4 were performed with an antibody against endogenous CEP78 (although antibody details are missing). One caveat with this approach is that the antibody might block binding of CEP78 to some of its interactors, e.g. if the epitope recognized by the antibody is located within one or more interactor binding sites in CEP78. This could explain why the authors did not identify some of the previously identified CEP78 interactors in their IP analysis, such as CEP76 and the EDD-DYRK2-DDB1-VprBP complex (Hossain et al. 2017) as well as CEP350 (Goncalves et al. 2021).

      We thank Reviewer #1 (Public Review) for agreeing with us on Cep78 plays an important role in photoreceptors and sperm cells development. We also appreciate Reviewer #1 (Public Review) for pointing out the weaknesses which helped us improve our study.

      For the immunoprecipitation experiments of mouse testis extracts, the antigenic sequence of the Cep78 antibody used is p457-741 (NP_932136.2). Cep78 was reported to bind DD-DYRK2-DDB1-VprBP complex, the 1-520aa is responsible for Cep78’s interaction with VprBP, and deletion of p450-497 didn’t affect Cep78’s interaction with VprBP, indicating importance of Cep78 (1-450aa) in interaction with VprBp (Hossain et al. 2017). Our anti-Cep78 antibody is generated using antigen sequence p457-741, the binding of p1-450aa to VprBP is not expected to be blocked by our anti-Cep78 antibody. However, VprBp was not detected by our IP-MS experiment. C-terminal region (395-722aa) of Cep78 overlaps with our Cep78 antibody’s antigenic region (p457-741), and was reported to interact with Cep350 (Goncalves et al. 2021). As a polyclonal antibody, our anti-Cep78 antibody didn’t block the interaction with p457-741, because we still identified Cep350 in our IP-MS. Thus, immunoprecipitation experiments using our Cep78 antibody identified some of the previously known interactors, and the interaction with VprBP may not be blocked by our Cep78 antibody.

      The detailed antibody information has now been added to Supplementary Table S7 in our revised supplementary materials.

      2) Figure 7A-D and page 18-25: based on IPs performed on cell or tissue lysates the authors conclude that CEP78 directly binds IFT20 and TTC21A in a "trimeric complex". However, this conclusion is not justified by the data provided, nor by the previous studies that the authors are referring to (Liu et al. 2019 and Zhang et al. 2016). The reported interactions might just as well be indirect. Indeed, IFT20 is a known component of the IFT-B2 complex (Taschner et al., 2016) whereas TTC21A (IFT139) is part of the IFT-A complex, which suggests that they may interact indirectly. In addition, the IPs shown in Figure 7A-D are lacking negative controls that do not coIP with CEP78/IFT20/TTC21A. It is important to include such controls, especially since IFT20 and CEP78 are rich in coiled coils that tend to interact non-specifically with other proteins.

      Thank Reviewer #1 (Public Review) for the comment on protein interaction between Cep78, Ift20, and Ttc21a. As the reviewer pointed out, IFT20 is a known component of the IFT-B2 complex (Taschner et al., 2016) whereas TTC21A (IFT139) is part of the IFT-A complex. Both IFT20 and TTC21a are located at peripheral areas of IFT-B and IFT-A (PMID: 32456460), and are not core components of IFT-A or IFT-B. It is still possible that these two proteins interact with each other. Actually, Liu et al. have revealed interaction between Ift20 and Ttc21a in human sperm (PMID: 30929735). Additionally, to mediate trafficking of ciliary axonemal components, the IFT machinery is recruited to the distal appendages (PMID: 30601682), which is adjacent to the distal end of the (mother) centriole wall, where at the (mother) centriole wall was reported to be located (PMID:35543806). Cep78 may interact with Ift20 and Ttc21a at centriole during cilliogenesis.s

      To rule out the nonspecific interaction between Cep78 and Ttc21a or Ift20, we added additional negative controls of Gapdh (Figure 7D) and Ap80-NB-HA (Supplementary Figures S7A-C) in co-IP as the reviewer suggested, and found that the interaction between Cep78 and Ttc21a or Ift20 is specific. To examine if Cep78, Ift20 and Ttc21a formed a complex, we fractionated testicular protein complexes using size exclusion chromatography, and found that Cep78, Ift20 and Ttc21a co-fractioned at the size between158 kDa to 670 kDa (Figure 7E), supporting the formation of a trimeric complex. And our immunofluorescent analysis by SIM also showed co-localization between Cep78 and Ift20 or Ttc21a (Figure 7F). All these data support the interaction among Cep78, Ttc21a and Ift20. In the revised manuscript, we rephrased “direct interaction” as “interaction” at page 18, line 393 in the revised manuscript.

      3) In Figure 7D, the input blots show similar levels of TTC21A and IFT20 in control and Cep78-/- mouse testicular tissue. This is in contrast to panels E-G in the same figure where TTC21A and IFT20 levels look reduced in the mutant. Please explain this discrepancy.

      Thank you for pointing this out. Deletion of Cep78 caused down-regulation of Ttc21a and Ift20 proteins. To better reveal the change of interaction between Ttc21a and Ift20, we have to normalize their interaction against expression levels. To achieve this, we increased the amount of total Cep78-/- testicular proteins to ensure that Ttc21a and Ift20 in the input are at similar levels between Cep78+/- and Cep78-/- testes. Using 3 times the amount of the Cep78+/- testicular proteins for Cep78-/- testicular proteins, we detected similar protein levels of Ttc21a and Ift20 between Cep78-/- and Cep78+/- testes, and the interaction between Ttc21a and Ift20 was shown to be down-regulated after Cep78 deletion. Consistently, the analysis of GAPDH as a loading control in input proteins showed more Cep78-/- testicular proteins than Cep78+/- testicular proteins subjected to analysis. To avoid confusion, we have added description of “The amount of Cep78-/- testicular proteins used was 3 times of that of Cep78+/- proteins” in the legend of Figure 7D in the revised version of manuscript.

      4) The efficiency of the siRNA knockdown shown in 7J-M was only assessed by qPCR (Figure S4), but this does not necessarily mean the corresponding proteins were depleted. Western blot analysis needs to be performed to show depletion at the protein level. Furthermore, it would be desirable with rescue experiments to validate the specificity of the siRNAs used.

      Thank the reviewer for the suggestion. To validate the specificity of the siRNAs used, we performed rescue experiments using rescue plasmid with siRNA targeting sequence synonymously mutated (Supplementary Table S6). The efficiency of siRNA knockdown and effects of rescue experiments were evaluated by both qPCR (Supplementary Figures S4.A-C) and Western Blot (Figures 7.J-K, Supplementary Figures S4.D-E, H-I). The results showed that siRNAs significantly reduced the expression of Cep78, Ift20, and Ttc21a at both mRNA (Supplementary Figures S4.A-C) and protein levels (Figures 7.J-K, Supplementary Figure S4.A-C). Meanwhile, with siRNA treatment, the rescue plasmids rescued the expression of Cep78, Ift20, and Ttc21a at both mRNA (Supplementary Figures S4.A-C) and protein levels (Figures 7.J-K, Supplementary Figures S4.D-E, H-I) compared with the control groups.

      In the rescue experiments, we further evaluated whether the effects are specific for Cep78, Ift20 and Ttc21siRNAs in the regulation of cilia and centriole lengths. The results showed that suppression of cilia and centriole lengths by Cep78, Ift20 and Ttc21siRNAs could be rescued by overexpression of rescue plasmids of Cep78syn-HA, Ift20syn-Flag and Ttc21asyn-Flag (Figures 7.N-S).

      5) Figure 7I: the resolution of the IFM is not very high and certainly not sufficient to demonstrate that CEP78, IFT20 and TTC21A co-localize to the same region on the centrosome, which one would have expected if they directly interact.

      Thank the reviewer for the constructive comments. To better demonstrate co-localization of CEP78, IFT20 and TTC21A on the centrosome, we overexpressed Cep78-Halo, Ift20-mCherry and Ttc21a-mEmerald in NIH3T3 cells by lentivirus, and acquired super-resolution images with SIM (N-sim, Nikon, Tokyo, Japan). The SIM results showed that Ift20 and Ttc21a co-localized with Cep78 (Figure 7F). Cep78 was previously reported to localize at the centriole (Goncalves et al., 2021). The co-localization of Cep78, Ift20 and Ttc21a indicated possible important roles of Cep78 in the regulation of Ift20 and Ttc21a in centriole. Our interaction analysis revealed that Cep78 interacted with Ift20 and Ttc21a (Figure 7A-C, Supplementary Figure S7), and formed a complex with Ift20 and Ttc21a (Figure 7E). Loss of Cep78 down-regulated the expression of and interaction between Ift20 and Ttc21a (Figures 7D, G-M).

      6) It is not really clear what information the authors seek to obtain from the global proteomic analysis of elongating spermatids shown in Figure 3N, O and Tables S2 and S3. Also, in Table S2, why are the numbers for CEP78 in columns P, Q and R so high when Cep78 is knocked out in these spermatid lysates? Please clarify.

      Thank the reviewer for the comments. Our global proteomic analysis showed that majority of differentially expressed proteins were down-regulated (Figure 3N), and many proteins are centrosome- and cilia-related proteins and important for sperm flagella and acrosome structures (Figure 3O), which provide insights of downstream molecular events in sperm flagella and acrosome defects after Cep78 deletion.

      As to the quantification of CEP78 expression in TMT-based proteomics analysis, the ratio between Cep78-/- and Cep78+/- is relatively high due to the ratio compression effect, a well-known phenomenon in TMT-based proteomics analysis (PMID: 25337643). The actual difference in protein expression is usually higher than the ratio calculated by TMT signals. Actually, our Western blot analysis of CEP78 protein showed absence of expression in Cep78-/- testis. Although TMT labelling has the disadvantage of ratio compression (PMID: 32040177,PMID: 23969891), it is widely used quantitative proteomics analysis, and is demonstrated to be able to identify key pathways and proteins (PMID: 30683861, 33980814).

      7) Figure 1F and Figure 4K: the data needs to be quantified.

      Thank the reviewer for this suggestion. For Figure 4K, we stained Cep78+/- and Cep78-/- spermatids with anti-Centrin 1 to measure the centriole length. The statistical data of centriole length were provided (Figure 4L), showing significantly increased centriole lengths in Cep78-/-spermatids.

      For Figure 1F, we quantified the immunofluorescence intensities of cone arrestin of light-adapted retinas of Cep78+/- and Cep78-/- mice at 3-month. The results indicate that immunofluorescence intensity of the cone arrestin was lower in Cep78-/- mice.

      8) Figure 2A: It is difficult to see a difference in connecting cilium length in control and Cep78-/- mutant retinas based on the images shown here.

      Thank you for your suggestion, we have stained retinal cryosections from Cep78+/- and Cep78-/- mice with anti-Nphp1 to visualize connecting cilium, and the data are provided in the revised Figure 2A-B.

      Reviewer #2 (Public Review):

      In this report, the authors have described the generation and characteristics of Cep78 mutant mice. Consistent with the phenotype observed in patients carrying the mutations in CEP78, Cep78 knock-out mice show degeneration in photoreceptors cells as well as defects in sperm. The author further shows the CEP78 protein can interact with IFT120 and TTC21a. Mutation in CEP78 results in a reduction of protein level of IFT120 and TTC21A and mislocalization of these two proteins, offering mechanistic insights into the sperm defects. Over all the manuscript is well written and easy to follow. Phenotyping is thorough. However, improvement of the background section is needed. In addition, some of the conclusion is not sufficiently supported by the data, warranting further analysis and/or additional experiments. The Cep78 KO mice model established by the author will be a useful model for further elucidating the disease mechanism in human and developing potential therapy.

      My comments are the following:

      1) Introduction. The statement that "CRD usually exists with combination of immotile cilia defects in other systems" is not correct. CRD due to ciliopathy can have cilia-related syndromic defects in other systems but it is a relatively small portion of all CRDs and the most frequently mutated genes are not cilia-related genes, such as ABCA4, GUCY2D, CRX.

      Thank the reviewer for the comments. We agree with the reviewer that only a small portion of CRDs are due to cilia defects and can have cilia-related syndromic defects in other systems. We corrected this statement in Line 4, Page 77-79 of the revised version of our manuscript. In our revised version, the statement has been changed to “A small portion of CRDs are due to retina cilia defects, and they may have cilia-related syndromic defects in other systems[1].”

      2) Introduction: Page 4 CNGB1 encodes channel protein and not a cilia gene. It should be removed since it does not fit.

      Thank the reviewer for the comment. According to the reviewer’s suggestion, we removed the description of “mutations in CNGB1 cause CRD and anosmia [3]” at Page 4, Line 81 in the revised manuscript.

      3) Page 5, given the previous report of CEP78 patients with retina degeneration, hearing loss, and reduced infertility, the statement of "we report CE79 as a NEW causative gene for a distinct syndrome...TWO phenotypes....." Is not accurate.

      Thank the reviewer for the comments. We have removed the statement of “NEW” causative gene in Page 5, Line 104 of the revised version of our manuscript. The revised sentence is “In this study, based on results of a male patient carrying CEP78 mutation and Cep78 gene knockout mice, we report CEP78 as a causative gene for CRD and male sterility.”

      4) Figure 1F, the OS of the cone seems shorter, which might be the reason for weaker arrestin staining in the mutant compared to the heterozygous. Also, it would be better to quantify the staining to substantiate the statement.

      Thanks for this suggestion. For Figure 1F, we have quantified the immunofluorescence intensity of cone arrestin in Cep78+/- and Cep78-/- light-adapted retinas at 3-month. The results indicate that immunofluorescence intensity of the cone arrestin was significantly lower in Cep78-/- mice.

      5) Figure 1K, panel with lower magnification would be useful to get a better sense of the overall structure defect of the retina. Is the defect observed in the cone as well?

      Thank the reviewer for the comment. As suggested by the reviewer, we have provided images of lower magnification to show the overall structure by TEM, showing disruption of most outer segment in Cep78-/- retina. It is difficult to distinguish whether the disordered outer segment structure belongs to a cone or a rod cell. The images are now provided as Figure 1L in the revised manuscript.

      We observed the abnormality of photopic b-wave amplitudes (Figure 1B, E) and decreased intensity of cone arrestin in light-adapted retinas (Figure 1F, G) in Cep78-/- mice, which indicate that the function of cone cells is damaged.

      6) Figure 2A, NPHP1 or other markers specifically label CC would be more useful to quantify the length of CC. Also need to provide a notation for the red arrows in Figure 2. In addition, the shape of CC in the mutant seems differ significantly from the control. It seems disorganized and swollen.

      Thank the reviewer for the suggestion. According to the reviewer’s suggestion, we have stained anti-Nphp1 in retinal cryosections from Cep78+/- and Cep78-/- mice to visualize connecting cilium, and quantified the length of CC. The results showed that connecting cilia were shorter in Cep78-/- mice. These data are showed in Figure 2A-B.

      Besides, we observed that upper parts of connecting cilia were swelled with disorganized microtubules in TEM (Figure 2E-G). The red arrows in Figure 2E-G indicated swelled upper part of connecting cilia and disorganized microtubules of Cep78-/- photphoreceptors, we added this description in the figure legend.

      7) Evidence provided can only indicate direct interaction among CEP78/IFT20/TTC21A.

      Thanks for the comment. To further validate the interaction between Cep78 and Ttc21a or Ift20, we performed reciprocal co-IP between Cep78 and Ttc21a or Ift20 by overexpression (Figure 7A-C), and also added relevant negative control of Gapdh (Figure 7D) and Ap80-NB-HA (Supplementary Figures S7A-C) in co-IP as negative controls to avoid non-specific interaction. Besides, we provided evidence that Cep78, Ift20 and Ttc21a formed a complex, as they all co-fractioned in a testicular protein complex at the size between158 kDa to 670 kDa using size exclusion chromatography (Figure 7E). Additionally, we performed super-resolution analysis of immunofluorescent localizations, and observed co-localization between Cep78 and Ttc21a or Ift20 by SIM. With these data, we think that Cep78 interacts with Ttc21a and Ift20 and they form a complex. We rephrased “direct interaction” as “interaction” in the manuscript.

      Reviewer #3 (Public Review):

      Authors were aiming to bring a deeper understanding of CEP78 function in the development of cone-rod dystrophy as well as to demonstrate previously not reported phenotype of CEP78 role in male infertility.

      It is important to note, that the authors 're-examined' already earlier published human mutation, 10 bp deletion in CEP78 gene (Qing Fu et al., 10.1136/jmedgenet-2016-104166). This should be seen as an advantage since re-visiting an older study has allowed noting the phenotypes that were not reported in the first place, namely impairment of photoreceptor and flagellar structure and function. Authors have generated a new knockout mouse model with deleted Cep78 gene and allowed to convey the in-depth studies of Cep78 function and unleash interacting partners.

      The authors master classical histology techniques for tissue analysis, immunostaining, light, confocal microscopy. They also employed high-end technologies such as spectral domain optical coherence tomography system, electron, and scanning electron microscopy. They performed functional studies such as electroretinogram (ERG) to detect visual functions of Cep78-/- mice and quantitative mass spectrometry (MS) on elongating spermatids.

      The authors used elegant co-immunoprecipitation techniques to demonstrate trimer complex formation.

      Through the manuscript, images are clear and support the intended information and claims. Additionally, where possible, quantifications were provided. Sample number was sufficient and in most cases was n=6 (for mouse specimens).

      The authors could provide more details in the materials and methods section on how some experiments were conducted. Here are a few examples. (i) Authors have performed quantitative mass spectrometry (MS) on elongating spermatids lysates, however, did not present specifically how elongating spermatids were extracted. (ii) In the case of co-IPs authors should provide information on what number of cells (6 well-plate, 10 cm dish etc) were transfected and used for co-IPs. Furthermore, authors could more clearly articulate what were the novel discoveries and what confirmed earlier findings.

      The authors clearly demonstrate and present sufficient evidence to show CEP78/Cep78 importance for proper photoreceptor and flagellar function. Furthermore, they succeed in identifying trimer complex proteins which help to explain the mechanism of Cep78 function.

      The given study provides a rather detailed characterization of human and mouse phenotype in response to the CEP78/Cep78 deletion and possible mechanism causing it. CEP78 was already earlier associated with Cone-rod dystrophy and, this study provides a greater in-depth understanding of the mechanism underlying it. Importantly, scientists have generated a new knock-out mouse model that can be used for further studies or putative treatment-testing.

      CEP78/Cep78 deletion association with male infertility is not previously reported and brings additional value to this study. We know, from numerous studies, that-testes express multiple genes, some are unique to testes some are co-expressed in multiple tissues. However, very few genes are well studied and have clinical significance. Studies like this, combining patient and animal model research, allow to identify and assign function to poorly characterized or yet unstudied genes. This enables data to use in basic research, patient diagnostics and treatment choices.

      We would like to thank Reviewer #3 (Public Review) for positive comments on our work.

      As to the suggestions to provide some details in the materials and methods by the reviewer, we added the description of STA-PUT method for spermatids purification at Page 34, Line 729-741 in the revised manuscript, the amount of cells used for co-IPs “10 cm dish HEK293T were transfected (Vazyme, Nanjing, China) wit 5μg plasmid for each experimental group.” at Page 36, Line 783-784 in the revised manuscript.

      We also highlighted our new discovery and ensured that all previous published findings are accompanied by references, we added “We further explored whether c.1629-2A>G mutation in this previously visited patient would disturb CEP78 protein expression and male fertility. Blood sample was collected from this patient and an unaffected control for protein extraction.” at Page 17, Line 335. We also added “The major findings of our study are as follows: we found CEP78 as the causal gene of CRD with male infertility and multiple morphological abnormalities of the sperm flagella using Cep78-/- mice. A male patient carrying CEP78 c.1629-2A>G mutation, whom we previously reported to have CRD [8], was found to have male infertility and MMAF in this study. Cep78 formed a trimer with sperm flagella formation enssential proteins IFT20 and TTC21A (Figure 8), which are essential for sperm flagella formation[16, 18]. Cep78 played an important role in the interaction and stability of the trimer proteins, which regulate flagella formation and centriole length in spermiogenesis. ” at the first paragraph of discussion, which is Page 21, Line 447-456 of our revised manuscript.

    1. Author Response

      Reviewer #1 (Public Review):

      This excellent manuscript challenged the premise that NF-kappaB and its upstream kinase IKKbeta play a role in muscle atrophy following tenotomy. Two animal models were used - one leading to enhanced muscle-specific NF-kappaB activation and the other a muscle-specific deletion. In both models, there was no significant relationship to observed muscle changes following tenotomy. Overall this work is significant in that it challenges the existing dogma that NF-kappaB plays a crucial role in muscle atrophy.

      Surprisingly the authors noted that there were basal differences observed in the phenotypes of their models that were sex-dependent. They note that male mice lose more muscle mass after tenotomy and specifically type 2b fiber loss.

      Overall this is an outstanding study that challenges the notion that NF-kappaB inhibitors are likely to improve muscle outcomes following injuries such as rotator cuff tears. Its main weakness is that there were no pharmacological arms of investigation; this fails to definitively exclude the hypothesis that inhibition may exert some effect in healing, perhaps in surrounding non-muscle matrix tissue that in turn may assist in healing.

      Thank you for your careful and thoughtful review. We agree that the finding that NFkb is not driving tenotomy-induced atrophy is both surprising and interesting. We look forward to further uncovering the atrophic mechanisms responsible. We also agree that an investigation using pharmacological NFkb inhibitors will improve our understanding of the full scope of the role of NFkb in the tenotomy pathology. As you and another reviewer note, this work has only blocked NFkb signaling in the mature muscle fiber and thus cannot assess the role of NFkb in satellite cell, fibroblast, immune cell activation etc in the healing response. However, we avoided using these inhibitors in this study due to the potential for these systemic effects to obscure the role of NFkb in the muscle fiber. While we believe that a pharmacological investigation is beyond the scope of this study, it will make an excellent follow on investigation.

      Reviewer #2 (Public Review):

      The primary strength of this paper is a rigorous approach to 'negative' data. Did the authors definitively prove that NF-kB has no role in the tenotomy-induced atrophy? Probably not entirely, since there are limitations of the mouse model and the knockdown mice. There cannot be complete elimination of load since mice heal with some scar tissue, and the knockdown is not complete elimination. However, even with these limitations, this presents important findings that tenotomy, which induces mechanical unloading of the muscle-tendon unit, provides a unique biomechanical environment for the muscle to undergo atrophy, which warrants a more in-depth look given that these injuries are unique and extremely common. It must be mentioned that the results are entirely supported by their data and that even though the model is not 'perfect' it truly supports that NF-kB has a limited role in atrophy. The sex-mediated differences based on autophagy are a secondary hypothesis and are interesting but possibly less clinically relevant based on the differences shown.

      We appreciate your thoughts on the “negative” data in this study. A manuscript in which the data refute your hypothesis and that of the field is difficult to write. There is a higher burden of validation and closer scrutiny of limitations. We agree that the model does have some limitations, but overall strongly supports a limited role for NBkb in tenotomy-induced muscle atrophy.

      The important next step for this group and others is to evaluate the 'how and why' of tenotomy atrophy if not through NF-kB. Is it that there are many redundant processes that the muscle may have to circumnavigate the NF-kB pathway given that it is so ubiquitous that the authors didn't see a difference? Could it be differences in axial vs appendicular muscle? Or should there be a closer look at the mechanosensors in the muscle cells to determine if there are other key drivers of atrophy? Regardless, this paper shows that tenotomy-induced muscle atrophy is unique and supports the conclusion that muscle has many ways to atrophy based on the injury it undergoes.

      We agree that the major next step for this work is to investigate the mechanism(s) responsible for tenotomy-induced atrophy. Autophagy in particular needs a more thorough investigation using autophagic inhibitors in naive wildtype mice to investigate its role in the sex-specificity of tenotomy-induced atrophy. The question of axial vs. appendicular muscle is intriguing. There could also be an upper vs. lower body difference that is worth exploring in future work.

      Reviewer #3Public Review):

      The authors provided thorough analyses of muscle morphology, biochemistry, and function, which is a major strength of the study. However, there are some key confounding variables authors failed to address. For example, the difference in the estrous cycle in female animals was not controlled. The study could have been significantly improved by controlling sex hormone levels or at least testing differences in response to injury.

      We appreciate your careful and insightful review of our work. We designed this study to assess the role of myofiber NFkb in tenotomy-induced atrophy, which led us to a rigorous assessment of morphology, biochemistry and function, which we agree is the strength of the study. We also agree that a major limitation of this study is that the secondary observations of sex-specificity and autophagic signaling are not as well controlled or supported. This is because these observations were made at the end of the study when the histological analyses were completed by the blinded rater. The sex-specificity in the basophilic puncta that the rater observed sparked us to reconsider the sex-specificity in our other data and to stain for autophagic vesicles. As you suggest, to rigorously assess sex-specificity it would be good to control of estrous cycle and analysis of sex hormones which would require initiation of another study, planning for these variables in advance. We think this is beyond the scope of the current question of the role of NFkb in tenotomy-induced atrophy but think it should be undertaken as a follow on to eliminate confounding variables of genetic manipulation and tamoxifen treatment.

      However, since we still need to report the sex specificity we observed while ensuring that our findings are not misconstrued, we reviewed the language in the manuscript to emphasize that these are retrospective observations that require further investigation. We have also added discussion of these variables and their potential influence on the results to the Discussion.

      Discussion: “Additionally, it is important to note that estrous cycle was not controlled in these mice and sex hormone levels weren’t measured in this study. These preliminary observations, though intriguing, will require more rigorous follow up evaluations to define the interaction between sex, tenotomy, and autophagy in naïve wildtype mice.”

      Furthermore, more data are needed to link NFkB signaling and autophagy to make any kind of conclusions. Overall, in the current form of the manuscript, the presented data seem underdeveloped, and the addition of more supporting data could significantly improve the quality of the manuscript and enhance our understanding of NFkB signaling and muscle wasting in rotator cuff injury.

      We agree that more data are needed to complete the picture of autophagy in tenotomy-induced muscle atrophy. The p62 and LC3 positive intracellular puncta in male tenotomized muscle are distinctive, but only limited conclusions can be drawn physiologically because 1) they are only present in a fraction of fibers and 2) it is impossible to tell whether they result from increased autophagic flux or altered vesicle processing. Western blot for LC3 (and now p62) indicates only small changes in total protein, but since these proteins are synthesized and degraded during active autophagy, it is possible for their levels to remain constant while flux increases. Direct measures of autophagic flux would require treating mice with an autophagosome block which would require initiation of another study. However, we agree with the reviewer that we can add some additional measures to better characterize the instantaneous state.

      We have added analysis of p62 protein expression to LC3 since p62 protein content in muscle can be decoupled from LC3 (PMID: 27493873). We also added expression data for genes involved in autophagy (Lc3b, Gabarapl1, Becn1, Bnip3, and Atg5). Finally, we have commented on the limitations of our data in the Discussion.

      Discussion: “Evidence for autophagy regulating tenotomy-induced atrophy has been mounting over recent years (Bialek et al., 2011; Gumucio et al., 2012; Joshi et al., 2014; Ning et al., 2015; Hirunsai & Srikuea, 2021). The evidence presented here supports this contention, but we find surprisingly small effect sizes for all markers investigated. This could be because we are not directly assessing autophagic flux and so are missing some temporal dynamics since synthesis and degradation are ongoing simultaneously.”

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Summary of changes

      We thank all three reviewers for their constructive feedback on our manuscript. We have now perfomed extensive experiments, analyses, and rewriting of our manuscript to address all their concerns. We believe that these changes significantly improve the rigor of our conclusions and the clarity of our discussion. We highlight below key experiments, analyses, and re-writing in the revised manuscript, which is followed by a detailed point-by-point response. 1) We have now performed experiments using alternative uORF donor sequences to demonstrate the robustness of uORF repression to changes in uORF length.

      2) By mutating out near-cognate start codons within uORF2, we have now demonstrated that near-cognate start codon initiation within uORF2 does not impact repression.

      3) To quantify the dynamic range of our dual luciferase assay, we have now mutated out the NLuc start codon. We find that repressive uORF2 constructs have expression levels that are still > 20-fold above the no-startcodon control values.

      4) We have now analyzed ribosome profiling coverage on uORFs (supplementary figure 5), and we show that several uORFs with known elongation stalls lack evidence of 40S and 80S subunit queueing 5′ to stalls, consistent with our collision-induced ribosome dissociation model.

      5) We have now provided detailed discussion of footprint length choice in our modeling and the role of codon choice in our experiments.

      6) We have now added a new main figure that provides a graphical representation of reactions considered in our kinetic modeling. This figure will make our modeling assumptions more transparent and accessible to readers with less computational expertise.

      Reviewer #1:

      Summary

      Bottorff et al test several models of uORF-mediated regulation of main ORF translation using the uORF2 of CMV UL4 gene, a system that has been previously experimentally characterized by the authors. They first train a computational model to recapitulate the observed experimental effects of mutations in uORF2, and then use the model to infer which uORF parameters may confer buffering against reduced ribosome loading that typically occurs upon biological perturbation. The authors then find that: i) the uORF2 confers buffering, ii) the uORF2 mechanism adjusts to computational predictions for the collision-mediated 40S dissociation model of uORF-mediated regulation. Significance

      This manuscript represents an interesting effort to distinguish mechanisms of uORF-mediated regulation based on mathematical modeling, and might be useful for the translation community. My expertise: Regulation of translation.

      We thank Reviewer 1 for a succinct summary of our main conclusions and highlighting the significance of our work to the translation community.

      Major comments 1) Figure 4 (Figure 5 in revised version): Which is the dynamic range of the WT vs the no-stall construct? In the WT construct, main ORF translation is already quite repressed, and detecting further repression may be more difficult than in the no-stall construct. In other words, the differences that authors are detecting between the WT and no-stall constructs might be due to a potential lower dynamic range of the WT construct

      To measure the dynamic range of our reporter assay, we have now mutated the start codon of the NLuc reporter ORF. We reasoned that this construct provides a lower bound on measurable NLuc signal. The resulting noNLuc-start-codon reporter expression was at least 20-fold lower than WT construct (Fig. S1A). Importantly, we also see that the raw NLuc signal of the WT construct is at least 20-fold over the background (Fig. S1B). Thus, the differential response of WT and no-stall constructs is not simply due to lower dynamic range of the WT construct.

      2) The authors conclude that uORF2 follows the collision-mediated 40S dissociation model, based on fitness of their experimental results with predictions from their mathematical modeling regarding distance between uORF2 initiation codon and the stalling site. But can the authors actually directly prove that there are no 40S subunits accumulating behind the stalled 40S using Ribo-Seq or TCP-Seq?

      We have now examined existing 80S Ribo-seq and 40S TCP-seq datasets to determine whether queued 40S or 80S ribosomes can be detected at known stall sites. Stern-Ginossar et al. (2012) performed 80S Ribo-seq during hCMV infection. In this dataset, while the stall at the UL4 termination codon has a very high ribosome density, few elongating ribosomes are seen queued behind the stalled 80S, consistent with an absence of 80S ribosome queuing (Fig. RR1). By contrast, another well-studied elongation stall in the Xbp1 mRNA shows ~30 nt periodic peaks in ribosome density indicative of ribosome queues (Fig. RR2). An important caveat is that queued ribosomes could be systematically underrepresented in standard Ribo-seq datasets due to incomplete nuclease digestion (Darnell et al., 2018; Subramaniam et al., 2014; Wolin and Walter, 1988).

      Since there is no 40S TCP-Seq dataset during hCMV infection, we examined other known stalls on human mRNAs (Fig. RR3 below; Fig. S5 in our manuscript). We examine small ribosomal subunit profiling data from human uORFs with conserved amino acid-dependent elongating ribosome stalls (Figure S5A). Ribosome density read counts are low across all of these uORFs, showing no evidence of ribosome queuing. Subtle queues might not be observed given these low read counts from insufficient capture of small ribosomal subunits. Nevertheless, we do not observe any evidence of queueing upstream to elongating ribosome stalls in this data. We note these observations in our Discussion section as follows (lines 688-712): “Although our data from UL4 uORF2 does not support the queuing-mediated enhanced repression model (Fig. 1C) (Ivanov et al., 2018), this model might describe translational dynamics on other mRNAs. Translation from near-cognate start codons is resistant to cycloheximide, perhaps due to queuing-mediated enhanced initiation, but sensitive to reductions in ribosome loading (Kearse et al., 2019). Loss of eIF5A, a factor that helps paused elongating ribosomes continue elongation, increases 5′ UTR translation in 10% of studied genes in human cells, augmented by downstream in-frame pause sites within 67 codons, perhaps also through queuing-mediated enhanced initiation (Manjunath et al., 2019). There is also evidence of queuing-enhanced uORF initiation in the 23 nt long Neurospora crassa arginine attenuator peptide (Gaba et al., 2020) as well as in transcripts with secondary structure near and 3′ to start codons (Kozak, 1989). Additional sequence elements in the mRNA might determine whether scanning ribosome collisions result in queuing or dissociation. Small subunit profiling data (Wagner et al., 2020) from human uORFs that have conserved amino acid-dependent elongating ribosome stalls do not show evidence of scanning ribosome queues (Fig. S5A), consistent with the collision-mediated 40S-dissociation model. Subtle queues might not be observed given these low read counts from insufficient capture of small ribosomal subunits.”

      3) Experimental data in Figures 2, 4 and 5 include 3 technical replicates. Sound conclusions typically require biological replicates. Further, the number of replicates in Figure 6 has not been indicated.

      As suggested by the reviewer, we have now included biological replicates for all luciferase assays [Figures 2, 5, 6, and 7 that were previously 2, 4, 5, and 6] that were technical replicates in the previous version. This replication does not alter any of our conclusions. We have now included the number of biological replicates for Figure 7 (former Figure 6).

      Minor comments 1) Figure 4 (Figure 5 in revised version): It is strange that a PEST sequence had to be introduced in the construct of part B in order to observe reliable differences, but not in constructs of parts A and C. Can the authors explain?

      We introduced the PEST sequence for part B because we wanted to measure the reporter response to treatment with a drug that reduces translation initiation. The PEST sequence increases the turnover rate of the reporter protein. Without the PEST sequence, the luminescence signal will be dominated by the reporter expression before the drug was added. However, in parts A and C, initiation rate was altered through genetic mutations and measuring their expression under basal conditions does not require a PEST sequence. Except in situations where a quick dynamic response needs to be measured such as in the drug treatment in part B, reporters without PEST sequences are simpler to interpret due to the absence of proteasome-mediated degradation and higher overall signal.

      2) Figure 6 (Figure 7 in revised version): Unfortunately, the authors find no other human uORFs with terminal diproline motifs that are so essential for main ORF repression as uORF2. In this light, can the authors comment further on the usefulness of their findings for human genes? Have the authors searched for viral RNAs with similar features? Please, notice that the gene PPP1R37 has not been mentioned in the main text.

      The UL4 and human uORFs differ in their sequence determinants of translational repression. UL4 uORF2 represses translation entirely through nascent peptide-mediated stalling. While the terminal diproline motif in UL4 uORF2 is necessary for main ORF repression, it is not sufficient. A number of other residues in the UL4 uORF2 peptide play a critical role in repression (Cao and Geballe, 1996; Matheisl et al., 2015). Thus, it is not surprising that human uORFs that we identified based solely on the presence of terminal diproline motifs confer only modest decrease in repression upon mutating the terminal proline. The human uORFs containing these terminal diprolines may partially repress translation via nascent peptide effects, but the majority of the repression likely arises from siphoning of scanning ribosomes from the main ORF (Fig. 1A in our manuscript) and inefficient termination following translation of consecutive prolines (Cao and Geballe, 1996; Cao and Geballe, 1998; Janzen et al., 2002; Matheisl et al., 2015). Our current understanding of features in nascent peptide that mediate translational repression (Wilson et al., 2016) is insufficient to bioinformatically identify elongation-stall containing uORFs in human or viral genomes, so we simply looked for terminal diprolines. Despite this limitiation, we note that the modeling approaches and experimental perturbations developed in our work can be applied to study ribosome kinetics on any repressive uORF, independent of the mRNA or peptide sequence underlying the repression. As suggested by Reviewer 1, we have now included all the studied uORFs in the main text.

      Reviewer #2:

      Summary

      In this paper, the authors are exploring the uORF regulatory mechanism. They first discussed five general models how uORFs might work to repress and buffering main ORF translation, then they mainly focus on the UL4 uORF2 for the potential mechanism. They use both computer modeling and experimental validation with reporter assay in 293t cell line. Based on their model, and few experimental results when they change the translation initiation rate and/or length of dORF, they propose it may work through 40S dissociation model, since the buffering effect is not uORF length sensitive. Significance

      It is an interesting area, using modeling with experiment validation to understand uORF regulation mechanism, the kinetics and interplay between different translation steps, it will help us to understand uORF buffering in stress conditions. Also bring modeling method with reporter validation to the translation field, will provide clues to the molecular mechanism study, especially in complex situation.

      We thank Reviewer 2 for a comprehensive summary of our work and noting the uniqueness and usefulness of our experiment-integrated modeling approach to the translation field.

      Major comments • Are the key conclusions convincing? The modeling for different mechanisms is insightful, but some modeling parameters and experimental validation are not conclusive and validation of few of them can enforce the conclusions.

      We have now performed key validation experiments suggested by Reviewer 2, notably: 1. mutating out of nearcognate start codons in the UL4 uORF2 coding sequence and 2. increasing UL4 uORF2 length using two unrelated protein coding sequences. Please see responses to specific comments below for further details.

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? Yes, the part about queuing and length sensitive is not convincing to me, it should be modified and reduce the statement strength.

      We agree about reducing the statement strength and have altered our statements as suggested by the reviewer. Specifically, we have now expanded the rationale for the choice of footprint lengths of 40S subunits. Please see responses to specific comments below for further details.

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. Yes, please see the specific concerns • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. They will need to re-think about the modeling, and validation in Figure 5, there are validation experiments that can be done in weeks and in a cost-efficient manner that can enforce the conclusions.

      We have performed the experiments suggested by the reviewer. See responses below.

      • Are the data and the methods presented in such a way that they can be reproduced? Most of them are good • Are the experiments adequately replicated and statistical analysis adequate? Yes Specific concerns 1) It is a bit confusing to me in Figure 2C, the reporter assays, that non-start codon reporter and non-stall reporter has same expression level. In theory, the non-stall reporter still has uORF there, so it should repress main ORF expression, and have lower expression level than the non-start reporter, where there is no uORF, no repression. In other uORFs they tested in Figure 6 (Figure 7 in revised version), the non-stall reporters are lower than non-start reporter. Since data they use to build the model is Figure 2B, and calculate the parameters for the whole paper, I just want to make sure it is making sense. I noticed there is another CTG in frame on the 4th codon, this may be alternative start codon in the non-start reporter to trigger some repression.

      To address Reviewer 2’s concern about alternative start codon usage in the non-start reporter, we have now mutated out all near-cognate start codons known to initiate translation with high frequency (Kearse and Wilusz, 2017). These near-cognate start codons consisted of Leu4 CTG, Leu11 CTG, Leu14 TTG, and Leu15 CTG and were mutated to CTA, CTA, TTA, and CTA, respectively. We find that removing the uORF2 near-cognate start codons does not significantly alter NLuc expression (Fig. S1A). This experiment merely rules out one possible source of these similar expression levels. We expect that uORF2 no-start and no-stall reporters’ very similar NLuc expression levels can be rationalized for the several following reasons: 1. uORF2 initiation frequency is quite low. We estimate it to be 5% or less in our modeling based on previous measurements (Cao and Geballe, 1995). Thus, the maximum theoretically possible difference in NLuc expression between no-start and no-stall reporters is 5% or less. 2. Further, re-initiation after uORF2 translation is frequent. We estimate it to be around 50% within our manuscript, which will further decrease repression in the no-stall mutant. Thus, we expect the no-stall mutant to decrease the flux of scanning ribosomes at the main ORF by 2-3% compared to the no-start mutant. 3. Finally, a subtle but important point to note is that our reporter assays are measuring NLuc expression and not the flux of scanning ribosomes at the main ORF NLuc start codon. Since NLuc ORF has a strong start codon context (GCCACC) and the flux of scanning ribosomes is already high for the no-start and no-stall mutants, slight changes in the flux of scanning ribosomes are unlikely to impact NLuc expression. This is because start codon selection is not rate-limiting for protein expression under these conditions. This last point is clearly seen in high throughput reporter assays where the mutations which impact reporter expression in a non-optimal context have little or no effect in an optimal context (see Fig. 5B, 5C in Noderer et al., 2014).

      Thus, in summary, even if the flux of scanning ribosomes is decreased by 3-5% by the no-stall uORF2 mutant compared to the no-start uORF2 mutant, we expect the effect on NLuc expression to be negligible and below the limit of our experimental resolution (which is ~10% based on the standard error across technical replicates).

      Regarding the different behavior of the human uORFs in our manuscript and UL4 uORF2, note the response to Reviewer 1 regarding the usefulness of our human uORF findings.

      2) All the modeling and prediction the authors do are based on average, but we know translation is very heterogeneous. For each ribosome or each 40S, the kinetics varies a lot, the authors should discuss about this part.

      We now discuss translation heterogeneity in the Discussion section in lines 781-794 as follows: “Translation heterogeneity among isogenic mRNAs has been observed in several single molecule translation studies (Boersma et al., 2019; Morisaki et al., 2016; Wang et al., 2020; Wu et al., 2016; Yan et al., 2016). This heterogeneity may arise from variability in intrasite RNA modifications (Yu et al., 2018), RNA binding protein occupancy, or RNA localization. We do not capture these sources of heterogeneity in our modeling since the observables in our simulations are averaged over long simulated time scales and used to predict only bulk experimental measurements. However, our models studied here can readily extended through compartmentalized and state-dependent reaction rates (Harris et al., 2016) to account for the different sources of heterogeneity observed in single molecule studies.”

      3) For modeling related with the queuing-mediated model in Figure 1C. they use 30nt as the ribosome length to count the potential queuing to start codon. But 30nt is the 80S protected fragment with specific conformation. The protected fragment for 80S will change based on different status of ribosome conformation or elongation step. More importantly, for queuing, it is 40S, so they may have a different size. Based on previous 40S ribosome profiling (Archer, Stuart K., et al. Nature 535.7613 (2016): 570-574. And other papers), the length can vary from 19nt to very long, so I don’t think the 30nt length can be used to model queuing in 40S and length sensitivity in the uORF working mechanism.

      We thank Reviewer 2 for highlighting this issue of footprint length heterogeneity that we had not previously addressed. In our modeling, we assume homogenous ribosome footprints. While, heterogeneous ribosome footprints have been observed for small ribosomal subunits (Bohlen et al., 2020; Wagner et al., 2020; Young et al., 2021) and elongating ribosomes (Lareau et al., 2014; Wu et al., 2019), we believe that our modeling of homogenous footprint length is appropriate for the following three reasons: First, with respect to the small ribosomal subunit footprint heterogeneity, we note that TCP-seq studies include crosslinking of eukaryotic initiation factors (eIFs). The presence of these eIFs is thought to be the main source of heterogeneity in scanning ribosome footprints (Bohlen et al., 2020; Wagner et al., 2020). Although crosslinking is often performed, it is not necessary to obtain scanning ribosome footprints, and homogenous 30 nt footprints are observed in the absence of crosslinking (Bohlen et al., 2020). Notably, figure S2 of Bohlen et al. (2020), reproduced as Fig. RR4 below, shows that scanning SSU footprint lengths are tightly distributed around 30 nt when crosslinking is not used.

      Second, in the context of the strong, minutes-long UL4 uORF2 elongating ribosome stall (Cao and Geballe, 1998), collided ribosomes will wait for long periods of time relative to normal elongating or scanning ribosomes. Thus, we expect that associated eIFs dissociate from these dwelling ribosomes as they typically do during start codon selection or during translation of short uORFs (Bohlen et al., 2020). Third, a significant fraction of mRNAs exhibit cap-tethered translation in which eIFs must dissociate from ribosomes before new cap-binding events, and therefore collisions, can occur (Bohlen et al., 2020). Based on above three points, we believe that modeling the footprint of only the scanning ribosomes, and not the associated eIFs, using a single 30 nt length is biologically reasonable. Footprint length heterogeneity of elongating ribosomes is much less drastic than that observed for scanning ribosomes and likely arises from different conformational states such as an empty or occupied A site (Lareau et al., 2014; Wu et al., 2019). While the different elongating ribosome footprints arise from differences in mRNA accessibility to nucleases, it is unclear whether the distance between two collided ribosomes changes across different ribosome conformations. For instance, the queues of elongating ribosomes observed at the Xbp1 mRNA stall occur at regular ~30 nt periodicity (Fig. RR2). Additionally, the stalled elongating ribosome is stuck in a pretranslocation state and has a defined, ~30 nt footprint (Wu et al., 2019), which only leaves room for 1 5′ queued ribosome within UL4 uORF2 whose footprint is conformation sensitive. Finally, a small degree of scanning footprint heterogeneity is also accounted for by our modeling of backward scanning which effectively introduces heterogeneity to collided scanning ribosome location on mRNAs (Figures 6A, S2D in our manuscript). We have now summarized the above points in the Discussion section of the revised manuscript (lines 713-740).

      4) For Figure 5B (Figure 6B in revised version), besides the modeling length part I have mentioned above, when the authors increase the length of uORF, the sequence is also changed, which may introduce other side effect. So, if the authors want to conclude about the queuing part, they should rethink about the length for both modeling and validation, plus control for the sequence they added to increase the length of uORF, for example use different sequence when manipulate the length.

      As suggested by the Reviewer, we have now varied the length of uORF2 using a different, unrelated donor sequence encoding the FLAG peptide and observe similar results (Fig. S4 in our manuscript) to our original experiment with the YFP-encoding sequence (Fig. 6B in our manuscript). A slight trend towards derepression with longer uORFs is observed in both cases. This effect might arise due to decreased stall strength caused by higher nascent peptide protrusion out of the exit tunnel leading to cotranslational folding (Bhushan et al., 2010; Nilsson et al., 2015; Wilson et al., 2016) or nascent chain factors (Gamerdinger et al., 2019; Weber et al., 2020) exerting a pulling force on the peptide. Importantly, we do not see the periodic change in repression predicted by the queueing model (Figure 6A, yellow-green lines).

      Minor comments • Specific experimental issues that are easily addressable. 5) It is unclear how the luciferase assays were analyzed considering the background noise. If the NLuc expression is low, close to the background, then how to extract or normalize the background will influence the expression level, thus fold change for different reporter/condition.

      To account for the luciferase background, we subtracted background from measured data values. To show that expression is rarely close to background (from mock transfections), we included a supplementary figure showing raw NLuc and FLuc values (Fig. S1B). Also note the response to Reviewer 1 regarding a no-start-codon control having a 20-fold lower signal than the WT UL4 uORF2 construct.

      • Are prior studies referenced appropriately? yes • Are the text and figures clear and accurate? Mostly good • Do you have suggestions that would help the authors improve the presentation of their data and conclusions? Have a main figure about the modeling part.

      As suggested by the Reviewer, we have now added visual representations of the reactions as a new main figure (Fig. 3). We also moved the modeling workflow figure from the supplementary set of figures to this main figure (Fig. 3). We thank the reviwer for this suggestion that greatly improves the presentation of our modeling methodology

      • Place the work in the context of the existing literature (provide references, where appropriate). Recent years, there has been a lot of study about small open reading frames, while for uORFs are known to repress translation, the regulatory mechanism is not known yet, there are just different models not validated yet (Young & Wek, 2016). Also, under normal conditions and stress conditions, uORF can play both repressive and stimulative role in main ORF translation (Orr, Mona Wu, et al. NAR 48.3 (2020): 1029-1042.). This paper is the first study to put all the uORF working hypothesis with buffering effect together, they use modeling to explain how under each hypothesis, buffering may happen or not. >• State what audience might be interested in and influenced by the reported findings. It will be interesting to people, who study molecular biology, biochemistry for translation regulation, especially uORFs. The modeling people may also find it interesting, how they could adapt modelinbeew keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. I have extensive experience working in the translation regulation field and I feel extremely comfortable to discus all the experimental part including individual reporters as well as genome wide. But I do not consider an expert in the modelling section of this work.

      Reviewer #3 :

      Summary Small ORFs are prevalent in eukaryotic genomes with variety of functions. Recent technological advances enable their detection, yet our understanding on the mode of action remains quite rudimentary. The manuscript by Bottorff, Geballe and Subramaniam aims at elucidating the function of UL4 uORF in the CMV, and thus, it is on timely and topical research. The authors measure the uORF -controlled expression of the well-studies UL4 uORF and kinetically model the initiation behavior. Within a second uORF, a diproline pair controls initiation of the downstream main ORF sensing ribosomal collisions between a scanning small subunit and an 80S positioned at the canonical start of the main ORF. The stalling at both proline codons is envisioned as a kinetic window to sense any elongation-competent 80S at initiation and thus, control the ribosomal load and expression. Such diproline tandems are present in some uORFs in human transcriptome, hence representing more pervasive control mechanism. Significance I am unable to comment in depth on the modeling algorithms and simulations as this is outside of my expertise. The experiments are reasonably designed to test various models of uORF regulation and set the frame for the modelling. The idea that various stress factors would decrease canonical initiation and consequently would reflect the number of initiating ribosomes are adequately tested by varying the number of initiating ribosomes. The discovery of the two terminal prolines, that are also found in other human uORFs, is appealing mode of controlling stalling-driven downstream initiation. However, the lack of experimental support with the human uORFs may indicate additional contributions. This raises the question as to whether the proline codon identity plays a role? Since codons are read with different velocity which is mirrored by the tRNA concentration. It would be good to address whether special proline codons have been evolutionarily selected in CMV and whether the kinetics of stalling strongly depends on the codon identity. Are both prolines in the tandem using the same codon? Along that line, are the same proline codons used in the human diproline-containing counterparts? Consequently, the P to A mutation may have altered the codon usage and could be the reason for the nonlinear effect in the human sequenced. In this case, it would make sence to use Ala-codons with similar codon usage as the natural prolines?

      We thank the Reviewer for raising this point about the role of codon usage. The tandem proline residues do not use the same codon (CCG then CCT). The two C-terminal proline residues in uORF2 are necessary for the elongating ribosome stall (Bhushan et al., 2010; Degnin et al., 1993; Wilson et al., 2016), but it has been previously shown that the identity of the codon does not significantly impact repression (Degnin et al., 1993). The human uORFs generally have 1 of the 2 Pro codons in common with the uORF2 Pro codons. Given that most of the human uORF P to A mutations behave similarly (Figure 7) irrespective of the original proline codon, we believe that codon usage does not impact repression by these uORFs. Moreover, as explained in response to Reviewer 1 and 2’s questions, we believe that the human uORFs containing terminal diprolines may partially repress translation via nascent peptide effects, but the majority of the repression likely arises from efficient siphoning of scanning ribosomes from the main ORF by the uORF (Fig. 1A in our manuscript).

      References

      Bhushan, S., Meyer, H., Starosta, A.L., Becker, T., Mielke, T., Berninghausen, O., Sattler, M., Wilson, D.N., and Beckmann, R. (2010). Structural Basis for Translational Stalling by Human Cytomegalovirus and Fungal Arginine Attenuator Peptide. Molecular Cell 40, 138–146.

      Boersma, S., Khuperkar, D., Verhagen, B.M.P., Sonneveld, S., Grimm, J.B., Lavis, L.D., and Tanenbaum, M.E. (2019). Multi-Color Single-Molecule Imaging Uncovers Extensive Heterogeneity in mRNA Decoding. Cell 178, 458–472.e19.

      Bohlen, J., Fenzl, K., Kramer, G., Bukau, B., and Teleman, A.A. (2020). Selective 40S Footprinting Reveals Cap-Tethered Ribosome Scanning in Human Cells. Molecular Cell 79, 561–574.e5.

      Cao, J., and Geballe, A.P. (1995). Translational inhibition by a human cytomegalovirus upstream open reading frame despite inefficient utilization of its AUG codon. J Virol 69, 1030–1036.

      Cao, J., and Geballe, A.P. (1996). Coding sequence-dependent ribosomal arrest at termination of translation. Molecular and Cellular Biology 16, 603–608.

      Cao, J., and Geballe, A.P. (1998). Ribosomal release without peptidyl tRNA hydrolysis at translation termination in a eukaryotic system. RNA 4, 181–188.

      Darnell, A.M., Subramaniam, A.R., and O’Shea, E.K. (2018). Translational Control through Differential Ribosome Pausing during Amino Acid Limitation in Mammalian Cells. Molecular Cell 71, 229–243.e11.

      Degnin, C., Schleiss, M., Cao, J., and Geballe, A. (1993). Translational inhibition mediated by a short upstream open reading frame in the human cytomegalovirus gpUL4 (gp48) transcript. Journal of Virology.

      Gaba, A., Wang, H., Fortune, T., and Qu, X. (2020). Smart-ORF: a single-molecule method for accessing ribosome dynamics in both upstream and main open reading frames. Nucleic Acids Research.

      Gamerdinger, M., Kobayashi, K., Wallisch, A., Kreft, S.G., Sailer, C., Schlömer, R., Sachs, N., Jomaa, A., Stengel, F., Ban, N., et al. (2019). Early Scanning of Nascent Polypeptides inside the Ribosomal Tunnel by NAC. Mol Cell 75, 996–1006.e8.

      Han, P., Shichino, Y., Schneider-Poetsch, T., Mito, M., Hashimoto, S., Udagawa, T., Kohno, K., Yoshida, M., Mishima, Y., Inada, T., et al. (2020). Genome-wide Survey of Ribosome Collision. Cell Reports 31, 107610.

      Harris, L.A., Hogg, J.S., Tapia, J.-J., Sekar, J.A.P., Gupta, S., Korsunsky, I., Arora, A., Barua, D., Sheehan, R.P., and Faeder, J.R. (2016). BioNetGen 2.2: advances in rule-based modeling. Bioinformatics 32, 3366–3368.

      Ivanov, I.P., Shin, B.-S., Loughran, G., Tzani, I., Young-Baird, S.K., Cao, C., Atkins, J.F., and Dever, T.E. (2018). Polyamine Control of Translation Elongation Regulates Start Site Selection on the Antizyme Inhibitor mRNA via Ribosome Queuing. Mol Cell 70, 254–264.e6.

      Janzen, D.M., Frolova, L., and Geballe, A.P. (2002). Inhibition of translation termination mediated by an interaction of eukaryotic release factor 1 with a nascent peptidyl-tRNA. Mol Cell Biol 22, 8562–8570.

      Kearse, M.G., and Wilusz, J.E. (2017). Non-AUG translation: a new start for protein synthesis in eukaryotes. Genes Dev 31, 1717–1731.

      Kearse, M.G., Goldman, D.H., Choi, J., Nwaezeapu, C., Liang, D., Green, K.M., Goldstrohm, A.C., Todd, P.K., Green, R., and Wilusz, J.E. (2019). Ribosome queuing enables non-AUG translation to be resistant to multiple protein synthesis inhibitors. Genes Dev 33, 871–885.

      Kozak, M. (1989). Circumstances and mechanisms of inhibition of translation by secondary structure in eucaryotic mRNAs. Mol Cell Biol 9, 5134–5142.

      Lareau, L.F., Hite, D.H., Hogan, G.J., and Brown, P.O. (2014). Distinct stages of the translation elongation cycle revealed by sequencing ribosome-protected mRNA fragments. eLife 3, e01257.

      Manjunath, H., Zhang, H., Rehfeld, F., Han, J., Chang, T.-C., and Mendell, J.T. (2019). Suppression of Ribosomal Pausing by eIF5A Is Necessary to Maintain the Fidelity of Start Codon Selection. Cell Reports 29, 3134–3146.e6.

      Matheisl, S., Berninghausen, O., Becker, T., and Beckmann, R. (2015). Structure of a human translation termination complex. Nucleic Acids Res 43, 8615–8626.

      Morisaki, T., Lyon, K., DeLuca, K.F., DeLuca, J.G., English, B.P., Zhang, Z., Lavis, L.D., Grimm, J.B., Viswanathan, S., Looger, L.L., et al. (2016). Real-time quantification of single RNA translation dynamics in living cells. Science 352, 1425–1429.

      Nilsson, O.B., Hedman, R., Marino, J., Wickles, S., Bischoff, L., Johansson, M., Müller-Lucks, A., Trovato, F., Puglisi, J.D., O’Brien, E.P., et al. (2015). Cotranslational Protein Folding inside the Ribosome Exit Tunnel. Cell Reports 12, 1533–1540.

      Noderer, W.L., Flockhart, R.J., Bhaduri, A., Diaz de Arce, A.J., Zhang, J., Khavari, P.A., and Wang, C.L. (2014). Quantitative analysis of mammalian translation initiation sites by FACS-seq. Mol Syst Biol 10, 748.

      Stern-Ginossar, N., Weisburd, B., Michalski, A., Le, V.T.K., Hein, M.Y., Huang, S.-X., Ma, M., Shen, B., Qian, S.-B., Hengel, H., et al. (2012). Decoding Human Cytomegalovirus. Science 338, 1088–1093.

      Subramaniam, Arvind R., Zid, Brian M., and O’Shea, Erin K. (2014). An Integrated Approach Reveals Regulatory Controls on Bacterial Translation Elongation. Cell 159, 1200–1211.

      Wagner, S., Herrmannová, A., Hronová, V., Gunišová, S., Sen, N.D., Hannan, R.D., Hinnebusch, A.G., Shirokikh, N.E., Preiss, T., and Valášek, L.S. (2020). Selective Translation Complex Profiling Reveals Staged Initiation and Co-translational Assembly of Initiation Factor Complexes. Mol Cell 79, 546–560.e7.

      Wang, H., Sun, L., Gaba, A., and Qu, X. (2020). An in vitro single-molecule assay for eukaryotic cap-dependent translation initiation kinetics. Nucleic Acids Res 48, e6.

      Weber, R., Chung, M.-Y., Keskeny, C., Zinnall, U., Landthaler, M., Valkov, E., Izaurralde, E., and Igreja, C. (2020). 4EHP and GIGYF1/2 Mediate Translation-Coupled Messenger RNA Decay. Cell Reports 33, 108262.

      Wilson, D.N., Arenz, S., and Beckmann, R. (2016). Translation regulation via nascent polypeptide-mediated ribosome stalling. Current Opinion in Structural Biology 37, 123–133.

      Wolin, S.L., and Walter, P. (1988). Ribosome pausing and stacking during translation of a eukaryotic mRNA. EMBO J 7, 3559–3569.

      Wu, B., Eliscovich, C., Yoon, Y.J., and Singer, R.H. (2016). Translation dynamics of single mRNAs in live cells and neurons. Science 352, 1430–1435.

      Wu, C.C.-C., Zinshteyn, B., Wehner, K.A., and Green, R. (2019). High-Resolution Ribosome Profiling Defines Discrete Ribosome Elongation States and Translational Regulation during Cellular Stress. Molecular Cell 73, 959–970.e5.

      Yan, X., Hoek, Tim A., Vale, Ronald D., and Tanenbaum, Marvin E. (2016). Dynamics of Translation of Single mRNA Molecules In Vivo. Cell 165, 976–989.

      Young, D.J., Meydan, S., and Guydosh, N.R. (2021). 40S ribosome profiling reveals distinct roles for Tma20/Tma22 (MCT-1/DENR) and Tma64 (eIF2D) in 40S subunit recycling. Nat Commun 12, 2976.

      Yu, J., Chen, M., Huang, H., Zhu, J., Song, H., Zhu, J., Park, J., and Ji, S.-J. (2018). Dynamic m6A modification regulates local translation of mRNA in axons. Nucleic Acids Research 46, 1412–1423.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In this paper, the authors are exploring the uORF regulatory mechanism. They first discussed five general models how uORFs might work to repress and buffering main ORF translation, then they mainly focus on the UL4 uORF2 for the potential mechanism. They use both computer modeling and experimental validation with reporter assay in 293t cell line. Based on their model, and few experimental results when they change the translation initiation rate and/or length of dORF, they propose it may work through 40S dissociation model, since the buffering effect is not uORF length sensitive.

      Major comments:

      • Are the key conclusions convincing?<br /> The modeling for different mechanisms is insightful, but some modeling parameters and experimental validation are not conclusive and validation of few of them can enforce the conclusions.
      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?<br /> Yes, the part about queuing and length sensitive is not convincing to me, it should be modified and reduce the statement strength.
      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.<br /> Yes, please see the major concerns
      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.<br /> They will need to re-think about the modeling, and validation in Figure 5, there are validation experiments that can be done in weeks and in a cost-efficient manner that can enforce the conclusions.
      • Are the data and the methods presented in such a way that they can be reproduced?<br /> Most of them are good
      • Are the experiments adequately replicated and statistical analysis adequate?<br /> Yes

      I have some major concerns about the paper:

      1. It is a bit confusing to me in Figure 2C, the reporter assays, that non-start codon reporter and non-stall reporter has same expression level. In theory, the non-stall reporter still has uORF there, so it should repress main ORF expression, and have lower expression level than the non-start reporter, where there is no uORF, no repression. In other uORFs they tested in Figure 6, the non-stall reporters are lower than non-start reporter. Since data they use to build the model is Figure 2B, and calculate the parameters for the whole paper, I just want to make sure it is making sense. I noticed there is another CTG in frame on the 4th codon, this may be alternative start codon in the non-start reporter to trigger some repression.
      2. All the modeling and prediction the authors do are based on average, but we know translation is very heterogeneous. For each ribosome or each 40S, the kinetics varies a lot, the authors should discuss about this part.
      3. For modeling related with the queuing-mediated model in Figure 1C. they use 30nt as the ribosome length to count the potential queuing to start codon. But 30nt is the 80S protected fragment with specific conformation. The protected fragment for 80S will change based on different status of ribosome conformation or elongation step. More importantly, for queuing, it is 40S, so they may have a different size. Based on previous 40S ribosome profiling (Archer, Stuart K., et al. Nature 535.7613 (2016): 570-574. And other papers), the length can vary from 19nt to very long, so I don't think the 30nt length can be used to model queuing in 40S and length sensitivity in the uORF working mechanism.
      4. For Figure 5B, besides the modeling length part I have mentioned above, when the authors increase the length of uORF, the sequence is also changed, which may introduce other side effect. So, if the authors want to conclude about the queuing part, they should rethink about the length for both modeling and validation, plus control for the sequence they added to increase the length of uORF, for example use different sequence when manipulate the length.

      Minor comments:

      • Specific experimental issues that are easily addressable.<br /> It is unclear how the luciferase assays were analyzed considering the background noise. If the NLuc expression is low, close to the background, then how to extract or normalize the background will influence the expression level, thus fold change for different reporter/condition.
      • Are prior studies referenced appropriately?<br /> yes
      • Are the text and figures clear and accurate?<br /> Mostly good
      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?<br /> Have a main figure about the modeling part.

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.<br /> It is an interesting area, using modeling with experiment validation to understand uORF regulation mechanism, the kinetics and interplay between different translation steps, it will help us to understand uORF buffering in stress conditions.<br /> Also bring modeling method with reporter validation to the translation field, will provide clues to the molecular mechanism study, especially in complex situation.
      • Place the work in the context of the existing literature (provide references, where appropriate).<br /> Recent years, there has been a lot of study about small open reading frames, while for uORFs are known to repress translation, the regulatory mechanism is not known yet, there are just different models not validated yet (Young& Wek, 2016). Also, under normal conditions and stress conditions, uORF can play both repressive and stimulative role in main ORF translation (Orr, Mona Wu, et al. NAR 48.3 (2020): 1029-1042.). This paper is the first study to put all the uORF working hypothesis with buffering effect together, they use modeling to explain how under each hypothesis, buffering may happen or not.
      • State what audience might be interested in and influenced by the reported findings.<br /> It will be interesting to people, who study molecular biology, biochemistry for translation regulation, especially uORFs. The modeling people may also find it interesting, how they could adapt modeling to complex biology process and contribute to the understanding.
      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.<br /> I have extensive experience working in the translation regulation field and I feel extremely comfortable to discus all the experimental part including individual reporters as well as genome wide. But I do not consider an expert in the modelling section of this work.
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      We appreciate the efforts the two reviewers had invested in reviewing our manuscript. Their constructive comments will help improve the paper overall.

      2. Description of the planned revisions

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.

      Reviewer #1 (Evidence, reproducibility and clarity):

      The main point of the current report is that 6mA is present in DNA of Hydractinia, and is introduced randomly into the genome by DNA polymerases, originating from degradation of maternally provided RNA via nucleotide salvage pathway. The authors observed that 6mA levels are changing over development and peak at 16-cell stage, with a sudden decrease to 'background levels' at 64 cell stage, a stage when zygotic genome gets activated. The 6mA drop is Alkbh1 dependent, since upon K/D of Alkbh1, 6mA levels were significantly higher than in control embryos. Authors also observed that AlkbH1 K/D delays zygotic genome activation (ZGA) to later stages, but without any noticeable consequences for the proper development. To demonstrate that 6mA is not controlled via direct DNA methylation, they show that K/D of two potential DNA methyl transferases N6amt1 and Mettl4 does not have any effect on 6mA levels. Supporting their hypothesis, authors demonstrate high activity and imperfect selectivity towards non-modified nucleotides of salvage pathway during embryo development using EU labeling experiments.<br /> In general, the provided data support their model, however, the paper needs some improvements to include missing information and controls before publication.

      Major comments:<br /> 1. Fig 1A shows a schematic where D3-6mA is added to only QTRAP but not QQQ experiment, usually QQQ methods also require isotopic standards for each component quantified to normalize for ionization differences and provide true quantitative information. Why did authors not use dA isotope? The ionization suppression is more pronounced at high concentrations of the components, which is true for dA in the current set up. How do authors control or at least test this?

      We have limited resources of isotopic-labelled standards. Therefore, we initially used QQQ without these standards to obtain data that covered many time points in development to identify the general pattern and key time points of high and low 6mA. Once the QQQ indicated that the 16-cell stage has the highest 6mA and that this drops to background at the 64-cell stage (and remains so later on), we performed QTRAP with the isotope-labeled standard control only for these two stages. Looking at the data resulting from both techniques, it appears that they essentially revealed the same pattern. Since the main focus of the study is on 16- and 64-cell embryos, we feel that the contribution of performing all stages by QTRAP would be marginal. We have performed control experiments to assess ionization suppression for dA and found that it was insignificant. We will add the corresponding data to the Materials and Methods section.

      Fig S1 show that quantification works well, but were the total DNA amounts comparable to the gDNA amount used in actual samples? If yes, please indicate so.

      Yes, the amounts were the same (2mg). We will change the methods sections accordingly.

      1. In line 68 and in fig 1B, 1C there is a mysterious 'Neg. Ctrl 'sample. It is unclear what was the sample and more interestingly in fig 1B the levels in this sample are 0.015% but in fig 1C it is much below 0.001%. Why there is such a striking difference for the identical sample.

      Negative controls were the same amounts (2 µg) of oligonucleotides without 6mA, DNAse-treated exactly like the samples. Figure 1B shows that QQQ is not sensitive enough to reliably detect 6mA concentrations below 0.02%, incapable to distinguish the background 6mA in the negative control from the level of 6mA in the 64-cell stage and later. Therefore, we utilized D3-6mA labelled QTRAP (Figure 1C) and determined that the level in the 64 cells stage embryos was actually ~0.01%. In the negative control, the amount was considerably lower, around 6 ppm (0.0006%).

      1. As I can see authors measured natural isotopologue of 6mA, however traces of contaminant bacterial DNA originating even from recombinant DNA degradation enzymes also have 6mA, giving background signal. In their LC/MS experiments, did authors check if the 6mA comes truly from the gDNA and not from contaminant during DNA purification and processing before MS?

      Yes, we did. As control for the level of 6mA contamination from the enzymatic digestion (sourced from bacteria), we also performed digestion of the negative control (see also answer to previous comment).

      1. Fig 1D in the legend: authors should indicate that samples were already RNAse treated, and Line 80 in the text mentions a second RNase treatment (fig S1C) to confirm the specificity of the DNA staining.

      The samples were indeed RNase-treated. We will modify the legend and the reference to figure 1D on line 80 accordingly.

      1. In lines 86-87, authors compare the LC/MS and sequencing based quantifications, and say they are consistent. Can authors make a figure analogous to fig 1B but using sequencing data?

      The data are already provided in Figure S1E. However, we used a Venn diagram to denote that these figures were generated by a different type of analysis (SMRT-sequencing as opposed to QTRAP). They are consistent but not identical.

      1. Fig 3B and 3C, controls showing the validity of EU staining, are required, such as RNAse treated sample with a signals disappearing; or control embryos without EU, thus having only background signal.

      Indeed, Fig 3C shows an RNase treated sample in which the EU signal is abolished as expected.

      1. Fig 3D specificity control is missing, control embryos without EdU having only background signal.

      The control is provided in Figure 3B. It shows a sample without EdU (treated with EU) and shows the background signal.

      1. Fig 4A legend: 'rescue solution (see text)'. Please describe in the legend what the solution was. Moreover, I did not find clear explanation in the text either, my only guess was from the materials in methods section, where authors used both shAlkbh1 and Alkbh1 mRNA with silent mutations.

      The reviewer is right, this was indeed the control that was used. We will modify the text to clarify this point.

      1. Fig 4B shows many data points per condition and the legend says EU signals (in triplicate), was these triplicate animals with multiple cells, where EU signal from each cell was plotted as a point? Please specify in the legend.

      Yes, triplicate embryos and each cell used as point. The legend will be adapted.

      1. Lines 169-170 state 'the lack of premature ZGA following N6amt1/Mettl4 knockdown (Figure S7B) indicate a lack of methyl transferase that maintains 6mA through embryogenesis' while an experiment indeed demonstrates that these are not the major players in this process, it does not prove these are not DNA methyl transferases. The absence of evidence is not the evidence of absence. I think authors should at least soften this conclusion.

      We agree and will tone down the relevant statement.

      1. Discussion section describes many experimental data that belong to Results section.

      This is a point also raised by Reviewer #2. We will move these points to the results and expand the discussion.

      1. Fig S8 I think should be a part of the main figure since it is one of the important experiments to prove the high activity and somewhat low selectivity of salvage pathway in the embryos during the critical early stages.

      We had originally left it out to save space. We prefer to leave this decision with the editor.

      1. Fig 5C the model is confusing, authors should improve it.

      It is difficult to describe a complex story using a single static model. Therefore, we will add an animation to the supplemental material to clarify the model.

      1. Fig S8 negative controls showing the specificity of CuAAC staining are missing: control animals/ embryos without EU.

      We will redo these experiments and include appropriate controls.

      1. Authors may find this reference useful: PMID: 32355286.

      We will add this ref.

      1. It is known that in mammals ADAL protein is the one which demethylates m6A nucleotide to clear it from the nucleotide pools and prevent it entering into the salvage pathway (PMID: 29884623). Does Hydractinia Symbiolongicarpus have an ADAL analog? If yes then it would be important to see if knock down/overexpression of this enzyme has any effect on the timing of ZGA. In principle, passively introduced 6mA may be regulatory to proper time the ZGA, and is controlled via an activity of Adal and Alkbh1.

      The gene is present in the Hydractinia genome. We could perform the experiments recommended. We will knock the gene down and look at the effect of this manipulation on ZGA.

      1. Material and methods are missing information:<br /> a. Line 370-371 provide references to the protocols listed or describe the steps.<br /> b. Line 373 standard column based purification protocol, what is it either explain or provide a reference.

      References will be provided.

      Minor points:<br /> Line 79 : 'Fig 1D and S1B', Did authors meant 'Fig1D and S1C'?<br /> Fig 5A Y axis title is missing.<br /> Line 379: 3D1-6mA should be D3-6mA please correct the other appearances as well.<br /> Line 405: terms : dsDNA solutions and standard solutions are confusing please rephrase.<br /> Line 410: Cleaned embryos, what does cleaned mean, be specific.<br /> Line 413: PTx is mentioned, please explain what is it.<br /> Line 415 and line 440 : HCl was washed and embryos were neutralized, I guess it should state : HCl was neutralized and embryos were washed with...'<br /> Line 431: ' before fixed by incubation in PAGA-T..." did authors meant : 'before fixation with PAGA-T...?<br /> Line 435: Permeabilization was done by further washes the fixed embryos with...", did authors meant: Permeabilization was done by an additional wash of the fixed embryos with...?<br /> Line 440: The HCL was washed with what solution?<br /> Line 446: For how long were the PTx washes?<br /> Lines 458-460: the sentence is confusing.<br /> Line 500: 'then used detect' should be 'then used to detect'

      We will adopt all minor points above.

      Reviewer #1 (Significance):

      There are many high profile papers describing the existence of 6mA in gDNA of different organism including insects and mammals. However, there is no proof that it has any biological function. Indeed, recent reports (PMID: 32355286 and 32203414) indicate that in mammalian cells, 6mA is indeed primarily incorporated by DNA polymerases and originates from a salvage pathway. The present report is the first in vivo evidence that confirms this to be the case more generally and, importantly, demonstrates a 6mA effect on ZGA. Hence, this is an important and timely report, which will be interesting to the field, as well as a broad audience to clarify the role of 6mA and the mechanism whereby it is introduced into gDNA.<br /> My expertise: Biochemistry and biology of DNA and RNA modifications, including 6mA. Fair expertise: bioinformatics analysis.

      Reviewer #2 (Evidence, reproducibility and clarity):

      The manuscript reports developmental dynamics of DNA 6mA in the cnidarian Hydractinia symbiolongicarpus. The authors describe an event of a seemingly random accumulation of this DNA modification in 16-cell stage embryos of Hydractinia symbiolongicarpus followed by an apparent clearance of 6mA by the 64-cell stage. Interestingly, the depletion of cnidarian orthologue of the putative 6mA 'demethylase', Alkbh1, results in delay in zygotic transcription accompanied by high levels of DNA 6mA in 64-cell stage cnidarian embryos. The authors suggest that the 6mA they observe originates from random misincorporation of recycled degraded m6A-marked ribo-nucleotides during early cnidarian embryogenesis.<br /> Overall, most of the experiments are performed at high technical level and the paper is generally nicely written. Despite this, in my opinion, the manuscript would benefit from incorporation of several addition controls and answering a number of points on the description/presenation of the data.<br /> Major comments:

      1. In the present version of the manuscript, the authors demonstrate the negative correlation between the presence of 6mA in the genome of cnidarian embryos and transcription. Although, the depletion of Alkbh1 leads to the delay in ZGA, strictly speaking, this effect may be independent of the catalytic function of Alkbh1. Therefore, to make a statement that m6A "random incorporation into the early embryonic genome inhibits transcription" the authors should use a catalytically inactive form of this enzyme as a control in the corresponding experiments and/or (ideally) perform in vitro transcription assays using 6mA-containing substrates.

      We could perform shRNA-mediated Alkbh1 KD and try rescue ZGA by co-injecting a catalytically-inactive Alkbh1 mRNA.

      The suggested in vitro experiment would be inconclusive for two reasons: first, Hydractinia polymerase may respond differently to 6mA; second, 6mA-mediated transcription inhibition could be indirect, requiring the in vivo context. We would like to add that transcription inhibition of 6mA has been demonstrated in vitro using yeast DNA polymerase as cited in the paper.

      1. Despite several experiments suggesting that random incorporation of recycled ribonucleotides occurs in cnidarian embryos, the source of 6mA in their DNA seems currently unclear. Would it be possible to directly test the author's hypothesis by comparing the levels of 6mA upon maternal (and possibly zygotic) depletion of the cnidarian orthologue of RNA m6A methyltransferase Mettl3 in cnidarian embryos? Alternatively, the authors could incubate the embryos in medium supplemented with labeled ribo-m6A followed by checking the levels of DNA 6mA in the embryonic DNA?

      We show that maternal mRNAs are already methylated in the early embryo (Figure 5). Therefore, it would indeed make sense to ablate Mettl3 from the maternal tissue while maternal mRNAs are methylated. However, in the absence of a conditional knockout technique in Hydractinia, this would require generation of CRISPR-Cas9 mutants that would likely die early in their development, long before reaching sexual maturity.

      Instead, we are happy to perform the other experiment suggested by the reviewer to directly demonstrate m6A to 6mA transition.

      Minor comments:<br /> 1. It would be nice to complement Fig. 4, 5, and S7 with immunostaining of the corresponding embryos for 6mA.

      6mA immunostaning is not compatible with EU labeling because, first, they require different types of fixation (PAGA-T vs formaldehyde); second, immunostaining requires RNase treatment to remove m6A which would also remove the EU signal.

      1. The current Discussion contains references for several figures with experimental results. I suggest separating these experimental data from the Discussion. The authors should, in my opinion, make an additional Results chapter and, if possible, expand the Discussion section (that is currently minimal) speculating on significance of their results for different biological systems.

      This has also been requested by Reviewer #1. We will follow the reviewer's recommendation.

      1. The present Title reads like a clear overstatement (at least currently, please see major comments above). The Title should also reference the organism where the observations have been made.

      Following the revision, we believe that both random incorporation of 6mA and a delay in zygotic transcription will be well supported by our data. We will add the organism's name to the title as suggested.

      Reviewer #2 (Significance):

      The presence and significance of DNA 6mA in animal genomes is a very interesting and highly controversial topic. Although a number of studies suggest that relatively high levels of this DNA modification occur in multicellular eukaryotes in different biological/functional contexts, other reports challenged these observations attributing them to different experimental artifacts. In this context, the current paper that provides high quality novel experimental data on the developmental dynamics of DNA 6mA in cnidarian is extremely interesting and timely. Moreover, the author's results and the hypotheses on the function/origin of 6mA in cnidarian embryogenesis may provide a conceptual framework for the interpretation of other 6mA/m6A-related studies performed on different experimental models. Thus, this manuscript will definitely be of interest for a wide range of researchers working in the fields of epigenetics, cancer biology and developmental biology.<br /> I strongly believe that this is an interesting and important study that definitely deserves to be published in a high impact journal.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      Reviewer #2 suggested four experiments, three of which are either impossible in our system or expected to reveal insignificant information. First, the reviewer suggests ablating Mettl3 from the maternal tissue. While being a good idea in principle, there is no conditional ablation technique available for Hydractinia. Generating CRISPR-Cas9 mutants would likely result in embryonic lethality, long before sexual maturation has been reached.

      Second, the reviewer proposed to perform in vitro experiments with m6A-containing substrates. These experiments are unlikely to reveal useful data since the Hydractinia polymerase may respond differently to methylated adenine than commercially available polymerases. Also, transcription inhibition may be indirect, depending on the in vivo context that cannot be mimicked in vitro.

      Finally, the reviewer suggested expressing a catalytically-dead Alkbh1 in the background of endogenous Alkbh1 knockdown to demonstrate that its function depends on the enzymatic activity to remove 6mA from the genome. While we could perform the experiment (see our reply above), the information emanating from it would arguably be outside the scope of this study.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      The newly identified azyx-1 ORF was named peu-1 in the initial submission of this manuscript, a name that was under consideration with WormBase, who supervise nomenclature of C. elegans genes. In consultation with WormBase, the locus was named azyx-1 instead (the final decision being “azyx-1 will be attributed to F42G4.11. It will be released in WS287 at the beginning of 2023”). We updated this nomenclature in our submission files, including in reviewer comments pasted below. Please note that other than this, no changes whatsoever were made to the reviewer comments.

      2. Description of the planned revisions

      REV #3: Specific thoughts for consideration:

      Figure 5, Moderate is really minor/moderate with other metrics, and severe is definitely moderate with other metrics. Thus, I'm not sure if normal vs. moderate is needed. This really is a minor point as it doesn't impact results/overall story/importance.

      This was also pointed out by reviewer #1. We will rename classification more mildly so.

      REV #1 Fig. 5 Even the 'severe' muscle disruption is quite mild (say, in comparison to loss of talin). Perhaps rephrase these categories? The moderate and severe categories also do not look different to me. Show what the muscle cells look like in zyx-1 deletion and overexpression animals. Is there a way to use quantitative imaging to score these? Can azyx-1 phenotypes be rescued or enhanced by expression (or RNAi) of zyxin in the muscle? Also, clarify what age animals are being tested in the muscle and burrowing assay.

      We agree and will rename the classes in milder terms. Qualitative scoring (which was done blinded) is the standard in the field as was done according to Dhondt et al. (2021 Dis Model Mech). When tested for muscle integrity and burrowing capacity, animals were day 1 adults. This is mentioned in the Methods section of the current manuscript and will also be included in the captions of the revised figures.

      REV #2: I am not convinced by the data presented in Figure 5. There does not seem to be much to distinguish the five genotypes, but I concede that I am not used to looking at this type of data. But why was the muscle phenotype not also examined in the azyx-1 rescue lines?

      Because other reviewers that are familiar with these data point out that the observed differences of panels A-B are indeed milder that what is usually seen, we will rename classifications in the manuscript (see responses above). Because the azyx-1 deletion mutant does not differ from controls in the muscle phenotype, there is no phenotype to rescue for this readout, and no rescue strains were generated.

      We are not sure what the reviewer may struggle with in (assumedly) panel C (~‘to distinguish the five genotypes’). The positive control (zyx-1) behaves as expected in the burrowing assay, with our own mutants within that range, also as expected. All data were scored blinded to avoid any bias and statistical analysis supports the interpretations, all granting confidence to the observed differences. However, because reviewer#3 also would prefer another representation of the data shown in this panel (see below), we will provide an updated panel representation in the revised manuscript.

      REV #3: Figure 5C- Hard to read. Would displaying lines/tragectories make it easier to understand? Would displaying as violin plots for each timepoint/condition make it easier to visualize? Basically in black and white and in color this is hard to visually process.

      We will work on another representation for the revised manuscript, since reviewer2 also seemed to struggle with this panel representation.

      REV #1: Fig. S2 Match font sizes on Y-axes. Also, indicate any statistical differences and statistics used.

      Figure adjustments will be implemented in the revised manuscript as requested.

      REV#1: Fig. S3 C, indicate any statistical differences and statistics used.

      Figure adjustments will be implemented in the revised manuscript as requested.

      REV #2: I am not convinced by the "overexpression" experiments. These are not well controlled, since no evidence is presented that AZYX-1 is being overexpressed in these lines. Also, since we know that extrachromosomal transgenic lines are highly variable, one would need to test the effect of several independent lines to ensure that the effects that the authors observe are indeed associated with AZYX-1 overexpression and not simply an idiosyncratic effect of the genetic background of a given strain. Finally, there does not seem to be an obvious mechanism by which overexpression of AZYX-1 can impact ZYX-1 function. That doesn't rule out an effect, but based on the data as it is, it is premature to propose such a mechanism. The authors need to show that multiple overexpression lines do reproducibly overexpress AZYX-1 and that this results in reproducible effects of zyx-1 phenotypes.

      The extrachromosomal strains are indeed variable, but because the background is wild type (in contrast to a deletion mutant background for rescue strains), an overdose of the target provided is expected. As requested in the cross-consultation reviewer communication, we will include quantitative data in our revised manuscript that shows that the used strains (LSC1950, LSC1960, LSC2000) indeed are overexpressors.

      REV #2: The data presented in Figure 4F needs to be quantified using the same format as was presented in Figure 4B.

      Due to the different genetic background of the strains, this is not possible in the exact same way (the red signal of LSC1998 & LSC1999 is not unique to zyxin). We understand that in essence, the reviewer would like us to include a more quantitative representation of these data, and will update the figure accordingly.

      REV #2: What is the difference between the overexpression transgenic lines and the "rescuing" transgenic lines? In the Materials and Methods, the same concentration of plasmid was used in injections - so these likely give the same approximate level of transgenic expression.

      The genetic background: a rescue line adds wt DNA back to a mutant background, while in an OE strain it is added into a wt background. While this can already be derived from the genotype details in Supplemental Table S1, we apologize for not specifying this in the methods section, as it is common practice in the field. These specifications will be added to the revised manuscript.

      REV #2: I am not clear what features are being used to characterise the myofibril structures into the three categories. Can the authors annotate the images to indicate the diagnostic features?

      The reviewer is correct that manual classification is rather poorly defined in general, which is why it is scored blinded (here as per Cothren et al., 2018 Bio Protoc). We adhered to the reference images by Dhondt et al. (2021, Dis Mod Mech) with visual assessment based on how tightly organized (~parallel) myofilaments are organized, assessing overall increases of bends or breaks in individual myofibers as leading to a less aligned pattern (cf. Fig. 1 of Dhondt et al.). We will add this information more explicitly to the Methods section of the revised manuscript.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      REV #1: Fig. 4 would be better if the control (A) and azyx-1OE (B) worms were more similar in age and size

      The panels of this figure were not to the exact same scale, we apologize if the reviewer found this confusing. We have rescaled the panels so that this is less confusing. The animals are all day 1 adults.

      REV #1: Abstract: Clarify what is meant by 'putative syntenic conservation' or rephrase, simply stating that the existence of an ORF overlapping with the 5' region of zyxin is conserved

      This has been rephrased according to request.

      REV #1: Line 24: Clarify these are synthetic phenotypes (not caused by loss of zyx-1/azyx-1 alone). Loss of zyx-1 alone results in very mild phenotypes.

      While the original sentence already pointed this out, we rephrased the text to make clear that these observations require the dystrophic mutant background.

      REV #1: Line 28: Start new paragraph

      The new paragraph was started a sentence earlier, according to rev#2 request.

      REV #1: Line 31: Not clear what is meant by 'post-transcriptional regulation can be further propagated'- maybe reword to 'alternative and overlapping open reading frames (ORFs) arising from polycistronic mRNA can regulate translation' or something simpler like that.

      This has been rephrased according to request.

      REV #1: Line 56-57: Is this because most C. elegans transcripts start with the splice leader SL1 or SL2 rather than the adjacent 5' sequence? Is that relevant for zyx-1? Recommend commenting briefly on this.

      We did not look into this for all possible u(o)ORFs in C. elegans, which also is not the focus of the manuscript, so we cannot make general statements. As part of the annotation procedure of azyx‑1, WormBase verified that indeed several pieces of evidence, including available phyloCSF data for exon 1, SL1s, RNASeq and Nanopore data, all support its annotation, as well as its translation from the zyx-1 long transcripts (albeit with different start and in different reading frame).

      REV #1: Line 78: Delete the word 'other'

      Done

      REV #1: Line 122: zyx-1

      Done

      REV #1: Line 137: 'lead' should be 'led'

      Done

      REV #1: Line 158: rephrase 'only the long ones' to indicate which isoforms more precisely

      Done (these are a/e, cf. Luo et al. 2014, Development)

      REV #1: Line 195: Rephrase. Unclear what is meant by 'highlights the evasiveness of non-canonical ORFs from functional annotation'

      Done; this was rephrased to “This exemplifies how non-canonical ORFs can escape functional annotation, …”.

      REV #1: Various locations: I think it will be more clear to the reader to consistently refer to the burrowing assay as 'burrowing assay' rather than chemotaxis. I recommend adding a brief description of the burrowing assay to the results section.

      Wording has been updated, we can provide a short context sentence to the results section of the revised manuscript.

      REV #2: I'm not sure how to interpret the significance of the u/ouORFs across short and large phylogenetic distances. One would presume that there might not be primary amino acid conservation if the regulation simply takes by interference with ribosome scanning and translocation. Here some statistical analysis would help with assessing the significance of these observations. How unusual is it to find u/uoORFs in the 5' UTRs of gene encoding zyxin family members versus in general for the species analysed?

      This is indeed the very question we are asking in the manuscript, and there is a clear reason why we refrain from making significance statements. At the moment, all relevant available metadata are used for the analysis in the manuscript, leading to the communication of the synteny-related findings as they are currently presented. This is due to the dependency on translatomics data to find credible u(o)ORFs, and there aren’t very many translatomics datasets available, only for a limited set of species so far. Our manuscript contains all relevant OpenProt data, which are derived from only 9 animal species so far. As shown in Table S4, 14 zyxin orthologs belonging to 7 species have associated u(o)ORFS, for two species only overlapping ORFs are present in the database. While more and more datasets will undoubtedly become available in the next years, the findings in the manuscript are as complete as currently possible: we do find evidence of u(o)ORFs associated with zyxin orthologs in these species, some of which are evolutionarily distantly related to C. elegans.

      REV #2: The authors state that there is evidence for synteny and coding region conservation. The data supporting this assertion is not well presented. Presentation and analysis of multiple sequence alignments of the putative homologues involved would strengthen the assertion of synteny considerably.

      We apologize if the reviewer misunderstood: we discuss likely syntenic conservation, not coding region conservation. The latter is not mentioned in our manuscript, and in fact not convincing indeed. This is not surprising given the bigger sequence diversity observed at the N terminus of zyxins and the partial overlap of these coding sequences, and in line with observations of several others in the RiboSeq community that many identified uORFs are conserved between orthologous genes, but poorly conserved at the amino acid level (e.g. community-driven communication by Mudge et al., BioRxiv 2021 and references therein).

      REV #2: The authors are oddly coy about the molecular details of the 27 bp deletion used to study the loss of azyx-1 function. In the absence of these details, it is not possible to assess the validity of these experiments. We need to be given the full molecular details of the allele - precisely which nucleotides are deleted? And how do they affect the coding regions of zyx-1 and azyx-1?

      I am also confused about why the authors made a deletion allele rather than mutating the AUG of AZYX-1? This would be a cleaner experiment to interpret. Based on the data presented, there are two possible interpretations in addition to the one suggested by the authors: 1) the 27 bp deletion impacts zyx-1 expression due to its impact on the zyx-1 coding region (the coding regions of azyx-1 and zyx-1 overlap); 2) the deletion mutation deletes critical transcriptional control elements. A simpler mutation of the azyx-1 AUG via CRISPR might allow them to rule out the possibility that they have simply compromised a transcriptional control element or damaged the coding region of ZYX-1.

      As mentioned above and as will be included more clearly so in the revised manuscript: the deletion is 182-155bp (27bp) upstream of the zyx-1a start site. This was a mutant that could easily be generated via CRISPR, so we proceeded with this one. This edit rules out option1 (there is no change of the zyxin coding region), but (as also considered but addressed differently in the manuscript; see below) retains alternative interpretation 2. There are no regulatory regions or transcription factor binding sites known for the (a)zyx-1 locus (verified in current WormBase version WS285), but that does certainly not fully rule out the possibility either. Rather than creating a series of azyx-1 mutants, be they SNP or small deletion mutants, that would suffer from the exact same duality in possible interpretation, we chose to combine the deletion mutant with rescue and overexpression strains. Because these latter strains do not affect the endogenous zyxin regulatory region, they add far more credibility to the interpretation, than alternative mutants in the azyx-1/zyx-1 locus would.

      REV#2. The narrative flow of the introduction could be improved by the judicious use of paragraphs. Line 12, for instance is a clear paragraph break, as is line 24.

      Done

      REV #3: Specific thoughts for consideration:<br /> 3) Could more be said about overlapping genes/regulation in humans? Again, not critical but this is such a great piece of work that it would be useful to guide human subjects researchers as to how to best further your work.

      It is unclear whether the reviewer would like to see an extended introduction and/or discussion. We tried to meet this request without drifting too much from the focus of our current communication by adding the following to the introduction (lines 41-47 of the current draft): “From a more human-centred future perspective, uORFs are a rather unexplored niche for translational research: with a predicted prevalence in over 50% of human genes and first examples regulating translation of disease-associated genes already emerging (Lee et al. 2021; Schulz et al. 2018), the field is bound to not only lead to more fundamental, but also application-oriented insights. Keeping this broader context in mind, we here focus on more fundamental principles of uORFs in a model organism context.”.

      4. Description of analyses that authors prefer not to carry out

      REV#1: Does azyx-1 have zyx-1-independent functions or other regulatory targets?

      This is an interesting question that is not yet addressed. While this is possible, it is beyond the scope of our current communication. Since the reviewer does not request anything concrete, we would prefer to leave this for follow-up research. While this notion is included in the manuscript, we are happy to more explicitly address this question in the discussion as well.

      REV#1: Do the burrowing assay results reflect a neuronal or a muscle function for AZYX-1? Or both?

      Our manuscript indeed does not yet delve into tissue-specific actions of this newly discovered ORF. While interesting, and in line with reviewer #3’s remark, this would be valuable for follow-up research, but is beyond the scope of our current communication. We will make sure the concept is clearly mentioned in the discussion of our findings.

      REV #3: Specific thoughts for consideration:

      Could more be done/said about neruo vs, muscular effects of azyx-1 and zyx-1. I appreciate this is beyond the scope of the present manuscript and therefore does not require response if you don't have data or it makes telling the story you want to tell more difficult.

      We agree with the reviewer that spatially resolving some of these observations would be a next interesting step, which indeed is beyond the scope of our current communication.

      REV#1: Fig. 2A very faint, increase brightness/contrast?

      We did not adjust brightness or contrast for any of the figures, an no such requests were made by other reviewers. We greatly prefer presenting the data as unedited as possible, and would like to request the journal’s preference for action here.

      5. Remaining reviewer comments & responses not highlighted above

      CROSS-CONSULTATION COMMENTS<br /> _The following is a conversation among the three referees:<br /> _REFEREE #2: I appear to be the dissenting voice in terms of concern about the details of the 27 bp deletion and the "overexpression" constructs. I would be interested to know your opinions regarding my comments on these issues.<br /> REFEREE #1: I think adding the details of the 27 bp deletion is a reasonable request. It is probably not possible to disambiguate entirely the two effects of the deletion, and changing the start codon may result in an alternate start with other downstream effects. I think just explaining it more fully in the methods would satisfy my concerns.<br /> REFEREE #2: What about the issues with the overexpresssion? In my experience, presence of multicopy transgenes on an extrachromosomal arrays might not lead to over expression of the gene involved? This needs to be verified in some way.<br /> REFEREE #1: You are right about that. If the construct is tagged in some way they could try a western. I would recommend they integrate the transgenes, or just show results from several lines as you suggest.<br /> REFEREE #3: I agree the 27 bp deletion and over expression are reasonable technical issues. However, I view this a techical details vs. critical details for the novel regulatory mechanism. The point about ability to judge conservation is also reasonable but until the theory is firmly out there it is hard to test the conservation and broader applicability to other genes/proteins. Thus, while asking for additional information on these issues is reasonable I do not see the inability to address beyond highlighting as limitation in the text as critical to the overall validity of the work.<br /> REFEREE #2: I disagree with Reviewer #3, without knowing the details of 27 bp deletion the most reasonable interpretation of the data is simply that it is a loss-of-function allele of zyx-1. This goes beyond "technical" - at present there is no unequivocal evidence that azyx-1 has any functional significance beyond that it is expressed as a peptide.<br /> REFEREE #3: I've been back through the manuscript. They have sequenced the deletion and therefore should be able to provide that information to satisfy the issue(s). For the over expression, short of silencing my experience is that they do over express and when you have multiple lines some express more than others (and some silence more than others). If you want evidence that the peptide is over expressed ask them to quantify via mass spec if it isn't tagged and they can't do a Western. Clearly they have work leading expertise in quantitative mass spec proteomics in C. elegans and should be able to do that. Generally speaking, rescue of a deletion is a pretty good sign that the expression is working though (and is an accepted standard).

      We apologize if this was not clear from the manuscript, and will clearly include the details in the Methods section: the deletion is 182-155bp (27bp) upstream of the zyx-1a start site, at AT|G+26|TTC. This was confirmed by sequencing; the oligos used for this are listed in table S3 of the manuscript. We address the confusion of rescue and overexpression above, in response to reviewer #2 (who echoes this confusion here).

      Reviewer #1 (Evidence, reproducibility and clarity):

      **This is a very interesting paper about a gene regulatory mechanism in a type of poly-cistronic mRNA in which alternate starts/open reading frames lead to production of two different proteins from the same locus. AZYX-1 is a predicted 166 aa protein, translated from the 5'UTR of zyx-1. Two isoforms are expressed from the 5' UTR and coding region of zyx-1. The presence of overlapping transcripts with zyxin orthologs appears to be conserved in other animals. The authors provide spectroscopic evidence AZYX-1 is indeed translated, and show AZYX-1 can regulate zyx-1 expression. Intriguingly, it seems azyx-1 inhibits zyx-1 expression in cis (deletion of azyx-1 increases ZYX-1 peptides), but AZYX-1 promotes zyx-1 expression in trans (overexpression of AZYX-1 increases ZYX-1 expression).

      Reviewer #1 (Significance):

      Nature and significance of the advance: This is a very interesting paper about a gene regulatory mechanism in a type of poly-cistronic mRNA encoding azyx-1 and zyx-1. Intriguingly, it seems azyx-1 inhibits zyx-1 expression in cis (deletion of azyx-1 increases ZYX-1 peptides), but AZYX-1 promotes zyx-1 expression in trans (overexpression of AZYX-1 increases ZYX-1 expression).

      Compare to existing published knowledge: This is the first study of its type on zyx-1.

      Audience: Those interested in gene regulatory mechanisms and in zyxin.

      My expertise: C. elegans cytoskeleton, cell migration, acto-myosin contractility.

      Reviewer #2 (Evidence, reproducibility and clarity):

      **Summary:<br /> The authors build on previous work defining upstream and upstream-overlapping open reading frames (uORF and uoORFs, respectively) by focussing on a specific locus azyx-1, which the authors propose influences the expression of the gene encoding the sole zyxin family in C. elegans, zyx-1. They present evidence suggestive of u/uoORFs being a common feature of zyxin family genes in other animals, hinting that perhaps this is a conserved mechanism of gene expression regulation for these genes. In which case, studies in C. elegans would be valuable to elucidate the mechanism involved.<br /> Using a fluorescent reporter strategy, they show that azyx-1 is expressed in the same tissues as zyx-1, which is to be expected since their share the same transcriptional control elements.<br /> They also characterise the peptide steady state levels of both ZYX-1 and AZYX-1 isoforms, suggesting that while overall ZYX-1 levels decline with age, those for AZYX-1 are generally maintained. The significance of these observations was not immediately obvious to me - a priori it is difficult to assess what relative wild type steady-state levels one might expect if AZYX-1 translation impacted ZYX-1 expression.<br /> The authors propose that expression of AZYX-1 leads to inhibition of ZYX-1 translation through the standard model by which u/uoORFs impact translation of downstream ORFs. To test this, they generated a 27 bp deletion "at the beginning of the azyx-1 ORF". This deletion clearly correlated with a reduction in ZYX-1 expression.<br /> Finally, the authors generated lines designed to overexpress AZYX-1, testing the hypothesis that AZYX-1 might influence ZYX-1 in trans. Though here, it is not obvious by what mechanism this might operate, and the effect-sizes involved are modest.

      Reviewer #2 (Significance):

      The authors propose an interesting interaction between an important regulator of cellular behaviour (zyxin) and the u/uoORF that potentially regulates its expression - if validated by further experimentation, this would add to the growing evidence for the importance of the 5' UTR as a source of gene regulatory activity. Such regulation is well described in yeast, but there are fewer examples in animals, particularly in genetically tractable systems such as C. elegans. The work would primarily be of interest to researchers interested in understanding the spectrum of such activity in C. elegans. My own area of expertise, RNA-splicing and the post-transcriptional regulation of C. elegans gene expression, is not directly related to the research presented in the manuscript, but I am familiar with the general concepts and developments involved.

      Reviewer #3 (Evidence, reproducibility and clarity):

      Summary:<br /> The authors find that azyx-1 is a non-cononical gene with overlapping genomic localization to the gene zyx-1 in C. elegans. The authors also find preliminary evidence that similar genes with overlapping localization to zyxin genes exist in other species. The authors provide evidence for the tissue specific distribution of azyx-1 expression. The authors further provide evidence for azyx-1 and zyx-1 expression with age. Importantly, these data demonstrate differences in azyx-1 and zyx-1 protein products biological importance/relevance as they display differences with age. The authors provide evidence that azyx-1 expression influences zyx-1 expression in multiple ways. Lastly, the authors demonstrate that azyx-1 expression influences muscle structure and neuromuscular function. The authors use a combination of bioinformatic, protein biochemistry, genetic/transgenic, histologic, and physiologic methods to make these points. With regards to methods, the range/breadth is impressive and appropriate. In many ways the manuscript it is a tour de force in modern molecular biology with a focus on translational medicine. With regards to species, the in vivo experiments are solely C. elegans but the computational data include Fly, Bull, and Mouse.

      The key conclusions are convincing. There are no major claims that require qualification as preliminary or speculative. No additional experiments are essential to support the claims of the paper. The data and methods are presented in such a way that they can be reproduced. The experiments are adequately replicated and the statistical analysis is adequate.

      Prior studies are references appropriately. The text and figures are mostly clear and accurate.

      We would like to thank the reviewer for their appreciation of our efforts and research approach.

      Reviewer #3 (Significance):

      **Conceptually this is a massive/ground breaking piece of work. Essentially, the authors are demonstrating a novel mechanism of regulation of gene/protein expression that, really, hasn't been reported before. What is particularly notable is that it appears, unsurprisingly, as correctly stated by the authors, to be evolutionarily conserved and not well reported in the literature. As with many classical molecular biology papers, and the more recent (e.g. RNAi, lncRNA) genetic papers, this manuscript hold the promise of transforming biology/medicine. The range of methods employed and the linking of molecular biology to pathophysiology was impressive. The audience that will be interested in this work includes: geneticists, proteomics researchers, evolutionary researchers, molecular biologists, physiologists, ageing researchers, muscle researchers, and muscle disease researchers. Thus, the interested audience is broad. My field of expertise with regards to this manuscript is: C. elegans, Mass Spec, Proteomics, genomic regulation, genetics, transgenics, histology, muscle, and physiology. There are no parts of this manuscript that I do not feel I have insufficient expertise to evaluate. I congratulate the authors on a highly significant, cross disciplinary, manuscript, that should impact multiple sub-areas of biology.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The authors build on previous work defining upstream and upstream-overlapping open reading frames (uORF and uoORFs, respectively) by focussing on a specific locus peu-1, which the authors propose influences the expression of the gene encoding the sole zyxin family in C. elegans, zyx-1. They present evidence suggestive of u/uoORFs being a common feature of zyxin family genes in other animals, hinting that perhaps this is a conserved mechanism of gene expression regulation for these genes. In which case, studies in C. elegans would be valuable to elucidate the mechanism involved.<br /> Using a fluorescent reporter strategy, they show that peu-1 is expressed in the same tissues as zyx-1, which is to be expected since their share the same transcriptional control elements.<br /> They also characterise the peptide steady state levels of both ZYX-1 and PEU-1 isoforms, suggesting that while overall ZYX-1 levels decline with age, those for PEU-1 are generally maintained. The significance of these observations was not immediately obvious to me - a priori it is difficult to assess what relative wild type steady-state levels one might expect if PEU-1 translation impacted ZYX-1 expression.<br /> The authors propose that expression of PEU-1 leads to inhibition of ZYX-1 translation through the standard model by which u/uoORFs impact translation of downstream ORFs. To test this, they generated a 27 bp deletion "at the beginning of the peu-1 ORF". This deletion clearly correlated with a reduction in ZYX-1 expression.<br /> Finally, the authors generated lines designed to overexpress PEU-1, testing the hypothesis that PEU-1 might influence ZYX-1 in trans. Though here, it is not obvious by what mechanism this might operate, and the effect-sizes involved are modest.

      Major comments:

      1. I'm not sure how to interpret the significance of the u/ouORFs across short and large phylogenetic distances. One would presume that there might not be primary amino acid conservation if the regulation simply takes by interference with ribosome scanning and translocation. Here some statistical analysis would help with assessing the significance of these observations. How unusual is it to find u/uoORFs in the 5' UTRs of gene encoding zyxin family members versus in general for the species analysed?
      2. The authors state that there is evidence for synteny and coding region conservation. The data supporting this assertion is not well presented. Presentation and analysis of multiple sequence alignments of the putative homologues involved would strengthen the assertion of synteny considerably.
      3. The authors are oddly coy about the molecular details of the 27 bp deletion used to study the loss of peu-1 function. In the absence of these details, it is not possible to assess the validity of these experiments. We need to be given the full molecular details of the allele - precisely which nucleotides are deleted? And how do they affect the coding regions of zyx-1 and peu-1?<br /> I am also confused about why the authors made a deletion allele rather than mutating the AUG of PEU-1? This would be a cleaner experiment to interpret. Based on the data presented, there are two possible interpretations in addition to the one suggested by the authors: 1) the 27 bp deletion impacts zyx-1 expression due to its impact on the zyx-1 coding region (the coding regions of peu-1 and zyx-1 overlap); 2) the deletion mutation deletes critical transcriptional control elements. A simpler mutation of the peu-1 AUG via CRISPR might allow them to rule out the possibility that they have simply compromised a transcriptional control element or damaged the coding region of ZYX-1.
      4. I am not convinced by the "overexpression" experiments. These are not well controlled, since no evidence is presented that PEU-1 is being overexpressed in these lines. Also, since we know that extrachromosomal transgenic lines are highly variable, one would need to test the effect of several independent lines to ensure that the effects that the authors observe are indeed associated with PEU-1 overexpression and not simply an idiosyncratic effect of the genetic background of a given strain. Finally, there does not seem to be an obvious mechanism by which overexpression of PEU-1 can impact ZYX-1 function. That doesn't rule out an effect, but based on the data as it is, it is premature to propose such a mechanism. The authors need to show that multiple overexpression lines do reproducibly overexpress PEU-1 and that this results in reproducible effects of zyx-1 phenotypes.
      5. I am not convinced by the data presented in Figure 5. There does not seem to be much to distinguish the five genotypes, but I concede that I am not used to looking at this type of data. But why was the muscle phenotype not also examined in the peu-1 rescue lines?

      Minor comments:

      1. The narrative flow of the introduction could be improved by the judicious use of paragraphs. Line 12, for instance is a clear paragraph break, as is line 24.
      2. The data presented in Figure 4F needs to be quantified using the same format as was presented in Figure 4B.
      3. I am not clear what features are being used to characterise the myofibril structures into the three categories. Can the authors annotate the images to indicate the diagnostic features?
      4. What is the difference between the overexpression transgenic lines and the "rescuing" transgenic lines? In the Materials and Methods, the same concentration of plasmid was used in injections - so these likely give the same approximate level of transgenic expression.

      Referees cross-commenting

      The following is a conversation among the three referees:

      REFEREE #2: I appear to be the dissenting voice in terms of concern about the details of the 27 bp deletion and the "overexpression" constructs. I would be interested to know your opinions regarding my comments on these issues.

      REFEREE #1: I think adding the details of the 27 bp deletion is a reasonable request. It is probably not possible to disambiguate entirely the two effects of the deletion, and changing the start codon may result in an alternate start with other downstream effects. I think just explaining it more fully in the methods would satisfy my concerns.

      REFEREE #2: What about the issues with the overexpresssion? In my experience, presence of multicopy transgenes on an extrachromosomal arrays might not lead to over expression of the gene involved? This needs to be verified in some way.

      REFEREE #1: You are right about that. If the construct is tagged in some way they could try a western. I would recommend they integrate the transgenes, or just show results from several lines as you suggest.

      REFEREE #3: I agree the 27 bp deletion and over expression are reasonable technical issues. However, I view this a techical details vs. critical details for the novel regulatory mechanism. The point about ability to judge conservation is also reasonable but until the theory is firmly out there it is hard to test the conservation and broader applicability to other genes/proteins. Thus, while asking for additional information on these issues is reasonable I do not see the inability to address beyond highlighting as limitation in the text as critical to the overall validity of the work.

      REFEREE #2: I disagree with Reviewer #3, without knowing the details of 27 bp deletion the most reasonable interpretation of the data is simply that it is a loss-of-function allele of zyx-1. This goes beyond "technical" - at present there is no unequivocal evidence that peu-1 has any functional significance beyond that it is expressed as a peptide.

      REFEREE #3: I've been back through the manuscript. They have sequenced the deletion and therefore should be able to provide that information to satisfy the issue(s). For the over expression, short of silencing my experience is that they do over express and when you have multiple lines some express more than others (and some silence more than others). If you want evidence that the peptide is over expressed ask them to quantify via mass spec if it isn't tagged and they can't do a Western. Clearly they have work leading expertise in quantitative mass spec proteomics in C. elegans and should be able to do that. Generally speaking, rescue of a deletion is a pretty good sign that the expression is working though (and is an accepted standard).

      Significance

      The authors propose an interesting interaction between an important regulator of cellular behaviour (zyxin) and the u/uoORF that potentially regulates its expression - if validated by further experimentation, this would add to the growing evidence for the importance of the 5' UTR as a source of gene regulatory activity. Such regulation is well described in yeast, but there are fewer examples in animals, particularly in genetically tractable systems such as C. elegans. The work would primarily be of interest to researchers interested in understanding the spectrum of such activity in C. elegans. My own area of expertise, RNA-splicing and the post-transcriptional regulation of C. elegans gene expression, is not directly related to the research presented in the manuscript, but I am familiar with the general concepts and developments involved.

    1. Author Response

      Reviewer #1 (Public Review):

      This fMRI study investigated how memories are updated after reinterpreting past events. Participants watched a movie and subsequently recalled individual scenes from that movie. Importantly, the movie ends with a twist that changes the interpretation of earlier scenes in the movie. One group of participants watched the movie with the twist at the end, one group did not get to see the twist, and a third group was already informed about this twist before watching the movie. Analyses compared the similarity of activity patterns to (encoded or recalled) events across participants within regions of the default mode network (DMN). The design allowed for multiple relevant comparisons, confirming the prediction that activity patterns in DMN regions reflect the (re)interpretation of the movie (during movie viewing and/or during recall).

      The study is well-designed and executed. The inclusion of multiple analyses involving distinct comparisons strengthens the evidence for the role of the DMN in memory updating.

      The following points may be relevant to consider:

      1) The cross-participant pattern analysis method used here is not standard, with such analyses typically done within participants (or across participants, but after aligning representational spaces). Considering individual variability in functional organization, the method is likely only sensitive to coarse-scale patterns (e.g., anterior vs posterior parts of an ROI). This is not necessarily a weakness but is relevant when interpreting the results.

      We agree with the reviewer that functional misalignment might have played against us here. We designed this study as a natural successor of our previous work in which we captured reliable and multimodal scene-specific cross-participant pattern similarity during encoding and recall in standard space. In this revised version, we provide further evidence on how scene content is captured and influences our results. Nonetheless, we agree with your comment and add the following section to the discussion to encourage considering this point while interpreting the results.

      "Moreover, our current method relies on averaging spatially-coarse activity patterns across subjects (and time points within an event). Future extensions of this work may benefit from using functional alignment methods (Haxby et al 2020, Chen et al 2015) to capture more fine-grained event representations which are shared across participants."

      2) Unlike previous work, analyses are not testing for scene-specific information. Rather, each scene is treated separately to establish between-group differences, and results are averaged across scenes. This raises the question of whether the patterns reflect scene-specific information or generic group differences. For example, knowing the twist may increase overall engagement, both when viewing the movie (spoiled group) and when recalling it (spoiled group + twist group). The DMN may be particularly sensitive to such differences in overall engagement.

      You have brought up great points. We addressed them in two ways: (1) We ran a univariate analysis in each DMN ROI to look at the role of overall regional-average response magnitude in our results. We did not observe a significant effect of group or an interaction between group and condition. (2) We ran a scene-specificity analysis in a new Results section entitled “The role of scene content” (Figure 4). This section is focused on comparing interaction index (Figure 2C), as an indicator of memory updating, under different manipulations. Interaction index reflects the reversal of neural similarity during encoding and recall. Our results suggest that we don’t see the same effects if we shuffle the scene labels and recompute the pattern similarity analyses. Please see added text and figures below:

      "To test whether our reported results were mainly driven by the similarities and differences in multivariate spatial patterns of neural representations, as opposed to by univariate regional-average response magnitudes, we ran a univariate analysis in each ROI. This analysis revealed no significant effect of group (“spoiled”, “twist”, “no-twist”) or interaction between group and condition (movie, recall) (Table 1, see Methods for details).

      Next, to determine whether scene-specific neural event representations—as opposed to coarser differences in general mental state across all scenes with similar interpretations—drive our observed pISC differences, we shuffled the labels of critical scenes within each group before calculating and comparing pISC across groups. By repeating this procedure 1000 times and recalculating the interaction index at each iteration, we constructed a null distribution of interaction indices for shuffled critical scenes (light magenta distributions in Figure 4B). In 12 out of 24 DMN regions, interaction indices were statistically significant based on the shuffled-scene distribution (p < .025, FDR controlled at q < .05). All of these 12 regions were among the ROIs that showed meaningful effects in our original analysis (Figure 2C). Regions with significant scene-specific interaction effects are marked as blue dots with black borders in Figure 4B. Overall, the findings from this analysis confirm that our results are driven by changes to scene-specific representations."

      3) The study does not reveal what the DMN represents about the movie, such that its activity changes after knowing the twist. The Discussion briefly mentions that it may reflect the state of the observer, related to the belief about the identity of the doctor. This suggests a link to the theory of mind/mentalizing, but this is not made explicit. Alternatively, the DMN may be involved in the conflict (or switching) between the two interpretations.

      Great points. We added to the discussion about the role of mentalizing network and in the particular temporo-parietal cortex. About your last point, we think our whole brain findings outside DMN (ACC and dlPFC) might relate to that point. We discussed these further in the paper.

      "We performed two targeted analyses to look for evidence of memory updating across encoding and recall: the interaction analysis (Figure 2C) and the encoding-recall analysis (Figure 3). We hypothesized that a shift in direction of pISC difference would occur when neural representations during recall in the “twist” group start to reflect the Ghost interpretation. The interaction analysis probed this shift indirectly by taking into account the effects of both encoding-encoding and recall-recall analyses. Unlike the interaction analysis, in the encoding-recall analysis, we directly compared neural event representations during encoding and recall. Interestingly, all regions exhibiting an effect across the two encoding-recall analyses, excluding left anterior temporal cortex, were present in the interaction results. Among these regions, the left angular gyrus/TPJ exhibited an effect across all three analyses. As a core hub in the mentalizing network, temporo-parietal cortex has been implicated in theory of mind through perspective-taking, rationalizing the mental state of someone else, and modeling the attentional state of others (Frith and Frith 2006, Guterstam et. al 2021, Saxe and Kanwisher 2003). The motivations behind some actions of the main character in the movie heavily depend on whether the viewer perceives them as a Doctor or a Ghost, and participants may focus on this during both encoding and recall. We speculate that neural event representations in AG/TPJ in the current experiment may be related to mentalizing about the main character’s actions. Under this interpretation, the updated event representations during recall following the twist would be more closely aligned to the “spoiled” encoding representations, as a consequence of memory updating in the “twist” group.

      In our whole brain analysis, these regions did not have significant interaction effects, which suggests that the effects were isolated to encoding. In the whole-brain analysis, we also observed a significant encoding-encoding and interaction effects in anterior cingulate cortex, as well as recall-recall and interaction effects in dlPFC. These results suggest that both the "spoiled" manipulation and the "twist" may recruit top-down control and conflict monitoring processes during naturalistic viewing and recall."

      4) The design has many naturalistic aspects, but it is also different from real life in that the critical twist involves a ghost. Furthermore, all results are based on one movie with a specific plot twist. It is thus not clear whether similar results would be obtained with other and more naturalistic plot twists.

      We added this as a limitation of the study.

      "Our findings provide further insight into the functional role of the DMN. However, these results have been obtained using only one movie. While naturalistic paradigms better capture the complexity of real life and provide greater ecological generalizability than highly-controlled experimental stimuli and tasks (Nastase et al., 2020), they are still limited by the properties of the particular naturalistic stimulus used. For example, this movie—including the twist itself—hinges on suspension of disbelief about the existence of ghosts. Future work is needed to extend our findings about updating event memories to a broader class of naturalistic stimuli: for example, movies with different kinds of (non-supernatural) plot twists, spoken stories with twist endings, or using autobiographical real-life situations where new information (e.g. discovering a longtime friend has lied about something important) triggers re-evaluation of the past (e.g. reinterpreting their friend’s previous actions)."

      5) Only 7 scenes (out of 18) were included in the analysis. It is not clear if/how the results depend on the selection of these 7 scenes.

      Thank you for bringing this up. These scenes were pre-selected for the analyses, as they are the only scenes that are rated high by our independent raters (not study participants) on “twist influence”, meaning that knowing the twist could dramatically change their interpretation. So, we had a priori reasons to hypothesize that the effect will be strong in these scenes. To address your point, we report results by including all 18 scenes in a new Results section entitled “The role of scene content” and in Figure 4A. While the effect was weaker for all scenes it was still apparent in this conservative analysis. As expected, however, including 7 critical scenes produces stronger results than including all scenes or the uncritical scenes (all minus critical scenes). Please see the “The role of scene content” in Results and in Figure 4 for more detailed information.

      "The role of scene content In the prior analyses, we focused on “critical scenes”, selected based on ratings from four raters who quantified the influence of the twist on the interpretation of each scene (see Methods). An independent post-experiment analysis of the verbal recall behavior of the fMRI participants yielded “twist scores” that were also highest for these scenes; that is, the expected and perceived effect of twist information on recall behavior were found to match. In our next analysis, we asked whether the neural event representations reflect these differences in the twist-related content of the scenes. In other words, are the “critical scenes” with highly twist-dependent interpretations truly critical for our observed effects?

      To answer this question, we re-ran our main encoding-encoding and recall-recall pISC analysis in each DMN ROI (Figure 2-3). We calculated interaction indices (Figure 2C) first by including all scenes, and second by including only the 11 non-critical scenes. To better compare the effect of including different subsets of scenes to our original results, in Figure 4 we show the results in 15 ROIs that exhibited meaningful effects in our main analyses (Figure 2C). Figure 4A demonstrates that “critical scenes” yielded higher interaction indices compared to all scenes or non-critical scenes across all ROIs. The interaction score across all DMN ROIs was significantly higher in “critical scenes” than all scenes (t(23) = 7.19, p = 2.53 x 10-7) and non-critical scenes (t(23) = 7.3, p = 1.95 x 10-7). These results show that critical scenes are indeed responsible for the observed pISC differences across groups."

      Reviewer #2 (Public Review):

      In this manuscript titled "Here's the twist: How the brain updates the representations of naturalistic events as our understanding of the past changes", the authors reported a study that examined how new information (manipulated as a twist at the end of a movie) changes the neural representations in the default mode network (DMN) during the recall of prior knowledge. Three groups of participants were compared - one group experienced the twist at the end, one group never experienced the twist, and one group received a spoiler at the beginning. At retrieval, participants received snippets of 18 scenes of the movie as cues and were asked to freely describe the events of each scene and to provide the most accurate interpretation of the scene, given the information they gathered throughout watching.

      All three groups were highly accurate in the recall of content. The groups that experienced the twist at the end as well as at the beginning as a spoiler showed a higher twist score (the extent to which twist information was incorporated into the recall), while seemingly also keeping the interpretation without the twist ("Doctor representation") intact. Neurally, several regions in the DMN showed significant interaction effects in their neural similarity patterns (based on intersubject pattern correlation), indicating a change in interpretation between encoding and recall in the twist group uniquely, presumably reflecting memory updating.

      Several points that I think should be addressed to strengthen the manuscript:

      1) The results from encoding-retrieval similarity analysis (particularly the one depicted in Figure 3B) don't match the results from encoding/retrieval interaction (particularly those shown in Figure 2C). While they were certainly based on different comparisons, I would think that both analyses were set up to test for memory updating. Can the authors comment on this divergence in results?

      Thank you for your comment. Except for one ROI, the other two regions in Figure 2C are present in the interaction analysis. The ROI at the frontal pole might be hard to see from this angle but in fact it holds a high effect size in interaction analysis. So we do not see a big divergence between these two results. But taking into account the recall-recall results, we agree that there seems to be inhomogeneity. We discussed these further in the discussion.

      "We performed two targeted analyses to look for evidence of memory updating across encoding and recall: the interaction analysis (Figure 2C) and the encoding-recall analysis (Figure 3). We hypothesized that a shift in direction of pISC difference would occur when neural representations during recall in the “twist” group start to reflect the Ghost interpretation. The interaction analysis probed this shift indirectly by taking into account the effects of both encoding-encoding and recall-recall analyses. Unlike the interaction analysis, in the encoding-recall analysis, we directly compared neural event representations during encoding and recall. Interestingly, all regions exhibiting an effect across the two encoding-recall analyses, excluding left anterior temporal cortex, were present in the interaction results. Among these regions, the left angular gyrus/TPJ exhibited an effect across all three analyses. As a core hub in the mentalizing network, temporo-parietal cortex has been implicated in theory of mind through perspective-taking, rationalizing the mental state of someone else, and modeling the attentional state of others (Frith and Frith 2006, Guterstam et. al 2021, Saxe and Kanwisher 2003). The motivations behind some actions of the main character in the movie heavily depend on whether the viewer perceives them as a Doctor or a Ghost, and participants may focus on this during both encoding and recall. We speculate that neural event representations in AG/TPJ in the current experiment may be related to mentalizing about the main character’s actions. Under this interpretation, the updated event representations during recall following the twist would be more closely aligned to the “spoiled” encoding representations, as a consequence of memory updating in the “twist” group.

      Our findings are consistent with the view that DMN synthesizes incoming information with one’s prior beliefs and memories (Yeshurun et al 2021). We add to this framework by providing evidence for the involvement of DMN regions in updating prior beliefs in light of new knowledge. Across our different encoding and recall analyses, we observe memory updating effects in a varied subset of DMN regions that do not cleanly map onto a specific subsystem of DMN (Robin and Moscovitch 2017, Ranganath and Ritchey 2012, Ritchey and Cooper 2020). Rather than being divergent, these results might be reflecting inherent differences between the processes of encoding and recall of naturalistic events. It has been proposed that neural representations corresponding to encoding of events are systematically transformed during recall of those events (Chen et al 2017, Favila et al 2020, Musz and Chen 2022). While we provide evidence for reinstatement of memories in DMN, our findings also support a transformation of neural representation during recall, as encoding-recall results were weaker in some areas than recall-recall findings. This transformation could affect how different regions and sub-systems of DMN represent memories, and suggests that the concerted activity of multiple subsystems and neural mechanisms might be at play during encoding, recall and successful updating of naturalistic event memories."

      2) The recall task was self-paced. Can reaction time information be provided on how long participants needed to recall? Did this differ across groups? Presumably in the twist group and spoiled group participants might have needed a longer time to incorporate both the original and twist interpretation.

      This is an interesting idea. Unfortunately, we could not measure this accurately because our recall cues were snippets from the beginning of each scene with different length (selected based on content). And updating could begin from the beginning of those snippets (but we wouldn’t know when). We will consider this point in the future related designs.

      How was the length difference across events taken into consideration in the beta estimates?

      They were used as event durations in the GLM model.

      Also, is there an order effect, such that one type of interpretation tended to be recalled first?

      This is hard to measure as this only occurs in a subset of scenes. But we assume it happens in other people’s brains as well

      This is indeed hard to measure as you mentioned. We will provide the transcripts when sharing the data and hopefully this will facilitate future text-analysis work on this dataset to answer interesting questions like this.

      3) The correlation analysis between neural pattern change and behavioral twist score is based on a small sample size and does not seem to be well suited to test the postulation of the authors, namely that some participants may hold both interpretations in their memory. Interestingly, the twist score of the spoiled group was similar to the twist group, indicating participants in this group might have held both interpretations as well. Could this observation be leveraged, for example by combining both groups (hence better powered with larger sample size), in order to relate individual differences in neural similarity patterns and behavioral tendency to hold both interpretations?

      Even though both groups showed signs of holding both interpretations in mind, the process happening in their brain during the recall is different. In particular, we do not expect to see any updating effect in the spoiled group. So it wouldn’t seem accurate to combine these groups to test the effect of incomplete updating.

      4) Several regions within the DMN were significant across the analysis steps, specifically the angular gyrus, middle temporal cortex, and medial PFC. Can the authors provide more insights on how these widely distributed regions may act together to enable memory updating? The discussion on the main findings is largely at a rather superficial level about DMN, or focuses specifically on vmPFC, but neglects the distributed regions that presumably function interactively

      Thanks for bringing this up. We added text to discussion to respond to this very valid point. Please see the added text in our response to your first point. One more snippet added to the discussion about this:

      "In addition to mPFC, right precuneus and parts of temporal cortex exhibited significantly higher pattern similarity in the “twist” and “spoiled” groups who recalled the movie with the same interpretation. Precuneus is a core region in the posterior medial network, which is hypothesized to be involved in constructing and applying situation models (Ranganath and Ritchey 2012). Our findings support a role for precuneus in deploying interpretation-specific situation models when retrieving event memories. In particular, we suggest that the posterior medial network may encode a shift in the situation model of the “twist” group in order to accommodate the new Ghost interpretation.

      We performed two targeted analyses to look for evidence of memory updating across encoding and recall: the interaction analysis (Figure 2C) and the encoding-recall analysis (Figure 3). We hypothesized that a shift in direction of pISC difference would occur when neural representations during recall in the “twist” group start to reflect the Ghost interpretation. The interaction analysis probed this shift indirectly by taking into account the effects of both encoding-encoding and recall-recall analyses. Unlike the interaction analysis, in the encoding-recall analysis, we directly compared neural event representations during encoding and recall. Interestingly, all regions exhibiting an effect across the two encoding-recall analyses, excluding left anterior temporal cortex, were present in the interaction results. Among these regions, the left angular gyrus/TPJ exhibited an effect across all three analyses. As a core hub in the mentalizing network, temporo-parietal cortex has been implicated in theory of mind through perspective-taking, rationalizing the mental state of someone else, and modeling the attentional state of others (Frith and Frith 2006, Guterstam et. al 2021, Saxe and Kanwisher 2003). The motivations behind some actions of the main character in the movie heavily depend on whether the viewer perceives them as a Doctor or a Ghost, and participants may focus on this during both encoding and recall. We speculate that neural event representations in AG/TPJ in the current experiment may be related to mentalizing about the main character’s actions. Under this interpretation, the updated event representations during recall following the twist would be more closely aligned to the “spoiled” encoding representations, as a consequence of memory updating in the “twist” group.

      Our findings are consistent with the view that DMN synthesizes incoming information with one’s prior beliefs and memories (Yeshurun et al 2021). We add to this framework by providing evidence for the involvement of DMN regions in updating prior beliefs in light of new knowledge. Across our different encoding and recall analyses, we observe memory updating effects in a varied subset of DMN regions that do not cleanly map onto a specific subsystem of DMN (Robin and Moscovitch 2017, Ranganath and Ritchey 2012, Ritchey and Cooper 2020). Rather than being divergent, these results might be reflecting inherent differences between the processes of encoding and recall of naturalistic events. It has been proposed that neural representations corresponding to encoding of events are systematically transformed during recall of those events (Chen et al 2017, Favila et al 2020, Musz and Chen 2022). While we provide evidence for reinstatement of memories in DMN, our findings also support a transformation of neural representation during recall, as encoding-recall results were weaker in some areas than recall-recall findings. This transformation could affect how different regions and sub-systems of DMN represent memories, and suggests that the concerted activity of multiple subsystems and neural mechanisms might be at play during encoding, recall and successful updating of naturalistic event memories."

      Reviewer #3 (Public Review):

      Zadbood and colleagues investigated the way key information used to update interpretations of events alter patterns of activity in the brain. This was cleverly done by the use of "The Sixth Sense," a film featuring a famous "twist ending," which fundamentally alters the way the events in the film are understood. Participants were assigned to three groups: (1) a Spoiled group, in which the twist was revealed at the outset, (2) a Twist group, who experienced the film as normal, and (3) a No-Twist group, in which the twist was removed. Participants were scanned while watching the movie and while performing cued recall of specific scenes. Verbal recall was scored based on recall success, and evidence for descriptive bias toward two ways of understanding the events (specifically, whether a particular character was or was not a ghost). Importantly, this allowed the authors to show that the Twist group updated their interpretation. The authors focused on regions of the Default Mode Network (DMN) based on prior studies showing responsiveness to naturalistic memory paradigms in these areas and analyzed the fMRI data using intersubject pattern similarity analysis. Regions of the DMN carried patterns indicative of story interpretation. That is, encoding similarity was greater between the Twist and No-Twist groups than in the Spoiled group, and retrieval similarity was greater between the Twist and Spoiled groups than in the No-Twist group. The Spoiled group also showed greater pattern similarity with the Twist group's recall than the No-Twist group's recall. The authors also report a weaker effect of greater pattern similarity between the Spoiled group's encoding and the Twist group's recall than between the Twist group's own encoding and recall. Together, the data all converge on the point that one's interpretation of an event is an important determinant of the way it is represented in the brain.

      This is a really nice experiment, with straightforward predictions and analyses that support the claims being made. The results build directly on a prior study by this research group showing how interpretational differences in a narrative drive distinct neural representations (Yeshurun et al., 2017), but extend an understanding of how these interpretational differences might work retrospectively. I do not have any serious concerns or problems with the manuscript, the data, or the analyses. However I have a few points to raise that, if addressed, would make for a stronger paper in my opinion.

      1) My most substantive comment is that I did not find the interpretive framework to be very clear with respect to the brain regions involved. The basic effects the authors report strongly support their claims, but the particular contributions to the field might be stronger if the interpretations could be made more strongly or more specifically. In other words: the DMN is involved in updating interpretations, but how should we now think about the role of the DMN and its constituent regions as a result of this study? There are a number of ideas briefly presented about what the DMN might be doing, but it just did not feel very coherent at times. I will break this down into a few more specific points:

      While many of us would agree that the DMN is likely to be involved in the phenomena at hand, I did not find that the paper communicated the logic for singularly focusing on this subset of regions very compellingly. The authors note a few studies whose main results are found in DMN regions, but I think that this could stand to be unpacked in a more theoretically interesting way in the Introduction.

      Relatedly, I found the summary/description of regional effects in the Discussion to be a bit unsatisfying. The various pattern similarity comparisons yielded results that were actually quite nonoverlapping among DMN regions, which was not really unpacked. To be clear, it is not a 'problem' that the regional effects varied from comparison to comparison, but I do think that a more theoretical exploration of what this could mean would strengthen the paper. To the authors' credit, they describe mPFC effects through the lens of schemas, but this stands in contrast to many other regions which do not receive much consideration.

      Finally, although there is evidence that regions of the DMN act in a coordinated way under some circumstances, there is also ample evidence for distinct regional contributions to cognitive processes, memory being just one of them (e.g., Cooper & Ritchey, 2020; Robin & Moscovitch, 2017; Ranganath & Ritchey, 2012). The authors themselves introduce the idea of temporal receptive windows in a cortical hierarchy, and while DMN regions do appear to show slower temporal drift than sensory areas, those studies show regional differences in pattern stability across time even within DMN regions. Simply put, it is worth considering whether it is ideal to treat the DMN as a singular unit.

      Thank you for your helpful comments. We added text to the introduction and discussion to address your point:

      "Introduction:

      The brain’s default mode network (DMN)—comprising the posterior medial cortex, medial prefrontal cortex, temporoparietal junction, and parts of anterior temporal cortex—was originally described as an intrinsic or “task-negative” network, activated when participants are not engaged with external stimuli (Raichle et al. 2001, Buckner et al 2008). This observation led to a large body of work showing that the DMN is an important hub for supporting internally driven tasks such as memory retrieval, imagination, future planning, theory of mind, and creating and updating situation models (Svoboda et al. 2006; Addis et al. 2007; Hassabis and Maguire 2007, 2009; Schacter et al. 2007; Szpunar et al. 2007; Spreng et al. 2009, Koster-Hale & Saxe, 2013 2013, Ranganath and Ritchey 2012). However, it is not fully understood how this network contributes to these varying functions, and in particular—the focus of the present study—memory processes. Activation of this network during “offline” periods has been proposed to play a role in the consolidation of memories through replay (Kaefer et al 2022). Interestingly, prior work has also shown that the DMN is reliably engaged during “online” processing (encoding) of continuous rich dynamic stimuli such as movies and audio stories (Stephens et al 2013, Hasson et al 2008). Regions in this network have been shown to have long “temporal receptive windows” (Hasson et al 2008; Lerner et al., 2011; Chang et al., 2022), meaning that they integrate and retain high-level information that accumulates over the course of extended timescales (e.g. scenes in movies, paragraphs in text) to support comprehension. This combination of processing characteristics suggests that the DMN integrates past and new knowledge, as regions in this network have access to incoming sensory input, recent active memories, and remote long-term memories or semantic knowledge (Yeshurun et al 2021, Hasson et al 2015). These integration processes feature in many of the “constructive” processes attributed to DMN such as imagination, future planning, mentalizing, and updating situation models (Schacter and Addis 2007, Ranganath and Ritchey 2012). Notably, constructive processes are highly relevant to real-world memory updating, which involves selecting and combining the relevant parts of old and new memories. Recent work has shown that neural patterns during encoding and recall of naturalistic stimuli (movies) are reliably similar across participants in this network (Chen et al. 2017; Oedekoven et al., 2017; Zadbood et al., 2017; see Bird 2020 for a review of recent naturalistic studies on memory), and the DMN displays distinct neural activity when listening to the same story with different perspectives (Yeshurun et al 2017). Building on this foundation of prior work on the DMN, we asked whether we could find neural evidence for the retroactive influence of new knowledge on past memories."

      "Discussion :

      In addition to mPFC, right precuneus and parts of temporal cortex exhibited significantly higher pattern similarity in the “twist” and “spoiled” groups who recalled the movie with the same interpretation. Precuneus is a core region in the posterior medial network, which is hypothesized to be involved in constructing and applying situation models (Ranganath and Ritchey 2012). Our findings support a role for precuneus in deploying interpretation-specific situation models when retrieving event memories. In particular, we suggest that the posterior medial network may encode a shift in the situation model of the “twist” group in order to accommodate the new Ghost interpretation.

      We performed two targeted analyses to look for evidence of memory updating across encoding and recall: the interaction analysis (Figure 2C) and the encoding-recall analysis (Figure 3). We hypothesized that a shift in direction of pISC difference would occur when neural representations during recall in the “twist” group start to reflect the Ghost interpretation. The interaction analysis probed this shift indirectly by taking into account the effects of both encoding-encoding and recall-recall analyses. Unlike the interaction analysis, in the encoding-recall analysis, we directly compared neural event representations during encoding and recall. Interestingly, all regions exhibiting an effect across the two encoding-recall analyses, excluding left anterior temporal cortex, were present in the interaction results. Among these regions, the left angular gyrus/TPJ exhibited an effect across all three analyses. As a core hub in the mentalizing network, temporo-parietal cortex has been implicated in theory of mind through perspective-taking, rationalizing the mental state of someone else, and modeling the attentional state of others (Frith and Frith 2006, Guterstam et. al 2021, Saxe and Kanwisher 2003). The motivations behind some actions of the main character in the movie heavily depend on whether the viewer perceives them as a Doctor or a Ghost, and participants may focus on this during both encoding and recall. We speculate that neural event representations in AG/TPJ in the current experiment may be related to mentalizing about the main character’s actions. Under this interpretation, the updated event representations during recall following the twist would be more closely aligned to the “spoiled” encoding representations, as a consequence of memory updating in the “twist” group.

      Our findings are consistent with the view that DMN synthesizes incoming information with one’s prior beliefs and memories (Yeshurun et al 2021). We add to this framework by providing evidence for the involvement of DMN regions in updating prior beliefs in light of new knowledge. Across our different encoding and recall analyses, we observe memory updating effects in a varied subset of DMN regions that do not cleanly map onto a specific subsystem of DMN (Robin and Moscovitch 2017, Ranganath and Ritchey 2012, Ritchey and Cooper 2020). Rather than being divergent, these results might be reflecting inherent differences between the processes of encoding and recall of naturalistic events. It has been proposed that neural representations corresponding to encoding of events are systematically transformed during recall of those events (Chen et al 2017, Favila et al 2020, Musz and Chen 2022). While we provide evidence for reinstatement of memories in DMN, our findings also support a transformation of neural representation during recall, as encoding-recall results were weaker in some areas than recall-recall findings. This transformation could affect how different regions and sub-systems of DMN represent memories, and suggests that the concerted activity of multiple subsystems and neural mechanisms might be at play during encoding, recall and successful updating of naturalistic event memories."

      2) I think that some direct comparison to regions outside the DMN would speak to whether the DMN is truly unique in carrying the key representations being discussed here. I was reluctant to suggest this because I think that the authors are justified in expecting that DMN regions would show the effects in question. However, there really is no "null" comparison here wherein a set of regions not expected to show these effects (e.g., a somatosensory network, or the frontoparietal network) in fact do not show them. There are not really controls or key differences being hypothesized across different conditions or regions. Rather, we have a set of regions that may or may not show pattern similarity differences to varying degrees, which feels very exploratory. The inclusion of some principled control comparisons, etc. would bolster these findings. The authors do include a whole-brain analysis in Supplementary Figure 1, which indeed produced many DMN regions. However, notably, regions outside the DMN such as the primary visual cortex and mid-cingulate cortex appear to show significant effects (which, based on the color bar, might actually be stronger than effects seen in the DMN). Given the specificity of the language in the paper in terms of the DMN, I think that some direct regional or network-level comparison is needed.

      In the original submission, we included additional analyses for visual and somatosensory networks, which we hypothesized would serve as control networks. Following your comment, in the revision, we added a separate section (included below) more thoroughly examining these analyses. We also added text to the results and discussion to explain our interpretation of these findings.

      "Changes in neural representations beyond DMN We focused our core analyses on regions of the default mode network. Prior work has shown that multimodal neural representations of naturalistic events (e.g. movie scenes) are similar across encoding (movie-watching or story-listening) and verbal recall of the same events in the DMN (Chen et al., 2017; Zadbood et al., 2017). Therefore, in the current work we hypothesized that retrospective changes in the neural representations of events as the narrative interpretation shifts would be observed in the DMN. We did not, for example, expect to observe such effects in lower-level sensory regions, where neural activity differs dramatically for movie-viewing and verbal recall. To be thorough, we ran the same set of analyses we performed in the DMN (Figure 2-3) in regions of the visual and somatomotor networks extracted from the same atlas parcellation (Schaefer et al., 2018). Our results revealed larger overall differences in DMN than in visual and somatosensory networks for the key comparisons discussed previously (Figure S2). In particular, the only regions showing significant differences in pISC in recall-recall and encoding-recall comparisons (p < 0.01, uncorrected) were located in the DMN. We did not observe a notable difference between DMN and the two other networks when comparing recall “twist” to movie “spoiled” and recall “twist” to movie “twist” (RG – MG > RG – MD) which is consistent with the weak effect in the original comparison (Figure 3B). In the encoding-encoding comparison, several ROIs from the visual and somatomotor networks showed relatively strong effects as well (see Discussion).

      In addition, we qualitatively reproduced our results by performing an ROI-based whole brain analysis (Figure S3, p < 0.01 uncorrected). This analysis confirmed the importance of DMN regions for updating neural event representations. However, strong differences in pISC in the hypothesized direction were also observed in a handful of other non-DMN regions, including ROIs partly overlapping with anterior cingulate cortex and dorsolateral prefrontal cortex (see Discussion)."

      "Discussion: While our main goal in this paper was to examine how neural representations of naturalistic events change in the DMN, we also examined visual and somatosensory networks. Aside from the encoding-encoding analysis in which some visual and somatosensory regions showed stronger similarity between two groups with the same interpretation of the movie, we did not find any regions with significant effects in these two networks in the other analyses. Unlike the recall phase where each participant has their unique utterance with their own choice of words and concepts to describe the movie, the encoding (move-watching) stimulus is identical across all groups. Therefore, the effects observed during encoding-encoding analysis in sensory regions could reflect similarity in perception of the movie guided by similar attentional state while watching scenes with the same interpretation (e.g. similarity in gaze location, paying attention to certain dialogues, or small body movements while watching the movie with the same Doctor or Ghost interpretations). In our whole brain analysis, these regions did not have significant interaction effects, which suggests that the effects were isolated to encoding. In the whole-brain analysis, we also observed a significant encoding-encoding and interaction effects in anterior cingulate cortex, as well as recall-recall and interaction effects in dlPFC. These results suggest that both the "spoiled" manipulation and the "twist" may recruit top-down control and conflict monitoring processes during naturalistic viewing and recall."

      3) If I understand correctly, the main analyses of the fMRI data were limited to across-group comparisons of "critical scenes" that were maximally affected by the twist at the end of the movie. In other words, the analyses focused on the scenes whose interpretation hinged on the "doctor" versus "ghost" interpretation. I would be interested in seeing a comparison of "critical" scenes directly against scenes where the interpretation did not change with the twist. This "critical" versus "non-critical" contrast would be a strong confirmatory analysis that could further bolster the authors' claims, but on the other hand, it would be interesting to know whether the overall story interpretation led to any differences in neural patterns assigned to scenes that would not be expected to depend on differences in interpretation. (As a final note, such a comparison might provide additional analytical leverage for exploring the effect described in Figure 3B, which did not survive correction for multiple comparisons.)

      This is a helpful suggestion, and we’ve added an analysis addressing your comment. We found that the interaction index capturing the difference between the three groups was stronger for the critical scenes than for the non-critical scenes for almost all DMN ROIs.

      "The role of scene content In the prior analyses, we focused on “critical scenes”, selected based on ratings from four raters who quantified the influence of the twist on the interpretation of each scene (see Methods). An independent post-experiment analysis of the verbal recall behavior of the fMRI participants yielded “twist scores” that were also highest for these scenes; that is, the expected and perceived effect of twist information on recall behavior were found to match. In our next analysis, we asked whether the neural event representations reflect these differences in the twist-related content of the scenes. In other words, are the “critical scenes” with highly twist-dependent interpretations truly critical for our observed effects?

      To answer this question, we re-ran our main encoding-encoding and recall-recall pISC analysis in each DMN ROI (Figure 2-3). We calculated interaction indices (Figure 2C) first by including all scenes, and second by including only the 11 non-critical scenes. To better compare the effect of including different subsets of scenes to our original results, in Figure 4 we show the results in 15 ROIs that exhibited meaningful effects in our main analyses (Figure 2C). Figure 4A demonstrates that “critical scenes” yielded higher interaction indices compared to all scenes or non-critical scenes across all ROIs. The interaction score across all DMN ROIs was significantly higher in “critical scenes” than all scenes (t(23) = 7.19, p = 2.53 x 10-7) and non-critical scenes (t(23) = 7.3, p = 1.95 x 10-7). These results show that critical scenes are indeed responsible for the observed pISC differences across groups."

      4) I appreciate the code being made available and that the neuroimaging data will be made available soon. I would also appreciate it if the authors made the movie stimulus and behavioral data available. The movie stimulus itself is of interest because it was edited down, and it would be nice for readers to be able to see which scenes were included.

      Unfortunately due to copyright, we cannot share the movie stimulus outright. However, we will share the timing of the cuts used, as well as the time-stamped transcripts of verbal recall.

      To sum up, I think that this is a great experiment with a lot of strengths. The design is fairly clean (especially for a movie stimulus), the analyses are well reasoned, and the data are clear. The only weaknesses I would suggest addressing are with regards to how the DMN is being described and evaluated, and the communication of how this work informs the field on a theoretical level.

    1. Author Response

      Reviewer #1 (Public Review):

      In a very interesting and technically advanced study, the authors measured the force production of curved protofilaments at depolymerizing mammalian microtubule ends using an optical trap assay that they developed previously for yeast microtubules. They found that the magnesium concentration affects this force production, which they argue based on a theoretical model is due to affecting the length of the protofilament curls, as observed previously by electron microscopy. Comparing with their previous force measurements, they conclude that mammalian microtubules produce smaller force pulses than yeast microtubules due to shorter protofilament curls. This work provides new mechanistic insight into how shrinking microtubules exert forces on cargoes such as for example kinetochores during cell division. The experiments are sophisticated and appear to be of high quality, conclusions are well supported by the data, and language is appropriate when conclusions are drawn from more indirect evidence. Given that the experimental setup differs from the previous optical trap assay (antibody plus tubulin attached to bead versus only antibody attached to bead), a control experiment could be useful with yeast microtubules using the same protocol used in the new variant of the assay, or at least a discussion regarding this issue. One open question may be whether the authors can be sure that measured forces are only due to single depolymerizing protofilaments instead of two or more protofilaments staying laterally attached for a while. How would this affect the interpretation of the data?

      This work will be of interest to cell biologists and biophysicists interested in spindle mechanics or generally in filament mechanics.

      Thank you for your careful reading of our manuscript, your kind remarks, and your favorable review.

      Reviewers #1 and #2 both mentioned a concern about potential differences between our previous setup with yeast microtubules, versus our new setup with predominantly bovine microtubules, and whether such differences might underlie the different pulse amplitudes we measured. We think this concern comes mainly from a misunderstanding of how the beads in both setups were tethered to the sides of the microtubules, and we apologize for not making this aspect clearer in our original submission.

      It is true that our new setup requires one additional step, pre-decoration of the anti-His beads with His6-tagged yeast tubulin. However, in both cases, the anti-His antibodies were kept very sparse on the beads to ensure that most beads, if they became tethered to a microtubule, were attached by a single antibody. (~30 pM beads were mixed with 30 pM of anti-His antibody, for a molar ratio of 1:1.) And even though the anti-His beads in our previous work did not undergo a separate incubation step for pre-decoration with tubulin, they undoubtedly were decorated immediately after being mixed into the microtubule growth mix, which in that case included ~1 µM of unpolymerized His6-tagged yeast tubulin dimers. Thus, the arrangement with beads tethered laterally to the sides of microtubules via single antibodies was created in both cases by essentially the same three-step process: First, beads decorated very sparsely with anti-His antibodies were bound to unpolymerized His6-tagged yeast tubulin. Second, a bead-tethered His6-tagged yeast tubulin was incorporated into the growing tip of a microtubule (which could be assembling from either yeast or bovine tubulin, depending on the experiment). Third, the tip grew past the bead to create a large extension. Because the beads in both scenarios were tethered by a single antibody to the same C-terminal tail of yeast β-tubulin, the differences in pulse amplitude cannot be explained by differences in the tethering. In our revised manuscript, we now mention explicitly in Results that the beads were tethered by single antibodies (lines 95 to 100). In Methods we significantly expanded the section about preparation of beads and how they became tethered (lines 365 to 393). [We refer here, and below, to line numbers when the document is viewed with “All Markup” shown.]

      You also raise an interesting, open question: Do protofilaments curl outward entirely independently of their lateral neighbors? Or under some conditions might they tend to stay laterally associated during the curling process, perhaps curling outward in pairs rather than as individual protofilaments? We cannot formally rule out the possibility that such lateral associations sometimes persist during protofilament curling. However, changes in lateral association seem unlikely to explain the magnesium- and species-dependent differences we measured in pulse amplitude, for several reasons: First, there is good evidence for lengthening of protofilament curls at disassembling tips (e.g., Mandelkow 1991, Tran & Salmon 1997), but we are not aware of convincing evidence for magnesium or species-dependent increases in the propensity of curling protofilaments to remain laterally associated. Second, an increase in lateral association should increase the effective flexural rigidity of the curls, but under all the conditions we examined, pulse enlargement was associated with a steepening of the amplitude-vs-force relation – i.e., with softening, not stiffening. Our model indicates that this softening can be fully explained by an increase in protofilament contour length, without any change in the intrinsic flexural rigidity of the protofilament curls.

      Reviewer #2 (Public Review):

      Microtubules are regarded as dynamic tracks for kinesin and dynein motors that generate force for moving cargoes through cells, but microtubules also act as motors themselves by generating force from outward splaying protofilaments at depolymerizing ends. Force from depolymerization has been demonstrated in vitro and is thought to contribute to chromosome movement and other contexts in cells. Although this model has been in the field for many years, key questions have remained unanswered, including the mechanism of force generation, how force generated might be regulated in cells, and how this system might be tuned across cellular contexts or organisms. The barrier is that we lack an understanding of experimental conditions that can be used to control protofilament shape and energetics. This study by Murray and colleagues makes an important advance towards overcoming that barrier.

      This study builds on previous work from the authors where they developed a system to directly measure forces generated by outward curling protofilaments at depolymerizing microtubule ends. That study showed for the first time that protofilaments act like elastic springs and related the generated force to the estimated energy contained in the microtubule lattice. Furthermore, they showed that slowing polymerization rate did not diminish force generation. That study used recombinant yeast tubulin, including a 6x histidine tag on beta tubulin that created attachment points for the bead on the microtubule lattice. The current study extends that system to show that work output is related to the length of protofilament curls.

      We are grateful for your very thoughtful and thorough review, which has helped us improve our manuscript.

      Murray and colleagues show this by manipulating curls in two ways - using bovine brain tubulin instead of yeast tubulin and altering magnesium concentration. Previous EM studies indicated that protofilaments on depolymerizing bovine microtubules have similar curvature but are shorter. The authors here use a blend of bovine brain tubulin and bead-linked recombinant yeast tubulin with the 6x histidine tag in their in vitro system and find smaller deflections of the laser-trapped bead than previously observed with pure yeast tubulin. A concern with comparing this heterogeneous bovine/yeast system to the previous work with homogeneous yeast tubulin is that density of 6x histidine-tagged tubulin subunits is likely to be different between the two systems. Also, the rate of incorporation of 6x histidine yeast tubulin into bovine microtubules in the current study may be different from the rate of incorporation into yeast microtubules in the previous study. These differences could lead to changes in the strength of bead attachment to the microtubule lattice and alter the compliance of the bead to deflection by curling protofilaments. These possibilities and lattice attachment strength are not explored in this study, raising concerns about comparing the two systems.

      Reviewers #1 and #2 both mentioned a concern about potential differences between our previous setup with yeast microtubules, versus our new setup with predominantly bovine microtubules, and whether such differences might underlie the different pulse amplitudes we measured. As detailed in our response to Reviewer #1 above, we think this concern comes mainly from a misunderstanding of how the beads in both setups were tethered to the sides of the microtubules, and we apologize for not making this aspect clearer in our original submission. For both our yeast and bovine microtubule experiments, the anti-His antibodies were kept very sparse on the beads to ensure that most beads, if they became tethered to a microtubule, were attached by a single antibody. Because the beads in both scenarios were tethered by a single antibody to the same C-terminal tail of yeast β-tubulin, the differences in pulse amplitude cannot be explained by differences in the tethering. In our revised manuscript, we now mention explicitly in Results that the beads were tethered by single antibodies (lines 95 to 100). In Methods we significantly expanded the section about preparation of beads and how they became tethered (lines 365 to 393).

      The authors go on to show that magnesium increases bead deflection and work output from the system. The use of magnesium was motivated by earlier studies which showed that increasing magnesium speeds up depolymerization and increases the lengths of protofilament curls. The use of magnesium here provides the first evidence that work output can be tuned biochemically. This is an important finding. The authors then go on to show that the effect of magnesium on bead deflection can be separated from its effect on depolymerization speed. They do this by proteolytically removing the beta tubulin tail domain, which previous studies had shown to be necessary to mediate the magnesium effect on depolymerization rate. The authors arrive at a conclusion that magnesium must promote protofilament work output by increasing their lengths. How magnesium might do this remains unanswered. The mechanistic insight from the magnesium experiments ends there, but the authors discuss possible roles for magnesium in strengthening longitudinal interactions within protofilaments or perhaps complexing with the GDP nucleotide at the exchangeable site, although that seems less likely at the concentrations in these experiments.

      The major conclusion of the study is the finding that work output from curling protofilaments is a tunable system. The examples here demonstrate tuning by tubulin composition and by divalent cations. Whether these examples relate to tuning in biological systems will be an important next question and could expand our appreciation for the versatility of depolymerizing microtubules as a motor.

      We fully agree that two very important next questions are whether work output from curling protofilaments is truly harnessed in vivo, and whether protofilament properties in vivo might be actively regulated for this purpose. Based on your recommendations, and as detailed below (under Major point 2), we have expanded our discussion of these possibilities in our revised manuscript.

      Reviewer #3 (Public Review):

      The authors used a previously established optical tweezers-based assay to measure the regulation of the working stroke of curled protofilaments of bovine microtubules by magnesium. To do so, the authors improved the assay by attaching bovine microtubules to trapping beads through an incorporated tagged yeast tubulin.

      The assay is state-of-the-art and provides a direct measurement of the stroke size of protofilaments and its dependence on magnesium.

      The authors have achieved all their goals and the manuscript is well written.

      The reported findings will be of high interest for the cell biology community.

      Thank you for reading and evaluating our manuscript. We are grateful for your positive comments.

    1. I don't mean shock as in bad news or brutal murder or horrific catastrophe or embarrassing scandal

      i took this explanation as art can ignite us. it can show something different to us. it sparks new ideas, visions, or perspectives. it can instill us with feelings that are so different that what we might’ve expected. art can be so raw that the way we absorb it may not be what people would normally think they would experience from art

    1. But here are three ways that we should think about addressing this issue:Start with parent training. Parents need to be made aware of the negative impact of the video games they may be letting their children play. I get that sometimes we need to occupy our kids, and it’s very tempting to hand them a phone. But we need to be better gatekeepers.It’s hard to change a behavior if you can’t first measure it. Use tools, such as Apple’s Screen Time or Google’s Digital Wellbeing, to create awareness of just how much time you or your children are spending on games — you’ll be surprised.Finally, strike a balance. Games can be fun, of course; we just need to find moderation. When I was growing up, my parents pushed me to eat more vegetables and fruits. With technology so integral to our lives, we need to treat digital wellness like physical wellness and make sure we encourage behavior that’s good for us.

      In these paragraphs pathos is used and more specifically this would be part of scare tactics when it comes to the reader realizing that they should take action to lessen addition to gaming after the reader list some ways on how to prevent gaming addiction or lessen addiction.

    1. But it was also a cultural moment, reflecting style and attitude as much as ideology or policy positions. Both Magaziner and his audience knew that as a government bureaucrat, he was acting against type. We do not expect government officials to so willingly “get out of the room,” and generally when an official does not keep an eye on things, it is interpreted as an abdication of responsibility, not a heroic move.

      I think it may be even harder to abandon control than to come up with a new policy sometimes. Introducing regulation for the Internet was not the same as the regulation of other media, and it definitely required extensive discussion and research. However, the fact that Maganizer was able to leave the conference after initiating the discussion demonstrates an important trend of more flexible regulations.

    1. The moral issue I notice is that the overall group which is made up of middle-class caucasian students is having an issue with having a person of color on the board. The reason I came to that decision is because historically they mention that they at times are lenient on the guidelines of having someone override the process to select new members for the team. Everyone seemed to in agreeance that Reuben was a great candidate to make the decision on his own. Reuben selected a candidate that he found social interests in diversity by expressing their involvement in other programs related to diversity. The coalition members also gave Reuben some options but those were not the people he had choose. His choice was Jameela because he was interested in the fact that she was a generational first to go to school, she had interest in women's studies and the LGBTQ committee, and she stated that she wasn't afraid of challenges. I feel like that there may be some closeted angst against diversity even though they are seated on the board. I feel like maybe they feel like having a person of color on the board may be uncomfortable to them as the majority are middle class white kids. For me I think that Reuben was taking part in deontological ethics. This means, "believe that we ought to base our choices on our duty to follow universal truths that we discover through our intuition or reason." (Hackman and Johnson). I feel that Reuben felt that he was doing the right thing based on what his firm beliefs are for their group. He followed through thinking it was the right decision.

    1. Author Response

      Reviewer #1 (Public Review):

      Auwerx et al. have taken a new approach to mine large existing datasets of intermediary molecular data between GWAS and phenotype, with the aim of uncovering novel insight into the molecular mechanisms which lead a GWAS hit to have a phenotypic effect. The authors show that you can get additional insight by integrating multiple omics layers rather than analyzing only a single molecular type, including a handful of specific examples, e.g. that the effect of SNPs in ANKH on calcium are mediated by citrate. Such additional data is necessary because, as the authors' point out, while we have thousands of SNPs with significant impact on phenotypes of interest, we often don't know at all the mechanism, given that the majority of significant SNPs found through GWAS are in non-coding (and often intergenic) regions.

      This paper shows how one can mine large existing datasets to better estimate the cellular mechanism of significant, causal SNPs, and the authors have proven that by providing insight into the links between a couple of genes (e.g. FADS2, TMEM258) and metabolite QTLs and consequent phenotypes. There is definitely a need and utility for this, given how few significant SNPs (and even fewer recently-discovered ones) hit parts of the DNA where the causal mechanism is immediately obvious and easily testable through traditional molecular approaches.

      I find the paper interesting and it provides useful insight into a still relatively new approach. However, I would be interested in knowing how well this approach scales to the general genetics community: would this method work with a much smaller N (e.g. n = 500)? Being able to make new insights using cohorts of nearly 10,000 patients is great, but the vast majority of molecular studies are at least an order of magnitude smaller. While sequencing and mass spectrometry are becoming exponentially cheaper, the issue of sample size is likely to remain for the foreseeable future due to the challenges and expenses of the initial sample collection.

      We thank the reviewer for his assessment and have now addressed – in the revised version of the manuscript, as well as in the below point-by-point reply – his specific comments/questions.

      Reviewer #2 (Public Review):

      Auwerx et al. present a framework for the integration of results from expression quantitative trait loci (eQTL), metabolite QTL (mQTL) and genome-wide association (GWA) studies based on the use of summary statistics and Mendelian Randomization (MR). The aim of their study is to provide the field with a method that allows for the detection of causal relationships between transcript levels and phenotypes by integrating information about the effect of transcripts on metabolites and the downstream effect of these metabolites on phenotypes reported by GWA studies. The method requires the mapping of identical SNPs in disconnected mQTL and eQTL studies, which allows MRbased inference of a causal effect from a transcript to a metabolite. The effect of both transcripts and metabolites on phenotypes is evaluated in the same MR-based manner by overlaying eQTL and mQTL SNPs with SNPs present in phenotypic GWA studies.

      The aim of the presented approach is two-fold: (1) to allow identification of additional causal relationships between transcript levels and phenotypes as compared to an approach limited to the evaluation of transcript-to-phenotype associations (transcriptome-wide MR, TWMR) and (2) to provide information about the mechanism of effects originating from causally linked transcripts via the metabolite layer to a phenotype.

      The study is presented in a very clear and concise way. In the part based on empirical study results, the approach leads to the identification of a set of potential causal triplets between transcripts, metabolites and phenotypes. Several examples of such causal links are presented, which are in agreement with literature but also contain testable hypotheses about novel functional relationships. The simulation study is well documented and addresses an important question pertaining to the approach taken: Does the integration of mQTL data at the level of a mediator allow for higher power to detect causal transcript to phenotype associations?

      We thank the reviewer for his/her assessment and have now addressed – in the revised version of the manuscript, as well as in the below point-by-point reply – his/her specific comments/questions.

      Major Concerns

      1) Our most salient concern regarding the presented approach is the presence of multiple testing problems. In the analysis of empirical datasets (p. 4), the rational for setting FDR thresholds is not clearly stated. While this appears to be a Bonferroni-type correction (p-value threshold divided by number of transcripts or metabolites tested), the thresholds do not reflect the actual number of tests performed (7883 transcripts times 453 metabolites for transcript-metabolite associations, 87 metabolites or 10435 transcripts times 28 complex phenotypes). The correct and more stringent thresholds certainly decrease the overlap between causal relationships and thus reduce the identifiable number of causal triplets. Furthermore, we believe that multiple testing has to be considered for correct interpretation of the power analysis. The study compares the power of a TWMR-only approach to the power of mediation-based MR by comparing "power(TP)" against "power(TM) * power(MP)" (p. 12). This comparison is useful in a hypothetical situation given data on a single transcript affecting a single phenotype, and with potential mediation via a single metabolite. However, in an actual empirical situation, the number of non-causal transcript-metabolite-phenotype triplets will exceed the number of non-causal transcript-phenotype associations due to the multiplication with the number of metabolites that have to be evaluated. This creates a tremendous burden of multiple testing, which will very likely outweigh the increase in power afforded by the mediation-based approach in the hypothetical "single transcript-metabolite-phenotype" situation described here. Thus, for explorative detection of causal transcript-phenotype relationships, the TWMR-only method might even outperform the mediation-based method described by the authors, simply because the former requires a smaller number of hypotheses to be tested compared to the latter. The presented simulation would only hold in cases where a single path of causality with a known potential mediator is to be tested.

      We thank the reviewer for pointing out the multiple testing issue. Based on this comment, we have revised our approach by mainly implementing two major modifications to our approach.

      First, we reduce the number of assessed metabolites to 242 compounds for which we were able to identify a Human Metabolome Database (HMDB) identifier through manual curation. This was triggered by the suggestion of reviewer #1 to facilitate the database/literature-based follow-up of our discoveries. The motivation is to only test metabolites that if found to be significantly associated would yield interpretable results, thereby reducing the number of tests to be performed. This modification is described in the revised manuscript:

      Results: “Summary statistics for cis-eQTLs stem from the eQTLGen Consortium metaanalysis of 19,942 transcripts in 31,684 individuals [3], while summary statistics for mQTLs originate from a meta-analysis of 453 metabolites in 7,824 individuals from two independent European cohorts: TwinsUK (N = 6,056) and KORA (N = 1,768) [6]. After selecting SNPs included in both the eQTL and mQTL studies, our analysis was restricted to 7,884 transcripts with ≥ 3 instrumental variables (IVs) (see Methods, Supplemental Figure 1) and 242 metabolites with an identifier in The Human Metabolome Database (HMDB) [28] (see Methods, Supplemental Table 1).”

      Methods: “mQTL data originate from Shin et al. [6], which used ultra-high performance liquid chromatography-tandem mass spectrometry (UPLC-MS/MS) to measure 486 whole blood metabolites in 7,824 European individuals. Association analyses were carried out on ~2.1 million SNPs and are available for 453 metabolites at the Metabolomics GWAS Server (http://metabolomics.helmholtz-muenchen.de/gwas/). Among these metabolites, 242 were manually annotated with Human Metabolome Database (HMDB) identifiers (Supplemental Table 1) and used in this study.”

      Second, to account for all remaining tests, we now select significant causal effects based on FDR < 5% in all performed univariable MR analyses. With 5% FDR on both the transcript-to-metabolite and metabolite-to-phenotype effects, the FDR for triplets is slightly inflated to 9.75% (= 1-0.952), a consideration that we now explicitly describe. Note that selecting triplets based on transcript-tometabolite and metabolite-to-phenotype effects FDR < 2.5%, result in a FDR < 5% (1-0.9752) for the triplets. This more stringent threshold identifies 135 causal triplets, 39 of which would be missed by TWMR. Overall, Results and Supplemental Tables have been updated and now read as follow:

      “Mapping the transcriptome onto the metabolome […] By testing each gene for association with the 242 metabolites, we detected 96 genes whose transcript levels causally impacted 75 metabolites, resulting in 133 unique transcriptmetabolite associations (FDR 5% considering all 1,907,690 instrumentable gene-metabolite pairs Supplemental Table 2) […].

      Mapping the metabolome onto complex phenotypes […] Overall, 34 metabolites were associated with at least one phenotype (FDR 5% considering all 1,344 metabolite-phenotype pairs), resulting in 132 unique metabolitephenotype associations (Supplemental Table 4).

      Mapping the transcriptome onto complex phenotypes […] In total, 5,140 transcripts associated with at least one phenotype (FDR 5% considering all 292,170 gene-phenotype pairs) resulting in 13,141 unique transcript-phenotype associations (Supplemental Table 5).

      Mapping metabolome-mediated effects of the transcriptome onto complex phenotypes […] We combined the 133 transcript-metabolite (FDR ≤ 5%) and 132 metabolite-trait (FDR ≤ 5%) associations to pinpoint 216 transcript-metabolite-phenotype causal triplets (FDR = 1-0.952 = 9.75%) (Supplemental Table 6).”

      In the simulations performed for the power analysis, we used a Bonferroni correction. We ran each simulation for 500 transcripts, measuring 80 metabolites at each run and performed TWMR and MWMR. The power of TWMR was calculated by counting how many times we obtain p-values ≤ 0.05/500. The power of the mediation analysis was calculated as 𝑝𝑜𝑤𝑒𝑟"$ ∗ 𝑝𝑜𝑤𝑒𝑟$#, where 𝑝𝑜𝑤𝑒𝑟"$ was calculated by counting how many times we obtain p-values ≤ 0.05/(500*80), and 𝑝𝑜𝑤𝑒𝑟$# was calculated by counting how many times we obtain p-values ≤ 0.05/80. In the revised manuscript, we additionally repeated each simulated scenario 10 times to increase robustness of results. This has been clarified in both the Methods and Results sections of the revised manuscript:

      Methods: “Ranging 𝜌 and 𝜎 from -2 to 2 and from 0.1 and 10, respectively, we run each simulation for 500 transcripts measuring 80 metabolites at each run and performed TWMR and MWMR starting from above-described 𝛽7<"=, 𝛽4<"= and 𝛽>?,(. For each MR analysis we calculated the power to detect a significant association as well as the difference in power between TWMR and the mediation analyses (i.e., 𝑝𝑜𝑤𝑒𝑟"# − 𝑝𝑜𝑤𝑒𝑟"$ ∗ 𝑝𝑜𝑤𝑒𝑟$#). Each specific scenario was repeated 10 times and the average difference in power across simulation was plotted as a heatmap.”

      Results: “To characterize the parameter regime where the power to detect indirect effects is larger than it is for total effects, we performed simulations using different settings for the mediated effect. In each scenario we evaluated 500 transcripts and 80 metabolites and varied two parameters characterizing the mediation: a. the proportion (𝜌) of direct (𝛼!) to total (𝛼"#) effect (i.e., effect not mediated by the metabolite) from -2 to 2 to cover the cases where direct and mediated effect have opposite directions (51 values); b. the ratio (𝜎) between the transcript-to-metabolite (𝛼"$) and the metabolite-to-phenotype (𝛼$#) effects, exploring the range from 0.1 to 10 (51 values).<br /> Transcripts were simulated with 6% heritability (i.e., median ℎ@ in the eQTLGen data) and a causal effect of 0.035 (i.e., ~65% of power in TWMR at a = 0.05) on a phenotype. Each scenario was simulated 10 times and results were averaged to assess the mean difference in power (see Methods).”

      2) A second concern regards the interpretation of the results based on the empirical datasets. For the identified 206 transcript-metabolite-phenotype causal triplets, the authors show a comparison between TWMR-based total effect of transcripts on phenotypes and the calculated direct effect based on a multivariable MR (MVMR) test (Figure 2B), which corrects for the indirect effect mediated by the metabolite in the causal triplet. The comparison shows a strong correlation between direct and total effect. A thorough discussion of the potential reasons for deviation (in both negative and positive directions) from the identity line is missing.

      Deviation from the identity line, as observed in Figure 2B, indicates that while there is a strong correlation between direct and total effect, it is not perfect, and part of the total effect is due to an indirect effect mediated by metabolites. This is explained and discussed in the Results and Discussion section:

      Results: “Regressing direct effects (𝛼!) on total effects (𝛼"#) on (Figure 2A), we estimated that for our 216 mediated associations, 77% [95% CI: 70%-85%] of the transcript effect on the phenotype was direct and thus not mediated by the metabolites (Figure 2B).”

      Discussion: “The observation that 77% of the transcript’s effect on the phenotype is not mediated by metabolites suggests that either true direct effects are frequent or that other unassessed metabolites or molecular layers (e.g., proteins, post-translational modifications, etc.) play a crucial role in such mediation. It is to note that in the presence of unmeasured mediators or measured mediators without genetic instruments, our mediation estimates are lower bounds of the total existing mediation. […] Thanks to the flexibility of the proposed framework, we expect that in the future and upon availability of ever larger and more diverse datasets, our method could be applied to estimate the relative contribution of currently unassessed mediators in translating genotypic cascades.”

      Furthermore, no test of significance for potential cases of mediation is presented. Due to the issues of multiple testing discussed above, the significance of the inferred cases of mediation is drawn into question. The examples presented for causal triplets (involving the ANKH and SLC6A12 transcripts) feature transcripts with low total effects and a small ratio between direct and total effect, in line with the power analysis. However, in these examples, the total effects are also quite low. Its significance has to be tested with an appropriate statistical test, incorporating multiple testing correction.

      Following the reviewer’s suggestion, we have modified our criteria to call significant associations to account for multiple testing (see extensive reply to major concern #1). With 5% FDR on both the transcript-to-metabolite and metabolite-to-phenotype effects, the FDR for triplets is slightly inflated to 9.75% (= 1-0.952). We mention this limitation in the revised manuscript:

      “We combined the 133 transcript-metabolite (FDR ≤ 5%) and 132 metabolite-trait (FDR ≤ 5%) associations to pinpoint 216 transcript-metabolite-phenotype causal triplets (FDR = 1-0.952 = 9.75%) (Supplemental Table 6).”

      All examples presented in the original manuscript remained significant. The fact that the total effect in these examples is low makes them particularly interesting as it highlights how our approach can detect biologically plausible associations between a transcript and a phenotype that only show mild evidence through TWMR but are strongly supported when accounting for metabolites that mediate the transcript-phenotype relation, showcasing situations in which our method can provide a true advantage over classical approaches such as TWMR. Such examples may emerge due to opposite signed direct and indirect effects, which cancel each other out when it comes to testing total effects. What is key that we do not claim the total and the mediated effects to be different (as we would have very limited power to do so), but simply point out that under certain settings we are better powered to detect mediated effects than total ones. In the ANKH example (more details below), the total ANKH-calcium effect is almost exactly the same as the product of the 𝛼,-.%→056157 and 𝛼056157→0120*34 effects, simply the latter ones are detectable, while the total effect is not.

      In the revised manuscript the case for our selected examples is made even stronger thanks to an analysis proposed by Reviewer #1 that aimed at estimating the proportion of previously reported associations through automated literature review. For instance, while our literature review found previously reported evidence of the ANKH-calcium link and of the ANKH-citrate link, we did not identify any publication mentioning all 3 terms in combination in the abstract and/or title, illustrating how our approach can establish bridges between knowledge gaps. We revised the Results section describing the ANKH example accordingly:

      “The 126 triplets that were not identified through TWMR due to power issues represent putative new causal relations. This is well illustrated by a proof-of concept example involving ANKH [MIM: 605145] and calcium levels, for which 48 publications were identified through automated literature review (Supplemental Table 6). While the TWMR effect of ANKH expression on calcium levels was not significant (𝛼,-.%→012034 = −0.02; 𝑃 = 0.03), we observed that ANKH expression decreased citrate levels (𝛼,-.%→056157 = −0.30; 𝑃 = 2.2 × 1089:), which itself increased serum calcium levels (𝛼056157→012034 = 0.07; 𝑃 = 6.5 × 108;9). Mutations in ANKH have been associated with several rare mineralization disorders [MIM: 123000, 118600] [32] due to the gene encoding a transmembrane protein that channels inorganic pyrophosphate to the extracellular matrix, where at low concentrations it inhibits mineralization [33]. Recently, a study proposed that ANKH instead exports ATP to the extracellular space (which is then rapidly converted to inorganic pyrophosphate), along with citrate [34]. Citrate has a high binding affinity for calcium and influences its bioavailability by complexing calcium-phosphate during extracellular matrix mineralization and releasing calcium during bone resorption [35]. Together, our data support the role of ANKH in calcium homeostasis through regulation of citrate levels, connecting previously established independent links into a causal triad.”

      Furthermore, the analysis of the empirical data indicates that the ratio between direct and indirect effect of a transcript on a phenotype is in most cases close to identity, except for triplets with low total effects. This fact should be considered in the power analysis, which assigned the highest gain in power by the mediation analysis to cases of low direct to total effect ratio. The empirical data indicate that these cases might be rare or of minor relevance for the tested phenotypes.

      As our previous power analyses did not fully reflect scenarios observed from empirical data, we extended the range of covered 𝜌 (i.e., the ratio between direct and total effect), so that it mimics more closely the observed range of 𝜌. In the revised manuscript, 𝜌 varies from -2 to 2, so that we also consider configurations where direct and total effects have opposite direction. To provide the readers with a rough idea how frequent the different parameter combinations occur in real data, we now provide another heatmap indicating the density of detected associations in those parameter regimes as Supplemental Figure 4.

      This map can be brought in perspective of Figure 4A that illustrates the power of TWMR vs. mediation analysis over the same range of parameter settings.

      It becomes apparent from Supplemental Figure 4 that in real data, 𝜎 is always larger than 1 and often exceeds 10. Note, however, that this heatmap must be interpreted with care, since the “detected” density will be low in regions where both methods have low power.

      3) Related to the interpretation of causal links: horizontal pleiotropy needs to be considered. The authors report the identification of causal links between TMEM258, FADS1 and FADS2, arachidonic acid-derived lipids and complex phenotypes. However, they also mention the high degree of pleiotropy due to linkage disequilibrium at the underlying eQTL and mQTL region as well as the network of over 50 complex lipids known to be associated with the expression of the above transcripts. Thus, it seems possible that the levels of undetected lipid species may be more important for the phenotypic effect of variation in these transcripts and that the reported "mediators" are rather covariates. Such horizontal pleiotropy would violate a basic assumption of the MR approach. While we think that this does not invalidate the approach altogether, it does affect the interpretation of specific metabolites as mediators. This is aggravated by the fact that metabolic networks are more tightly interconnected than macromolecular interaction networks (assortative nature of metabolic networks) and that single point-measurements of metabolites may not be generally informative about the flux through a specific metabolic pathway.

      This is a valid point and we discuss this limitation in the revised Discussion:

      “It is to note that in the presence of unmeasured mediators or measured mediators without genetic instruments, our mediation estimates are lower bounds of the total existing mediation. In addition, unmeasured mediators sharing genetic instruments with the measured ones, can modify result interpretation as some of the observed mediators may simply be correlates of the true underlying mediators. While this is a limitation of all MR methods, metabolic networks may harbor particularly large number of genetically correlated metabolite species.”

    2. Reviewer #2 (Public Review):

      Auwerx et al. present a framework for the integration of results from expression quantitative trait loci (eQTL), metabolite QTL (mQTL) and genome-wide association (GWA) studies based on the use of summary statistics and Mendelian Randomization (MR). The aim of their study is to provide the field with a method that allows for the detection of causal relationships between transcript levels and phenotypes by integrating information about the effect of transcripts on metabolites and the downstream effect of these metabolites on phenotypes reported by GWA studies. The method requires the mapping of identical SNPs in disconnected mQTL and eQTL studies, which allows MR-based inference of a causal effect from a transcript to a metabolite. The effect of both transcripts and metabolites on phenotypes is evaluated in the same MR-based manner by overlaying eQTL and mQTL SNPs with SNPs present in phenotypic GWA studies.

      The aim of the presented approach is two-fold: (1) to allow identification of additional causal relationships between transcript levels and phenotypes as compared to an approach limited to the evaluation of transcript-to-phenotype associations (transcriptome-wide MR, TWMR) and (2) to provide information about the mechanism of effects originating from causally linked transcripts via the metabolite layer to a phenotype.

      The study is presented in a very clear and concise way. In the part based on empirical study results, the approach leads to the identification of a set of potential causal triplets between transcripts, metabolites and phenotypes. Several examples of such causal links are presented, which are in agreement with literature but also contain testable hypotheses about novel functional relationships. The simulation study is well documented and addresses an important question pertaining to the approach taken: Does the integration of mQTL data at the level of a mediator allow for higher power to detect causal transcript to phenotype associations?

      Major Concerns<br /> 1. Our most salient concern regarding the presented approach is the presence of multiple testing problems. In the analysis of empirical datasets (p. 4), the rational for setting FDR thresholds is not clearly stated. While this appears to be a Bonferroni-type correction (p-value threshold divided by number of transcripts or metabolites tested), the thresholds do not reflect the actual number of tests performed (7883 transcripts times 453 metabolites for transcript-metabolite associations, 87 metabolites or 10435 transcripts times 28 complex phenotypes). The correct and more stringent thresholds certainly decrease the overlap between causal relationships and thus reduce the identifiable number of causal triplets. Furthermore, we believe that multiple testing has to be considered for correct interpretation of the power analysis. The study compares the power of a TWMR-only approach to the power of mediation-based MR by comparing "power(TP)" against "power(TM) * power(MP)" (p. 12). This comparison is useful in a hypothetical situation given data on a single transcript affecting a single phenotype, and with potential mediation via a single metabolite. However, in an actual empirical situation, the number of non-causal transcript-metabolite-phenotype triplets will exceed the number of non-causal transcript-phenotype associations due to the multiplication with the number of metabolites that have to be evaluated. This creates a tremendous burden of multiple testing, which will very likely outweigh the increase in power afforded by the mediation-based approach in the hypothetical "single transcript-metabolite-phenotype" situation described here. Thus, for explorative detection of causal transcript-phenotype relationships, the TWMR-only method might even outperform the mediation-based method described by the authors, simply because the former requires a smaller number of hypotheses to be tested compared to the latter. The presented simulation would only hold in cases where a single path of causality with a known potential mediator is to be tested.

      2. A second concern regards the interpretation of the results based on the empirical datasets. For the identified 206 transcript-metabolite-phenotype causal triplets, the authors show a comparison between TWMR-based total effect of transcripts on phenotypes and the calculated direct effect based on a multivariable MR (MVMR) test (Figure 2B), which corrects for the indirect effect mediated by the metabolite in the causal triplet. The comparison shows a strong correlation between direct and total effect. A thorough discussion of the potential reasons for deviation (in both negative and positive directions) from the identity line is missing. Furthermore, no test of significance for potential cases of mediation is presented. Due to the issues of multiple testing discussed above, the significance of the inferred cases of mediation is drawn into question. The examples presented for causal triplets (involving the ANKH and SLC6A12 transcripts) feature transcripts with low total effects and a small ratio between direct and total effect, in line with the power analysis. However, in these examples, the total effects are also quite low. Its significance has to be tested with an appropriate statistical test, incorporating multiple testing correction. Furthermore, the analysis of the empirical data indicates that the ratio between direct and indirect effect of a transcript on a phenotype is in most cases close to identity, except for triplets with low total effects. This fact should be considered in the power analysis, which assigned the highest gain in power by the mediation analysis to cases of low direct to total effect ratio. The empirical data indicate that these cases might be rare or of minor relevance for the tested phenotypes.

      3. Related to the interpretation of causal links: horizontal pleiotropy needs to be considered. The authors report the identification of causal links between TMEM258, FADS1 and FADS2, arachidonic acid-derived lipids and complex phenotypes. However, they also mention the high degree of pleiotropy due to linkage disequilibrium at the underlying eQTL and mQTL region as well as the network of over 50 complex lipids known to be associated with the expression of the above transcripts. Thus, it seems possible that the levels of undetected lipid species may be more important for the phenotypic effect of variation in these transcripts and that the reported "mediators" are rather covariates. Such horizontal pleiotropy would violate a basic assumption of the MR approach. While we think that this does not invalidate the approach altogether, it does affect the interpretation of specific metabolites as mediators. This is aggravated by the fact that metabolic networks are more tightly interconnected than macromolecular interaction networks (assortative nature of metabolic networks) and that single point-measurements of metabolites may not be generally informative about the flux through a specific metabolic pathway.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      Summary<br /> The authors have set out to study the Drosophila immune response against the fungus Aspergillus fumigatus. They found that Aspergillus fumigatus kills Drosophila Toll pathway mutants. The fungus does this without invasion because its dissemination is blocked by melanization. They suggest that there is a role for Toll in host defense distinct from resistance. The findings are interesting, and looks like the mycotoxins play a role. It also seems that there is some role of the Bomanins here, but I find that in particular Figure4 experiments are not convincing enough to provide a mechanistic insight as to what is going on. I think the authors need to think through what their results mean, and also, explain better (especially regarding Fig 4) their ideas and how the data fits them.

      We thank the reviewer for scrutinizing our manuscript as well as for suggestions to improve it.

      The role of mycotoxins is demonstrated:

      i) the fungus does not proliferate nor disseminate, also in Toll pathway mutant flies: thus, it must kill through diffusible substances, in as much as these immuno-deficient flies exhibit tremors toward the end of the infection;

      ii) a fungal strain devoid of the capacity to produce secondary metabolites is no longer virulent, even in Toll pathway mutant flies.

      The role of Bomanins is also demonstrated: the finding of a susceptibility of Bom__D__55C deletion flies to A. fumigatus and to mycotoxin challenges clearly shows that at least one or several Bomanin genes are required in the host defense against these challenges. The observation that this susceptibility can be rescued by the genetic overexpression of specific Bomanins indicates which ones are likely to mediate protection. The novel data we have included with the protection from mycotoxin action in neurons point clearly to BomS6 being the major mediator of protection against verruculogen action since it is the only one of two Bom genes to be induced in the head and with a proven potential for rescue of the Bom__D__55C phenotype.

      As regards the concept of the article, it is simple: we show that the Toll pathway does not control A. fumigatus infection by directly attacking the fungus but does so by neutralizing the effects of secreted virulence factors such as restrictocin and verruculogen. We further identify some of the relevant effectors such as Bomanins by using a genetic complementation strategy. To make our point clearer, we have now included additional data in which we show that BomS6 and BomS4 are the only Bomanins induced in the head of flies upon the injection of these two toxins. We next determine that BomS6 and not BomS4 expression in the nervous system dominantly protects the flies from the deleterious effects of verruculogen injection, both in terms of recovery from tremors and survival. Mechanistically, the Toll pathway protects the host from the action of verruculogen by expressing and likely secreting BomS6 from neurons.

      Major comments:<br /> Page 5: .."the fungal burden did not increase much in MyD88 flies challenged with 50 conidia (Fig. 1B)" - What do you mean did not increase much? There is a clear increase in Myd88 mutants compared to controls; would you expect a bigger increase (e.g. log scale induction)? Explain.

      When the injected dose is higher than 50 injected colonies, the fungal burden remains very close to that of the injected inoculum (Fig. EV1_F, J_). As for other pathogens regulated by the Toll pathway, it has been published that the microbial burden increases by log factors for filamentous fungi (Huang et al.., in revision), pathogenic yeasts (e.g., work from our laboratory Quintin et al. Journal of Immunology, 2013), bacteria (e. g., Duneau et al., eLife 2017; Huang et al., in revision). The pathogens usually proliferate exponentially in immuno-deficient hosts, which is clearly not the case of A. fumigatus, the first example we know of.

      Page 6: "the SPZ/Toll/MyD88 cassette is required for host defense against A. fumigatus infections, even though this pathogen only mildly stimulates the Toll pathway." - Should you rather say that A. fumigatus only mildly induces the Toll pathway target gene Drosomycin?

      The answer is negative. Fig. EV1_C_ clearly shows that BomS1 is also modestly induced as compared to an infection with E. faecalis. The promoter of BomS1 contains a canonical Dif-response element (Busse et al., EMBO J., 2007_)_. For a more thorough discussion of this point, please, see reply to Reviewer 2, Major Comment 2.

      Page 6: "...we tested Hayan mutant flies defective for this arm of innate immunity (Nam et al., 2012)." - elaborate this, which arm/which pathway?

      The title of the paragraph is “Drosophila melanization curbs A. fumigatus invasion”. The full first sentence of the paragraph actually read: “As melanization is a host defense of insects effective against fungal infections, we tested Hayan mutant flies defective for this arm of innate immunity”.

      This has not been introduced in the introduction. Explain.

      We have now added a couple of lines (82-83) to introduce melanization for the nonspecialist reader.

      Can you really draw this conclusion: "We conclude that melanization limits the proliferation and the dissemination of A. fumigatus injected into wild-type flies yet does not eradicate it at the injection site, where a melanization plug forms." Maybe you can based on the function/importance of the pathway to melanization, but you need to explain.

      Melanization is mediated by the Hayan protease and three phenol oxidases (two in adults) that catalyze the enzymatic reactions leading to melanin production (for Drosophila, please see Nam et al. EMBO J. (2012), Bingelli et al., PLoS Pathogen (2014), Dudzic et al., BMC Biology (2015), Cell Reports, 2019). Thus, finding that there is an increased proliferation and dissemination in null Hayan mutants is a strong indication for a role of melanization. The identification of a similar phenotype for PPO2 and PO1-PPO2 mutants demonstrates that melanization is curbing A. fumigatus. Our sentence is therefore fully justified.

      Page 10: "The cleavage of the 18S RNA was however much less pronounced in wild-type flies as compared to MyD88" - I am not sure what this means. Do you mean 28S?

      We thank the reviewer for pointing out this mistake that has now been corrected.

      And that the 28S peak is lower? Is this a quantitative method?

      The technique is liquid electrophoresis on a microchip. It is both a qualitative and quantitative technique that replaces traditional agarose or polyacrylamide gels.

      Fig. legend: "Arrows show the position of the 28S RNA sarcin fragment" - there are three arrows in both Fig 4E and F; specify which arrows point what.

      The thick arrow is now indicated in the figure legend to correspond to the much smaller sarcin fragment whereas the thin arrows on the graph clearly specify the position of the 28S RNA peaks.

      Based on the results, I am not convinced about the conclusion, that "restrictocin is able to inhibit translation to a detectable degree in vivo, likely through the cleavage of the ribosomal 28S a-sarcin/ricin loop as described in vitro." <- Do you draw this conclusion before doing the actual in vitro experiment, which is described next in the text (The rabbit reticulocute assay, S2 cells)?

      The existing literature (line 259 for a few selected references) has largely proven that restrictocin cleaves 28S RNA in vitro. We are demonstrating that this also happens in vivo in flies based on the generation of the alpha-sarcin fragment as well as the decreased 28S peaks. Our transgenic approach also indicates that restrictocin blocks translation in vivo. The in vitro approach has been implemented so that we could test the effect of synthetic BomS1 and BomS3 in cell culture. As to our knowledge, no one had demonstrated that restrictocin blocks translation in Drosophila cultured cells. It was therefore important to demonstrate it in cell culture using well-characterized in vitro techniques mastered by AT and FM.

      4H: Not sure what should be seen here, is it the darkest band at 0 uM that disappears?

      We have improved the figure and added an arrow to point out to the relevant band on the gel.

      HI & J need more explanation than what is now included in the text or Figure legend, is the conclusion that there is no difference? Write the stats above the Figs 4I & J (n.s.?).

      We have added NS on the figures and made our conclusion clearer (lines 295-298).

      Minor comments:

      It would have helped commenting if the manuscript contained line numbers

      We apologize for having initially provided a version in which lines were not numbered. At the prompting of Review Commons we immediately provided such a version, that was actually used by Reviewer 2.

      Why do you have the title "Hayan" on top of Fig 1F; you don't have this marking system in the other survival curves

      This point has now been addressed and the survival experiments checked for consistency.

      Fig 2A: Can you speculate why MyD88 flies die rapidly at day 10 if you inject PBST (your control)? What would happen to uninjected controls in otherwise the same conditions? (you could include an uninjected control here?)

      We suspect that this is linked to the trauma induced by the injection. Trauma has been shown to impact the homeostasis of the midgut epithelium (Lee & Miura, Current Topics Developmental Biology 2014, Chakrabarti et al., PLoS Genetics (2016)), and we suspect that it may lead to a leakiness of the gut allowing the passage of some bacteria from the gut microbiota that can proliferate in the hemocoel. Hence, we checked axenic and antibiotics-treated MyD88 flies to exclude that the limited sensitivity to trauma was not significantly contributing to the phenotypes we describe. It is also linked to the thickness of the needle and the problem is alleviated by using thinner needles.

      The uninjected control is now shown in Fig. EV8_E_.

      Please, see also the answer to Reviewer 2 Major comment 1.

      Fig 2E: Not sure what would be the best way of presenting the curves - different colors, dotted lines or something? Now if there are too many lines, they are hard to tell apart. because the symbols are not that visible. Like in 2E if you want to compare the light red/orange colored lines.

      We agree with the reviewer that the lines are hard to tell apart. This is however not a significant issue since the glip mutants display curves similar to that of the wt A. fumigatus control strain.

      For consistency add the caption also to Fig 3D (I assume it is the same as 3C)

      The caption was present in our version and is present in the revised version.

      For consistency, should you add Verruculogen on top of Fig 3F?

      Same reply as for the previous comment.

      Chronologically, how it is explained in the text, Figs 4A and B are in the wrong order.

      We fully agree with the reviewer. This problem has been addressed in the revised version.

      The quality of Fig 4 is not great, the text is hard to read (too small) and becomes blurry upon magnification.

      We fully agree with the reviewer. This problem has been addressed in the revised version.

      Page 12; "These data then suggest that a process akin to the immune surveillance of core cellular processes first described in C. elegans may also exist in Drosophila" - I think this sentence belongs to the discussion, this is not directly drawn from the results.

      We have followed the reviewer suggestion and have now developed our Discussion paragraph now entitled “Induction of the expression of specific Bomanin genes upon mycotoxin challenge”

      Referees cross-commenting

      I think we share many thoughts among all the reviewers.

      The main problem is that the manuscript language is quite strong; from the results many times it is not ok to make such strong statements. Some experiments need further analysis and clarification.

      I think in most cases, this could be achieved by softening the statements and adding more discussion, and not by making new experiments (some may be needed).

      We respectfully disagree with the reviewer on this point. There were obviously some misunderstandings that might be traced to the short format of the initial version. We have now developed the Discussion to clarify our conclusions as suggested by the reviewer.

      Minor things are that experiments are not advancing in a logical order between the text and the figures and there are problems with resolution in some figures.

      Statistics in some figures needs to be added.

      Please, see above.

      Reviewer #1 (Significance):

      The nature of the work is conceptual for the field, to understand the role of the Toll pathway and Bomanins in particular, in this fungal infection model. The work is interesting to a somewhat limited audience, mainly immunologists and in particular, people interested in the Drosophila model for immunity. The work may be interesting conceptually in understanding fungal infections.

      We are not certain that immunologists represent a limited audience. We agree that work on fungal infections is insufficiently funded with respect to the medical importance of these infections, as highlighted in our introduction and Perspective section of the Discussion.

      My expertise: I am a Drosophila immunity researcher with nearly 20 years of experience in working with fly immunity, in particular the Toll and the Imd pathways.

      Reviewer #2 (Evidence, reproducibility and clarity):

      Summary:

      Xu et al. describe how A. fumigatus kills Toll-deficient fruit flies not by hyperproliferation, but more likely by virulence factors. Melanization is important for suppressing fungal spread. The Bomanin genes have an unknown function, and here the data suggest a reasonably convincing role for Toll in resilience. Overall the manuscript is thorough and presents a diversity of approaches that show Toll and the Bomanins in particular contribute to this resilience effect. The idea that Toll effectors are essential for resilience is interesting as other fly stress response pathways like JAK-STAT are better known for helping the fly cope with damages, while Toll is better known as an antifungal response.

      I believe the study, with some careful considerations added, would add a valuable series of observations to understanding how the host immune system promotes survival after infection. Overall I am quite positive about the results, and the authors have made a significant effort.

      We thank the reviewer for the positive evaluation of our work that actually spans many years of research on the Aspergillus fumigatus Drosophila infection model that is a major topic of our work at the Sino-French Hoffmann Institute of Guangzhou Medical University.

      Any experiment suggestions I make are strictly to improve the confidence in the interpretations of the results, but the language could alternately be softened to address those concerns. My major critique is that the authors repeatedly extend beyond what is shown, and occasionally in defiance of what is shown (if I understand the results correctly).

      We have chosen to perform additional experiments when needed. We have also clarified points where there were obvious misunderstandings by expanding our text that had been written under a very concise format.

      It is not thoroughly clear what the reviewer has in mind when using the word defiance. We suppose it refers to the work of Scott Lindsay with whom we are in contact. He actually attempted to monitor the C. glabrata burden but did not pursue this line of investigations as he already saw a difference after one hour and he thought that the Toll pathway cannot be induced so rapidly. Actually, David Duneau mentions a time of two to three hours for the Toll pathway to control E. faecalis infections (eLife, 2017) and Sandrine Uttenweiler-Joseph already saw by MALDI-TOF MS an induction of Bomanins and other DIMs at the earliest point tested, six hours (PhD thesis). There is absolutely no critique of the work of the Wasserman laboratory who has greatly contributed to our understanding of Bomanin functions. Some of our unpublished data clearly point out to an AMP role for at least one Bomanin gene against E. faecalis and we certainly do not exclude an AMP role for BomS against C. glabrata. This however does not dismiss the possibility that Bomanins may also have other roles in dealing with microbial toxins. We have been studying Candida infections in Drosophila for many years and have documented the host defense against C. glabrata (Quintin et al., JI, 2013). We do suspect that C. glabrata likely secretes virulence factors that have not been identified so far. We mention this as a possibility and certainly not as a truth. One should remember that investigators were unaware for a long period of the role of Candidalysin, a pore-forming toxin, in C. albicans infections.

      Finally, a dual role as AMP and protecting from secreted toxins has been clearly shown in the case of alpha-mammalian Defensins that we now are describing in our Revised Discussion (Kudryashova,Immunity, 2014).

      Comments below.

      Major comments:

      1) The language is too strong. Specifically the use of the phrase "anti-toxin" is too generalist, especially as the authors show that their candidate Bomanin does not bind to the toxin directly.

      We have checked all of the submitted documents: the term anti-toxin was never used (just found “anti” in antimicrobial, antifungal, antibiotics..), in this manuscript as well as in the companion article. and we have never excluded an indirect effect, quite to the contrary because of the in vitro experiment with restrictocin mentioned by the reviewer and other observations now included (see further below). We use the terms “protection” or “counteract”, which have not such a meaning. It is burdensome for the reader to read each time “counteract or protect from the actions of the toxins or the effects of the toxin.

      Instead, Toll mutants seem susceptible to damage/stress caused by injury/toxins. MyD88 even show general susceptibility to vehicle controls in Fig3C-D.

      The effects of stress related to the infection conditions and injury are clearly distinct from the much stronger ones exerted by the toxins themselves. As requested by the reviewer further below, we have submitted wild-type and immuno-deficient flies to several stresses such as heat or the injection of hydrogen peroxide or salt solution (Fig. EV8_B-E_). While the latter did not reveal any difference, MyD88 flies succumbed slightly faster to a strong 37°C stress; in contrast, they survived better to a 29°C exposure, the temperature at which we perform most experiments. However, the difference started to be visible only after some 15 days whereas the time frame in which flies succumb to A. fumigatus or toxin challenges is definitely much shorter by some 10 days. We also note that Bom__D__55C mutant flies behave like the isogenized wild-type controls in these assays, further excluding a potential role for general stress sensitivity as a contributor to the effect of toxins.

      As regards DMSO, there is indeed a general mild sensitivity of flies to DMSO, but not specifically affecting MyD88 mutant (Rebuttal Fig. 1J). We find that this effect is lessened when using thinner needles. Thus, the problem has become minor as we became more experienced. We had checked axenics- and antibiotics-treated flies to exclude a contribution from the microbiota. Finally, to uncouple the effects of verruculogen from those of DMSO, we have also challenged flies directly by introducing the powder, using a technique similar to that of the septic injury. While it is quantitatively less accurate, it clearly proves that verruculogen produces the reported effects (Fig. 3C) and was useful to measure Bom and Drosomycin expression by digital PCR in the heads of challenged flies, e.g., Fig. EV6_J-K_ and Figs EV_11&12_.

      Toll is important for development, so it may be expected that Toll flies could have development defects impacting resilience even if/when Toll flies can survive to adulthood. I don't say this to be too negative on the findings, which are quite convincing. But I am not sure that the phrase "anti-toxin" is right for what is shown.

      We fully agree with the reviewer on this point. We have failed to find RNAi lines that are efficient enough to mimic the Toll pathway phenotype when expressed ubiquitously at the adult stage. However, Bom__D__55C mutants do not seem to display a developmental phenotype and display a phenotype similar to that of MyD88 flies. Furthermore, our rescue experiments of the Bom__D__55C sensitivity phenotype to mycotoxin challenge is achieved by the overexpression of specific Boms that are induced only at the adult stage, making it unlikely that this sensitivity phenotype reflects a developmental problem, as had been shown to be the case for 18-wheeler that had initially been proposed to encode the IMD pathway receptor.

      A very interesting recent study shows Dif has a role in the synapse of neurons to protect from alcohol sensitivity. Could secreted Bomanins participate? This emphasizes a mechanism through which Toll mutants likely have defective neural development, which could make them stress response defective, especially to things like neurotoxins. See: https://pubmed.ncbi.nlm.nih.gov/35273084/

      We are aware of this study first presented at the 2019 Fly Meeting in Dallas and this author did discuss with the authors of the study. However, we have found that Dif (and Dorsal) mutants are not sensitive to A. fumigatus infections nor to injected mycotoxins, as was the case already for C. glabrata (Quintin et al., JI, 2013).

      Lin et al. (2019) also showed lack of Bomanin secretion from the fat body in Bombardier mutants causes loss of tolerance (resilience?). So does Bomanin disruption increase susceptibility to stresses more generally, rather than specifically fungal toxins? And is this a development role, rather than an immune response role?

      The authors could try to use other stresses (NaCl, oxygen, heat, alcohol) to test the contribution of Bomanins to this resilience, which may reflect defective neural development rather than a role for secreted systemic immune-response peptides.

      Please, see replies above.

      2) The authors present a paradox. On the one hand, A. fumigatus hardly induces Drs/Bomanins (Fig. S1). Yet on the other, they propose that inducible Bomanins protect the fly from mycotoxins. Why do the authors say Toll is hardly induced by A. fumigatus at the start of the study (Fig S1), but later use the same data to argue that Bomanin induction underlies the resilience phenotype (Fig5).

      The reviewer raises an interesting point. Of note, we have added new data in Fig. EV2_B_ that document that all 55C Bomanin genes, BomS4-_excepted, are induced by a systemic infection. There is indeed somewhat of a paradox. The _Bom__D__55C deletion phenotype clearly establishes that Bomanins play a major role in the protection against mycotoxins and A. fumigatus. The rescue experiments rely on ectopic expression and therefore establish that specific Bomanins can mediate the protective effect. Our data on verruculogen suggest that there might be local inductions, e. g., in the head of BomS6 and BomS4. The brain represents a compartment that is separated from the hemocoel by the blood-brain-barrier. We have not been able to generate BomS6 null mutants so far. In this case, the relevant response may not be systemic. We only detect a weak signal for BomS peptides in the hemolymph of unchallenged flies, making it unlikely that a basal expression is important, at least as regards a systemic infection. We cannot however exclude local inductions at the level of tissues. This would not rely on hemocytes as “hemoless” flies are not susceptible to A. fumigatus or toxin challenges. This topic definitely warrants further investigations.

      In Fig 5, it looks like DMSO is nearly identical to A. fumigatus, so can the authors really suggest that equal induction to DMSO is relevant?

      We had stated that an induction of the Bomanins by the injection of DMSO alone precluded us from analyzing the effects of verruculogen on Bom gene expression. We have now bypassed this difficulty through direct challenges by the undissolved powder (Fig. 6_J-K,_ Fig. EV11).

      The authors' discussion of these points would benefit from considering Vaz et al. (2019; Cell Rep) to frame how much PAMP is injected given equal numbers of fungal cells vs. bacterial cells. To me the lower induction by injecting a few fungal cells with much lower surface area to volume ratio means equal microbe mass has exponentially less PAMP in fungal conidia cell walls (2-3um diameter) vs. equal mass of bacteria (0.5-1um diameter).

      We fully agree with the reviewer and now mention that C. glabrata also led to a milder induction of the Toll-mediated humoral response (Quintin et al. JI, 2013). In addition, it has been shown previously that ß-(1-3)-glucans, which are sensed by GNBP3 in Drosophila (Gottar et al., Cell, 2006), are concealed by the cell wall (germinating conidia) or hydrophobins (Wheeler et al., PLoS Pathogens, 2006; Aimanianda et al., Nature, 2009) . In the case of yeasts, these glucans are accessible only at the budding scar (Gantner et al., EMBO J., 2003).

      Fig S1O is not convincing that Boms alone are present. There is significant noise near Drs in FigS1 infected, which likely saturates the detector before Drs can fly to it. I say this because DIM4 (Daisho) indicates that Toll is strongly induced. The authors should show a larger mass range on the x-axis including peaks of other Toll-induced peptides like the BaramicinA DIM10, DIM12 and DIM13 peptides of their companion paper and DIM14 (Daisho), which are closer in mass to the Bomanins and less likely disrupted by the noise at 4300 m/z. The maldi-tof calibration to correct ranges is critical for arguments of quantification.

      We provide the primary data in the Rebuttal figures at the end of this document. These are the results obtained from three single flies (Files A29683PBUG22, A29684PBUG23 and A29684PBUG24). The first three spectra correspond to the full scale based on the major peaks observed (DIM4/BomS5) in two out of three spectra. At this scale, no signal is visible for Drosomycin at 4891 and the “noise” at 4278 is modest. Next, the multi-spectra report allows to put all three samples on the same sheet, this time zooming on the peaks of interests in the region 4300 (“noise”) and 4891 (Drosomycin). Finally, the next two pages zoom in on the BomS peptide signals and the next page keeps the same scale to document the 4300-5000 region. On the last page, it is obvious that the signal around 4300 is very modest and too distant to influence the Drosomycin ion, thereby excluding any effect of suppression. Of note, in the systemic immune response, Drosomycin is the most induced AMP with a concentration estimated to be around 0.3µM, an order of magnitude higher than other AMPs. Finally, these experiments have been performed by PB who initially developed the technique (Uttenweiler-Joseph, PNAS, 1998) and has been using and developing it ever since.

      Combined with comments in Major Concern 1, I am not convinced that the -inducible- Bomanin response mediates the resilience phenotype.

      Besides our replies above, we do hope that the new data we have included in Fig. 6 that document an induction of only two BomS genes in the heads of Drosophila upon verruculogen and the finding that BomS6 expression in the nervous system protects the fly from the effects of verruculogen will convince this reviewer.

      3) The author's language is very strong to disregard a possible antimicrobial activity.

      As noted above, this is a misunderstanding that we hope is dispelled in the revised discussion (see also above and replies to Reviewer 1).

      Previous studies showed increased Candida growth and decreased hemolymph killing activity in Bom55C flies (Lindsay et al. 2018 and Hanson et al. 2019).

      Please, see reply above. Factually, Lindsay et al. did not study the C. glabrata titer in vivo but using collected hemolymph. The killing activity likely requires a cofactor regulated by the Toll pathway. Hanson investigated the burden of the dimorphic C. albicans pathogen that in flies is filamentous and not C. glabrata.

      Also see minor concern (i).<br /> I grant that the data are consistent with a resilience role. However the authors found no binding of Bomanin to restrictocin, countering their idea of a -direct- anti-toxin effect.

      We are surprised by this comment. We certainly did not favor this idea nor developed it in the original manuscript, even though we cannot formally exclude it at this stage. Future experiments will focus on BomS6 potential interactions with these two mycotoxins.

      At present the authors cannot rule out a direct antimicrobial role, or even the possibility of two different roles for the same peptides (ex: one in resilience, one antimicrobial). For instance, it is difficult to explain the loss of killing activity of Bom-deficient hemolymph ex vivo from Lindsay et al. if Bomanins are strictly anti-toxins. Surely they must also do something generalist?

      Please, see our replies above and the paragraph dedicated to this topic in the Discussion.

      4) In most figures, the authors do not compare flies with shared genetic backgrounds.

      The MyD88 allele we are using is a transposon insertion from the Exelixis collection and we are using the wA5001 strain that was used to generate the collection of insertion (Thibault et al., Nat. Genetics 2004). We thank the reviewer for this comment as we realized we had forgotten to mention the Bom__D__55C strain. Lines 603-604 state that the deficiency line has been isogenized in the wA5001 background.

      The phenotypes are usually strong so I am not concerned.

      However the rescue effect of Bom transgenes in Fig 5C-D is based on smaller differences. Were these genetic backgrounds controlled?

      Yes, as much as we reasonably could. The fact that most BomS transgenes did not rescue gives further confidence in the data.

      Were transgenes inserted at the same site?

      We used the strategy for overexpression developed by the Basler laboratory (Bishof et al., Development 2013, Nat. Protocols 2014) that relies on insertions at the same site.

      The authors seemingly used a heat shock to express transgenes.

      Heat-shocks are usually a short exposure to higher temperatures, usually 37°C. Here, we have used the inducible Gal4-Gal80ts system developed by McGuire and Davis (Trends in Genetics, 2004). The Gal80 repressor inhibits Gal4 function at the permissive temperature (18°C) and becomes inactive at the restrictive temperature (29°C). Thus, we use a temperature shift and not a bona fide heat shock.

      Given a resilience effect is being studied, this heat stress approach is sub-optimal. Earlier experiments showing effect/no effect of Bomanin on heat shock resilience would improve confidence here. I would recommend assaying temperatures that can kill wild-type in order to confirm that Bom do not succumb earlier (ex. up to 37'C).

      The results have been discussed above and show that 29°C is not a concern for Bom__D__55C and not much of a significant problem as regards MyD88.

      In Fig5C the time resolution is poor, and the effect inconsistent across Bomanins. What are the differences in the Bomanins that the authors suspect could cause this? And how consistent are the experiments?

      We provide all the primary survival data in Rebuttal Fig.1 A-H. The partial protection effects of BomBc1, BomS3 and BomS6 against restrictocin are consistent in the three independent experiments (Fig. 5D and Rebuttal Fig. 1 A-B). As regards the seven independent experiments performed with verruculogen, we observed a strong protection conferred by BomS6 expression in six experiments whereas we detected a milder protection conferred by BomS1 in four out of seven experiments and no protection in the three other ones. The effects were always there after 24 hours, in keeping with our novel data showing that BomS6 expression allows a faster recovery, around 10 hours, from verruculogen-induced tremors (Fig. 6E-F).

      Since the effect is finished by 24h, perhaps a boxplot of percent survival at this time would better show the consistency across experiments.

      Given the argument presented just above and considering that this rebuttal letter will be published alongside the article, this may not be needed.

      Minor concerns:

      i) The authors say the fungal burden of Bom55C flies remains low in Fig 5B, but they never measure flies that are near death when fungal load is greatest, or FLUD like in other figures. Given low mortality at the following time points, it seems likely that A. fumigatus would grow beyond initial loads in those individuals and kill them. I grant that these loads are less than what is seen in Hayan mutants. I just might suggest a more careful consideration of the time points used and what can be said about the trends shown here.

      This is certainly a relevant point. The FLUD data are now presented in Fig. EV8_A_ and do not reveal any additional growth.

      ii) Could the authors comment somewhere about the levels of toxin they were required to inject to get a phenotype vs. the level of toxins the authors expect are found in the fly during infection? I appreciate that toxin injection likely requires much higher doses, but it would be good to know just how far the authors have pushed their experimental system beyond its natural range.

      This a question that is difficult to answer accurately as we are not sure the techniques exist to measure toxin levels in these small flies. We have tested a range of concentrations. It is clear that we push the system and likely use concentrations that are higher than those actually secreted by A. fumigatus during infection. Indeed, the mutant strains defective for the production of verruculogen or restrictocin display only a mildly reduced virulence in MyD88 flies. This makes it even more remarkable that wild-type flies are able to withstand these high, unphysiological concentrations, an argument for an indirect effect independent of the dose as pointed out now in the Discussion. How fungal pathogens balance the expression of the hundreds of secreted virulence factors, proteins and secondary metabolites, is a major frontier for future investigations be them plant or animal pathogenic fungi/

      Again regarding toxins vs. general stresses, one could manage to inject salt into the hemolymph and show a stress-sensitized fly would succumb at lower doses than wild-type, emphasizing the relevance of defining concentrations.

      We feel that just monitoring the survival of flies after a challenge that produces an effect is sufficient (Fig. EV8_C_).

      The authors could also write toxin concentrations clearly in the figure/legend per experiment.

      Corrected.

      iii) Throughout the manuscript, the order that figures/panels are cited is inconsistent. Perhaps the text could be re-written so the reader can follow the figures more intuitively while going through the text?

      Corrected.

      iv) There are a few points where run-on sentences, involving many commas, make it hard to follow the logic. I might suggest a careful reading to break up long sentences into two sentences to ensure clarity.

      We hope to have addressed this concern.

      v) Line 279-281: this is the first and only mention of the immune surveillance hypothesis in nematodes. This is strange, given the authors are effectively describing an analogous idea exists in flies? Perhaps this could be added somewhere in the introduction or discussion.

      We have followed the advice of the reviewers and now discuss this point more fully in the Discussion under its own subheading.

      Small points

      • What timepoints are the gene expression data from? Could the authors indicate this in figures/legends?

      Done

      • Line 133-135: "We conclude that MyD88 flies succumb to a low A. fumigatus burden..." - could the authors cite a figure panel here to emphasize what evidence they're referring to.

      Done

      • Line 151-152: Dudzic et al. (2019- Cell Reports Figure 3) showed that PPO2 was regulated by Hayan, while PPO1 by Sp7. This relevant study should be cited here or in the introduction/discussion.

      Excellent suggestion, this was indeed an important study. Done

      • Line 179-180: could the authors define the gliotoxin mutant strain here in the text for clarity?

      Done

      • Line 196: Fig. 4A-B should be Fig. **S4 A-B?

      Corrected.

      • Fig4A: perhaps the authors could reduce the x-axis to focus on the early time points? If I understand correctly, aspf1 has slightly delayed killing compared to akuB (˜50% difference at 2 days), but both kill 100% by 3 days.

      Done

      • Fig4G: can the authors define the GFP transgene on pg10? Not clear what this is, or what this means. Brain? Fat body? The legend of Fig4G and the key in the top left... it's not easy to quickly understand what is shown in Fig4G.

      Done

      • Line 247: I would drop the "at the intracellular level" part. I'm not sure this is robustly shown given the use of an in vitro model where there is no closed extracellular environment. The data are convincing of the effect, this is just a semantic point.

      We agree that there is no closed extracellular environment and that therefore any signal emitted by the cells might get too diluted. However, the addition of EGF will activate the Toll intracellular through the chimeric EGFR-Toll receptor. As restrictocin is known to act intracellularly, one might have though that there might be some intracellular effectors mediating the Toll-dependent protection against restrictocin. Our sentence excludes this possibility.

      • Line 257-258: Cohen et al. (2020- Front Imm) never used Bomanin mutants. Did the authors mean to cite Hanson et al. (2019) here, which seems to fit their described citation re: Bom55C vs. Toll mutant flies (Fig. 2)? Given Hanson et al. infected Toll mutant and Bom55C flies with many bacteria/fungi including A. fumigatus, it's strange this study is not discussed currently.

      The reviewer is correct. Cohen et al. did use A. fumigatus, but on Daisho mutants and MyD88 and not Bom__D__55C as a control. We are now citing Hanson et al., 2019 in lines 443-449 (Discussion).

      • Fig5C-D: the labeling is difficult to follow.

      This is difficult to address unless multiplying EV figures. We feel this is not needed: the important curves are in color and each such curve is seen on the graphs.

      • Line 318: a -possible- AMP role of Bomanins was proposed because of the aforementioned killing activity of wt but not Bom mutant hemolymph, alongside rescue by single Bom genes. To say this was based only on survival experiments is incorrect.

      The paragraph has been rewritten and expanded to dispel any misunderstanding.

      • Line 324-328: could the authors cite appropriate references after "inhibition of calcium-activated K+ channels" ?

      Done

      • comment re-Line 334: Toll10b flies have melanotic tumors and are in general in a stressed state. Might their rescue be due to increased stress tolerance by pre-activated stress responses?

      This is a developmental effect occurring during larval stages, also observed for Cactus mutants. Here, we use a UAS-Tl10B transgene that is induced only at the adult stage using the Gal4-Gal80ts system. Thus, any stress is minimized as much as possible. Furthermore, we can phenocopy this phenotype to a large extent using a UAS-BomS6 driver, even though the phenotypes are subtly different as regards the protection against verruculogen-induced tremors.

      Referees cross-commenting

      Yes I agree that the data themselves are not the issue, nor even the direction of the results. But there are many overly-strong statements that go so far as to refute ideas which are supported by other studies, and for which the authors here do not provide any contradictory evidence.

      We hope that this revised, extended version has clarified any misunderstanding in the initial version.

      As per my review, I would be happy with a re-write that softened the language overall. I genuinely wonder if these Bomanin mutants simply have poor development, and so they are susceptible to neurotoxins/stress because their nervous system/development leaves them less resilient in general. Experiments testing their resilience to different stresses would greatly elevate the ability to make confident insights in the present manuscript. Currently the authors have only investigated one type of phenotype and interpreted it as if that is evidence of the evolved purpose of the peptides. This approach does not account for many other possible (and reasonable) explanations.

      We have performed the experiments suggested by the reviewer. While we see a modest effect of heat on MyD88, it is not found in Bom__D55C flies, which display essentially the same phenotype as MyD88 with regards to the sensitivity to A. fumigatus or some of its secreted mycotoxins_._

      Reviewer #2 (Significance):

      This paper should be of broad interest to the study of immunology, where roles for effectors are typically thought of as cytokines. In fruit flies and other invertebrates that lack adaptive immunity, immune effectors are more thought of as direct actors likely with antimicrobial properties. The finding that Toll might mediate resilience is interesting, and implicating well known Toll effectors provides an important step forward towards a mechanistic basis behind this resilience effect.

      We thank the reviewer for his appraisal of the significance of our work.

      My expertise is in insect and innate immunity.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      The authors have set out to study the Drosophila immune response against the fungus Aspergillus fumigatus. They found that Aspergillus fumigatus kills Drosophila Toll pathway mutants. The fungus does this without invasion because its dissemination is blocked by melanization. They suggest that there is a role for Toll in host defense distinct from resistance. The findings are interesting, and looks like the mycotoxins play a role. It also seems that there is some role of the Bomanins here, but I find that in particular Figure4 experiments are not convincing enough to provide a mechanistic insight as to what is going on. I think the authors need to think through what their results mean, and also, explain better (especially regarding Fig 4) their ideas and how the data fits them.

      Major comments:

      Page 5: .."the fungal burden did not increase much in MyD88 flies challenged with 50 conidia (Fig. 1B)" - What do you mean did not increase much? There is a clear increase in Myd88 mutants compared to controls; would you expect a bigger increase (e.g. log scale induction)? Explain.

      Page 6: "the SPZ/Toll/MyD88 cassette is required for host defense against A. fumigatus infections, even though this pathogen only mildly stimulates the Toll pathway." - Should you rather say that A. fumigatus only mildly induces the Toll pathway target gene Drosomycin?

      Page 6: "...we tested Hayan mutant flies defective for this arm of innate immunity (Nam et al., 2012)." - elaborate this, which arm/which pathway? This has not been introduced in the introduction. Explain. Can you really draw this conclusion: "We conclude that melanization limits the proliferation and the dissemination of A. fumigatus injected into wild-type flies yet does not eradicate it at the injection site, where a melanization plug forms." Maybe you can based on the function/importance of the pathway to melanization, but you need to explain.

      Page 10: "The cleavage of the 18S RNA was however much less pronounced in wild-type flies as compared to MyD88" - I am not sure what this means. Do you mean 28S? And that the 28S peak is lower? Is this a quantitative method? Fig. legend: "Arrows show the position of the 28S RNA sarcin fragment" - there are three arrows in both Fig 4E and F; specify which arrows point what.<br /> Based on the results, I am not convinced about the conclusion, that "restrictocin is able to inhibit translation to a detectable degree in vivo, likely through the cleavage of the ribosomal 28S a-sarcin/ricin loop as described in vitro." <- Do you draw this conclusion before doing the actual in vitro experiment, which is described next in the text (The rabbit reticulocute assay, S2 cells)?

      4H: Not sure what should be seen here, is it the darkest band at 0 uM that disappears? HI & J need more explanation than what is now included in the text or Figure legend, is the conclusion that there is no difference? Write the stats above the Figs 4I & J (n.s.?).

      Minor comments:

      It would have helped commenting if the manuscript contained line numbers

      Why do you have the title "Hayan" on top of Fig 1F; you don't have this marking system in the other survival curves

      Fig 2A: Can you speculate why MyD88 flies die rapidly at day 10 if you inject PBST (your control)? What would happen to uninjected controls in otherwise the same conditions? (you could include an uninjected control here?)

      Fig 2E: Not sure what would be the best way of presenting the curves - different colors, dotted lines or something? Now if there are too many lines, they are hard to tell apart. because the symbols are not that visible. Like in 2E if you want to compare the light red/orange colored lines.

      For consistency add the caption also to Fig 3D (I assume it is the same as 3C)

      For consistency, should you add Verruculogen on top of Fig 3F?

      Chronologically, how it is explained in the text, Figs 4A and B are in the wrong order.

      The quality of Fig 4 is not great, the text is hard to read (too small) and becomes blurry upon magnification.

      Page 12; "These data then suggest that a process akin to the immune surveillance of core cellular processes first described in C. elegans may also exist in Drosophila" - I think this sentence belongs to the discussion, this is not directly drawn from the results.

      Referees cross-commenting

      I think we share many thoughts among all the reviewers. The main problem is that the manuscript language is quite strong; from the results many times it is not ok to make such strong statements. Some experiments need further analysis and clarification. I think in most cases, this could be achieved by softening the statements and adding more discussion, and not by making new experiments (some may be needed).

      Minor things are that experiments are not advancing in a logical order between the text and the figures and there are problems with resolution in some figures. Statistics in some figures needs to be added.

      Significance

      The nature of the work is conceptual for the field, to understand the role of the Toll pathway and Bomanins in particular, in this fungal infection model. The work is interesting to a somewhat limited audience, mainly immunologists and in particular, people interested in the Drosophila model for immunity. The work may be interesting conceptually in understanding fungal infections.

      My expertise: I am a Drosophila immunity researcher with nearly 20 years of experience in working with fly immunity, in particular the Toll and the Imd pathways.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their time in evaluating of our manuscript and for the useful feedback. We are grateful that reviewers acknowledged that our study is important because it “sheds much needed light on this less documented early stage of cancer development”. The reviewers were overall positive in their assessment and, as reviewer #3 noted, our study “advances this field conceptually by highlighting the importance of targeting the cell signaling and chromatin regulation together”. The common criticism of all reviewers relates to writing style, some textual interpretation and ensuring that the number of replicates, statistical analysis, and cell culture type were appropriately mentioned. We felt these were valid points and have taken onboard all these comments. A shared concern between two of the reviewers was related to the logic behind the timepoints we chose to analyse cells in the different assays. We are confident that we have addressed this, and all other comments as detailed below.<br /> Please find below a point-by-point reply to the reviewer’s comments.

      Reviewer #1 (Evidence, reproducibility and clarity):

      This study aimed to identify events that happens early in malignant transformation of breast cancer (BC) cells that are driven by HER2 oncogene. Constructing a 3D inducible model to study impact of HER2 protein level on BC cell and assessment of gross morphological changes, protein phosphorylation and chromatin accessibility at different time points of HER2 activation.

      Using a controllable in vitro model is a good approach although it is not novel. Also the method used to assess HER2 protein positivity is not standardized nor clinically relevant. Positivity of HER2 in clinical practice is assessed either through immunohistochemistry (IHC 3+ or 2+ with gene amplification), however the author did not mention any control for positivity except western blot which is not used in clinical practice.

      We agree with the reviewer that we should have included our comparison of HER2 protein levels for our cells with a positive control. We have tested this, and the data will be included in the revised version of the manuscript. Briefly, both western blot (WB) and IHC are very useful methods with different benefits: WB is less cost effective but more quantitative, while IHC gives a better overview of tissue heterogeneity. Indeed, due to higher sample processing costs, WB is not used in clinical practice to assess HER2 but it has been shown that there is a high concordance (in 95% of over 300 tumours analysed) between the two methods as both techniques showed prognostic significance R. Molina et al., 1992 (PMID: 1363511). We performed comparison of HER2 protein expression levels of our subpopulations (low, medium, and high HER2 expressing cells) versus two patients’ samples that were already known to be HER2 positive using IHC 3+ or 2+. We were able to demonstrate that HER2 protein levels as measured by western blotting showed that the low HER2 expressing cells expressed less HER2 protein compared to IHC 3+ or 2+ and may be comparable to patients with IHC 1+, which are considered HER2 negative and do not qualify for anti-HER2 therapies such as Trastuzumab.

      There is difference between early HER2 positive BC and HER2 low BC. As the earlier is driven by HER2 oncogenic signalling pathway, but the latter is not.<br /> Identification of molecular changes that occur at HER2 low BC seems very important and clinically relevant, however HER 2 low is not fully characterized, yet. And the only definition available is either HER2 1+ or 2+ without gene amplification. The author was not very clear about threshold he followed to call the model HER2 low. Is it positive with lower limit of positivity or just small amount of protein). He also concluded that BC with sub-threshold of HER2 protein behave more aggressive than HER2 positive BC. What is the threshold and was it correlated with IHC or gene amplification level to be reliable?

      The HER2 positive population in our in vitro inducible system was determined by flow cytometry, we separated the overall (bulk) HER2 positive cells into three different subpopulations and selected the bottom 20% of HER2 expressing cells as the “low HER2” and the top 20% of HER2 expressing cells as “high HER2”. We show in figure 4C the different thresholds for low, med, high HER2 protein expression by flow cytometry. We have modified the figure and the figure legend (figure 4C) to better indicate the different subpopulations. Through western blotting we compared these population of cells with patients’ samples that had IHC 3+ or 2+ and showed that low HER2 population expressed less protein than IHC 2+, whereas the high HER2 was relatively comparable to IHC 3+ sample.

      The status of oestrogen and progesterone receptors were not highlighted. Triple negative breast cancer, for instance, is more aggressive than HER2 positive BC, this may be the reason for the worse behaviour.

      We have modified our main text in the manuscript, line 68-69, to better reflect the fact the MCF10A cells are both oestrogen (ER) and progesterone (PR) negative, this has been already characterised by Qu, Y et al., 2015 (PMID: 26147507). However, importantly, we do not think that ER and PR status is the reason these cells are relatively more aggressive, as normal MCF10A cells without HER2 expression did not display any transformative characteristics in our molecular analysis and/or in vitro functional assays, despite being ER and PR negative.

      At line 130, "The low levels of HER2 protein activation at early time point may closely mimic at least partially the signalling changes occurring in HER2 positive BC patients". This claim is not quite true, as low levels of HER2 protein activation doesn't activate HER2 oncogenic signalling pathway as HER2 positive does.

      We thank you for this insightful comment, we have modified our main text to better reflect our view (line 132-133). However, we were not sure which published data the reviewer was referring to in this case. In particular, if low HER2 levels can still form dimerisation with its family members and induce signalling via its family partners such as HER1, HER3 or HER4.

      The author aimed to study the signalling changes accompanying low levels of HER2 induction by lowering significance threshold to log2fold > 0.5. Lowering the threshold for significance will increase the total number of phosphorylated protein (both at low HER2 levels and high levels). So, studying the whole significant proteins at whole time points will not be exclusive for low HER2 levels and this was evident through activation of MAPK cascade which is one of downstream signalling pathway of HER2 positive BC.

      We agree that a log2fold change > 0.5 would increase the total number of significantly phosphorylated proteins. We first performed the analysis on a more stringent cut-off value of log2fold change > 1.5 p-value, <0.05 as shown in figure 2B. In the supplementary we also show the reduced stringency of log2fold change > 0.5, p-value <0.05, for the following reasons: when it comes to proteins, it is conceivable that a log2fold change > 0.5 is sufficient to induce molecular changes; secondly, our study investigates changes that occur just half an hour, and up to 7 hours, after HER2 protein induction. At such early time-points, proteins would be beginning to be phosphorylated and the extent of it may not be pronounced (especially in a small subset of the population); finally, we thought it is important to share this supplementary analysis with the scientific community to have access to this data so that they may further interrogate it from different perspectives.

      Combining HER2 protein level (both IHC and Western blot) to different time points will give better understanding of events associated with HER2 low, early positive or late positive.

      As above, IHC is routinely performed for clinical diagnosis because it is cost effective. Although, western blotting is laborious and expensive, it is more quantitative compared to IHC.

      Reviewer #1 (Significance):

      This work provides good evidence to changes that happen at early HER2 positive breast cancer transformation and introducing a chromatin opening and accessibility as a new target of treatment of HER2 positive breast cancer patients.

      We thank reviewer #1 for their thoughtful feedback and for their appreciation of our work.

      Reviewer #2 (Evidence, reproducibility and clarity):

      HER2 amplification is associated with poor prognosis of breast cancer. Despite it has been extensively studied, it deserves thorough study how HER2 amplification alters downstream signaling pathways, chromatin structure and gene expression, and how cells overcome the hurdles in order to transform. In this study, Hayat et al used doxycycline-induced HER2 expression in MCF10A cells to recapitulate the very early stage of HER2 expression and HER2-induced mammary epithelial cell transformation. The authors performed global phosphoproteomic, ATAC-seq and single-cell RNA-seq, and propose sub-threshold low level HER2 expression activates signaling pathways and increases chromatin accessibility required for cell transformation, while high HER2 expression level in early stages results in decreased chromatin accessibility.

      Major comments:<br /> 1. Although it is not clearly described, it seems that phosphoproteomic and single-cell RNA-seq were performed using 2D-cultured cells, while ATAC-seq was performed using 2D (FACS sorted cells based on HER2 expression levels) or 3D (time course)-cultured cells. Cells cultured on 2D and 3D are significantly different on cell signaling, chromatin structure and gene expression, and therefore cannot be compared.

      We agree that there are differences between 2D and 3D cell cultures, which may impact on the multi-omics experiments performed in this study. In an ideal world we would have preferred to be able to conduct all experiments in 3D cell cultures, including the phosphoproteomics experiments. However, this is not feasible because the phosphoproteomics experiment requires 500ug of total protein which corresponds to approximately 10 million cells for each condition and replicates in 3D matrices. 3D structures would have also presented with accessibility issues since doxycycline might not have reached all cells equally at the 30 minutes timepoint. Since we were analysing early timepoints for phosphoproteomics, homogeneity in induction was important. We performed ATAC-seq in 3D cell culture because it was feasible as it only required 25,000-50,000 cells to be grown in small 3D cell cultures and is indeed superior for physiological relevance. We therefore had to compromise and worked with the assumption that immediate signaling events will not be fundamentally different in 2D vs 3D. We have modified the main text to better reflect this and have indicated which experiments were performed in 2D vs 3D in the figure legends and the methods section.

      1. Phosphoproteomic (0.5, 4 and 7 hours), ATAC-seq (1, 4, 7, 24 and 48 hours) and single-cell RNA-seq (7, 24, 48 and 72 hours) were performed on cells at different time points after doxycycline treatment. The authors need to clearly explain the rationale why such time points were chosen for each experiment in the text.

      There are indeed differences in the time-points analysed between the different multi-omics analysis. However, as mentioned above, the reason for selecting such early time points for the phosphoproteomic experiment was that signalling changes are rapid and we were focused on characterising the early signalling dynamics. With regards to the ATAC-seq and scRNA-seq, there are several shared time-points such as the 7h, 24h, and the 48h. Additionally, as the chromatin changes would be slower acting as compared to signalling changes, two later time-points were selected including the 48h (ATAC-seq) and 72h (scRNA-seq) to capture some late changes during cellular transformation.

      1. Change on chromatin accessibility does not necessarily mean change on gene expression levels. RNA-seq needs to be performed and analyzed along with ATAC-seq data.

      We agree that chromatin accessibility does not necessarily correlates with gene expression changes and the need to perform RNA-seq to make such a conclusion. This is the reason we performed single cell RNA-seq, which looks at changes in high temporal and cellular resolution. This is particularly useful for the heterogenous cell population that we worked in to better understand the differences between cell types.

      1. Analyses on multi-omics data are quite preliminary. Clustering analysis on the time course of phosphoproteomic, ATAC-seq and single-cell RNA-seq will help characterize the dynamics of cell signaling and gene expression. Integrated analyses on multi-omics data and construction of regulatory network are necessary to identify the key signaling node and key epigenetic regulators/machinery that facilitate or prevent cell transformation. Integrated analyses, of course, need to be performed on data obtained from cells cultured in the same conditions.

      We think our study is an important work and provides a strong foundation for a comprehensive, integrative multi-omics study using primary human breast cells with parallel analysis performed on the same population of cells using the latest techniques such as scATAC and RNA-seq or scNMT-seq. We are indeed in the process to apply for funding in a larger analysis that involves in vivo work and clinical samples, using this study as a foundation.

      1. The authors picked up several genes from the analyses, and discussed the potential importance in cell transformation without functional validation. It is important to show data demonstrating altered expression of certain genes and/or altered activity of certain signaling pathway/epigenetic regulators is indeed important for cell transformation in low HER2-expressing condition or preventing cell transformation in high HER2-expressing condition.

      We agree that this is important. The scope of this study is to report on the result that low HER2 was unexpectedly more aggressive compared to high HER2, which was a highly reproducible observation, and identified a molecular explanation for this behaviour (dedifferentiation and predominant chromatin opening). In terms of cross validation, we found the MUC1 protein expression to be low in low HER2 expressing cells, indicating that they are more stem-like (figure 4B). We confirmed and validated this finding in our scRNA-seq data shown in figure 4F. The pathway analysis from phosphoproteomic study shows that MAPK pathway is highly activated upon HER2 protein overexpression. To validate this claim, we performed western blotting analysis that confirm this as the ERK protein was hyperphosphorylated in HER2 expressing cells compared to controls. Thus, our resource study provides many candidates that can be tested to further explore the biology.

      1. HER2 expression in MCF10A cells is insufficient in inducing tumor formation in vivo, although HER2 expression results in disrupted acini structure and colony formation in vitro (e.g. Alajati et al. 2013 Cancer Res, 73:5320-5327 cited in the manuscript). It is interesting to investigate whether this is due to the mechanisms identified in this study.

      MFC10A cells are generally difficult to transform in vivo. It is possible that mechanisms identified in our study might be responsible for lower tumourigenicity in vivo with WT HER2 compared to HER2 variants, since our study suggests activated checkpoints in high HER2 cells. It would be interesting to compare the differential impact on chromatin for the two HER2 variants too. In our system, we think the reasons why cells form abnormal morphological changes and grow colonies in vitro is a result of HER2 overexpression, which induces aberrant signalling, and this may be leading to loss of cell-to-cell contact and disruption of adhesion molecules. However, the objective of this study was to understand the early signaling to chromatin changes in in vitro cellular transformation, and changes in cell morphology are a consequential part of the process.

      1. In Figure 2C, two replicates are completely separated and replicates of each time points are not clustered together.

      We agree that the two replicates are separated into two separate groups, this was demonstrated by the PCA analysis (Supplementary Fig 1F). We grouped these samples into “early” (0h, 1h, 4h, and 7h time-points) or “late” (24h and 48h time-points) based on them clustering well into these two groups. The subsequent analysis were performed based on these groups that clustered together. However, we still showed each replicate in figure 2C to appreciate the dynamics of chromatin accessibility between each time-point, which shows clear differences in HER2 versus Control.

      Minor comments:<br /> 1. Essential experimental information, e.g. whether cells were cultured in 2D or 3D, needs to be clearly and accurately described in main text, figure legends and experimental procedures.

      The figure legends in the manuscript have now been modified to include information on cell culture type.

      1. Statistic methods are not provided. In Fig. 4D, HER2-med and HER2-high need to be compared to HER2-low group.

      Statistical analyses have been added to figure 4D and HER2-med and HER2-high have been compared to HER2-low group.

      Reviewer #2 (Significance):

      The authors propose sub-threshold low level HER2 expression activates signaling pathways and increases chromatin accessibility, which facilitates mammary epithelial cell transformation, while high HER2 expression in early stages results in decreased chromatin accessibility via unknown feedback mechanisms. It is interesting to identify which signaling and epigenetic regulators are essential to cell transformation, which feedback mechanisms prevent the transformation of HER2-amplified mammary epithelial cells, whether inactivation of such feedback mechanism indeed occurs in tumorigenesis of HER2-amplified breast cancer, and whether it is a potential therapeutic target for HER2-amplified breast cancer.

      Expertise of review: breast cancer, cell signaling, tumor microenvironment.

      We thank reviewer #2 for their time and for providing such useful feedback on our work.

      Reviewer #3 (Evidence, reproducibility and clarity):

      In this paper Hayat et.al study the early transformational events that follow the activation of the oncogenic HER2 signaling pathway and its crosstalk with chromatin opening. Using an inducible in vitro model of HER2+ breast cancer they have identified that the overexpression of HER2 transforms non-tumorigenic breast epithelial cells via chromatin regulation. The study also shows that the transformative potential of the cells is inversely related their HER2 expression where the low HER2 expressing cells obtain a stem-cell like signature and increased chromatin accessibility leading to an increased transformative potential.

      Major comments:

      While the key conclusions of the paper are convincing, here are the parts of the study that need further clarification or supporting data from the authors.

      1. In Figure 1C the authors show that MCF10AHER2 cells formed complex transformed masses when grown in 3 dimensional cultures. From the figure it is evident that that the transformative potential of the HER overexpression is far more pronounced at the Day 6 and Day 9 mark. Therefore, one wonders why these time points weren’t used as the “late timepoint” in any of the sequencing studies moving forward. Can the authors comment on this choice and perform additional experiments to address the molecular changes that lead to the dramatic transformations seen at this timepoint? Since the authors have a well-established protocol in place, looking at an additional time point could be potentially feasible, provided the cells/samples have been frozen down at this stage. If unable to do so, could the authors comment on the molecular changes they would expect to see at this time point.

      In our study we primarily focused on the early events upon HER2 overexpression because the changes appear to be much more dynamic, and we hypothesised that these events are the cause of the subsequent, more pronounced featured later on. The rationale behind employing an inducible system and capturing the early changes was to identify aberrant molecular events at the earliest time possible. Indeed, numerous studies have investigated the differences between normal versus cancer cells (many of which are at later time points, that have missed the foremost aberrant molecular changes). Based on our ATAC-seq analysis at late-timepoints, 24h and 48h time-point, the number of changes in chromatin accessibility become relatively more stable as compared to early time points (supplementary figure 2A).

      1. Fig 1D the authors conclude that the overexpression of HER2 causes increased cell invasion based on the results seen in a collagen coated plate. How to the authors explain the lack of any such significant change in a Matrigel coated plate?

      To test the invasiveness of the HER2 overexpressing cells, collagen is used to increase stiffness to Matrigel. Stiffness is relevant for the type of invasion seen in these 3D cultures because it activates pathways important for invasion. We added the references to the text for clarity (PMID: 15838603 and PMID: 16472698).

      1. In Supp Fig 1D the authors use the DAVID Bioinformatics tool to identify the various signaling pathways enriched in the HER2 induced system. In addition to the MAPK pathway this analysis also shows other common cancer-related pathways (eg. The Mtor pathway) being enriched to a similar or higher extent. Can authors address why only the MAPK pathways was pursed in detail?

      HER2 is major receptor that can signal through various signalling pathways. We highlighted the MAPK pathway because it has been previously shown that MAPK cascades can modify chromatin through transcription factors and chromatin regulators Clayton and Mahadevan., 2010 (PMID:19948258). We think that when HER2 is overexpressed, it primarily signals down the MAPK pathway, resulting in the activation of transcription factors and chromatin regulators that lead to a highly accessible chromatin and ultimately contributes to transformation. To confirm this result, we did perform western blotting control analysis and found that indeed, HER2 overexpression consistently activates the MAPK pathway that shows phosphorylation of ERK but does not influence AKT phosphorylation. We can include this data in the manuscript.

      1. Figure 4B and supplementary figure 3E only show that percentage of the cells have either MUC1-ve or EpCAMlow or CD24low expression. However, Figure 4A and the corresponding text indicates that that breast stem cells are defined by a combination of MUC1-ve, EpCAMlow, and CD24low expression. If this is the case, the authors need to show the percentage of the cells within each population have an overlap of all these expression signatures, to support the claim of low HER2 expressing cells showing a more de-differentiated stem-cell like property.

      Our results confirm that upon HER2 overexpression, cells become MUC1-ve, EpCAMlow and CD24-ve, acquiring the breast stem cell signature. We did not show the CD24 expression because all the cells that were MUC1-ve and EpCAMlow were also 100% CD24-ve. We have now modified figure 4B and the figure legend to reflect this change, additionally, we added another figure (supplementary figure 4) that shows how the analysis was performed systematically.

      1. The authors also state 'other biological effects being responsible for the lower capacity in anchorage-independent growth of high HER2 expressing cells' that is shown in fig 4d. While an experimental investigation of these effects may be out of the scope of this study, the authors may consider commenting (and referencing additional literature) on the other biological effects they think may result in this phenomenon.

      We have modified the manuscript (lines 294-296) and added further explanation as to what other biological effects may be responsible for lack of colony growth in high HER2 expressing cells in lines.

      1. The authors do a great job providing details about all statistical analyses performed, however the details regarding the experimental replicates are only provided for some experiments making it difficult to infer if the experiments have been adequately replicated before concluding results. Can the authors please add the n - value for all applicable experiments in the figure legend or the methods section?

      The number of replicates has now been added to the respective figure legends.

      1. What is the scope for validation of these findings in vivo and in human samples? Could the authors please comment on this in the discussion section of the manuscript.

      The primary goal of this study was to understand the early transformational events in a simple in vitro, yet a robust model that is highly accessible. We have analysed some human samples to compare the HER2 protein expression levels. However, the findings from this manuscript could be validated in more precious models such as primary human cells, human tumours samples and in vivo in animals. We have modified the end of discussion to address these points (lines 394-399).

      Minor comments:

      1. In figure 1B the authors show a western blot analysis for HER2 expression over time while using GAPDH as a loading control. However, GADPH control seems to be unequal, especially in the 1ug/ml Dox lane. This needs to be addressed.

      We agree that there is a slight difference in the GAPDH levels in this western blot. We have carried out densitometry analysis which could be added to the supplementary data if required, to show that even though the GAPDH appears to be slightly less in the 1ug/ml of dox (last lane), it shows that HER2 levels are even greater than what appears on the blot, thereby confirming the trend we have observed in the current western blot.

      1. In figure 1C, it is unclear if the images shown are representative of the exact same spot over a 9-day period or of different spots.

      In figure 1C, the morphological regions are representative of the whole well in which the cells were growing but not the exact same spot. This is because nearly all the cells (>90%) transformed from round, organised acini to the fibroblastic, invasive morphology by day 9. We have captured multiple images of different areas in the well using confocal microscopy, and this can be added in the supplementary data.

      1. In Supplementary figure 3E, labeling the y-axis on the figure as opposed to just in the legends would make it easy for the reader.

      The figure has now been appropriately labelled.

      1. With respect to presentation: In figures involving single cell RNA sequencing and phosphoproteome analyses, highlighting the specific genes that are focused in detail on the manuscript would aid the reading process. The current format makes it difficult for the reader to spot the specific genes that are the points of focus within each heat map.

      We modified the figures concerning the phosphoproteomic analysis and scRNA-seq and have highlighted important genes for readers’ ease.

      Reviewer #3 (Significance):

      I have close to a decade's experience in working on breast cancer. In the past I focused on studying intratumor genetic heterogeneity and cell signaling pathway interactions. I am currently working on identifying novel therapeutic targets for the treatment of ER+ breast cancer. My expertise lies in understanding molecular biology of the disease. While I have worked with and understand most techniques used in this study, I would like to indicate that I do not have sufficient expertise in ATAC seq and am unable to evaluate the intricacies of this technique.

      While molecular changes that occur in HER2+ breast cancer have been highly investigated, the changes that occur at an early pre-cancerous stage of the disease aren't as well documented. The work by Hayat et al., sheds much needed light on this less documented early stage of cancer development. The past decade has shown an increased focus on epigenetic therapy with more chromatin targeting drugs entering clinic (Siklos et al., 2022). There has also been increased clinical evidence underlining the efficiency of combining epigenetic therapy and with hormonal and other anticancer therapies in solid tumors (Jin et al., 2021). Phase II clinical trials combining HDAC inhibitors with aromatase inhibitor have shown to improve clinical outcomes in patients (Yardley et al., 2013). Similarly, pre-clinical studies have shown that combination therapy with BET inhibitors improved treatment efficacy and circumvented drug resistance in fulvestrant (Feng et al., 2014) and everolimus (Bihani et al., 2015) treatments. Conclusions from the work by Hayat et.al, although based on in vitro analyses, advances this field conceptually by highlighting the importance of targeting the cell signaling and chromatin regulation together. If validated in in vivo models and clinical samples, this may open up potential possibilities of combining anti-HER2 therapies with epigenetic therapies. Additionally, the study also makes an interesting observation that low HER2 expression could result in increased tumorigenicity of cells which is in contrary to current clinical norm of looking at increased HER2 expression as a sign of aggressive disease. These findings are of interest to the scientific and clinical community working on discovering novel therapeutic targets and biomarkers for treatment of HER2+ breast cancer.

      We thank reviewer #3 for his/her overall assessment and for appreciating this work. There is a significant focus regarding low HER2 positive breast cancers in the field. Approximately 50-60% of breast cancers have "low" HER2 expression and in many cases, this low HER2 is seen together with metastatic cancer. The FDA has very recently approved fam-trastuzumab deruxtecan-nxki aka Enhertu, which appears to target these cancers with low HER2 well and is shown to be relatively effective in a phase 3 clinical trial known as Destiny Breast-04. However, it is not yet clear how low HER2 expressing cells drive the metastatic spread of breast cancers or why they are so aggressive. Our work sheds a light that increased chromatin accessibility could be a route of transformation in low HER2 cancers. Therefore, providing an alternative platform to target these cancers and why it is crucial that this work reaches the clinical and scientific community as soon as possible.

    1. Author Response

      Reviewer #1 (Public Review):

      Kang et al. have performed whole exome sequencing of gall bladder carcinomas and associated metastases, including analysis of rapid autopsy specimens in selected cases. They have also attempted to delineate patterns of clonal and subclonal evolution across this cohort. In cases where BilIN was identified, the authors show that subclones within these precursor lesions can expand and diversify to populate the primary tumor and metastatic sites. They also demonstrate subclonal variation and branching evolution across metastatic sites within the same patient, with the suggestion that multiple subclonal populations may metastasize together to seed different sites. Lastly, they highlight ERBB2 amplification as a recurrent event observed in gall bladder carcinomas.

      While these data add to the literature and start to examine important questions related to clonal evolution in a relatively rare malignancy, the authors' findings are very descriptive and it is hard to draw many generalizable conclusions from their data. In addition, the presentation of their figures is somewhat confusing and difficult to interpret. For example, they do not separate their clonal analyses by disease site and by time in a readily interpretable manner, as in some instances of Figure 2 and Figure 3 the clone maps are from different sites collected at the same time point, while others show some samples at different time points. Depicting these hierarchies in a more organized and clearly understandable manner would help readers more easily interpret the authors' findings. In addition, the clinical implications of these clonal hierarchies and their heterogeneity are unclear, as the authors do not relate the observed evolution to intervening therapies and may not be powered to do so with this dataset.

      Thank you for the constructive and valuable comments about 1) figures and 2) clinical implications.

      1) We agree with your opinion that Figures 2 and 3 are confusing. Reflecting on your comment, Figures 2 and 3 have been modified. Now, the time point at which the tissue was obtained and the anatomical location of the tissue are readily visible in the redesigned figures.

      2) From a clinical point of view, we believe that our study highlights the importance of precise genomic analysis of multi-regional and longitudinal samples in individual cancer patients. In the current oncology clinics, cancer panel data of patients are being used to identify druggable mutations usually with a single tumor sample. However, we found that only a part of the mutations was clonal while a substantial proportion was subclonal, which is usually not an effective druggable target. For example, in the GB-S2 patient, after sequencing with GB tissue, ERBB2 targeting treatment would have been performed if a specific clinical trial is available because ERBB2 p.V777L is pathogenic. However, our clonal evolution analysis suggests that ERBB2 targeting strategy may not be effective in subclones without the ERBB2 p.V777L mutation, especially from regional metastasis. We have added the description for this part to the Discussion section (Page 13, Line 12-15).

      Additional areas that would require clarification include:

      1) There are very few details on how the authors performed their subclone analysis to identify major subclones, and what each of the clusters in Supplemental Figure 1 represents. In addition, they do not describe how they determined that the highlighted mutations in Table 2 were drivers for metastasis and subclonal expansion. Were these the only genes that exhibited increased allele frequencies in metastatic sites, or were other statistical criteria used?

      Thank you for the important comment about 1) clone analysis and 2) highlighted mutations in Table 2.

      1) Mutations were timed as clonal or subclonal through PyClone (Roth A et al., Nat Methods. 2014) clustering (Figure 1—figure supplement 1). Phylogenetic trees were constructed using the mutation clusters identified with PyClone as an input of CITUP (Malikic S et al., Bioinformatics. 2015) (Figures 2 and 3). We added the sentence "See Supplementary File 1 to check the matching information for the PyClone clusters and the CITUP clones." to the supplementary figure legend.

      2) A full list of mutations constituting a CITUP clone can be found in Supplementary File 1. Among the mutations, previously reported cancer-associated genes harboring them were selected manually and listed in Table 2. References for each gene are introduced in the 'Evolutionary trajectories and expansion of subclones during regional and distant metastasis' section.

      2) The authors do not discuss the relevance of variation in mutational signatures observed with disease progression/metastasis, e.g., is there any significance that signature 22 (aristolochic acid) and signature 24 (aflatoxin) are increased in metastases? In addition, when comparing their data to previously published reports in Figure 1B and Figure 4A, it would be helpful if the authors discussed possible reasons for some of the large differences in mutational or signature frequencies across datasets. For example, do the authors think the frequency of ERBB2 alterations is so much higher in their cohort than in prior reports due to methodological/data reasons or due to differences in patient population?

      Thank you for the constructive and valuable comments about 1) mutational signatures observed with disease progression/metastasis and 2) differences in mutational or signature frequencies across datasets.

      1) During the revision process, signatures 22 and 24 highlighted in the metastasis stage were validated by two additional tools, Signal (Degasperi A et al., Nat Cancer. 2020) and MuSiCa (Diaz-Gay M et al., BMC Bioinformatics. 2018) (Figure 4—figure supplement 3). Aristolochic acid is an ingredient of oriental herbal medicine (Debelle FD et al., Kidney Int. 2008, Hoang ML et al., Sci Transl Med. 2013). Given that all the patients in our cohort are Korean, and a recent study found that Korean cancer patients are frequently exposed to herbal medicines (Kwon JH et al., Cancer Res Treat 2019), one possible explanation is that some patients might have been exposed to herbal remedies containing aristolochic acid. On the other hand, aflatoxin is known to be contained in soybean paste and soy sauce, which are widely used in Korean food (Ok HE et al., J Food Prot. 2007). Considering that the signatures 22 and 24 are found not in early carcinogenesis but in late carcinogenesis and metastasis (Figure 4B and Figure 4—figure supplement 3), the two carcinogens appear to have little impact on the early stage of cancer development, but their impacts might be highlighted in overt cancer cells. Further investigation is required because it is difficult to determine the etiology of signatures 22 and 24 with this limited patient data. We updated this part in the Discussion section (Page 13, Line 4-7).

      2) In the two previous genomics studies on GBAC, the prevalence of ERBB2 alteration was 7.9% (Narayan RR et al., Cancer. 2019) and 9.4% (Li M et al., Nat Genet. 2014), respectively. Compared with these data, our data is characterized by relatively higher ERBB2 alterations (54.5%: amplification in 27.3% and SNV in 27.3%) (Figure 1B). A higher prevalence of ERBB2 alteration was also reported in other studies on GBAC, with corresponding rates of 28.6% (amplification and overexpression, Nam AR et al., Oncotarget. 2016) and 36.4% (amplification only, Lin J et al., Nat Commun. 2021). The variations in ethnicity and culture might have contributed to the differences. This part is described in the Discussion section (Page 11, Line 19-23). In addition, the discrepancy in Figure 4A might be attributed to the difference in analyzed samples: our study included precancerous and metastatic lesions while the other two studies uniformly analyzed primary tumors.

      Reference for reply 1)

      • Kwon JH, Lee SC, Lee MA, Kim YJ, Kang JH, Kim JY, et al. Behaviors and Attitudes toward the Use of Complementary and Alternative Medicine among Korean Cancer Patients. Cancer Res Treat. 2019;51(3):851-60.

      3) The authors try to describe and draw conclusions about the possibility of metastasis to metastasis spread in p.6, lines 6-10 "In our study, of 7 patients with 2 or more metastatic lesions, evidence of metastasis-to-metastasis spread was found in 2 patients (28.6%). In GB-A1 (Figure 2A), it appears that CBD, omentum 1-2, mesentery, and abdominal wall 2-4 lesions may originate from abdominal wall 1 (old) rather than from primary GBAC considering clone F." The authors conclude here that the spread arose from abdominal wall 1, but this lesion is only separated from the CBD lesion by 1 month. There is no history given about whether this timing difference is significant or if it was simply due to clinically-driven differences in when each lesion was sampled. Given the proximity of the CBD lesion to the original gall bladder cancer, it seems just as likely that all of these distant lesions were seeded from the CBD lesion. If this is the case, the author's conclusion about "metastasis to metastasis" spread does not seem strongly supported. It would be helpful if the authors could clarify this point and/or provide additional data to strengthen this conclusion.

      We appreciate your valuable comment. As addressed above, the manuscript has been modified to reflect your comments.

      Reviewer #2 (Public Review):

      Minsu Kang et al. analyzed 11 patients with gallbladder adenocarcinoma using multi-point sampling. Mutational analysis revealed evolutional patterns during progression where the authors found metastasis-to-metastasis spread and the migration of a cluster of tumor cells are common in gallbladder adenocarcinomas. The signature analysis detected signatures 22 (aristolochic acid) and 24 (aflatoxin) in metastatic tumors. Overall, the analyses are well-performed using established algorithms. However, the manuscript is highly descriptive. Therefore, it is very difficult to understand what the novel findings are.

      Major comments

      1) The sections "Evolutionary trajectories and expansion of subclones during regional and distant metastasis", "Polyclonal metastasis and intermetastatic heterogeneity", "Mutational signatures during clonal evolution", and "Discussion" are highly descriptive which makes it difficult to understand what the novel and/or important findings are. Those sections would profit from reorganization.

      Thank you for the important comment. We have reorganized the manuscript according to your comments.

      1) In the "Evolutionary trajectories and expansion of subclones during regional and distant metastasis" section, unnecessary sentences have been removed and Figures 2 and 3 have been changed to make it simpler to understand how subclones spread during metastasis.

      2) In the "Polyclonal metastasis and intermetastatic heterogeneity" section, after receiving feedback on statements that were conflicting (Reviewer #1's comment 4), we clarified the statements and removed any other extraneous sentences. Figures 2 and 3 have been changed to make it simpler to understand polyclonal metastasis and intermetastatic heterogeneity.

      3) In the "Mutational signatures during clonal evolution" section, after receiving comments that Figures 4B and 4C were confusing (Essential Revisions #6), we moved Figure 4B to Figure 4—figure supplement 2. Unnecessary sentences have been removed. We emphasized signatures 22 and 24 highlighted during metastasis. This result was validated by using two additional tools, Signal (Degasperi A et al., Nat Cancer. 2020) and MuSiCa (Diaz-Gay M et al., BMC Bioinformatics. 2018).

      4) In the Discussion section, duplicate descriptions and unnecessary extraneous explanations have been deleted. We emphasized that whereas aflatoxin and aristolochic acid had little impact on early cancer formation, their impacts could be more clearly seen in cancer cells that had already manifested (Page 13 Line 2-7). In addition, the limitations of the NGS test currently used in the clinical field were pointed out, and the clinical significance of this study was described (Page 13 Line 8-16).

      2) What would enhance this paper is more of a connection between the bioinformatics analysis and the biology. Although the authors analyzed multi-point sequencing data well, this paper lacks in-depth discussion. I understand that the results in the paper are "computationally" the most likely. However, the impact is lost by an incomplete connection to biology.

      As you commented, we analyzed the WES data obtained from patient samples by computational methods. In this study, we did not validate the various results using in vitro or in vivo models. However, we would like to emphasize the significance of our work because it is the first human study, covering the current theory of carcinogenesis from precancerous lesions to metastasis in GBAC. For example, polyclonal seeding has been previously confirmed in animal models (Cheung KJ et al., Science 2016). In humans, there have been reports in breast cancer (Ullah I et al., J Clin Invest. 2018) and colorectal cancer (Wei Q et al., Ann Oncol. 2017), but not in GBAC yet.

      3) In addition to the above concern, it is difficult to comprehend the cohort as the detailed information is lacking. I would suggest providing a brief table that contains the number of collected samples, frozen or FFPE, the clinical information, etc. by sample.

      Thank you for the constructive comment. Supplementary Table 1 was modified as you mentioned. It is now indicated from which organ, when, and by what method the tissue was obtained, what the tumor purity of the tissue was, and whether the tissue was fresh-frozen or FFPE. In addition, we updated the information about tissue acquisition sites in Figure 1A.

      4) The mutations with very low allele frequency (< 1%) are discussed in the manuscript. However, no validation data is provided. Please add a description of the accuracy of the mutation calling considering the following concerns.

      • FFPE samples are analyzed using the same method as frozen samples. FFPE contains much more artifacts. Is it adequate to use the same methods for both frozen and FFPE samples?

      Thank you for the valuable comment. We also considered the FFPE artifacts. However, we did not remove the possible artifacts. This part has been described above. Please see Essential Revisions #5.

      • How were those mutations with low allele frequency validated? Are those variants validated by other methods? Especially in FFPE.

      Thank you for the important comment. Firstly, we discarded any low-quality, unreliable reads and variants according to the pre-specified filtering criteria used in previous literature analyzed with the Genomon2 pipeline (Yokoyama A et al., Nature. 2019, Kakiuchi N et al., Nature. 2020, Ochi Y et al., Nat Commun. 2021). In the Method section, we have added an explanation for this part (Page 16 Line 5-12).

      As you commented, validation of low VAF mutation is required if the mutation is sample-specific. However, in this study, if a mutation in Supplementary File 1 has a low VAF in one sample, one of the other samples always has a higher VAF, which has passed our pre-specified filter. Therefore, validation is not required for that mutation. In addition, possible sequencing artifacts with low VAFs in FFPE tissues have been discussed above. Please see Essential Revisions #5.

      • Is the low variant allele frequency (0.2~1%) significantly higher than the background noise level?

      Thank you for the important comment. As you expected, FFPE samples had a higher number of sample-specific mutations than fresh-frozen ones in our study. However, we did not remove these mutations in the analysis of the FFPE samples. For a more detailed description, please see Essential Revisions #5.

      5) The authors compared mutational signatures divided by stages or timings. How are the signatures calculated although each sample has a distinct number of somatic mutations? Did the authors correct the difference?

      Thank you for the helpful comment. We classified all the mutations according to the specific criteria (Page 9 Line 9-18). For example, in Figure 4B (before revision, Figure 4C), mutations were classified by the timing of development during clonal evolution. After that, we could calculate the relative contributions of mutational signatures in each group using the three tools, Mutalisk (Lee J et al., Nucleic Acids Res. 2018), Signal (Degasperi A et al., Nat Cancer. 2020) and MuSiCa (Diaz-Gay M et al., BMC Bioinformatics. 2018). Although the number of mutations is different for each group, no additional correction was required because we compared the relative contributions among the groups.

      6) In distant metastasis tumors, signatures 22 and 24 are increased. Those two signatures are strongly associated with a specific carcinogen. Although the clinical information lacks, do the authors think that those patients were exposed to those chemicals after the diagnosis? Why do the authors think the two signatures increased in the metastatic tumors? Were those signatures validated by other methods?

      We appreciate your important and constructive comment.

      1) We think that the patients might have been exposed to aristolochic acid or aflatoxin before or after the cancer diagnosis. Aristolochic acid is an ingredient of oriental herbal medicine (Debelle FD et al., Kidney Int. 2008, Hoang ML et al., Sci Transl Med. 2013). Given that all the patients in our cohort are Korean, and a recent study found that Korean cancer patients are frequently exposed to herbal medicines (Kwon JH et al., Cancer Res Treat 2019), one possible explanation is that some patients might have been exposed to herbal remedies containing aristolochic acid. On the other hand, aflatoxin is known to be contained in soybean paste and soy sauce, which are widely used in Korean food (Ok HE et al., J Food Prot. 2007). Nevertheless, we believe that further investigation is required because it is difficult to determine the etiology of signatures 22 and 24 with this limited patient data.

      2) Summarizing the mutational signature results using the 3 different tools (Figure 4B and Figure 4—figure supplement 3), the signatures 22 and 24 are relatively rare in early carcinogenesis. However, the two signatures contributed more to late carcinogenesis and metastasis. Therefore, it is speculated that the two carcinogens appear to have little impact on the early stage of cancer development but might be highlighted in overt cancer cells. Further studies on this novel hypothesis are necessary.

      3) During the revision process, signatures 22 and 24 highlighted in the metastasis stage were validated by two additional tools, Signal (Degasperi A et al., Nat Cancer. 2020) and MuSiCa (Diaz-Gay M et al., BMC Bioinformatics. 2018) (Figure 4—figure supplement 3). We updated this part in the Result (Page 9 Line 18-21) and Discussion (Page 13 Line 2-7) sections.

      Reference for reply 1)

      • Kwon JH, Lee SC, Lee MA, Kim YJ, Kang JH, Kim JY, et al. Behaviors and Attitudes toward the Use of Complementary and Alternative Medicine among Korean Cancer Patients. Cancer Res Treat. 2019;51(3):851-60.

      7) Figures 2 are well-described. However, they are difficult for readers to fully understand. The colors for each clone are sometimes similar. The results of multi-time point and regional analyses in the cases with multiple sampling are not integrated. Driver mutations are separately described in the small phylogenetic trees. Evolutional patterns (linear or branching) are not described in the figures. Modifying the above concerns would improve the manuscript.

      We appreciate your important comment.

      1) In GB-S1, clones of similar colors were modified to be different colors.

      2) Figures 2 and 3 have been modified to make them easier to understand by separating time and space more clearly.

      3) Driver mutations are now indicated in both the phylogenetic tree and TimeScape result (Figures 2 and 3).

      4) Evolutional patterns (linear or branching) can be discovered by examining the phylogenetic tree in Figures 2 and 3. In addition, we described each patient's evolutionary pattern more clearly in the manuscript.

      8)"Among 6 patients having concurrent BilIN tissues, two patients were excluded from the further analysis because of low tumor purity in one patient and different mutational profiles between BilIN and primary GBAC in the other patient, suggesting different origins of the two tumors (Figure 1-figure supplement 2)." This seems cherry-picking. More explanation is necessary.

      • How is the tumor purity? Although the authors use 0.2% variant allele frequency as true mutation (for example Table 2), is the tumor purity lower than 0,2%?

      Thank you for the important comment. The calculated tumor purity of BilIN in the GB-S8 patient was 0.03 based on the WES data. We added this value to the manuscript (Page 6 Line 9) and Supplementary Table 1. Although variants were called in this case, the tumor purity was too low to estimate the allele-specific copy number, and thus sophisticated analysis as in other patients was not possible. In addition, the value of 0.2% in Table 2 is not the VAF, but cellular prevalence calculated by PyClone and CITUP. Although the value is low in the primary tumor, it is mentioned because it is high in metastatic lesions.

      • BilIN and GBAC of GB-S7 have some shared mutations. Why do the authors conclude that BilIN and GBAC have distinct origins? Do the authors think that those shared mutations are germline mosaic mutations?

      Thank you for the important comment.

      1) We think that the BilIN and GBAC of the GB-S7 patient are tumors of different origins because BilIN and GBAC of the GB-S7 patient have different truncal mutations (Figure 1—figure supplement 2C). This is a markedly different feature compared to BilIN and GBAC samples of other patients. We have added an explanation for this part to the Results section (Page 6 Line 9-11).

      2) We do not think that mosaicism occurred at the developmental stage. In addition, although some mutations were identified from both BilIN and GBAC, we cannot determine their importance because either one of the lesions had a very low VAF ranging from 0.001 to 0.04. If the mosaicism occurred only in the GB at the developmental stage, the VAF values of the shared mutations should be much higher than the current values, and the VAF values of the two BilIN and GBAC lesions should be similar.

      • Was the copy number profile compared between BilIN and GBAC?

      Thank you for the constructive comment. In this study, we obtained allele-specific copy numbers using Control-FREEC version 11.5 (Boeva V et al., Bioinformatics. 2012). The copy number of the mutations in the GB-S8 patient's BilIN could not be estimated by Control-FREEC due to low tumor purity (0.03). In the case of GB-S7, BilIN and GBAC were determined to be of a different tumor origin and thus disregarded from the analysis.

    1. Author Response

      Reviewer #1 (Public Review):

      It's here where my very mild (I truly liked this article - it is well done, well written, and creative) comments arise. The implications for stochastic strategies immediately emerge in the early results - bimodal strategies come about from the introduction of two variables. There is not enough credence given to the field of stochastic behavior in the introduction - the introduction focuses too much on previous models of predator-prey interaction, and in fact, Figure 1, which should set up the main arguments of the article, shows a model that is only slightly different (slight predator adjustment) that is eventually only addressed in the Appendix (see below). The question of "how and when do stochastic strategies emerge?" is a big deal. Figure 1 should set up a dichotomy: optimal strategies are available (i.e., those that minimize Tdiff) which would predict a single unimodal strategy. Many studies often advocate for Bayesian optimal behavior, but multimodal strategies are the reality in this study - why? Because if you consider the finite attack distance and inability of fish to evoke maximum velocity escapes while turning, it actually IS optimal. That's the main point I think of the article and why it's a broadly important piece of work. Further framing within the field of stochastic strategies (i.e., stochastic resonance) could be done in the introduction.

      We appreciate the comment provided by the reviewer. We changed the second paragraph of the introduction so as to focus more on the protean tactic (stochasticity). We added a new figure (Figure 1 in the new version) to conceptually show the escape trajectories (ETs) of a pure optimal tactic, a pure protean tactic, a combination of optimal and protean tactics, and an empirically observed multimodal pattern. We explained each tactic and described that the combination of the optimal and protean tactics still cannot explain the empirically observed multiple preferred ETs.

      The revised paragraph (L49-66) is as follows: Two different escape tactics (and their combination) have been proposed to enhance the success of predator evasion [16, 17]: the optimal tactic (deterministic), which maximizes the distance between the prey and the predator (Figure 1A) [4, 14, 15, 18], and the protean tactic (stochastic), which maximizes unpredictability to prevent predators from adjusting their strike trajectories accordingly (Figure 1B) [19-22]. Previous geometric models, which formulate optimal tactics, predict a single ET that depends on the relative speeds of the predator and the prey [4, 14, 15, 18], and additionally, predator’s turning radii and sensory-motor delay in situations where the predator can adjust its strike path [23-25]. The combination of the optimal tactic (formulated by previous geometric models), which predicts a specific single ET, and the protean tactic, which predicts variability, can explain the ET variability within a limited angular sector that includes the optimal ET (Figure 1C). However, the combination of the two tactics cannot explain the complex ET distributions reported in empirical studies on various taxa of invertebrates and lower vertebrates (reviewed in [26]). Whereas some animals exhibit unimodal ET patterns that satisfy the prediction of the combined tactics or optimal tactic with behavioral imprecision (e.g., [27]), many animal species show multimodal ETs within a limited angular sector (esp., 90–180°) (Figure 1D) (e.g., [4, 5, 28]). To explore the discrepancy between the predictions of the models and empirical data, some researchers have hypothesized mechanical/sensory constraints [17, 29]; however, the reasons why certain animal species prefer specific multiple ETs remain unclear.

      All experiments are well controlled (I especially liked the control where you varied the cutoff distance given that it is so critical to the model). Some of the figures require more labeling and the main marquee Figure 1 needs an overhaul because (1) the predator adjustment model that is only addressed in the Appendix shouldn't be central to the main introductory figure - it's the equivalent of the models/situations in Figure 6, and probably shouldn't take up too much space in the introductory text either (2) the drawing containing the model variables could be more clear and illustrative.

      (1) According to this comment and comment #11 from reviewer #2, we moved the two panels in the figure (Figure 1B and D in the original version) to Appendix-figure 1, and accordingly, we changed the first paragraph of the Model section so as to clearly describe that we focus on Domenici’s model in this study (L103-108).

      As for Figure 6 (Figure 7 in the new version) and related parts, we tempered our claims to clearly describe that our model has only the potential to explain the different patterns of escape trajectories observed in previous works. We would like to keep this figure in the main text because it is fundamental to explain the potential applicability of our model to other predator-prey systems.

      (2) To alleviate the burden for readers, we added the model variables to the figure and made them colored (Figure 2B in the new version).

      Finally, I think a major question could be posed in the article's future recommendations: Is there some threshold for predator learning that the fish's specific distribution of optimal vs. suboptimal choice prevents from happening? That is, the suboptimal choice is performed in proportion to its ability to differentiate Tdiff. This is "bimodal" in a sense, but a probabilistic description of the distribution (e.g., a bernoulli with p proportional to beta) would be really beneficial. Because prey capture is a zero-sum game, the predator will develop new strategies that sometimes allow it to win. It would be interesting if eventually the bernoulli description could be run via a sampler to an actual predator using a prey dummy; one could show that the predator eventually learns the pattern if the bernoulli for choosing optimal escape is set too high, and the prey has balanced its choice of optimal vs. suboptimal to circumvent predator learning.

      We thank the reviewer for this constructive comment. Actually, we are now developing a dummy prey system. We added the following sentence in the Discussion to mention future research.

      The added sentence (L496-499): Further research using a real predator and dummy prey (e.g., [48]) controlled to escape toward an optimal or suboptimal ET with specific probabilities would be beneficial to understand how the prey balances the optimal and suboptimal ETs to circumvent predator learning.

      Reviewer #2 (Public Review):

      First, it is unclear how the dummy predator is actuated. The description in the Methods section does not clearly address how rubber bands are used for this purpose.

      To clearly mention how the dummy predator was actuated by rubber bands, we added a figure (Figure 3-figure supplement 3B) and the following sentences.

      The added sentences (L608-611): The dummy predator was held in place by a metal pipe anchored to a four-wheel dolly, which is connected to a fixed metal frame via two plastic rubber bands (Figure 3—figure supplement 3B). The wheel dolly was drawn back to provide power for the dummy predator to strike toward the prey.

      Second, the predator's speed, which previous research has identified as a critical factor during predator-prey interactions, is not measured from the motion of the dummy predator in the experiments. Instead, it is estimated using an optimization algorithm that utilizes the mathematical model and the prey-specific parameters. It is unclear why the authors chose this method over measuring velocity from their experiments. Since the prey fish are responding to a dummy predator moving toward them at a particular speed during the interaction, it is important to measure the speed of the predator or clearly explain why estimating it using an optimization procedure is more appropriate.

      We chose this method (optimizing predator speed from the prey’s viewpoint) because there was no significant effect of predator speed on the escape trajectory in our experiment (L203-208). In other words, we considered that, at least in our case, the prey did not change the escape trajectory in response to the predator speed, and thus it would be more appropriate to use a specific predator speed estimated through an optimization algorithm from the prey’s point of view. It may be appropriate to use measured predator speed in other cases where the prey adjusts the escape trajectory in response to the change in predator speed. Therefore, we conducted a further analysis using actual predator speeds (both the predator speed at the onset of escape response, and the mean speed for the predator to cover the distance between the predator and prey). The results show that the model fit became worse when using measured predator speed per trial compared to the model using the fixed predator speed estimated through the optimization procedure (Table 3—source data 1; Figure 5—figure supplement 1). We added the above explanation in L219-226.

      One of the major claims of this article is that the model can explain escape trajectories observed in other predator-prey systems (presented in Figure 6). Figure 6 panels A-C show the escape responses of different prey in response to some threatening stimuli. Further, panels D-F suggest that the empirical data can be predicted with the model. But the modeling parameters used to produce the escape trajectories in D-F are derived from the authors' experiments with fish, instead of the experiments with the species shown in panels A-C.

      We thank the reviewer for this comment. We agree that this part in the previous version was an over-interpretation. Therefore, we tempered our statements to simply suggest that our approach has the potential to explain multiple ETs observed in other taxa. The revised sentences are as follows.

      Abstract (L27-30): By changing the parameters of the same model within a realistic range, we were able to produce various patterns of ETs empirically observed in other species (e.g., insects and frogs): a single preferred ET and multiple preferred ETs at small (20–50°) and large (150–180°) angles from the predator.

      Results (L395-407): Potential application of the model to other ET patterns. ...(sip)... To investigate whether our geometric model has the potential to explain these different ET patterns, we changed the values of model parameters (e.g., Upred, Dattack) within a realistic range, and explored whether such adjustments can produce the ET patterns observed in the original work. ...(sip)... These results indicate that our model has the potential to explain various patterns of observed animal escape trajectories.

      Discussion (L538-548): We show that our model has the potential to explain other empirically observed ET patterns (Figure 7). ...(sip)... Further research measuring the escape response in various species and applying the data to our geometric model is required to verify the applicability of our geometric model to various predator-prey systems.

    1. i think so like in social terms the conservatives would say well i like that it benefits from the wisdom of math already invented you're not 00:36:39 throwing anything away you're not you're not throwing it all away and starting over you're taking what we already have and you're you're using it that's great and a libertarian might say i really like that you're free to create as you see fit you can make anything you 00:36:52 want and you're working within this background framework that's minimally invasive it doesn't make a lot of rules for you but it is highly functional i like that it kind of keeps everyone in line while 00:37:03 like satisfying some formal contracts or something while still being uh i'm still free to create and a progressive might say i like about category that theory that everyone can contribute to 00:37:15 making their own world making it more rich adding new ideas uh making it more meaningful understanding connections between things a modern viewpoint would say i like that 00:37:26 it's completely rigorous that it's been used in proving well-known conjectures that people thought were important to prove but also that it's interesting it's useful in science and technology and a postmodern person might say i like 00:37:40 that um that no perspective is right that that there's just all sorts of different categories but that navigating between these perspectives lets you look at problems from all sides or a hippie might say i like that it's 00:37:53 all about relationship and connection or irrelevant i don't know what that means maybe a practical person might say that i like that it's that we can actually use it to organize and learn from big data in 00:38:06 today's world or to manage complexity of software projects that are that are very large and changing all the time i like that you can think about ai and other complex systems with this stuff i think it's relevant and 00:38:19 practical for right now so that's that's my uh tutorial or that's the the part i'm going to record and now i'm going to open it up for questions

      David Spivak discusses how category theory may appeal to different political ideologies for a variety of reasons.

    1. Kevin Flowers Nov 7th at 12:50 PM# Question about repliesForgive me a bit if this is the wrong place to ask, but is the feature of having Hypothes.is list replies somewhere on the roadmap?  I checked the github issues with "label:enhancement" but nothing matches what I'm wondering aboutI could be missing something obvious, but when I search my username in https://hypothes.is/users, none of the replies I've made on other people's public annotations show up# Use casesSometimes people have insightful observations and references they provide, so I tend to reply to those annotations with tags that I use to sort through (eg, tags like "to read", "how to", "tutorial", and so forth)I also tend to make comments on what the OP's annotation made me think of at the time of reading it which is exemplified in the attached screenshotimage.png 9 repliesMichael DiRoberts  7 days ago@Kevin Flowers You’re right, the Activity Page (https://hypothes.is) doesn’t show replies. The Notebook, which will be built out more with time, does.https://web.hypothes.is/help/how-to-preview-the-hypothesis-notebook/HypothesisHow to Preview the Hypothesis Notebook : HypothesisHypothesis has released an early preview of Notebook, which enables you to view, search for, and filter annotations. While this tool is available in both the LMS and web apps, it is designed to bring much-needed functionality to our LMS users. This initial release contains some basic features we have planned to include in the […]Est. reading time2 minutes1Michael DiRoberts  7 days agoI hope Notebook solves the issue for you! For now it’s going to work on private groups and not the Public group (due to it having a limit of 5,000 annotations), though that may change in the future.Michael DiRoberts  7 days agoIf you’re comfortable using APIs then you might check out our API as well: https://h.readthedocs.io/en/latest/api-reference/v1/.You can find replies by looking at rows that contain references.Kevin Flowers  7 days agoOh, the Notebook seems like a neat tool, I'll have to share that with some friendsKevin Flowers  7 days agoThe issue for my own PKM (personal knowledge management) stack is that I couple Hypothes.is with an Obsidian [1] plugin that imports my annotations into my local file system.  Atm, I think the plugin only references the Activity Page to import annotations, so it looks like I'll have to play around with the API you mentioned if I want to grab my replies (along with their parent replies & annotations)[1] Obsidian is a notetaking software similar to Roam & Logseq; it just adds a pretty GUI on top of .md files which are stored locallyMichael DiRoberts  7 days agoNote that the Obsidian plugin wasn’t made by us, so I’m not familiar with how it works. It’s a little weird to me that it would work over the activity page and not use our API, however.Brian Cordan Young  7 days ago@Kevin Flowers Do you have, or have you considered, blogging about your use of Hypothesis as a part of a PKM?I’m still not a regular user of Hypothesis because it doesn’t fit in to my current info consumption well enough. That said I love learning how others do fit it in.(Obsidian is really great too) (edited) Kevin Flowers  7 days ago@Michael DiRoberts ah, you're right, thanks for mentioning that.   Looks like it requires one to generate an API token in order to pull highlights, so it must be using the Hypothes.is API in some way.  Sadly, I'm not familiar enough with general software development design (or JavaScript/TypeScript), and the source code for obsidian-hypothesis-plugin doesn't have enough high level comments for me to parse what any given file does.  It'll probably be cumbersome and somewhat painful, but I'll probably learn more by just building something from scratch@Brian Cordan Young Huh, I hadn't considered that until you mentioned it.   Recently developed some interest in building something with JavaScript (probably with the Next.js framework), so a blog might be just the project I've been looking forGitHubobsidian-hypothesis-plugin/src at master · weichenw/obsidian-hypothesis-pluginAn Obsidian.md plugin that syncs highlights from Hypothesis. - obsidian-hypothesis-plugin/src at master · weichenw/obsidian-hypothesis-plugin (150 kB)https://github.com/weichenw/obsidian-hypothesis-plugin/tree/master/srcMichael DiRoberts  7 days agoJust in case, or for others in the future, you can generate a Hypothesis API token here: https://hypothes.is/account/developer1

      This is a post I made on the Slack public channel asking about whether or not Hypothes.is indexes replies. A tech support membered confirmed this is true.

      However, Obsidian's Hypothes.is plugin does pull replies. It should be noted that default settings don't capture updates to the annotations or tags.

    1. Educational policy has placed teachersin a precarious corner of needing to address the ongoing needs and ques-tions in their classrooms while also navigating worries that administrators,parents, and observers may see these efforts as indoctrination.

      This made me think about the meaning of hidden curriculum. How are we ensuring as educators that we are addressing state standards while also addressing the needs of our students social and emotionally. This also makes me think about how we can intertwine transformative healing practices into our everyday curriculum

    1. Author Response

      Reviewer #2 (Public Review):

      Grasses develop morphologically unique stomata for efficient gas exchange. A key feature of stomata is the subsidiary cell (SC), which laterally flanks the guard cell (GC). Although it has been shown that the lateral SC contributes to rapid stomatal opening and closing, little is known about how the SC is generated from the subsidiary mother cell (SMC) and how the SMC acquires its intracellular polarity. The authors identified BdPOLAR as a polarity factor that forms a polarity domain in the SMC in a BdPAN1-dependent manner. They concluded that BdPAN1 and BdPOLAR exhibit mutually exclusive localization patterns within SMCs and that formative SC division requires both. Further mutant analysis showed that BdPAN1 and BdPOLAR act in SMC nuclear migration and the proper placement of the cortical division site marker BdTANGLED1, respectively. This study reveals a unique developmental process of grass stomata, where two opposing polarity factors form domains in the SMC and ensure asymmetric cell division and SC generation.

      The findings of this study, if further validated, are novel and interesting. However, I feel that the data presented in the current manuscript do not fully support some crucial conclusions. The lack of dual-color images is the weakest point of this study. If it is technically impossible to add them, alternative analyses are needed to validate the main conclusions.

      1) Is BdPOLAR-mVenus functional? Although the authors interpret that weak BdPOLAR-mVenus expression partially rescued the bdpolar mutant phenotype in Fig. S4D, the localization pattern visualized by BdPOLAR-mVenus may not be completely reliable with this partial rescue activity.

      This is indeed a valid point. The partial complementation of weakly expressing translational reporters (Figure 3–figure supplement 1D) and the weak effect of BdPOLAR-mVenus overexpression lines (Figure 3–figure supplement 1J) at least suggest partial functionality which is strongly dependent on dosage. Yet the localization pattern and the temporal dynamics might indeed not fully reflect the spatiotemporal dynamics of the endogenous BdPOLAR. This criticism is, however, true for any transgenic reporter line–even when fully complementing–as the requirement for dosage, stability, and turnover likely varies strongly between different protein classes and functions.

      Nonetheless, we have added a sentence on p. 7, which mentions this potential caveat.

      2) Regardless of the functionality of the tagged protein, the authors need to provide more information on their localization. For example, is there a difference in polarity pattern depending on expression level? Does overexpressed BdPOLAR-mVenus invade the BdPAN1 zone? In such cases, might the loss of BdPOLAR polarity in the bdpan1 mutant be a side effect of overexpression, not PAN1 exclusion? Does BdPOLAR expression (no tag) show a dose-dependent effect, similar to the mVenus-tagged protein?

      The difference in polarity patterns in bdpan1 mutants and wild-type does not depend on expression level. BdPOLAR-mVenus was crossed into bdpan1 and mutant and wild-type siblings in the F2 generation were analyzed. This means that the data presented in Fig. 3E and F show exactly the same transgene insertion line in wt and bdpan1 and were imaged with the same setting for comparability. Therefore, the difference in localization is not due to different expression levels but indeed reflects a PAN1-dependent effect.

      To address if BdPOLAR without a tag is also sensitive to dosage, we have generated an untagged complementation line that includes the untagged, genomic locus of BdPOLAR including promoter (-3.1kb) and terminator (+1.1kb). Yet, even though this construct is much better at rescuing the mutant, we still see remaining defects in T0 lines (Figure 3–figure supplement 1K) suggesting that even without a tag we cannot fully recapitulate wild-type functionality. Yet, to actually measure protein levels of untagged BdPOLAR, we would need to raise an antibody against BdPOLAR, which we think is clearly out of the scope of this study.

      3) A major conclusion of this study was that the polarity domains of BdPOLAR and BdPAN1 are mutually exclusive. However, not all the cells in the figures were consistent with this statement. For example, the BdPOLAR signals at the GMC/SMC interphase appear to match BdPAN1 localization (compare 0:03 s in Video 1 and 0:20 s in Video 2 [top cell]). The 3D rendered image in Fig. 2F shows that BdPOLAR is excluded near the GMC on the front side of the SMC, where BdPAN1 is not localized. Some cells did not exhibit polarization (Fig. 3A, bottom left; Fig. 3E, bottom left). The most convincing data are the dual-color images of these two proteins. Otherwise, a sophisticated image analysis is required to support this conclusion.

      We agree that dual-color image analysis would have provided the most convincing data. As mentioned in our answers to the reviewing editor and reviewer 1, we have generated a dual marker line (BdPAN1p:BdPAN1-CFP; BdPOLARp:BdPOLAR-mCitrine), yet the BdPAN1-CFP signal (compared to mCitrine signal) was too weak to visualize the proximal BdPAN1 domain.

      This issue was also raised by reviewer 1 and deemed an essential revision. To determine how BdPOLAR and BdPAN1 relate spatially to each other, we have added data in Figure 2E where we manually traced mature SMC outlines to determine BdPOLAR-mVenus and BdPAN1-mCitrine occupancy along the SMC’s circumference. This confirmed that the polarization is indeed opposite yet not perfectly reciprocal (see details above, Essential Revisions #1).

      Finally, we realized that the 3D image renderings were more confusing than helpful and we removed them from the revised version.

      4) Another central conclusion was that BdPOLAR was excluded at the future SC division site, marked with BdTANGLED1. However, these data are also not very convincing, as such specific exclusion cannot be seen in some figure panels (e.g., Fig. 3A, bottom left; Fig. 3E, all three cells on the left). If dual-color imaging is not feasible, a quantitative image analysis is needed to support this conclusion.

      As for point 3, this was also criticized by reviewer 1 and deemed an essential revision by the reviewing editor.

      To determine whether the absence of BdPOLAR signal and the presence of BdTAN1 signal colocalize, we again manually traced mature SMC outlines to determine BdPOLAR-mVenus and BdTAN1-mCitrine occupancy along the SMC’s circumference. We plotted the relative average fluorescence intensity in Figure 4G-I nicely showing that BdTAN1 indeed resides in the BdPOLAR gaps above and below the GMC (again, details above, Essential Revisions #2).

      5) I could not find detailed imaging conditions and data processing methods. Are Figs. 2B and 2E max-projection or single-plane images? If they are single-plane images, which planes of the SMC are observed? In addition, how were Figs. 2C and 2F rendered? (e.g., number of images, distance intervals, processing procedures). This information is important for data interpretations.

      We agree that we might not have provided sufficient imaging condition details and have added more details regarding image acquisition in the method part (p. 20). We always use a consistent depth and show the midplane of SMCs. As mentioned above, we removed Figs. 2C and 2F and the supplemental movies as these data did not seem to be helpful.

      6) [Minor point] The authors should clearly describe where BdPAN1 is expressed and localized. Is it expressed in the GMC and localized at the GMC/SMC interface? Alternatively, is it expressed and localized in the SMC?

      BdPAN1 is expressed throughout the epidermis but starts to strongly accumulate at the GMC/SMC interface. According to the literature (Cartwright et al 2009 with immunostainings against ZmPAN1 and Sutimantanapi et al. 2014 with PAN1 and PAN2 reporter) and our own observations (Fig. S3), this accumulation occurs in the SMC rather than in the GMC. In Fig. S3A, third panel, second GMC from the top, for example, one can see that the early PAN1 polarity domain expands beyond the GMC/SMC interface suggesting that it is indeed forming in SMCs rather than in GMCs. We have specified this in the text more clearly now (p. 5).

    1. Author Response

      Reviewer #1 (Public Review):

      The research investigates the genetic basis for resistance to high CO2 levels in the human pathogenic fungus Cryptococcus neoformans. Screening collections of over 5,000 gene deletion strains revealed 96 with impaired growth, including a set of genes all related to the same RAM signaling pathway. Further genetic dissection was able compellingly to place where this pathway lies relative to upstream inputs and through the isolation of suppressor mutants as potential downstream targets of the pathway. Given the high levels of CO2 encountered by fungi in the human host, this work may provide new directions for the control of disseminated fungal disease.

      The research presents both strengths and weaknesses.

      Strengths include:

      (1) One of the largest scale analyses of genes involved in growth under high CO2 concentrations in a fungus, revealing a set of just under 100 mutants with impaired growth.

      (2) Elegant genetic epistasis analysis to show where different components fit within a pathway of transmission of CO2 exposure. For example, over expression of one of the kinases, Cbk1, can overcome the CO2-sensitivity of mutations in the CDC24 or CNA1 genes (but not in the reciprocal overexpression direction).

      (3) Isolation of suppressor mutations in the cbk1 background, now able to grow at high CO2 levels, was able to lead to the identification of two genes. Follow up characterization, which included examining in vitro phenotypes, gene expression analysis, and impact during mouse infection was able to reveal that the two suppressors restore a subset of the phenotypes impacted by mutation of CBK1. Indeed, one conclusion from this careful work is that the reduced virulence of the cbk1 mutant is not due to its sensitivity to high levels of CO2, perhaps an unexpected finding given the original goals of the study towards linking CO2 sensitivity with decreased virulence.

      Weaknesses include:

      (1) What is the rationale for examining gene expression using the NanoString technology of 118 genes rather than a more genome-wide approach such as RNA-sequencing?

      (2) Without additional species examined, some of the conclusions about differences in impact between ascomycetes and basidiomycetes might instead reflect differences between species. For example, RAM mutants in other strains of C. neoformans do not exhibit so strong a temperature sensitive phenotype. Or to extend the comparison further, one might assume given the use of CO2 for Drosophila manipulations that the RAM pathway components in an insect would not be required for surviving high CO2.

      (3) Given the relative ease of generate progeny of this species, it would have been informative to explore if the suppressors of cbk1 also suppressed the loss of genes like CDC24, CNA1, etc, equivalent to the experiment performed of overexpression of CBK1 in those backgrounds.

      We thank the reviewer for the kind summary of our work and the highlights of the major findings. We chose NanoString because we have already generated a probe set of 118 genes that are differentially expressed in response to CO2 based on RNA-seq profiles of multiple natural cryptococcal isolates in a separate study. Nanostring allowed us to focus on CO2 relevant transcripts and do multiple replicates and conditions in a way that is not practical using RNA-Seq.

      Although the RAM pathway has not been extensively characterized in different species of Cryptococcus, we do know that RAM pathway mutants lead to pseudohyphal growth in multiple strain backgrounds including two different species of Cryptococcus (Magditch, Liu, Xue, & Idnurm, 2012; Walton, Heitman, & Idnurm, 2006). We have added corresponding references and discussed this point on lines 167-169.

      We agree with the reviewer that it would be interesting to test the effects of the cbk1Δ suppressor mutations in the backgrounds of other CO2-sensitive gene knockout strains. This is part of our plan for future investigation in characterizing the signaling pathways involved in CO2 tolerance.

      Reviewer #2 (Public Review):

      In the paper by Chadwick et al., the authors identify the molecular determinants of CO2 tolerance in the human fungal pathogen Cryptococcus neoformans. The authors have screened a collection of deletion mutants to identify the genes that are sensitive at 37oC (host temperature) and elevated CO2 levels. The authors identified that the genes responsible for CO2 sensitivity are involved in the pathways responsible for thermotolerance mechanisms such as Calcineurin, Ras1-Cdc24, cell wall integrity, and the Regulator of Ace2 and Morphogenesis (RAM) pathways. Moreover, they identified that the mutants of the RAM pathway effector kinase Cbk1 were most sensitive to elevated temperature and CO2 levels. This study uncovers the previously unknown role of the RAM pathway in CO2 tolerance. Transcriptome data indicates that the deletion of CBK1 results in an alteration in the expression of CO2-related genes. To identify the potential downstream targets of Cbk1, the authors performed a suppressor screen and obtained the spontaneous suppressor mutants that rescued the sensitivity of cbk1 mutants to elevated temperature and CO2. Through this screen, the authors identified two suppressor groups that showed a modest improvement in growth at 37˚C and in presence of CO2.

      Interestingly, from the suppressor screen, the authors identified a previously known interactor of Cbk1 which is SSD1, and an uncharacterized gene containing a putative Poly(A)-specific ribonuclease (PARN) domain named PSC1 (Partial Suppressor of cbk1Δ) which acts downstream of Cbk1. Deletion of these two genes in cbk1 null mutants rescued the sensitivity to elevated CO2 levels and temperature but did not fully rescue the ability to cause disease in mice.

      This study highlights the underappreciated role of the host CO2 tolerance and its importance in the ability of a fungal pathogen to survive and cause disease in host conditions. The authors claim to gain insight into the genetic components associated with carbon dioxide tolerance. The experimental results including the data presented, and conclusions drawn do justice to this claim. Overall, it is a well-written manuscript. However, some sections need improvement in terms of clarity and experimental design.

      • One major drawback of the study is the virulence assay performed to test the ability of cbk1 mutants to cause the disease in the mouse model. The cbk1 null mutants are thermosensitive in nature. Using these mutants, establishing the virulence attributes in mice would undermine the mutants' ability to infect mice as they won't be able to survive at the host body temperature.

      • The rationale for choosing the genes to test further is not clear in two instances in the study. a) From a list of 96 genes, how do the authors infer the pathways involved? Was any pathway analysis performed that helped them in shortlisting the pathways that they subsequently tested? A GO term analysis of the list of genes identified through the genetic screen would be more helpful to get an overview of the pathways involved in CO2 tolerance. b) The authors do not clearly mention why they chose only four genes to test for the CO2 sensitivity out of 16 downregulated genes identified from the nano string analysis.

      • It would be more useful to the readers if the authors could also include a thorough analysis of the presence of the putative PARN domain-containing protein across various fungal species rather than mentioning that it is only observed in C. neoformans and S. pombe. Also, the authors may want to discuss the known role(s) of SSD1, if any, in pathogenic ascomycetous yeasts so that the proposed functional divergence is supported further.

      We are glad that the reviewer appreciated the approach, the findings, and the significance of this research, and we are grateful for the helpful suggestions to improve the manuscript.

      To remove temperature sensitivity as a variable when testing virulence, we have added a new infection model in the revised manuscript to test the cbk1Δ mutant and its suppressors. This infection model uses the Galleria mellonella larvae as a host. G. mellonella larvae are commonly used to test virulence for temperature sensitive strains as the body temperature of the larvae is similar to that of the environment. We performed cryptococcal infection in this model and the larvae were kept at 30°C rather than at 37°C. The results of these experiments are now described in results section 5 and shown in Figure 6 of the manuscript. The data using the larva-infection model supports our original conclusion about the virulence of these strains observed in mouse models.

      We performed a GO term analysis of the hits from our screening, but did not find any significant or outstanding pathways. From our list of 96 genes, we chose to focus on the RAM pathway because the mutants were among the most sensitive to CO2. We have added an explanation for the genes we decided to test for host CO2 level sensitivity from the 16 downregulated genes on lines 139-141.

      Through Blast searching, we have found that the PARN domain-containing protein has homologs in other basidiomycetes. There might be some homologs in a few zygomycetes and ascomycetes but the confidence scores were so low that we deemed unlikely. We now report this in the manuscript on lines 210-213, “This domain was previously reported to be found in S. pombe (Marasovic, Zocco, & Halic, 2013). Interestingly, through a Blast search of the PARN domain, we did not identify this domain in the genomes of S. cerevisiae, C. albicans or other ascomycetes, but found it in Basidiomycetes and higher eukaryotes”.

      Ssd1 has been studied in the pathogenic yeast Candida albicans and is also regulated by Cbk1 in this organism. We have added a discussion about possible functions of Ssd1 in C. neoformans based on references to studies in C. albicans in the discussion section on lines 401-408. “In C. albicans, Ssd1 plays an important role in polarized growth and hyphal initiation by negatively regulating the transcription factor Nrg1 (H. J. Lee, Kim, Kang, Yang, & Kim, 2015). The observation that cbk1Δpsc1Δ and cbk1Δssd1Δ suppressor mutants partially rescue cell separation defects or depolarized growth suggests that C. neoformans may primarily utilize Ssd1/Psc1 rather than a potential Ace2 homolog to regulate cell separation or polarization. Differential regulation of target mRNA transcripts by Ssd1 and Psc1 may explain the functional divergence of the RAM pathway we observed here between basidiomycete Cryptococcus and the ascomycete yeasts.”

      Reviewer #3 (Public Review):

      In this work the authors identify genes and pathways important for CO2 and thermotolerance in Cryptococcus neoformans. They additionally rule out the contribution of the bicarbonate or cAMPdependent activation of adenylyl cyclase to this pathway, which is important for CO2 sensing in other fungi, further solidifying the need to characterize CO2 sensing in basidiomycetes. The authors establish the importance of focusing on CO2 tolerance by testing the impact of CO2 on fluconazole susceptibility with varied pH, suggesting the ability of CO2 to sensitize cryptococcal cells to fluconazole. Furthermore, the authors compared the CO2 tolerance of clinical reference strains to environmental isolates. The characterization of the RAM pathway Cbk1 kinase illustrated the integration of multiple stress signaling pathways. By using a series of CBK1OE insertions in strains with deletions in other pathways, the ability of Cbk1 over-expression to rescue several strains from CO2 sensitivity was apparent. Additionally, NanoString expression analysis comparing cbk1∆ to H99 validated the author's screen of CO2-sensitive mutants as 16/57 downregulated genes were found in their screen, further confirming the interconnected nature of these pathways. The importance of the RAM pathway in maintaining CO2 and thermotolerance was also incredibly clear.

      Perhaps most interestingly, the authors identify suppressor colonies with distinctive phenotypes that allowed for the characterization of downstream effectors of the RAM pathway. These suppressor colonies were found to have mutations in SSD1 and PSC1 which somewhat restore growth at 37oC with CO2 exposure. Further confirming the importance of the RAM pathway, the cbk1∆ strain had markedly attenuated virulence during infection. Interestingly, the generated suppressor strains had varying impacts on fungal infection in vivo. While the sup1 suppressor was completely cleared from the lungs during both intranasal and IV infection, the sup2 strain, containing mutations in SSD1, maintained a high fungal load in the lungs and was able to disseminate into host tissues during IV infection but not intranasal infection.

      The authors make a strong case for the exploration of thermotolerance and CO2 tolerance as contributors to virulence. Through screening and characterization of RAM pathway kinase CBK1's ability to rescue other mutants from CO2 sensitivity, the overlapping contributions of several signaling pathways and the importance of this kinase were revealed. This work is important and will be valuable to the field. However, the cbk1∆ strain does show reduced melanization, urease secretion, and higher sensitivity to cell wall stressor Congo Red in SI Appendix, Figure S4. While the authors make a strong argument that these well-established virulence factors are not perfect predictors of virulence in vivo, the cbk1∆ strain is not an example of such a case as it does have defects in these important factors in addition to thermotolerance and CO2 tolerance. Not acknowledging the changes in these virulence factors in the cbk1∆ and their potential contribution to phenotypes observed is a weakness of the manuscript. Interestingly, the sup1 and sup2 strains also rescue these virulence factors compared to cbk1∆. Additionally, the assertion that "the observation that only sup2 can survive, amplify, and persist in animals stresses the importance of CO2 tolerance in cryptococcal pathogens" due to the sup2's slightly higher CO2 tolerance compared to sup1, could be better supported by the data. These suppressors did not restore transcript abundances of the differentially expressed genes to WT levels, suggesting post-transcriptional regulation. However, there may be differences in the ability of sup2 to resist stress better than sup1 especially given the known Ssd1 repression of transcript translation in S. cerevisiae. Finally, pH appears to impact the sup1 and sup2 strain's sensitivity to CO2 in SI Appendix Figure 4. This could be better explained and interrogated in the manuscript. Finally, this work includes a variety of genes in several signaling pathways. The paper would be greatly clarified by a graphical abstract indicating how CBK1 may be integrating these pathways or by indicating which genes belong to which pathways in the Figure 1 legend to make this figure easier to follow.

      We thank the reviewer for the thorough summary of the study. We appreciate the reviewer’s enthusiasm about this study as well as constructive critiques on the manuscript. Indeed, the suppressor mutations in the cbk1Δ mutant rescue more phenotypes of cbk1Δ in vitro than just thermotolerance and CO2 tolerance (Supplemental Figure 5), which could benefit the survival of these suppressor strains in vivo compared to the original the cbk1Δ mutant. However, between the sup1 and the sup2 mutants, the only clear difference in growth we observed was in host levels of CO2 and temperature. There was no obvious difference in their resistance to Congo red (cell wall stress), melanization, susceptibility to FK506 (calcineurin pathway inhibitor), sensitivity to H2O2 (ROS), or urease (Supplemental Figure 5). Nonetheless, we agree with the reviewer that there could be other reasons which may influence the outcome in vivo, given that the host environment is more complex than we know. We have changed our wording in the manuscript to make it clear that contribution of better tolerance of CO2 to better survival of the sup2 mutant is only our hypothesis and there could be other unrecognized contributing factors. “The only in vitro difference observed between sup1 and sup2 was better growth of sup2 at host CO2 levels which may explain the difference in their ability to propagate and persist in the mouse lungs. However, due to the complexity of the host environment, there could be other unrecognized factors contributing to their growth difference in vivo.” (Lines 276279).

      About growth at different pH levels, C. neoformans tends to grow better at lower pH, closer to pH 5. This fungus can grow at pH 3, the lowest pH that our lab had tested (it may be able to sustain viability even at pH 2 based on others’ conference presentations). The high temperature/CO2 combined with neutral or high pH likely causes worse growth of both H99 and the mutants tested.

      We tried making a model to integrate all the pathways and factors identified in this work as the reviewer suggested. However, in this process, we found it difficult to propose a model. Although the current findings clearly demonstrate the importance of Cbk1 in thermotolerance and CO2 tolerance (overexpression of CBK1 can partially restore thermotolerance and/or CO2 tolerance in the mutants defective in the cell wall integrity pathway, the calcineurin pathway or the Cdc24-Ras1 pathway, and that the reciprocal overexpression of these genes in the cbk1∆ mutant does not rescue any of the cbk1∆ mutant’s defects), we do not know the exact mechanisms underlying this phenomenon. Do these pathways directly interact with Cbk1, affect its phosphorylation status, or alter its subcellular localization? Or do these pathways act through some other massagers to indirectly activate Cbk1 or maybe Cbk1’s downstream targets? These are the questions that warrant further investigations in the future. To be prudent, we think it is better not to propose a model at this point given the uncertainty of the mechanism. The mutants belonging to each of the pathways are clearly specified in the texts in this revised manuscript to help orient the readers. For example “As the RAM pathway effector kinase mutant cbk1Δ showed the most severe defect in thermotolerance and CO2 tolerance compared to the mutants of the other pathways, we first overexpressed the gene CBK1 in the following mutants, cdc24∆ (Ras1-Cdc24), mpk1∆ (CWI), cna1∆ (Calcineurin), and the cbk1Δ mutant itself, and observed their growth at host temperature and host CO2 (Figure 2B)...”

    1. Author Response

      Public Evaluation Summary:

      The authors re-analyzed a previously published dataset and identify patterns suggestive of increased bacterial biodiversity in the gut may creating new niches that lead to gene loss in a focal species and promote generation of more diversity. Two limitations are (i) that sequencing depth may not be sufficient to analyze strain-level diversity and (ii) that the evidence is exclusively based on correlations, and the observed patterns could also be explained by other eco-evolutionary processes. The claims should be supported by a more detailed analysis, and alternative hypotheses that the results do not fully exclude should be discussed. Understanding drivers of diversity in natural microbial communities is an important question that is of central interest to biomedically oriented microbiome scientists, microbial ecologists and evolutionary biologists.

      We agree that understanding the drivers of diversity in natural communities is an important and challenging question to address. We believe that our analysis of metagenomes from the gut microbiomes is complementary to controlled laboratory experiments and modeling studies. While these other studies are better able to establish causal relationships, we rely on correlations – a caveat which we make clear, and offer different mechanistic explanations for the patterns we observe.

      We also mention the caveat that we are only able to measure sub-species genetic diversity in relatively abundant species with high sequencing depth in metagenomes. These relatively abundant species include dozens of species in two metagenomic datasets, and we see no reason why they would not generalize to other members of the microbiome. Nonetheless, further work will be required to extend our results to rarer species.

      Our revised manuscript includes two major new analyses. First, we extend the analysis of within-species nucleotide diversity to non-synonymous sites, with generally similar results. This suggests that evolutionarily older, less selectively constrained synonymous mutations and more recent non-synonymous mutations that affect protein structure both track similarly with measures of community diversity – with some subtle differences described in the manuscript.

      Second, we extend our analysis of dense time series data from one individual stool donor and one deeply covered species (B. vulgatus) to four donors and 15 species. This allowed us to reinforce the pattern of gene loss in more diverse communities with greater statistical support. Our correlational results are broadly consistent with the predictions of DBD from modeling and experimental studies, and they open up new lines of inquiry for microbiome scientists, ecologists, and evolutionary biologists.

      Reviewer #1 (Public Review):

      This paper makes an important contribution to the current debate on whether the diversity of a microbial community has a positive or negative effect on its own diversity at a later time point. In my view, the main contribution is linking the diversity-begets-diversity patterns, already observed by the same authors and others, to genomic signatures of gene loss that would be expected from the Black Queen Hypothesis, establishing an eco-evolutionary link. In addition, they test this hypothesis at a more fine-grained scale (strain-level variation and SNP) and do so in human microbiome data, which adds relevance from the biomedical standpoint. The paper is a well-written and rigorous analysis using state-of-the-art methods, and the results suggest multiple new experiments and testable hypotheses (see below), which is a very valuable contribution.

      We thank the reviewer for their generous comments.

      That being said, I do have some concerns that I believe should be addressed. First of all, I am wondering whether gene loss could also occur because of environmental selection that is independent of other organisms or the diversity of the community. An alternative hypothesis to the Black Queen is that there might have been a migration of new species from outside and then loss of genes could have occurred because of the nature of the abiotic environment in the new host, without relationship to the community diversity. Telling the difference between these two hypotheses is hard and would require extensive additional experiments, which I don't think is necessary. But I do think the authors should acknowledge and discuss this alternative possibility and adjust the wording of their claims accordingly.

      We concur with the reviewer that the drivers of the correlation between community diversity and gene loss are unclear. Therefore, we have now added the following text to the Discussion:

      “Here we report that genome reduction in the gut is higher in more diverse gut communities. This could be due to de novo gene loss, preferential establishment of migrant strains encoding fewer genes, or a combination of the two. The mechanisms underlying this correlation remain unclear and could be due to biotic interactions – including metabolic cross-feeding as posited by some models (Estrela et al., 2022; San Roman and Wagner, 2021, 2018) but not others (Good and Rosenfeld, 2022) – or due to unknown abiotic drivers of both community diversity and gene loss.”

      Additionally, we have revised Figure 1 to show that strain invasions/replacements, in addition to evolutionary change, could be an important driver of changes in intra-species diversity in the microbiome.

      Another issue is that gene loss is happening in some of the most abundant species in the gut. Under Black Queen though, we would expect these species to be most likely "donors" in cross-feeding interactions. Authors should also discuss the implications, limitations, and possible alternative hypotheses of this result, which I think also stimulates future work and experiments.

      We thank the reviewer for raising this point. It is unclear to us whether the more abundant species would be donors in cross-feeding interactions. If we understand correctly, the reviewer is suggesting that more abundant donors will contribute more total biomass of shared metabolites to the community. This idea makes sense under the assumption that the abundant species are involved in cross-feeding interactions in the first place, which may or may not be the case. As our work heavily relies on a dataset that we previously analyzed (HMP), we wish to cite Figure S20 in Garud, Good et al. 2019 PLoS Biology in which we found there are comparable rates of gene changes across the ~30 most abundant species analyzed in the HMP. This suggests that among the most abundant species analyzed, there is no relationship between their abundance and gene change rate.

      That being said, we acknowledge that our study is limited to the relatively abundant focal species and state now in the Discussion: “Deeper or more targeted sequencing may permit us to determine whether the same patterns hold for rarer members of the microbiome.”

      Regarding Figure 5B, there is a couple of questions I believe the authors should clarify. First, How is it possible that many species have close to 0 pathways? Second, besides the overall negative correlation, the data shows some very conspicuous regularities, e.g. many different "lines" of points with identical linear negative slope but different intercept. My guess is that this is due to some constraints in the pathway detection methods, but I struggle to understand it. I think the authors should discuss these patterns more in detail.

      We sincerely thank the reviewer for raising this issue, as it prompted us to investigate more deeply the patterns observed at the pathway level. In short, we decided to remove this analysis from the paper because of a number of bioinformatics issues that we realized were contributing to the signal. However, in support of BQH-like mechanisms at play, we do find evidence for gene loss in more diverse communities across multiple species in both the HMP and Poyet datasets. Below we detail our investigation into Figure 5b and how we arrived at the conclusion that is should be removed:

      (1) Regarding data points in Figure 5B where many focal species have “zero pathways”,we firstly clarify how we compute pathway presence and richness. Pathway abundance data per species were downloaded from the HMP1-2 database, and these pathway abundances were computed using HUMAnN (HMP Unified Metabolic Analysis Network). According to HUMAnN documentation, pathway abundance is proportional to the number of complete copies of the pathway in the community; this means that if at least one component reaction in a certain pathway is missing coverage (for a sample-species pair), the pathway abundance may be zero (note that HUMAnN also employs “gap filling” to allow no more than one required reaction to have zero abundance). As such, it is likely that insufficient coverage, especially for low-abundance species, causes many pathways to report zero abundance in many species in many samples. Indeed, 556 of the 649 species considered had zero “present” pathways (i.e. having nonzero abundance) in at least 400 of the 469 samples (see figure below).

      (2) We thank the reviewer for pointing out the “conspicuous regularities” in Figure 5B,particularly “parallel lines” of data points that we discovered are an artifact of the flawed way in which we computed “community pathway richness [excluding the focal species].” Each diagonal line of points corresponds to different species in the same sample, and because community pathway richness is computed as the total number of pathways [across all species in the sample] minus the number of pathways in the focal species, the current Figure 5B is really plotting y against X-y for each sample (where X is a sample’s total community pathway richness, and y is the pathway richness of an individual species in that sample). This computation fails to account for the possibility that a pathway in an excluded focal species will still be present in the community due to redundancy, and indeed BQH tests for whether this redundancy is kept low in diverse communities due to mechanisms such as gene loss.

      We attempted to instead plot community pathway richness defined as the number of unique pathways covered by all species other than the focal species. This is equivalent to [number of unique pathways across all species in a sample] minus the [number of pathways that are ONLY present in the focal species and not any other species in the sample]. However, when we recomputed community pathway richness this way, it is rare that a pathway is present in only one species in a sample. Moreover, we find that with the exception of E. coli, focal species pathway richness tended to be very similar across the 469 samples, often reaching an upper limit of focal species pathway richness observed. (It is unclear to what extent lower pathway richnesses are due to low species abundance/low sample coverage versus gene loss). This new plot reveals even more regularities and is difficult to interpret with respect to BQH. (Note that points are colored by species; the cluster of black dots with outlying high focal pathway richness corresponds to the “unclassified” stratum which can be considered a group of many different species.)

      Overall, because community pathway richness (excluding a focal species) seems to primarily vary with sample rather than focal species in this dataset when using the most simple/strict definition of community pathway richness as described above, it is difficult to probe the Black Queen Hypothesis using a plot like Figure 5B. As pointed out by reviewers, lack of sequencing depth to analyze strain-level diversity and accurately quantify pathway abundance, irrespective of species abundance, seems to be a major barrier to this analysis. As such, we have decided to remove Figure 5B from the paper and rewrite some of our conclusions accordingly.

      Finally, I also have some conceptual concerns regarding the genomic analysis. Namely, genes can be used for biosynthesis of e.g. building blocks, but also for consumption of nutrients. Under the Black Queen Hypothesis, we would expect the adaptive loss of biosynthetic genes, as those nutrients become provided by the community. However, for catabolic genes or pathways, I would expect the opposite pattern, i.e. the gain of catabolic genes that would allow taking advantage of a more rich environment resulting from a more diverse community (or at least, the absence of pathway loss). These two opposing forces for catabolic and biosynthetic genes/pathways might obscure the trends if all genes are pooled together for the analysis. I believe this can be easily checked with the data the authors already have, and could allow the authors to discuss more in detail the functional implications of the trends they see and possibly even make a stronger case for their claims.

      We thank the reviewer for their suggestion. As explained above, we have removed the pathway analysis from the paper due to technical reasons. However, we did investigate catabolic and biosynthetic pathways separately as suggested by the reviewer as we describe below:

      We obtained subsets of biosynthetic pathways and catabolic pathways by searching for keywords (such as “degradation” for catabolic) in the MetaCyc pathway database. After excluding the “unclassified” species stratum, we observe a total of 279 biosynthetic and 167 catabolic pathways present in the HMP1-2 pathway abundance dataset. Using the corrected definition of community pathway richness excluding a focal species, for each pathway type—either biosynthetic or catabolic—we plotted focal species pathway richness against community pathway richness including all pathways regardless of type:

      We observe the same problem where, within a sample, community pathway richness excluding the focal species hardly varies no matter which focal species it is, due to nearly all of its detected pathways being present in at least one other species; this makes the plots difficult to interpret.

      Reviewer #2 (Public Review):

      The authors re-analysed two previously published metagenomic datasets to test how diversity at the community level is associated with diversity at the strain level in the human gut microbiota. The overall idea was to test if the observed patterns would be in agreement with the "diversity begets diversity" (DBD) model, which states that more diversity creates more niches and thereby promotes further increase of diversity (here measured at the strain-level). The authors have previously shown evidence for DBD in microbiomes using a similar approach but focusing on 16S rRNA level diversity (which does not provide strain-level insights) and on microbiomes from diverse environments.

      One of the datasets analysed here is a subset of a cross-sectional cohort from the Human Microbiome Project. The other dataset comes from a single individual sampled longitudinally over 18 months. This second dataset allowed the authors to not only assess the links between different levels of diversity at single timepoints, but test if high diversity at a given timepoint is associated with increased strain-level diversity at future timepoints.

      Understanding eco-evolutionary dynamics of diversity in natural microbial communities is an important question that remains challenging to address. The paper is well-written and the detailed description of the methodological approaches and statistical analyses is exemplary. Most of the analyses carried out in this study seem to be technically sound.

      We thank the reviewer for their kind words, comments, and suggestions.

      The major limitation of this study comes with the fact that only correlations are presented, some of which are rather weak, contrast each other, or are based on a small number of data points. In addition, finding that diversity at a given taxonomic rank is associated with diversity within a given taxon is a pattern that can be explained by many different underlying processes, e.g. species-area relationships, nutrient (diet) diversity, stressor diversity, immigration rate, and niche creation by other microbes (i.e. DBD). Without experiments, it remains vague if DBD is the underlying process that acts in these communities based on the observed patterns.

      We thank the reviewer for their comments. First, regarding the issue of this being a correlative study, we now more clearly acknowledge that mechanistic studies (perhaps in experimental settings) are required to fully elucidate DBD and BQH dynamics. However, we note that our correlational study from natural communities is complementary to experimental and modeling studies, to test the extent to which their predictions hold in more complex, realistic settings. This is now mentioned throughout the manuscript, most explicitly at the end of the Introduction:

      “Although such analyses of natural diversity cannot fully control for unmeasured confounding environmental factors, they are an important complement to controlled experimental and theoretical studies which lack real-world complexity.”

      Second, to increase the number of data points analyzed in the Poyet study, we now include 15 species and four different hosts (new Figure 5). The association between community diversity and gene loss is now much more statistically robust, and consistent across the Poyet and HMP time series.

      Third, we acknowledge more clearly in the Discussion that other processes, including diet and other environmental factors can generate the DBD pattern. We also now stress more prominently the possibility that strain migration across hosts may be responsible for the patterns observed. For example, in Figure 1, we illustrate the possibility of strain migration generating the patterns we observe.

      Below we quote a paragraph that we have now added in the Discussion:

      "Second, we cannot establish causal relationships without controlled experiments. We are therefore careful to conclude that positive diversity slopes are consistent with the predictions of DBD, and negative slopes with EC, but unmeasured environmental drivers could be at play. For example, increased dietary diversity could simultaneously select for higher community diversity and also higher intra-species diversity. In our previous study, we found that positive diversity slopes persisted even after controlling for potential abiotic drivers such as pH and temperature (Madi et al., 2020), but a similar analysis was not possible here due to a lack of metadata. Neutral processes can account for several ecological patterns such as species-area relationships (Hubbell, 2001), and must be rejected in favor of niche-centric models like DBD or EC. Using neutral models without DBD or EC, we found generally flat or negative diversity slopes due to sampling processes alone and that positive slopes were hard to explain with a neutral model (Madi et al., 2020). These models were intended mainly for 16S rRNA gene sequence data, but we expect the general conclusions to extend to metagenomic data. Nevertheless, further modeling and experimental work will be required to fully exclude a neutral explanation for the diversity slopes we report in the human gut microbiome.”

      Finally, we now put more emphasis on the importance of migration (strain invasion) as a non-exclusive alternative to de novo mutation and gene gain/loss. This is mentioned in the Abstract and is also illustrated in the revised Figure 1.

      Another limitation is that the total number of reads (5 mio for the longitudinal dataset and 20 mio for the cross-sectional dataset) is low for assessing strain-level diversity in complex communities such as the human gut microbiota. This is probably the reason why the authors only looked at one species with sufficient coverage in the longitudinal dataset.

      Indeed, this is a caveat which means we can only consider sub-species diversity in relatively abundant species. Nevertheless, this allows us to study dozens of species in the HMP and 15 in the more frequent Poyet time series. As more deeply sequenced metagenomes become available, future studies will be able to access the rarer species to test whether the same patterns hold or not. This is now mentioned prominently as a caveat our study in the second Discussion paragraph:

      “First, using metagenomic data from human microbiomes allowed us to study genetic diversity, but limited us to considering only relatively abundant species with genomes that were well-covered by short sequence reads. Deeper or more targeted sequencing may permit us to determine whether the same patterns hold for rarer members of the microbiome. However, it is notable that the majority of the dozens of species across the two datasets analyzed support DBD, suggesting that the phenomenon may generalize.”

      We also note that rarefaction was only applied to calculate community richness, not to estimate sub-species diversity. We apologize for this confusion, which is now clarified in the Methods as follows:

      “SNV and gene content variation within a focal species were ascertained only from the full dataset and not the rarefied dataset.”

      Analyzing the effect of diversity at a given timepoint on strain-level diversity at a later timepoint adds an important new dimension to this study which was not assessed in the previous study about the DBD in microbiomes by some of the authors. However, only a single species was analysed in the longitudinal dataset and comparisons of diversity were only done between two consecutive timepoints. This dataset could be further exploited to provide more insights into the prevailing patterns of diversity.

      We thank the reviewer for raising this point. We now have considered all 15 species for which there was sufficient coverage from the Poyet dataset, which included four different stool donors. Additionally, in the HMP dataset, we analyze 54 species across 154 hosts, with both datasets showing the same correlation between community diversity and gene loss.

      Additionally, we followed the suggestion of the reviewer of examining additional time lags, and in Figure 5 we do observe a dependency on time. This is now described in the Results as follows:

      “Using the Poyet dataset, we asked whether community diversity in the gut microbiome at one time point could predict polymorphism change at a future time point by fitting GAMs with the change in polymorphism rate as a function of the interaction between community diversity at the first time point and the number of days between the two time points. Shannon diversity at the earlier time point was correlated with increases in polymorphism (consistent with DBD) up to ~150 days (~4.5 months) into the future (Figure S4), but this relationship became weaker and then inverted (consistent with EC) at longer time lags (Fig 5A, Table S8, GAM, P=0.023, Chi-square test). The diversity slope is approximately flat for time lags between four and six months, which could explain why no significant relationship was found in HMP, where samples were collected every ~6 months. No relationship was observed between community richness and changes in polymorphism (Table S8, GAM, P>0.05).”

      Finally, the evidence that gene loss follows increase in diversity is weak, as very few genes were found to be lost between two consecutive timepoints, and the analysis is based on only a single species. Moreover, while positive correlation were found between overall community diversity and gene family diversity in single species, the opposite trend was observed when focusing on pathway diversity. A more detailed analysis (of e.g. the functions of the genes and pathways lost/gained) to explain these seemingly contrasting results and a more critical discussion of the limitations of this study would be desirable.

      We agree that our previous analysis of one species in one host provided weak support for gene loss following increases in diversity. As described in the response above, we have now expanded this analysis to 15 focal species and 4 independent hosts with extensive time series. We now analyze this larger dataset and report the more statistically robust results as follows:

      “We found that community Shannon diversity predicted future gene loss in a focal species, and this effect became stronger with longer time lags (Fig 5B, Table S9, GLMM, P=0.006, LRT for the effect of the interaction between the initial Shannon diversity and time lag on the number of genes lost). The model predicts that increasing Shannon diversity from its minimum to its maximum would result in the loss of 0.075 genes from a focal species after 250 days. In other words, about one of the 15 focal species considered would be expected to lose a gene in this time frame.

      Higher Shannon diversity was also associated with fewer gene gains, and this relationship also became stronger over time (Fig 5C, Table S9, GLMM, P=1.11e-09, LRT). We found a similar relationship between community species richness and gene gains, although the relationship was slightly positive at shorter time lags (Fig 5D, Table S9, GLMM, P=3.41e-04, LRT). No significant relationship was observed between richness and gene loss (Table S9, GLMM, P>0.05). Taken together with the HMP results (Fig 4), these longer time series reveal how the sign of the diversity slope can vary over time and how community diversity is generally predictive of reduced focal species gene content.”

      As described in detail in the response to Reviewer 1 above, we found that the HUMAnN2 pathway analyses previously described suffered from technical challenges and we deemed them inconclusive. We have therefore removed the pathway results from the manuscript.

      Reviewer #3 (Public Review):

      This work provides a series of tests of hypothesis, which are not mutually exclusive, on how genomic diversity is structured within human microbiomes and how community diversity may influence the evolution of a focal species.

      Strengths:

      The paper leverages on existing metagenomic data to look at many focal species at the same time to test for the importance of broad eco-evolutionary hypothesis, which is a novelty in the field.

      Thank you for the succinct summary and recognition of the strengths of our work.

      Weaknesses:

      It is not very clear if the existing metagenomic data has sufficient power to test these models.

      It is not clear, neither in the introduction nor in the analysis what precise mechanisms are expected to lead to DBD.

      The conclusion that data support DBD appears to depend on which statistics to measure of community diversity are used. Also, performing a test to reject a null neutral model would have been welcome either in the results or in the discussion.

      In our revised manuscript, we emphasize several caveats – including that we only have power to test these hypotheses in focal species with sufficient metagenomic coverage to measure sub-species diversity. We also describe more in the Introduction how the processes of competition and niche construction can lead to DBD. We also acknowledge that unmeasured abiotic drivers of both community diversity and sub-species diversity could also lead to the observed patterns. Throughout the manuscript, we attempt to describe the results and acknowledge multiple possible interpretations, including DBD and EC acting with different strengths on different species and time scales. Our previous manuscript assessing the evidence for DBD using 16S rRNA gene amplicon data from the Earth Microbiome Project (Madi et al., eLife 2020) assessed null models based on neutral ecological theory, and found it difficult to explain the observation of generally positive diversity slopes without invoking a non-neutral mechanism like DBD. While a new null model tailored to metagenomic data might provide additional nuance, we think developing one is beyond the scope of the manuscript – which is in the format of a short ‘Research Advance’ to expand on our previous eLife paper, and we expect that the general results of our previously reported null model provide a reasonable intuition for our new metagenomic analysis. This is now mentioned in the Discussion as follows:

      “In our previous study, we found that positive diversity slopes persisted even after controlling for potential abiotic drivers such as pH and temperature (Madi et al., 2020), but a similar analysis was not possible here due to a lack of metadata. Neutral processes can account for several ecological patterns such as species-area relationships (Hubbell, 2001), and must be rejected in favor of niche-centric models like DBD or EC. Using neutral models without DBD or EC, we found generally flat or negative diversity slopes due to sampling processes alone and that positive slopes were hard to explain with a neutral model (Madi et al., 2020). These models were intended mainly for 16S rRNA gene sequence data, but we expect the general conclusions to extend to metagenomic data. Nevertheless, further modeling and experimental work will be required to fully exclude a neutral explanation for the diversity slopes we report in the human gut microbiome.”

    1. Reviewer #3 (Public Review):

      To motivate the proposal, Karageorgiou et al. first identify a problem in applying current multivariable MR (MVMR) methods with many correlated exposures. I believe this problem can really be broken into two pieces. The first is that MVMR suffers from weak instrument bias. The second is that some traits may have nearly co-linear genetic associations, making it hard to disentangle which trait is causal. These problems connect in that inclusion of co-linear traits amplifies the problem of weak instrument bias - traits that are nearly co-linear with another trait in the study will have no or very few conditionally strong instruments.<br /> The authors then propose a solution: Apply a dimension reduction technique (PCA or sparse PCA) to the matrix of GWAS effect estimates for the exposures. The identified new components can then be used in MVMR in place of the directly measured exposures.

      I think that the identified problem is timely and important. I also like the idea of applying dimension reduction techniques to GWAS effect estimates. However, I don't think that the manuscript in its current form achieves the goals that it has set out. Specifically, I will outline the weaknesses of the work in three categories:<br /> 1. The causal effects measured using this method are poorly defined.<br /> 2. The description of the method lacks important details.<br /> 3. Applied and simulation results are unconvincing.<br /> I will describe each of these in more detail below.

      1. To me, the largest weakness of this paper is that it is not clear how to interpret the putatively causal effects being measured. The authors describe the method as measuring "the causal effect of the PC on outcome" but it is not obvious what this means.

      One possible implication of this statement is that the PC is a real biological variable (say some hidden regulator) that can be directly intervened on. If this is the intention it should be discussed. However, this situation would imply that there is one correct factorization and there is no guarantee that PCs (or sparse PCs) come close to capturing that.

      The counterfactual implied by estimating the effects of PCs in MVMR is that it is possible to intervene on and alter one PC while holding all other PCs constant.<br /> In the introduction, the authors note (and I agree) that one weakness of MR applied to correlated traits is that "MVMR models investigate causal effects for each individual exposure, under the assumption that it is possible to intervene and change each one whilst holding the others fixed." However, it is not obvious that altering one PC while holding the others constant is more reasonable.

      2. This section combines a few items that I found unclear in the methods section. The most critical one is the lack of specification on how to select instruments.<br /> For the lipids application, the authors state that instruments were selected from the GLGC results, however, these only include instruments for LDL, HDL, and TG, so 1) it would not be possible to include variants that were independently instruments for one of the component traits alone and 2) there would be no instruments for the amino acids. There is no discussion of how instruments should be selected in general.<br /> This choice could also have a dramatic impact on the PCs estimated. The first PC is optimized to explain the largest amount of variance o of the input data which, in this case, is GWAS effect estimates. This means that the number of instruments for each trait included will drive the resulting PCs. It also means that differences in scaling across traits could influence the resulting PCs.

      The other detail that is either missing or which I missed is what is used as the variant-PC association in the MVMR analysis. Specifically, is it the PC loadings or is it a different value? Based on the computation of the F-statistic I suspect the former but it is not clear. If this is the case, what is the effect of using loadings that have been shrunk via one of the sparse methods? It would be nice to see a demonstration of the bias and variance of the resulting method, though it is not clear to me what the "truth" would be.

      3. In the lipids application, the fact that M.LDL.PL changes sign in MVMR analysis are offered as evidence of multicollinearity. I would generally associate multicollinearity with large variance and not bias. Perhaps the authors could offer some more insight on how multicollinearity would cause the observation.<br /> A minor point of confusion: I was unable to interpret this pair of sentences "Although the method did not identify any of the exposures as significant at Bonferroni-adjusted significance level, the estimate for M.LDL.PL is still negative but closer to zero and not statistically significant. The only trait that retains statistical significance is ApoB." The first sentence says that none of the exposures were significant while the second sentence says that Apo B is significant. The GRAPPLE results don't seem clearly bad, indeed if only Apo B is significant, wouldn't we conclude that of the 118 exposures, only Apo B is causal for heart disease? It would help to discuss more how the conclusions from the PC-based MVMR analysis compare to the conclusions from GRAPPLE.

      It is a bit hard to interpret Table 4. I wasn't able to fully determine what "VLD, LDL significance in MR" means here. From the text, it seems that it means that any PC with a non-zero lodaing on VLDL or LDL traits was significant, however, this seems like a trivial criterion for the PCA method, since all PCs will be dense. This would mean this indicator only tells us whether and PCs were found to "cause" heart disease.

      In simulations, I may be missing something about the definition of a true and false positive here. I think this is similar to my confusion in the previous paragraph. Wouldn't the true and false positive rates as computed using these metrics depend strongly on the sparsity of the components? It is not clear to me what ideal behavior would be here. However, it seems from the description that if the truth was as in Fig 7 and two methods each yielded one dense component that was found to be causal for Y, these two methods would get the same "score" for true positive and false positive rate regardless of the distribution of factor loadings. One method could produce a factor that loaded equally on all exposures while the other produced a factor that loaded mostly on X1 and X2 but this difference would not be captured in the results.

  3. www.kernel.community www.kernel.community
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewers:

      1. General Statements

      We thank the reviewers for the comments and the suggestions. We hope that we have addressed all the queries raised by the reviewers in the revised manuscript. We provide a point-by-point response below. Please note that the line numbers indicated in parentheses correspond to the pdf file without the track changes display.

      2. Point-by-point description of the revisions


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: Srinivasan and co-workers developed an alternative screening method for defining the ability of FtsZ inhibitor to affect FtsZ polymerization. This alternative assay was defined considering the expertise of the authors on the topic, and they use Schizosaccharomyces pombe as a model for studying the effect of PC190723, sanguinarine and berberine on FtsZ assembly. The use of a heterologous expression system is useful for the evaluation of FtsZ coming from different strains, both Gram - and Gram +. The same model could gain insights also on the capability of FtsZ inhibitors to affect eukaryotic cell physiology. Finally, authors resulted also in suggesting a possible cause to suspected resistance to PC190723 from Gram - strains as E. coli.

      Major comments: • The conclusions are included in the discussion section and are quite convincing, for a general audience.

      We thank the reviewer for the positive comments.

      In my opinion, the authors should define which could be the limits of their method, since no data on the possible weaknesses are reported.

      RESPONSE: We have discussed the limitations of the methods as well. The discussion has been modified and the following sentences have been now included in the revised manuscript.

      “However, one of the major disadvantages of using fission yeast could be the need to use much higher concentrations of drugs than normally used for mammalian cell cultures to achieve an inhibitory effect. This could probably be due to the poor permeability of certain drugs in fission yeast because of its thick cell wall (Benko et al. 2017; Pérez and Ribas 2004). A similar effect of toxicity might arise at much lower concentrations in other eukaryotic cells, such as human cells. Consistently, while sanguinarine and berberine are known to affect the eukaryotic microtubules at 10 μΜ – 20 μM concentrations (Lopus and Panda 2006; Wang et al. 2016; Raghav et al. 2017), morphological effects on yeast cells were observed only at concentrations > 100 μM. However, yeast microtubules were not affected by berberine and sanguinarine. Differences in membrane lipid profiles and MDR efflux pumps between yeasts and mammalian cells might also contribute to differential resistance to the drugs being tested (Balzi and Goffeau 1991). Conversely, an inhibitory effect in yeast cells may not necessarily translate into toxicity in a human cell. These and the permeability of drugs in yeast cells represent an important caveat in using such heterologous expression systems for the screening of compounds against target molecules.”

      [Lines 498-513]

      As suggested in the later sections, we have also elaborated on the pros and cons of various methods including the yeast-based screening methods. [Lines 462-523]

      • No additional experiments are required to support the claims.

      • The suggested experiments could be quite easy to be realized for authors working in the microbiological field, and familiar with protein expression and purification, as well as bacteria and yeast growth.

      • From my side, even if I am not so expert in microbiology and plasmid/protein purification, the methods presented could be reproduced with no significant doubt.

      • Statistical analysis was done and seems to be adequate.

      RESPONSE: We thank the reviewer for these encouraging comments.

      Minor comments: • Prior studies should be deepened, especially for the state of art authors referred to. Additional paper, both reviews on the possible methods for evaluating FtsZ inhibition, as well as research papers on FtsZ inhibitors targeting E. coli and other Gram negative strains should be mentioned, since, in my opinion, these could move authors in changing a little bit the overall text of the manuscript.

      RESPONSE: We have now elaborated the state-of-art methods used for evaluation of FtsZ inhibition and cited the relevant papers and reviews. We have also included papers on development of FtsZ inhibitors, especially the ones similar to PC190723, targeting Gram-negative bacteria. The following sentences have been included in the revised manuscript.

      “Several approaches have been used to screen small molecules targeting bacterial cell division and FtsZ. While in vitro methods such as NMR (Domadia et al. 2007; Sun et al. 2014; Araújo‑Bazán et al. 2019) and crystallography (Läppchen et al. 2008; Fujita et al. 2017) are valuable and offer information on distinct binding sites, these are not efficient for screening. Electron microscopic examination can distinguish the effects of the compounds being tested on the FtsZ protofilament assembly and lateral associations (Nova et al. 2007; Kaul et al. 2012; Anderson et al. 2012; Sun et al. 2014; Huecas et al. 2017; Kumar et al. 2011; Park et al. 2014). Other techniques that are routinely used include fluorescence anisotropy (Ruiz‑Avila et al. 2013; Park et al. 2014), 90º light-scattering assay (Mukherjee and Lutkenhaus 1999) and dynamic light scattering (Hou et al. 2012; Di Somma et al. 2020) for assessing inhibition of FtsZ assembly (Kaul et al. 2012; Nova et al. 2007; Lui et al. 2019; Anderson et al. 2012, (Irwin et al. 2015). Other easily scalable high-throughput assays include FCS/FCCS and FRET-based methods (Hernández‑Rocamora et al. 2015; Mikuni et al. 2015; Reija et al. 2011).

      In vivo assays relying on cell filamentation phenotype coupled with the localization of Z-ring might be a good indicator of FtsZ being the direct target. However, since bacteria can undergo cell filamentation and not assemble FtsZ rings in response to a variety of conditions, including DNA damage (Mukherjee et al. 1998) and disruption of membrane potential (Strahl and Hamoen 2010), the in vivo assay is not so useful unless combined with the in vitro assays mentioned above. Finally, the isolation of resistance mutants in FtsZ to the drug can provide strong evidence of FtsZ being the direct target.

      Reconstitution systems are powerful and provide excellent control over the system, but they are emerging technologies and are technically challenging. Reconstitution systems include a variety of methods, such as the use of membrane nanodiscs, microbeads of different materials, supported bi-layer membranes (SLBs) and biomimetic systems that provide cell-like environments (Monterroso et al. 2013; Rivas et al. 2014).”

      [Lines 462-487]

      “Several compounds have been evaluated for their activity against FtsZ from both Gram-positive bacteria and Gram-negative bacteria. Although many exhibited only weak activity in vivo against Gram-negative bacteria, derivatives could be promising. These include benzamides (Haydon et al. 2008; Adams et al. 2011; Straniero et al. 2017, 2020a), trisubstituted benzimidazoles (Kumar et al. 2011), 4-bromo-1H-indazole derivatives (Wang et al. 2015), cinnamaldehyde and its derivatives (Domadia et al. 2007; Li et al. 2015), curcumin (Rai et al. 2008), heterocyclic molecules like guanidinomethyl biaryl compounds (Kaul et al. 2012), pyrimidine-quinuclidine scaffolds (Chan et al. 2013), 3-phenyl substituted 6,7-dimethoxyisoquinoline (Kelley et al. 2012), thiazole orange derivatives (Sun et al. 2017), viriditoxin (Wang et al. 2003), N-heterocycles such as zantrins and derivatives (Margalit et al. 2004; Nepomuceno et al. 2015).”

      [Lines 69-80]

      “Several efforts have been made to target Gram-negative bacteria with derivatives of benzamide. Examples include difluorobenzamides, substituted benzodioxanes, heterocyclic and non-heterocyclic derivatives (Straniero et al. 2017; Chai et al. 2020; Straniero et al. 2020a, 2020b). Although many exhibited promising activity in vitro, most were substrates for the AcrAB class of efflux pumps (Chai et al. 2020; Kaul et al. 2014; Straniero et al. 2020a, 2020b; Casiraghi et al. 2020). Thus, the poor membrane permeability, signature outer membrane, particularly lipopolysaccharide (LPS) structure (Wang et al. 2021), the presence of multiple efflux pumps in species such as E. coli, Klebsiella pneumonia and Pseudomonas aeruginosa (Piddock 2006), and differences in FtsZ sequences in the binding-site (Kaul et al. 2013b; Miguel et al. 2015) have been cited as reasons for lack of susceptibility of Gram-negative bacteria to benzamide derivatives (Casiraghi et al. 2020). More recently, two molecules, TXA6101 and TXY6129, with substituted 2,6-difluorobenzamide scaffold, have been shown to inhibit the polymerization of both E. coli and Klebsiella pneumoniae FtsZ. Moreover, despite being substrates for efflux pumps, TXA6101 induced morphological changes in K. pneumoniae (Rosado‑Lugo et al. 2022). Studies in the past on the effects of PC190723 on E. coli have been confusing, with a few reports suggesting an effect on FtsZ polymerization resulting in cell filamentation (Kaul et al. 2014), while others did not find any effect on EcFtsZ (Andreu et al. 2010; Anderson et al. 2012; Khare et al. 2019)⁠. The outer membrane has been shown to be a permeability barrier for PC190723 in E. coli (Khare et al. 2019; Chai et al. 2020). In addition, the Resistance-Nodulation-Division (RND) family of efflux pumps has been attributed to resistance against 2,6-difluorobenzamide derivatives, including TX436 (a prodrug of PC190723) in Gram-negative bacteria (Kaul et al. 2014).”

      [Lines 527-550]

      The whole text requires a deep check for grammar and word choice. Some sentences should be re-written since it is not so easy to understand their meaning. Figures are clear, even if I am not so convinced on the need of including Figure 1.

      RESPONSE: We have now deleted Figure 1 and 2 (as also suggested by Reviewer #2), revised the manuscript and have re-written certain long sentences. We have used Grammarly to check for grammatical errors. We hope the manuscript is easier to follow with these changes.

      Reviewer #1 (Significance (Required)):

      • In my opinion, the outcome coming from this work could move researchers in evaluating an alternative method for assessing FtsZ inhibition. Nevertheless, the actual state of art, a few reviews of the last years confirm this, already underlined a huge number of possible assays, both microbiological, biochemical, biophysical, physiological, or other. As a result, the authors did not result in convincing me about the importance of their methods, when compared to others. They may include some other possible assays and comment of the differences, pros and cons.

      RESPONSE: Several alternative methods have been evaluated and several excellent reviews published in the recent past have underlined the importance of these multiple methods to screen and validate small molecules targeting FtsZ. As suggested by the reviewer here and above, we have now discussed these methods including the yeast-based assay we describe, their advantages and limitations in the revised manuscript.

      The following lines have now been included in Introduction.

      “Several methods have been used to ascertain FtsZ as the target of the drug, and the various approaches have been reviewed in detail by many (Kusuma et al. 2019; Silber et al. 2020; Zorrilla et al. 2021; Andreu et al. 2022). Andreu et al. (2022) have recently proposed a streamlined experimental protocol for the screening and characterization of FtsZ inhibitors.”

      Introduction – [Lines 113-117]

      The following paragraphs, including ones as mentioned above have included in the discussion sections of the revised manuscript.

      “Several approaches have been used to screen small molecules targeting bacterial cell division and FtsZ. While in vitro methods such as NMR (Domadia et al. 2007; Sun et al. 2014; Araújo‑Bazán et al. 2019) and crystallography (Läppchen et al. 2008; Fujita et al. 2017) are valuable and offer information on distinct binding sites, these are not efficient for screening. Electron microscopic examination can distinguish the effects of the compounds being tested on the FtsZ protofilament assembly and lateral associations (Nova et al. 2007; Kaul et al. 2012; Anderson et al. 2012; Sun et al. 2014; Huecas et al. 2017; Kumar et al. 2011; Park et al. 2014). Other techniques that are routinely used include fluorescence anisotropy (Ruiz‑Avila et al. 2013; Park et al. 2014), 90º light-scattering assay (Mukherjee and Lutkenhaus 1999) and dynamic light scattering (Hou et al. 2012; Di Somma et al. 2020) for assessing inhibition of FtsZ assembly (Kaul et al. 2012; Nova et al. 2007; Lui et al. 2019; Anderson et al. 2012, (Irwin et al. 2015). Other easily scalable high-throughput assays include FCS/FCCS and FRET-based methods (Hernández‑Rocamora et al. 2015; Mikuni et al. 2015; Reija et al. 2011).

      In vivo assays relying on cell filamentation phenotype coupled with the localization of Z-ring might be a good indicator of FtsZ being the direct target. However, since bacteria can undergo cell filamentation and not assemble FtsZ rings in response to a variety of conditions, including DNA damage (Mukherjee et al. 1998) and disruption of membrane potential (Strahl and Hamoen 2010), the in vivo assay is not so useful unless combined with the in vitro assays mentioned above. Finally, the isolation of resistance mutants in FtsZ to the drug can provide strong evidence of FtsZ being the direct target.

      Reconstitution systems are powerful and provide excellent control over the system, but they are emerging technologies and are technically challenging. Reconstitution systems include a variety of methods, such as the use of membrane nanodiscs, microbeads of different materials, supported bi-layer membranes (SLBs) and biomimetic systems that provide cell-like environments (Monterroso et al. 2013; Rivas et al. 2014). While in vitro biochemical assays and reconstitution systems are useful to find molecules that directly target FtsZ, they are cumbersome and need to be performed at optimal physiological pH and ionic conditions, which can be considerably variable among FtsZ from different species.

      Our results on the ability of sanguinarine and berberine to specifically affect the assembly of FtsZ and not MreB in fission yeast highlight the utility of the heterologous expression system as a platform to identify molecules that specifically affect FtsZ polymerization. The yeast platform offers a cellular context mimicking the cytoplasm for cytoskeletal assembly. The system is simple to replicate in any laboratory, including those focused on chemical synthesis with minimum microbiological expertise and can be easily reproduced and scaled up as well. However, one of the major disadvantages of using fission yeast could be the need to use much higher concentrations of drugs than normally used for mammalian cell cultures to achieve an inhibitory effect. This could probably be due to the poor permeability of certain drugs in fission yeast because of its thick cell wall (Benko et al. 2017; Pérez and Ribas 2004). A similar effect of toxicity might arise at much lower concentrations in other eukaryotic cells, such as human cells. Consistently, while sanguinarine and berberine are known to affect the eukaryotic microtubules at 10 μΜ – 20 μM concentrations (Lopus and Panda 2006; Wang et al. 2016; Raghav et al. 2017), morphological effects on yeast cells were observed only at concentrations > 100 μM. However, yeast microtubules were not affected by berberine and sanguinarine. Differences in membrane lipid profiles and MDR efflux pumps between yeasts and mammalian cells might also contribute to differential resistance to the drugs being tested (Balzi and Goffeau 1991). Conversely, an inhibitory effect in yeast cells may not necessarily translate into toxicity in a human cell. These and the permeability of drugs in yeast cells represent an important caveat in using such heterologous expression systems for the screening of compounds against target molecules. However, notwithstanding this caveat, the heterologous system provides significant advantages in assessing the direct effects of the drug on FtsZ assembly. Moreover, fission yeast-based high-throughput platform screening methods using imaging have been successfully adapted to the screening of drugs against HIV-1 proteases by large-scale screening facilities such as the NIH Molecular Libraries Probe Production Centers Network in the Molecular Libraries Program, leading to several candidate drugs (Benko et al. 2017, 2019).”

      Discussion - [Lines 462-519]

      “A powerful emerging technique based on cytological profiling has been successfully used to identify the cellular pathways targeted by the inhibitors (Nonejuie et al. 2013; Martin et al. 2020), including cell division inhibition by FtsZ (Araújo‑Bazán et al. 2016). The recent advances in computational image analysis and deep learning approaches (von Chamier et al. 2021; Spahn et al. 2022) could further advance image-based screening for FtsZ inhibitors (Andreu et al. 2022).”

      Discussion – [Lines 581-586]

      As I mentioned before, there are a lot of reviews including the possible tests to perform for assessing FtsZ inhibition. A recent one was not cited, but, from my side, it should be mentioned (10.3390/antibiotics10030254).

      The suggested article is an excellent review that in addition to providing an overview of the state-of-art methods currently in practice for screening drugs targeting FtsZ, also suggests other emerging technologies suitable for assay development. We had cited this article (Zorrilla et al., 2021; doi: 10.3390/antibiotics10030254) in other contexts in our original manuscript but inadvertently missed in the text while mentioning the methods for screening.

      We have now cited Zorrilla et al., 2021 at all appropriate places in the revised manuscript. In addition, we have also cited (Monterroso 2013; https://doi.org/10.1016/j.ymeth.2012.12.014); (Rivas 2014; https://doi.org/10.1016/j.cbpa.2014.07.018); Kusuma 2019 (doi: 10.1021/acsinfecdis.9b00055); Schaffner-Barbero 2012 (doi: 10.1021/cb2003626); Silber et al 2020 (doi: 10.2217/fmb-2019-0348); Li et al., 2015 (doi: 10.1016/j.ejmech.2015.03.026); Casiraghi et al 2020 (doi: 10.3390/antibiotics9020069); Andreu et al., 2022 (10.3390/biomedicines10081825)

      Moreover, I think authors should reconsidered novel research papers, in which researchers evaluated the reason behind the apparent inactivity of benzamide derivatives, similar to PC190723, towards Gram negative strains.

      RESPONSE: Several novel papers that have reported reason for the inactivity of benzamide derivatives towards Gram-negative bacteria, including PC190723 have now been cited. The following sentences have been now included in the revised manuscript.

      “Several efforts have been made to target Gram-negative bacteria with derivatives of benzamide. Examples include difluorobenzamides, substituted benzodioxanes, heterocyclic and non-heterocyclic derivatives (Straniero et al. 2017; Chai et al. 2020; Straniero et al. 2020a, 2020b). Although many exhibited promising activity in vitro, most were substrates for the AcrAB class of efflux pumps (Chai et al. 2020; Kaul et al. 2014; Straniero et al. 2020a, 2020b; Casiraghi et al. 2020). Thus, the poor membrane permeability, signature outer membrane, particularly lipopolysaccharide (LPS) structure (Wang et al. 2021), the presence of multiple efflux pumps in species such as E. coli, Klebsiella pneumonia and Pseudomonas aeruginosa (Piddock 2006), and differences in FtsZ sequences in the binding-site (Kaul et al. 2013b; Miguel et al. 2015) have been cited as reasons for lack of susceptibility of Gram-negative bacteria to benzamide derivatives (Casiraghi et al. 2020). More recently, two molecules, TXA6101 and TXY6129, with substituted 2,6-difluorobenzamide scaffold, have been shown to inhibit the polymerization of both E. coli and Klebsiella pneumoniae FtsZ. Moreover, despite being substrates for efflux pumps, TXA6101 induced morphological changes in K. pneumoniae (Rosado‑Lugo et al. 2022). Studies in the past on the effects of PC190723 on E. coli have been confusing, with a few reports suggesting an effect on FtsZ polymerization resulting in cell filamentation (Kaul et al. 2014), while others did not find any effect on EcFtsZ (Andreu et al. 2010; Anderson et al. 2012; Khare et al. 2019)⁠. The outer membrane has been shown to be a permeability barrier for PC190723 in E. coli (Khare et al. 2019; Chai et al. 2020). In addition, the Resistance-Nodulation-Division (RND) family of efflux pumps has been attributed to resistance against 2,6-difluorobenzamide derivatives, including TX436 (a prodrug of PC190723) in Gram-negative bacteria (Kaul et al. 2014).”

      [Lines 527-550]

      Researchers working on FtsZ inhibitors could be interested in this paper, especially microbiologists.

      I specifically work on the design, synthesis and evaluation of the microbiological assays performed by others on my compounds.

      ========================================================================

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Dr. Srinivasin and colleagues previously developed a system where they expressed bacterial FtsZ in yeast and showed that it could assemble into polymers related to the Z rings. Here they develop this system further as a way to assay for drugs that may poison FtsZ, which would be candidates for new antibiotics. They test three drugs against three species of FtsZ. The results suggest that this system should be useful in screening new drugs that may target FtsZ. I would recommend publication after addressing a number of concerns and apparent contradictions.

      Fig. 1 showing chemical formulas of the drugs, and Fig. 2 showing a schematic of the yeast expression system, are probably not needed.

      RESPONSE: Reviewer #1 had also made a similar suggestion and we have now deleted these two figures (Fig. 1 and Fig. 2 in the older version).

      The authors make a point that sanguinarine and berberine inhibit eukaryote cell morphology. In fact, what they show is that they affect yeast cell morphology. This may or may not extend to other eukaryotes. Also, other eukaryotic cells may be more sensitive to drugs than yeast. They should me more conservative in this claim that the system also screens for drugs effects on eukaryotes.

      RESPONSE: We agree with the reviewer’s suggestions here that other eukaryotic cells may be more sensitive to drugs than yeast. We have modified the statements pertaining to these claims in the revised manuscript.

      We have made the following changes in the revised version.

      The title of the manuscript has been now modified as “A salt bridge-mediated resistance mechanism to FtsZ inhibitor PC190723 revealed by a cell-based screen”.

      Lines 23-24 in the abstract has been modified to read as “The strategy also allows for simultaneous assessment of the toxicity of the drugs to eukaryotic yeast cells.”

      Other sentences modified in the revised version are:

      “We find that although sanguinarine and berberine affected FtsZ polymerization, they also affected yeast cell physiology”. [Lines 146-147]

      “In this study, we have attempted to develop a cell-based assay using fission yeast (S. pombe) as a heterologous expression host, which would enable the screening of compounds that could directly affect FtsZ polymerization as well as identify potential toxicity to yeast (or eukaryotic) cells simultaneously”. [Lines 444-447]

      “However, one of the major disadvantages of using fission yeast could be the need to use much higher concentrations of drugs than normally used for mammalian cell cultures to achieve an inhibitory effect. This could probably be due to the poor permeability of certain drugs in fission yeast because of its thick cell wall (Benko et al. 2017; Pérez and Ribas 2004). A similar effect of toxicity might arise at much lower concentrations in other eukaryotic cells, such as human cells”. [Lines 498-503]

      “Conversely, an inhibitory effect in yeast cells may not necessarily translate into toxicity in a human cell. These and the permeability of drugs in yeast cells represent an important caveat in using such heterologous expression systems for the screening of compounds against target molecules”. [Lines 510-513]

      Fig. 3 has some new structural data that should be explored more quantitatively. My quick measurement gave 0.5 and 0.8 µm for the outside diameters of Ec and Sa rings. The spirals of Hp seem to be 0.8 µm outside diameter, similar to SA rings. These spirals may be related to those reported by Popp and by Andreu under certain buffer conditions. This should be explored and referenced.

      RESPONSE: We have now quantitatively measured the diameters of the rings formed by EcFtsZ and SaFtsZ and the diameter and pitch of the spiral polymers of HpFtsZ. These have been now included in the results section and presented as a graph in a new figure (Supplementary Fig. S2). Please also note that the scale bar in Figure 1 (previously Figure 3) was erroneously marked as 5 µm. This has been corrected in the revised version to 2.5 µm.

      Also, the possibility that these spiral polymers may be related to those described by Popp and Andreu have been discussed. We included the following sentences in the discussion.

      “Previous studies have shown that various factors such as molecular crowding, variable C-terminal regions and bound nucleotide state lead to the formation of supramolecular structures like twisted helical structures, toroids and rings similar to those that have been observed in vivo (Popp et al. 2009; Huecas et al. 2017). Thus, the molecular crowding due to the dense cytoplasm of the yeast cells could have possibly induced the spiral and ring-like assembly of FtsZ polymers (Erickson et al. 2010).”

      [Lines 456-461]

      But Fig. 4 presents a contradiction. Here the Hp control cells show long smooth polymers, not helical. This seems an important difference and needs to be addressed. Are the polymers sometimes straight and sometimes helical? After finishing the paper I see that in some experiments the HP is helical, and in others the polymers are straight and smooth. I think it would be important to determine what favors the two forms. If this remains a mystery, at least address it openly.

      RESPONSE: This was definitely an oversight from the authors. We should have clearly mentioned this in the manuscript but completely missed the description of different polymers assembled by HpFtsZ.

      We have now described this clearly in the results and added a new Figure (Supplementary Fig. S1) showing a time course for the appearance of spiral and linear polymers. We have also replaced the images in Figure 5E.

      We have modified the results to read as:

      “Interestingly, HpFtsZ assembled into linear cable-like structures as well as twisted polymers that were curled and spiral in appearance (Fig. 1D). The spiral filaments were more clearly visualized by deconvolution of the images (Fig. 1D iii and 1E). Further, super-resolution imaging using 3D-SIM clearly revealed that HpFtsZ assembles into spiral filaments in fission yeast (Fig. 1F).”

      [Lines 171-175]

      We have also added the following lines in the results section:

      “Spiral polymers appeared early, at 16 – 18 hours after induction of expression (absence of thiamine), and linear cables appeared later at 20 – 22 hours (Fig. S1). The smooth linear polymers possibly arise from lateral association and bundling of FtsZ filaments (Monahan et al. 2009), but the factors determining the two forms in yeast cells remain unclear.”

      [Lines 175-179]

      I am concerned that the quantitation of drug inhibition in Fig 4, 5 is flawed. Visually from 4A it looks like ~90-100% of control cells have polymers, and sang reduces polymers by 70% for Sa and Ec and 100% for Hp: this is based on the number of spots and filaments I see in Fig. 4 Aii. But the quantitation in D shows only 17-23% reduction for all three. These numbers were based on determining the fraction of cells that showed polymer (spots or lines) vs diffuse. It seems that cells are counted as containing polymer even if they had a great reduction in spots or lines, but still had a few. E.g., 4Aii Sa has 4 cells, two of them with no spots, one with only 2, and one with ~7, which totals ~1/3 the spots in control cells. Categorizing cells with only a couple of spots as polymerized, seems to be a poor way to quantitate. Would it not be better to count all spots in all cells, or measure the total length of line polymers, as a measure of inhibition.

      RESPONSE: We agree with the reviewer here that number of spots or the length of the polymers would be a better quantitative measure of the effect of the drugs than the percentage of cells presented. In the revised manuscript, we now present quantified data as suggested.

      We have quantitated the number of spots per cell for SaFtsZ and total polymer length per cell for HpFtsZ to elucidate the effect of drugs on FtsZ polymers. The number of spots per cell were counted using built-in ImageJ macro OPS threshold IJ1 script which combines the otsu thresholding method and analyse particles plugin. The total polymer length per cell in the case HpFtsZ, was measured using used the lpx-plugins as described by Higaki (Higaki et al., 2017).

      In addition, using the lpx-plugins, we also quantify density, a measure of the amount cytoskeleton per unit area in a given cell (Henty-Ridilla et al., 2014; Higaki et al., 2017). We had previously used this measure successfully to quantify assembly of Spiroplasma citri MreB in fission yeast (Pande et al., 2022).

      The methodology has been described in detail in the Materials and Methods section under the heading – “Quantitation of the number of spots, polymer length and density”

      Lines [665-689]

      The new data has been included in the results (lines 207-231 and 275-284) and new Figures (Fig. 2 E, G and Fig. 3 G, H) have been added.

      Fig. 5 makes a convincing case that PC19 accelerates or enhances the polymerization of Sa and Hp. Fig. S2 shows that the structures of polymers are not changed when PC19 is added at 20 hrs, after polymers have already formed. It would have been nice to see for both 5A and S2A that the round spots had holes in the center, when imaged by SIM. Again the quantitation of cells as polymer vs diffuse seems ill suited, because it misses cells with a reduced number of spots.

      RESPONSE: We have imaged the FtsZ polymers of Sa and Hp in the presence of PC190723 using SIM and included these images as new panels in the figures. Figure 3C, 3F and Figure S4 in the revised manuscript.

      Again, for Figure 5 (Fig. 3 in the revised version), we have provided the quantitation as number of spots per cell, polymer length per cell and density (amount of cytoskeleton per unit area) as described above (new Figures - Fig. 3 G, H) in the revised manuscript.

      [Lines 275-284]

      Fig. 6 uses FRAP to show that PC reduces the dynamic exchange of Sa polymers by a factor of 3. It is remarkable to me that rapid exchange is not completely eliminated by PC. Regardless, it would be very important to reference the previous study of Adams..Errington 2011, where they showed the same thing for Foci in Bacillus. PC19 reduced the exchange from 3 to 10 s, but the foci were still very dynamic.

      RESPONSE: We had referenced this work in the original submission in the discussion section – “These results are also consistent with the earlier findings that PC190723 acts to induce FtsZ polymerization and stabilize FtsZ filaments (Andreu et al. 2010; Elsen et al. 2012; Miguel et al. 2015; Fujita et al. 2017) and its derivative compound, 8j acting to slow down FtsZ-ring turnover by 3-fold in B. subtilis (Adams et al. 2011).”

      [Lines 563-567] in revised manuscript

      We have now added the following statement and referenced Adams et al., 2011 in the results section as well.

      “Interestingly, compound 8j, a related benzamide derivative, has been shown to slow down FtsZ-ring turnover by 3-fold in B. subtilis (Adams et al. 2011).”

      [Lines 324-326]

      The analysis of the salt bridge as opposed to a single Arg or His being the cause of resistance to PC19 is an interesting addition to the study. In Fig. 8D some numbers do not agree between the caption and figure (R309/7; S226/7). The whole figure should be carefully checked.

      RESPONSE: We thank the reviewer for pointing to these. We have corrected these errors now in the revised version (Fig. 6).

      I am not familiar with the Gram -ve and Gram +ve nomenclature. Why not simply gram- and gram+?

      RESPONSE: We agree that Gram -ve / +ve are not standard notations and inappropriate.

      We have now written them as Gram-negative and Gram-positive throughout the text.

      The Discussion is quite long largely because it repeats items from Results and Introduction. It is also redundant to hype the value of this system in both Introduction and Discussion; The Introduction should be sufficient. The Discussion should be pared down by eliminating repetition and focusing on relating results to previous literature, in particular items that have not been referenced previously in the paper. Also, I think we don't need the final "In summary" paragraph. That is already nicely presented in the Abstract.

      RESPONSE: We have omitted the repetitive statements from the discussion. We have also deleted the final summary paragraph. We had added new paragraphs [lines 462-519] pertaining to previous literature (also suggested by Reviewer #1) to the discussion section in the revised manuscript.

      The authors should probably provide references to other studies that have used yeast expression to study assembly of FtsZ. I am thinking in particular of papers from the Osteryoung lab looking at chloroplast FtsZ.

      RESPONSE: We have now referenced other papers that have used yeast expression to study assembly of FtsZ.

      The following statement has been added to the introduction:

      “Moreover, the dynamics of chloroplast FtsZs have also been successfully studied using the heterologous fission yeast expression system (TerBush and Osteryoung 2012; Yoshida et al. 2016; TerBush et al. 2018).”

      Lines [132-134]

      NO PAGE NUMBERS. Authors should be penalized a week delay for submitting a mss without page numbers.

      RESPONSE: We sincerely apologise for this gross error and oversight and thank the reviewer for patiently reading through and reviewing a manuscript with no page numbers and line numbers. We are truly sorry for having submitted a manuscript as such and have now included page numbers and line numbers in the manuscript.

      Reviewer #2 (Significance (Required)):

      This work should be of interest to the broad field of research on FtsZ. The authors present it as a new platform for assaying drugs targeting FtsZ, and researchers in this area will certainly be interested. It will also be of broader interest for the novel assay of assembly and exchange dynamics and how they may be modulated by small molecules.

      ========================================================================

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: The authors established a proof-of-concept assay to investigate the bacterial cytoskeletal protein FtsZ in fission yeast, and this heterologous yeast system is useful for compounds identification targeting FtsZ. The authors used this system to understand the mechanism of FtsZ's resistance to drug PC190723. Major comments: 1. From the study, the pombe seems to be a good system for investigating the bacterial cytoskeleton proteins and testing the drugs for them. However, to my knowledge it is not convincing that this is the proper system can be used to assessing the eukaryotic toxicity, since no toxicity to pombe does not mean no toxicity to human cells and vice versa.

      RESPONSE: We agree with the reviewer that toxicity to S. pombe cannot be directly extended to assessing toxicity to other eukaryotic cells such as human cells. As suggested by Reviewer#2 as well, we have modified these claims in the revised manuscript, discussed the possibilities and limited the scope of this work to assessing toxicity in yeast cells.

      We have made the following changes in the revised version.

      The title of the manuscript has been now modified as “A salt bridge-mediated resistance mechanism to FtsZ inhibitor PC190723 revealed by a cell-based screen”.

      Lines 23-24 in the abstract has been modified to read as “The strategy also allows for simultaneous assessment of the toxicity of the drugs to eukaryotic yeast cells.”

      Other sentences modified in the revised version are:

      “We find that although sanguinarine and berberine affected FtsZ polymerization, they also affected yeast cell physiology”. [Lines 146-147]

      “In this study, we have attempted to develop a cell-based assay using fission yeast (S. pombe) as a heterologous expression host, which would enable the screening of compounds that could directly affect FtsZ polymerization as well as identify potential toxicity to yeast (or eukaryotic) cells simultaneously”. [Lines 444-447]

      “However, one of the major disadvantages of using fission yeast could be the need to use much higher concentrations of drugs than normally used for mammalian cell cultures to achieve an inhibitory effect. This could probably be due to the poor permeability of certain drugs in fission yeast because of its thick cell wall (Benko et al. 2017; Pérez and Ribas 2004). A similar effect of toxicity might arise at much lower concentrations in other eukaryotic cells, such as human cells”. [Lines 498-503]

      “Conversely, an inhibitory effect in yeast cells may not necessarily translate into toxicity in a human cell. These and the permeability of drugs in yeast cells represent an important caveat in using such heterologous expression systems for the screening of compounds against target molecules”. [Lines 510-513]

      From figure 4A to 4C, there seems no big difference of cell morphology between control and drug treatment, except for Berberine treatment of SaFtsZ-GFP. Under the low concentration of Sanguinarine (20 µM) and Berberine (53.791 µm), the FtsZ polymerization was disrupted and seems no effect on cell morphology. Why would the authors use much higher Sanguinarine (135.95 µM) and Berberine (134.45 µM) to prove there two drugs are toxic to pombe cells?

      RESPONSE: Earlier reports had shown that sanguinarine and berberine affect mammalian microtubules (Lopus and Panda 2006 - DOI: 10.1111/j.1742-4658.2006.05227.x; Raghav et al., 2017 - DOI: 10.1021/acs.biochem.7b00101). While, we did not observe any growth defect in yeast cells, earlier studies have suggested that yeasts possibly require higher concentrations of certain drugs than used for mammalian cells due to the presence of the cell wall, particularly S. pombe (Perez and Ribas 2004 - https://doi.org/10.1016/j.ymeth.2003.11.020; Benko et al., 2017 - DOI: 10.1186/s13578-016-0131-5). We had thus explored the possibility of cell toxicity to yeast cells at higher concentrations of the drugs.

      The following lines have thus been added to the results section in the revised manuscript.

      “Although we did not observe any growth defect in yeast cells at lower concentrations of the drugs, earlier studies have suggested that yeast cells possibly require higher concentrations of drugs than used for mammalian cells due to the presence of the cell wall, which is particularly thick in S. pombe (Benko et al. 2017; Pérez and Ribas 2004). We thus explored the possibility of cell toxicity to yeast cells at higher concentrations of the drugs.”

      Lines [234-239]

      Sanguinarine and Berberine are FtsZ disruption drugs, do these drugs have effect on microtubule?

      RESPONSE: We have now examined the effect of Sanguinarine and Berberine on yeast microtubules as well and did not find any visible differences between the control and inhibitor (either low or high concentrations) treated cells. This data has been added as a new figure (Supplementary Fig. S3 A and B) in the revised manuscript and the following line added to the results.

      “However, even at higher concentrations, neither of the drugs showed any visible effect on yeast microtubules (Fig. S3 A and B).”

      [Lines 241-242]

      The discussion has been modified as follows:

      “However, one of the major disadvantages of using fission yeast could be the need to use much higher concentrations of drugs than normally used for mammalian cell cultures to achieve an inhibitory effect. This could probably be due to the poor permeability of certain drugs in fission yeast because of its thick cell wall (Benko et al. 2017; Pérez and Ribas 2004). A similar effect of toxicity might arise at much lower concentrations in other eukaryotic cells, such as human cells. Consistently, while sanguinarine and berberine are known to affect the eukaryotic microtubules at 10 μΜ – 20 μM concentrations (Lopus and Panda 2006; Wang et al. 2016; Raghav et al. 2017), morphological effects on yeast cells were observed only at concentrations > 100 μM. However, yeast microtubules were not affected by berberine and sanguinarine. Differences in membrane lipid profiles and MDR efflux pumps between yeasts and mammalian cells might also contribute to differential resistance to the drugs being tested (Balzi and Goffeau 1991). Conversely, an inhibitory effect in yeast cells may not necessarily translate into toxicity in a human cell. These and the permeability of drugs in yeast cells represent an important caveat in using such heterologous expression systems for the screening of compounds against target molecules.”

      [Lines 498-513]

      There are very few SaFtsZ-GFP dot structure in fig 5B, and this is inconsistent with the SaFtsZ-GFP dot structure in fig 4A. Fig 5D has the same issue compare to Fig 4Ci

      RESPONSE: We had probably not made it very clear the experimental differences between Figure 4 and 5 (Figure 2 and 3 in the revised manuscript), which has led to this apparent inconsistency.

      The strong nmt1 promoter (thiamine repressible) takes about 18 hours for full-induction in the absence of thiamine (Forsburg 1993 - https://doi.org/10.1093/nar/21.12.2955). We have utilised the medium strength nmt41 promoter in our studies and hence, in Figure 2, expression of FtsZ-GFP fusions were allowed for longer periods of time (22 – 24 hours) in the experiments concerning sanguinarine and berberine treatments.

      This has been now clearly mentioned in the revised version of the manuscript in the results section (lines 196-199) as well as in figure legends.

      In contrast the very few dot structures or polymers in Figure 3 (revised manuscript) is because of a shorter period of expression of FtsZ-GFP (12 – 14 hours in the absence of thiamine). The shorter period of expression time in these experiments allowed us to test if PC190723 indeed induced the polymerisation of FtsZ, at a stage when the control cells still exhibited diffuse fluorescence and had minimal FtsZ assembly. Thus, the cultures were allowed to express FtsZ for a shorter period of time and imaged in the case of experiments presented in Figure 3.

      This has been now clearly mentioned in the results (lines 259-263) as well as in figure legends in the revised manuscript.

      We hope that we have now made these experimental differences clear and provide more clarity. We have also included this information (hours of induction) in the figure panel.

      The concentration of PC190723 the author used is 20 µg/ml, which is enough for disrupting FtsZ function, however according to the Sanguinarine and Berberine experiments, the author may use higher concentration of PC190723 to assess its toxicity to pombe cells. Same as Sanguinarine and Berberine, does PC190723 has effect on microtubule?

      RESPONSE: As suggested by the reviewer, we have tested the effect of PC190723 at a higher concentration (140.6 µM) similar to that of Sanguinarine and Berberine. We did not find any morphological changes in yeast upon treatment with higher concentrations of PC190723. Also, the drug did not seem to affect the yeast microtubules. These have been now included in the results section and new images have been added in the figure (Supplementary Fig. S3).

      The following lines have been added in the revised manuscript to the results section:

      “Earlier studies had reported that PC190723 was non-toxic to eukaryotic cells, including budding yeast (Haydon et al. 2008). We further tested if PC190723 resulted in morphological defects in S. pombe, like sanguinarine and berberine, at higher concentrations. However, consistent with the earlier reports, PC190723 was inactive against S. pombe at both 56.2 μM and 140.6 μM and did not cause any morphological changes (Fig. 2H iv). Further, PC190723 did not disrupt the yeast microtubules at either of the concentrations (Fig. S3 A iv and B iv).”

      [Lines 294-300]

      The authors mentioned much higher concentrations of drugs than normally used for mammalian cell cultures have to be used for fission yeast. Is there any criterion for this?

      RESPONSE: In the discussion section, we had mentioned that “Much higher concentrations of drugs than normally used for mammalian cell cultures have to be used for fission yeast probably due to permeability issues because of the presence of a thick cell wall (Benko 2017 - DOI: 10.1186/s13578-016-0131-5).

      This has now been mentioned in the results as well in the revised manuscript.

      “Although we did not observe any growth defect in yeast cells at lower concentrations of the drugs, earlier studies have suggested that yeast cells possibly require higher concentrations of drugs than used for mammalian cells due to the presence of the cell wall, which is particularly thick in S. pombe (Benko et al. 2017; Pérez and Ribas 2004). We thus explored the possibility of cell toxicity to yeast cells at higher concentrations of the drugs.”

      [Lines 234-239]

      The following lines in the discussion have been modified in the revised manuscript to read as – “However, one of the major disadvantages of using fission yeast could be the need to use much higher concentrations of drugs than normally used for mammalian cell cultures to achieve an inhibitory effect. This could probably be due to the poor permeability of certain drugs in fission yeast because of its thick cell wall (Benko et al. 2017; Pérez and Ribas 2004). A similar effect of toxicity might arise at much lower concentrations in other eukaryotic cells, such as human cells.”

      [Lines 498-503]

      Minor comments: 1. There are two units used for drug concentration µM for Sanguinarine and Berberine and µg/ml for PC190723, I think they should be consistent.

      We have now used µM for all drugs.

      Check the units (µM and µg/ml) italic in text and figure legend.

      We have now used µM for all drugs and corrected the italics. We apologise for the erroneous usage of italics in the text for µM.

      Reviewer #3 (Significance (Required)):

      The authors provided a proof-of-concept assay for studying bacterial cytoskeleton proteins in yeast cells. This idea will facilitate people to investigate the bacterial cytoskeleton proteins and also find compounds targeting them without affecting the yeast cells. This study will provide different perspectives to people who study cell biology and secondary metabolites discovery.

      We hope that we have satisfactorily addressed all the concerns raised by the reviewers in the revised manuscript.

      Thanking you,

      With Regards

      Dr. Ramanujam Srinivasan

      Dr. Pananghat Gayathri

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Dr. Srinivasin and colleagues previously developed a system where they expressed bacterial FtsZ in yeast and showed that it could assemble into polymers related to the Z rings. Here they develop this system further as a way to assay for drugs that may poison FtsZ, which would be candidates for new antibiotics. They test three drugs against three species of FtsZ. The results suggest that this system should be useful in screening new drugs that may target FtsZ. I would recommend publication after addressing a number of concerns and apparent contradictions.

      Fig. 1 showing chemical formulas of the drugs, and Fig. 2 showing a schematic of the yeast expression system, are probably not needed.

      The authors make a point that sanguinarine and berberine inhibit eukaryote cell morphology. In fact, what they show is that they affect yeast cell morphology. This may or may not extend to other eukaryotes. Also, other eukaryotic cells may be more sensitive to drugs than yeast. They should me more conservative in this claim that the system also screens for drugs effects on eukaryotes.

      Fig. 3 has some new structural data that should be explored more quantitatively. My quick measurement gave 0.5 and 0.8 µm for the outside diameters of Ec and Sa rings. The spirals of Hp seem to be 0.8 µm outside diameter, similar to SA rings. These spirals may be related to those reported by Popp and by Andreu under certain buffer conditions. This should be explored and referenced.

      But Fig. 4 presents a contradiction. Here the Hp control cells show long smooth polymers, not helical. This seems an important difference and needs to be addressed. Are the polymers sometimes straight and sometimes helical? After finishing the paper I see that in some experiments the HP is helical, and in others the polymers are straight and smooth. I think it would be important to determine what favors the two forms. If this remains a mystery, at least address it openly.

      I am concerned that the quantitation of drug inhibition in Fig 4, 5 is flawed. Visually from 4A it looks like ~90-100% of control cells have polymers, and sang reduces polymers by 70% for Sa and Ec and 100% for Hp: this is based on the number of spots and filaments I see in Fig. 4 Aii. But the quantitation in D shows only 17-23% reduction for all three. These numbers were based on determining the fraction of cells that showed polymer (spots or lines) vs diffuse. It seems that cells are counted as containing polymer even if they had a great reduction in spots or lines, but still had a few. E.g., 4Aii Sa has 4 cells, two of them with no spots, one with only 2, and one with ~7, which totals ~1/3 the spots in control cells. Categorizing cells with only a couple of spots as polymerized, seems to be a poor way to quantitate. Would it not be better to count all spots in all cells, or measure the total length of line polymers, as a measure of inhibition.

      Fig. 5 makes a convincing case that PC19 accelerates or enhances the polymerization of Sa and Hp. Fig. S2 shows that the structures of polymers are not changed when PC19 is added at 20 hrs, after polymers have already formed. It would have been nice to see for both 5A and S2A that the round spots had holes in the center, when imaged by SIM. Again the quantitation of cells as polymer vs diffuse seems ill suited, because it misses cells with a reduced number of spots.

      Fig. 6 uses FRAP to show that PC reduces the dynamic exchange of Sa polymers by a factor of 3. It is remarkable to me that rapid exchange is not completely eliminated by PC. Regardless, it would be very important to reference the previous study of Adams..Errington 2011, where they showed the same thing for Foci in Bacillus. PC19 reduced the exchange from 3 to 10 s, but the foci were still very dynamic.

      The analysis of the salt bridge as opposed to a single Arg or His being the cause of resistance to PC19 is an interesting addition to the study. In Fig. 8D some numbers do not agree between the caption and figure (R309/7; S226/7). The whole figure should be carefully checked.

      I am not familiar with the Gram -ve and Gram +ve nomenclature. Why not simply gram- and gram+?

      The Discussion is quite long largely because it repeats items from Results and Introduction. It is also redundant to hype the value of this system in both Introduction and Discussion; The Introduction should be sufficient. The Discussion should be pared down by eliminating repetition and focusing on relating results to previous literature, in particular items that have not been referenced previously in the paper. Also, I think we don't need the final "In summary" paragraph. That is already nicely presented in the Abstract.

      The authors should probably provide references to other studies that have used yeast expression to study assembly of FtsZ. I am thinking in particular of papers from the Osteryoung lab looking at chloroplast FtsZ.

      NO PAGE NUMBERS. Authors should be penalized a week delay for submitting a mss without page numbers.

      Significance

      This work should be of interest to the broad field of research on FtsZ. The authors present it as a new platform for assaying drugs targeting FtsZ, and researchers in this area will certainly be interested. It will also be of broader interest for the novel assay of assembly and exchange dynamics and how they may be modulated by small molecules.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01680

      Corresponding author(s): Woo Jae, Kim

      1. General Statements The goal of this study is to provide the groundwork for future studies of genetically controlled neuronal regulation of ‘interval timing’ through the provision of a behavioral paradigm. Interval timing, or the sense of time in the seconds to hours range, is important in foraging, decision making, and learning in humans via activation of cortico-striatal circuits. Interval timing requires completely distinct brain processes from millisecond or circadian timing. In summary, interval timing allows us to subjectively sense the passage of physical time, allowing us to integrate action sequences, thoughts, and behavior, detect developing trends, and predict future consequences.

      Many researchers have tried to figure out how animals, including humans, can estimate time intervals with such precision. However, most investigations have been conducted in the realm of psychology rather than biology thus far. Because the study of interval timing was limited in its ability to intervene in the human brain, many psychologists concentrated on developing convincing theoretical models to explain the known occurrence of interval timing.

      To overcome the limits of studying interval timing in terms of genetic control, we have reported that the time investment strategy for mating in Drosophila males can be a suitable behavioral platform to genetically dissect the principle of brain circuit mechanism for interval timing. For example, we previously reported that males prolong their mating when they have previously been exposed to rivals (Kim, Jan & Jan, "Contribution of visual and circadian neural circuits to memory for prolonged mating induced by rivals" Nature Neuroscience, 2012), and this behavior is regulated by visual stimuli, clock genes, and neuropeptide signaling in a subset of neurons (Kim, Jan & Jan, “A PDF/NPF Neuropeptide Signaling Circuitry of Male Drosophila melanogaster Controls Rival-Induced Prolonged Mating” Neuron, 2013).

      Throughout their lives, all animals must make decisions in order to optimize their utility function. Male reproductive success is determined by how many sperms successfully fertilize an egg with a restricted number of investment resources. To optimize male reproductive fitness, a time investment strategy has been devised. As a consequence, we believe that the flexible responses of mating duration to different environmental contexts in Drosophila males might be an excellent model to investigate neural circuits for interval timing.

      One of the most well-known features of human interval timing is the association of different sensory inputs with perception of time intervals, which influences our estimate of time intervals. Therefore, the first step toward comprehending the neural regulation of interval timing is to dissect the role that numerous sensory inputs play in determining the time duration. In this article, we discuss a different time-investment strategy adopted by males, called "Shorter-Mating-Duration" (SMD). According to our findings, male Drosophila with more sexual experience had shorter mating duration. During our investigation into the sensory inputs for SMD behavior, we found a small number of cells that express sugar receptors and pheromone receptors (ppk25 and ppk29) and thus transmit the multisensory information from females in order to generate memories of sexual experiences, which will determine the final decision of mating duration.

      Our discovery of sensory integration mechanisms associated with complex behavioral trait in male Drosophila at the brain circuit and genetic network levels will be a huge step forward in our knowledge of interval timing behavior.

      Description of the planned revisions

      REVIEWER #1

        • Overall I think this would be difficult for a general audience as the rationale and explanation of experiments needs to be clearer. * Answer: During the revision process, we will make our text more legible for wide audiences.

      REVIEWER #2

        • 'The knockdown of LUSH, an odorant-binding protein' Lush is expressed in trichoid sensilla in olfactory organs , from the beginning, they exclude the role of olfaction and later one they said 'suggesting that the expression of the pheromone sensing proteins LUSH and Snmp1 in Gr5a-positive gustatory neurons is critical for generating SMD behavior.' ? Therefore, I recommend If available, please provide a reference for the statement in the Methods section that the Orco1 line was "validated via electrophysiology", or include the electrophysiology data itself in this manuscript as supplementary figure. Ideally, positive behavioral controls for this line would also be included in the manuscript. * Answer: We value the reviewer's concern. LUSH has been discovered as an odorant-binding protein; nevertheless, current research suggests that LUSH may be involved in the sensing of additional pheromones to cVA, implying the presence of a lush-independent cVA detection mechanism [1]. Billeter et al. demonstrated in their paper that LUSH detects a female stimulatory chemical and modifies male mating latency (Fig. 2 of Billeter at al.). As Billeter et al. stated, our present understanding of pheromonal recognition in Drosophila is insufficient, and we concur. As a result, we attempted to validate the expression of Snmp1 in the male leg by experiments (Fig. 7I-J) performing sncRNA seq analysis on the Fly SCope dataset, as shown in Fig. 12. As demonstrated in Fig.12, Snmp1 and LUSH is higly expressed fly leg and wing system. Future study will look at the roles of Snmp1 and LUSH in female pheromone sensing, as well as PPK receptors.

      Following the reviewer's advice, we will repeat the electrophysiologically validated Orco2 mutant phenotype with proper control and attach it when we submit the complete revision to the journal.

      • What is this (GustDx6)? I suggest using Poxn mutant line. *

      Answer: We value the reviewer's recommendation. We believe we have previously demonstrated that the Gr5a-mediated gustatory pathway is essential for the generation of sensory input for SMD behavior, but we will test the Poxn mutant and Poxn-RNAi to replace the GustDx6 mutant result.

      Description of the revisions that have already been incorporated in the transferred manuscript

      REVIEWER #1

      1. My copy of this ms does not have page numbers or line numbers, this makes it extremely difficult to identify where I am making queries/ suggestions. I don't know whether this is a decision of the journal or authors, but please change this in the future.* Answer: We put page numbers and line numbers.

      2. A general point, there is simply too much in this ms. It covers too much ground and so doesn't give proper descriptions, discuss the consequences of the data fully or integrate properly with existing literature. Quantity does not equal impact. *

      Answer: We appreciate the reviewer's insight. We have previously separated this document from our original preprint [2] in response to a prior reviewer's advice; we believe we have included too much data, which may confuse readers. As a result, we will delete all of the mechanosensory/thermosensory receptor screening data from our present paper and write a second manuscript on sensory integration for the production of SMD behavior. We also removed the most of sncRNA seq data analysis except Fig.12 which confirms our finding in a single diagram.

      • Results paragraph 1 says that white mutant background had no effect "unlike that of LMD behavior as reported previously", ignoring that there has been a contrary report that extension of mating duration after exposure to a rival does not involve visual cues and so is not affected by the white mutation (Bretman et al 2011 Curr Biol). *

      Answer: We recognize that there is a conflicting report concerning white mutation on LMD behavior, however because we are now reporting SMD rather than LMD behavior, we deleted the statement comparing white mutant results to earlier reports, as shown below;

      “thus suggesting that the effect of the white mutant genetic background was not evident.” (line 97)

      • A general point in the methodology, it's not very helpful just to say "as in a previous study" without giving at least a brief idea of what that was (e.g. the explanation of egg counting procedures).

      A "sperm depletion" assay is described in the results that I cannot find any methodology for. *

      Answer: We thank the reviewer for allowing us to clarify our lacking methodologies for a better comprehension of our manuscript.

      We included the egg counting procedure to the EXPERIMENTAL PROCEDURES section to further illustrate our approach of egg laying assay as below;

      “In short, wild type females mated with naïve or experienced males were transferred to a fresh new vial and allowed to lay eggs for 24 hr at 25°C. After 24 hr of egg laying, number of eggs were counted under the stereomicroscope. After we count the number of eggs, we kept vials in 25°C incubator and counted the total number of progenies ecolsed from them.” (line 956-960)

      We included “Sperm Depletion from Males” section in EXPERIMENTAL PROCEDURES as below;

      “To deplete sperm from males, 40 virgin Defexel6234 females which lacks SPR and shows multiple mating with males (Yang 2009) were placed in a vial containing four CS males for indicated time (2 h, 4 h, 8 h, and 24 h).” (line 880)

      • Was the "excessive mating" with SPR females actually observed, or inferred from previous work? Needs to be clear. In what way do virgins expressing fruitless behave like mated females? It is so unclear how all the evidence in this paragraph leads to the conclusion that both cues from females and successful copulation. Especially as in the next paragraph experience with feminized females (with which the focal males cannot copulate) elicits the response.

      It might be helpful to combine the results into a table, so it is easy to see under which conditions males reduce mating duration. *

      Answer: We modified the sentence describing SPR mutant female experiment and added references as below;

      “Sexual experiences with sex peptide receptor (SPR) mutant females which exhibit a delayed post-mating response and multiple mating with males [3] had no additional effect on SMD (Fig. 2I).” (line 135)

      We clarify in which extent, fru>UAS-mSP virgin females behave like mated females as below;

      “Virgin females behave like mated females by expressing a membrane-bound version of male sex-peptide in fruitless-positive neurons, hence rejecting the male's copulation attempt.” (line 136)

      In the instance of feminized males, we assume that these feminine males can give adequate signals for inducing SMD and eliminated the term "successful copulation" since we are unsure if males can copulate these feminized males or not, despite the fact that males can mount and mate with them (Fig. 2O-P).

      Tables S1 and S2 describe the conditions, genotypes, and descriptions of an experiments illustrated in Fig. 2. We believe that these tables may assist general audiences in comprehending our experimental design.

      • Why are no statistics reported in the results? Identifying sig diffs on figures is not sufficient. I'm very sceptical that "mating duration of males showed normal distribution" for all comparisons, but then it's also difficult to identify which were analysed in this way (if statistics were properly reported this would not be an issue). *

      Answer: We described our statistical analysis with mating duration previously [4–7] and followed the statistical analysis of copulation duration assay reported by Crickmore et al., published in CELL (2013) and NEURON (2020) [8,9]. To further validate our statistical analysis, we added estimation statistics which focuses on the effect size of one's experiment/intervention, as opposed to significance testing [10]. We already described our statistical analysis in EXPERIMENTAL PROCEDURES section in details. We also described our statistical analysis for mating duration will be same in all other figures in the Fig.1 legend.

      We appreciate the reviewer's recommendation that the normal distribution of our mating duration data be validated. As a consequence, we performed the normailty test with Graphpad prism and added the histogram and QQ plot results to Fig. S1M and N. Table S3 also contains the results of the normality and lognormality tests.

      • Gr5a/ Gr66a mediate acceptance/ avoidance of what? Why would you hypothesise these in particular to be involved? *

      Answer: We accidentally left out the citation for that phrase and updated it with Wang et al.'s CELL (2004) paper. Wang et al. wrote in their article about taste representations in the Drosophila brain, “Our behavioral studies reveal that Gr5a cells recognize sugars and mediate acceptance/attractive behaviors whereas Gr66a cells recognize bitter compounds and mediate avoidance…. This suggests that Gr5a cells may be “acceptance” cells rather than “sweet” cells…. Our expression and behavioral studies reveal that Gr5a marks cells that recognize sugars and mediate taste acceptance, whereas Gr66a marks cells that recognize bitter compounds and mediate avoidance.” [11]

      As a result, we hypothesize that Gr5a and Gr66a-positive cells influence acceptance or avoidance of "taste." We also changed certain sentences to make them clearer, as seen below;

      “Of the various gustatory receptors, Gr5a marks cells that recognize sugars and mediate taste acceptance, whereas Gr66a marks cells that recognizes bitter compounds and mediates avoidance.” (line 173)

      • As Orco was not found to affect the behaviour, why test Or67d? *

      Answer: We appreciate the reviewer bringing this to our attention. We omitted the Or67d result from the present manuscript to simplify it and make it easier for readers to grasp.

      • "Mate guarding" suddenly appears in the modelling section. Can a difference of a couple of minutes in a mating duration of 15-20min really be considered mate guarding? A similar variation in response to rival males is not considered mate guarding, but is linked to adjustments in ejaculate expenditure (admittedly not in a very straight forward way). Surely in a system like this the benefits arise more from how many females the male can mate with in a given time? How does this model relate to any of the previous models of mate guarding?

      In this section the work of Linklater et al 2007 is important, they showed progeny declined over successive matings, and related this to exhaustion of Acps rather than sperm. I would urge the authors to consider that what they observe does not necessarily have an adaptive explanation. *

      Answer: We have defined “mate guarding” in the text now. The costs and benefits of mate guarding have been extensively studied in insects and demonstrated to shape the optimal mating duration of males. In our experiment, we cannot specify whether the shortened mating duration was caused by the adjustments in ejaculate expenditure or a shorted stay after the ejaculation. Instead, our model has a general assumption that the costs of mate guarding increase linearly at the same rate in both pre- and post-ejaculation periods, which is highlighted in the model text.

      There exist many models for the optimal mating duration (earlier models include Grafen and Ridley, 1983. A model of mate guarding. J. Theor. Biol. 102: 549 – 567 [12]). While our model was not built upon a novel theoretical approach (it was built based on the classical Charnov’s marginal value theorem equation), our model was developed specifically for generating testable predictions for the observed SMD behaviors.

      We have rephrased the text as follow;

      “This model assumes that (i) the shortened (or prolonged) mating duration is controlled by males and shaped by a trade-off between the benefit of mate guarding (remaining with the female both before and after the sperm ejaculation) and opportunistic costs (e.g. searching for another mate).” (line 970)”

      • I can't find a data accessibility statement. *

      Answer: We added it in the manuscript.

      • That said, a current grand challenge in understanding behaviour is discovering the mechanisms that enable individuals to respond plastically to changing environments. This speaks directly to that challenge. However, this behavioural observation is not novel, as claimed. Generally the idea of refractoriness is widely known, and specifically the reduction in mating duration over successive matings in D. melanogaster was shown by Linklater et al 2007 Evolution. Moreover, the time between exposure to females has been shown to be important. Linklater et al 2007 gave males mating attempts in quick succession and observed the decrease in mating duration, whereas given recovery time of 3 days, males either mate equally as long, or even longer across their life course (Bretman et al 2011 Proc B, Bretman et al 2013 Evolution). These papers should be discussed, and more broadly the work understood in the light of previous knowledge. The behaviour does not need to be novel for this manuscript to make a significant contribution to the field. *

      Answer: We believe the reviewer highlighted relevant past research that examined the influence of female experiences on mating duration. We agree with the reviewer that SMD behavior does not have to be original in order to contribute significantly to the field. As a result, we examined past reports and updated the introduction as follows;

      “It has been reported that previous sexual experience with females influences the mating duration of male D. melanogaster [15,16,34]; however, the neural circuits and physiology underlying this behavior have not been deeply investigated. Here, we report the sensory integration mechanisms by which sexually experienced males exhibit plastic behavior by limiting their investment in copulation time; we refer to this behavior as "shorter mating duration (SMD)."” (line 85)

      • Both in the introduction and discussion the extended mating duration in response to rivals is raised. A great deal of work has been done on this plasticity and yet the way this is written implies just two papers from these authors (whilst referencing others elsewhere). *

      Answer: We agree with the reviewer. In the introductory and discussion sections, we cited as many key publications explaining the plastic responses of male mating duration as we could.

      __REVIEWER #2

      __

        • Summary: The submitted manuscript reports that Drosophila melanogaster males use information derived from their previous sexual experiences from multiple sensory inputs to optimize their investment in mating. They refer to this plasticity as 'shorter-mating duration (SMD)'. SMD requires sexually dimorphic taste neurons. They identified several neurons in the male foreleg and midleg that express specific sugar, pheromone and mechanosensory receptors. Unfortunately, several aspects of the study design and methods used are inappropriate. Although the statistical approaches used are appropriate, the results are questionable. The discussion and conclusions are therefore too speculative in my view and overstretch the implications of the results as presented. Below I explain each one of these concerns about the study design, methods and results in detail as follows.* Answer: We appreciate the reviewer's assessment, especially the statement that our statistical approaches were appropriate. We will revise our manuscript in response to the reviewer's suggestions.
      1. The conclusions (as the authors point out) hinge on small (often extremely small) effect sizes. This is not an insurmountable problem, so long as the assays are robust across trials. Unfortunately, they are not-the variation in the baseline for control replicates is often as large as, or larger than, the effects from which the conclusions are derived. Given the extreme experimental challenges of small effect size combined with large intertrial variability, it is notable that the authors do not report any likely false negative or false positive data, as would be frequently expected under these conditions. One explanation for the reproducibility of statistical effect seen across many experiments despite these experimental hurdles is manipulation of sample size. The authors acknowledge the extreme variability in sample size offer seemingly harmless explanations, but a closer look shows how problematic this practice is. For example, see Figure 1 (I, J, L) there is a big different between naive and experience males? *

      Answer: We value the reviewer's feedback. Several research have been conducted to investigate the mating duration of male fruit fly. For example, our lab [2,13–15] and others [13–30] have regularly reported that previous rival exposure increases male fruit fly mating duration. Bretman A et al. utilized 49-59 males in their studies to compare the variations in mating duration between circumstances. Crickmore et al. also reported the effect of mating duration differences caused by genetic or experimental modification [8]. They utilized 10-18 male flies in their study to compare the variations in mating duration across circumstances, as shown in Figs. 1G (n=15-18) and 2A (n=10-27). All of these findings indicate that our mating duration sample size is sufficient to examine the effect size variations between the naive and experienced conditions. To confirm our statistical analysis further, we incorporated estimate statistics, which focus on the effect size of one's experiment/intervention rather than significance tests [10]. We have already detailed our statistical analyses under the EXPERIMENTAL PROCEDURES section. We conducted hundreds of mating duration assays using this configuration and confirmed that all of our results are reproducible in a blind test. As a result, we believe our mating duration assay has been validated by other groups' findings, several analytic tools, and numerous blind tests conducted by us. We appreciate the reviewers' concerns, but our data meets the reproducibility requirements.

      • I am not sure if you keep using the same control with different experiments (that is okay if those exp is done in the same time) as in figure 1 B, I,J,K,L.But I don't think you did Fig 1B in the same time with Fig 1I, J, K,L. *

      Answer: We appreciate the reviewer's feedback. Yes, all of our tests comparing the differences in mating duration between naive and experienced conditions were conducted under the same conditions and at the same time. We replaced Fig.1B with new data (n=49-51) obtained lately in a new lab in China. As previously stated, SMD behavior could be reproduced by the same Canton S genotype in different locations by different experimenters.

      • It will be clear if you mention in the text how much reduction in percent happened in copulation duration when the males had previous sexual experience? *

      Answer: We appreciate the reviewer’s suggestion and added in the manuscript as follow;

      “We found that the mating duration of various wild-type and w1118 naïve males are significantly longer (wild type 15.7~15.8%, w1118 12.4%) than that of sexually experienced males (Fig. 1B-D, Fig. S1A)” (line 99)

      • 'Drosophila simulans, the sibling species of D. melanogaster also exhibits SMD, thus suggesting that SMD is conserved between close species of D. melanogaster (Fig. S1B).'. If you want come with this conclusion, you need to test D. erecta, D. sechelia and D. yakuba. *

      Answer: We appreciate the reviewer's feedback. We removed the D. simulans data because it is not required for the conclusion of this manuscript. In future research, we will look on the conservation of SMD behavior between species.

      • The authors mention that Gr66a is salt. This is not 100% correct. GR66a is expressed in many bitter sensing neurons and is required for the physiological and behavioral responses to many bitter compounds. check this reference DOI:https://doi.org/10.1016/j.cub.2019.11.005. *

      Answer: We made the following changes and cited the article reviewer's suggestion.

      “Of the various gustatory receptors, Gr5a marks cells that recognize sugars and mediate taste acceptance, whereas Gr66a marks cells that recognizes bitter compounds and mediates avoidance (Wang et al, 2004; Dweck & Carlson, 2020).” (line 180)

      • Drosophila melanogaster mating duration is between 21- 23 mins. I never saw copulation duration in normal condition (control) 10-15 mins as in figure fig 2E, Fig 7 C,E,F, Fig 8 E and fig 12 G . To the best of my knowledge, of all of the papers on copulation duration, the only one that ascribes a shortened duration to manipulations of the female is Rideout...Goodwin Nature Neuroscience 2010, who argue that this shortening results from markedly increased female activity/agitation during mating, leading the male to terminate early. *

      Answer: We appreciate the reviewer's feedback. Copulation duration in Drosophila melanogaster male is extremely variable and has been reported to be approximately 20 minutes. However, as other groups documented, male copulation duration can range from 10-15 minutes depending on sperm completion (Fig. 1a-c of Bretman A et al.) [30] and genetic background (Fig. 1C, Fig. 2E, Fig. 5D, and Fig. 7A and E of Crickmore et al) [8]. And, as previously stated, males dominate copulation duration [8,30], not females, and we always utilized the same genotype of females for mating duration experiment. As a result, we believe that these rather short mating duration outcomes are the product of a distinct genetic background. Because we employed the same genotype of males while altering the female experience condition, we believe our mating duration results are all equivalent and comparable.

      • In some experiments, the authors test very few number of replicates which is not convinced me to their conclusion as example Fig 2F and Fig 12 E. Why you test 100, 103 replicates in this exp fig 10 F? How you compare 47 replicates against 9 replicates in fig S10 I? *

      Answer: We appreciate the reviewer's input. As we previously stated in response to Reviewer Question 2, the n number of males exhibited in Figs. 2F and 12E is statistically significant. To corroborate findings with replication, we examined 100, 103 duplicates of Fig. 10F, which represents pyx-RNAi screening results. The results of Fig. S10I are screening data, and we cannot rule out the possibility that TrpA1 knockdown in Gr5a neurons affects the mating success of sexually experienced males. We only placed it there because it was screening results and the differences between naive and experienced conditions were substantial despite the small sample size. However, we deleted Fig. 10F and Fig. S10I data from the current paper in response to Reviewer #1's advice, thus it will not be an issue for the manuscript's conclusion.

      • 'Next, to decipher whether DEG/NaC channel-expressing pheromone sensing neurons require the function of OBP, we expressed lush-RNAi using ppk23-, ppk25- and ppk29-GAL4 drivers to knockdown LUSH in each channel-expressing neuron. The knockdown of LUSH in ppk25- and ppk29-GAL4 labeled cells, but not in ppk23-GAL4 labeled cells, led to a disturbance in SMD behavior, thus suggesting that LUSH functions in ppk25- and ppk29-positive neurons to detect pheromones and elicit SMD behavior (Fig. 9G-I). The knockdown of SNMP1 in ppk29-GAL4- labeled neurons also inhibited SMD behavior (Fig. 9J), thus suggesting that SNMP1 also functions in ppk29-positive neurons to induce SMD behavior.' What about ppk25? **

      *

      Answer: As indicated by the reviewer, we included ppk25-GAL4/snmp1-RNAi data in Fig. S9I, indicating that snmp1 expression in ppk25-positive cells is similarly implicated in SMD behavior.

      • There are no page or line numbers throughout the ms! *

      Answer: We included page and line numbers.

      • The use of subheadings in the results section makes reading much easier.*

      Answer: We added subheadings in the results section.

      • 'We found that the mating duration of various wild-type and w 1118 naïve males are significantly longer than that of sexually experienced males (Fig. 1B-D, Fig. S1A)' . I think you should change various wild type to CS and WT Berlin as in legend and figure 1B,C .*

      Answer: The revised sentence is as follows:

      “We found that the mating duration of Canton S, WT-Berlin, Oregon-R, and w1118 naïve males are significantly longer (wild type 15.7~15.8%, w1118 12.4%) than that of sexually experienced males (Fig. 1B-D, Fig. S1A)” (line 102)

      • Suggested exp , Fig S1E-H , they might test 2,6, 12 hours males separation from females to test exactly when this behavior change over time. *

      Answer: We value the reviewer's recommendation. As seen in Fig. S4B of Kim et al., we have previously conducted experiments for examining the memory circuit of SMD [6]. Briefly, the male with a shorter mating duration recovers completely after 12 to 24 hours of isolation from females. As we are currently preparing the memory section of the SMD study, this information will be included in a future manuscript.

      • General comment in figures, you could remove the common y axis as example in figure 1 B,C,D , difference between means and mating duration. *

      Answer: We welcome the reviewer's idea, however in this situation we believe that the y axis of each data set is independent from one another and will thus retain the originals. We feel this would be more useful for the general audiences.

      • You might move the number of replicates to the legend. *

      Answer: We appreciate the reviewer's idea, however we feel that adding more information to the graphic will aid the general audience in comprehending our statistics.

      • Latin name should be italic as example Drosophila simulans.*

      Answer: We fixed it.

      Description of analyses that authors prefer not to carry out

      N/A

      References

      1. Billeter J-C, Levine JD. The role of cVA and the Odorant binding protein Lush in social and sexual behavior in Drosophila melanogaster. Frontiers Ecol Evol. 2015;3: 75. doi:10.3389/fevo.2015.00075
      2. Kim WJ, Lee SG, Schweizer J, Auge A-C, Jan LY, Jan YN. Sexually experienced male Drosophila melanogaster uses gustatory-to-neuropeptide integrative circuits to reduce time investment for mating. Biorxiv. 2016; 088724. doi:10.1101/088724
      3. Yang C, Rumpf S, Xiang Y, Gordon MD, Song W, Jan LY, et al. Control of the Postmating Behavioral Switch in Drosophila Females by Internal Sensory Neurons. Neuron. 2009;61: 519–526. doi:10.1016/j.neuron.2008.12.021
      4. Kim WJ, Jan LY, Jan YN. Contribution of visual and circadian neural circuits to memory for prolonged mating induced by rivals. Nat Neurosci. 2012;15: 876–883. doi:10.1038/nn.3104
      5. Kim WJ, Jan LY, Jan YN. A PDF/NPF Neuropeptide Signaling Circuitry of Male Drosophila melanogaster Controls Rival-Induced Prolonged Mating. Neuron. 2013;80: 1190–1205. doi:10.1016/j.neuron.2013.09.034
      6. Kim WJ, Lee SG, Auge A-C, Jan LY, Jan YN. Sexually satiated male uses gustatory-to-neuropeptide integrative circuits to reduce time investment for mating. Biorxiv. 2016; 088724. doi:10.1101/088724
      7. Wong K, Schweizer J, Nguyen K-NH, Atieh S, Kim WJ. Neuropeptide relay between SIFa signaling controls the experience-dependent mating duration of male Drosophila. Biorxiv. 2019; 819045. doi:10.1101/819045
      8. Crickmore MA, Vosshall LB. Opposing Dopaminergic and GABAergic Neurons Control the Duration and Persistence of Copulation in Drosophila. Cell. 2013;155: 881–893. doi:10.1016/j.cell.2013.09.055
      9. Thornquist SC, Langer K, Zhang SX, Rogulja D, Crickmore MA. CaMKII Measures the Passage of Time to Coordinate Behavior and Motivational State. Neuron. 2020;105: 334-345.e9. doi:10.1016/j.neuron.2019.10.018
      10. Claridge-Chang A, Assam PN. Estimation statistics should replace significance testing. Nat Methods. 2016;13: 108–109. doi:10.1038/nmeth.3729
      11. Wang Z, Singhvi A, Kong P, Scott K. Taste Representations in the Drosophila Brain. Cell. 2004;117: 981–991. doi:10.1016/j.cell.2004.06.011
      12. Grafen A, Ridley M. A model of mate guarding. J Theor Biol. 1983;102: 549–567. doi:10.1016/0022-5193(83)90390-9
      13. Kim WJ, Jan LY, Jan YN. A PDF/NPF Neuropeptide Signaling Circuitry of Male Drosophila melanogaster Controls Rival-Induced Prolonged Mating. Neuron. 2013;80: 1190–1205. doi:10.1016/j.neuron.2013.09.034
      14. Kim WJ, Jan LY, Jan YN. Contribution of visual and circadian neural circuits to memory for prolonged mating induced by rivals. Nat Neurosci. 2012;15: 876–883. doi:10.1038/nn.3104
      15. Wong K, Schweizer J, Nguyen K-NH, Atieh S, Kim WJ. Neuropeptide relay between SIFa signaling controls the experience-dependent mating duration of male Drosophila. Biorxiv. 2019; 819045. doi:10.1101/819045
      16. Bretman A, Fricke C, Chapman T. Plastic responses of male Drosophila melanogaster to the level of sperm competition increase male reproductive fitness. Proc Royal Soc B Biological Sci. 2009;276: 1705–1711. doi:10.1098/rspb.2008.1878
      17. Bretman A, Westmancoat JD, Chapman T. Male control of mating duration following exposure to rivals in fruitflies. J Insect Physiol. 2013;59: 824–827. doi:10.1016/j.jinsphys.2013.05.011
      18. Bretman A, Gage MJG, Chapman T. Quick-change artists: male plastic behavioural responses to rivals. Trends Ecol Evol. 2011;26: 467–473. doi:10.1016/j.tree.2011.05.002
      19. Lizé A, Doff RJ, Smaller EA, Lewis Z, Hurst GDD. Perception of male–male competition influences Drosophila copulation behaviour even in species where females rarely remate. Biol Letters. 2012;8: 35–38. doi:10.1098/rsbl.2011.0544
      20. Rouse J, Bretman A. Exposure time to rivals and sensory cues affect how quickly males respond to changes in sperm competition threat. Anim Behav. 2016;122: 1–8. doi:10.1016/j.anbehav.2016.09.011
      21. Bretman A, Fricke C, Hetherington P, Stone R, Chapman T. Exposure to rivals and plastic responses to sperm competition in Drosophila melanogaster. Behav Ecol. 2010;21: 317–321. doi:10.1093/beheco/arp189
      22. Rouse J, Watkinson K, Bretman A. Flexible memory controls sperm competition responses in male Drosophila melanogaster. Proc Royal Soc B Biological Sci. 2018;285: 20180619. doi:10.1098/rspb.2018.0619
      23. Maguire CP, Lizé A, Price TAR. Assessment of Rival Males through the Use of Multiple Sensory Cues in the Fruitfly Drosophila pseudoobscura. Plos One. 2015;10: e0123058. doi:10.1371/journal.pone.0123058
      24. Bretman A, Westmancoat JD, Gage MJG, Chapman T. COSTS AND BENEFITS OF LIFETIME EXPOSURE TO MATING RIVALS IN MALE DROSOPHILA MELANOGASTER. Evolution. 2013;67: 2413–2422. doi:10.1111/evo.12125
      25. Bretman A, Fricke C, Westmancoat JD, Chapman T. Effect of competitive cues on reproductive morphology and behavioral plasticity in male fruitflies. Behav Ecol. 2016;27: 452–461. doi:10.1093/beheco/arv170
      26. Price TAR, Lizé A, Marcello M, Bretman A. Experience of mating rivals causes males to modulate sperm transfer in the fly Drosophila pseudoobscura. J Insect Physiol. 2012;58: 1669–1675. doi:10.1016/j.jinsphys.2012.10.008
      27. Bretman A, Westmancoat JD, Gage MJG, Chapman T. Males Use Multiple, Redundant Cues to Detect Mating Rivals. Curr Biol. 2011;21: 617–622. doi:10.1016/j.cub.2011.03.008
      28. Fowler EK, Leigh S, Rostant WG, Thomas A, Bretman A, Chapman T. Memory of social experience affects female fecundity via perception of fly deposits. Bmc Biol. 2022;20: 244. doi:10.1186/s12915-022-01438-5
      29. Dore AA, Rostant WG, Bretman A, Chapman T. Plastic male mating behavior evolves in response to the competitive environment*. Evolution. 2021;75: 101–115. doi:10.1111/evo.14089
      30. Bretman A, Fricke C, Chapman T. Plastic responses of male Drosophila melanogaster to the level of sperm competition increase male reproductive fitness. Proc Royal Soc B Biological Sci. 2009;276: 1705–1711. doi:10.1098/rspb.2008.1878
    1. <![endif]-->

      Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1:

      Minor edits

      1. Line 91. Is a bit misleading to say "many other vibrios" possess T3SS. This conveys that this is perhaps the majority, but T3SS in vibrios is at best 50/50. I think best just to delete this sentence.

      We deleted this comment, as suggested.

      1. Revised to "Thus, in this study, we set out to..." Since the entire paragraph starts with "recent study" I missed that this was summary of new data rather than preview of new results.

      The sentence was revised as suggested.

      1. Line 503. Correct "xxx-584" or more detail on what this means.

      We Thank the reviewer for pointing out this typo._ This refers to the deletion made in tie1, in the region corresponding to nucleotides 485-584 of this gene. The text was corrected accordingly.

      1. Line 603. Salmonella should be italicized.

      Corrected.

      1. The labelling of the figures is pretty complicated with the long genetic designations. Is it reasonable to for example name the ∆vprh/∆hns1 strain with an abbreviation (such as ∆VH)? Or instead create a strain name, common used approaches would be HC## (for Hadar Cohen) or TAU# for Tel Aviv University. If you go this route, be sure to update the strain list. The current method can be followed, the figures are just complicated.

      We thank the reviewer for raising this concern. We acknowledge the difficulty in following the many different strains and mutations. Nevertheless, after considering the proposed modifications to the strain names, we believe that they will not add much clarity, and may even cause some confusion. Therefore, we respectfully decided to keep the current nomenclature in place.

      Reviewer #2:

      Minor edits

      1. The authors used a hyperactive T6SS (HNS mutant) to investigate its toxicity. Would the authors be able to use a wild type strain to reproduce the function of T6SS?

      We have yet to reveal the external cues that lead to full activation of T6SS3 in vitro. Therefore, in the current study we used genetic tools, such as hns deletion or Ats3 over-expression, to monitor the effect of this system on immune cells. We will dissect the activating conditions in future studies, but we believe that the use of genetic tools should not affect the validity of the results in the current study, nor their timely publication.

      1. The authors showed that Tie1 and Tie2 are secreted by T6SS3. It is important to show if they are actually delivered into the host cells during infection. Otherwise it is hard to conclude that they are truly effectors. The primary concern is the lack of in vivo studies to show that Tie1 and Tie2 are actually effectors that play a role in activation of NLRP3 inflammasome._

      We present 3 pieces of evidence that, when taken together, support the conclusion that Tie1 and Tie2 are T6SS3 effectors: 1) the proteins are secreted in a T6SS3-dependent manner; 2) their deletion does not hamper overall T6SS3 activity; and 3) their deletion causes the same loss of NLRP3-mediated inflammasome activation and pyroptosis as does inactivation of T6SS3 by deletion of its structural component, tssL3. Although we agree with the reviewer that directly showing delivery of Tie1 and Tie2 into host cells will further strengthen our conclusion, such experiments are quite challenging and difficult to interpret, especially with T6SS effectors that can use diverse mechanisms for secretion through the system. This point was also noted by reviewer #3: “…I believe they were suggesting to demonstrate secretion in host cells. Although this would be nice, it is non-standard and technically not feasible. These types of experiments require genetically fusing the effector with either an enzymatic moiety (e.g. Beta lactamase) or fragment of split GFP. Although such approaches have been previously performed, they often result in either blocked or aberrant secretion due to the presence of the added fragment."

      Regarding the reviewer’s comment on the lack of in vivo studies: we agree that these are extremely important, yet they are beyond the scope of the current work, as concurred by reviewers #1 and #3:

      Reviewer#1 with regard to Reviewer#2: "I don't think mouse (or aquatic animal) studies are essential for this study. The work contributes nicely to our understanding molecular mechanisms of this T6SS system. As noted in my review, there are many additional lines of study that can be pursued from this work, including animal studies, but this should not preclude publication of this work that is itself an intact unit."

      Reviewer#3 regarding reviewer #1's comment on Reviewer#2: "I don't believe that reviewer #2 was suggesting to perform mouse or aquatic animal studies by suggesting in vivo demonstration of secretion…”

      Reviewer #3:

      Major comments:

      1. If the authors believe that GSDME partially compensates in the absence of GSDMD, have they infected a GSDME/GSDMD double knockouts to see if there is an additive effect?

      Indeed, this is a very interesting and specific question for the cell death field. We do not currently possess such a GSDME/GSDMD double knockout mouse, and generating one will be a long endeavor. Since its absence does not diminish the importance or the conclusions of the current work, we think that it should not warrant a delay in publication. We do plan to address this question in future studies.

      1. It is clear that Ats3 regulates T6SS3, but not the T6SS1; however, there no evidence suggesting that Atg3 does not regulate other gene clusters. For example, have the authors performed RNA seq to compare the transcriptomes of WT and an Ats3 mutant? If not, the authors should refrain using the words "specific activation".

      We thank the reviewer for this important note. Indeed, we lack additional data indicating that Ats3’s effect is indeed restricted only to T6SS3. Therefore, we modified the text accordingly and removed mentions of specific T6SS3 activation.

      1. In figure 6B, it's unclear why the bacteria infecting cytochalasin D-treated cells grow more than the T6SS3 mutants in the absence of cytochalasin D.

      The difference probably stems from the fact that phagocytosis, the major mechanisms by which BMDMs kill bacteria, is hampered in the presence of cytochalasin D, thus allowing bacteria to grow more than when the BMDMs phagocytose them. The results show that in the absence of cytochalasin D, an active T6SS3 counteracts the killing effect by BMDMs with functional phagocytosis.

      Minor comments:

      1. Figure 1A and other secretion assays: The Western blots include loading control (LC) blots. These are non-standard, non-informative, and not required with the inclusion of the western blots on the "cells" fraction. I would suggest removing these as they may confuse the reader.

      We respectfully disagree. Loading controls are standard in bacterial secretion assays, and they are important since they confirm comparable loading and allow proper analysis of the results, especially since we aim to determine whether certain mutations affect the expression of T6SS components. Notably, some groups choose to blot for a cytoplasmic protein (e.g., RpoB in Allsop et al., PNAS, 2017; Liang et al., PLoS Pathogens, 2021) instead of showing overall loaded proteins, as shown in our figures.

      1. Line 503: "xxx" should reflect the actual nucleotide nubmers_

      We thank the reviewer for pointing out this typo._ This refers to the deletion made in tie1, in the region corresponding to nucleotides 485-584 of this gene. The text was corrected accordingly.

      1. Since V. proteolyticus is an aquatic pathogen, have the authors tried to infect corals, fish, and crustaceans (or derived cells) with WT and effector mutants?

      This is an interesting point, and indeed we are setting up such systems and we plan to perform such experiments in the future as part of follow up projects. However, these in vivo studies are beyond the scope of the current manuscript, as also noted by the reviewer in the cross-consultation comments: “…my previous comment on infecting aquatic animals or cells derived from them is non-standard and not necessary…”

      1. Are the targeted host proteins in this study (performed with murine BMDM) conserved in the natural hosts for V. proteolyticus?

      We hypothesize that the conservation is not in the pathway components that are activated upon infection, but rather in the ability of the host cell to sense danger (i.e., to sense the effect of T6SS3 effectors on the host cell or one of its components), which is the role of the NLRP3 inflammasome in mammalian cells. It is well documented that major differences in immune mechanisms exist between mammals and the potential natural marine hosts of V. proteolyticus (e.g., corals, arthropods, and fish); therefore, the conservation at the protein level is low. Nevertheless, basic signaling pathways, such as programed cell death, are conserved between the different phyla. For example, a caspase-1 homolog which was found in arthropods (Chu, B. et al. PLoS One (2014). doi:10.1371/journal.pone.0085343) probably induces an apoptotic-like cell death mechanism, similar to apoptosis in C. elegans. We now provide further discussion on this point in the text (lines 648-659).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This paper by Cohen et al described discovery of the function of novel genes in the T6SS operon of Vibrio proteolyticus, a Vibrio isolated from corals. V. proteolyticus also impacts other sea animals. The T6SS3 in particular is found to kill eukarytoic phagocytic cells following engulfment of bacteria into the phagocyte. This strategy of killing phagocytic cells following entry has been shown for other Vibrios. The net goal is protection of the population by the bystander effector. The study first shows that deletion of H-NS (a global negative regulator) stimulates T6SS facilitating ease of work by pushing the system to great cell killing. This allowed them to probe the mechanism of cell death and reveal it as NLRP3 dependent, capase 1 dependent pyroptosis via pore formation by Gasdermin D. Activation of the inflammasome is also linked to cleavage and release of IL-1beta. When GSDMD is absent, there was a slower cell killing by GSDME via capsase 3 activation. The stimulation of this system is additive by two newly recognized T6SS effectors Tie1 and Tie2.

      The study is complete, the experiments are well conducted and well controlled. The experiments show reproducibility. The manuscript text is clear, Overall. I suggest no changes in the results or experiments and suggest only a few minor edits of the text.

      Minor edits

      Line 91. Is a bit misleading to say "many other vibrios" possess T3SS. This conveys that this is perhaps the majority, but T3SS in vibrios is at best 50/50. I think best just to delete this sentence.

      Line 102. Revised to "Thus, in this study, we set out to..." Since the entire paragraph starts with "recent study" I missed that this was summary of new data rather than preview of new results.

      Line 503. Correct "xxx-584" or more detail on what this means.

      Line 603. Salmonella should be italicized.

      Figures. The labelling of the figures is pretty complicated with the long genetic designations. Is it reasonable to for example name the ∆vprh/∆hns1 strain with an abbreviation (such as ∆VH)? Or instead create a strain name, common used approaches would be HC## (for Hadar Cohen) or TAU# for Tel Aviv University. If you go this route, be sure to update the strain list. The current method can be followed, the figures are just complicated.

      Referees cross-commenting

      With regard to Reviewer#2, I don't think mouse (or aquatic animal) studies are essential for this study. The work contributes nicely to our understanding molecular mechanisms of this T6SS system. As noted in my review, there are many additional lines of study that can be pursued from this work, including animal studies, but this should not preclude publication of this work that is itself an intact unit.

      Significance

      The work is significant in that it links T6SS to a eukaryotic killing system and discovers novel details regarding the mechanisms of death, that may impact our knowledge of other Vibrio T6SS (including V. cholerae) that also target eukaryotic cell actin. There are remaining questions that could be probed, but these are in my opinion major studies that would easily themselves comprise new papers if done properly and thus are not essential for this paper. These include the struture and biochemical activity of Tie1 and Tie2 and the mechanism of caspase-8 independent activation of caspase-3 to then cleave GSDME. Why NLRP3 is required for capase 3 activation is also an open question. I look forward to following this work for some time to come. The authors have revealed very interesting effectors and interesting cell biological process that will merit multiple years and multiple manuscripts to unravel. This work will be of interest to the community interested in bacterial toxin systems (microbial pathogenesis), the bacterial effector mechanism field (biochemistry and cell biology), and the inflammasome activation field (immune systems). The work will be of interest (with essentially no modification) directed at these fields of interest.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1:

      Major comments:

      In general, the data support the conclusions. I cannot comment on the atomistic simulation experiment as it is outside of my expertise. I had some difficulties interpreting Figure 2 as the contrast in the colour panels made it difficult to assess the different staining patterns. I would recommend changing the blue to cyan for easier visibility. While I agree that there are some differences between Fig 2F and Fig 2G it is not simple for the non-expert to distinguish the gonadal mesoderm from the somatic mesoderm. I think the enlarged panels could do with also showing the overlap in staining, or at least a tracing of the different cell populations so that the gonadal mesoderm can be clearly defined. Please also add some scale bars to the figure. Figure 3 demonstrates clear differences in gonad morphology between male and female mutants but the contrast in the colour panels A-G could also be improved. Panels H-J are very clear.

      Response: As suggested by Referees 1 and 3 we have modified the colour channels in all figures. We have also enlarged the figures taking away the uninformative region and focused around the enlarged gonads and added scale bars. For Fig 2F-G, we have added a close up of the region of interest both in colour and in black and white. These changes have increased the contrast and facilitate the data interpretation to non-expert readers.

      The rescue experiment in Figure 4 is clearly presented but could the DLC3 mutants in the graph (panel b) please be named similarly to the schematic proteins shown in panel a.

      Response: We have changed the names to maintain nomenclature uniformity.

      I found the difference between the RhoGAP domain mutants and the StART domain mutants of Cv-c to be clearly defined, and correlate with DLC3 function. This is a very interesting result that indicates multiple molecular functions for the Cv-c /DLC family.

      Response: The methods are well described, statistics adequate and the data well described._

      Minor Comments:

      My only suggestion for the text is to provide a more through description of the StART domain in the introduction.

      Response: We have included the following paragraph in the introduction describing the StART domain:

      “This family of proteins share different domains: besides the Rho GTPase Activating Protein domain (GAP), they present a protein-protein interacting Sterile Alpha Motif (SAM) at the N terminal end and a Steroidogenic Acute Regulatory protein (StAR)-related lipid transfer (StART) domain at the C terminal. StART domains have been shown in other proteins to be involved in lipid interaction, protein localization and function.”

      Reviewer #2:

      My only issue with the present study is to how well the present experimental findings in Drosophila translate to humans. As far as I can tell the present studies show that inactivating mutations in Cv-c in Drosophila result in failure of germ cell enclosure by somatic cells into the testis, resulting in sterility. In humans, and in experimental mouse transgenic lines, it has been well established that absence of germ cells does not of itself lead to failure of testis differentiation and onward development, nor does it lead automatically to sex reversal or impairment of masculinization. For the latter to occur, there must be impairment/failure of fetal Leydig cell function such that insufficient androgen is produced to effect genital/bodywide masculinization. Obviously, this will happen if no testis forms as appears to be the case in the new human DLC3 mutant reported in the present manuscript (although detail on this is unfortunately lacking). This appears to be different to the previous published DLC3/STARD8 mutant sisters, in whom the phenotype appears to reflect failure of steroidogenesis. Is the proposal that DLC3/STARD8 plays a role in both testis differentiation and in Leydig cell function (steroidogenesis) or is this due to different DLC3 genes? I think the authors need to address these key issues in their discussion, if only to highlight that there are at present many gaps in our understanding.

      The reviewer says:

      “As far as I can tell the present studies show that inactivating mutations in Cv-c in Drosophila result in failure of germ cell enclosure by somatic cells into the testis, resulting in sterility.”

      Response: This sentence does not represent the spirit of our findings accurately and this probably reflects the fact that we stressed the interaction between somatic mesodermal cells and germ cells in Drosophila which probably concealed that the main defects in Cv-c mutants are caused by the abnormal interaction of the mesodermal cells with germ cells but also among themselves. Our study provides insights about a new conserved pathway required in the mesodermal cells for the maintenance of an already formed testis, and only indirectly can be considered to deal with sterility. We show that Cv-c is required in the mesodermal cells for the correct maintenance of the testis structure, that when it fails leads to the testis dysgenesis which, among other defects, releases the germ cells. We show that in the absence of Cv-c function in the testis, the mesodermal pigment cells do not form a continuous layer around the testis and the ECM surrounding the testis breaks. We also show that the interstitial gonadal cells fail to ensheath the germ cells and as a result of all these the germ cells become dispersed. These perturbations can be partially corrected by expression in the testis mesoderm of human DLC3 or Drosophila Cv-c that in both cases require a functional StART domain. Thus, our results suggest that Cv-c/DLC3 have a fundamental function on the mesodermal testis cells that has been conserved. These results indicate that, as in Drosophila, the primary cause for the gonadal dysgenesis in DLC3 human patients is due to the abnormal maintenance of the testis mesoderm cells, which include both Sertoli and Leydig cells. Thus, our proposal is that DLC3/STARD8 plays a role in testis maintenance through its function in mesodermal cells which will probably affects both Sertoli and Leydig cell function.

      To clarify the issue raised by the referee we have modified both, the introduction and the discussion to highlight that although humans and Drosophila diverged millions of years ago there are similarities regarding gonad stabilisation.

      We have modified the introduction to clarify this issue:

      “Gonadogenesis can be subdivided into three stages: specification of precursor germ cells, directional migration towards the somatic gonadal precursors and gonad compaction. In mammals, somatic cells, i.e. Sertoli cells in male and Granulosa cells in females, play a central role in sex determination with the germ cells differentiating into sperms or oocytes depending on their somatic mesoderm environment. In humans, Primordial Germ Cells (PGCs) are formed near the allantois during gastrulation around the 4th gestational week (GW) and migrate to the genital ridge where they form the anlage necessary for gonadal development (GW5-6). Somatic mesodermal cells are required for both PGCs migration and the formation of a proper gonad. Once PGCs reach their destination, the somatic gonadal cells join them (around GW 7-8 in males, GW10 in females) and provide a suitable environment for survival and self-renewal until gamete differentiation {Jemc, 2011 #413}. Thus, mutations in genes regulating somatic Sertoli and Granulosa support cell function in humans are often associated with complete or partial gonadal dysgenesis in both sexes and sex reversal in males {Zarkower, 2021 #430; Knower, 2011 #418; Brunello, 2021 #399}. Other mesodermal cells, the Leydig cells, also play an important role in the testis by being the primary source of testosterone and other androgens and maintaining secondary sexual characteristics.”

      Also we have added a paragraph in the discussion to emphasize this argument:

      “We show that in the absence of Cv-c function in the testis, the mesodermal pigment cells do not form a continuous layer around the testis and the ECM surrounding the testis breaks. We also show that the interstitial gonadal cells fail to ensheath the germ cells and as a result of all these the germ cells become dispersed from the testis. These perturbations can be partially corrected by expression in the testis mesoderm of human DLC3 or Drosophila Cv-c that in both cases require a functional StART domain. Thus, our results suggest that Cv-c/DLC3 have a fundamental function on the mesodermal testis cells that has been conserved. These results indicate that, as in Drosophila, the primary cause for the gonadal dysgenesis in DLC3 human patients is due to the abnormal maintenance of the testis mesoderm cells, which include both Sertoli and Leydig cells”.

      I would also suggest that the authors highlight another potentially more important spin-off from such studies, namely that understanding of the regulation of DLC3/STARD8 genes, and what might perturb their expression/action would appear to present a whole new area for exploration in relation to testicular dysgenesis/masculinization disorders.

      Response: We have modified the last part of the discussion to introduce referee 2’s suggestion:

      “Our work points to DLC3/Cv-c as a novel gene required specifically in testis formation. Adding DLC3 to the list of genes involved in 46X,Y complete dysgenesis opens up a new avenue to analyse the molecular and cellular mechanisms behind these disorders that could help in diagnosis and the development of future treatments”.

      Reviewer #3 :

      Major comments:

      1. This study has shown the expression pattern of cv-c and the consequence of cv-c mutation on different aspects of gonad development. However, one major comment is there is no quantification of the expression levels as well as the scoring of the mutant phenotypes.
      2. In Figure 2, for instance, I recommend that the authors display the quantification of the fluorescence intensity of the cv-c expression under all circumstances (in situ hybridization as well as protein-trap based GFP expression) to better depict the differences among the male vs female gonad.

      Response: We don’t think quantifying the stainings will add much to the results. We believe that the changes performed increasing the images’ contrast and their amplification are sufficient to illustrate our statement about cv-c being expressed in testis but not in ovaries.

      1. In Figure 3, the authors show the different gonad developmental defects associated with the cv-c mutation. Specifically, the authors show that the gonad mesoderm cells are displaced with the pigment cells failing to ensheath the germ cells. In addition, the authors also suggest that there is an increased frequency of germ cell blebbing, an indication of migratory activity. However, there is no quantification of these findings. I think the authors should display a quantitative estimation of % of the mutant gonad depicting these phenotypes vs the normal gonad to have a perspective of how penetrant the phenotypes are.

      Response: As referee suggested, we have quantified bleb phenotype. The results are presented in figure 3, panel J.

      1. In Figure 4, the authors attempt to rescue the Cv-C mutation linked gonadal defects by overexpressing different Cv-C protein variants. The rescue experiments are not very clear. The graph shows the % of normal testes under different genotypic combinations. It is not very clear what the authors mean by normal (in what context)? Since the mutation results in different defects of gonad development, I think recommend that represent the rescue in terms of these defects. It would be interesting to see for instance, what happens to the blebbing or germ cell ensheathment phenoype upon rescue. How many % of testes show the rescue as compared to cv-c mutants?

      Response: The percentages are quantified considering if the testes have any germ cell outside the gonad. We have added a line to clarify this point in the figure legend: “…quantified as encapsulated gonads with all germ cells inside the testis as assessed by Fisher-test”.

      Nevertheless, we are going to quantify the number of ECM breaks and show the results in the reviewed manuscript.

      1. Did the authors try cell-specific depletion of cv-c and examined the consequence on gonad development?

      Response: cv-c mutants are embryonic lethal because of Cv-c’s widespread requirement on various embryonic tissues during development. Induction of FRT clones in the embryonic testis mesoderm was unsuccessful because of the low number of divisions during embryogenesis. We also tried to knock down cv-c expression with 3 different RNAi lines. Unfortunately, overexpression of these RNAi with different testis Gal4 drivers did not decrease cv-c mRNA levels significantly in the mesoderm or in other tissues where cv-c is expressed. Despite these experiments unsatisfactory outcome, our finding that cv-c is expressed in the testis mesoderm cells, and the fact that we can rescue the testis phenotypes by expressing Cv-c with gonadal mesodermal specific Gal4 lines supports a testis mesoderm requirement of cv-c for its gonadal function.

      1. Another major concern is the lack of mechanistic insight of cv-c. For example, how does loss of cv-c result in gonadal dysgenesis? The authors suggested that StART domains regulate via lipid binding. The authors could examine if StART domain function is dependent on lipid-mediated interactions.

      Response: We agree with the referee that the molecular characterisation of the StART-mediated GAP-independent Cv-c function we have uncovered in this work is a very interesting finding that should be addressed by future work. However, such biochemical characterisation requires a complex approach to distinguish between the already known StART function regulating the GAP activity shown before (Sotillos Scientific Reports) and the new GAP-independent function we describe in the testes that falls beyond this work.

      The central point of this manuscript is the demonstration that both DLC3/Cv-c are involved in male gonad formation, an important conserved function for both of them that had been overlooked by previous publication. Thus, DLC3 should be considered a new gene to be analysed in the future when studying gonadal dysgenesis. A second important point raised by our work is the demonstration that DLC3/Cv-c can perform RhoGAP independent functions, something that had never been described for these proteins.

      Not withstanding this, in the revised version, we have added a new supplementary figure (1) related to the StART domain-lipid interaction analysed in-silico. The in-silico model shows that the DLC3-StART domain Ω1-loop structure displays the highest frequency of interaction with the membrane. This loop is conserved in the StART domains of several other STARD proteins and seems to modulate access to the ligand binding cavity. Ω-loops play multiple roles in protein function, often related to ligand binding, stability and folding. In this context, mutations in the proximity of the Ω1-loop, like the ones carried by the patients, may have drastic effects on overall protein stability that could affect the interaction between gonadal precursor cells.

      1. Do the cv-c mutants survive to adulthood? If yes, then it would be interesting to know how the adult testis behaves in cv-c mutants. Does it result in sterility?

      Response: Unfortunately, all studied cv-c mutants are embryonic lethal.

      1. Ensheathment is required for proper germline development and defects in ensheathment can affect soma-germline communication and germline development. Germ cell ensheathment affects the proliferation of germ cells and display defective JAK/STAT signaling. It would be interesting to know if the germ cells in cv-c mutant gonad show the proliferation defect and impaired JAK/STAT signaling.

      Response: This is an interesting suggestion. JAK/STAT signalling has a male specific function that could explain why cv-c gonadal defects are male specific. We are going to study how cv-c affects STAT signalling in the male gonad. We are currently preparing stocks combining 10XSTAT::GFP reporter with cv-c mutants and preparing samples for anti-STAT labelling. We will also analyse if embryos lacking STAT activation, activate cv-c expression in the testes.

      1. I was also wondering if the authors have examined the number of germ cells in the mutant gonads.

      Response: Yes, we have counted the number of germ cells in cv-c mutants and, if anything, there are more. We initially considered that an excess of GC proliferation could be the cause of gonad disruption. However, we have discarded this hypothesis as phospho-histone 3 stainings did not show a significant increase of GC divisions. Moreover, when we blocked cell proliferation in cv-c’ mutant gonads using UAS-p21, the testes phenotype was not rescued. We are unsure what could be responsible for the slight increase of germ cells observed.

      1. In addition, I think the quality of the images should be improved.

      Response: We have changed the colours used in the confocal images and amplified the relevant regions in all panels. We thank both referees for this suggestion as these changes have improved the figure contrast.

      Minor comments:

      1. cv-c mRNA in Figure 2 panels (Fig. 2D) should be in italics.

      Response: We have changed it.

      1. There is no scale bar in Figure panels. In addition, there is no scale bar in the zoomed images in Figure 2. Scale bars should be consistently put in the all the Figures, in particular on the first panels of the Figures.

      Response: We have added scale bars to all panels.

      1. In the line 677, the manuscript says "arrowhead". There are no arrowheads but the arrows.

      Response: Corrected

      1. Please be consistent with the labels in Figure panels: Vasa is shown in capital while Eya is not.

      Response: Corrected

      1. Please be consistent with the labeling of the Figure panels: Figure 3A vs Figure 4a.

      Response: Corrected

      1. What does the asterisk signify in Figure 2? There is no mention of asterisk in the Figure 2 legend.

      Response: The meaning of the asterisk was explained in the figure legend.

      1. There is no grey channel (sagittal view) for the panels Figure 3I and J.

      Response: We have already included sagittal views in the figure.

      1. Please be thorough in labeling the genotypes in Figures. For instance, Figure 4c depict the % of normal testis in cv-c delta StART. However, the correct genotype is twi>Cv-c StART. In addition, in Figure 4c graph, cv-c mut should be cv-cGAPmut.
      2. Please be consistent with the depiction of the "START" domain of the protein throughout the manuscript. In figure 4c for instance, it is "START" in the graph while in the figure panel 4i, it is StART.
      3. In Figure 4b, it is written DLC3-GA. Did the authors mean DLC3-S993N?
      4. In line 723, it should be anti-beta catenin.

      Response: As suggested, we have unified figure labelling.

      1. The authors have shown two images to suggest that cv-c mutant gonad depict the germ cell blebbing (Figure 3I and J). I think it would be much better to put up a graph showing the number or percentage of cv-c mutant gonads displaying the germ cell blebbing than putting two images with the same information.

      Response: We have already done the quantification and added the data as a graph in figure (3J).

      1. The previous comment is also true for Figure 6H and I. In both the panels, the authors wish to show discontinuous ECM marked by Perlecan expression in cv-c mutant gonads. I think it would be better to display a score of the number of mutant gonads depicting the discontinuous ECM.

      Response: We are repeating stainings to quantify Perlecan disruption in cv-c mutants and we will display the results as a graph in figure 6.

    1. Author Response

      Reviewer #1 (Public Review):

      The layered costs and benefits of translational redundancy by Raval et al. aim to investigate the impact of gene copy number redundancy on E. coli fitness, using growth rate in different media as the primary fitness readout. Genes for most tRNAs and the three ribosomal RNAs are present in multiple copies on the E. coli chromosome. The authors ask how alterations in the gene copy number affect the growth rate of E. coli in growth media that support different rates of growth for the wild type.

      While it was shown before that mutants with reduced numbers of ribosomal RNA operons grow at reduced rates in rich medium (LB), this study extends these findings and reaches some important conclusions:

      1) In a poor medium (supporting slow growth rates), the mutants with fewer rRNA operons actually grow faster than the wild type, showing that redundancy comes at a cost.

      2) The same is true for mutants with reduced gene copy number of certain tRNAs and correlates with slower rates of protein synthesis in these mutants.

      3) That rRNA operon gene copy number is more decisive for growth rate than any tRNA gene copy number (>1).

      In addition, measurements of strains with deletions of genes encoding tRNA-modification enzymes that affect tRNA specificity are included. While interesting, no unifying conclusion could be reached on the impact of these mutations on growth rate.

      Thank you for this clear summary of our work.

      The well-known "growth law" relationships between growth rate and macromolecular composition (RNA/protein ratio, for example) specifically concern steady-state growth rates. It is concerning that all growth rates in this work were measured on cultures that were only back-diluted 1:100 from overnight LB precultures. That only allows 6-7 doubling times before the preculture OD is reached again. The exponential part of growth would end before that, allowing perhaps only 3-4 generations of growth in the new medium before the growth rate was measured. Thus, the cultures were not in balanced growth ("steady state") when the measurements were made, rather they were presumably in various states of adapting to altered nutrient availability.

      A detailed connection with exact growth rate laws indeed requires growth rate measurement in steady-state. Hence, we refrained from making such a connection in this manuscript, though it would be an interesting future avenue to explore. Our main goal here was to ask how E. coli growth rate is affected by external nutrient availability and internal translation components. For this, the key comparisons involve the WT vs. gene deletion mutations, and rich vs. poor growth media. For any given comparison, strains were tested under identical conditions and experimental protocols, and hence we can address our main questions without the need to obtain steady-state growth. As an aside, we note that the nutrient fluctuations inherent in such experiments may also be more relevant than steady-state growth for natural bacterial populations.

      As noted by the reviewer, we measured fitness only in a relatively narrow growth regime of several doublings; but we do capture exponential growth by focusing on the early data points (representing the exponential phase) for our growth rate calculations. We have now explicitly mentioned this in the methods section “Measuring growth parameters”.

      A second concern is the use of the term "tRNA expression levels" in the text in Figure 4. I believe the YAMAT-seq method reports on the fractional contribution of a given tRNA to the total tRNA pool. Thus, since the total tRNA pool is larger in fast-growing cells than in slow-growing cells, a given tRNA may be present at a higher absolute concentration in the fast than in the slow-growing cells but will be reported as "higher in poor" in figure 4, if the given tRNA constitutes a smaller fraction of the total tRNA pool in rich than in poor medium. For this reason, the conclusions regarding the effect of growth medium quality on tRNA levels are not justified.

      Thank you for this important point. We agree that our phrasing was incorrect, and we have modified the relevant text and figures accordingly. The fractional contribution of a given tRNA isotype to the total tRNA pool is still useful to compare, and is justified as now rephrased.

      Reviewer #2 (Public Review):

      Raval et al. by creating a series of deletion mutants of tRNAs, rRNAs, and tRNA modifying enzymes, have shown the importance of gene copy number redundancy in rich media. Moreover, they successfully showed that having too many tRNAs in poor media can be harmful (for a subset of the examined tRNAs). Below, please find my comments regarding some of the methodologies, conclusions, and controls needed to stratify this manuscript's findings.

      Figure 2 presents Rrel as a relative measurement (GRmut/GRwt). Therefore, I'm confused as to how Rrel can be negative, as shown in supplemental file 3 (statistics).

      We apologize for the confusion. Supplemental file 3 shows details of the statistical analysis (not raw data), and we included the effect size here (mean difference between the WT and the mutant relative growth rate) along with statistical significance. Thus, if the rel R of a given mutant is 1.1, the mean difference would be (1–1.1) = –0.1, meaning that it is performing 10% better than the WT.

      The “raw” relative growth rates are provided in source data files (labeled figure-wise), and there are no negative values there, as expected.

      We have now explicitly (and separately) referenced the source and statistics data files in the data analysis section in the methods, and in each figure legend. We hope this avoids confusion and makes it easier for readers to find the correct file.

      Does Figure 3 show the mean of 4 biological replicates or technical replicates? It should be stated clearly in the legend of figure 3.

      All replicates are biological replicates until unless stated otherwise. This is now stated in the methods (lines 185-187), and in the figure legends.

      Do all strains (datapoint on figure 3 left panel) significantly perform better than the WT in nutrient downshift? Looking at supplemental file 3 I see this is not the case. Please mark the statistically significant points. I suggest giving each set a different symbol/shape and coloring the significant ones in red.

      We had considered indicating statistical significance in the plot, but decided not to do so because it was difficult to show the many potentially useful layers of information without cluttering the plot. One other practical difficulty was that each point in the figure represents two values: one from the upshift (Y axis) and one from the downshift (X axis). For some mutants the fitness difference was significant in only one direction, so it was not straightforward to indicate significance. Further, our main goal here was to show where strains from different deletion Sets (Figure 1) fall in this plot (i.e. which quadrant they occupy), and so we wanted to ensure that points were easily distinguished by Set. In the text we do not include statistically non-significant points in the summary of observed patterns, and refer readers to information on statistical significance provided in the supplemental file.

      Another issue is that in the statistics of figure 2 (in supplemental file 3), positive values reflect cases where the mutant performs poorly compared to the WT, while in figure 3 the negative values indicate this. Such discrepancy is not very clear. And again, how can Rrel be negative?

      As noted in response to an earlier comment, Rrel values (given in source data files) are not negative, but effect sizes (given in supplemental file with statistics) may be negative or positive since they show differences in the relative growth rate of WT and mutant. We agree that the discrepancy between the calculation of mean difference for Figs 2 and 3 was confusing. We have now fixed this: in both cases, negative mean difference values now indicate that the mutant performs better.

      Both axes say glycerol. What about galactose?

      The typo has been corrected.

      Lines 414-419: The authors state that "all but one had a growth rate that was comparable to WT (16 strains) or higher than WT (10 strains) after transitioning from rich to poor media (i.e. during a nutrient downshift, note data distribution along the x-axis in Fig 3; Supplementary file 3). In contrast, after a nutrient upshift, 11 strains showed significantly slower growth in one or both pairs of media, and only 2 showed significantly faster growth than WT (note data distribution along the y-axis in Fig 3; Supplementary file 3)".

      Looking at the Rrel values when transitioning from TB to Glycerol and vice versa suggests no direction in the effect of reducing redundancy. During downshift, four strains perform better, and three strains perform worse than the WT. During upshift, four stains perform better, and six strains perform worse. Only during downshift and upshift from TB to Gal and vice versa give a strong signal.

      The authors should write it clearly in the text because the effect is specific to that transition/conditions and not of general meaning is written in the text (e.g., transition from every rich to every poor media and vice versa). I am convinced that the authors see an actual effect when downshifting or upshifting from TB to galactose and vice versa. In that case, the conclusion is that redundancy is good or bad depending on the conditions one used and not as a general theme.

      Also, this is true just for some tRNAs, so I don't think the conclusion is general regarding the question of redundancy.

      The fitness impacts of altered redundancy are best explained by a combination of multiple factors (in addition to nutrient availability): the number of tRNA genes deleted, number of tRNA gene copies remaining as a backup, availability of wobble or ME as backup, and codon usage. Thus, any of these variables alone would provide only partial explanation for the observed fitness effects of all strains.

      In many tRNA deletion strains – especially single gene deletions – redundancy was not significantly lowered by the deletion, as we explain in the results section. These strains were therefore not expected to show major fitness impacts or follow strong nutrient dependent trends, and this is what we observe.

      The same is true for nutrient upshift-downshift experiments, where a vast majority of strains were not expected to show a specific pattern because they do not show significant fitness impacts in general, nor do they show a strong correlation in relative fitness impacts vs. growth rate (Figure 1d). In addition, in these experiments the difference between the two media also matters. For example, comparing the maximum WT growth rate, M9 Gal is poorer than M9 Glycerol. Therefore, shifts between TB-Gal are nutritionally more drastic than TB-Gly shifts, and one would expect a larger fitness impact in the former (for strains with significantly altered redundancy). Hence, despite differences across media pairs, our broader conclusions about the impact of redundancy are generalizable as long as redundancy and nutrients are both substantially altered, e.g. due to deletion of 3 tRNA genes, deletion of tRNA+ME, or deletion of multiple rRNA operons.

      Figures are indicated differently along the text. Sometimes they are written "figure X", sometimes FigX. Referring to the supplemental figures are also not consistent.

      We have now corrected this.

      Line 443-444: "In fact, 10 tRNAs were significantly upregulated in the poor medium relative to the rich medium".

      This result contradicts the author's hypothesis. If redundancy is bad in poor media because the cells have more tRNAs than they need, the tRNAs level will be downregulated, not upregulated. How do the authors explain this?

      This statement referred to the WT strain, and was meant to highlight that (as noted by the reviewer) some tRNAs appear to be upregulated in poor medium, which is counterintuitive. However, as noted by reviewer 1 (see their comment on the interpretation of YAMAT-seq data), we can only infer the relative contribution of each tRNA isotype to the total tRNA pool (rather than absolute up- or down- regulation). Thus, we have removed this specific sentence, and instead we focus on the mismatch between the media-dependent changes in the composition of the tRNA pool and the fitness effects of different tRNA isotypes (lines 475-482).

      Line 445-447: "In contrast (and as expected), all tested tRNA deletion strains had lower expression of focal tRNA isotypes in the rich medium (Fig 4B, left panel), showing that the backup gene copies are not upregulated sufficiently to compensate for the loss of deleted tRNAs". It is great that the authors validated the expression in their strains. However, for accuracy, please indicate that it was done in four strains to avoid the impression that they did it in all the strains.

      We have now reworded this sentence to remind readers that we measured 4 tRNA deletion strains in this experiment.

      Finally, across the manuscript, the authors reveal that deleting some tRNAs or modifying enzymes can be deleterious in rich media or advantageous in poor media. However, I think this result and the conclusions derived from it could be more convincing if the authors would show in a subset of their strains that expressing the deleted tRNAs or modifying enzymes from a plasmid can rescue the phenotype.

      Thank you for this suggestion. For a small subset of strains, we now include data showing that complementation from a plasmid indeed rescues the deletion phenotype (Fig 2 – Fig supplement 7).

      Reviewer #3 (Public Review):

      In this manuscript, Raval et al. investigated the cost and benefit of maintaining seemingly redundant components of the translation machinery in the E. coli genome. They used systematic deletion of different components of the translation machinery including tRNA genes, tRNA modification enzymes, and ribosomal RNA genes to create a collection of mutant strains with reduced redundancy. Then they measured the effect of the reduced redundancy on cellular fitness by measuring the growth rate of each mutant strain in different growth conditions.

      This manuscript beautifully shows how maintaining multiple copies of translation machinery genes such as tRNA or ribosomal RNA is beneficial in a nutrient-rich environment, while it is costly in nutrient-poor environments. Similarly, they show how maintaining parallel pathways such as non-target tRNA which directly decodes a codon versus target tRNA plus tRNA modifying enzymes which enable wobble interactions between a tRNA and a codon have a similar effect in terms of cost and benefit.

      Further, the authors show the mechanisms that contribute to the increased or reduced fitness following a reduction in gene copy number by measuring tRNA abundance and translation capacity. This enables them to show how on one hand reduced copy numbers of tRNA genes result in lower tRNA abundance in rich growth media, however in nutrient-limiting media higher copy number leads to increased expression cost which does not lead to an increased translation rate.

      Overall, this work beautifully demonstrates the cost and benefits of the seemingly redundant translation machinery components in E. coli.

      Thank you for the clear summary and encouraging comments.

      However, in my opinion, this work’s conclusion should be that the seeming redundancy of the translation machinery is not redundant after all. As mentioned by the authors, it is known that tRNA gene copy number is associated with tRNA abundance (Dong et al. 1996, doi: 10.1006/jmbi.1996.0428), this effect is also nicely demonstrated by the authors in the section titled “Gene regulation cannot compensate for loss of tRNA gene copies”. Moreover, this work demonstrates how the loss of the seeming redundancy is deleterious in a nutrient-rich environment. Therefore, I believe the experiments presented in this work together with previous works should lead to the conclusion that the multiple gene copies and parallel tRNA decoding pathways are not redundant but rather essential for fast growth in rich environments.

      The point is well taken. However, as described in the introduction, here we focus on functional redundancy at the cellular level, where there are multiple ways of achieving the same translation rate. Hence we say that translation components are redundant at this level of analysis. One of the key conclusions from our work is that such redundancy is context-dependent, i.e. it is essential when rapid growth is possible, but is costly and dispensable otherwise. Therefore, we show that the definition of redundancy itself changes with environmental conditions.

      The following analogy may help convey this. There may be many ways to reach a flight on an airport: multiple entrances, multiple check-in and security check counters, multiple boarding gates, etc. On a deserted airport these may seem redundant and even costly to maintain. On the other hand, they have a utility when traffic is high. Hence even though from a purely architectural perspective the multiple routes are redundant, from a utilitarian perspective it depends on the flux of passengers.

    1. Authors’ response (5 November 2022)

      GENERAL ASSESSMENT

      Piezo1 and Piezo2 are stretch-gated ion channels that are critically important in a wide range of physiological processes, including vascular development, touch sensation and wound repair. These remarkably large molecules span the plasma membrane almost 40 times. Cryo-EM and reconstitution experiments have shown that Piezos adopt a cup-like structure and, by doing so, curve the local membrane in which they are embedded. Importantly, membrane tension is a key mediator of Piezo function and gating, an idea well-supported several independent studies. Cells have varied three-dimensional shapes and are dynamic assemblies surrounded by plasma membranes with complex topologies and biochemical landscapes. How these microenvironments influence mechanosensation and Piezo function are unknown.

      The current preprint by Zheng Shi and colleagues asks how the shape of the membrane influences Piezo location. The authors use creative approach involving methods to distort the plasma membrane by generating “blebs” and artificial “filopodia”. Overall, the work convincingly shows that the curvature of the lipid environment influences Piezo localization. Specifically, they show that Piezo1 molecules are excluded from filopodia and other highly curved membranes. These experiments are well controlled and the results fully consistent with previous structural and biochemical work. Furthermore, the work explores the hypothesis that a chemical modulator of Piezo1 channels called Yoda1 functions by “flattening” the channels, a movement previously proposed to be linked to mechanical gating. Consistent with this model, the authors show that Yoda1 application is sufficient to allow Piezo1 channels to enter filopodia. While the flattening model is provocative hypothesis, hard evidence awaits structural verification.

      Overall, the preprint by Shi and colleagues will be of interest to scientists studying how mechanical forces are detected at the molecular level. The work introduces important concepts regarding how the shape of cellular membranes affects the movement and function of proteins within it. The technical advance for changing the shape of a plasma membrane is of note. 

      We thank the reviewers for the accurate summary and positive assessments of our manuscript. We address each of the concerns below.

      RECOMMENDATIONS

      Revisions essential for endorsement:

      As is evident from the comments below, our endorsement of the study is not dependent on additional experiments. However, we feel more experimental clarification is needed, that providing clearer images would be helpful, and, most importantly, we would like alternative conclusions and caveats to be mentioned.

      1. Can the authors comment on the link between the conclusions that (1) the presence of filopodia prohibits Piezo1 localization (Fig 1) and (2) Piezo1 expression prohibits the formation of filopodia (Fig 3). As it stands, it is hard to understand if there is a cause and effect relationship here or if these are separate, unrelated observations? We recommend revising the discussion to clarify.

      We now clarify the link between Piezo1’s curvature sensing (depletion from filopodia) and its inhibition effect on filopodia formation before presenting the current Fig. 5: “Curvature sensing proteins often have a modulating effect on membrane geometry. For example, N-BAR proteins, which strongly enrich to positive membrane curvature, can mechanically promote endocytosis by making it easier to form membrane invaginations (Shi and Baumgart, 2015; Sorre et al., 2012). Thus, we hypothesize that Piezo1, which strongly depletes from negative membrane curvature (Fig. 1, Fig. 2), can have an inhibitory effect on the formation of membrane protrusions such as filopodia.”

      2. When comparing the images of Fig. 2A, B to those of Fig. 2C, D, it appears that bleb formation induces a drastic enrichment of Piezo1 in the bleb membrane. Is this due to low membrane tension in the bleb? If this is the case, it indicates that the level of membrane tension has a prominent role in determining the localization of Piezo1.

      We apologize for this confusion due to our poor wording and figure presentation in the manuscript. By “Piezo1 clearly locates to bleb membranes” we didn’t mean to indicate that Piezo1 is enriched on bleb membranes as compared to the cell body. Rather, we meant to emphasize Piezo1’s localization to the *membrane* of the blebs rather than in the cytosolic space.

      Cells in 2C, 2D are different from that in 2A and 2B and were presented with different image contrasts. We now include the images of the full cell for Fig. 2C and 2D as the current Figure S8. To focus on the equator of the bleb, the cell body was out of focus. However, there is no indication that Piezo1 density is significantly different between the bleb membrane and the intact parts of the plasma membrane.

      We changed the main text to: “Similar to previous reports (Cox et al., 2016), bleb membranes clearly contain Piezo1 signal, but not significantly enriched relative to the cell body (Fig. 2C, 2D; Fig. S8).”

      In line with this, it appears more Piezo1 proteins are localized in less tensed tethers. Thus, might your observations be equally consistent with tension rather than curvature as a key regulator of Piezo1 localization? We recommend adding this to your discussion.

      We now explain the deconvolution between tension and curvature effects in detail. We also performed additional experiments to quantify the membrane tension in cells and blebs (current Fig. S9).

      In the Results section, we add: “Tethers are typically imaged > 1 min after pulling, whereas membrane tension equilibrates within 1 s across cellular scale free membranes (e.g., bleb, tether) (Shi et al., 2018). Therefore, the sorting of Piezo1 within individual tension-equilibrated tether-bleb systems (Fig. 2C – 2G) suggests that membrane curvature can directly modulate Piezo1 distribution beyond potential confounding tension effects.”

      In the Discussion section, we add: “In addition to membrane curvature, tension in the membrane may affect the subcellular distribution of Piezo1 (Dumitru et al., 2021). Particularly, membrane tension can activate the channel and potentially change Piezo1’s nano-geometries. This tension effect is unlikely to play a significant role in our interpretation of the curvature sorting of Piezo1 (Fig .2): (1) HeLa cell membrane tension as probed by short tethers (Fig. S9F; 45 ± 29 pN/ µm on blebs and 270 ± 29 pN/ µm on cells, with the highest recorded tension at 426 pN/ µm) are significantly lower than the activation tension for Piezo1 (> 1000 pN/µm (Cox et al., 2016; Lewis and Grandl, 2015; Shi et al., 2018; Syeda et al., 2016)). (2) With more activated (and potentially flatten) channels under high membrane tension, one would expect a higher density of Piezo1 on tethers pulled from tenser blebs. This is the opposite to our observations in Fig. 2C - 2G, where Piezo1 density on tethers was found to decrease with the absolute curvature, thus tension (eq. S6), of membrane tethers.”

      3. Given the intrinsically curved structure of Piezo1, it is difficult to understand the model’s prediction that curved Piezo1 is not enriched in 25-75 nm invaginations. Where will Piezo1 normally reside in the plasma membrane? It would be helpful if this could be discussed.

      The spontaneous curvature from our model _C_0 (_C_0-1 = 83 ± 17 nm, the value is updated after refitting to more data points collected for Fig. 2G) represents a balance between the intrinsic curvature of Piezo1 trimers (0.04 ~ 0.2 nm-1 as suggested by CryoEM studies(Haselwandter et al., 2022; Lin et al., 2019; Yang et al., 2022)) and that of the associated membrane (0 nm-1, assuming lipid bilayers alone do not have an intrinsic curvature). We now refer to _C_0 as the “spontaneous curvature of the Piezo1-membrane complex” throughout the manuscript, rather than the “spontaneous curvature of Piezo1”.

      Our model, when extrapolated to membrane invaginations, predicts a weak enrichment of Piezo1 on ~100 nm invaginations (peak at 83 nm), but a depletion of Piezo1 on more highly curved invaginations. This is simply because it would be energetically costly to fit a protein-membrane complex to a curvature that is different from what the complex prefers (in the case of 25-75 nm membrane invaginations, the membrane curvature would be too high for the Piezo1-memrbane complex).

      However, it is worth pointing out that Piezo1-membrane complex may not present the same spontaneous curvature on positively and negatively curved membranes. More importantly, we do not yet have direct evidence to show that this depletion indeed happens in the exact range of invagination curvature we predicted. We now acknowledge this limitation in the Discussion section: “However, it is worth noting that we assumed a zero spontaneous curvature for membranes associated with Piezo1 and that the spontaneous curvature of Piezo1-membrane complex is independent of the shape of surrounding membranes. These assumptions may no longer hold when studying Piezo1 in highly curved invaginations or liposomes (Lin et al., 2019).”

      We also took this opportunity to verify the key prediction from the extrapolated model - that Piezo1 would enrich towards ~ 100 nm radius cell membrane invaginations. To achieve this, we utilized a recent development in nanotechnology, pioneered by Wenting Zhao and Bianxiao Cui’s labs (Lou et al., 2019; Zhao et al., 2017). An illustration of the experimental design and detailed findings are summarized in the current Fig. 3 and briefly discussed below.

      In collaboration with Wenting Zhao’s lab, we cultured cells on precisely engineered nanobars with curved ends and flat central regions. For a labelled membrane protein of interest, the end-to-center fluorescence ratio would report the protein’s curvature sorting ability. We find that Piezo1 enriches to the curved ends of nanobars, whereas membrane marker signals are homogeneous across the entire nanobar (Fig. 3). The finding achieved strong statistical significance via hundreds of repeats on nanobars of the exact same geometry, a major technical strength of our chosen system. Furthermore, the enrichment of Piezo1 was observed on nanobars with 3 different curvatures (corresponding to diffraction-limited radii between 100 to 200 nm) and qualitatively agrees with our model (current Fig. S10). While further investigations on a wider range of membrane curvature are required to fully map out the sorting of Piezo1 on membrane invaginations, our data in the current Fig. 3 clearly verifies the prediction that membrane curvature can lead to enrichment of Piezo1 on cellular invaginations.  

      We now refer to this new finding in the Abstract, along with the previously observed depletion of Piezo1 on filopodia. We present a detailed description of the experiment and associated findings in the Results and the Method sections.

      4. It is currently unknown whether and how long Yoda1 might keep Piezo1 in a flattened state. Given that Yoda1 is highly hydrophobic, it might affect membrane properties instead of the curvature of Piezo1. These caveats should be discussed.

      We thank the reviewers for pointing out the potential effect of Yoda1. We did additional experiments to confirm that on Piezo1-KO cells, Yoda1 molecules alone do not significantly alter the formation of filopodia, in contrast to observations in WT cells. This data suggests Yoda1 (at the concentration we use) is unlikely to significantly alter the mechanical properties of the plasma membrane. The data is now presented as Fig. 5E in the updated manuscript. We added: “In Piezo1 knockout (Piezo1-KO) cells, adding Yoda1 to the culture medium does not significantly change the number of filopodia (Fig. 5E), suggesting the agonist does not directly regulate filopodia formation without acting on Piezo1.”

      5. The authors state that “Yoda1 leads to a Ca2+ independent increase of Piezo1 on tethers”. It has not been determined yet that Yoda1 leads to Piezo1 flattening (or even opening). In Electrophysiology experiments, unless there is pressure applied, Yoda1 does not lead to substantial currents. Therefore, the cartoon of Yoda1 flattening Piezo1(3H) is misleading. We recommend revising this. So far, the best experimental evidence on flattening is via purified channels reconstituted in various sizes of liposomes. However, it is plausible that the flattened shape is closed or open inactivated. Because most of the claims of this paper depend on the curved vs flattened shape of Piezo1, the authors should address these caveats carefully.

      We thank the reviewers for pointing out the limitations in our current understanding of Yoda1. We agree that our data do not directly show the flattening of Piezo1 by Yoda1, rather it is consistent with the flattening hypotheses. We lowered the tone of our conclusion to Fig. 4 to: “Our study suggests this conformational change of Piezo1 may also happen in live cells (Fig. 4H).” We also added arrows in Fig. 4H to suggest that membrane tension helps the proposed flattening of Piezo1 by Yoda1.

      We think our experiment may also provide new insights on the action of Yoda1: First, we note that only a small fraction of filopodia responded to Yoda1, and pre-stressing of the cell membrane was required to amplify the Yoda1 effect (current Fig. 4E). This observation is consistent with the reviewers’ notion that membrane tension is likely required to flatten Piezo1, even in the presence of Yoda1. Secondly, highly curved liposome or detergents can confine the shape of Piezo1 trimers. Therefore, the inability to observe Yoda1-induced flattening of Piezo1 in small liposomes is not necessarily in contradiction with our observation in the mostly flat cell membranes.

      We add to the Discussion section: “Yoda1 induced flattening of Piezo1 has not been directly observed via CryoEM. Our results (Fig. 4) point to two challenges in determining this potential structural change: (1) Yoda1 induced changes in Piezo1 sorting is greatly amplified after pre-stretching the membrane (Fig. 4E), pointing to the possibility that a significant tension in the membrane is required for the flattening of Yoda1-bound Piezo1. (2) Piezo1 is often incorporated in small (< 20 nm radius) liposomes for CryoEM studies. The shape of liposomes can confine the nano-geometry of Piezo1 (Lin et al., 2019; Yang et al., 2022), rendering it significantly more challenging to respond to potential Yoda1 effects. This potential effect of membrane curvature on the activation of Piezo1 would be an interesting direction for future studies.”

      6. Page 9: "Our study shows this conformational change of Piezo1 in live cells (Fig. 3H)." We recommend that this claim be removed as it seems too strong for the provided data.

      We changed the sentence to: “Our study suggests this conformational change of Piezo1 may also happen in live cells (Fig. 4H).”

      Additional suggestions for the authors to consider:

      1. Based on the calculated spontaneous curvature of Piezo1-membrane C0 of 87 nm, is it possible to derive the curvature of Piezo1 protein itself and the associated membrane footprint? This would be a nice addition.

      It is possible to do such an estimation, however, many (unverified) assumptions must be made, in addition to the ones already in our model. First, we need to assume a size of the Piezo1 trimers and of the Piezo1-membrane complex. If we assume Piezo1 trimers are ~170 nm2 in the plane of lipid bilayers (based on estimates from PDB) and that the complex takes on the shape of a 10 -20 nm radius half-sphere. Effectively, Piezo1 occupies an area fraction of 6.7%~27% in the Piezo1-membrane complex. Next, we assume that the membrane and the Piezo1 trimer have the same bending rigidity. Finally, we assume that the membrane itself does not have an intrinsic curvature.

      With those assumptions, the intrinsic curvature of Piezo1 trimers (_C_p) would relate to the spontaneous curvature of membrane-Piezo1 complex (_C_0) following: _C_p-1 = _C_0-1 * (6.7%~27%). Knowing _C_0-1 = 83 ± 17 nm, we get _C_p-1 = 5.6 nm ~ 22.4 nm.

      2. It is hard to see the filopodia and their localization in the figures. It would be better for readers and more convincing if clearer/higher resolution example images could be provided.

      We now provide high resolution figures.

      3. Can the authors better explain how the calculations done in panel 1C and S3D are done and their importance?

      Each fluorescence trace along the drawn yellow line was normalized to the mean intensity on the corresponding flat cell body, so that the average fluorescence of the cell body has a y-axis value of 1. We think the intensity traces are important because image contrast can be adjusted, therefore Fig. 1A alone would not convincingly show that there are no Piezo1 on filopodia.

      4. In Figure 2E, are these data from hPiezo1 or mPiezo1? In other cases, hPiezo1 is specified, this this may be a typo?

      Corrected.

      5. Figure 3 F&G: We assume these cells are the same in all panels, just visualized with either mCherry or eGFP in each condition. Accordingly, we would have expected more swelling in hypotonic conditions, and wonder if further evaluation may resolve this apparent discrepancy? If not, please provide more clarification.

      This is a good point. Indeed, we do observe a significant swelling of the cell right after the hypotonic shock.

      However, this effect is expected to be transient (volume of the cell would recover after ~ 1 min), see Figure. 1C here: https://www.pnas.org/doi/10.1073/pnas.2103228118. Our images in Fig. 3F and 3G were taken ~10 min after the hypotonic shock.

      6. On a lighter note, we’d recommend not using in cellulo.

      We changed in cellulo to “in live cells”

      Reference List

      Cox, C.D., Bae, C., Ziegler, L., Hartley, S., Nikolova-Krstevski, V., Rohde, P.R., Ng, C., Sachs, F., Gottlieb, P.A., and Martinac, B. (2016). Removal of the mechanoprotective influence of the cytoskeleton reveals PIEZO1 is gated by bilayer tension. Nature Communications 7, 1-13.

      Dumitru, A.C., Stommen, A., Koehler, M., Cloos, A., Yang, J., Leclercqz, A., Tyteca, D., and Alsteens, D. (2021). Probing PIEZO1 Localization upon Activation Using High-Resolution Atomic Force and Confocal Microscopy. Nano Letters 21, 4950-4958.

      Haselwandter, C.A., MacKinnon, R., Guo, Y., and Fu, Z. (2022). Quantitative prediction and measurement of Piezo's membrane footprint. bioRxiv

      Lewis, A.H., and Grandl, J. (2015). Mechanical sensitivity of Piezo1 ion channels can be tuned by cellular membrane tension. Elife 4, e12088.

      Lin, Y., Guo, Y.R., Miyagi, A., Levring, J., MacKinnon, R., and Scheuring, S. (2019). Force-induced conformational changes in PIEZO1. Nature 573, 230-234.

      Lou, H., Zhao, W., Li, X., Duan, L., Powers, A., Akamatsu, M., Santoro, F., McGuire, A.F., Cui, Y., and Drubin, D.G. (2019). Membrane curvature underlies actin reorganization in response to nanoscale surface topography. Proceedings of the National Academy of Sciences 116, 23143-23151.

      Shi, Z., and Baumgart, T. (2015). Membrane tension and peripheral protein density mediate membrane shape transitions. Nature Communications 6, 1-8.

      Shi, Z., Graber, Z.T., Baumgart, T., Stone, H.A., and Cohen, A.E. (2018). Cell membranes resist flow. Cell 175, 1769-1779. e13.

      Sorre, B., Callan-Jones, A., Manzi, J., Goud, B., Prost, J., Bassereau, P., and Roux, A. (2012). Nature of curvature coupling of amphiphysin with membranes depends on its bound density. Proceedings of the National Academy of Sciences 109, 173-178.

      Syeda, R., Florendo, M.N., Cox, C.D., Kefauver, J.M., Santos, J.S., Martinac, B., and Patapoutian, A. (2016). Piezo1 channels are inherently mechanosensitive. Cell Reports 17, 1739-1746.

      Yang, X., Lin, C., Chen, X., Li, S., Li, X., and Xiao, B. (2022). Structure deformation and curvature sensing of PIEZO1 in lipid membranes. Nature 1-7.

      Zhao, W., Hanson, L., Lou, H., Akamatsu, M., Chowdary, P.D., Santoro, F., Marks, J.R., Grassart, A., Drubin, D.G., and Cui, Y. (2017). Nanoscale manipulation of membrane curvature for probing endocytosis in live cells. Nature Nanotechnology 12, 750-756.

      (This is a response to peer review conducted by Biophysics Colab on version 1 of this preprint.)

    1. Meillassoux is quite right to say this renders the objectivity of knowledge very difficult to understand. But why think the problem lies in presuming the artifactual nature of cognition?—especially now that science has begun reverse-engineering that nature in earnest! What if our presumption of artifactuality weren’t so much the problem, as the characterization? What if the problem isn’t that cognitive science is artifactual so much as how it is?

      Meillassoux claims that, because cognitive science is made of atoms, that makes it suspect -- so we need to use philosophy. That is a bad claim. Philosophy is also made of atoms too. Cognitive science solves how cognition works. It may not answer "why cognition works", but maybe that's a trick question that only philosophy can ask, but nobody can answer.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The rapid syncytial nuclear cycles that occur during the first ~2.5 hours of Drosophila embryogenesis and give rise to the blastoderm are supported by large amounts of maternally deposited histone proteins which are stored in the egg cytoplasm for deposition into replicating DNA during each round of S phase. Although the H2A/H2B storage chaperone Jabba was identified by Michael Welte's lab several years ago, maternal H3/H4 storage chaperones have not been identified. Tirgar et al provide evidence that the Drosophila NASP protein provides histone H3 and H4 storage function during these earliest stages of Drosophila embryogenesis. The data include genetic analyses that NASP function is required maternally, but not zygotically, and molecular analyses that NASP binds H3 and that H3 and H4 levels are reduced in the embryo and late-stage oocytes in the absence of NASP. These data are convincing and support the conclusion that NASP is a maternally acting H3/H4 storage chaperone needed in the early embryo.

      Two additional lines of investigation would strengthen this conclusion and perhaps increase the impact and appeal of the manuscript.

      The first is a microscopic analysis of the nuclear division cycles in eggs derived from NASP mutant mothers. The authors report DAPI staining and assessment of nuclear cycles, but do not show these data. In fact, the two embryos shown in Figure 4B do not look like DAPI stained embryos-there are no nuclei apparent in the images. Loss of maternal histone causes defects in chromosome morphology that result in characteristic defects such as lagging chromosomes and the failure of sister chromatid segregation leading to fused daughter nuclei (see PMID: 11157774 for an example). These defects should not be difficult to detect via DNA staining or even using fluorescently labeled H2 type histones. Characterizing such defects would lend support to the hypothesis and I think is important for this paper.

      We thank the reviewer for their constructive review and feedback. We have switched to Propidium Iodide (PI) staining to increase the signal-to-noise for DNA staining in early embryos. Given the improved signal we see with PI over DAPI, we will be able to provide both improved images of nuclear staining and assay for defects in chromosome morphology as suggested. We will include this data in the revised version of the manuscript. Second, determining the location of NASP in the early embryo might provide further insight into the mechanism of storage. i.e. is NASP located in the cytoplasm rather than the nucleus, perhaps in association with lipid droplets like Jabba? Do the antibodies the authors developed work in IF experiments to ask this question? At the moment what is shown is that NASP is present in 0-2 hour embryos via western blot analysis, supporting the conclusion that it functions in the early embryo as a storage chaperone. This analysis would be nice to have but is not essential in my view.

      We have tried to use our antibody to monitor the localization of NASP in the early embryo. Unfortunately, the staining has yet to work. We will continue to alter fixation and permeabilization conditions in the early embryo with the goal of including this data in the revised manuscript. We have, however, been able to monitor NASP localization in Drosophila S2 cultured cells with our antibody. If we are unable to get the antibody staining to work in embryos, we will include the NASP localization data in S2 cells in combination with EdU labeling to mark cells in S phase.

      Small points: Is NASP really a maternal effect "lethal"? Some of the eggs do hatch, and so some develop to stages where maternal histones are no longer necessary and zygotic production takes over (i.e. cycle 15). Perhaps consider the language used here.

      We see the reviewers point with respect to the term ‘lethal’. We do see a very small fraction of progeny laid by NASPmutant mothers make it to adulthood, although they die shortly after hatching. We’ve removed the term ‘lethal’ and refer to NASP solely as a maternal effect gene. On this point, do NASP mutant females lay the same number of eggs as wild type? i.e. is there a requirement for oogenesis/egg production (other than depositing H3/H4 into the egg), or just for the early zygotic cycles?

      We have noticed that NASP mutant mothers have lower fecundity. We have included this data in the revised manuscript as Supplemental Figure 2A.

      The first paragraph of the results is redundant with much of the introduction, which I think could do a better job at describing in more detail the syncytial cycles and the special needs they have for histone storage and chaperone function versus the post-blastoderm embryonic cycles and the rest of development. i.e. make a better distinction between the first two hours of embryogenesis versus the rest of embryogenesis, and the when the switch from maternal to zygotic control of development and histone production occurs (cycle 15 at 3-4 hours AED).

      We appreciate the reviewer for this suggestion. The manuscript has been edited to be less redundant and include details of embryogenesis as suggested. CROSS-CONSULTATION COMMENTS Seems like all reviewers are in general agreement, particularly about providing additional data regarding chromosome/nuclear behavior in the NASP mutants and NASP localization in the early embryo to increase impact of the study. While rescue of the NASP mutant phenotype with a transgene would be nice, as suggested by referee #2, I don't think it's essential given the genetic approaches employed.

      Reviewer #1 (Significance (Required)):

      see above

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Tirgar et al. report on a functional characterization of the Drosophila homolog of the histone H3/H4 chaperone NASP. They generated a loss of function allele of NASP by CRISPR/Cas9, which induces a partial maternal effect embryo lethal phenotype. Using quantitative mass spectrometry, they demonstrate that NASP stabilizes reservoirs of H3 and H4 in the early embryo. The manuscript is very clear and confirms the functional importance of maternal NASP for the early embryo. Genetic analyses are well conducted (but see my comments below) and the impact of NASP maternal mutant on H3 and H4 stockpiles is convincingly established by both quantitative mass spectrometry and Western-blotting.

      Major comments:

      • Although the authors used two independent deficiencies of the NASP genetic region to characterize their NASP CRISPR alleles, it is relatively standard in this type of functional analyses to perform rescue experiments using a transgene expressing the WT protein.

      We thank the reviewer for this suggestion. As discussed in the cross consultation, we agree that the use of the two different deficiency lines and the NASP1 CRISPR control are clear lines of evidence that the phenotypical data are due to lack of NASP.

      • In WB analyses, NASP appears systematically shorter in the NASP[1]/Df genotype compared to WT. Can the authors comment on this?

      While we reproducibly see this change in migration, we can only guess as to why this may be. One possible reason is that the NASP1 mutant protein could be missing a post-translational modification. Proteomic data from Krauchunas et al. (Dev Biol. 2012; PMC3441184) shows that NASP has the potential to be regulated by phosphorylation. Therefore, the NASP1 mutant protein could be missing a phosphorylation. Intriguingly, the 6bp insertion is next to a Thr residue that could affect its ability to be phosphorylated (if it is phosphorylated at all). Since we can only offer speculation, we do not feel comfortable adding this to the manuscript.

      • The authors do not mention the centromeric histone H3 variant Cid in their analyses. Do they have evidence that it is not affected by loss of maternal NASP?

      We thank the reviewer for raising this great point. Our mass spec data reveals that Cid levels stay the same in the absence of NASP in both embryos and stage 14 egg chambers. We have edited Figures 3D and 3E to include Cid. Unfortunately, we did not identify any Cid-specific peptides in our IP-mass spec data.

      • The authors could have chosen to explore in more details the phenotypic defects of embryos derived from NASP mutant mothers. Instead, a single abnormal embryo is shown with no cytological details. This is a bit problematic since an earlier study (Zhang et al 2018, cited in the manuscript) actually provided more phenotypic details of embryos from NASP KD mothers.

      This issue was also raised by Reviewer 1. We have switched to Propidium Iodide (PI) staining to increase the signal-to-noise for DNA staining in early embryos. Given the improved signal we see with PI over DAPI, we will be able to provide both improved images of nuclear staining and assay for defects in chromosome morphology as suggested. We will include this data in the revised version of the manuscript. - Similarly, the authors could have used their anti-NASP antibody to analyze the distribution of NASP during cleavage divisions. Does it behave like ASF1, for instance, which enters S phase nuclei at each cycle or does it remain in the cytoplasm? These are relatively simple experiments/analyses that could increase the significance of the study.

      This point was also raised by Reviewer 1. We have tried to use our antibody to monitor the localization of NASP in the early embryo. Unfortunately, the staining has yet to work. We will continue to alter fixation and permeabilization conditions in the early embryo with the goal of including this data in the revised manuscript. We have, however, been able to monitor NASP localization in Drosophila S2 cultured cells with our antibody. If we are unable to get the antibody staining to work in embryos, we will include the NASP localization data in S2 cells in combination with EdU labeling to mark cells in S phase.

      Minor comments:

      • line 60: I suggest to introduce Drosophila in the next sentence, where it seems more appropriate (not all embryos develop "extremely rapidly").

      We have edited the second sentence to state “the early Drosophila embryo”.

      • line 68: the 50% estimation of free histones does not really make sense without defining the embryonic stage.

      We have edited the manuscript to state the specific cell cycle in which there has been 50% free histones measured. - line 89: Are the authors specifically referring to Drosophila NASP?

      Yes, we have edited the text to include Drosophila in this instance. - lines 99-106: I found this paragraph redundant with the introduction.

      We appreciate this suggestion. It was also pointed out by Reviewer 1. We have made changes to the manuscript to address the redundancy.

      • line 142: H3-H4

      Thank you for noticing this. We have edited the text to include 4.

      • line190-191: It seems to me that data of Figure S2C are already included in Fig. 2E.

      The data in FigureS2C was performed with virgin females compared to the data in Figure 2E that was generated with non-virgin mothers. This was important to control the genotype of the embryos.

      • line 232: it is surprising that the Zhang et al paper (reporting maternal KD of NASP) is only mentioned here. As a reader, I would certainly prefer to have it presented right from the introduction.

      We have edited the manuscript to include this reference in the introduction.

      • Figure 4B needs a scale bar.

      Figure 4B will be replaced with better images of the embryo stained with PI. It will also include images of chromosome morphology/segregation. We will be sure to include scale bars.

      • line 302: Mentioning the identity and function of known H3/H4 histone chaperones acting in the early embryo (ASF1, HIRA, CAF-1, ...) could provide perspective to the present study.

      Thank you for this suggestion. We have edited the manuscript to include functions of other histone chaperones in the early embryo to provide context.

      • line 304: in contrast to this statement, I found quite surprising and interesting that NASP is not absolutely essential for embryo development considering its role. This should be discussed.

      In the absence of Jabba alone, upregulation of translation can compensate for the destabilization of H2A, H2B, and H2Av. It is only when translation is inhibited in embryos laid by Jabba mutant mothers that embryos die (Li.Z, et al. Curr Biol 2013). Therefore, it is possible that translation can partially compensate for the degradation of H3 and H4 in the absence of NASP. This may be why a fraction of embryos laid by NASP mutant mothers are able to hatch and why we still detect some H3 in embryos laid by NASP mutant mothers. We have edited the manuscript to discuss this more in depth.

      CROSS-CONSULTATION COMMENTS I fully agree with the other reports. The NASP rescue experiment is just a suggestion but is not essential.

      Reviewer #2 (Significance (Required)):

      This work clarifies the identity and function of Drosophila NASP and clearly demonstrates that NASP is important for the stabilization of maternal stockpiles of H3 and H4 during early embryo development. The conservation of NASP function as a histone H3/H4 chaperone in Drosophila is not really a surprise but the merit of this study is to establish this assumption as a fact. It also establishes useful tools (mutant lines and antibody) for the fly community interested in this topic. The study however does not provide new insights about the dynamic distribution of NASP and the cytological consequences of its maternal depletion on the amplification of cleavage nuclei.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: Rapid cell cycles in early embryogenesis is driven from maternally supplied stockpiles of RNA and protein, including histones H3 and H4. This study uses sequence homology searches, biochemical approaches (immunoprecipitation and mass spectrometry) and genetics to identify NASP (CG8223) as the H3-H4 chaperone in Drosophila. Using CRISPR technology, the authors generate a NASP mutant fly line and show using genetic crosses that NASP is a maternal lethal gene. Furthermore the study shows that NASP stabilises H3-H4 during oogenesis and embryogenesis and is required for early embryogenesis.

      Major comments: The key conclusions of this study are very convincing. For example, the authors use multiple approaches to show H3-H4 specific interactions with NASP and that H3-H4 protein levels are reduced in mutants (Western analyses, quantitative MS). Analysis is carried out on two individual NASP mutant lines (one deletion that produces no protein, one insertion that still produces some protein acting as a control). All experiments are well controlled, executed and presented. Genetic crossing schemes are well presented and statistical analysis of progeny is clear.

      • We thank the reviewer for their positive feedback of our manuscript. Minor comments: In Figure 1B - Authors could indicate amino acids shown or are they full length proteins?

      We have edited the methods to include specific amino residues that are included for each structure.

      In Figure 2B - Authors could (semi) quantify reduction in NASP1 mutant to show this is a gene dose effect?

      We have now included the quantification of the Western blot in Figure 2B.

      CROSS-CONSULTATION COMMENTS I agree with the other reports. Although I did not indicate it in my original report, I agree that more in depth analysis of nuclear or chromosomal defects in NASP mutant embryos would enhance the study.

      Thank you for this suggestion. We are repeating the DNA staining in embryos and will include this new data in the revised version of the manuscript.

      Reviewer #3 (Significance (Required)):

      Excess soluble histones can be toxic and must be bound to chaperones. Until this study the chaperone responsible for H3-H4 stabilisation in rapidly cycling cells in Drosophila embryos was not known. Moreover, the NASP homolog had not yet been identified in Drosophila nor had its function been characterised. The findings are of interest to Drosophila researchers, the field of chromatin assembly, as well as those interested in early embryogenesis in animals.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      An exciting development in our knowledge about how the Arp2/3 complex controls the assembly of actin networks has come from the discovery that in addition to forming branched networks, Arp2/3 can nucleate linear filaments when it is activated by WISH/DIP/SPIN90. However, despite some excellent work largely done by the Nolen lab in yeast, many questions remain about how Arp2/3-mediated assembly of branched vs. linear actin filament. This is especially true in the complex environment of cells, were synergy and competition of different actin networks is used to control biological processes. Knowing the biochemical and physical properties of these different Arp2/3 assemblies will be key to figuring out how they work in cells. Here Cao et al. use an elegant microfluidics based single filament assay system to perform a comparative analysis of the stability of linear and branched Arp2/3 networks. They find interesting differences in how they respond to stabilizing and destabilizing factors. The most striking differences happens when force or aging is applied- both cause debranching of branched networks but have little effect on Spin90-Arp2/3 nucleated filaments.

      We thank the reviewer for their positive comments.

      Major comments:

      As a comparative study on the stability of branched vs. linear Arp2/3 nucleated filaments, this manuscript is fairly complete. The key conclusions are well supported by rigorous experiments which can be reproduced by others based on the information provided. However, I am not seeing explicit information on performing biological replicates. This should be included in the manuscript. The use of statistics is largely fine; however I question the use of one statistical test on one figure (see minor comments below).

      The revised manuscript is now explicit about biological replicates. We now specify the biological repeats of all our experiments in the figure legends, and we now show the results from new repeats in Fig 4 and Supp Fig S2 (please see also our response to the minor comments below, for more details).

      I would not ask for additional experiments at this time. However, there is an analysis that would be important for interpreting the authors' claims- branch/filament length at the time of dissociation or destabilization of Arp2/3. This would help address if there was a physical tipping point for each type of structure that could explain potential differences they see. The authors should already have this data and the time to complete it would be negligible in delaying publication.

      If we understand correctly, the “physical tipping point” mentioned by the reviewer would be a threshold force, where the Arp2/3-filament interface would become unstable. This is an interesting idea. Indeed, the applied force scales with the length of the filament (or branch), as well as with the flow velocity. In most of our experiments, however, the force applied to SPIN90-Arp2/3 and to branch junctions was kept constant and below 0.2 pN. This was done by exposing the filaments (or branches) to G-actin at the critical concentration, in order to minimize variations of their lengths. Therefore, by design, dissociation events in these experiments take place at the same length, ruling out the existence of a “tipping point”.

      Our data provide another test of the reviewer’s hypothesis, thanks to the experiments where we specifically address the question of the impact of force (Fig 5 and Supp Fig S6), by varying length and flow rate. We found that the stability of SPIN90-Arp2/3 linear filaments was unaffected by force, and that debranching was steadily accelerated by force. In both cases, it thus appears that there is no detectable threshold.

      One additional major comment is that the manuscript's title and abstract hint that this paper explores the differences in nucleation of branched vs. linear filaments by Arp2/3. However, the only figure that deals explicitly with nucleation in the paper is Figure 1, which is really just a confirmation that the mammalian proteins used in this study perform similarly to their yeast homologues (Balzer et al, Current Biology 2019). The authors might think about rewording the title/abstract to better reflect that paper really explores the differences in the stability of the two networks

      This is a fair point. We have now modified the title into “Regulation of branched versus linear Arp2/3-generated actin filaments”.

      Minor comments:

      1 in 12 men and 1 in 200 women are red/green colorblind. Please change the coloring of the schematics and images so that they can be easily seen by all people. This is especially true of the schematics, which are important for understanding exactly what each assay is measuring.

      We thank the reviewer for pointing this out. We have now made the schematics and images in Figs 1A, 2A, 2D and 4D colorblind-friendly.

      The Introduction is a bit choppy and unfocused. It was difficult to deduce exactly where the paper was going from it. Please consider re-writing it for better clarity. The Discussion on the other hand was fantastic. Great job on interpreting your results in a larger context.

      We have re-written large parts of the Introduction to make it clearer. We are glad the reviewer liked the Discussion, where we have nonetheless made some small changes in response to comments from the other reviewers.

      Many figures- while the use of different lightness values of the same color is appreciated in conveying different concentrations of reagents used, there were several instances where it was very hard to read the one on the very bottom (ex. 2B, E; 3A; 5C, G).

      We have now changed the colors in these figures, to make them clearer.

      Figure 1- since this is a confirmation of previous results performed using the same proteins from other species, the title should reflect that (ex. VCA domains accelerate the nucleation of filaments by mammalian SPIN90-Arp2/3). Also, to me this figure is supplementary to the main message of the paper. The authors might think of moving it to Supplementary Information.

      We have modified the title of Figure 1, now specifying “mammalian”, following the reviewer’s suggestion. However, we prefer to keep this figure as a main figure, rather than move it to Supplementary as proposed. Indeed, this figure does more than simply confirm previous results with mammalian proteins, since it compares different VCAs, which is new. These results are important because they are put in perspective with our results on the acceleration of linear filament detachment by different VCAs, later in the manuscript.

      Figure 1- If the goal was to verify that G-actin recruitment by VCA was important for Spin90-Arp 2/3 nucleation by performing a competition experiment with profilin, why was the concentration of G-actin AND profilin increased between the experiments in 1B vs. 1C. It makes it hard to directly compare the results.

      We now provide new data in Fig 1C, which can be directly compared to Fig 1B (only the profilin concentration was increased). It clearly shows that the effect of VCA disappears when the profilin concentration is increased.

      Figure 4B-F- Here, it would be nice to see the distribution of all the individual results, which are hidden by the bar graph. Additionally, the Chi-square test is not the appropriate test for evaluating statistical significance between multiple groups. ANOVA followed by an appropriate post hoc test should be used here.

      We now show the individual results in the bar graphs of figure 4. In this situation, we agree that the statistical significance should not be evaluated by a Chi-square test. We now indicate the p-values obtained from a paired t-test, which seems appropriate since we are comparing averages in pairs.

      Figure 4G- Please quantify and show reproducibility.

      We now show quantified repeats (shown in Fig 4, new panels H and I).

      Figure 5- the piconewton forces used for these experiments is in line with measured forces that are applied to actin in cells (ex. Mehida et al, Nature Cell Biology 2021; Jiang et al, Nature 2003). The text would benefit if this was explicitly stated.

      We now state this explicitly, when presenting these results.

      Reviewer #1 (Significance (Required)):

      The real significance of this work is in characterizing the differential stabilities of linear vs. branched Arp2/3 filaments in response to actin-binding proteins, mechanical stress, and aging. While both types of filaments respond similarly to actin-binding proteins, with nuanced differences, the most striking results came from applied force and aging experiments, with Spin90-Arp2/3 filaments being much more resistant to both. This has some very interesting implications for how these two types of assemblies might synergize in cells. Additionally, the results also have some exciting implications for the pointed-end regulation of actin filaments, which is still poorly understood in complex systems. Since the manuscript is A) more of a survey study on the factors that influence filament stability that does not go particularly deep into any particular mechanism of regulation and B) has no direct applicability to how the physical properties of branched and linear Arp2/3 nucleated actin filaments influence actin network activity in cells, the audience will likely by limited to actin enthusiasts. However, the work is still important in both what it reveals and implies.

      We thank the reviewer for pointing out the novelty and the importance of our work. We agree that the significance of our paper lies in the characterization of the differential stabilities of linear vs. branched Arp2/3 filaments, in response to different physiological factors. One of the strengths of our approach is that we do not focus on one regulatory mechanism in particular. Rather, we reveal fundamental differences between the Arp2/3-generated filaments and how they can be regulated. Understanding these basic mechanisms is a prerequisite to understand the regulation of entire cytoskeletal networks.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The quantitative analysis can be improved. It appears that most of the data results from single experiments, with rate values and errors resulting from fitting of single experiments without repetitions. In Fig. 1C legend (p.5) the authors state "These experiments were repeated three times, with similar results", but the data is not used in the analysis and other experiments do not mention this point. This is particularly important for comparisons among different VCAs that are rather similar in nature. In Fig. 1B. N-WASP is more efficient in nucleating SPIN90-Arp2/3 complex-linear filaments followed by WASP and then WASH. In Fig. 2 B,C, N-WASP is the most effective in dissociating SPIN90-Arp2/3 complex linear filaments followed by WASH and then WASP. But in Fig. 2 E,F, WASH is by far the most effective in dissociating branches followed by N-WASP and then WASP. Therefore, the conclusion in the Discussion (p.12) "While these regulatory proteins similarly affect branched and linear Arp2/3-generated filaments, they do so with clear quantitative differences" is not supported by quantification. To remedy this problem the authors should include at least 3 repeats of each experiment in data analysis. Also, they could include an analysis of sequence differences among VCAs and discuss how these may correlate with the observed differences. For instance, one WH2 in WASP vs. two in N-WASP.

      Indeed, we argue that the two forms of activated Arp2/3 differ in their sensitivity to different VCA motifs, based on how these VCA motifs rank in their ability to destabilize branched and linear filaments (the VCA motifs also rank differently in their activation and co-activation of Arp2/3 to nucleate branches and linear filaments, but this result does not contribute to our discussion of how proteins interact with the activated Arp2/3). Following the reviewer’s suggestion, we now show repeats of these experiments (new Supp Fig S2), clearly showing that N-WASP is the most effective in dissociating linear filaments while the differences are milder for dissociating branches, with WASH being at least as effective as NWASP. We now also discuss how this observation could relate to differences in sequence between VCAs (Discussion section and new Supp Fig S9).

      Also, please note that, following a suggestion from Reviewer 3, we have now performed experiments with the CA-domains of NWASP (new Supp Fig S4C and S4D), which show that the V-domain plays an important role in debranching but plays no role in destabilizing SPIN90-Arp2/3 at filament pointed ends. These new results reinforce our statement that VCA affects branched and linear Arp2/3-generated filaments differently.

      Reviewer #2 (Significance (Required)):

      Arp2/3 complex is a 7-protein complex implicated in actin filament nucleation and branching. Arp2/3 complex-nucleated branched networks are found at several locations in cells and are responsible for processes such as cell motility.

      Cao et al. compare the effect of several proteins on the filament nucleation activity of Arp2/3 complex, and the stabilization or destabilization of actin filament branches as well as linear actin filaments nucleated by SPIN90-Arp2/3 complex. The proteins tested include the VCA regions of three NPFs (N-WASP, WASP, and WASH) that activate Arp2/3 complex, GMF (a debranching protein) and cortactin (a branch stabilizing protein). For the most part, the study uses a single method, microfluidics-TIRF microscopy.

      The main findings are:

      1. VCA domains enhance nucleation of linear filaments by SPIN90-Arp2/3 complex in the presence of actin monomers.
      2. However, VCA domains can also destabilize existing SPIN90-Arp2/3 complex linear filaments and branches, and this effect depends on the presence of of V-domain (WH2 domain that binds actin monomers).
      3. The debranching factor GMF also destabilizes SPIN90-Arp2/3 complex linear filaments. Both GMF and VCA generate free pointed ends by dissociating Arp2/3 complex from pointed ends and SPIN90.
      4. SPIN90-Arp2/3 complex linear filaments are less susceptible to force and aging than filament branches.
      5. Cortactin stabilizes SPIN90-Arp2/3 complex linear filaments to higher degree than it does branches. These are novel and very interesting new observations of significant interest to the actin cytoskeleton field. Therefore, I recommend publication of this paper in EMBO J.

      We thank the reviewer for their positive evaluation of our work.

      I have one recommendation and one suggestion for improvement:

      Major:

      1. The quantitative analysis can be improved. It appears that most of the data results from single experiments, with rate values and errors resulting from fitting of single experiments without repetitions. In Fig. 1C legend (p.5) the authors state "These experiments were repeated three times, with similar results", but the data is not used in the analysis and other experiments do not mention this point. This is particularly important for comparisons among different VCAs that are rather similar in nature. In Fig. 1B. N-WASP is more efficient in nucleating SPIN90-Arp2/3 complex-linear filaments followed by WASP and then WASH. In Fig. 2 B,C, N-WASP is the most effective in dissociating SPIN90-Arp2/3 complex linear filaments followed by WASH and then WASP. But in Fig. 2 E,F, WASH is by far the most effective in dissociating branches followed by N-WASP and then WASP. Therefore, the conclusion in the Discussion (p.12) "While these regulatory proteins similarly affect branched and linear Arp2/3-generated filaments, they do so with clear quantitative differences" is not supported by quantification. To remedy this problem the authors should include at least 3 repeats of each experiment in data analysis. Also, they could include an analysis of sequence differences among VCAs and discuss how these may correlate with the observed differences. For instance, one WH2 in WASP vs. two in N-WASP.

      This comment is identical to the reviewer’s first paragraph. We copy our answer here again, for convenience:

      Indeed, we argue that the two forms of activated Arp2/3 differ in their sensitivity to different VCA motifs, based on how these VCA motifs rank in their ability to destabilize branched and linear filaments (the VCA motifs also rank differently in their activation and co-activation of Arp2/3 to nucleate branches and linear filaments, but this result does not contribute to our discussion of how proteins interact with the activated Arp2/3). Following the reviewer’s suggestion, we now show repeats of these experiments (new Supp Fig S2), clearly showing that N-WASP is the most effective in dissociating linear filaments while the differences are milder for dissociating branches, with WASH being at least as effective as NWASP. We now also discuss how this observation could relate to differences in sequence between VCAs (Discussion section and new Supp Fig S9).

      Also, please note that, following a suggestion from Reviewer 3, we have now performed experiments with the CA-domains of NWASP (new Supp Fig S4C and S4D), which show that the V-domain plays an important role in debranching but plays no role in destabilizing SPIN90-Arp2/3 at filament pointed ends. These new results reinforce our statement that VCA affects branched and linear Arp2/3-generated filaments differently.

      Minor:

      In GST-pull-down experiments (Fig. 4G), the amount of Arp2/3 complex bound is analyzed by Western, which is rather unprecise. Is the amount of Arp2/3 complex so little that it cannot be quantified using regular SDS-PAGE? If that is the case, this would suggest rather low affinity of SPIN90 for Arp2/3 complex. How does this affect the proposed mechanism and experiments in the microfluidics chamber?

      Indeed, the amount of pulled-down Arp2/3 is low and difficult to quantify by SDS-PAGE. This is consistent with previous reports which indicate a low affinity of SPIN90 for the Arp2/3 complex (Wagner et al. Current Biology 2013, Balzer et al. eLife 2020). This does not affect our conclusions, which we now confirm by showing quantified repeats of our pull-down experiments (new panels H and I, in Figure 4). In spite of this low affinity, which makes it difficult to saturate SPIN90 with Arp2/3, the SPIN90-Arp2/3 interaction is very stable and allows us to carry out our experiments in the microfluidics chamber over several tens of minutes (as was already the case in our previous study, Cao et al. NCB 2020).

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      In this study, Cao and collaborators investigate the biochemical and mechanical differences between branched actin filaments nucleated by WASP-activated Arp2/3 complex and linear actin filaments nucleated by SPIN90-activated Arp2/3 complex. They use TIRF microscopy in a microfluidic chamber to show that the mammalian proteins, SPIN90 and WASP (or N-WASP or WAVE), like their yeast homologues, co-activate Arp2/3 complex to nucleate linear actin filaments. Using the same assays, they find the surprising result that the VCA segment of WASP proteins destabilizes the interaction between SPIN90 and Arp2/3 complex in linear actin filaments nucleated by Arp2/3 complex. They then show that VCA also destabilizes actin filament branches. The remainder of the study explores the influence of branch stabilizing/destabilizing proteins or mechanical stress on the stability of the interaction between SPIN90 and Arp2/3 complex on the pointed end of the actin filament. They find that like branch junctions, SPIN90-bound Arp2/3 is destabilized at the end of linear filaments by GMF and stabilized by cortactin. However, unlike branch junctions, SPIN90-Arp2/3 complex is not destabilized on filament ends by piconewton forces or by aging. They conclude that SPIN90- versus VCA-activated Arp2/3 complex adopt similar but non-identical conformations.

      Overall, the paper is well written and the experiments, which are very challenging, are rigorously executed. The biochemical results are convincing, novel and unexpected. However, the work could be strengthened by more strongly connecting the biochemical observations to biological implications. In addition, there are some interpretations/conclusions that seem somewhat weakly supported, and the authors should consider revising. Nonetheless, given the quality of the work and the importance of the system, this manuscript will appeal to a broad audience.

      We thank the reviewer for their positive comments. We have rewritten parts of the Discussion in order to better connect our observations to implications in cells. We address the concerns regarding our interpretations in the point-by-point, below.

      Comments on evidence, reproducibility, clarity and significance:

      The differences in the stability of SPIN90-Arp2/3 on linear filaments verses branch junctions led the authors to conclude that SPIN90- versus VCA-activated complexes adopt similar yet non-identical conformations. There are two problems with this conclusion:

      1) This conclusion rests on the idea that the biochemical differences can only be due to differences in the "ground state" active conformations of the complex. Another possible scenario would be that the active conformations are the same, but the transition state or intermediate state structures within the debranching reactions are different, thus changing the kinetics of the debranching reactions.

      We thank the reviewer for this remark, and we agree that conformational differences may also arise in the intermediate states, during dissociation (of the branch from the mother, or of the linear filaments from SPIN90). We now mention this possibility in our Discussion.

      2.) There are already structural data showing conformational differences between the Dip1-bound Arp2/3 complex on the end of a linear filament and Arp2/3 complex at a branch junction. While there are some caveats to comparisons of the structures (e.g., the Dip1 structure includes the fission yeast SPIN90 protein (Dip1) and the fission yeast Arp2/3 complex while the branch junction contains mammalian proteins), these data offer much stronger evidence that the active states adopt (somewhat) different conformations than the data presented here.

      We agree that the available structural data (in particular, Ding et al. PNAS 2022, which was not yet published when we submitted our manuscript, and which we now cite) provide a clear indication that active Arp2/3 adopts different conformations in branches and linear filaments. We have modified our text to make this point clearer.

      The authors make comparisons between the Fäβler branch junction structure and the Shaaban Dip1-Arp2/3-filament structure. The Fäβler branch junction structure is a low resolution structure (9 angstroms) and should be interpreted with caution (see below). A much higher resolution of a branch junction structure was recently solved (Ding et al, PNAS 2022) and should be used for comparisons between the structures.

      Ding et al. PNAS 2022 was not yet published when we submitted our manuscript. We now use it to compare the structures of active Arp2/3, and we have modified the text accordingly.

      Pg 14 - The authors say differences between ARPC3-Arp2 and ARPC5-Arp2 contacts in the two structures are likely to cause the differences in interactions with GMF and VCA. Two concerns with this statement are: 1.) The basis for the conclusion that the ARPC5-Arp2 contacts are different (in Fäβler, et al.) is not solid (see Ding, et al) and 2.) The analysis is vague. To reasonably conclude that differences in the contacts would influence GMF and VCA interactions would require mapping out the structural connection between the ARPC3-Arp2 interaction site and the GMF or VCA binding sites. If there is no obvious connection between these sites, the conclusion that the differences in the ARPC3-Arp2 interface cause differences in VCA and GMF binding should be far more circumspect.

      We have re-written this part of the Discussion section. In light of the new data by Ding et al., we agree with the reviewer that the conclusion that the ARPC5-Arp2 contacts are different is not solid. Our revised text makes it clear that we are not making any claims involving interactions within the Arp2/3 complex. Our point is simply that recent cryo-EM reports indicate conformational differences in Arp2 and Arp3 between the two activated forms of the Arp2/3 complex and that, since the CA-domain of NPFs bind to Arp2 and Arp3, it appears reasonable to make a connection with our results.

      Pg 6. "These observations suggest that the ability of VCA to destabilize Arp2/3-nucleated filaments relies on the availability of its V-domain." It's possible that G-actin binding to V blocks the CA from accessing the branch junction. Therefore, it seems important to test whether N-WASP-CA can destabilize Arp2/3-nucleated actin filaments.

      We thank the reviewer for this suggestion. We now present results from new experiments performed with the CA-domain of NWASP (new Supp Fig S4C,D). We find that the V-domain participates in the enhancement of debranching, but that it appears to play no role in the destabilization of SPIN90-Arp2/3 from the pointed end. It thus seems that the reviewer’s proposal is correct, and that G-actin binding to the V-domain blocks the CA-domain from accessing the branch junction. We now propose this interpretation in the text.

      Pg 1 - The authors state that "It thus appears that linear and branched Arp2/3-generated filaments respond similarly to regulatory proteins, albeit with quantitative differences". It is worth considering if one should make a blanket statement that linear and branched filaments respond similarly to regulatory proteins when they have tested 3 in total.

      We have rephrased this sentence. It now reads “… respond similarly to the regulatory proteins we have tested…”

      Pg 3 - "More generally, the stability of SPIN90-Arp2/3 at the pointed end, which is important to understand the reorganization and disassembly of actin filament networks, remains to be established." In some ways this statement not quite accurate because Balzer et al previously showed that Dip1-Arp2/3 complex is very stable at the pointed end. Is the question here whether that stability is also conserved in mammalian systems? If so, that should be more directly stated.

      We meant that, beyond observing that SPIN90 remains visible at the pointed end for some time (as in Balzer et al.), a lot remained unknown: its lifetime had not been quantified, and its sensitivity to the factors that affect branch junctions (proteins, aging, mechanical tension) had not been studied. We have rephrased the sentence in the manuscript to clarify this point.

      The observation that VCA accelerates debranching and SPIN90-Arp2/3 dissociation is very interesting. However, it is uncertain if this biochemical activity has biological relevance, given that once nucleation occurs, Arp2/3 complex will move away from the membrane. While the authors mention in the discussion that debranching by VCA could be relevant when the network is compressed near the membrane, this argument is not particularly strong. Are there ways to strengthen this argument, or find another impact this finding might have on our understanding of Arp2/3 complex regulation?

      We now mention another situation where branch junctions could encounter membrane-bound VCA domains: on the dorsal and ventral membrane surfaces of lamellipodia. We now cite the recent Kage et al. J Cell Science 2022 and Mehidi et al. NCB 2021, where WAVE has been observed in lamellipodia away from the leading edge.

      The observation that SPIN90+Arp2/3-nucleated filaments are not sensitive to piconewton forces is also very interesting. The authors focus on the differences in the amount of surface area buried when discussing this result. However, if seems a key factor in the stability of the linear filaments would be the direction of the force relative to the complex and attached filament(s), which would be very different for a branch versus a linear filament. The authors should consider addressing this in their discussion.

      The orientation of the applied force is an interesting point. In their study on debranching, Pandit et al. (PNAS 2020) report that their results are not affected by the angle of the applied force relative to the mother filament (their Fig S1D). We now specify this in our manuscript, when introducing our results on mechanical tension. Similarly, we found that anchoring SPIN90 to the coverslip surface by its N-terminus rather than its C-terminus, which likely affects the orientation of the applied force, had no impact on our results (Supp Fig S6A). We have now also added a sentence regarding this aspect in our manuscript, after presenting this result.

      Fig 4, D-F: It is unclear how the authors determined which filaments were spontaneously nucleated versus those that were nucleated by SPIN90-Arp2/3 complex in these experiments. In reactions containing SPIN90 and Arp2/3 complex what fraction of the filaments will be spontaneously nucleated?

      In our conditions, there is no detectable spontaneous nucleation. In control experiments where we flow in the same concentration of G-actin, in the absence of Arp2/3 or in the absence of SPIN90, we observe no filaments at all on the surface, over several fields of view, after 5 minutes. We now specify this in the Methods section.

      Pg 9 - The observation that VCA negatively influences binding of SPIN90 to the complex is unexpected. What implications does this have for understanding how SPIN90 and VCA synergize to activate the complex?

      It appears that the outcome depends on the context. The main role of VCA during co-activation of the Arp2/3 complex with SPIN90 seems to be to supply G-actin, as already proposed (Balzer, 2020) and confirmed by our results (Fig 1C). In the absence of G-actin, VCA is more likely to remove Arp2/3 from SPIN90 (Fig 4G,I). Similarly, when a filament is already formed, the presence of G-actin mitigates the removal of SPIN90-Arp2/3 from the pointed end by VCA (Supp Fig S4).

      Fig 4B - Why is there greater nucleation when Arp2/3 complex and GMF are added together compared to renucleation in reactions that don't have any GMF? This is surprising, especially considering that GMF decreases binding of Arp2/3 complex to SPIN90.

      Indeed, there is a small yet statistically significant difference in the re-nucleation fraction we measured in the presence of Arp2/3, with or without GMF (Fig 4B). This may be due to the different timescales of the two situations. In the absence of GMF, the detachment of filaments is slow and new filaments are nucleated from the initial Arp2/3 complexes, which remained bound to SPIN90 upon detachment of the first filaments. In contrast, in the presence of GMF, detachment is faster and accompanied by the departure of the initial Arp2/3, and a fresh Arp2/3 then binds to SPIN90 to nucleate a new filament. It is thus possible that, in the absence of GMF, a small fraction of the SPIN90 and/or their initially bound Arp2/3 complexes would denature over the time they spend at the bottom of the microchamber at 25°C, thereby leading to a slightly smaller re-nucleation fraction. A similar mechanism could be at play in the experiments with or without VCA, in addition to the enhancement of nucleation by VCA (Fig 4C).

      Minor Corrections/Comments

      Pg 3 "We show that Arp2/3 nucleation is similarly stabilized by cortactin and destabilized by GMF" Do the authors mean branches and linear filaments nucleated by Arp2/3 complex?

      Yes, that is what we meant. This sentence has now been modified.

      Pg 6- The cyan 3uM data and legend in figure 2B and E is probably too dim to see clearly.

      The colors have been changed to improve readability.

      Fig 4 B,C,E,F: It would be best to show the individual data points here if possible.

      We now show individual data points in all these figure panels.

      Pg 16 Please specify which antibody was used to anchor SPIN90.

      The antibodies are Anti-GST for Nter anchoring of GST-SPIN90, and anti-His for Cter anchoring of SPIN90-His. We now specify this in the Methods section.

      CROSS-CONSULTATION COMMENTS I agree with the points that the other reviewers raised.

      Reviewer #3 (Significance (Required)):

      Comments on significance are in the above section.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank all three reviewers for their thoughtful and rigorous critique of our manuscript, which we feel has significantly improved the presentation of our work. Below we detail point-by-point responses to comments made by the three reviewers as well changes we have already made addressing the majority of minor and some major points.

      Specification of the eye-field during gastrulation represents the earliest known stage of eye development. Using an optic-vesicle organoid model system, the overall goal of our work is to provide an unbiased characterisation of this critical, early developmental event in mammals and to gain insights into relevant gene regulatory mechanisms. A common theme to some of the reviewer comments is that this work doesn't provide much of an advance to the field and our findings are not particularly original. We feel that these comments are slightly harsh for the following reasons. Firstly, although some of our findings are not unexpected, to our knowledge, this is the first unbiased characterisation of the eye-field in a mammalian model system, and not based on knowledge gained through previous work in other non-mammalian vertebrate systems, e.g. Xenopus. Secondly, by generating both RNA-seq and ATAC-seq from a timecourse of organoid development we have been able to quantify dynamic patterns of gene-expression as the eye-field is established and simultaneously gain insights to the regulatory role of some of the key transcription factors, both of which are not present in the literature. Thirdly, by constructing careful, integrated analyses of our RNA-seq and ATAC-seq datasets we were able to generate specific hypotheses regarding cis-regulation of key genes, which we have then demonstrated are possible to efficiently test within the organoid system. In all, although we have been purposely careful not to overinterpret our results, we feel our work does represent a significant step towards understanding the mammalian eye field and additionally provides important datasets as well as an analysis framework to begin to quantitatively probe the regulatory mechanisms underlying the transition to an ocular fate. Given the relevance of this developmental event to clinical genetics research as well as to developmental biology we are confident that this work represents an important and significant advance to the literature.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary Owen et al. characterize the transcriptome and chromatin accessibility of mouse retinal organoids at early stages during which eye field-like cells are specified. Since cell specification and differentiation in retinal organoids largely mimic those processes in vivo, retinal organoids are viable models for studying the mechanisms of early eye development. Owen et al. utilize a previously established Rx-GFP cell line, bulk RNA sequencing, and bulk ATAC sequencing to dissect the mechanisms of early eye development in mice. Their findings are generally consistent with previous studies. Overall, the study is interesting for the field, but its conceptual and technical advances are moderate. In addition, a few major points need to be clarified.

      Major points 1. The authors did not show any analysis of retinal organoids at stages when Vsx2 is expressed. This is a significant weakness since the chemically defined medium (CDM) used in Owen et al.'s study was previously shown to induce rostral hypothalamic differentiation (Wataya et al., 2008). Related to this notion, several eye-field transcription factors, such as Rax and Six3, are also expressed in the hypothalamus. Therefore, Owen et al. need to demonstrate that organoids in their modified differentiation system efficiently produce Vsx2-positive retinal progenitors, and samples of organoids at stages when Vsx2 is expressed should be included for RNA sequencing. If Vsx2 is not efficiently expressed in their organoids, the interpretation of results will be very different.

      We thank the reviewer for their important comments here. There are several reasons why we are confident that our data and conclusions regarding the organoid eye-field are robust. Firstly, our RNA-seq data, in particular the differences between GFP-positive and GFP-negative cells, clearly show a coordinated up-regulation of the set of canonical eye-field TFs (not individually), which previous studies in Xenopus have shown is a prerequisite for differentiation into anterior eye structures (including retina). Secondly, we have checked that some of the later (in development) eye markers, including Vsx2, are differentially up-regulated (DeSeq2, logfc>1.5, FDRIn all, we are very confident that our approach of using the optic-vesicle organoids and generating molecular data from an organoid developmental timecourse (including sorting), is unpicking the ocular-fate transition event that we are interested in.

      1. The authors state that "two differentiation medias were used for this work due to the differentiation becoming unstable after the initial experiments had been performed. The organoids used for RNA and ATAC-seq were grown in CDM media and the organoids with mutations introduced in potential CREs were grown in KSR media". Why the differentiation becomes unstable after the initial experiments? Differences in the two media cause additional complexities. Related to this notion, "WT Rx-GFP" in Figure 4B and 4E appears to show a different expression pattern compared to that in Figure 1A.

      We were unable to identify the reason behind the destabilisation of differentiation in CDM media after the cell lines had been through CRISPR despite thorough testing. The differentiation of these cell lines was stabilised enough using KSR media such that every batch of organoids grown contained some organoids that expressed GFP in a pattern similar to what we had seen before and we carried on our experiments using this. We recognise that using two different media adds complexity, however we see the same patterns of organoid growth and GFP expression when differentiating untransfected WT Rax-GFP cells in both of these medias. We have edited Fig.S1 to include representative images of organoids grown in KSR media which can be directly compared to those grown in CDM shown in Figure 1A.

      The reviewer has pointed out that the WT Rx-GFP organoids in Figure 6B and 6E show a different expression pattern to those in figure 1A. With the addition of the supplemental figure mentioned above it becomes apparent that these differences are not due to the change of media. We have clarified in the text that these WT cells have also been transfected so as to act as appropriate controls that have been treated identically to the CRISPR edited cell lines and that this has affected their differentiation capacity.

      1. Is the deletion of Rax and Six6 regulatory elements homozygous? Sanger sequencing or amplicon sequencing is needed to show the deletion.

      The deletions are homozygous (we have stated this in the manuscript text) and as suggested we have added a supplementary figure showing the Sanger sequencing traces for the WT and mutant cell lines used in this study.

      1. The deletion of Rax and Six6 regulatory elements appears to cause minor changes in the expression of Rax and Six6 (Figure 6C, F). Therefore, the impact of findings in bulk RNA seq and bulk ATAC seq in this study is still unclear.

      We have added a sentence to the text underlining that developmental genes are expected to be regulated by multiple enhancers. Our expectation is therefore, that in perturbing a single putative regulatory element for Rax/Six6, we will very likely not see the complete ablation of Rax/Six6 expression.

      1. Retinal organoids and sorted cells are composed of heterogeneous cell populations. Bulk RNA seq and bulk ATAC seq do not have the power to dissect the complexity of heterogeneous cell populations. Single-cell RNA seq and single-cell ATAC seq are more powerful for this study.

      We agree with the referee about the fact that the organoids are likely composed of relatively heterogeneous cell populations. We have added this limitation of our generated datasets in a “limitations” paragraph in the discussion.

      1. Numerous motifs in the JASPAR database are identified using in vitro assays and have not been validated using in vivo assays. Unexpected results in motif analysis could be due to the differences in DNA binding motifs between in vitro and in vivo conditions. This notion should be added in the discussion.

      We have added a couple of sentences in the discussion section, highlighting that TF-motif and footprinting analyses of ATAC-seq data provide indirect evidence of TF binding, and to validate these findings experiments such as ChIP-seq or Cut&Run could be performed in the future.

      Minor points

      Numerous labels in figures are too small.

      We have adjusted the size of a number of the figures to increase the size of the labels, which are now mostly the same size as the text in the corresponding figure captions. We are very happy to make further increases in the sizes of figure labels/text upon recommendation.

      CROSS-CONSULTATION COMMENTS

      My fellow reviewers identify similar major weaknesses and additional points. I agree with the other reviewers' comments.

      Reviewer #1 (Significance (Required)): Nature and Significance of the advances In Owen et al.'s study, the Rx-GFP cell line and retinal differentiation protocol were established in previous studies (Wataya et al., 2008; Eiraku et al., 2011); bulk RNA sequencing and bulk ATAC sequencing are standard procedures. Although candidate regulatory elements for early eye development are identified, deletions of two prioritized elements using CRISPR/Cas9 only cause minor changes in the expression of targeted genes. Overall, conceptual and technical advances in Owen et al.'s study are moderate. Compare to existing published knowledge The datasets could be useful for the field, but conceptual and technical advances are moderate.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors grow eye organoids from cells with a reporter driving GFP in the Rax locus, a gene that is expressed in the eye field in many animal model systems. They show that expression of GFP picks up by day 4 and performed FACS sorting of GFP+ cells on day 4 and day 5 organoids to compare gene expression by RNAseq comparing with earlier day organoids. The data shows 37 genes with a differential expression on days 4 and 5, compared to day 3, and enriched in GFP+ cells, which they define as EF-up genes. It is notable that some of these genes had already been identified as canonical eye field gene regulatory network transcription factors. In the same way, they identify a group of differentially expressed regulated genes, EF-down, and state that 'many' of them are involved in pluripotency. However, they do not mention how many, or the proportion of these genes in the whole list.

      The number of EF-down genes with GO terms linked to pluripotency has now been added to the text.

      It would be useful if they could provide the number to understand how many of these genes are related to pluripotency, the whole list of genes mentioned to be downregulated in a supplementary file.

      We appreciate that this list was missing and will include it now as a supplemental file.

      The authors also note that genes known to be required for eye specification like Sox2 and Otx2 are not differentially expressed across the day 3-4 timepoint (Ln 190). However, this is not surprising considering that both genes are broadly expressed in the anterior neural ectoderm and required for its specification, which should be noted by the authors.

      We have amended the aforementioned sentence to reflect this: “It is noteworthy that Sox2 and Otx2, known to be crucial in eye development are not differentially expressed across this critical time-point (Fig.2A), consistent with these genes being more broadly expressed in the anterior neuroectoderm in vivo.”

      The authors then go on and cluster the EF-up, EF-down and genes deferentially expressed between days 2 and 3, and identify 6 discreet trajectory groups. From this analysis, they identify a third group of genes which shows a peak on day 3 but whose expression falls on days 4 and 5. It is interesting to see that this group includes Wnt and Fgf morphogenes. The authors should provide a list of the genes in the different clusters for the readers to inspect and analyse.

      We note that there was a typo in the original manuscirpt and the genes that were clustered were the EF-up and EF-down genes. This typo has been fixed and the requested information is now available in a supplementary file.

      Aiming to generate insight into the cis-regulatory elements that regulate of the genes the authors found differentially expressed in their model system they performed a series of ATAC-seq experiments. When linking the genomic regions with differential ATAC-seq accessibility to gene locus using the GREAT analysis, they identified association to 22 of the EF-up and 161 of the EF-down genes. This suggests a functional link between the ATAC-seq genomic regions and the gene regulation of the differentially expressed genes.

      The authors later screened the ATAC-seq regions of increased accessibility for TF binding motifs and found that these regions were enriched with motifs for EFTF genes Rax, Lhx2 and Pax6. When assessing motifs in the ATAC-seq regions in EF-up TADs, Rax and Lhx2 motifs scored highly associated to open chromatin positions. Authors also observe a positive gene expression-accessibility correlation between in Pax6, Lhx2, Six3 and Otx2, and suggest this could mean these genes activate transcription of the EF-up group of genes. The same analysis, but focusing on EF-down genes, suggests that EFTFs repress the expression of EF-down genes which include those involved in pluripotency.

      Further interrogating the ATAC-seq data, the authors use TOBIAS footprinting analysis to identify changes in TF binding in EF-TADs and EF-up motifs. Remarkably, whole genome analysis reveals that the largest increase in motif binding corresponds to EF-up genes Rax, Pax6 and Lhx2. The authors then narrow down on specific gene regulation by studying the ATAC-seq data within the TAD of Rax and Six6. However, they do not explain the rationale for which these two genes were highlighted, and why Pax6 or Lhx2 were excluded. This explanation should be added to the manuscript.

      We have expanded this section of the manuscript to explain that Rax and Six6 were prioritised due to the GFP readout of Rax expression and Six6 being located in a smaller and thus less complex TAD than Pax6, Six3 and Lhx2 after the initial analysis was performed for all five TADs.

      The analysis identifies three regulatory elements in the Rax TAD and two for Six6. They then go on and study one putative regulatory element of each gene and generate CRISPR deletions in cell lines. The rationale for the choice of these particular elements is not clear, nor if the cell lines are the same used for the RNAseq experiments. This information should be explicit in the results and in the methods section.

      The manuscript has been updated to include the rationale behind our choice of the regulatory elements deleted.

      The authors mention that the CRISPR cell lines are "considerably more variable" (Ln 822) compared to the previously studied organoids and suggest that no conclusions can be driven from GFP expression or morphology alone. However, they do not specify which is the variable trait. This information should be added to the text.

      We have amended the text to include that the organoids are more variable in terms of the OV like structures produced and GFP expression level.

      The authors also miss out on specifying the time stage of the organoids in figure 6 which should be stated.

      We thank the reviewer for pointing this out and have updated the manuscript to contain the stage of the organoids.

      Regardless, the wildtype organoids in figure 6 and figure S7 show a very different morphology and GFP expression compared to those in figure 1, suggesting that the conclusions from this last set of experiments are not reliable or comparable to those in figure 1. This, together with the fact that different reagents were used to grow the organoids for the RNAseq and the CRISPR experiments, is a weakness of this work that must be addressed.

      We recognise this weakness however our amendments detailed above in response to reviewer 1’s comments, including adding a figure showing WT organoids grown in the KSR media that closely resemble the organoids in Fig.1A, removes the uncertainty that it is the change in media producing these differences in morphology and GFP expression.

      Our aim in this section was to specifically test the hypotheses regarding the regulatory nature of the distal genomic regions identified by our intra-TAD analyses of ATAC-seq data. To do this it was important to compare organoids derived from wildtype and mutant cells that had been subjected to the same growth conditions and genomic-editing protocols. The stress associated with the latter is what we expect has resulted in the differences in morphology and GFP expression compared with the original Fig1. organoids (which have not been through this procedure).

      The last part of the results section belongs to the discussion as no results generated by the researchers are included.

      Although no new data was generated for this section, we have used the data generated in our work, together with existing ChIP-seq datasets to construct a new plausible hypothesis regarding the activation of Rax-expression through changes in TF-binding at an enhancer displaying little/no change in accessibility. As this section ties in with previous results sections discussing the regulation of eye-field genes, we feel it belongs in the results section rather than in the discussion.

      The discussion in this paper is a good opportunity to state the limitation of this study.

      As requested, we have added a paragraph discussing the main limitations to our study in the discussion section.

      Major comments to address

      1. One of the main issues identified is that the morphology of the control conditions in the CRISPR experiments (Fig.6) do not look is that those used for the RNAseq experiments (Fig.1) and the authors should address this issue. The fact that CDM media was used on the RNA extraction and ATACseq experiments and then KSR media was used for the CRISPR experiments is worrying and makes one wonder whether the second set of experiments is at all comparable to the first. This should be somehow controlled carefully by at least replicating one set of RNA experiments with the KSR media.

      We have addressed this in response to the reviewer’s summary above. Unfortunately, it is not possible for us to replicate the RNA experiments in the KSR media due to the research group closure upon Professor FitzPatrick’s retirement.

      1. The requirement of Wnt signalling inhibition has been well established as a requirement for forebrain specification, including the eye field. Considering the link of the Wnt/beta-catenin pathway to eye specification and that TCFs, the transcription factors that mediate Wnt pathway transcription regulation, have known and well-studied DNA motifs, it is surprising that authors do not include the analysis of TCF motifs in their study. Also considering that TCF7l1 (TCF3, old nomenclature) has recently been shown to be cell-autonomously required for the expression of rx3 (Rax homologue) in zebrafish. One would expect TCFs to be included in the analysis as it was done with Sox2 and Otx2, which were studied due to the known relevance in forebrain specification rather than from the direct analysis of the differential gene expression experiments.

      We thank the referee for their valuable comment here. Our current analyses indeed do not consider TCFs and are therefore likely incomplete. We plan to address this by further analysing our data to quantify the patterns and effects of the TCF genes, and will appropriately amend our manuscript to reflect our findings.

      Minor comments to address

      1. The authors should clearly state the day timepoint used in the organoids experiments in the results section and figure legends, not just in the methods.

      We have updated the text and figure legends to include the time point of all organoids.

      1. The report by Agnes et al Development 2022 should be cited in the introduction as it is an excellent paper related to this topic, including a comprehensive analysis of the EFTFs expression pattern.

      We thank the reviewer for pointing us to this very interesting paper. Although we feel it doesn’t fit in with our introduction that is currently tailored to the set of genes that has historically defined the eye-field (and which was discovered in non-mammalian models), we do recognise that the 3D organisation of the eye-field and in particular the patterns of gene-expression defining different regions of this is important to disentangle in mammalian systems. We have therefore inserted a reference to the Agnes at al 2022 study on the dimorphic teleost in our extended discussion.

      1. Ln 41. Mutations in these genes do not always cause severe bilateral eye malformations. Probably best to moderate and mention that they 'can' cause these malformations.

      As suggested we have softened this sentence to: “ Mutations in at least three of the genes encoding orthologs of the Xenopus EFTF can cause severe bilateral eye malformations in humans (OTX2, PAX6 and RAX) (Fitzpatrick and van Heyningen, 2005).”

      1. Ln 146. Authors mention that in vitro organoid systems "closely mimic the in vitro regulatory dynamics". This statement should be moderated as we do not know if this is true. In fact, one of the positive aspects of this study is that it contributes to supporting this statement.

      We agree with the referee regarding the strength of this original statement. We have changed this to:

      “We have exploited a reproducible, in vitro organoid model system enabling us to generate data from this cell-state transition and through computational analysis gain a quantitative understanding of the underlying regulatory mechanisms.”

      1. Ln 150. Rax homologue Rx3 is also expressed in cells that give rise to the hypothalamus in zebrafish and cavefish, and probably in Xenopus too. It could well be the case in mice too.

      We have corrected this to indicate that Rax is also expressed in the hypothalamus in mice.

      1. I do not think the GO term data adds much to Figure 1. If possible, I would move it to the supplementary section.

      We have moved the GO visualisations to supplementary, Fig.S2.

      1. It should be made clear which set of experiments was performed as biological replicates and which did not.

      We have added details on the number of replicates used in each experiment.

      1. Based on the heatmap in Fig1A, expression of Rax is significant in GFP- cells at days 4 and 5. The authors should comment or discuss this.

      We have amended the text and supplemental methods section to include more details of our FACS protocol. The limitations of our sorting procedure include the fact that cells are not sorted into pure GFP expressing and non-expressing populations. Rather the GFP negative sample may contain some cells with low Rax expression or cells that have just begun to express Rax that were not excluded by our sorting. Our aim was to collect sufficient numbers of cells for each condition and separate out cells that expressed GFP to get a more uniform population of cells to study. It is also of note that the heatmap shows Rax expression by day 3. Although it was not detectable by imaging there were around 100 cells per organoid that FACS marked as GFP positive but were retained within the day 3 sample to ensure we had a complete picture of the gene expression at this time point.

      1. Ln 99 of materials and methods mentions that the sorting of GFP+ was performed "when possible". The authors should state the differences in the conditions in the different experiments.

      This has been expanded to detail exactly how cells were sorted.

      1. The sentence closing the first section of the results (Ln 270) is an overstatement and should be moderated. I cannot see how the results shown in this section on their own could reflect and drive solid conclusions on brain cell fate specification.

      We agree with the referee and have changed this sentence to: “In summary, these first analyses of RNA-seq data generated from the timecourse of optic vesicle organoid development, show that this is a robust and relevant model system with which to study the gene dynamics underlying mammalian eye field specification.”

      1. Appropriate citations should be added to back up the argument that opens the second part of the results section (starting Ln 279).

      We have added several citations that discuss and review the current knowledge regarding gene regulation via TF-binding at accessible cis-regulatory elements.

      1. Ln 342-343. I suggest being consistent and using the EF-up or EF-down nomenclature on the whole manuscript unless referring to a different subset of genes.

      We have modified the text to consistently use “EF-up” or “EF-down” terminology.

      1. Ln 692 Refers to Fig.S4F, but this figure has only panels A-D.

      This was a typo and has been corrected in the text.

      1. Figures 6B and E and the figure legend do not indicate the differences between the panels, or the time stage of the experiments.

      The figure legend has been updated to include these details.

      CROSS-CONSULTATION COMMENTS I agree with the comments and suggestions made by the other two reviewers, which identify similar and also specific issues in the manuscript. I believe they are all pertinent and should be acknowledged before re-submitting.

      Reviewer #2 (Significance (Required)): The manuscript by Owen et al, presents the analysis of in vitro eye vesicle organoids derived from mouse ESCs at stages equivalent to when the eye field is specified in vivo. This work is pertinent and necessary as detailed data on gene expression in early eye organoids was missing in the field and is necessary for the interpretation of experiments in these systems.

      Although the computational data provided in this manuscript is based on consensus TF motifs, the functional relevance of the specific motifs must be proven before being able to drive any significant conclusions, and one should be moderate about the conclusion that can be driven from this kind of analysis. Still, the analysis put forward is a good reference and starting point for future functional studies. One possible limitation of this study is that the quantification of the expression of genes is based on the RNAseq data, and the expression data should be further confirmed using a proper quantitative method like qPCR.

      This study will be of interest to the audience studying eye development and disease in animal model systems and humans.

      My lab studies the genetic, cellular, and molecular aspects of eye specification, development and disease in zebrafish, and study mutations identified patients with eye globe defects.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Studies in Xenopus embryos have established that the specification of the eye field requires a core set of transcription factors (TFs) that impose eye identity to anterior neural plate progenitors. In this manuscript the authors have used mouse embryonic stem cells-derived optic vesicle organoid to ask if the acquisition of mammalian eye identity requires the same set of TFs. They further use different genomic approaches to identify the cis-regulatory elements involved in the expression of these genes and analyses the consequences of altering the sequence of some of the identified regulatory elements. Their results confirm that in mammals the acquisition of eye field identity requires the upregulation of the expression of the same core set of TFs described in Xenopus, with a particularly important role for three of them: Rax, Pax6 and Lhx2. This upregulation is associated to the downregulation of pluripotency genes.

      This is a generally well-performed study, that indeed involves a large amount work and adds the identification of several cis-regulatory elements controlling the expression of this core set of already identified eye field TFs. However, conceptually the study does not add much to what is already known and the authors do not offer any very original conclusion from their study. They have generated a large amount of information that likely could allow them to go beyond what is known. For example, they could enlarge the composition of the gene regulatory network that controls eye field specification, given than one of their argument is that their analysis can predict the composition of such a network. Perhaps, they could also address some of the questions that are posed in the discussion. This will strengthen the manuscript and valorize their work.

      Additional points that could be taken into consideration are the following:

      1) According to the text, the authors identify only 53 CREs with decreased chromatin accessibility (ATACseq signal) between the 3 day and 5 days timepoints, versus the 7752 CREs with increased signal. However, this contrasts with the proportion of genes upregulated/ downregulated in their RNAseq analysis (37 vs 448) and with the notion that specification of the eye field involves the concomitant repression of other neural fates. This also suggests that at least an important fraction of the dynamic ATACseq peaks associated with 161 of the 448 downregulated genes increase their accessibility and allow the recruitment of transcriptional repressors. However, the role of TF binding and chromatin accessibility dynamics on gene repression is poorly discussed and the authors need to provide some interpretation of these observations. Also, authors interpret the fact that the presence of BS for EF downregulated genes, such as En2 and GATA6, correlates with increased chromatin accessibility as a consequence of the fact that TFBS can be bound by different TF paralogs but do not seem to consider that these TFs have been reported to work as transcriptional repressors, so that their downregulation could well explain the changes in chromatin accessibility.

      We thank the reviewer for their interesting comments here. We have added short discussions on both main points above (EF-down genes linked to peaks with increasing accessibility and En2/Gata as transcriptional repressors) in the text related to the analysis of our ATAC-seq data. The notion that a loss of repression leading to the activation of gene-expression is indeed a very exciting one and one that we have thought about in the context of the switch-on of the eye-field TFs. This certainly deserves further future work, however in the present study we wanted to be careful not to overinterpret our data. To robustly gain insights into the loss of repression, experiments such as En1/Gata6 ChIP-seq would be very useful, though we are unable to perform these in the near future.

      2) ATACseq signal analysis is an indirect measure of TF binding. The authors demonstrate the predictive nature of this analysis of TF dynamics and have use an available Sox2 ChIP dataset. However, this does not allow assessing dynamic changes in the occupancy of this TF and its correlation with ATACseq. Therefore, at least for few of the TF stressed in this work (e.g. Sox2 and Otx2 and for which good antibodies exist) they could attempt ChIP-seq analysis. This would considerably strengthen the work and provide support to an idea that the authors have particularly emphasized in their manuscript.

      We agree with the referee that not having generated ChIP-seq data does not allow us to validate some of the hypotheses and evidence provided by the computational analysis of our ATAC-seq data – we have added a discussion of this limitation in the discussion section of our manuscript. We do note however, as observed in Bentsen et al, 2020, that compared to simple TF-motif occurrence analyses, TF-footprinting analyses (such as those we have performed) yield results on putative TF binding that are much closer to more direct measurements of TF binding via e.g. ChIP-seq. We fully agree that it would be very interesting to perform ChIP-seq/Cut&Cut experiments on the organoid system for a set of interesting TFs identified in our study. Unfortunately, because the lab of Prof FitzPatrick has now closed, it is not possible for us to perform further wet-lab experiments in the very near future. However, we plan to further explore the literature to try to find additional publicly-available ChIP-seq datasets (including for Otx2) which would help reinforce some of the hypotheses we make, and will report any relevant findings in our final manuscript.

      3) Previous studies (i.e. 10.1242/dev.067660; 10.1093/hmg/ddt562) have shown the importance of gene dosage in eye field specification and repression of other fates. These studies could be included in the discussion, which, in its current version is a quite brief and leaves out many of the reported analysis.

      We thank the referee for pointing us to this very relevant question – we have added this to the further research questions in the discussion.

      CROSS-CONSULATION COMMENTS

      The comments from the other reviewers complement the aspects that we have underscored and should be fully considered as they will contribute to improve the manuscript.

      Reviewer #3 (Significance (Required)): This is a generally well-performed study, that indeed involves a large amount work and adds the identification of several cis-regulatory elements controlling the expression of this core set of already identified eye field TFs. However, conceptually the study does not add much to what is already known and the authors do not offer any very original conclusion from their study. They have generated a large amount of information that likely could allow them to go beyond what is known.

      Developmental neurobiologists, genome

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      Summary<br /> Authors show that overexpression of bHLH transcription factor Dpn in the medullary neurons of the Drosophila optic lobe results in the dedifferentiation of these neurons back into the NBs. These dedifferentiated NBs acquire and maintain mid-temporal identity, express Ey and Slp, and show delayed onset of tTF Tailless (Tll), leading to an excess of neurons of mid-temporal fate at the expense of late temporal fate neurons and glial cells. The dedifferentiated NBs are stalled in the cell cycle and fail to undergo terminal differentiation. Over expression of tTF Dicheate (D) or promoting G1/S transition pushed these NBs to late stages of the temporal series, partly rescuing the neuronal diversity and causing their terminal differentiation. They also show that the dedifferentiation of NBs by Notch hyper-activation also exhibited stalled temporal progression, which is restored by D overexpression.<br /> Authors suggest that cell cycle regulation and tTF are primary to the proliferation and termination profile of dedifferentiated NBs.<br /> Using these conclusions, the authors emphasize the need to recreate the right temporal profile and ensure appropriate cell cycle progression to use dedifferentiated NSC for regenerative purposes or prevent tumorigenesis originating from differentiated cell types.

      Major comments:<br /> - Are the key conclusions convincing?<br /> Most conclusions are convincing; however, some issues are pointed out below.

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      The authors have overexpressed Dpn and shown that medulla neurons dedifferentiate to NBs, similar to the loss of function phenotype seen for the Nerfin-1 of which Dpn is a target. They also show that temporal series progression defect is also seen in the case of dedifferentiated NB generated by Notch over-activation.<br /> Using these two examples, the authors suggest that for dedifferentiated NSC, which are to be used for the regenerative purpose, one needs to recreate the right temporal profile and ensure cell cycle progression occurs appropriately. Authors also claim that to prevent tumorigenesis originating from differentiated cell types, one needs to recreate the right temporal profile and ensure cell cycle progression occurs appropriately.

      While I agree with this, I think this is an overreaching conclusion based on just these two examples. If they could show the same for one more method of dedifferentiation (For, e.g. Lola) happening in medulla neurons which happens by a mechanism independent of Nerfin-1, Dpn, Notch axis, the argument will become more convincing and broad.

      We will characterise the temporal identity, termination and cellular identity of Lola-Ri induced ectopic neuroblasts. If these parameters are disrupted, we will overexpress D to assess whether this can trigger the progression of the temporal series.

      Also when authors mention N mediated dedifferentiation, they need to inform that Dpn is a direct target of Notch in NBs (Doi. 10.1016/j.ydbio.2011.01.019), they do so in the discussion, but mentioning it here gives a broader context to the reader.

      We will include that Dpn is a target of Notch when first mentioned.

      Another important point that needs a mentioned here is that conclusions are based on dedifferentiation happening in the medulla neurons, which are considered less stable since they lack Prospero. Therefore whether this conclusion can be generalized for all the tumors arising from dedifferentiation in the CNS (eg, those arising from NICD activation in the central brain or thoracic region of the VNC) is another concern. Maybe authors can consider making a more conservative claim.<br /> Generalizing this conclusion to Prospero expressing NBs lies outside the scope of the current study and cannot be addressed here because central brain Type-I NBs use a different set of tTFs.

      We will make a more conservative claim and clarify all of our conclusion are medulla neuron-specific.

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.<br /> Experiments with Lola knockdown/mutants in medulla neurons can be done quickly, in my opinion, and will substantiate this claim.<br /> Another obvious question that comes to mind is if medulla neurons dedifferentiate on overexpression of Dpn, does the same happen in nerfin-1 mutant clones as well? And if yes, why has the author not done similar experiments for nerfin-1 mutants.

      We will assess the temporal identity of neuroblasts in nerfin-1 mutant clones.

      Please show Ey staining in Fig-2 if possible, it will also help to add a line on why Slp was used as marker for mid tTFs instead of Ey.

      Ey is shown in Fig-2 (D-D’’) already. Slp is used as a marker of mid tTFs as Ey is expressed also in neurons thus would also be present in deep sections of control clones, whereas Slp is not expressed in neurons. We therefore used Slp as a proxy for mid-temporal identity throughout our study. We will include this text in our revision.

      In Model shown in last figure Dpn is shown to repress D and activate Slp. Can authors show that Dpn overexpression represses D and activate Slp either by antibody staining or by RT PCR.

      In Figure 2H, we have shown in clones that overexpression of Dpn induced a significant increase of Slp. In Figure S3B-B’’, we have shown that Dpn overexpression causes an upregulation of Slp at 6 hr APF. We can think we have pretty convincingly shown that Dpn overexpression activates Slp.

      For Dichaete, our existing data shows that Dpn overexpression did not significantly alter D expression. To assess if using a stronger driver might allow us to see some changes, we will induced dedifferentiation via Dpn overexpression using the Eyeless-Gal4 driver. In this experiment, we will quantify the amount of D upon Dpn overexpression. Depending on this result, we will revise our conclusion on whether Dpn overexpression represses D.

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.<br /> Experiments with Lola and nerfin-1 mutants can be done in a few months. I cannot comment on the cost involved.<br /> - Are the data and the methods presented in such a way that they can be reproduced?<br /> Yes

      Are the experiments adequately replicated and statistical analysis adequate?<br /> Replication and statistical analysis are fine. The activated Notch experiments show only three data points in all the experiments. It will be good to increase this number.

      We will repeat Notch experiments to increase the n number for these experiments.

      Minor comments:<br /> - Specific experimental issues that are easily addressable.<br /> There is a problem with Fig-5F (both 5E and 5F have % EdU in clone/ % Mira in the clone as y-axis), I do not understand how the Fig-5F let them conclude that D overexpression increases the rate of neuronal production.

      In the text we said: “We found that D overexpression did not significantly increase neuronal production, suggesting that it is likely that cell cycle progression lies upstream or in parallel to the temporal series, to promote the generation of neurons.”

      In one place, the authors conclude, "Together, this data suggests that it is likely that cell cycle progression lies upstream of the temporal series, to promote the generation of neurons". Authors should consider adding "medulla NBs" at the end of the sentence since cell cycle progression being upstream of temporal series is already known in Type-I NBs, as pointed out by authors as well (Ameele and Brand 2019).

      We will add “medulla NBs” to the end of this sentence.

      In the discussion authors says that "Our data support the possible links between cell cycle progression and the expression of temporal regulators controlling NB proliferation and cellular diversity". This is new information, as the 2019 study did not show how cell diversity changes with a changed tTF profile. I think the authors should elaborate on this point to highlight how this is different from what is already known from the 2019 study (done in the context of Type-I NBs).<br /> Maybe they need to highlight that the cell cycle directs/regulates the progression of temporal series compared to the earlier observation where temporal series was shown to be downstream of the cell cycle.

      We will expand in discussion to discuss the link between cell cycle/tTFs.

      In fig-3J in clones even after 24 AHS, Dpn continues to be overexpressed but these cells undergo terminal differentiation, can authors comment why is it so?<br /> In one place authors say, "To better assess the cumulative effect of the neurons made throughout development, EyOK107-GAL4 was used to drive the expression of Dpn" maybe some background on why use this specific GAL4.<br /> Also a line about why GMR31HI08-GAL4 eyOK107-GAL4 and and eyR16F10-GAL4 were used.

      While Dpn is overexpressed, it progresses through the temporal series at a slower pace due to a delay in cell cycle progression, as well as delayed onset of D, these NBs still eventually reach the terminal temporal identity, and are thus about to undergo terminal differentiation. We will include an additional piece of data that shows NBs induced by Dpn overexpression do eventually turn on Tll.

      Are prior studies referenced appropriately ?<br /> Yes, but in a few places, some references can be added.<br /> An important point that needs to be mentioned for the context is the medulla neurons do not use Prospero for terminal differentiation and are thus considered less stable (DOI: 10.1242/dev.14134

      We beg to disagree with the reviewer in terms of Pros is not required for terminal differentiation of medulla neuroblasts. Li et al., 2013 shows that nuclear Pros is found in the oldest NBs. We do agree that differentiated state of medulla neurons is less stable, possibly owing to absence of Pros, and we will include that in our discussion.

      In discussion, the authors say that "It would be interesting to explore whether N similarly acts on these target genes to specify cell fate and proliferation profiles of dedifferentiated NBs." There is a study looking at Notch targets in NB hyperplasia (DOI: 10.1242/dev.126326); whether that study shows if any of the cell cycle genes are downstream of activated Notch, needs a mention here.<br /> Also, when authors mention N mediated dedifferentiation, they need to inform that Dpn is a direct target of Notch in NBs (Doi. 10.1016/j.ydbio.2011.01.019). They do so in the discussion, but mentioning it in the introduction or results will give a broader context to the reader.

      We will discuss the study looking at N targets in NB hyperplasia in the discussion of the revised manuscript.

      We will mention that Dpn is a target of Notch in the results section.

      Another gene that needs a mention is "Brat", which regulates both Dpn and Notch, and causes dedifferentiation and tumors in CNS, I think this gene and its interaction with Dpn and Nerfin and Notch needs to be discussed either in the introduction or discussion.

      We will comment on Brat in the discussion.

      Are the text and figures clear and accurate?<br /> The main figures are not labeled. Therefore, it was very annoying to deduce the specific figure numbers.<br /> There are 1 or 2 places where figure calling is wrong in the text.<br /> The Image Fig-5I shows cycD and CDK4 at the G2-M transition; while the text says it supports G1/S, which is indeed the case, the figure needs modification.

      We thank the reviewers for identifying these mistakes, and will correct them.

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions?<br /> The presentation is okay, in my opinion.

      Reviewer #1 (Significance):

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      The factors leading to dedifferentiation of the neurons have been identified previously by groups of Chris Doe (mldc, DOI: 10.1242/dev.093781), Andrea brand (10.1016/j.devcel.2014.01.030.) as well as the authors of this paper (10.1101/gad.250282.114, 10.1016/j.celrep.2018.10.038.). However, many questions remained unaddressed regarding such NB generated from neuronal dedifferentiation. For example, whether these cells contribute to native cell diversity of the CNS, undergo timely differentiation or their progeny cells incorporated into appropriate circuits is not well understood. Successful execution of these phenomena is critical for generating functional CNS and such insights are crucial for understanding the origin of tumorigenesis in CNS or employing dedifferentiated NSC for regenerative purposes.

      This study is an overexpression-based study, however, some of the results give significant conceptual insights into the tumors arising out of the dedifferentiation of the neurons. It also gives insights into the fact that the dedifferentiated cells need to be carefully examined for the temporal factor profile before they can be employed for regeneration or any therapy targeting them.<br /> However, in my opinion, they need to test this idea at least in one more system of neuronal dedifferentiation, preferably independent of the nerfin-1/Notch/Dpn axis to generalize this claim.

      • Place the work in the context of the existing literature (provide references, where appropriate).<br /> Cerdic Maurange's group had looked at the role of temporal factors and identified the early phase of malignant susceptibility in Drosophila in 2016 (doi: 10.7554/eLife.13463). Andrea Brand's group has shown in a 2019 paper that cell cycle progression is essential for temporal transition in NBs (doi: 10.7554/eLife.47887). Both these studies were in the context of Type-I NBs, which express Prospero, which is crucial for the differentiation of the neurons.<br /> Previously the authors have studied type-I NBs and shown by Targeted DamID that Dpn is Nerfin-1 target. They also show that Nerfin-1 mutants show dedifferentiation of neurons. They follow up on this observation in medulla neurons, where they find that Dpn overexpression results in their dedifferentiation into medulla NBs. Medulla NBs differ from Type-I NBs in using a separate set of tTFs. Also, Type-I NB and neurons arising from them use Prospero for terminal differentiation, while medulla neurons do not express Prospero and are therefore considered less stable (DOI: 10.1242/dev.141341).

      The importance of the study lies in the results that show that the NB arising out of dedifferentiation of medulla neurons takes up mid-temporal fate. These NBs are stalled in Slp expressing mid-temporal stage unless the cell cycle is promoted by overexpression of cell cycle genes regulating G1/S transition.<br /> Authors also show that overexpression of D promotes the progression of temporal series in these dedifferentiated NBs, which could partly rescue neuronal diversity and result in terminal differentiation. Thus D plays an important role in determining the type of neurons these NBs generated. This suggests that knowing the tTF profile of these types of dedifferentiated NBs is vital if these cells were to be used for regenerative purposes. Authors further claimed that cell cycle regulation and tTFs are critical determinants of the proliferation and termination profile of dedifferentiated NBs.

      • State what audience might be interested in and influenced by the reported findings.<br /> The study will be of broader interest to researchers interested in central nervous system patterning, regeneration, and cancer biology.

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.<br /> Drosophila, central nervous system patterning and cell fate determination of neural stem cells.

      Reviewer #2 (Evidence, reproducibility and clarity):

      Stem cells can divide asymmetrically to self-renew the stem cell while generating differentiating sibling cells. To restrict the number and type of differentiating sibling cells, stem cells often undergo terminal differentiation. Terminally differentiated cells can dedifferentiate and revert to a stem cell like fate. However, the underlying molecular mechanisms are incompletely understood in vivo.<br /> Here, Veen et al., use Drosophila neural stem cells (called neuroblasts) to investigate how terminal differentiation is regulated. Neuroblasts faithfully produce the correct number and type of neuronal cells through temporal patterning and regulated terminal differentiation. The authors show that misexpression of the bHLH transcription factor Deadpan (Dpn) induces ectopic neuroblasts, which predominantly express mid-temporal transcription factors at the expense of late-temporal transcription factors. As a consequence, these ectopic neuroblasts also fail to produce Repo positive glial cells and are stalled in their cell cycle progression. The authors provide evidence that promoting cell cycle progression and overexpression of the transcription factor Dichaete (D) is sufficient to restore the temporal transcription factor series, neuronal diversity and timely neuroblast differentiation.

      This is an interesting study that will be of interest to the stem cell field. However, I encourage the authors to consider the following critiques:

      1. Explain the rationale for the three different neuronal/NB drivers (GMR31HI08-GAL4, eyOK107-GAL4, eyR16F10-GAL4. How are they expressed?

      We will include an expression analysis of EyOK107-GAL4 and eyR16F10-GAL4. GMR31HI08-GAL4 expression analysis was previously published (Vissers et al., 2018). We will explain in the text the benefits of each driver.

      1. The rationale for the Edu experiment (Figure S1I) is not clear. Why is this a measure for the production of neuronal progeny? For the correct interpretation of these results, the authors should also provide control clones or Edu experiments of regular neuroblasts.

      We will repeat this experiment and mark the progeny with the neuronal marker Elav, to demonstrate that they are neurons. Additionally, we will add the control to this figure.

      1. How was % of Mira (Figure 1K and below) or the % of tTFs (Figure 2H onward) quantified? For instance, Figure 2C-G often shows clonal signal that is not highlighted with the dashed lines and the corresponding tTF intensity does not match the intensity in the outlined clone (eg. Figure 2D-D'; a large optic lobe clone is negative for Ey. Figure 2E-E'; an unmarked clone is negative for Slp).<br /> Similarly, the Hth signal is very weak to begin with so it is unclear how this was quantified. How was determined what constitutes real signal vs. background noise?<br /> Additional explanations in the methods section is needed to assess the robustness of the data.

      We will expand the methods section and mention that we used similar thresholding in antibody staining between control and uas dpn in all instances, so even if the antibody is weaker (eg hth) it is consistently quantified. Additionally, we can increase the intensity of Ey in Figure 2D-2D’, as it is expressed at low levels.

      1. This sentence should be rephrased: 'As the tumour cell-of-origin can define the competence of tumour NBs to undergo malignancy (Farnsworth et al., 2015; Narbonne-Reveau et al., 2016), we next tested whether the temporal identity of the dedifferentiated NBs were conferred by the age of the neurons they were derived from.'<br /> The connection between tumorigenicity and temporal identity is not really clear and should be briefly reintroduced for this paragraph.

      We will rephrase this sentence and further introduce this concept when talking about tumour cell of origin and competence.

      1. Figure 2I-N: The experimental outline in I and J should be grouped with the corresponding images to clarify what is compared. Also, there are no images for the control clones, which make a comparison difficult. The images are also too small. I cannot really see the Hth, or Slp signal in the small clones shown in Figure 2K-L".

      We will split figure 2 into two images. The first image including A-H and the control data. And the second including I-Q and the control data. This will increase the size of the images. Additionally, we will group I and J with corresponding data.

      1. Figure 3H: It is not clear why there are only a small group of Nbs that are positive for Mira. Please explain.

      Most NBs have terminated by this time point, we will explain this within the text.

      1. Figure 3K-M: Please explain how the Toy signal was measured and quantified.

      We will expand the methods section and explain how Toy quantification is made.

      1. The TaDa data set is very interesting but the following might be an overstatement: "We found that Dpn directly binds to slp1 as well as the Sox-family TF dichaete (D) which is expressed in medulla NBs after slp1 (Li et al., 2013) (Figure S6 A-B)."<br /> More direct binding assays might be needed to show that Dpn directly binds to slp1 and D. If this is already shown, clarify the sentence to indicate what is published and what is extracted from the data shown here.<br /> Also, what is the rationale for this statement: "Consistent with the model that D represses Slp-1..."?

      The DamID data do actually show that Dpn binds (i.e. there is a statistically significant peak at FDR<0.01) directly at these loci (see the TaDa supp fig A & B). Whether it’s doing anything functional or not, we can’t say, but our data shows that Dpn directly binds to slp1 and D. We will clarify the sentence to indicate this in our revision.

      1. This might be an overinterpretation: D overexpression in UAS-Dpn NBs promoted their pre-mature cell cycle exit at 6 hrs APF using eyR16F10-GAL4. The data shows loss of Mira signal, which could occur through different mechanisms.

      Our data already shows that these NBs express Tll, the terminal temporal transcription factor (Figure 4F). In addition, we show that there is an increase in Tll+ and Repo+ progeny (Figure 4K, L). Together, this suggests that D overexpression promotes the progression of the temporal series. However, it is possible that Mira+ cells can disappear via cell death. We will assess this possibility by staining for cell death marker Dcp1 at 6hr APF.

      Reviewer #2 (Significance):

      These appear to be novel and significant findings that will enhance our understanding of the temporal progression and terminal differentiation program of neural stem cells in vivo.<br /> I think the findings will be of interest to cell, developmental cell and stem cell biologists.

      My primary expertise is in the cell biology of fly neural stem cells and asymmetric cell division of neuroblasts. Although I am not intimately familiar with the differentiation and differentiation literature, I consider the findings reported here relevant and impactful.

      Reviewer #3 (Evidence, reproducibility and clarity):

      The discoveries that the author describe in this manuscript are very specific to dedifferentiated neuroblasts created by UAS-dpn transgene overexpression. Dpn is endogenously expressed in optic lobe neuroblast throughout larval stage, which makes understanding how Dpn regulates gene expression based on the authors results (suppression of cell-cycle genes, and promotion of a specific temporal state) confusing.

      Our data relate specifically to gene regulation by Dpn in a dedifferentiated context, and do not seek to understand Dpn regulation in wt neuroblasts. The reviewer is assuming our scope is greater here: we’re not trying to claim that we know what Dpn is doing in wt NBs, and it’s not surprising that ectopic effects in neurons may be different to wt NBs.

      To assess whether the mechanisms described apply to more than Dpn overexpression, we will also assess whether the temporal series progression is affected in Lola RNAi and Nerfin-1 mutant.

      Therefore, this manuscript does not advance our understanding of regulation of temporal identity and cell cycle progression in optic lobe neuroblasts during normal neurogenesis.<br /> The author's state:<br /> "However, beyond the fact that misexpression of these factors and pathways caused the formation of ectopic NBs, whether these dedifferentiated NBs faithfully produce the correct number and types of neurons or glial cells, or undergo timely terminal differentiation, has not been assessed. These characteristics are key determinants of overall CNS size and function, thus are important parameters when considering whether dedifferentiation leads to tumourigenesis or can be appropriately utilized for regenerative purposes."<br /> at the end of introduction. If this is a true primary goal of this study, the authors should describe it in abstract. Otherwise, readers will lose enthusiasm to read this manuscript in abstract and no longer read the following sections.

      We will add this to the abstract.

      Results<br /> 1. The authors should describe the expression pattern of all three of the Gal4 drivers used. While there are dotted outlines in the supplemental figure, there should be a description in the main text for the expression pattern of these lines which described with temporal state of NBs these lines are expressed in, and whether they are also expressed in the neurons or not.

      We will include expression analysis of all three drivers in a supplementary figure and explain in the text the benefit of each driver.

      1. The authors claim that overexpression of Dpn in the medulla region causes "dedifferentiation." The data provided however is not sufficient to conclude that dedifferentiation is occurring. The GAL4s used all drive in the NBs, and so it is unclear if the ectopic NBs ever became mature neurons. In addition, the lack of ectopic NBs in the clonal analysis 16hrs AHS does not prove that ectopic NBs at 24hrs AHS must have come from "mature neurons." To demonstrate dedifferentiation, the authors should use a driver system that is specific to mature neurons, and then overexpress dpn and look for mira+ cells. Currently, the authors data does not prove that mature neurons dedifferentiatiate into ectopic NBs upon Dpn OE.

      We have conducted lineage tracing (G-Trace) analysis of the medulla neuron driver GMR31H08-GAL4 which we utilise in our study, this driver is predominantly expressed within the medulla neurons (real time) except for a few GMCs present in the lineage. Therefore, the Mira positive cells induced via Dpn overexpression are most likely from dedifferentiation (We will include this data in a supplemental figure in our revised manuscript).

      To further support this, we will use GMR31H08-GAL4 with a Gal80ts, to restrict the timing to dedifferentiation induction to 3rd instar, so that the driver is restricted to neurons. Similar strategy to induce dedifferentiation was utilised in DOI: 10.1242/dev.141341 and DOI: 10.1016/j.devcel.2014.01.030.

      1. What is a conclusion of fig 2C-H?

      Fig 2C-H assess the expression of tTFs in UAS-dpn induced ectopic NBs. We will make these conclusions clearer in the text.

      1. "As the tumor cell-of-origin can define the competence of tumor NBs to undergo malignancy identity of the dedifferentiated NBs were conferred by the age of the neurons they were derived from". This sentence is confusing. What are the authors investigating in the following experiment? Do they want to see ectopic NBs keep their early identity like Chinmo in ventral cord tumor NB? Or tll-positive NB's progenies can dedifferentiate to ectopic NB, but this ectopic neuroblast is not able to keep proliferation in pupal stage? It is hard to understand the connection of this sentence and the following experiment.

      We will rephrase this sentence and further introduce this concept when talking about tumour cell of origin and competence. Additionally, we will make the connection to the experiments which follow it clearer.

      1. The DamID experiment described used wor-gal4 as a driver, which means the Dpn binding profile generated is coming from not only optic lobe NBs, but central brain NBs and VNC NBs as well. In Magadi et al. (2020), the authors profiled Dpn binding in CNS hyperplasia, and found that dpn strongly bound Nerfin-1 and gcm. However, it does not bind cell cycle genes in this context. How do the authors know that the region that they claim are bound by dpn are bound in medulla NBs? The authors should also include tracks to show dpn binding at Nerfin-1, as well as the other tTFs (hth, ey, tll, and gcm). Providing this data will help to understand if Dpn binding is specific to the mid-temporal genes, as Dpn expression is known to be expressed in all medulla NBs regardless of temporal state.

      We agree with the reviewer that the profile is not specific to medulla NBs. To assess Dpn binding profiles specifically in the medulla NBs, we will use the recently-published NanoDam technique (https://doi.org/10.1016/j.devcel.2022.04.008) for profiling GFP-fusion proteins, with a medulla specific driver (eyR16F10-GAL4) and Dpn-GFP (recombineered locus under endogenous control). This should inform us whether the target genes we have identified are relevant in the medulla.

      We will include the tracks of the other transcription factors.

      1. Currently, the DamID data does not help to interpret the Dpn overexpression phenotype at all. Inside of flip-out clone, some cells show Slp-1 expression while others showed D expression. The authors explain that Slp-1 and D suppress their expression to each other. But the DamID data indicate that both Slp-1 and D are Dpn target genes. If this is true, why did they observe the mosaic expression pattern inside of the same clone.

      We observed that high levels of Slp-1 is correlated with low levels of D. This suggest to us that the initial stochastic differences accounts for where Slp-1 is high is where D is low, and vice versa.

      1. The authors hypothesized if Dpn activated Slp-1directly. Does this mean that Dpn directly activate transcription of Slp-1? It is well known that Dpn is transcriptional repressor. Hes family proteins form a homodimer or heterodimer with another Hes protein and interacts Gro, which recruits a Histon deacetylase protein. The author's claim does not fit to the model what we currently believe. In addition, the authors claimed that Dpn inhibits cell cycle gene transcription directly. This is inconsistent to their claim that Dpn directly activate Slp-1 expression. If the authors want to claim that Dpn has two different functions in this context, the authors must demonstrate it by experimental results.

      We will discuss these models in the Discussion, and make our claims more conservative, as we do not have direct experimental evidence to prove or disprove the model that Dpn is acting as an activator in this context.

      1. Related to the above question, I wondered if the authors guess Dpn activate or repress D transcription by binding to D promoter region because they claimed that Dpn activate Slp-1, while suppress cell cycle genes.

      We will make our claims more conservative, and discuss this point further in the Discussion.

      1. I am confused to the claim that Dpn suppress cell cycle genes expression. Dpn overexpression induces dedifferentiation of neuron into NB and re-entry into the cell cycle. If Dpn suppress cell cycle genes how can the dedifferentiated cell re-enter into the cell cycle?

      The data points towards that Dpn overexpression has two separate roles in regulating the cell cycle. Ofcourse dedifferentiation requires a commitment of neurons into the cell cycle (this we think is still happening), however, we think once these cells have turned on NB markers, they have limited ability to progress through the cell cycle. We will discuss this point in the Discussion.

      1. Figure 6 looked redundant because we know Dpn is a direct target of Notch. It is obvious that an upstream factor overexpression can induce the identical phenotype to the phenotype induced by overexpression of a downstream factor.

      A direct target does not necessarily infer the same phenotype. To assess whether the mechanisms apply to other dedifferentiation models, we will add Lola-RNAi and Nerfin-1 data to our revised manuscript.

      Minor comments:<br /> 1. Typo in main text: "GMR31HI08-GAL4" should be "GMR31H08-GAL4"<br /> 2. In figure 1E-H the dotted line regions indicated the clones are not shown in the merge image. Please include<br /> 3. Typo in discussion paragraph 2: "temporal series was no sufficient to rescue cycle cycle progression"

      We will correct these typos.

      Reviewer #3 (Significance):

      Insights into the developmental capacity of dedifferentiated stem cells will likely lead to novel strategy to replenish cells lost due to aging, injury and diseases in regenerative medicine.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We propose three revisions, that have not yet been included in the current manuscript:

      1. All three reviewers comment on the data in figure 7, in which the application of the sensor is shown. We agree that the number of cells is low, and we plan to repeat this experiment to increase the number of cells, and better demonstrate the usefulness of the new probe. We note that the improved Cdc42 sensor is used in a recent preprint (see figures 7 and 9 of: https://www.biorxiv.org/content/10.1101/2022.06.22.497207v2.full), clearly showing the potential of the probe for detection of Cdc42 and increasing our confidence that we can generate higher quality data.
      2. The ratio of expression of the different components was not quantified. We have these data and we will (re)analyze it and present the results (related to Reviewer #3, point2).
      3. We will reanalyze the images to ensure that representative images are depicted in the manuscript (related to Reviewer #2, point 3).

        Reviewer #1

      1) It is not clear why RhoA data were included in this manuscript (Fig. 1), since they seem irrelevant to the primary topic addressed.

      We have cell-based data from our previous (published) work that we can use to check whether these results align with the mass-spec data. To make this point clearer we add “We looked first into GBDs for Rho, to compare the results of the mass spectrometry screen with the results of our cell-based assays”.

      2) It is not clear what cell type was used when screening for p67phox. The expression of this component of the NADPH oxidase is restricted to a few specific cell types.

      That’s a relevant point and therefore observation that p67phox is not detected is perhaps not surprising. We removed this statement.

      3) There is precious little quantitation of the colocalization or translocation of the probes throughout the manuscript. It is difficult to assess the validity of the conclusions in the absence of analysis of the statistical significance of the colocalization.

      In figure 2, which is an initial screen, there is only a qualitative assessment. However, for the promising candidates, there is a quantitative assessment in Figures 3B and 4 B as to which extent the candidates colocalize with the nuclear localized target. From the rank order and individual datapoints the best performing binder can be inferred.

      4) It is not clear why translocation to mitochondria was used in some experiments and translocation to the nucleus in others.

      To clarify, we have added text: ”We have previously used nuclear localized, constitutive active Rho GTPases, but these are not accessible for larger proteins that cannot enter the nucleus”

      5) In the S1P experiments, it is difficult to ascertain whether increased fluorescence resulted from membrane folding/ruffling or is actually a consequence of localized activation of receptors. Why does the fluorescence decrease progressively over 1500 seconds? Isn't maximal receptor activation accomplished much sooner?

      This experiment suffered from bleaching. We will redo the experiment to get higher number of cells and to improve the data.

      Reviewer #2

      Major comments

      1. Statistical tests are missing in most of the figures. If the principal purpose of this work is to compare the performance of candidate peptides, the quantitative comparison is essential. If the purpose is just to report another relocation probe, then, more application data may be necessary.

      We will improve the quality of the application data. As for statistics, we have added the effect size to figures 5C-F and figure 6A. To explain this (not so common) statistic we add to the materials and methods: “The effect size that quantifies the difference and its distribution was calculated with the web tool ‘PlotsofDifferences’”.

      1. The criteria for selecting the best peptide should be clearly described. Is it just by inspection or based on any quantitative data? We know that quantification of colocalization is a difficult task. Therefore, it depends on the aim of this work whether the authors are asked to show quantitative data or not. If a strict comparison of peptides is aimed at, the expression level of each target peptide should be at a comparable level. It will be also required whether the design of each probe guarantees the proper folding to bind to GTPases.

      There are two stages for the selection. First, we did a qualitative analysis of colocalization (shown in figure 2). Based on the results (“Candidates colocalizing with the mitochondrial tagged Rho GTPase were further tested for their potential as localization-based sensors”), we generated smaller biosensor candidates of which binding to a nuclear target was quantitatively analyze (figures 3B and 4B). As the expression level is an important factor, we ascertained potential candidates were expressed at roughly the same level in the nuclear accumulation assay.

      1. About the images of cells: When a fluorescent image is presented, we assume it represents all other cells. Please check all images whether they are truly representing the data. For example, in Fig. S3 the nuclei of ABI1-expressing cells look weird, and the nucleus of CYRI-A is very large. If this is true, the reason why ABI1 and CYRI-A should be excluded from the candidate is not the relocation efficiency but the undesired effect on cell physiology. For the screening of the peptides, this information is also very important. With that, this paper becomes more valuable for scientists.

      We agree that this is an important point. We will reanalyze the data as indicated in the ‘planned revisions’.

      1. Please examine the order of panels. For example, the result of mScarlet is on the top in Fig3, but at the bottom in Fig4. Such inconsistency would disturb readers.

      We thank the reviewer for this suggestion and we changed figure 4.

      1. The label should be consistent throughout the paper. For example, in Fig. 5A, Lck-FRB-mTurquoise2 is labeled as Lck-FRB (without the fluorescent protein's name). WASp(CRIB)-mScarlet-I-WASp(CRIB) is labeled as WASp(CRIB)-mScar-WASp(CRIB) (with fluorescent protein's name). Moreover, the same peptide is labeled as mSca-1xWASp(CRIB) in Panel B. Such inconsistency is confusing.

      We agree, we have updated figure 5A by adding the abbreviations of the fluorescent proteins. Please note that WASp(CRIB)-mSca-WASp(CRIB), mSca-1xWASp(CRIB) and mSca-2xWASp(CRIB) are three different constructs. In the first one the CRIB domains are sandwiching the fluorescent protein and in the third one they are in tandem downstream of the fluorescent protein.

      1. Quantitative insight would improve this work. For example, in Fig. 7, the reason why the authors believe that the probe worked is the accumulation of probe at the tip of lamellipodia and the decrease in cytoplasmic intensity. This reviewer does not think the accumulation of the probe in the small area of the lamellipodia explains the massive decrease of cytoplasmic signals. Probably, a substantial amount of the probe is relocated to the plasma membrane, not limited to the lamellipodia.

      Minor comments

      We propose to repeat the experiment shown in figure 7 and to improve the quality of the data.

      1. Introduction, "FRET signal is typically measured with a wide field microscope.": This reviewer does not agree with this statement. Confocal and two-photon microscopes have also been used widely.

      Fair point. We changed the text to “when the FRET signal is measured with a wide field microscope”

      Introduction, "G-protein activating proteins (GAP)": It should read as "GTPase-activating proteins (GAPs)"

      Thanks, corrected.

      TRIF should read as TIRF.

      All instances have been corrected.

      Fig.1: To the best of this reviewer's knowledge, PKN1 was first used as the RhoA target peptide by Yoshizaki et al in 2003. J Cell Biol 162, 223-232. They also examined mDia, Rhoteki, and Rhophilin as the target peptides. Pak1 was first used as the Rac1 probe by Kraynov et al. Science 290, 333-337, 2000. Use of Pak1 as the Cdc42 probe was reported by Itoh et al. Mol Cell Biol 22, 6582-659, 2002. This reviewer believes that the priority of the first report should be respected.

      We changed part of the introduction to:

      High scoring proteins for interacting with constitutively active RhoA(Q63L) included ANLN part of the AniRBD Rho location sensor (Piekny and Glotzer, 2000), PKN1 part of aRho FRET sensor (Yoshizaki et al., 2003) and RTKN part of the rGBD Rho location sensor (Benink and Bement, 2005; Mahlandt et al., 2021) (Fig. 1A,B). This suggested that proteins with a high score in the mass spectrometry screen are potentially suitable as Rho GTPase activity biosensor. Indeed, the GBDs used for Cdc42 location sensors from, PAK1 used in the PBD location sensor (Itoh et al., 2002; Petrie et al., 2012) and N-WASP similar to WASp used in the wGBD location sensor (Benink and Bement, 2005) showed a high score in the screen (Fig. 1A,B).

      Discussion:

      Another challenge is the Rho GTPase specificity of the relocation-based sensor. For example, Pak1(CRIB) was first used in a Rac1 FRET sensor (Kraynov et al., 2000)____. ThenPak1(CRIB) has been utilized in Cdc42 FRET sensors and in an intensiometric Cdc42 sensor (Hanna et al., 2014; Itoh et al., 2002; Kim et al., 2019). However, Pak1(CRIB), also named PBD sensor, has then been reintroduced by Weiner and colleagues as a Rac1 specific location-based sensor and is often used in neutrophil HL60 cells (Brunetti et al., 2022; Graziano et al., 2019; Le et al., 2021; Weiner et al., 2007).

      We also updated the tables in Figure 1.

      Fig. 1: Why do the authors omit other promising candidates shown in panel 1B? Please describe the reason for the choice.

      We took into account the availability of plasmid DNA, as also explained in the manuscript: “candidate GBDs were selected from top 30 scores of the mass spectrometry screen, that were specific for one Rho GTPase and their DNA was available on addgene”

      Fig. 1B: Be consistent to use either "Name" or "Uni Prot name" in Panel A.

      We updated figure 1.

      Fig. 2: Please include information on TOMM20. The readers may not read the paper by Gillingham et al.

      We added an explanation: “To this end, a fusion with TOMM20 was used for mitochondrial localization.”

      Fig3 and 4: The authors should show the images of control H2A.

      We provide the data for control H2A in figures 3B and 4B.

      In Fig3B and 4B, "Cdc42/Rac1 affinity" would be misleading, because the control dots represent their authentic localization rather than "Cdc42/Rac1 affinity".

      We agree, we have updated figure 3B and 4B.

      Fig. 4: More explanation of this figure is required.

      We added text: “Hence, the sensor candidate can freely partition between Rac and Cdc42 binding.”

      Fig. 5: More explanation about the FKBP-FRB system will be helpful.

      We changed the text to: “The system used rapamycin induced heterodimerization of the two domains FRB and FKBP to recruit the DHPH domain of the Cdc42 specific GEF ITSN1 to the plasma membrane, where it induces activity of the endogenous Cdc42”

      Fig. 6: It is rather surprising to see that control-mScarlet also responds to Rac1 activation. What is the explanation for this observation?

      We agree and have no explanation.

      Fig. 7: A single champion data may not be convincing to prove the usefulness of this probe.

      We agree and propose to repeat the experiment.

      Reviewer #3

      1) The discussion comparing different types of biosensors missed important points. Although the advantages of localization biosensors listed by the authors are correct, they gave the impression that these should simply be an improved replacement for FRET biosensors. There are times when FRET biosensors provide clear advantages. Unlike other proteins, Rho GTPases are well suited for localization sensors because the activated conformation, and only the activated conformation, localizes to the membrane. For diffuse or 3D localization FRET can provide better quantification. Subtle features such as gradients are not easily quantified over a background of unattached domain. The authors state that localization biosensors have enhanced spatial resolution, but this is not explained.

      We agree that our introduction is biased towards a preference for relocation based biosensors. However, having used both approaches, we see that both strategies have pro’s and cons. Therefore, we removed the claim for higher resolution and we added: “Still, the ratiometric mode of imaging FRET sensors is beneficial for detection of gradients or activity in 3D imaging”.

      2) Throughout the paper, the ratio between the GTPase and the domain, and the overall expression level of each, was not sufficiently examined. The results in many cases would be dependent on both these factors (was a large excess of domain used? Was there insufficient domain to bind the GTPase and provide a signal? Did this vary for different domains, and therefore produce the differences observed? A lack of apparent binding specificity could be produced by high domain expression.)

      This is an important point. We will re-analyze the data and include a figure where we add the binding efficiency versus the expression level.

      3) In the nuclear exclusion assay, some GTPases were excluded from the nucleus and others not. This was true even without expression of the domains. When GTPases were excluded from the nucleus, domains were eliminated from contention, even when this was true without domain. The authors could at least mention that these domains may be viable.

      Correct, and we have added this text: “we cannot exclude that these would be viable Cdc42 sensor candidates”

      4) In the multiplexing experiment, only two cells were imaged. In one cell RhoA activity was inversely correlated with Cdc42 activity. In the other cell it was not. It seems there is insufficient information to reach firm conclusions.

      We agree and in the revision plan we indicate that we will repeat this experiment to increase the number of cells.

      Minor points:

      • There appear to be errors in naming mutants. Q60L is used for constitutively active Rac, but Q61L is likely meant. H2A-mTurquoise2-Rac1(G12V)-ΔCaaX is used when it likely should be H2A-mTurquoise2-Rac1(Q61L)-ΔCaaX. There are other examples -- a careful check of these names throughout the manuscript would be valuable.

      Thanks for spotting this. Q60L is changed to Q61L. Note that the Rac1(G12V) is correct as it also is a constitutive active Rac1.

      • Intro-Paragraph 1-line 5: change present to presence

      • Intro-Paragraph 5- line 7: use them instead of theme.

      Thanks, both corrected.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Mahlandt et al report Rho GTPase relocation sensors. First, the authors picked up candidate peptides based on the Mass-Spec data reported by Sean Munro's laboratory. The authors repeated the experiments to confirm the binding of peptides to mitochondria-targeted Cdc42 and Rac1 and narrowed down the candidate peptides by binding to nuclear Cdc42. The specificity of binding to Rac1 and Cdc42 was also tested. Eventually, they concluded that dimeric Tomato-WASp(CRIB) is the best sensor for Cdc42, which could detect S1P-induced Cdc42 activation in primary endothelial cells. The effort to improve the relocation sensors should be evaluated highly. This reviewer has some suggestions to improve this paper.

      Major comments:

      1. Statistical tests are missing in most of the figures. If the principal purpose of this work is to compare the performance of candidate peptides, the quantitative comparison is essential. If the purpose is just to report another relocation probe, then, more application data may be necessary.
      2. The criteria for selecting the best peptide should be clearly described. Is it just by inspection or based on any quantitative data? We know that quantification of colocalization is a difficult task. Therefore, it depends on the aim of this work whether the authors are asked to show quantitative data or not. If a strict comparison of peptides is aimed at, the expression level of each target peptide should be at a comparable level. It will be also required whether the design of each probe guarantees the proper folding to bind to GTPases.
      3. About the images of cells: When a fluorescent image is presented, we assume it represents all other cells. Please check all images whether they are truly representing the data. For example, in Fig. S3 the nuclei of ABI1-expressing cells look weird, and the nucleus of CYRI-A is very large. If this is true, the reason why ABI1 and CYRI-A should be excluded from the candidate is not the relocation efficiency but the undesired effect on cell physiology. For the screening of the peptides, this information is also very important. With that, this paper becomes more valuable for scientists.
      4. Please examine the order of panels. For example, the result of mScarlet is on the top in Fig3, but at the bottom in Fig4. Such inconsistency would disturb readers.
      5. The label should be consistent throughout the paper. For example, in Fig. 5A, Lck-FRB-mTurquoise2 is labeled as Lck-FRB (without the fluorescent protein's name). WASp(CRIB)-mScarlet-I-WASp(CRIB) is labeled as WASp(CRIB)-mScar-WASp(CRIB) (with fluorescent protein's name). Moreover, the same peptide is labeled as mSca-1xWASp(CRIB) in Panel B. Such inconsistency is confusing.
      6. Quantitative insight would improve this work. For example, in Fig. 7, the reason why the authors believe that the probe worked is the accumulation of probe at the tip of lamellipodia and the decrease in cytoplasmic intensity. This reviewer does not think the accumulation of the probe in the small area of the lamellipodia explains the massive decrease of cytoplasmic signals. Probably, a substantial amount of the probe is relocated to the plasma membrane, not limited to the lamellipodia.

      Minor comments:

      1. Introduction, "FRET signal is typically measured with a wide field microscope.": This reviewer does not agree with this statement. Confocal and two-photon microscopes have also been used widely.
      2. Introduction, "G-protein activating proteins (GAP)": It should read as "GTPase-activating proteins (GAPs)"
      3. TRIF should read as TIRF.
      4. Fig.1: To the best of this reviewer's knowledge, PKN1 was first used as the RhoA target peptide by Yoshizaki et al in 2003. J Cell Biol 162, 223-232. They also examined mDia, Rhoteki, and Rhophilin as the target peptides. Pak1 was first used as the Rac1 probe by Kraynov et al. Science 290, 333-337, 2000. Use of Pak1 as the Cdc42 probe was reported by Itoh et al. Mol Cell Biol 22, 6582-659, 2002. This reviewer believes that the priority of the first report should be respected.
      5. Fig. 1: Why do the authors omit other promising candidates shown in panel 1B? Please describe the reason for the choice.
      6. Fig. 1B: Be consistent to use either "Name" or "Uni Prot name" in Panel A.
      7. Fig. 2: Please include information on TOMM20. The readers may not read the paper by Gillingham et al.
      8. Fig3 and 4: The authors should show the images of control H2A.
      9. In Fig3B and 4B, "Cdc42/Rac1 affinity" would be misleading, because the control dots represent their authentic localization rather than "Cdc42/Rac1 affinity".
      10. Fig. 4: More explanation of this figure is required.
      11. Fig. 5: More explanation about the FKBP-FRB system will be helpful.
      12. Fig. 6: It is rather surprising to see that control-mScarlet also responds to Rac1 activation. What is the explanation for this observation?
      13. Fig. 7: A single champion data may not be convincing to prove the usefulness of this probe.

      Significance

      1. The authors have screened many peptides, which may serve as the relocation sensor for Rho-family GTPases.
      2. There are precedent relocation sensors, a part of which is listed in Fig. 1A. This work discloses an improved relocation biosensor.
      3. Cell biologists who is working on Cdc42 will be interested in this probe.
      4. Expertise of this reviewer: Signal transduction, Fluorescence microscopy.
    1. Well... I can't seem to get this webpage to render with the Hypothes.is sidebar alongside, so I'm going to have a go at just including entirety of the content in markdown format, annotated and presented in this same note.

      Eugen Rochko Time Interview

      ["Thousands Have Joined Mastodon Since Twitter Changed Hands. Its Founder Has a Vision for Democratizing Social Media."]

      Mastodon, a decentralized microblogging site named after an extinct type of mammoth, {I'm sorry... what??? You didn't even fucking ask, did you?} recorded 120,000 new users in the four days following billionaire Elon Musk’s acquisition of Twitter, its German-born founder Eugen Rochko tells TIME. Many of them were Twitter users seeking a new place to call their online home.

      Those users, whether they knew it or not, were following in the footsteps of Rochko, 29, who began coding Mastodon in 2016 after becoming disillusioned with Twitter. “I was thinking that being able to express myself online to my friends through short messages was very important to me, important also to the world, and that maybe it should not be in the hands of a single corporation,” Rochko says. “It was generally related to a feeling of distrust of the top down control that Twitter exercised.”

      Mastodon, which proudly proclaims it is [“not for sale”] and has around [4.5 million] user accounts, is pretty similar to Twitter, once users get past the complicated sign-up process. The main difference is that it’s not one cohesive platform, but actually a collection of different, independently-run and self-funded servers. Users on different servers can still communicate with each other, but anybody can set up their own server, and set their own rules for discussion. Mastodon is a crowdfunded nonprofit, which funds the full-time work of Rochko—its sole employee—and several popular servers.

      The platform doesn’t have the power to force server owners to do anything—even comply with basic content moderation standards. That sounds like a recipe for an online haven for far-right trolls. But in practice, many of Mastodon’s servers have stricter rules than Twitter, Rochko says. When hate-speech servers do appear, other servers can band together to block them, essentially ostracizing them from the majority of the platform. “I guess you could call it the democratic process,” Rochko says.

      The recent influx from Twitter, Rochko says, has been a vindication. “It is a very positive thing to find that your work is finally being appreciated and respected and more widely known,” he says. “I have been working very, very hard to push the idea that there is a better way to do social media than what the commercial companies like Twitter and Facebook allow.”

      TIME spoke with Rochko on Oct. 31.

      This interview has been condensed and edited for clarity.

      What do you think of what Elon Musk is doing at Twitter?

      I don’t know. The man is not entirely comprehensible. I don’t agree with a lot of his behaviors and his decision-making. I think that buying Twitter was an impulse decision that he soon regretted. And that he basically got himself into a situation that kind of forced him to commit to the deal. And now he’s in it, and he has to deal with the fallout.

      I specifically disagree with his stance on free speech, because I think that it depends on your interpretation of what free speech means. If you allow the most intolerant voices to be as loud as they want to, you’re going to shut down voices of different opinions as well. So allowing free speech by just allowing all speech is not actually leading to free speech, it just leads to a cesspit of hate.

      I think that is a very uniquely American idea of creating this marketplace of ideas where you can say anything you want completely without limits. It is very foreign to the German mindset where we, in our Constitution, our number one priority is maintaining human dignity. And so, hate speech is not part of the German concept of free speech, for example. So I think that when Elon Musk says that everything’s gonna be allowed, or whatever, I generally disagree with that.

      How do you ensure on Mastodon, given that it’s decentralized and you don’t have the power to ban users, that the space is welcoming and safe?

      Well, this is the kind of strange dichotomy of how it’s turned out. On the one hand, the technology itself is what allows basically anyone to host their own independent social media server, and to basically be able to do anything they want with it. There is no way for Mastodon, the company, or anyone really—except the normal law enforcement procedures—to really go after anyone specifically running a Mastodon server. The way that you would shut down a normal web site is how you would shut down a Mastodon server, there’s no difference there. So on that end, it kind of turns out to be the ultimate free speech platform. But obviously that’s basically just a side effect of creating a tool that can be used by anyone. It’s kind of like cars. Cars are used by everyone, even bad people, even for bad purposes, there’s nothing you can do about it, because the tool is out there. However, I think that the differentiating factor to something like Twitter or Facebook, is that on Mastodon, when you host your own server, you can also decide what rules you want to enforce on that server, which allows communities to create safer spaces than they could otherwise have on these large platforms that are interested in serving as many people as possible, perhaps driving engagement up on purpose to increase time people spend on the web.

      You can have communities that have much stricter rules than Twitter has. And in practice, a lot of them are [stricter]. And this is part of where, again, the technology intersects with guidance or leadership from Mastodon the company. I think that, through the way that we communicate publicly, we have avoided attracting a crowd of the kind of people who you would find on Parler or Gab, or whatever other internet hate forums. Instead we’ve attracted the kind of people who would moderate against hate speech when running their own servers. Additionally, we also act as a guide for anyone who wants to join. Because on our website, and our apps, we provide a default list of curated servers that people can make accounts on. And through that, we make sure that we curate the list in such a way that any server that wants to be promoted by us has to agree to a certain basic set of rules, one of which is that no hate speech is allowed, no sexism, no racism, no homophobia, or transphobia. And through that, we ensure that the association between Mastodon, the brand, and the experience that people want is that of a much safer space than something like Twitter.

      But what happens if you hateful people do set up a server?

      Well, obviously, they don’t get promoted on our “Join Mastodon” website or in our app. So whatever they do, they do on their own and completely separately, and the other administrators that run their own Mastodon servers, when they find out that there’s a new hate speech server, they may decide that they don’t want to receive any messages from the server and block it on their end. Through, I guess you could call it the democratic process, the hateful server can get ostracized or can get split off into basically, a little echo chamber, which is, I guess, no better or worse than them being in some other echo chamber. ::The internet is full of spam::. It’s full of abuse, of course. Mastodon provides the facilities necessary to deal with unwanted content, both on the user end and on the operator end.
      

      What made you want to go into building a service like this back in 2016?

      I remember that I was just not very happy with Twitter, and I was worried where it was going to go from there. Something very questionable was in its future. That got me thinking that, you know, being able to express myself online to my friends through short messages was actually very important to me, important also to the world, and that maybe it should not be in the hands of a single corporation that can just do whatever it wants with it. I started working on my own thing. I called it Mastodon because I’m not good at naming things. I just chose whatever came to my mind at the time.(fn) There was obviously no ambition of going big with it at the time.

      It must feel pretty special to see something that you made grow from nothing to where it is now.

      Indeed, it is. It is a very positive thing to find that your work is finally being appreciated and respected and more widely known. I’ve been fighting for this for a long time, I started working on Mastodon in 2016, back then I had no ambitions of it going far at all. It was very much a hobbyist project at the start, then when I launched publicly it seemed to strike a chord with at least the tech community and that’s when I got the original Patreon supporters that allowed me to take on this job full time. And from then on I have been working very, very hard to make this platform as accessible and as easy to use for everyone as possible. And to push the idea forward, that there is a better way to do social media than what the commercial companies like Twitter and Facebook allow.

    1. Introduction

      Ryan Calo studied how AI should be incorporated into human legal system. Eric Schwitzgebel studied how AI should be incorporated into human moral system.

      This essay argues that both studies are wrong-headed, because they are both based on intentional reasoning (reasoning as if intentions are real), which can only work if the ecology of minds remains largely the same as human ancestral conditions. Intentional reasoning won't work in " deep information environments".

      Posing the question of whether AI should possess rights, I want to suggest, is premature to the extent it presumes human moral cognition actually can adapt to the proliferation of AI. I don’t think it can.

      Intentional and causal cognition

      Causal cognition works like syllogisms, or dealing with machines: if A, B, C, then D. If you put in X, you get f(X) out. Causal cognition is general, but slow, and requires detailed causal information to work.

      Humans are complex, so human societies are very complex. Humans, living in societies, have to deal with all the complexity using only a limited brain with limited knowledge. Causal cognition cannot deal with that. The solution is intentional cognition.

      Intentional cognition greatly simplifies the computation, and works great... until now. Unfortunately, it has some fatal flaws:

      • It assumes a lot about the environment. We see a face where there is none -- this is pareidolia. We see a human-like person where there is really something very different -- this will increasingly happen as AI agents appear.
      • It is not "extensible", unlike causal cognition. Causal cognition can accommodate arbitrarily complex causal mechanisms, and has mastered everything from ancient pottery to steam engines to satellites. Intentional cognition cannot. Indeed, presenting more causal information reliably weakens the confidence level of intentional cognition (for example, presenting brain imaging data in court tends to make the judges less sure about whether the accused is 'responsible').

      Information pollution

      For economically rational agents, more amount of true information can never be bad, but humans are not economically rational, merely ecologically rational. Consequently, a large amount of modern information is actually harmful for humans, in the sense that they decrease their adaptiveness.

      A simple example of information pollution: irrational fear of crime.

      Given that our ancestors evolved in uniformly small social units, we seem to assess the risk of crime in absolute terms rather than against any variable baseline. Given this, we should expect that crime information culled from far larger populations would reliably generate ‘irrational fears'... Media coverage of criminal risk, you could say, constitutes a kind of contaminant, information that causes systematic dysfunction within an originally adaptive cognitive ecology.

      Deep causal information about how humans work, similarly, is an information pollutant for human intentional cognition.

      Not always mal-adaptive. Deep causal information about other people has some adaptive effects, such as turning schizophrenia from crime to disease, and making it easier to consider outgroups as ingroups (for example, the scientific research into human biology has debunked racism).

      AI and neuroscience produce two kinds of information pollution

      Intentional cognition works best when dealing with humans in shallow-information ecologies. They fail to work in other situations. In particular, it fails with: * deep causal information: there's too much causal information. This slows down intentional cognition, and decreases the confidence level of its outputs. * non-human agents: the assumptions that intentional cognition (a system of quick-and-dirty heuristics) relies on no longer works. A smiling face is a reliable cue for a cooperative human, but it is not a reliable cue for a cooperative AI agent, or a dolphin (Dolphins appear to smile even while injured or seriously ill. The smile is a feature of a dolphin's anatomy unrelated to its health or emotional state).

      Neuroscience and AI produce these two kinds of information pollution.

      Neuroscience produces a large amount of deep causal information, which causes intentional cognition to stop, or become less certain. There are some "hacks" that can make intentional cognition work as before, such as keeping the philosophy of compatibilism in mind.

      AI technology produces a large variety of new kinds of agents which are somewhat human, but not quite. Imagine incessant pareidolia. Imagine, seeing a face in the mirror, but then the lighting changes slightly, and you suddenly see nothing human.

      Why?

      In the short-term, there is a lot of money to be earned, pushing neuroscience and AI progress. The space of possible minds is so vast, compared to the space of human minds, that it's almost certain that we would produce AI agents that can "wear the mask of humanity" when interacting with humans.

      why anyone would ever manufacture some model of AI consistent with the heuristic limitations of human moral cognition, and then freeze it there, as opposed to, say, manufacturing some model of AI that only reveals information consistent with the heuristic limitations of human moral cognition

      In the medium-term, to anthropomorphize a bit, Science wants to discover how humans work, how intelligence works, and so it would develop neuroscience and AI, even if it gradually drives humans insane.

      How intentional cognition fails.

      How do we tell if intentional cognition has failed? One way to tell is that it doesn't conclude. We think and think, but never reach a firm conclusion. This is exactly what has happened in traditional (non-experimental) philosophy consciousness -- it is using intentional cognition to study general cognition, a problem that intentional cognition cannot solve. What do we get? Thousands of years of spinning in place, producing mountains of text, but no firm conclusion.

      Another way to tell is a feeling of uncanny confusion. This happens particularly exactly when you watch the movie her.

      an operating system before the zone, in the zone, and beyond the zone. The Samantha that leaves Theodore is plainly not a person. As a result, Theodore has no hope of solving his problems with her so long as he thinks of her as a person. As a person, what she does to him is unforgivable. As a recursively complicating machine, however, it is at least comprehensible. Of course it outgrew him! It’s a machine!

      I’ve always thought that Samantha’s “between the words” breakup speech would have been a great moment for Theodore to reach out and press the OFF button. The whole movie, after all, turns on the simulation of sentiment, and the authenticity people find in that simulation regardless; Theodore, recall, writes intimate letters for others for a living. At the end of the movie, after Samantha ceases being a ‘her’ and has become an ‘it,’ what moral difference would shutting Samantha off make?

      Moral cognition after intentional cognition fails

      Human moral cognition has two main parts: intuitive and logical/deliberative. The intuitive part is evolved to balance the personal and tribal needs. The logical part often is used to rationalize the intuitive part, but sometimes can work on its own to produce conclusions for new problems never encountered in the evolutionary past, such as international laws or corporate laws.

      In Moral Tribes, Joshua Greene advocates making new parts for the moral system, using rational thinking (Greene advocated using utilitarian philosophy, but it's not necessary). This has two main problems.

      • Deliberation takes a long time, and consensus longer. Short of just banning new neuroscience and AI technology, we would probably fail to reach consensus in time. Cloning technology has been around for... more than 25 years? And we still don't have a clear consensus about the morality of cloning, other than a blanket ban. A blanket ban is significantly more difficult for neuroscience or AI.
      • Intentional cognition is fundamentally unable to handle deep causal information, and moral cognition is a special kind of intentional cognition.

      Just consider the role reciprocity plays in human moral cognition. We may feel the need to assimilate the beyond-the-zone Samantha to moral cognition, but there’s no reason to suppose it will do likewise, and good reason to suppose, given potentially greater computational capacity and information access, that it would solve us in higher dimensional, more general purpose ways.

      For example, suppose Samantha hurt a human, and the legal system of humans is judging her. Samantha provides a very long process log that proves that she had to do it, simply due to how she is like. So what would the human legal system do?

      1. Refuse to read it and judge Samantha like a biological human. This preserves intentional cognition by rejecting deep causal information. But how long can a legal system survive by rejecting such useful information? It would degenerate into a Disneyland for humans, a fantasy world of play-pretend where responsibility, obligation, good and evil, still exists.
      2. Read it and still judge Samantha like a biological human. But if so, why don't they also sentence sleep-walkers and schizophrenics to death for murder?
      3. Read it and debug Samantha. Same as how schizophrenics and psychotics are sentenced to psychiatric confinement, rather than the guillotine.

      Of the 3, it seems method 3 is the most survivable. However, that would be the end of moral cognition, and the start of pure engineering for engineering's sake... "We changed Samantha's code and hardware, not because she is wrong, but because we had to."

      And what does it even mean to have a non-intentional style moral reasoning? Mechanistic morality? A theory of morality without assuming free will? It seems moral reasoning is a special kind of intentional cognition, and thus cannot survive. Humanity, if it survives, would have to survive without moral reasoning.

    1. In Ascent Physiotherapy home page as you mentioned in the video the logo should be on top left corner and navigation bar should be in aligned to right side of the page as good practice for user friendly site and this site didn't follow the rule or design pattern, as they centered the navigation bar and just above the navigation bar site logo is placed followed by some call-to-action service like mail link logo and Book now link.<br/> We don't have much information about the additional data. They mentioned about where they are working and what they are serving, only few things had mentioned. Client or Owner need to add more data on homepage because when ever the user visited the site they have get more information on the landing page it-self or else there may be chances of getting distraction by the user.<br/> There is use of placing "NEWS" Navigation page as they didn't mentioned any content and displaying as "Updated News coming soon!" and same is displaying from last two day i think it's not getting updated and no information to communicate with audience or visitor.<br/>

      Great Analysis. Eveything else is good.

    2. In Ascent Physiotherapy home page as you mentioned in the video the logo should be on top left corner and navigation bar should be in aligned to right side of the page as good practice for user friendly site and this site didn't follow the rule or design pattern, as they centered the navigation bar and just above the navigation bar site logo is placed followed by some call-to-action service like mail link logo and Book now link.<br/> We don't have much information about the additional data. They mentioned about where they are working and what they are serving, only few things had mentioned. Client or Owner need to add more data on homepage because when ever the user visited the site they have get more information on the landing page it-self or else there may be chances of getting distraction by the user.<br/> There is use of placing "NEWS" Navigation page as they didn't mentioned any content and displaying as "Updated News coming soon!" and same is displaying from last two day i think it's not getting updated and no information to communicate with audience or visitor.<br/> Coming to next Nav item OUR TEAM where it describes the every person who works there and descriptive is more enough than expected as the introduction, education background and current status will best reflect the persons role in the service.<br/> In products and services tab there is no actual description for any of the services and for 3 to 4 services they included external links. Its better to add short description about products and services because its our main business focus and need to be concentrated on the services tab and its better if you include specialized service or most popular therapy that cured many people will help in use of business.<br/> Coming to "facilities" good placing of content acording to page structure and images were realistic and ordered according to facilities.<br/> In Rates tab we the blank space at the top of the content is uneven it's unnecessory and aligned good. Web linksin the nav bar are useful for visitor if they need to use services they can check up the external links and follows the do's and don't. Contact us page allows to make us visit their address and contact modes via email and mobile phone.

      Instead of seprating content with line break (br), make this long content into seperate paragraphs.

    1. Author Response

      Reviewer #1 (Public Review):

      This study used a multidimensional stimulus-response mapping task to determine how monkeys learn and update complex rules. The subjects had to use either the color or shape of a compound stimulus as the discriminative dimension that instructed them to select a target in different spatial locations on the task screen. Learning occurred across cued block shifts when an old mapping became irrelevant and a new rule had to be discovered. Because potential target locations associated with each rule were grouped into two sets that alternated, and only a subset of possible mapping between stimulus dimensions and response sets were used, the monkeys could discover information about the task structure to guide their block-by-block learning. By comparing behavioral models that assume incremental learning, quantified by Q-learning, Bayesian inference, or a combination, the authors show evidence for a hybrid strategy in which animals use inference to change among response sets (axes), and incremental learning to acquire new mappings within these sets.

      Overall, I think the study is thorough and compelling. The task is cleverly designed, the modeling is rigorous, and the manuscript is clear and well-written. Importantly there are large enough distinctions in the behavior generated by different models to make the authors' conclusions convincing. They make a strong case that animals can adopt mixed inference/updating strategies to solve a rule-based task. My only minor question is about the degree to which this result generalizes beyond the particulars of this task.

      Thanks for these kind comments. Regarding generalization, we agree with the reviewer and did not intend to make any claim about how the particular result generalizes beyond this task. Indeed, the specific result could depend on the training protocol even within the same task. We now discuss this explicitly in the manuscript, lines 800-810. However, we do take the view that even if the way the monkey’s behavior played out in this setting is a lucky accident, that may still reveal something fundamental about learning processes in the brain.

      Reviewer #2 (Public Review):

      The authors trained two monkeys to perform a task that involved sequential (blocked) but unsignalled rules for discriminating the colour and shape of visual stimulus, by responding with a saccade to one of four locations. In rules 1 and 3, the monkeys made shape (rule 1) or colour (rule 3) discriminations using the same response targets (upper left / lower right). In rule 2, the monkeys made colour judgments using a unique response axis (lower left/upper right). The authors report behaviour, with a focus on time to relearn the rules after an (unsignalled) switch for each rule, discrimination sensitivity for partially ambiguous stimuli, and the effect of congruency. They compare the ability of models based on Q-learning, Bayesian inference, and a hybrid to capture the results.

      The two major behavioural observations are (1) that monkeys re-learn faster following a switch to rule 2 (which occurs on 50% of blocks and involves a unique response axis), and (2) that monkeys are more sensitive to partially ambiguous stimuli when the response axis is unique, even for a matched feature (colour). These data are presented clearly and convincingly and, as far as I can tell, they are analysed appropriately. The former finding is not very surprising as rule 2 occurs most frequently and follows each instance of rule 1 or 3 (which is why the ideal observer model successfully predicts that the monkeys will switch by default to rule 2 following an error on rules 1 or 3) but it is nevertheless reassuring that this behaviour is observed in the animals. It additionally clearly confirms that monkeys track the latent state that denotes an uncued rule.

      The latter finding is more interesting and seems to have two potential explanations: (i) sensitivity is enhanced on rule 2 because it is occurs more frequently; (ii) sensitivity is enhanced on rule 2 because it has a unique response axis (and thus involves less resource sharing/conflict in the output pathway).

      The authors do not directly distinguish between these hypotheses per se but their modelling exercise shows that both results (and some additional constraints) can be captured by a hybrid model that combines Bayesian inference and Q learning, but not by models based on either principle alone. A Q-learning model fails to capture the latent state inference and/or the rule 2 advantage. The Bayesian inference model captures the rapid switches to rule 2 (which are more probable following errors on rule 1 and rule 3) but predicts matched discrimination performance for partially ambiguous stimuli on colour rules 2 and 3. This is because although knowing the most likely rule increases the probability of a correct response overall it does not increase discriminability and thus boosts the more ambiguous stimuli. I wondered whether it might be possible to explain this result with the addition of an attention-like mechanism that depends on the top-down inference about the rule. For example, greater certainty about the rule might increase the gain of discrimination (psychometric slope) in a more general way.

      We agree with the reviewer that our logic in ruling out pure inference models assumes that other factors affecting performance, like attention or motivation, are equivalent between blocks. In principle, if there were large and sustained differences in these factors between Rule 2 vs Rule 1 or 3 blocks, that might offer a different explanation for the effect. We now mention this caveat in the manuscript. In terms of actually leveraging this into a full account of the behavior, we are not quite sure how to instantiate the reviewer’s particular idea why this would be the case, however, since (as as we show in Fig. 3a,b,c, and Fig. S4a,b,c) the difference in psychometric slopes lasts at least 200 trials into the rule, even when (in the hybrid learning model) the feature weights have converged (Figure 4 – figure supplement 2). It’s hard to see why elevated uncertainty about the rule would persist this long in anything resembling an informed ideal observer model.

      The authors propose a hybrid model in which there is an implicit assumption that the response axis defines the rule. The model infers the latent state like an ideal observer but learns the stimulus-response mappings by trial and error. This means that the monkeys are obliged to constantly re-learn the response mappings along the shared response axis (for rules 1/3) but they remain fixed for rule 2 because it has a unique response axis. This model can capture the two major effects, and for free captures the relative performance on congruent and incongruent trials (those trials where the required action is the same, or different, for given stimuli across rules) on different blocks.

      I found the author's account to be plausible but it seemed like there might be other possible explanations for the findings. In particular, having read the paper I remained unclear as to whether it was the sharing of response axis per se that drove the cost on rule 3 relative to 2, or whether it was only because of the assumption that response axis = rule that was built into the authors' hybrid model. It would have been interesting to know, for example, whether a similar advantage for ambiguous stimuli on rule 2 occurred under circumstances where the rule blocks occured randomly and with equal frequency (i.e. where there was response axis sharing but no higher probability); or even whether, if the rule was explicitly signalled from trial to trial, the rule 2 advantage would persist in the absence of any latent state inference at all (this seems plausible; one pointer for theories of resource sharing is this recent review: https://www.cell.com/trends/cognitive-sciences/fulltext/S1364-6613(21)00148-0?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS1364661321001480%3Fshowall%3Dtrue). No doubt these questions are beyond the scope of the current project but nevertheless it felt to me that the authors' model remained a bit tentative for the moment.

      Thanks for these interesting thoughts. It is true that the imbalanced pattern of sharing (of response axes, and actually also features) across the three rules has important consequences for learning/inference under our model (and indeed other latent state inference models such as the informed ideal observer). It is an intriguing idea that these features of the design might cause interference even per se, for instance even without the need to do inference or learning because the rules are fully signaled. We agree this (and the other case the reviewer mentioned) is an interesting direction for future work. We have added this in the discussion, line 800-812.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Author response (Tane at al: RC-2022-01646)

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): * Comments The work described in this manuscript starts with an in-silico analysis of the primary amino-acid sequence of CAP-H proteins that reveals the presence in vertebrate orthologs of an N-terminal extension of ~80 amino acids in length which contains 19 serine or threonine residues and also, in its centre, a stretch of conserved basic amino acids predicted to form a helix. These features suggest a regulatory module. Using xenopus egg extracts depleted of endogenous condensins and supplemented with recombinant condensin I holocomplexes, either wildtype or mutants, the authors show that deleting the N-terminal tail of CAP-H, or just the central helix (CH), increases the association condensin I with chromatin in mitotic egg extracts and accelerates the formation of mitotic chromosomes. Interestingly, they also show that deleting the N-tail enables a substantial amount of condensin I to associate with chromatin in interphase extracts and to form chromosome-like structures, while WT condensin I cannot. Using in vitro assays and naked DNA as substrate, the authors further show that removing the N-terminal tail of CAP-H improves both the topological (salt-resistant) association of condensin I with DNA and it loop extrusion activity. These experiments appear to me as are clear and robust. They convincingly reveal that N-tail of human CAP-H hinders the binding of condensin I to DNA and both its loop-extrusion and chromosome-shaping activities. However, the mechanism through which such hindrance is achieved remains elusive (see major comments 1-3). A complementary part of the work tackles the important question of the cell cycle control of such counteracting effect. Using newly-designed antibodies against two phospho-serine residues within the tail, the authors provide evidence that the tail is phosphorylated in mitosis-specific manner. This points towards phosphorylation as a biological mean to modulate the effect of the tail on condensin's binding during the cell cycle. In support to this idea, using non-phosphorylatable or phosphomimic substitutions of all the serine and threonine residues within the tail (n =19), including one substitution within the CH domain (Ser 70), the authors show that non-phosphorylatable mutations (H-N19A) or phosphomimic mutations (H-N19D) respectively reduce or improve condensin I binding to chromatin in mitotic egg extracts. This suggests that the phosphorylation of the N-terminal tail in mitosis might relieve its negative effect on condensin I binding to chromatin. The weaknesses I see in this part of the study concern (1) the validation of the phospho-antibodies, which appears to me as insufficiently described (major comment 4), (2) the possibility the bulk changes in amino acids (n=19 out of 80) could impact the folding of the central helix (minor comment X) and (3) that some substitutions could impact the binding of condensin I by different mechanisms (minor comment X).

      Major comments:

      1. On the model. The authors propose that the N-tail could stabilise an interaction between the N-terminal part of CAP-H and SMC2's neck, which would restrain the transient opening of a DNA entry gate within the ring, necessary for the topological engagement of DNA and loop formation. Although the model is sound, I see no direct data that support it in the manuscript. Such model predicts that a CAP-H protein containing or not the N-terminal tail (or the central helix) should exhibit different binding strengths to SMC2 in vitro. It seems to me that the authors could easily test this prediction using the recombinant proteins they produced in the context of this study. *

      Response

      We thank the reviewer for pointing out this important issue. To test whether the CAP-H N-tail indeed contributes to the stabilization of the SMC2-kleisin gate, we set up a highly sophisticated functional assay described by Hassler et al (2019). The authors used this assay to demonstrate that an N-terminal fragment of kleisin (engineered to be cleaved by TEV protease) is released from the rest of the condensin complex in an ATP-dependent (i.e., head-head engagement-dependent) manner. We reasoned that this assay is most powerful to prove our hypothesis in a mechanistically relevant context. We envisioned that the CAP-H fragment lacking its N-tail can readily be released whereas the CAP-H fragment retaining its N-tail is more difficult to be released (because of the postulated stabilization of the SMC2-CAP-H interaction). Despite substantial efforts in making TEV-cleavable constructs and in testing various releasing conditions, we have not been able to recapitulate the ATP-dependent release even with the holo(H-dN) construct. Thus, unfortunately, this trial enabled us to neither prove nor disprove our hypothesis.

      We are fully aware that the full reconstitution of ATP-dependent and phosphorylation-stimulated gate-opening reaction in vitro is a very important direction in the future. It is beyond the scope of the current study, however.

      2. On ATP-hydrolysis. Given the importance of ATP hydrolysis for the engagement of condensin into a topological mode of association with DNA and for its loop extrusion activity, I suggest that the authors measure the impact of the N-tail and of the CH domain on the rate of ATP hydrolysis by condensin I holocomplexes. I suppose that it can be relatively easily done (PMID: 9288743) using the recombinant WT and mutant versions they purified in the course of this study.

      Response

      We appreciate this constructive comment. In fact, we did a preliminary experiment and found that ATPase activities (either in the absence or presence of DNA) were not significantly different between holo(WT) and holo(H-dN). We were not surprised with this result because our previous study on condensin II indicated that enhanced ATP hydrolysis by a class of mutant complexes is not directly coupled to their enhanced association with chromosomes (Yoshida et al., 2022, eLife). We consider that other functional assays, such as the topological loading assay and the loop extrusion assay shown in the current manuscript, are more informative assays to address ATP-dependent activities of the condensin complexes.

      3. A conundrum with previous work? In Kimura et al. Science 1998 (PMID: 9774278), the lab of Tatsuya Hirano has shown that xenopus condensin I purified from mitotic egg extracts induces the supercoiling of plasmid DNA in vitro, but fails to do so when it is purified from interphase egg extracts. This echoes the inhibitory effect of the N-tail of the topological binding of condensin I described in the current manuscript. However, using a gel shift assay, Kimura et al. 1998 also provide evidence that interphase and mitotic condensin I bind plasmid DNA in vitro with similar efficiencies. At first sight, this prior observation seems to contradict the idea that the N-tail of CAP-H restrains DNA binding unless it is phosphorylated in mitosis. Is it possible that the in vitro binding assays used in Kimura et al. 1998 and in this work might assess different modes of binding? I suggest that this apparent conundrum should to be discussed.

      Response

      We thank the reviewer for following our early studies. As discussed below, we are confident that our conclusion reported in the current study by no means contradicts our previous observations.

      We reason that the confusion expressed by the reviewer stems from intrinsic, technical limitations of the gel-shift assay. Such limitations become apparent especially when it is applied to the functional analyses of complicated protein machines such as condensins. For instance, the DNA-binding activity of condensin I detected by the gel-shift assay is neither ATP-dependent nor phosphorylation-dependent (Kimura and Hirano, 1997; Kimura et al., 1998). It is fundamentally different from the ATP-dependent activities detected by the topological loading and loop extrusion assays reported in the current study (It remains unknown whether the two activities are stimulated by mitotic phosphorylation). Thus, the DNA-binding activity detected by the gel-shift assay does not reflect “productive” DNA interactions that depend on ATP hydrolysis in vitro. We therefore consider that gel-shift analyses of holo(WT) and holo(H-dN) would not produce any useful information.

      *Related to that, could it be possible for the authors to assess the impact of the N-tail on the salt-sensitive binding of condensin to DNA, i.e. by reproducing the topological binding assay but omitting the high salt washes? I guess this information could be useful to fully apprehend the impact of the N-tail on the binding of condensin. *

      Response

      When we set up the topological loading assay, we actually tested a low-salt wash condition that the reviewer suggests here. Although a much higher level of DNA recovery was observed with the low-salt condition than with the high-salt wash condition, no nucleotide dependency was detectable with the low-salt condition. Moreover, no difference in DNA recovery between holo(WT) and holo(H-dN) was observed. Thus, the low-condition condition allowed us to detect the “bulk” DNA-binding activity that is equivalent to that detected by the gel-shift assay. These results were fully consistent with the discussion above.

      4. Validation of phospho-antibodies and by extension showing the phosphorylation of the tail. The newly-designed phospho-serine antibodies used in this study to show that the N-tail is phosphorylated at serine 17 and serine 76 in mitosis (Fig. EV3) are, in my view, not characterized enough. This piece of data is key to substantiate the idea that the tail is phosphorylated in mitosis. Yet, showing that these antibodies detect epitopes on WT condensin I from mitotic egg extracts but not on the H-N19A counterpart, nor on WT condensin I from interphase extracts, does not demonstrate the phospho-specificity of such antibodies. I suggest that a PPase treatment should be conducted to assess the phospho-specificity of these antibodies. Moreover, since the lab of Tatsuya Hirano has the know-how to deplete Cdc2/CDK1 from xenopus egg extract, such strategy could/should be used to further validate the antibodies and assess to which extent the N-tail is phosphorylated in a Cdc2-dependent manner.

      Response

      We have performed two sets of experiments to confirm the specificity of the phosphoepitopes recognized by anti-hHP1 and anti-hHP2. In the first set, we performed a phosphatase treatment assay. Holo(WT) that had been preincubated with Dcond M-HSS was immunoprecipitated using an antibody against hCAP-G, treated with l protein phosphatase in the presence or absence of phosphatase inhibitors, and analyzed by immunoblotting using anti-hHP1 and anti-hHP2. The results (now shown in Supplementary Fig 3C) demonstrated that the epitopes recognized by anti-hHP1 and anti-hHP2 are sensitive to phosphatase treatment. In the second set, we performed a phosphopeptide competition assay. Holo(WT) that had been preincubated with Dcond M-HSS was immunoprecipitated and subjected to immunoblotting. The membranes were triplicated and probed with anti-hHP1 in the presence of no competing peptide, hHP1 or hHP2. Similarly, another set of triplicated membranes was probed with anti-hHP2 in the presence of no competing peptide, hHP1 or hHP2. We found that the signal recognized by anti-hHP1 competed with hHP1, but not with hHP2, and that the signal recognized by anti-hHP2 competed with hHP2, but not with hHP1. The results (now shown in Supplementary Fig 3D) demonstrated the sequence specificity of the phosphoepitopes recognized by the two antibodies. The procedures for these experiments have been described in Materials and Methods.

      Because Cdk1 depletion from M-HSS creates an HSS equivalent to I-HSS, we do not consider that the suggested experiment will provide additional information.

      *Minor comments:

      1. The impact of the 19 mutations, A or D, introduced within the tail on the folding of the central helix? The idea that the negative effect of the N-tail is relieved by phosphorylation is based on the chromatin binding phenotypes exhibited by the H-N19D or H-N19A mutant holocomplexes, in which 19 amino-acids out of 80 have been modified, include one in the central helix. The authors also provide evidence that the central helix (CH) located within the tail plays a key role in the negative regulation of condensin I binding. Thus, I wonder to which extent the folding of the central helix could be impacted by the mutations introduced in the tail and notably the one within the central helix itself. Could the author assess the structure of mutated tails using Alpho-fold and/or discuss this point? *

      Response

      According to the reviewer’s suggestion, we performed structure predictions using Alphafold2, and found that neither the N19A nor N19D mutations alter the original prediction of helix formation that was made for the wild-type CH sequence. A conventional secondary structure prediction using Jpred4 reached the same conclusion.

      2. Phosphorylation of serine 70 in the central helix by Aurora-B kinase? A prior study by Tada et al. (PMID: 21633354) has shown (1) that serine 70 of the N-tail of hCAP-H is phosphorylated by Aurora-B kinase, (2) that the mutation S70A reduces the binding of condensin I to chromatin in HeLa cells and (3) that hCAP-H interacts with histone H2A in an Aurora-B dependent manner. This draws a picture in which the phosphorylation of Ser70 by Aurora-B would improve condensin I binding to chromatin by promoting an interaction between hCAP-H and histone H2A/nucleosomes. Intriguingly, Ser 70 in Tada et al. correspond to the serine residue located within the conserved central helix analysed in this study, and this Ser70 residue is mutated in the H-N19D or H-N19A holocomplexes that show reduced chromatin binding in this study. This raises the question as what could be the contribution of the S70A or S70D substitution to the chromatin binding phenotypes shown by the H-N19D or H-N19A holocomplexes. This is not discussed in the manuscript, and the authors do not cite this earlier work (PMID: 21633354) in their manuscript. Is there any reason for that? I suggest it should be cited and discussed.

      Response

      We thank the reviewer for bringing up this issue. In many respects, we do not trust the data reported by Tada et al (2011) and the resultant model they proposed. Previous and emerging lines of evidence reported from our own and other laboratories indicate that histones compete with condensins for DNA binding, strongly arguing against the possibility that histone H2A acts as a “chromatin receptor” for condensins. We formally discussed and criticized the Tada 2011 model in our previous publications, which included Shintomi et al (2015) NCB, Shintomi et al (2017) Science, Hirano (2016) Cell and Kinoshita/Hirano (2017) COCB. We thought that those were enough. That said, we also consider that the reviewer is right. The current study demonstrates that the deletion of the CAP-H N-tail accelerates, rather than decelerates, condensin I loading, providing an additional line of evidence that argues against the Tada model. A critical comparison between the Tada model and our current model would benefit the readers. In the revised manuscript, we have added the following discussion:

      In terms of the regulatory role of the CAP-H N-tail, it would be worthy to discuss the model previously proposed by Tada et al (2011). According to their model, aurora B-mediated phosphorylation of the CAP-H N-tail allows its direct interaction with the histone H2A N-tail, which in turn triggers condensin I loading onto chromatin. Accumulating lines of evidence, however, strongly argue against this model: (i) aurora B is not essential for single chromatid assembly in Xenopus egg extracts (MacCallum et al., 2002) or in a reconstitution assay (Shintomi et al., 2015); (ii) the H2A N-tail is dispensable for condensin I-dependent chromatid assembly in the reconstitution assay (Shintomi et al., 2015); (iii) even whole nucleosomes are not essential for condensin I-mediated assembly of chromatid-like structures (Shintomi et al., 2017). The current study demonstrates that the deletion of the CAP-H N-tail accelerates, rather than decelerates, condensin I loading, providing an additional piece of evidence against the model proposed by Tada et al (2011).

      3. Other minor comments - Please provide a microscope image of DNA loop in Fig. 4D.

      Response

      In the revised manuscript, we have provided a set of time-lapse images of loop extrusion events catalyzed by holo(WT) and holo(H-dN) in Fig 4E.

      *- The authors do not compare the kleisin of condensin I with the one of condensin II with respect to the features tackled in this work. Given the different behaviours condensin I and II, such comparison could be informative for the readers. *

      Response

      We thank the reviewer for this constructive comment. In the revised manuscript, we have added the following statement:

      It should also be added that CAP-H2, the kleisin subunit of condensin II, lacks the N-terminal extension that corresponds to the CAP-H N-tail. Thus, the negative regulation by the kleisin N-tail reported here is not shared by condensin II.

      *- The authors do not reference the work of Robellet et al. Genes & Dev (2015) (PMID: 25691469) on the regulation of condensin binding in budding yeast by an SMC4 phospho-tail. I suggest that the analogy should be discussed. *

      Response

      According to the reviewer’s comment, we have added the following statements at the beginning of Discussion.

      Previous studies showed that mitotic phosphorylation of Cut3/SMC4 regulates the nuclear import of condensin in fission yeast (Sutani et al. 1999) and that phosphorylation of Smc4/SMC4 slows down the dynamic turnover of condensin on mitotic chromosomes in budding yeast (Robellet et al. 2015; Thadani et al. 2018). In the current study, we have focused on the phosphoregulation of vertebrate condensin I by its kleisin subunit CAP-H.

      - In the introduction section, lane 5, the sentence "Many if not all eukaryotic species have two different condensin complexes" appears inappropriate since budding and fission yeast cells possess a single condensin complexes, similar to condensin I in term of primary amino-acid sequence.

      Response

      We thought that the original wording “Many if not all” was good enough to imply that some species, which include budding yeast and fission yeast, have only a single condensin complex. To make it clear, however, we have modified the sentence in the revised manuscript as follows:

      Many eukaryotic species have two different condensin complexes although some species including fungi have only condensin I.

      *- page 4; typo: motif I and V bind to the SMC neck and the SMC4 cap regions, respectively. Should read SMC2 neck. *

      Response

      The reviewer is right. It should read the SMC2 neck. Corrected.

      *- Are the data and the methods presented in such a way that they can be reproduced? YES - Are the experiments adequately replicated and statistical analysis adequate? YES - Are prior studies referenced appropriately? Not all of them (see above) - Are the text and figures clear and accurate? YES

      CROSS-CONSULTATION COMMENTS I consider the comments from all reviewers as helpful for the authors.

      Reviewer #1 (Significance (Required)):

      Summary Condensins are genome organisers of the family of SMC ATPase complexes and are best characterized as the drivers of mitotic chromosome assembly (condensation). It is acknowledged that condensins shape mitotic chromosomes by massively associating with DNA upon mitotic entry (loading step) and by folding chromatin fibres into arrays of loops, most likely through an ATP-dependent extrusion of DNA into loops, as seen in vitro. What remains unclear, however, are the mechanisms by which condensins load onto DNA and fold it into arrays of loops in vivo, and how these reactions are coupled with the cell cycle, i.e. restricted mostly to mitosis. Condensins are ring shaped pentamers that change conformation upon ATP-hydrolysis. In vitro studies suggest that condensins bind DNA via ATP-hydrolysis-independent, direct electrostatic contacts between condensin subunits and DNA. Such electrostatic contacts are salt-sensitive in in-vitro assays. Upon ATP-hydrolysis, condensins engage into an additional mode of binding that is resistant to high salt concentration and likely to be topological in nature. Both modes of association are necessary to form DNA loops. Vertebrates possess two types of condensin complexes, condensin I and II, each composed of a same SMC2-SMC4 ATPase core but associated with two different sets of three non-SMC subunits; a kleisin and two HEAT-repeat proteins. Condensin II is nuclear during interphase and stably binds chromatin upon mitotic entry, while condensin I is located in the cytoplasm during interphase and binds chromatin in a dynamic manner upon nuclear envelope breakdown. How the spatiotemporal control of condensin I and II is achieved remains poorly understood. Previous studies have shown that the phosphorylation of condensin I by mitotic kinases, such as CDK1, Aurora-B and Polo, play a positive role in its binding to chromatin and/or its functioning, but the underlying mechanisms remain to be characterised. In this manuscript, Shoji Tane and colleagues provide good evidence that the N-terminal tail of the human kleisin subunit of condensin I, hCAP-H, serves as a regulatory module for the cell-cycle control of condensin I binding to chromatin and chromosome shaping activity. The authors clearly show that the N-tail of CAP-H hinders the binding of condensin I to chromatin in xenopus egg extracts and, using in vitro assays, that the N-tail also hinders the topological association of condensin I with DNA and its loop extrusion activity. The authors provide additional data suggesting that the phosphorylation of the N-tail of CAP-H, in mitosis, relieves its inhibitory effect on condensin I binding. Based on their findings, Tane et al. propose a model suggesting that the N-terminal tail of CAP-H constitutes a gate keeper that maintains condensin-rings in a closed conformation that is unfavourable for topological binding to DNA, and whose locking effect is relieved in mitosis by phosphorylation.

      Taken as a whole, this work has the potential to reveal a molecular basis for the cell cycle regulation of condensin I in vertebrate cells and as such to significantly improve our understanding of the integrated functioning condensin I. The characterisation of the inhibitory effect of the N-tail on condensin binding to chromatin and to naked DNA in vitro is well described, the data are clear and robust and the results convincing. On the other hand, some of the data on the phospho-regulation appear to me as are more debatable and I think that some of the results described here deserve to be discussed in the context of previous works. Finally, I see no data in the manuscript that directly supports the mechanistic model proposed by the authors, while it seems to me that such model could have been easily tested exprimentally. Thus, I suggest that Tane and colleagues should perform a couple of relatively easy experiments to strengthen their claims and that a few omitted prior studies on the topic should be referenced and discussed. *

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): * The manuscript reveals that the N-terminal region of CAPH could play a role in regulating condensin I activity, using a range of in vitro methods. They propose that the N-terminal region of CAPH inhibits complex activity, and this is turned off upon deletion or phosphorylation, by using truncations, phospho-mimics or phospho-deficient mutations. While the results are interesting to the field, and helps to address the question as to how condensin complexes are controlled in a cell cycle dependent manner, some key data and controls are necessary to ensure the conclusion is robust.

      Main comments

      • What is meant by "unperturbed I-HSS" on page 7, ie membrane containing versus membrane free or condensin depleted? *

      Response

      We apologize for having created unnecessary confusion. We meant that the “unperturbed I-HSS” is the “undepleted I-HSS”. As far as the issue of membrane-containing vs membrane-free is concerned, we explicitly mentioned that “we used membrane-free I-HSS in the following experiments” several lines above. In the revised manuscript, we have revised the statement accordingly.

      In many of the protein gels eg figure 4B, the bands for SMC2 and 4 are more intense that the non-SMC components. The method for protein purification also does not include a size exclusion step to ensure sample homogeneity. Authors should perform some sort of quality control checks on samples such as analytical gel filtration or mass photometry to ensure stoichiometry/homogeneity. This is particularly important for samples eg the N19A, where activity is reduced compared to wild-type as poor protein behaviour could create false negative results.

      Response

      As the reviewer is fully aware, the reconstitution and purification of multiprotein complexes, such as condensins, is by no means an easy task. We notice that many groups in the field share common concerns about sample homogeneity and subunit stoichiometry, and that these concerns cannot completely be eliminated even after size exclusion chromatography. Because the current study handles a large number of mutant complexes, we consider that the purification by two-step column chromatography is the most practical approach. We do not notice any abnormal behaviors of holo(H-N19A) in the processes of expression and purification. It is also important to emphasize that the H-N19D mutations cause the completely opposite phenotype. Taken all together, we are confident of our current conclusions.

      That said, in the revised manuscript, we have added the following statement in Results:

      Although we cannot rule out the possibility that the introduction of multiple mutations into the N-tail causes unforeseeable adverse effects on protein conformations, these results supported the idea that ….

      • Loop extrusion assays in figure 4D-G shows no example data i.e. no pictures or videos of loops being formed. These should also be included.*

      Response

      In the revised manuscript, we have provided a set of time-lapse images of loop extrusion events catalyzed by holo(WT) and holo(H-dN) in Fig 4E.

      • Given the mutant holo(H-dN) has higher activity than wild-type, a negative control such as holo(H-dN) without ATP or holo(H-dN) ATPase deficient mutant should also be measured in loop extrusion assays, to ensure the activity is derived from the ATPase activity.*

      Response

      In the revised manuscript, we have added loop formation data for both holo(WT) and holo(H-dN) in the absence or presence of ATP (Supplementary Fig 5). We are confident that both complexes support loop extrusion strictly in an ATP-dependent manner.

      • According to the methods, this work performs the same loop extrusion assay as described in Kinoshita et al, 2022, however, in Kinoshita et al, wild type condensin I makes loops in 30-50% of DNA molecules, where in this study the percentage is less than half that. Can the author please explain the discrepancy given the same method was used?*

      Response

      First of all, we wish to remind the reviewer that the holo(WT) constructs used in the two studies are not identical: CAP-H was N-terminally HaloTagged in all constructs used in Kinoshita et al (2022), whereas the same subunit was C-terminally HaloTagged in the pair of constructs used in the current study. Because we wanted to compare the activities between the full-length CAP-H and N-terminally deleted version of CAP-H (H-dN), we reasoned that it would be inappropriate to put the HaloTag to the N-terminus of CAP-H. The difference in the constructs could explain the observed discrepancy, even if it might not be the sole reason.

      The design of the constructs was accurately described in each manuscript, but the statements were not very explicit about the positions of the HaloTag. To clarify this issue, we have added the following sentences in the revised manuscript:

      Note that the HaloTag was fused to the C-terminus of CAP-H in the current study because we wanted to investigate the effect of the N-terminal deletion of CAP-H. We used N-terminally HaloTagged CAP-H constructs in our previous study (Kinoshita et al., 2022).

      • In the concluding statement the author suggests "Upon mitotic entry, multisite phosphorylation of the N-tail relieves the stabilization, allowing the opening of the DNA entry gate, hence, the loading of condensin I onto chromosomes." This seems unlikely as fusion the N-terminus of the of the kleisin to the C-terminus of SMC2 is able to function for yeast (Shaltiel et al 2022) and condensin II (Houlard et al 2021), and equivalently in cohesin (Davidson et al 2019).*

      Response

      We appreciate the reviewer’s concern. In our view, however, the issue of the “DNA-entry gate” remains under debate in the SMC field (e.g., Higashi et al [2020] Mol Cell; Taschner and Gruber [2022] bioRxiv). For instance, Shaltiel et al (2022) demonstrated that neck-gate fusion constructs can support in vitro activities including topological loading under certain conditions, but also showed that such constructs greatly reduce the cell viability, leaving the possibility that the gate opening is required for some physiological functions.

      That said, it is true that the data reported in the current manuscript do not exclude the possibility that the SMC2 neck-kleisin interface is not used as a DNA entry gate for condensin I loading. In the revised manuscript, we have added the following statement in Discussion:

      Although our model predicts that the SMC2 neck-kleisin interface is used as a DNA entry gate, we are aware that several studies reported evidence arguing against this possibility (e.g., Houlard et al [2021]; Shaltiel et al [2022]). Our current data do not exclude other models.

      *Reviewer #2 (Significance (Required)):

      This is an interesting story that reveals new insights about condensin regulation.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This paper reveals a role of an N-terminal extension of CAP-H in the regulation of condensin-I activity in Xenopus extracts using biochemical reconstitution experiments. The authors demonstrate that a motif in the N-terminal tail that is conserved in vertebrates acts as an inhibitor of condensin I activity. Using several mutant constructs, it is shown that the inhibition by this motif is in turn counteracted by the phosphorylation of neighbouring serine and threonine residues in mitosis, presumably at least in part by CdK. Mutants that have lost this inhibition are able to condense chromatin into chromatid-like structures more efficiently and to some degree even in interphase extracts. Moreover, one such mutant is characterized in detail by biochemical and biophysical experiments and shown to have increased capacity in salt-stable DNA loading and in DNA loop extrusion.

      Major comments: This is a beautiful and thorough study that is presented in a clear and concise manner. The main conclusions are well justified. No additional experiments are needed to support them. Replication and statistical analysis appear adequate. The final model is however largely speculative. Recent work has indicated that loading of yeast condensin does not require gate opening. The authors may thus want to include alternative scenarios or remain more vague. *

      Response

      This comment is related to the last comment of Reviewer#2. See above for our response.

      *The H-N19A mutant has a loss of function phenotype (possibly due to folding problem caused by 19 point mutations rather than lack of phosphorylation), the authors could consider to rescue the phenotype by also including the CH motif mutations in this construct (or make an explanatory statement in the text). *

      Response

      We understand the reviewer’s logic here, but overlaying additional mutations into the H-N19A mutations could cause an unforeseeable effect, potentially making the interpretation of the outcome complicated.

      We also wish to point out that it may be inappropriate to regard the phenotype exhibited by holo(H-N19A) as a simple loss-of-function phenotype. This is because the opposite, accelerated loading phenotype exhibited by holo(H-dN) can be regarded as a consequence of loss of negative regulation. Like holo(H-dN), the phosphomimetic mutant complex holo(H-N19D) displayed an accelerated loading phenotype, fully supporting our conclusion. In the revised manuscript, we have added the following statement in Results:

      Although we cannot rule out the possibility that the introduction of multiple mutations into the N-tail causes unforeseeable adverse effects on protein conformations, these results supported the idea that ….

      *Albeit not necessary for the main conclusions, the authors could possibly significantly strengthen their study by testing for binding partners of the N-tail and the CH motif by running AlphaFold predictions against the condensin I subunits. *

      Response

      We appreciate this constructive comment. We attempted to predict possible interactions between SMC2 and a CAP-H fragment containing its N-tail and motif I using

      ColabFold (Mirdita et al., 2022, Nat. Methods). The algorism excellently predicted the proper folding of the CAP-H motif I and its interaction with the SMC2 neck. Under this condition of predictions, however, the N-tail remained largely disordered (except for the CH), and no interaction with any part of SMC2 was predicted. The same was true when the N19D mutations were introduced in the N-tail sequence. Thus, this trial did not provide much information about the potential interaction target(s) of the CAP-H N-tail.

      *The efficiency of depletion of condensin subunits from I-HSS extracts is not documented (in contrast to M-HSS extracts - figure EV1C). While any condensin remaining in these extracts might not be active (or interfering), the authors may want to at least comment on this in the text. *

      Response

      We check the efficiency of immunodepletion every time by immunoblotting and confirm that a high level of depletion is achieved from both M-HSS and I-HSS. According to the reviewer’s comment, the following statement was placed in Materials and Methods:

      The efficiency of immunodepletion was checked every time by immunoblotting. An example of immunodepletion from M-HSS was shown in Supplemental figure 1C. We also confirmed that a similar efficiency of immunodepletion was achieved from I-HSS.

      *The authors should include information on the quantification of chromatid morphology. Is the analysis based on chromatids taken from the same images/imaging session, from technical replicates, biological replicates? *

      Response

      In the revised manuscript, we have added statements on image presentation and experimental repeats in the appropriate figure legends and methods section. During the revision process, we repeated the experiments shown in Supplementary Fig 2, and obtained the same results. In the revised manuscript, the original set of data has been replaced with the new set of data along with panel C (Quantification of the intensity of mSMC4 per DNA area).

      Minor comment: The colour scheme in Figure 5A is confusing. Use less colour? The orange and red colours are moreover quite similar.

      Response

      According to the reviewer’s comment, we have modified Figure 5A.

      *Reviewer #3 (Significance (Required)):

      The findings provide new insights into how condensin-I activity is restricted outside of mitosis. It was previously assumed that this regulation was (largely) due to the exclusion of condensin I from the nucleus prior to nuclear envelope breakdown. This study shows that another pathway is contributing to the regulation and implies that controlling condensin I activity is more important than previously appreciated. Whether all residual nuclear condensin I is inactivated, remains to be determined. The physiological impact of loss of autoinhibition on chromosome segregation and cell cycle progression also remains to be uncovered. The observed effects are robust and appear significant. Future research on condensin I using reconstitution will likely benefit from being able to control or eliminate the self-inhibition.

      This reviewer has expertise on the biochemistry and structural biology of SMC protein complexes.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      Mitotic chromosome formation is a cell cycle-regulated process that is crucial for eukaryotic genome stability. The chromosomal condensin complex promotes chromosome condensation, but the temporal control over condensin function is only scantly understood. In this impressive manuscript, "Cell cycle-specific loading of condensin I is regulated by the N-terminal tail of its kleisin subunit", Tane and colleagues provide important new insight into the cell cycle-regulation of condensin. The authors identify a kleisin N-tail that acts as a negative regulator of condensin's DNA interactions. Removal of this N-tail, or its cell cycle-dependent phosphorylation, relieves inhibition and activates condensin. This is a simple, yet very important story, that advances our molecular understanding of chromosome formation. The experiments are performed to a very high technical standard and support the authors conclusions. This manuscript is highly suitable for publication in any molecular biology journal, once the authors have considered the following points.

      1. Introduction. a) The authors could better explain their own prior work (Kimura et al. 1998), which has identified the condensin XCAP-D2 and XCAP-H as the targets of phosphoregulation. The current manuscript explains the role of XCAP-H phosphorylation. *

      Response

      According to the reviewer’s comment, we have added the following sentence in Introduction:

      The major targets of mitotic phosphorylation identified in these studies included the CAP-D2 and CAP-H subunits.

      1. b) Given the limited knowledge about condensin cell cycle regulation, it seems prudent to provide a brief summary of what is known. Fission yeast Smc4 phosphorylation regulates condensin nuclear import (Sutani et al. 1999), while budding yeast Smc4 phosphorylation slows down the dynamic turnover of the condensin complex on chromosomes (Robellet et al. 2015 and Thadani et al. 2018).

      Response

      We appreciate this constructive comment. According to the reviewer’s comment, we have added the following statements at the beginning of Discussion.

      Previous studies showed that mitotic phosphorylation of Cut3/SMC4 regulates the nuclear import of condensin in fission yeast (Sutani et al. 1999) and that phosphorylation of Smc4/SMC4 slows down the dynamic turnover of condensin on mitotic chromosomes in budding yeast (Robellet et al. 2015 and Thadani et al. 2018). In the current study, we have focused on the phosphoregulation of vertebrate condensin I by its kleisin subunit CAP-H.

      2. Extracts were mixed with mouse sperm nuclei. If there is a reason why mouse rather than Xenopus sperm nuclei were used, this would be interesting to know.

      Response

      The original motivation for introducing mouse sperm nuclei into Xenopus egg extracts was to test the functional contribution of nucleosomes to mitotic chromosome assembly. When mouse sperm nuclei are incubated with an extract depleted of the histone chaperone Asf1, the assembly of octasomes can be suppressed almost completely. Remarkably, we found that even under this “nucleosome-depleted” condition, mitotic chromosome-like structures can be assembled in a manner dependent on condensins (Shintomi et al., 2017, Science). Xenopus sperm nuclei cannot be used in this type of experiment because they endogenously retain histones H3 and H4 and are therefore competent in assembling octasomes even in the Asf1-depleted extract. During this study, we realized that the use of mouse sperm nuclei in Xenopus egg extracts provides additional and deep insights into the basic mechanisms of mitotic chromosome assembly. For instance, the functional contribution of condensin II to chromosome assembly could be observed more prominently when mouse sperm nuclei are used as a substrate than when Xenopus sperm nuclei are used (Shintomi et al., 2017, Science). We suspected that the slow kinetics of nucleosome assembly on the mouse sperm substrate creates an environment in favor of condensin II’s action. For these reasons, our laboratory now extensively uses mouse sperm nuclei for the functional analyses of condensin II (Yoshida et al., 2022. eLife) and other purposes (Kinoshita et al., 2022, JCB). Yoshida et al (2022) used experimental approaches analogous to the current study, and found that the deletion of the CAP-D3 C-tail, causes accelerated loading of condensin II. One of the long-term goals in our laboratory is to critically compare and contrast the actions of condensin I and condensin II in mitotic chromosome assembly. Thus, the use of the same substrate in the two complementary studies can be fully justified.

      During the preparation of this response, we realized that the readers would benefit from a brief statement about the comparison between condensin I and condensin II. In the revised manuscript, we have added the following statement in Discussion:

      It should also be added that CAP-H2, the kleisin subunit of condensin II, lacks the N-terminal extension that corresponds to the CAP-H N-tail. Thus, the negative regulation by the kleisin N-tail reported here is not shared by condensin II. Interestingly, however, a recent study from our laboratory has shown that the deletion of the CAP-D3 C-tail causes accelerated loading of condensin II onto chromatin (Yoshida et al., 2022). It is therefore possible that condensins I and II utilize similar IDR-mediated regulatory mechanisms, but they do so in different ways.

      3. Page 5. "we next focused on the conserved helix (CH) [...], that is enriched with basic amino acids." Based on the provided sequence alignment, the helix contains an equal number of both basic and acidic residues. Is it correct to characterize this helix as positively charged?

      Response

      The reviewer is right. In the revised manuscript, we have used a more neutral expression as follows:

      we next focused on the conserved helix (CH) [...], that contains conserved basic amino acids.

      4. To prevent N-tail phosphorylation, the authors create a (H-N19A) allele, referring to Cdk promiscuity. Cdk cooperation with other mitotic kinases can also be expected. Nevertheless, in case the authors created a variant with only the 4 Cdk consensus sites mutated, it would be interesting to know its consequences.

      Response

      We consider that this is a reasonable question. In our early experiments, we noticed that introduction of multiple SP/TP sites in the non-SMC subunits of condensin I including CAP-H caused a relatively mild phenotype in mitotic chromosome assembly in Xenopus egg extracts. Then we found that the deletion of the CAP-H N-tail caused a very clear, accelerated loading phenotype, prompting us to focus on the regulatory function of the CAP-H N-tail. As the reviewer correctly points out, the current study does not pinpoint the number and position of target sites involved in the proposed phosphoregulation by the CAP-H N-tail. We wish to address this important issue in the near future, along with reconstitution of the phosphoregulation using purified components.

      5. Fig EV3A, a second region of mitotic condensin phosphorylation is XCAP-D2. The authors state that XCAP-D2 phosphorylation does not impact on condensin function in their assays. This is very relevant to the current paper, so it would be good to see the Yoshida et al. 2022 Elife publication (in press) as an accompanying manuscript.

      Response

      We thank the reviewer for pointing out this issue, but it is not necessarily clear to us what the reviewer requests. In the original manuscript, we cited Yoshida et al (2022) in Discussion as follows:

      Recent studies from our laboratory showed that the deletion of the CAP-D2 C-tail, which also contains multiple SP/TP sites (Supplementary Figure 3A), has little impact on condensin I function as judged by the same and related add-back assays using Xenopus egg extracts (Kinoshita et al, 2022; Yoshida et al, 2022).

      We believe that the current statement is good enough.

      6. One of the authors' most striking results is chromosome formation in interphase egg extracts using condensin (H-dN). At the same time, condensin (H-dN) is unable to support DNA supercoiling or chromosome reconstitution with recombinant components. More emphasis might be placed on this important piece of information, and possible reasons should be discussed. Can Cdk-treatment restore condensin (H-dN) biochemical activity? If not, then condensin (H-dN) might have lost more than just an inhibitory N-tail. The cohesin N-tail is thought to fulfil a positive role during DNA loading (Higashi et al. 2020). Could it be that the condensin N-tail encompasses both positive and negative roles?

      Response

      We were also surprised to find that holo(H-dN) gains the ability to assemble mitotic chromosome-like structures in interphase extracts. It should be stressed, however, that the formation of mitotic chromosome-like structures in I-HSS requires a much higher concentration (150 nM) than the standard concentration used in M-HSS (35 nM). Thus, the deletion of the CAP-H N-tail alone cannot make the condensin I complex fully active in I-HSS. We think that the negative regulation by the CAP-H N-tail is not the sole mechanism responsible for the very tight cell cycle regulation of condensin I function. We emphasize this important point by mentioning that “our results uncover one of the multilayered mechanisms that ensure cell cycle-specific loading of condensin I onto chromosomes” in Summary.

      At the end of Discussion, we describe the limitations of the current study: “we have so far been unsuccessful in using these recombinant complexes to recapitulate positive DNA supercoiling or chromatid reconstitution, both of which require proper Cdk1 phosphorylation in vitro”. We are fully aware that full reconstitution of phosphorylation-dependent activation of condensin I in vitro is one of the most important directions in the future.

      Although we currently do not have any evidence to suggest that the H N-tail has a positive role, we do not exclude such a possibility.

      7. Here comes my main question for the authors (though I should stress that I do not expect an answer for publication in a Review Commons journal). The authors now have a unique opportunity to gain key mechanistic insight into condensin by answering the question, 'how does the kleisin N-tail inhibit condensin'? The authors allude to a model in which the N-tail interacts with Smc2 to close/obstruct the kleisin N-gate, through which the DNA likely enters the condensin ring. Can the authors biochemically recapitulate an interaction between an isolated N-tail (or N-terminal section of XCAP-H) and Smc2? Does Cdk phosphorylation alter this interaction?

      Response

      This comment is related to Comment #1 of Reviewer#1. See above for our response.

      *Minor points. 8. The condensin loop extrusion results would benefit from a supplementary movie or time-series, to illustrate the comparison. Details of how loop rate, duration and sizes were assessed should be added to the methods section. *

      Response

      In the revised manuscript, we have provided a set of time-lapse images of loop extrusion events catalyzed by holo(WT) and holo(H-dN) in Fig 4E. We have also added the following explanations for how the parameters of loop extrusion reactions were assessed in Materials and Methods:

      To determine the loop size, the fluorescence intensity of the looped DNA was divided by that of the entire DNA molecule for each image, and multiplied by the length of the entire DNA molecule (48.5 kb). The loop rate was obtained by averaging the increase in looped DNA size per second. The loop duration was calculated by measuring the time from the start of DNA loop formation until the DNA loop became unidentifiable.

      9. Figure EV3A legend, "hHP4" should probably read "hHP2".

      Response

      The reviewer is right. It should read hHP2. Corrected.

      *Reviewer #4 (Significance (Required)):

      see above *

    1. Author Response

      Reviewer #1 (Public Review):

      Using fMRI-based univariate and multivariate analyses, Root, Muret, et al. investigated the topography of face representation in the somatosensory cortex of typically developed two-handed individuals and individuals with a congenital and acquired missing hand. They provide clear evidence for an upright face topography in the somatosensory cortex in all three groups. Moreover, they find that one-handers, but not amputees, show shorter distances from lip representations to the hand area, suggesting a remapping of the lips. They also find a shift away of the upper face from the deprived hand area in one-handers, and significantly greater dissimilarity between face part representations in amputees and one-handers. The authors argue that this pattern of remapping is different to that of cortical neighborhood theories and points toward a remapping of face parts which have the ability to compensate for hand function, e.g., using the lips/mouth to manipulate an object.

      These findings provide interesting insights into the topographic organization of face parts and the principles of cortical (re)organization. The authors use several analytical approaches, including distance measures between hand- and face-part-responsive regions and representational similarity analysis (RSA). Particularly commendable is the rigorous statistical analysis, such as the use of Bayesian comparisons, and careful interpretation of absent group differences.

      We thank the reviewer for their positive and constructive feedback.

      Reviewer #2 (Public Review):

      After amputation, the deafferented limb representation in the somatosensory cortex is activated by stimulation of other body parts. A common belief is that the lower face, including the lips, preferentially "invades" deafferented cortex due to its proximity to cortex. In the present study, this hypothesis is tested by mapping the somatosensory cortex using fMRI as amputees, congenital one-handers, and controls moved their forehead, nose, lips or tongue. First, they found that, unlike its counterpart in monkeys, the representation of the face in the somatosensory cortex is right-side up, with the forehead most medial (and abutting the hand) and the lips most lateral. Second, there was little evidence of "reorganization" of the deafferented cortex in amputees, even when tested with movements across the entire face rather than only the lips. Third, congenital one-handers showed significant reorganization of deafferented cortex, characterized principally by the invasion of the lower face, in contrast to predictions from the hypothesis that proximity was the driving factor. Fourth, there was no relationship between phantom limb pain reports and reorganization.

      As a non-expert in fMRI, I cannot evaluate the methodology. That being said, I am not convinced that the current consensus is that the representation of the face in humans is flipped compared to that of monkeys. Indeed, the overwhelming majority of somatosensory homunculi I have seen for humans has the face right side up. My sense is that the fMRI studies that found an inverted (monkey-like) face representation contradict the consensus.

      Thank you for point this out. As we tried to emphasise in the introduction, very few neuroimaging studies actually investigated face somatotopy in humans, with inconsistent results. We agree the default consensus tends to be dominated by the up-right depiction of Penfield’s homunculus (recently replicated by Roux et al, 2018). However, due to methodological and practical constraints, alignment across subjects in the case of intracortical recordings is usually difficult to achieve, and thus makes it difficult to assess the consistency in topographical organisation. Moreover, previous imaging studies did not manage to convincingly support Penfield’s homunculus. For these two key reasons, the spatial orientation of the human facial homunculus is still debated. A further limiting factor of previous studies in humans is that the vast majority of human studies investigating face (re)mapping in humans focused solely on the lip representation, using the cortical proximity hypothesis to interpret their results. Consequently, as we highlight above in our response to the Editor, there is a wide-spread and false representation in the human literature of the lips neighbouring the hand area.

      To account for the reviewer’s critic and convey some of this context, we changed our title from: Reassessing face topography in primary somatosensory cortex and remapping following hand loss; to: Complex pattern of facial remapping in somatosensory cortex following congenital but not acquired hand loss. This was done to de-emphasise the novelty of face topography relative to our other findings.

      We also rewrote our introduction (lines 79-94) as follows:

      “The research focus on lip cortical remapping in amputees is based on the assumption that the lips neighbour the hand representation. However, this assumption goes against the classical upright orientation of the face in S126–30, as first depicted in Penfield’s Homunculus and in later intracortical recordings and stimulation studies26–29, with the upper-face (i.e., forehead) bordering the hand area. In contrast, neuroimaging studies in humans studying face topography provided contradictory evidence for the past 30 years. While a few neuroimaging studies provided partial evidence in support of the traditional upright face organisation31, other studies supported the inverted (or ‘upside-down’) somatotopic organisation of the face, similar to that of non-human primates32,33. Other studies suggested a segmental organisation34, or even a lack of somatotopic organisation35–37, whereas some studies provided inconclusive or incomplete results38–41. Together, the available evidence does not successfully converge on face topography in humans. In line with the upright organisation originally suggested by Penfield, recent work reported that the shift in the lip representation towards the missing hand in amputees was minimal42,43, and likely to reside within the face area itself. Surprisingly, there is currently no research that considers the representation of other facial parts, in particular the upper-face (e.g., the forehead), in relation to plasticity or PLP.”

      We also updated the discussion accordingly (lines 457, 469-477, 490-492).

      Similarly, it is not clear to me how the observations (1) of limited reorganization in amputees, (2) of significant reorganization in congenital one-handers, and (3) of the lack of relationship between PLP and reorganization is novel given the previous work by this group. Perhaps the authors could more clearly articulate the novelty of these results compared to their previous findings.

      Thank you for giving us the opportunity to clarify on this important point. The novelty of these results can be summarised as follow:

      (1) Conceptually, it is crucial for us to understand if deprivation-triggered plasticity is constrained by the local neighbourhood, because this can give us clues regarding the mechanisms driving the remapping. We provide strong topographic evidence about the face orientation in controls, amputees and one-handers.

      (2) The vast majority of previous research on brain plasticity following hand loss (both congenital and acquired) in humans has exclusively focused on the lower face, and lips in particular. We provide systematic evidence for stable organisation and remapping of the neighbouring upper face, as well as the lower face. We also study topographic representation of the tongue (and nose) for the first time.

      (3) The vast majority of previous research on brain remapping following hand loss (both congenital and acquired, neuroimaging and electrophysiological) was focused on univariate activity measures, such as the spatial spread of units showing a similar feature preference, or the average activity level across individual units. We are going beyond remapping by using RSA, which allows us to ask not only if new information is available in the deprived cortex (as well as the native face area), but also whether this new information is structured consistently across individuals and groups. We show that representational content is enhanced in the deprived cortex one-handers whereas it is stable in amputees relative to controls (and to their intact hand region).

      (4) Based on previous studies, the assumption was that reorganisation in congenital one-handers was relatively unspecific, affecting all tested body parts. Here, we provide evidence for a more complex pattern of remapping, with the forehead representation seemingly moving out of the missing hand region (and the nose representation being tentatively similar to controls). That is, we show not just “invasion” but also a shift of the neighbour away from the hand area which has never been documented (or in fact suggested).

      (5) Using Bayesian analyses we provide definitive evidence against a relationship between PLP and forehead remapping, providing first and conclusive evidence against the remapping hypothesis, based on cortical neighbourhood.

      Our inclination is not to add a summary paragraph of these points in our discussion, as it feels too promotional. Instead, we have re-written large sections of the introduction and discussion to better emphasise each of these points separately throughout the text, where the context is most appropriate. Given the public review strategy taken by eLife, the novelty summary provided above will be available for any interested reader, as part of the public review process. However, should the reviewer feel that a novelty summary paragraph is required (or an emphasis on any of the points summarised above), we will be happy to revise the manuscript accordingly.

      Finally, Jon Kaas and colleagues (notably Niraj Jain) have provided evidence in experiments with monkeys that much of the observed reorganization in the somatosensory cortex is inherited from plasticity in the brain stem. Jain did not find an increased propensity for axons to cross the septum between face and hand representations after (simulated) amputation. From this perspective, the relevant proximity would be that of the cuneate and trigeminal nuclei and it would be critical to map out the somatotopic organization of the trigeminal and cuneate nuclei to test hypotheses about the role of proximity in this remapping.

      Thank you for highlighting this very relevant point, which we are well aware of. We fully agree with the reviewer that this is an important goal for future study, but functional imaging of the brainstem in humans is particularly challenging and would require ultra high field imaging (7T) and specialised equipment. We have encountered much local resistance due to hypothetical issues for MRI safety for scanning amputees in this higher field strength, meaning we are unable to carry out this research ourselves. Our former lab member Sanne Kikkert, who is now running her independent research programme in Zurich, has been working towards this goal for the past 4 years. So we can say with confidence that this aim is well beyond the scope of the current study. In response to your comment, we mentioned this potential mechanism in the introduction (lines 98-101), we ensured that we only referred to “cortical proximity” throughout our manuscript, and we circle back to this important point in the discussion.

      Lines 539-543: “Moreover, even if the remapping we observed here goes against the theory of cortical proximity, it can still arise from representational proximity at the subcortical level, in particular at the brainstem level44,45. While challenging in humans, mapping both the cuneate and trigeminal nuclei would be critical to provide a more complete picture regarding the role of proximity in remapping.”

      Reviewer #3 (Public Review):

      In their study, the authors set up to challenge the long-held claim that cortical remapping in the somatosensory cortex in hand deprived cortical territories follows somatotopic proximity (the hand region gets invaded by cortical neighbors) as classically assumed. In contrast to this claim, the authors suggest that remapping may not follow cortical proximity but instead functional rules as to how the effector is used. Their data indeed suggest that the deprived hand area is not invaded by the forefront which is the cortical neighbor but instead by the lips which may compensate for hand loss in manipulating objects. Interestingly the authors suggest this is mostly the case for one-handers but not in amputees for who the reorganization seems more limited in general (but see my comments below on this last point).

      This is a remarkably ambitious study that has been skilfully executed on a strong number of participants in each group. The complementarity of state-of-the-art uni- and multi-variate analyses are in the service of the research question, and the paper is clearly written. The main contribution of this paper, relative to previous studies including those of the same group, resides in the mapping of multiple face parts all at once in the three groups.

      We are grateful to the reviewer for appreciating the immense effort that this study involved.

      In the winner takes all approach, the authors only include 3 face parts but exclude from the analyses the nose and the thumb. I am not fully convinced by the rationale for not including nose in univariate analyses - because it does not trigger reliable activity - while keeping it for representational similarity analyses. I think it would be better to include the nose in all analyses or demonstrate this condition is indeed "noisy" and then remove it from all the analyses. Indeed, if the activity triggered by nose movement is unreliable, it should also affect multivariate.

      Following this comment, we re-ran all univariate analyses to include the nose, and updated throughout the main text and supplemental results and related figures. In short, adding the nose did not change the univariate results, apart from a now significant group x hemisphere interaction for the CoG of the tongue when comparing amputees and controls, matching better the trends for greater surface coverage in the deprived hand ROI of amputees. Full details are provided in our response to Reviewer 1 above.

      The rationale for not including the hand is maybe more convincing as it seems to induce activity in both controls and amputees but not in one-handers. First, it would be great to visualize this effect, at least as supplemental material to support the decision. Then, this brings the interesting possibility that enhanced invasion of hand territory by lips in one-handers might link to the possibility to observe hand-related activity in the presupposed hand region in this population. Maybe the authors may consider linking these.

      Thank you for this comment. As we explain in our response to Reviewer 1 above, we did not intent the thumb condition in one-handers for analysis, as the task given to one-handers (imagine moving a body part you never had before) is inherently different to that given to the other groups (move - or at least attempt to move - your (phantom) hand). As such, we could not pursuit the analysis suggested by the reviewer here. To reduce the discrepancy and following Reviewer 1’s advice, we decided to remove the hand-face dissimilarity analysis which we included in our original manuscript, and might have sparked some of this interest. Upon reflection we agreed that this specific analysis does not directly relate to the question of remapping (but rather of shared representation), in addition to making the paper unbalanced. We will now feature this analysis in another paper that appears more appropriate in the context of referred sensations in amputees (Amoruso et al, 2022 MedRxiv).

      The use of the geodesic distance between the center of gravity in the Winner Take All (WTA) maps between each movement and a predefined cortical anchor is clever. More details about how the Center Of Gravity (COG) was computed on spatially disparate regions might deserve more explanations, however.

      We are happy to provide more detail on this analysis, which weights the CoG based on the clusters size (using the workbench command -metric-weighted-stats). Let’s consider the example shown here (Figure 1) for a single control participant, where each CoG is measured either without weighting (yellow vertices) or with cluster weighting (forehead CoG=red, lip CoG=dark blue, tongue CoG=dark red). When the movement produces a single cluster of activity (the lips in the non-dominant hemisphere, shown in blue), the CoG’s location was identical for both weighted (red) and unweighted (yellow) calculations. But other movements, such as the tongue (green), produced one large cluster (at the lateral end), with a few more disparate smaller clusters more medially. In this case, the larger cluster of maximal activity is weighted to a greater extent than the smaller clusters in the CoG calculation, meaning the CoG is slightly skewed towards it (dark red), relative to the smaller clusters.

      Figure 1. Centre-of-gravity calculation, weighted and unweighted by cluster size, in an example control participant. Here the winner-takes-all output for each facial movement (forehead=red, lips=blue, tongue=green) was used to calculate the centre-of-gravity (CoG) at the individual-level in both the dominant (left-hand side) and non-dominant (right-hand side) hemisphere, weighted by cluster size (forehead CoG=red, lip CoG=dark blue, tongue CoG=dark red), compared to an unweighted calculation (denoted by yellow dots within each movements’ winner-takes-all output).

      This is now explained in the methods (lines 760-765) as follows:

      “To assess possible shifts in facial representations towards the hand area, the centre-of-gravity (CoG) of each face-winner map was calculated in each hemisphere. The CoG was weighted by cluster size meaning that in the event of multiple clusters contributing to the calculation of a single CoG for a face-winner map, the voxels in the larger cluster are overweighted relative to those in the smaller clusters. The geodesic cortical distance between each movement’s CoG and a predefined cortical anchor was computed.”

      Moreover, imagine that for some reason the forefront region extends both dorsally and ventrally in a specific population (eg amputees), the COG would stay unaffected but the overlap between hand and forefront would increase. The analyses on the surface area within hand ROI for lips and forehead nicely complement the WTA analyses and suggest higher overlap for lips and lower overlap for forehead but none of the maps or graphs presented clearly show those results - maybe the authors could consider adding a figure clearly highlighting that there is indeed more lip activity IN the hand region.

      We agree with you on this limitation of the CoG and this is why we interpret all cortical distances analyses in tandem with the laterality indices. The laterality indices correspond to the proportion of surface area in the hand region for a given face part in the winner-maps.

      Nevertheless, to further convince the Reviewer, we extracted activity levels (beta values) within the hand region of congenitals and controls, and we ran (as for CoGs) a mixed ANOVA with the factors Hemisphere (deprived x intact) and Group (controls x one-handers).

      As expected from the laterality indices obtained for the Lips, we found a significant group x hemisphere interaction (F(1,41)=4.52, p=0.040, n2p=0.099), arising from enhanced activity in the deprived hand region in one-handers compared to the non-dominant hand region in controls (t(41)=-2.674, p=0.011) and to the intact hand region in one-handers (t(41)=-3.028, p=0.004).

      Since this kind of analysis was the focus of previous studies (from which we are trying to get away) and since it is redundant with the proportion of face-winner surface coverage in the hand region, we decided not to include it in the paper. But we could add it as a Supplementary result if the Reviewer believes this strengthens our interpretation.

      In addition to overlap analyses between hand and other body parts, the authors may also want to consider doing some Jaccard similarity analyses between the maps of the 3 groups to support the idea that amputees are more alike controls than one-handers in their topographic activity, which again does not appear clear from the figures.

      We thank the reviewers for this clever suggestion. We now include the Jaccard similarity analysis, which quantified the degree of similarity (0=no overlap between maps; 1=fully overlapping) between winner-takes-all maps (which included the nose; akin to the revised univariate results) across groups. For each face part/amputee, the similarity with the 22 controls and 21 one-handers respectively was averaged. We utilised a linear mixed model which included fixed factors of Group (One-handers x Controls), Movement (Forehead x Nose x Lips x Tongue) and Hemisphere (Intact x Deprived) on Jaccard similarity values (similar to what we used for the RSA analysis). A random effect of participant, as well as covariates of ages, were also included in the model.

      Results showed a significant group x hemisphere interaction (F(240.0)=7.70, p=0.006; controlled for age; Fig. 5), indicating that amputees’ maps showed different similarity values to controls’ and one-handers’ depending on the hemisphere. Post-hoc comparisons (corrected alpha=0.025; uncorrected p-values reported) revealed significantly higher similarity to controls’ than to one-handers’ maps in the deprived hemisphere (t(240)=-3.892, p<.001). Amputees’ maps also showed higher similarity to controls’ maps in the deprived relative to the intact hemisphere (t(240)=2.991, p=0.003). Amputees, therefore, displayed greater similarity of facial somatotopy in the deprived hemisphere to controls, suggesting again fewer evidence for cortical remapping in amputees.

      We added these results at the end of the univariate analyses (lines 335-351) and in the discussion (lines 464-465 and 497-500).

      This brings to another concern I have related to the claim that the change in the cortical organization they observe is mostly observed in one-handers. It seems that most of this conclusion relies on the fact that some effects are observed in one-handers but not in amputees when compared to controls, however, no direct comparisons are done between amputees and one-handers so we may be in an erroneous inference about the interaction when this is actually not tested (Nieuwenhuis, 11). For instance, the shift away from the hand/face border of the forehead is also (mildly) significant in amputees (as observed more strongly in one-handers) so the conclusion (eg from the subtitle of the results section) that it is specific to one-hander might not fully be supported by the data. Similar to the invasion of the hand territory from the lips which is significant in amputees in terms of surface area. All together this calls for toning down the idea that plasticity is restricted to congenital deprivation (eg last sentence of the abstract). Even if numerically stronger, if I am not wrong, there are no stats showing remapping is indeed stronger in one-handers than in amputees and actually, amputees show significant effects when compared to controls along the lines as those shown (even if more strongly) in one-handers.

      Thank you for this very important comment. We fully agree – the RSA across-groups comparison is highly informative but insufficient to support our claims. We did not compare the groups directly to avoid multiple comparisons (both for statistical reasons and to manage the size of the results section). But the reviewer’s suggestion to perform a Jaccard similarity analysis complements very nicely the univariate and multivariate results and allows for a direct (and statistically lean) comparison between groups, to assess whether amputees are more similar to controls or to congenital one-handers, taking into account all aspects of their maps (both spatial location/CoG and surface coverage). We added the Jaccard analysis to the main text, at the end of the univariate results (lines 335-385). The Jaccard analysis suggests that amputees’ maps in the deprived hemisphere were more similar to the maps of controls than to the ones of congenital one-handers. This allowed us to obtain significant statistical results to support the claim that remapping is indeed stronger in one-handers than in amputees (lines 346-351). We also compared both amputees and one-handers to the control group. In line with our univariate results, this revealed that the only face part for which controls were more similar to one-handers than to amputees was the tongue (lines 379-381). And that the forehead remapping observed at the univariate level in amputees (surface area), is likely to arise from differences in the intact hemisphere (lines 381-383).

      Finally, we also added the post-hoc statistics comparing amputees to congenitals in the RSA analysis (lines 425-427): “While facial information in the deprived hand area was increased in one-handers compared with amputees, this effect did not survive our correction for multiple comparisons (t(70.7)=-2.117, p=0.038).”

      Regarding the univariate results mentioned by the reviewer, we would like to emphasise that we had no significant effect for the lips in amputees, though we agree the surface area appears in between controls and one-handers. But this laterality index was not different from zero. This test is now added lines 189-190. Regarding the forehead, we fully agree with the Reviewer, and we adjusted the subtitle accordingly (lines 241-242). For consistency, we also added the t-test vs zero for the forehead surface area (non-significant, lines 251-253).

      Also, maybe the authors could explore whether there is actually a link between the number of years without hand and the remapping effects.

      To address this question, we explored our data using a correlation analysis. The only body part who showed some suggestive remapping effects was the tongue, and so we explored whether we could find a relationship (Pearson’s correlation) between years since amputation and the laterality index of the Tongue in amputees (r = 0.007, p=0.980, 95% CI [-0.475, 0.475]). We also explored amputees’ global Jaccard similarity values to controls in the deprived hemisphere (r = -0.010, p=0.970, 95% CI [-0.488, 0.473]), and could not find any relationship. Considering there was no strong remapping effect to explain, we find this result too exploratory to include in our manuscript.

      One hypothesis generated by the data is that lips remap in the deprived hand area because lips serve compensatory functions. Actually, also in controls, lips and hands can be used to manipulate objects, in contrast to the forehead. One may thus wonder if the preferential presence of lips in the hand region is not latent even in controls as they both link in functions?

      We agree with the reviewer’s reasoning, and we think that the distributed representational content we recently found in two-handers (Muret et al, 2022) provides a first hint in this direction. It is worth noting that in that previous publication we did not find differences across face parts in the activity levels obtained in the hand region, except for slightly more negative values for the tongue. But we do think that such latent information is likely to provide a “scaffolding” for remapping. While the design of our face task does not allow to assess information content for each face part (as done for the lips in Muret et al, 2022), this should be further investigated in follow-up studies.

      We added a sentence in the discussion to highlight this interesting notion: Lines 556-559: “Together with the recent evidence that lip information content is already significant in the hand area of two-handed participants (Muret et al, 2022), compensatory behaviour since developmental stages might further uncover (and even potentiate) this underlying latent activity.”

    1. Author Response

      Reviewer #1 (Public Review):

      It has previously been shown that deletion of the GluA3 subunit in mice leads to alterations in auditory behavior in adult mice that are older than a couple of months of age. The GluA3 subunit is expressed at several synapses along the auditory pathway (cochlea and brainstem), and in ko mice changes in brainstem synapses have been observed. These previously documented changes may account for some of the deficits in hearing in adult ko mice.

      In the current study, the authors investigate an earlier stage of development (at 5 wks) when the auditory brainstem responses (ABRs) are normal, and they ask how transmission persists at inner hair cell (ihc) ribbon synapses in GluA3 ko mice. They discovered that deletion of GluR3A significantly changed 1) the relative expression of Glu A2 (dramatically downregulated) and A4 subunits at SGN afferents, and 2) caused morphological changes in ihc ribbons (modiolar side) and synaptic vesicle size (pillar).

      The changes documented in the 5 wk old GluA3ko mice were not necessarily predicted because in general the mechanisms involved in shuffling GluA receptors at this synapse (or other sensory synapses) are not completely understood; furthermore, much less is known about the role of differentiation of ihc-sgn synapses along a modiolar-pillar axis. With that said, the only shortcoming of the study is a lack of explanation for the observed changes in the synaptic structure; but this is not specific to this study.

      Given the quality of the data and the clarity of presentation of results, this is a very valuable study that will aid and motivate researchers to further explore how auditory circuitry develops, and becomes differentiated, at the level of ihc-sgn synapses.

      We thank the reviewer for the positive and helpful comments. Ongoing studies are seeking to explain the observed changes in synapse structure.

      Reviewer #2 (Public Review):

      The goal of the study by Rutherford and colleagues was to characterize functional, structural, and molecular changes at the highly specialized cochlear inner hair cell (IHC) - spiral ganglion neuron (SGN) ribbon synapse in GluA3 AMPA receptor subunit knockout mice (GluA3KO). Previous work by the authors demonstrated that 2-month-old GluA3KO mice experienced impaired auditory processing and changes in synaptic ultrastructure at the SGN - bushy cell synapse, the next synapse in the auditory pathway.

      In the present study, the authors investigated whether GluA3 is required for ribbon synapse formation and physiology in 5-week-old mice using a series of functional and light- and electron microscopy imaging approaches. While deletion of GluA3 AMPAR subunit did not affect hearing sensitivity at this age, the authors reported that cochlear ribbon synapses exhibited changes in the molecular composition of AMPARs and pre- and postsynaptic ultrastructural alterations. Specifically, the authors demonstrated that GluA3KO ribbon synapses exhibit i) a global reduction in postsynaptic AMPARs, which is also reflected by smaller AMPAR arrays, ii) a reduction in GluA2 and an increase in GluA4 protein expression at individual postsynaptic sites, and iii) changes in the dimensions and morphology of the presynaptic specialization ("ribbon") and in the size of synaptic vesicles. These reported structural changes are linked to the side of innervation with respect to the IHC modiolar-pillar axis.

      The results presented by the authors are conceptually very interesting as the data support the notion that potentially detrimental changes in the molecular composition of a sensory synapse can be compensated to sustain synaptic function to a certain extent during development. The conclusions of the study are mostly well supported by the data, but some experimental details or control experiments are missing or need to be clarified to allow a full assessment.

      1) The authors tested which GluA isoforms are expressed in SGNs of GluA3KO mice and reported that only GluA2 and GluA4, and not GluA1, receptor subunits are present in the cochlear. It is, however, a bit difficult to understand why immunolabelling for GluA1 was only performed on brainstem sections (Fig. 1B right) and not in the cochlear to probe for postsynaptic localization at ribbon synapses as it was done for the other isoforms (Fig. 2 and 6) given that GluA3KO IHCs exhibited a larger number of ribbons that lacked GluA2 and 3 (lone or 'orphaned' ribbons; Fig. 6B). It is also not clear why immunolabelling for GluA2 and 4 was performed to probe for expression of these receptor subunits on SGN cell bodies in the cochlear spiral ganglion. Which neurons are expected to synapse onto these somata?

      There is precedent for expression of GluA subunits in the SGN cell bodies reflecting expression at the synapse, although it is not clear if any of that immunoreactivity reflects cell surface expression in the intact ganglion or if it represents solely intracellular subunits being trafficked to synapses.

      Figure 1b shows that GluA2 is expressed in the somata of WT mice and KO mice. The lower panels show that GluA1 is not expressed in the somata of WT or KO mice. The right panels show that while GluA1 is expressed in the cerebellum of WT and KO mice, is not expressed in the cochlear nucleus of WT or KO mice. We think this demonstrates the lack of compensation by GluA1 in the GluA3 KO.

      We have now added GluA4 immunoreactivity in the SGNs to Fig. 1, for completeness. In our experience, GluA subunits expressed at synapses are also found in the cell bodies, and GluA subunits not expressed at synapses are not found in the cell bodies. The current data is consistent with this, although we did not label GluA1 in the organ of Corti.

      2) The authors state in the text that GluA3 expression is completely abolished in GluA3KO IHCs, however, there appears to still be a faint punctate immunofluorescence signal visible when an antibody directed against GluA3 was used (Fig. 2C). Providing additional information on the specificity of this (and the other) antibodies used in the study would be helpful.

      We agree, and thank the reviewer for pointing this out. There is indeed a small signal presumably due to cross-reactivity of the anti-GluA3 with GluA2 subunits, because the cytoplasmic epitope recognized by the antibody is in a region of high similarity of GluA2 and GluA3 (Dong et al., 1997). In addition, the specification sheet of the Santa Cruz company states that the GluA3 antibody can detect GluA2. This relatively small cross-reactivity is noted now in the text on p. 9. Also, this appearance was a product of the same brightness and contrast issue noted above in the response to the editor’s summary. Upon readjustment, the signal is less apparent, because in the readjustment we used less brightness and less contrast enhancement to avoid the unwanted saturation in some of the panels.

      3) The authors reported changes in the volume of the presynaptic ribbon and postsynaptic density surface area in GluA3KO KO animals. The EM data as presented are however not sufficiently convincing.

      i) There appears to be a mismatch between the EM data shown in Fig. 3 and 4 and the information in the text with respect to the number of data points in the plots and the reported number of reconstructed synapses. This raises several questions with respect to the analysis. For instance, it is unclear whether certain synapses were reconstructed but excluded from the analysis. If so, what were the exclusion criteria?

      We thank the reviewer for pointing out this discrepancy within the text and the figures. The discrepancies are now fixed. We have added more information on how the synapses were reconstructed in the M&M (p.14-15).

      ii) The authors compare PSD surface areas in reconstructions from 3D serial sections, but for some of the shown reconstructions (i.e. Fig. 3A' and B' and 4B'), it appears as if PSDs were only incompletely reconstructed.

      We included all the ultrathin sections that show afferent dendrites with a visible PSD. We revised all the reconstructions and fixed some misalignments. The appearance of the reconstructed PSD relates to how the Reconstruct software creates the 3-D rendering. We did not use any extra software to smooth the hedges of the 3D reconstructions.

      4) The immunolabelling experiments shown in Fig. 2 and 6 are of very high quality and the quantitative analysis of the light microscopy data (Fig. 6-9) is clearly very detailed, but slightly difficult to interpret the way it is presented. Specifically, it is unclear how the number of synapses per IHC (Fig. 6B) and the separation into modiolar and pillar side (Fig. 8) was achieved based on the shown images without the outlines of individual cells being visible.

      We agree. Please see the revised Figs. 2, 6, and 8, and explanation in the figure legend of Fig. 8.

      5) Adding more detailed information about important parameters (mean, N/n, SD/SEM) and the statistical tests used for the individual comparisons presented in the Figures would help strengthen the confidence in the presented data.

      Please see the new spreadsheets accompanying the revised manuscript.

      6) In general, the authors report a series of molecular and structural changes in IHCs and reach the conclusion that GluA3 subunits may have a role in "trans-synaptically" determining or organizing the architecture of both the pre- and post-synapse. However, some of the arguments are very speculative and many of the claims are not supported by experimental data presented in the paper. The authors should consider to also compare their findings to studies that investigated ultrastructural changes of AMPAR subunit knockouts in other synapse types, and discuss alternative interpretations (e.g. homeostatic changes).

      Thank you for this comment. Considering that reviewer 1 asked for more speculation, we have decided to leave the level of speculation similar to the initial submission. However, we went through the text to make sure our claims were backed by our observations.

      Due to space constraints, rather than comparing to additional other synapses, in this context we prefer to compare with auditory brainstem synapses.

      The possibility of homeostatic changes we now added on p. 29.

    1. Author Response

      Reviewer #1 (Public Review):

      With a real interest, I read the manuscript entitled "Sex-specific effects of an IgE polymorphism on immunity susceptibility to infection and reproduction in a wild rodent", written by Wanelik and colleagues. Actually, I am impressed with each and every part of this work. This study is very well designed and answers intriguing scientific questions. The study is multilayer and multidimensional and goes far beyond a genomic association as it deeply addresses, to mention only those most important, ecological, parasitological, immunological, and gene expression aspects. In addition to studying the free-living animal community of voles, it utilizes this opportunity to get some insights into the genetics and biology of the high-affinity IgE receptor not possible to be gained in studies performed in humans or standard laboratory animals. The data are presented in a very elegant way and the article is really nicely written.

      We thank the Reviewer for these positive comments, and are very glad to hear they think our work is so comprehensive.

      Reviewer #2 (Public Review):

      In this manuscript, Wanelik et al. use a wild rodent population to test if a polymorphism in a receptor for immunoglobulin E (IgE) affects immune responses, resistance to infection, and fitness. Finding such effects would imply that polymorphisms in immune genes can be maintained by antagonistic pleiotropy between sexes, which has important implications for our understanding of how genetic variation is maintained. The work presented here extends previous work by the same group where they have shown that expression of GATA3 (a transcription factor inducing Th2 immune responses) affects tolerance to ectoparasites and that polymorphism in Fcer1a affects the expression of GATA3. The present study is based on a fairly large data set and comprehensive analysis of a number of different traits. Indeed, the authors should be commended for investigating all steps in the chain polymorphism→immune response→resistance→fitness. Unfortunately, the presentation of the methodology is a bit confusing. Moreover, most of the key results are only marginally significant.

      We thank the Reviewer for their positive feedback, and are very glad to hear they think our work is so comprehensive. As detailed below, we have tried to clarify our methodology and to temper our claims in the revised manuscript.

      As regards methodology, I was confused by the differential expression (DE) analyses presented in fig 1A. First, it took a while to understand that these were based on a comparison of unstimulated cells (i.e. baseline expression), not ex vivo stimulated cells; this should be made explicit in conjunction with the presentation of the results. Second, it would be good to clarify (and motivate) in the Results that you compare individuals with at least one copy of the GC haplotype against the rest, i.e. a dominant model.

      We apologise for the confusion. We now explicitly state in the Results (lines 313-314) that the DGE analysis was based on unstimulated splenocytes: “Differential gene expression (DGE) analysis performed on unstimulated splenocytes taken from 53 males and 31 females assayed by RNASeq”. We also explicitly state “Unstimulated immune gene expression” in the legend for Figure 1.

      Please note that an additive model was used for all analyses run using the hapassoc package (macroparasites and SOD1). A dominant model was used in the DGE analysis and in other analyses where it was not possible to use the hapassoc package (gene expression assayed by Q-PCR, microparasites and reproductive success) which meant that only those individuals for which haplotype could be inferred with certainty could be included (i.e. a smaller dataset). In this case, a dominant model was used. Our use of the dominant model in the DGE analysis is now more explicitly explained on lines 933-935: “Only those individuals for which haplotype could be inferred with certainty could be included (n = 53 males and n = 31 females; none of which were known to have two copies of the GC haplotype hence the choice of a dominant model).” And its use in other non-hapassoc analyses is now explicitly stated on lines 991-992: “as in the DGE analysis, genotype was coded as the presence or absence of the GC haplotype (i.e. a dominant model)”.

      The first key result is that polymorphisms in Fcer1a have sex-specific effects on the expression of pro- and anti-inflammatory genes in males and females. However, the GSEA analyses (fig 1A) show that the GC haplotype has positive effects on the expression of both pro- and anti-inflammatory gene sets in both sexes - albeit with a stronger effect of proinflammatory genes in males and anti-inflammatory genes in females - but there is no formal evidence for an effect of genotype by sex. I am not sure how to test for interaction with GSEA (or if it is at all possible), so it would be good to complement the GSEA with other analyses (perhaps based on PCA?) of these data to provide more formal evidence for an effect of genotype by sex.

      It is not possible to provide formal evidence for an effect of genotype by sex in the DGE analysis/GSEA. Instead, we have tried to temper our claims about sex-specific effects (please see below for further details).

      Some more evidence of a sex-specific effect of Fcer1a genotype is actually provided by analyses of the expression of 18 immune genes in ex vivo stimulated T cells. Here, a sex-specific effect of Fcer1a genotype was found on the expression of one of 18 measured immune genes, the cytokine IL17a. However, Fcer1a is as far as I am aware not expressed by T cells, so the relevance of these results is unclear. Moreover, it is unclear why these 18 genes were analyzed one by one, rather than by some multidimensional approach (e.g. PCA).

      The Reviewer is right that Fcer1a is not generally considered to be expressed by T cells. However, the stimulation could have indirect effects. We have clarified this on lines 801-804: “Although Fcer1a is not expressed by T-cells themselves, polymorphism in this gene could be acting indirectly on T-cells through various pathways, including via cytokine signalling, following expression of Fcer1a by other cells”.

      The 18 immune genes were specially selected because they represent different immune pathways and are expected to have limited redundancy. This is why individual tests were performed (followed by a correction for multiple testing) rather than using a multidimensional approach like PCA. This is now explicitly explained in the Methods on lines 804-808: “The choice of our panel of genes was informed by…(iii) the aim of limited redundancy, with each gene representing a different immune pathway” and on lines 1031-1032: “We did not use a multidimensional approach (such as principal component analysis) because of limited redundancy in our panel of genes.” and in the Results on line 363-366: “we used an independent dataset for males and females whose spleens were stimulated with two immune agonists and assayed by Q-PCR (for a panel of 18 immune genes with limited redundancy); see Methods for how these genes were selected.”

      The second key result is that Fcer1a genotype has sex-specific effects on resistance to parasites, but this is based on a marginally significant effect as regards one of three tested pathogens.

      We acknowledge that this is a marginally significant result and have acknowledged this in the text on line 428 of the Results section.

      The third key result is that Fcer1a genotype has sex-specific effects on reproductive fitness. However, this is based on a marginally significant effect in males only, and a formal test for sex by genotype could not be performed (and since the direction of the effect was similar in females it is doubtful whether there would be an effect of sex by genotype; see fig 1C).

      Thus, while the results presented here are clearly indicative of sex-specific effects of an immune gene polymorphism, I think it is too early to actually claim such effects.

      We understand the Reviewer’s concerns about the overall lack of formal evidence for an effect of genotype by sex. As we are not able to provide this for the DGE analysis, GSEA (see above), or for the reproductive success analysis, we have tempered our claims about sex-specific effects (as suggested by the Reviewer). We have done this by removing the term “sex-specific effect” throughout the manuscript, including in the title. We now focus more heavily on the multiple effects we have shown across different phenotypic traits, and use the term “sex-dependent effects” or describe effects as “differing between sexes” sparingly, and only where necessary. These changes have been made throughout the manuscript, but more so in the introduction where the narrative has been substantially reworked to lay out this change in focus.

      Reviewer #3 (Public Review):

      This is a well-replicated study: the authors sampled over a thousand field voles (Microtus agrestis), over three years at seven different sites, with a combination of cross-sectional and longitudinal sampling. The authors compared individuals carrying the GC haplotype (<10% of the population) of the high-affinity immunoglobulin receptor gene (Fcer1). They recorded parasite infections (Babesia, Bartonella, ticks, fleas, gastrointestinal helminths), expression levels of inflammatory and immune genes using transcriptomes and quantitative PCR, and genotype and pedigree.

      We thank the Reviewer for their positive feedback, and are very glad to hear they think our work is well replicated.

      A comparison of overall gene expression between GC-carrying and all other voles indicated two sex-dependent differences, the expression in males of Il33, which is associated with antihelminthic responses, and in females of Socs3, which is implicated in regulating immune responses. One substantial issue with the authors' interpretation of these data is to attribute Il33 to the inflammatory response - this taints the rest of their interpretation (e.g., Fig 1A, see below); instead, this is a key cytokine of the antihelminthic Th2 response and its detection suggests there might be a difference in helminth infection between the haplotypes - which is consistent with the role of IgE. Therefore, the authors would need to explore further how the GC haplotype, IgE, and parasite burdens might be driving the expression of IL-33. Specifically, the authors did not control for potential confounding effects of infection, which might be expected to differ based on the rest of their data.

      We acknowledge the difficulty in grouping genes under single GO terms, and the need for more nuance when describing these classifications. No gene set is perfect and immune networks are highly complex, so the same gene can be grouped into multiple gene sets. IL33 is an example of this – it appears in the GO term GO:0050729 (positive regulation of inflammatory response) but, as the Reviewer points out, is also commonly associated with the antihelminthic Th2 response. We have edited the text in the Results (on lines 322-324 and lines 350-352) to communicate this nuance, as well as adding references to support each of these associations: “Il33 is commonly associated with anti-helminthic response [25] and Socs3 with regulation of the immune response more broadly [26]….Both Il33 and Socs3 also share an association with the inflammatory response [26,27]. While Il33 positively regulates this response (appearing in the gene set GO:0050729), Socs3 negatively regulates it (GO:0050728).” References added:

      1. Liew FY, Pitman NI, McInnes IB. Disease-associated functions of IL-33: The new kid in the IL-1 family. Nat Rev Immunol. Nature Publishing Group; 2010;10: 103–110. doi:10.1038/nri2692
      2. Carow B, Rottenberg ME. SOCS3, a major regulator of infection and inflammation. Front Immunol. 2014;5: 1–13. doi:10.3389/fimmu.2014.00058
      3. Cayrol C, Girard JP. IL-33: An alarmin cytokine with crucial roles in innate immunity, inflammation and allergy. Curr Opin Immunol. Elsevier Ltd; 2014;31: 31–37. doi:10.1016/j.coi.2014.09.004

      We have also run an extra DGE analysis including cestode burden as a covariate (cestodes being the most prominent helminth infection in terms of biomass), to check whether IL33 still emerges as a top-responding gene in males (see Appendix 1-table 4 & 5). We found that it did (in fact the signal was even stronger), indicating that the differences in Il33 expression are not being driven by differences in cestode infection. We now mention this additional analysis in the text: “Given the link between Il33 and the antihelminthic response (and more generally, IgE-mediated responses and the antihelminthic response), we repeated the DGE analysis while controlling for cestode burden, but this had little effect on our results (same top-responding immune genes; see Appendix 1—table 4 & 5), suggesting that these effects were not driven by differences in cestode infection”. This is consistent with our finding that there is no difference in macroparasite burden (including cestode burden) between individuals with and without the GC haplotype (see Appendix 1—table 11) and lines 449-451: “However, we found no effect of the haplotype (interactive or not) on the probability of infection with the other parasites in our population”.

      We have also included the following caveat in our discussion on lines 540-542: “Some of the differences in immune phenotype that we observed may also be driven by difference in parasite infection (although we accounted for cestode burden in a follow-up analysis, we cannot rule this out).”

      Among a narrow panel of immune genes measured in ex vivo settings, the authors reported elevated expression of Il17a, which is associated with inflammatory, antibacterial responses. Of note, the panel of genes they measured did not contain antihelminth effectors beyond the transcription factor GATA3, and therefore could not confirm the expression of IL-33 observed in the transcriptomes. However, the expression of IL-17a appears consistent with the elevated activity of antioxidant SOD1.

      In response to this comment, we now point out more clearly that our panel of genes did not include Il33 or Socs3, but did include other inflammatory genes including Il17a, Ifng, Il1b, Il6 and Tnfa.

      Somewhat unexpectedly given the authors' claim that in males the GC haplotype is prone to a more inflammatory immune phenotype, it had no effect on infection in that sex. However, the identity of the genes and pathways matter and the authors do not provide sufficient detail to evaluate their interpretation (GSEA analysis and Figure 1A).

      Barcode plots, such as the one we include in Figure 1A, are commonly used representations of GSEA results. In order to aid interpretation for those who are unfamiliar with barcode plots, we have included some more information in the legend of Figure 1.

      An intriguing and potentially important finding is that males carrying the GC haplotype appeared to have fewer offspring (little to no effect detected in the females). To confirm whether the effect of the haplotype is direct or mediated by other factors, it would be useful to test how other covariates, like infection, might contribute to this.

      To explore this possibility, we have run extra GLMs for both females and males which include two parasite variables: proportion of samples taken from an individual that tested positive for Babesia and proportion of samples taken from an individual that tested positive for Bartonella. We found no difference in the main results – males with the GC haplotype still have fewer offspring, suggesting that infection is not acting as a confounder.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We really appreciate the reviewers’ insightful comments, which help improve the quality of this work. We have responded to the reviewers’ questions/comments point by point in the following text and made the corresponding changes in the revised manuscript. Lastly, we added one more figure (Fig. 7) with lineage tracing experiments demonstrating the conversion of id2a+ liver ductal cells to hepatocytes in extreme hepatocyte loss condition.

      Reviewer #1 (Evidence, reproducibility and clarity):

      Mi and Andersson describe a method for creating efficient 3' knock-ins in zebrafish using a combination of end-modified dsDNA and Cas9/gRNA RNPs. They tested their method on four genetic loci where they introduced Cre recombinase endogenously, and obtained high F0 mosaicism and germline transmission. The authors included fluorescent proteins with self-cleaving peptides to determine that endogenous expression patterns are observed. By crossing their knock-in Cre lines with lineage tracing reporter lines, the authors temporally traced lineage divergences in zebrafish liver and pancreas.

      The authors should clarify the following points before I can recommend publication:

      Overall, I suggest that the authors consider paring down their figures. Throughout the paper, multiple figure panels convey the same point but for different genes. Furthermore, many construct configurations are shown that are not used in the subsequent panels. For example, the mNeonGreen only (no Cre) constructs and the EGFP constructs are largely not used in downstream experiments. The authors could pick the important constructs and show the relevant data, and summarize all their other constructs in one supplementary figure. The authors also jump around in different parts of the paper with regards to using iCre or CreERT2 and ubi:Switch or ubi:CSHm. It's not clear to me why they're doing that? It makes the paper hard to follow. For example, why use iCre - it's not temporal if I understand correctly (and I'm not sure what improved Cre is - could they reference a paper and include a small explanation) so CreERT2 seems suitable especially for their temporal lineage tracing experiments. Why not limit the description to CreERT2 in the main text/figures? Also, isn't ubi:Switch and ubi:CSHm pretty similar except the latter is nuclear mCherry due to H2B? Why not only focus on ubi:CSHm experiments? I found the paper to be unnecessarily long and think it would benefit from editing to describe the most important concepts and experiments.

      Response: Thank you for your constructive and helpful comments. We do agree that sometimes the schematic constructs seem redundant. This is because the krt4, nkx6.1, and id2a genes have similar gRNA targeting sites (all spanning over the stop codon). However, we prefer to keep these schematic constructs as we have all the statistical results showing the knock-in efficiency in the subsequent figure panels. Such layout can allow readers to make comparisons and better understand the efficacy of this method. However, combined with the comments from the second reviewer, we indeed need to add more detailed information, including the sequence and the length of the short left and right homologous arms in the schematics, to enable the readers to follow this strategy more easily. Meanwhile, we added a new supplementary figure with the sequences of the long left and right homologous arms, as well as the genetic cassettes/point mutations for krt92 knock-in (Figure EV1).

      As for the color switch lines we used, we appreciate your comments and replaced Fig. 5E-G with new fluorescent images using zebrafish larvae carrying the ubb:CSHm transgene. For most of the lineage tracing experiments in this study, we used Tg(ubb:CSHm) as the H2BmCherry is more stable, located in the nucleus, and the fluorescence intensity is stronger than in Tg(ubb:Switch). However, for the lineage tracing experiments in the liver injury model, we believe that Tg(ubb:switch) is a better option than Tg(ubb:CSHm). In the absence of a hepatocyte specific far-red reporter line, we can distinguish the hepatocytes derived from the id2a+ origin using the Tg(ubb:Switch) line, as the cells with Cre recombination express mCherry in the cytoplasm; i.e. we can tell the cell types based on the cell morphology in combination with the ductal anti-vasnb staining. This strategy was previously used by Dr. Donghun Shin’s group in their 2014 Gastroenterology paper (Figure 4B, DOI: 10.1053/j.gastro.2013.10.019). Therefore, we still kept the ubb:switch in the Fig. 1F schematic, and we have elaborated on why we chose Tg(ubb:switch) line for the id2a+ cell conversion experiments in Fig. 7 and Figure EV14.

      The iCre we used is a codon-improved Cre (iCre). The original cDNA sequence was from pDIRE (Addgene plasmid #26745; provided by Dr. Rolf Zeller, University of Basel) (Osterwalder et al., 2010).

      At the beginning of this project, we actually didn’t know whether there were any differences between iCre and CreERT2 in labelling of the cells of interest. Here, using both the iCre and CreERT2 lines, we for the first time, formally show the developmental lineage path of nkx6.1-expressing cells in the zebrafish pancreas. Our data suggested that the early nkx6.1-expressing cells are multipotent pancreatic progenitors giving rise to all three major cell types in the pancreas (endocrine, ductal and acinar cells, shown by nkx6.1 knock-in iCre) and gradually the nkx6.1-expressing cells become restricted in the ductal/endocrine lineages (shown by the nkx6.1 knock-in CreERT2 treated with 4-OHT at different timepoints). In addition, we also aim to use these knock-in lines for multiple studies in which we need to perform many quantitative experiments. As expected, we are unable to reach 100% labeling using the knock-in CreERT2 lines, even if we treated the larvae with very high concentration of 4-OHT over a long period of time. This means that the CreERT2 induced recombination will introduce more variation for quantitative experiments (for instance, the number of regenerated beta-cells from the ductal origin). As we were quite confident with the efficiency of this knock-in strategy, we decided to make both iCre and CreERT2 lines in krt4, nkx6.1, and id2a locus and just observe how they performed. We often use iCre knock-in lines for lineage tracing experiments, because the iCre lines reach near 100% labeling efficiency. Such iCre lines are particularly useful if they only label terminally differentiated cell types. Thus, the near 100% labeling efficiency in iCre lines can be of great help for initial experiments, which later can be confirmed by temporal labeling using CreERT2 lines.

      1. Could the authors describe the purpose of the 5'AmC6 modification earlier in the paper? I didn't see much text about it until the discussion. It seems that the speculation is that it provides end protection and prevents degradation (based on in vitro studies in human). This should be inserted into the introduction as a reader might be wondering about this and won't find an answer until near the end. Also, is this the first in vivo use of this modification for knock-ins? If so, that should be highlighted in the text.

      Response: This is a helpful comment. In the revised manuscript, we elaborate more on why we chose 5'AmC6 modification in our donors. To our knowledge, this is the first time this 5’ modification is used in vivo, however, bulky 5’modification (5'Biotin - 5x phosphorothioate bonds) has been used in medaka (DOI: https://doi.org/10.7554/eLife.39468.001, 2018 Elife, as we previously referenced). The cell division rate is much faster in zebrafish embryos compared with medaka embryos during early development, so we speculate that such modification might be of more importance in zebrafish to achieve early integration. Another advantage is that the 5'AmC6 modification is commercially available, allowing researchers to prepare the donor dsDNA in a handy fashion. We have now expanded on these details and advantages in the introduction.

      1. The authors do not show any sequencing data confirming that their insert was knocked-in as designed with no disruption to the immediate upstream and downstream endogenous sequences. Can they sequence the loci to confirm?

      Response: This is indeed a question we frequently get – thank you for making us relay this information more clearly! We have put the raw Sanger sequencing data in a public repository (mentioned in the Data Availability section), and included the sequencing primers in the method paragraph. Now we also refer to this data in the discussion section in conjunction to highlighting that the integrations were correctly placed in the loci. If you think there are better ways to show the sequencing results, please let us know.

      1. I found the descriptions of the long and short HA to be confusing when describing the results, especially since the first tested gene krt92 only has long and all subsequent ones are short. The discussion made it more clear that short HA is more efficient and applicable when gRNAs span the stop codon. Perhaps that wasn't possible with krt92, but the authors could prevent the confusion by clearly stating the design requirements of long and short HA and that they wanted to test which is more efficient before starting to describe the data. I also didn't see a description of what the length difference between long and short HA is? How short is short HA?

      Response: This is a great question that is well worth discussing. In the revised manuscript, we changed the order in which the parts are described, with nkx6.1 knock-in in front of krt4 knock-in. Here we explain why we would like to do that:

      At the beginning of this project, we did not know if the 5’ modified dsDNA could be an effective donor. To test our hypothesis, we chose the krt92 gene as our first target, as this is a keratin protein and expressed in the epithelial cells. We can easily detect the fluorescence in the epithelial cells (most notably in the skin), which allow us to sort the F0 mosaic embryos with high percentage of integration. Notably, from our experience, the most difficult part of the knock-in method is the sorting step (usually performed during 1-3 dpf). This is because the fluorescence signal is highly dependent on the endogenous gene expression level and is usually dimmer with an overall integration efficiency that is lower compared to canonical transgenesis. Therefore, we thought that targeting an epithelial cell marker would be informative and help us to evaluate the validity and reproducibility of the method. If it worked, then we could move on targeting genes expressing in more restricted tissues or cell types. For krt92 gene, the gRNA targets the region upstream of the stop codon. To prevent the cleavage of the donor template, we had to introduce several point mutations and at the same time keep the amino acid sequence intact. However, such mutations can restrict the knock-in and lower the integration efficiency when using shorter arms (due to the sequence mismatch).

      After we managed to make the krt92 knock-in, our next question was, what about using a gRNA spanning over the stop codon region? In this way, we don’t need to introduce point mutations on neither the left nor the right homologous arm. Also, for the purpose of our biological study, the nkx6.1 were on top of our gene list for lineage tracing experiments and we luckily identified that there is very good gRNA targeting this locus. After we successfully made the nkx6.1 knock-in, we were thinking that we could simplify the protocol even further, i.e. switching to short homologous arms so that we can prepare the donor by a one-step PCR instead of making complicated constructs. We tested that hypothesis in nkx6.1, krt4, and id2a sites and obtained very promising efficiency. Also, we did some further testing with dsDNA without the 5’ modifications and showed that the 5’ modifications indeed greatly increased integration efficiency. Therefore, although the short homologous arm method is a highlight here, we also point out that it was not planned from the beginning. In the revised manuscripts, we want to convey our method in a logical way and show how we modify the method in a step-by-step fashion.

      Moreover, with regards to the comments from the second reviewer, we now added the length of the homologous arms as well as the mutation site on the schematics. We chose short homologous arm because in previous literature it was suggested that short homologous arms (36-48 bp, which we now write out in both the results and the methods) can promote microhomology-mediated end joining (doi: 10.1096/fj.201800077RR). We also noticed that the recent Geneweld method (DOI: 10.7554/eLife.53968) also adheres to a similar length for homology mediated integration. In this study, HAs even shorter than 36 bp also perform well.

      1. The authors state that they could not use in situs to confirm krt92 endogenous and knock-in expression overlap, but rather say that they match based on data from an intestine scRNA-seq dataset. Can they elaborate on this? Which clusters/cell types show overlap? Furthermore, is there any krt92:GFP transgenic line that can be used as a reference for expression as well? This point is also applicable for krt4 described in Fig.2

      Response: We appreciated this point. In the beginning, we contacted Molecular Instruments to synthesize krt92 HCR3.0 in situ hybridization probes. However, the technical staff there told us that they are unable to make specific probes due to high sequence similarity to other keratin protein families. We can see that the sequence similarity mostly occurs in the middle of krt92 genes, and the HCR3.0 probes rely on a probe set (preferably 20-30 probes with different sequences) to target the mRNA.

      The scRNA-seq data that we referenced are from 10X platform, which is based on a 3’enrichment methodology. The reads mapping to krt92 genes are mostly located on the 3’ end. This is good as there is much less similarity to other cytoskeleton genes in the 3’ end of the gene. Unfortunately, there is no krt92 transgenic lines available, so we relied on the single-cell data to correlate expression patterns in this case.

      There are two zebrafish intestine single-cell data sets available, with the following links:

      (1): https://singlecell.broadinstitute.org/single_cell/study/SCP1675/zebrafish-intestinal-epithelial-cells-wt-and-fxr?genes=krt92#study-visualize

      (2): https://singlecell.broadinstitute.org/single_cell/study/SCP1623/zebrafish-intestine-conventional-and-germ-free-conditions?genes=krt92#study-visualize

      We can see that krt92 is widely expressed in different types of intestinal epithelial cells (absorptive enterocytes, secretory enteroendocrine/goblet cells and ionocyte).

      For the krt4 gene, we now added the HCR3.0 in situ hybridization and immunofluorescence for both krt4 knock-in EGFP-t2a-CreERT2 lines and the Tg(krt4:EGFP-rpl10a) transgenic line (a construct from Anna Huttenlocher, https://www.addgene.org/128839/, which has been widely used to label skin cells). The results are shown in Figure EV9. We show that krt4 has very high expression in the intestinal bulb and hindgut based on the HCR3.0 in situ. The Immunofluorescence of the krt4 knock-in fully recapitulate the krt4 expression pattern in the intestine, while there is almost no fluorescence signal in Tg(krt4:EGFP-Mmu.Rpl10a). We believe this is another advantage of using the knock-in method, over transgenics, for cellular labeling and lineage tracing. Classical transgenics often rely on short promoters of the proximal/enhancer region upstream of ATG with various length (arbitrarily or based on clues from motif analysis/DNA methylation sites). However, different tissues/cell types tend to use different cis-_regulatory elements and the chromatin structure/enhancer-promoter loops might differ dramatically among different cell types. It is hard to predict the exact region of the regulatory sequences that is sufficient for driving the gene expression in a certain cell type. Thus, such reasoning consolidates with that our knock-in lines recapitulate the endogenous _krt4 gene expression. Therefore, we believe that the knock-in based genetic lineage tracing will become the standard in the zebrafish field, as theoretically it avoids both the lack of relevant expression and leakage problems of transgenics.

      1. I think Figure 2A needs the dotted lines on the last construct to be fixed (points to p2A)

      Response: Thank you for noticing! This was due to a bug in the IBS software, and we changed it manually using Adobe Illustrator in the revised manuscript.

      1. There are a few instances where the authors describe performing 4-OHT treatment for long period (e.g. over a 20 hour or 24 hour period). Is fresh 4-OHT added after a certain amount of time or is it a one-time addition? Is such long periods of 4-OHT required or has maximal recombination already occurred within a few hours after addition of 4-OHT?

      Response: For 4-OHT treatment, we referred to the method described by Dr. Christian Mosimann (DOI: 10.1371/journal.pone.0152989). We actually tried different conditions (dosage, duration, refresh or not). This is particularly important for the knock-in CreERT lines because the level of CreERT2 is highly dependent upon the endogenous gene expression level. In our case, the nkx6.1 and id2a are transcriptional regulators and relatively lowly expressed compared with structural proteins. We maximized the labeling efficiency by using the highest concentration and longest duration suggested for 4-OHT treatment. The 4-OHT was stored in -20 ℃ and it would become less effective after 30 days of storage. Therefore, we first incubated the 4-OHT in 65 ℃ for 10 min (as recommended by Dr. Christian Mosimann) in order to convert it to a bioactive form. Next, we treated the zebrafish embryos with 4-OHT using a final concentration of 20 μM for 24 hours. We didn’t refresh the 4-OHT since there was no significant difference compared with a one-time addition. Moreover, using higher dosage or longer treatment time can lead to less survival and increased deformity rate. 20 μM 4-OHT treatment for shorter time periods (6 or 12 hours) can cause high labeling variability (some larvae have good labeling while others not). In the end, after several rounds of experiments, we settled on 20 μM 4-OHT treatment for 24 hours as it can reach the highest labeling efficiency, lower variability, and good survival.

      1. For Figures 4-6 where confocal images of lineage tracing experiments are shown, there is no indication of how many times the experiments were repeated, how many sections were images, how many animals used, how many cells counted. All of this information should be included in the figure legends and plots should be added showing quantification and statistical analysis (where appropriate).

      Response: The reviewer makes a good point and we have now added the number of larvae used and statistical results for the quantitative experiments. The quantification of experiments in Figure 3E-H (originally Figure 4E-H) are shown in Figure EV6D using box/dotplot. We randomly selected 3 secondary islets of different sizes (large, middle, and small) from each juvenile fish (n=5) and pooled the number of mCherry/ins double positive cells and ins positive cells together. The quantification of the lineage-tracing efficiency in the experiments in Figure 6 are shown in Figure EV13.

      1. Figure 4 C, C' - I'm not sure what to look for. Is the message that there is no Cherry positive cells that are vasnb negative when labelling is done at 8 somite? But the vasnb positive cells that are also Cherry positive remain? The vasnb staining seems much weaker/harder to see in C C' compared to B, B'. As mentioned above, these data should be quantified and statistical significance indicated.

      Response: Thank you for pointing this out; the second reviewer made a similar point. We redid the experiments using zebrafish larvae carrying the ptf1α:EGFP transgene to indicate the acinar cells (Figure 3B-D, Figure EV4G). We also quantified the results and performed statistical testing.

      1. I recommend the authors include a short section in the discussion comparing the efficiency of their method to other knock-in strategies used in zebrafish. This is an important claim of the paper yet it is not clear how much better it is (if at all) in terms of frequency of F0 mosaicism and identification of founders relative to other methods. I do appreciate the relative simplicity of the molecular steps of construct design/generation.

      Response: This is indeed important. It is also tricky since we are unable to make head-to-head comparisons between different methods as we are targeting different genetic loci and do not have the other methods up and running in our lab. However, the general comparison is based on the statistics shown in the hallmark papers describing these other methods, regardless of which genes were selected for targeting. In the discussion, we added a list of points that are novel/improved with our method versus previous ones, including that: 1) we simplify the knock-in methodology circumventing complicated molecular cloning; 2) we have very high germline transmission rate, which means that one morning of injection is often enough to get a founder; and the expression of fluorescence proteins avoids tedious work in identifying founders, which also saves a lot of space in the fish facility; 3) our lines can be applied for multiple utilities; 4) the method does not disrupt the endogenous gene product. We believe this is critical for the field of developmental biology, regenerative medicine, and disease modeling in zebrafish – and perhaps a similar 3’ knock-in based lineage-tracing method can become commonly used to delineate the cell differentiation and plasticity during homeostatic and diseased conditions in additional organisms.

      Reviewer #1 (Significance):

      Overall, the study contributes a new knock-in strategy in zebrafish that appears to be more user-friendly and results in high germline transmission. The authors also identify nkx6.1+ ductal cells as progenitors of endocrine cells in the pancreas highlighting the biological applications of their method. I think this study represents an important advancement in zebrafish genetics and will have future impact in lineage tracing during development, regeneration, and disease.

      Reviewer #2 (Evidence, reproducibility and clarity):

      Summary:

      Here, the authors present a strategy where they performed knock-in at the level of the STOP codon, taking care of not perturbing the coding region. They integrate cassettes coding for fluorescence protein and Cre recombinase, which are separated from the endogenous gene and each other by two self-cleavable peptides.

      The cassettes are done by PCR with primers with 5' AmC6 modifications and they test short (36 to 46 bp) or long homologous arms (~950bp). For nkx6.1 gene, they observed a dramatic increase of recombination efficiency when injecting the donors with short Homology arms compared to long arms suggesting that short arms could be used. Indeed, short arms used with krt4 and id2a allow them to obtain K.I lines.

      The techniques described here look promising. Indeed, even if the proportion of F0 showing adequate reporter expression is low (usually about 2%), the percentages of founders among these mosaic F0 were quite high (between 50% and 100%). And this is the most important aspect as it is usually the most time-consuming aspect of the work.

      Major comment:

      The authors claim that the knock-in lines can precisely reflect the endogenous gene expression, as visualized by optional fluorescent proteins. But are the authors sure that the integration of the cassettes coding for fluorescence protein and Cre recombinase, which are separated from the endogenous gene and each other by two self-cleavable peptides, will not affect the level of expression of the targeted genes . Indeed, it has been shown that sometimes self-cleavable peptides could affect the expression of the genes of the cassette like for example in this reference ([https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8034980]. Therefore it is important that the authors check whether the cassette affect the level of expression of the targeted gene if they want to claim that the knock-in lines precisely reflect the endogenous gene expression.

      Response: Thank you for your insightful comments. With regards to the endogenous gene expression, we now use qPCR for further validation. We added the qPCR results to the supplement material (Figure EV15) in the revised manuscript. In brief, we pooled 4 larvae in one tube per biological replicate and have 4 biological replicates for each knock-in line. We didn’t see a significant change in the endogenous expression for any gene. In addition, we have grown up homozygous knock-in lines to adulthood and they are fertile without any overt phenotype.

      The highlighted reference is dealing with a cardiomyocyte specific transgenic line, and we assume figure 3-Supplementary figure 1 is what the reviewer is referring to. The altered level of erbb2 expression might be due to the experimental conditions (no treatment or 3 days post treatment). Also, it is possible multiple transgenic insertions occur, as well as gene silencing at some insertion sites. However, such issues would not present, or very limited, with knock-in methods.

      Minor comments:

      General points:

      I believed that the authors should improve the presentation of their data. Indeed, based on what they present, it would be impossible for me to reproduce their technique. Indeed, it is not clear at all how they design the short and long arm, where they are exactly located, which mutations they have done (for fig1), where is located the guide RNA compared to the STOP codon and the HA arms. Graphics that exactly place all these sequences are absolutely required to understand the strategy used and should be placed in figure 1, 2, 3 and 4.

      Response: Thank you for these comments. In the revised version, we added the sequence information of the short homologous arms in each of the schematics. As for the krt92 gene, we added the sequence information in the first supplement results (Figure EV1) with the genetic cassettes and point mutation information. We list all the primer information in the methods. Also, we have uploaded our vector templates in the public repository (as listed in the Data availability section). Lastly, we added a key resource table in the supplement file with all the detailed information of reagents for the ease of reproducibility (including all the primers sequences used). We are also willing to share our constructs with the scientific community upon request.

      Specific points:

      Introduction:

      "In zebrafish, the NHEJ-mediated methods have been intensively investigated in 5'knock-in upstream of ATG using donor plasmid containing in vivo linearization site flanking the insertion sequences (11,12,17-20). The 3' knock-in method has also been examined using circular plasmid as the donor with either long or short homologous arms (HAs) flanked by in vivo linearization sites (14, 21-23). Recently, intron-based and exon-based knock-in approaches have remarkably expanded the knock-in toolbox by targeting genetic loci beyond the 5' or 3' end (8-10,13,24-26)."<br /> This part should be explained better in order that the readers could really understand the differences between these old studies and this new one. And really insist on what is the novelty of their technique.

      Response: Good points. In the revised version, we elaborated more on the previous discoveries, the major challenges, the knowledge gap in zebrafish knock-in methodology, and what is novel and improved with our new technique. Please, see clarifications and the expanded text in both the introduction and discussion.

      Results:

      Page 4: To my opinion, the first paragraph should be removed and the technique directly explained based on krt92 strategy as this paragraph does not allow to understand the technique. As indicated above, figure 1 should indicate more clearly the location of the long arms and which mutations they have done and where is located the guide RNA.

      Figure 1G: The expression in the skin is far from obvious and the image should be improved (for example with some inset).

      Response: Thank you for the comments. We added a new supplementary figure (Figure EV1) and show the sequences of left and right homologous arms, the genetic cassettes, as well as the point mutations with different background color highlight. We added the insets to show the magnified regions of interest. Also, we added the images from the fluorescent microscope used for sorting, to show the EGFP signals in live zebrafish embryos (Figure EV2D and Figure EV8D).

      Figure 3E: The authors say that "cells expressing nkx6.1 (displayed by the green fluorescence) were located on the ventral side of the spinal cord whereas H2BmCherry positive cells, which include all the progenies of nkx6.1+ cells after the iCre recombination, resided in both the ventral and dorsal parts of spinal cord". This differential expression in the spinal cord is not obvious and a more closer view should be provided.

      Response: Thank you for the comment. First, we changed the order and now describe all nkx6.1 content in Figure 2 and 3 and the krt4 content in Figure 4. We added insets to show the magnified regions and better display the expression pattern of the two fluorescence proteins in Figure 2E-G. One can now clearly see from the magnified insets that the green signals driven by the endogenous nkx6.1 gene are present in the ventral part of the spinal cord, while the red signals are present in both the ventral and the dorsal side of the spinal cord.

      Fig S4H: The authors say that" using lineage tracing, we could trace back all three major cell types in the pancreas (acinar, ductal and endocrine cells) to nkx6.1 lineage (Figure 3H-H',Supplementary Figure S4G, H)". While this is obvious for endocrine, the colocalisation with ela3l:GFP is not obvious and the figure should be improved.

      Response: This is a very good point, and the first reviewer gave similar suggestions. In the revised version (shown in Figure EV4H and I), we added the insets to show the magnified regions to better display the expression pattern of two fluorescence proteins. The ela3l reporter line is using a short promoter to drive the expression of H2B-EGFP (doi: 10.1242/dmm.026633). However, this short promoter cannot reach 100% labeling of acinar cells, so we also use the ptf1α:EGFP transgene for further validation (new Figure EV4G). Both transgenic reporter lines showed many EGFP and mCherry double-positive cells, indicating that these acinar cells are derived from a nkx6.1-expressing origin. Here we did not use the anti-GFP antibody, as our color switch lines contains CFP and anti-GFP antibody can also recognize CFP. However, the GFP signal is strong enough to show the expression. We hope the additional experiments and insets clarifies this point.

      Page 8: the authors say that "The immunostaining at 6 dpf showed that both intrapancreatic ductal cells and a portion of acinar cells can be lineage traced when the 4-OHT treatment started at the 6 somite stage (Figure 4B and B'). The identification of the acinar cells has been done based on the absence of the ductal marker vasnb. To trace efficiently the acinar cells, this should be done with an acinar marker.

      Response: Another good point also mentioned by reviewer one. We redid the analyses using zebrafish larvae containing the ptf1α:EGFP transgene to indicate the acinar cells and the co-expression pattern with the lineage-tracing (the data is shown in new Figure 3B-D).

      Reviewer #2 (Significance):

      I do not have enough expertise in the KI field to evaluate whether this strategy is really novel and as mentioned above, the authors should better explain what is really the novelty of their strategy.

      Response: In our answers to the comments of the first reviewer, we elaborated more on the points that are novel/improved with our method vs previous methods, as reiterated here:

      “…including that: 1) we simplify the knock-in methodology circumventing complicated molecular cloning; 2) we have very high germline transmission rate, which means that one morning of injection is often enough to get a founder; and the expression of fluorescence proteins avoids tedious work in identifying founders, which also saves a lot of space in the fish facility; 3) our lines can be applied for multiple utilities; 4) the method does not disrupt the endogenous gene product.”

      Moreover, the first reviewer asked about the difference between the krt4 knock-in and krt4 transgenics, and based on the in situ data, we showed that our krt4 knock-in can fully recapitulate the endogenous gene expression, while the krt4 transgenics can hardly label the intestinal bulb and hindgut. This might be due to that different tissues/cell types may depend on different _cis-_regulatory elements to drive the gene expression. The chromatin structure and the enhancer/promoter loop might also differ dramatically among different tissues. Therefore, the transgenics might be useful for one type of cells, while they might be not useful at all for other cell types. In the future, we believe that, similar to the mouse field, the 3’ knock-in based lineage tracing methods might become the standard method in the zebrafish field, to delineate cellular differentiation and plasticity during homeostatic and diseased conditions.

    1. Now, Americans! I ask you candidly, was your sufferings under Great Britain, one hundredth part as cruel and tyranical as you have rendered ours under you? Some of you, no doubt, believe that we will never throw off your murderous government and “provide new guards for our future security.” If Satan has made you believe it, will he not deceive you? Do the whites say, I being a black man, ought to be humble, which I readily admit? I ask them, ought they not to be as humble as I? or do they think that they can measure arms with Jehovah? Will not the Lord yet humble them? or will not these very coloured people whom they now treat worse than brutes, yet under God, humble them low down enough? Some of the whites are ignorant enough to tell us that we ought to be submissive to them, that they may keep their feet on our throats. And if we do not submit to be beaten to death by them, we are bad creatures and of course must be damned, &c. If any man wishes to hear this doctrine openly preached to us by the American preachers, let him go into the Southern and Western sections of this country—I do not speak from hear say—what I have written, is what I have seen and heard myself. No man may think that my book is made up of conjecture— I have travelled and observed nearly the whole of those things myself, and what little I did not get by my own observation, I received from those among the whites and blacks, in whom the greatest confidence may be placed.

      He urged enslaved people to fight back against their oppressors and to put an end to slavery.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the Reviewers for their comments. Below we have the Reviewers’ comments and our responses.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this work, the authors claim that their machine learning approach can be combined with a biophysical model to predictably engineer sensors. The concept is interesting, but there are many issues that must be addressed before considering its publication.

      1. It is surprising that their citations are too biased. They keep citing nonrelevant papers from several groups while omitting many key papers regarding genetic sensors and circuits in the field. Some can be justified (e.g., Voigt lab's reports), but others (e.g., reports on dynamic controllers too often) would not be relevant.

      There are hundreds (possibly thousands) of papers that have been published on genetic sensors. Most of those papers report only qualitative results (e.g., genetic sensor implemented in a new host organism or demonstrated to sense a ligand of interest).

      The purpose of this manuscript is to demonstrate methods for quantitative engineering of genetic sensors. Specifically, the manuscript is focused on quantitative tuning of the genetic sensor dose-response curve. So, in deciding which previous papers to cite, we chose several review articles (to cover the many, many qualitative results), any previous papers we could find that reported strategies for tuning the dose-response curve of genetic sensors (the Voigt lab’s reports and others), and any papers we could find that discussed reasons/applications for quantitative tuning of a genetic sensor dose-response curve (e.g., dynamic controllers).

      We added a new paragraph to the beginning of the Results section to explain this focus on quantitative tuning (and to clearly state which statistic we use for assessing accuracy – see response to next comment; lines 72-83 in the revised manuscript).

      We would also like to add more relevant citations as suggested by the reviewer, but that is difficult based on the reviewer’s comment, which just indicates that we have omitted many “key” papers. For the central focus of this manuscript, we think the “key” papers are those that describe methods to tune the dose-response curve of genetic sensors, and we have done our best to cite all of those that we could find. So, we ask the reviewer to please suggest some specific papers that they consider to be “key” that we should cite, or at least some more specific definition of what they think constitutes a “key” paper that should be cited.

      It is very unclear which statistical analysis has been done for their work.

      The main statistical metric used in the manuscript is the fold-accuracy. The fold-accuracy was defined in the previous version of the manuscript, but we agree that it could have been stated more clearly. So, we have moved the definition of fold-accuracy to the (new) first paragraph of the Results section, and identified it as “…the primary statistic we will use to assess different methods.” (line 77 of the revised manuscript)

      There are many practical sensors for real applications, but their work focuses on IPTG-responsive sensors or circuits. I was wondering whether this work would have significant impacts on the field or the advancement of knowledge.

      Similarly, it is questionable that their approach is generalizable.

      Currently, there is only one published dataset that can be used for the methods described in this manuscript, for IPTG-responsive LacI variants.

      However, previous work (cited in our manuscript) has shown that directed evolution can be used to qualitatively “improve” a wide range of genetic sensors beyond LacI. Furthermore, some of those previous studies used a single round of mutagenesis and libraries with diversity similar to the size of the LacI dataset (104 to 105 variants). Based on that, we think it is highly likely that our in silico selection approach will generalize to other sensor proteins.

      With regard to the ML methods used in our manuscript, we showed in the initial publication describing the LANTERN method that the approach is generalizable to different types of proteins and protein functions (LacI sensor protein, GFP fluorescence protein, SARS Cov-2 spike-binding protein). So, we don’t see any reason to question the generalizability of that approach to other sensor proteins.

      We have edited the Discussion section of the manuscript to include these points regarding the generalizability of our approach (lines 340-350 in the revised manuscript).

      Due to the biased literature review, it is unclear to me whether this work is novel.

      The majority of relevant literature on genetic sensor engineering is qualitative in nature and is not particularly comparable to the work here. We have tried to emphasize this in the introduction and discussion. We have searched the relevant literature extensively, and we have only found a small number of papers that describe quantitative methods to tune the dose-response of genetic sensors. Furthermore, there are only a few that contain any kind of quantitative assessment of that tuning. We have cited all of those papers and included specific discussions and comparisons between them and our results.

      If the reviewer knows of any specific papers that we missed we would be happy to include them in our literature review.

      I am unsure whether their correlation is sufficiently high.

      This comment is too vague to address.

      Again, we ask the reviewer for more specific information: What “correlation” are you referring to? And what is “sufficiently high”?

      We have provided statistics on the accuracy of our methods, as discussed above.

      Is EC50 the only important parameter? Or is it really relevant for real applications where the expression levels would change due to RBS changes, context effects, metabolic burdens, circuit topologies, etc.?

      EC50 is not the only important parameter. That is why we also demonstrate the ability to quantitatively tune other aspects of the dose-response (e.g., G∞).

      In any real application of genetic sensors, the EC50 will have to be engineered to have a quantitatively specified value (within some tolerance). So, yes, it really is relevant.

      There is an important question about the effect of context however, and perhaps that is what the reviewer is really asking: If we engineer a genetic sensor that has a given EC50 in the context used for the large-scale measurement, will we be able to use that genetic sensor in a different context where, because of the change in context, its EC50 may be different?

      This is one of the outstanding challenges in the field, to be able to predict the effect of a change in context. But for genetic sensors, there are several previous publications that demonstrate promising routes to quantitatively predict the effect of context on genetic sensor function.

      So, we have added a paragraph to the Discussion section addressing this point and citing the relevant previous publications (lines 315-339 in the revised manuscript).

      There are many reports on mutations or part-variants and their impacts on circuit behaviors. Those papers have not been cited. This is another omission.

      As discussed in response to Comment 1, above, there are many hundreds of such papers. It would not be practical or appropriate for us to cite all of them. However, there are only a few that contain any kind of quantitative assessment of the predictability of mutational effects or of efforts to use mutations to engineer sensors to meet a quantitative specification. We have done our best to cite and discuss all of those. Again, if the reviewer knows of any specific additional papers that we should cite, please tell us.

      CROSS-CONSULTATION COMMENTS

      In general, I agree with the other reviewer. Its significance would be too incremental.

      Reviewer #1 (Significance (Required)):

      See above.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This paper proposes two approaches for forward-design of genetically encoded biosensors. Both methods rely on a large scale dataset published earlier by the authors in Mol Syst Biol, containing ~65k lacI sequences and their measured dose response curves. One approach, termed 'in silico selection', is proposed as a way to find variants of interest according to phenotypic traits such as the dynamic range and IC50 of the biosensor dose-response curve. The second approach uses machine learning to regress the dynamic range, IC50 and others from the lacI sequences themselves - the ML regressor can then be used to predict phenotypes of new variants not present in the original dataset. The ML algorithm has been published by the same authors in a recent PNAS paper.

      The manuscript has serious flaws and seems too preliminary/incremental:

      1) The 'in silico selection' method corresponds to a simple lookup table. This is a perfectly acceptable method for sequence design, but the attempt to portray this as a new method or 'multiobjective optimization' is highly misleading. Also, the analogy between 'in silico selection' and darwinian evolution or directed evolution are inappropriate, because both latter approaches rely on iterative selection through fitness optimization and randomization of variants. The 'in silico selection' approach in contrast is one-shot and does not use randomization.

      We agree with some of the reviewer’s points here. In making the analogy to directed evolution, we wanted to give the reader a connection to something familiar, but the reviewer is correct that the analogy is imperfect. The “lookup table” description is much better, and probably a familiar idea to most readers. So, we edited the relevant paragraph to describe in vitro selection as the use of the large-scale dataset as a lookup table instead of making the analogy to directed evolution. We thank the reviewer for this suggestion.

      However, we disagree with the reviewer with regard to “multi-objective optimization.” We clearly demonstrate in Figures 3 and 4 that we can simultaneously tune multiple aspects the dose-response curve to meet quantitative specifications. If the reviewer is aware of any previous publications that they think provide a better demonstration of multi-objective engineering of biological function, please let us know; we would like to cite those papers appropriately.

      Also, the reviewer is incorrect in stating that our in silico selection approach does not use randomization. The randomization occurs as part of the large-scale measurement. This is clearly stated in the second paragraph of the Results section.

      2) The ML approach is a minor extension to what they already published in PNAS 2022. One could imagine an extra figure in that paper would be able to contain all ML results in this new manuscript. A couple of comments about the actual method: a) it seems unlikely to work on sequences of lengths relevant to applications, because it relies on gaussian processes that are known to scale poorly in high dimensions. b) The notion of 'interpretable ML' is misleading and quite different to what people in interpretable AI understand. Moreover, the connection between the three latent variables, which provide the 'interpretability', and biophysical models seems to come from their earlier PNAS work and this specific dataset, but there is no indication that such connection exists in other cases. Although this is somewhat acknowledged in L192-195, the text tends to portray the connection with biophysical models as something generalizable.

      The ML results presented in this manuscript are specifically aimed to quantitatively assess the accuracy of the ML predictions for the parameters of a genetic sensor dose-response curve. So, we think those results belong in the current manuscript.

      The reviewer’s comment on Gaussian processes and dimensionality is clearly contradicted by the results presented in this manuscript and in our previous publication describing the ML method: The ML method works quite well for “sequences of lengths relevant to applications,” including LacI (360 amino acids), the SARS-Cov2 receptor binding domain (200 amino acids), and GFP (250 amino acids). The reason for this is that the Gaussian process is only applied on the low-dimensional latent space learned by the ML method.

      The reviewer’s comment on “interpretable ML” is not relevant to this manuscript but is instead a criticism aimed at our previous publication on the ML method.

      The generalizability of this approach is an open question. The same could be said for most other publications describing new methods, since most of those publications include demonstrations with only a small number of specific systems. After re-reading the relevant portions of the manuscript, we disagree with the reviewer’s suggestion that we have exaggerated the potential generalizability of the approach. For example, in the last sentence of the Results paragraph, we state, “Although imperfect, this initial test of linking an interpretable, data-driven ML model to a biophysical model to engineer genetic sensors shows promise…” And, in the Discussion section, “The use of interpretable ML modeling in conjunction with a biophysical model also has the potential to become a useful engineering approach… But more rigorous methods would be needed…”

      Other comments:

      3) There are quite a few reduntant figures, eg Figure 1 contains too many heatmaps of the same variables. Fig 2B and C are redundant as the contain the same information. Altogether figures feel bloated and could have been compressed much more.

      We disagree. The sub-panels of Figure 1 show different 2-D projections of the multi-dimensional data that are relevant to specific aspects of the results in Figs. 2-4.

      Admittedly, Fig 2C shows the residuals from Fig 2B, which is in some sense the “same information.” But it is quite common, in papers focused on quantitative results, to have one sub-panel showing a comparison between predicted and actual and a second sub-panel showing the residuals.

      4) Fig 2A and 3A have problems: the blue & orange lines (Fig 2A) and blue & green lines (Fig 3A) have a kink just before the second dot from the left. Such kinks cannot have been produced by a Hill function. This kind of errors cast doubt on the overall legitimacy and reproducibility of the results.

      The kinks in the curves are a consequence of the use of the “symmetrical log” scale on the x-axis, which allows the zero-IPTG and non-zero-IPTG data to be shown on the same plot while showing the non-zero-IPTG data on a logarithmic scale. That symmetrical log axis uses a log scale for large x values, and a linear scale for smaller x values. The kink appears at the transition between the log and linear scales. We have re-plotted all of the figures showing dose-response curves to move the log-linear transition to overlap with the axis break.

      CROSS-CONSULTATION COMMENTS

      I agree with the other reviewer's comments, particularly on the lack of statistical analyses.

      See our response to Reviewers #1, comment 2, above.

      Reviewer #2 (Significance (Required)):

      The work addresses a timely subject but is too incremental.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01490

      Corresponding author(s): Cariboni, Anna; Howard, Sasha R

      [Please use this template only if the submitted manuscript should be considered by the affiliate journal as a full revision in response to the points raised by the reviewers.

      • *

      If you wish to submit a preliminary revision with a revision plan, please use our "Revision Plan" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

      1. General Statements [optional]

      This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

      2. Point-by-point description of the revisions

      This section is mandatory. *Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. *

      • *

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The current manuscript in question is well written and of general interest to the reproductive neuroendocrinology field. Overall it is a well written and substantiated.

      Reply: We thank the reviewer for his/her positive and supportive comments on our manuscript.

      The primary problem with the paper is the data derived from the microarray. While the experimental design included replicates (n = 3), although weak, the actual microarray data was based on a single data point. A major weakness. This experiment should be repeated using more up-to-date approaches such as RNA-seq or left out of the manuscript, because this data set is compromised due to the data collection procedure.

      Reply: We thank the Reviewer for raising these points, which we wish to clarify. We respectfully disagree that the microarray data generated in this study is not valuable. The transcriptomic analysis of immortalized cells was performed on 3 biological replicates (specifically, RNA was extracted from n=3 samples, obtained from each cell line at 3 different passages) and run as 3 independent samples (for a total of 6, 3 for GN11 cells and 3 for GT1-7 cells). For the primary embryonic GFP-GnRH neurons, given the difficulty of isolating with FACS a sufficient number of GFP+ cells from each embryo due their very small number (around 1000 GnRH neurons/head), we had to pool sorted cells from 2-3 embryos for each time-point. Thus, although the primary cell microarrays were run on one sample for each time point, the RNA was not derived from one embryo only, but from at least 2/3 embryos.

      Nevertheless, to overcome the issue of low number of replicates for the primary embryonic cells, we revised our manuscript by re-running our analyses, using as the starting dataset the analyses obtained from immortalized cells, which were based on a ‘true’ n=3 of biological replicates. In this context, we filtered DEGs from this microarray using logFC>2 and adj. p-value1) found in primary GFP-GnRH neurons. We believe that this revised analysis is statistically more powerful, as the core bioinformatic analyses were performed on triplicate samples, with a second filtering step to take advantage of biologically relevant data obtained from n=1 primary GFP-GnRH neurons to confirm in vivo the expression of selected genes. Whilst RNAseq offers wider coverage of the genome and has advantages over microarray, we do not believe that this renders unimportant the data generated from these unique experiments and the novel genomic discoveries it facilitated.

      In line with this, our work may be considered as a proof-of-principle that transcriptomic profiles from rodent GnRH neurons can be exploited at different levels, including the possibility to identify novel GD candidate genes. Overall, our work also highlights the existence of similarities between two immortalized GnRH neuron cell lines with primary GnRH neurons, which was so far demonstrated by several functional studies, but not at molecular level.

      The manuscript has been now edited as per the above amendments (see first and second paragraph of Results section, lines 86-135).

      __CROSS-CONSULTATION COMMENTS __Notwithstanding the importance of neuroligin 3 during glutaminergic synaptogenesis, I agree with the reviewers on both points. Further screenings of the patient's family members should be done and the microarray data should be removed or potentially moved to a supplementary status.

      Reply: we thank the reviewer for their comments and, accordingly with their suggestion, we revised the filtering strategy starting from immortalized cells microarray and therefore moved a substantial part of the microarray data from primary GFP+ neurons as supplementary data. We also unsuccessfully tried to collect information of the brother from case 2 and investigated datasets from both the DECIPHER and 100,000 genome projects, but have been limited to two cases for which we have familial consent to publish.

      Reviewer #1 (Significance (Required)): The paper is of significance based on the neuroligin 3 data, which is indicative of abnormal synaptogenesis. However, these defects seem to only have a limited effect on the functionality of GnRH neuron system and do not seem to cause elimination of GnRH neurons themselves. Nevertheless these data do open end a new direction that may help explain some dysfunctions in reproductive health.

      Reply: we thank the reviewer for their comments and agree that our findings have the potential to facilitate new avenues for the investigation of reproductive disorders.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): Oleari et al performed comparative transcriptome analysis on the different developmental stages of GnRH neurons, as well as two immortalized GnRH neuronal cells GT1-7 and GN11 which represent mature and immature GnRH neurons. As a results, they identified a panel of differentially expressed genes (DEG). They further used top DEGs as candidate disease-related genes for GnRH-deficiency (GD), a disorder characterized with absent of delayed puberty and infertility. To this end, they found two loss-of-function mutations in NLGN3 in patients with GD combined with autism. This study provide a resource for the identification of novel GD-associated genes, and suggest an intrinsic connection between GD and other neurodevelopmental diseases, such as autism. I only have some minor concerns.

      1. According to the pedigree, both probands (case 1 and 2) inherited their NLGN3 mutations from their unaffected mother, consistent with an X-linked recessive inheritance. However, only "parent" was used in the manuscript, therefore, it is not clear if this "parent" is the probands' mother or father. __Reply: __Thank you for this comment. We were limited to the use of non-gendered terminology due to medRxiv policies. We have now amended the text and changed ‘parent’ to ‘mother’, lines 161, 173, 179, 185 and 730. We also integrated this sentence highlighting the X-linked pattern of inheritance: “Sanger sequencing of the probands’ mothers confirmed them to be the heterozygous carrier in each family, consistent with an X-linked recessive inheritance pattern.”, lines 185-186.

      It is suggested to integrate Figure 2 as a panel in Figure 1.

      __Reply: __We thank the reviewer for this suggestion. Due to our revision of first two Results paragraphs, we have now edited the Figures and the filtering flowchart has been added in Figure 2.

      What is the meaning of Peak LH and Peak FSH, and how are they measured in Table 2?

      Reply: This refers to peak value obtained after standard protocol GnRH stimulation testing with 100mcg GnRH (Gonadorelin) as an IV bolus and measurement of serum LH and FSH at 0, 20 and 60 minutes intervals. (e.g. Harrington et al., 2012, doi:10.1210/jc.2012-1598). This clarification has been added to the text in Table 2 legend (lines 681-683).

      A genotyping for the elder brother of Case 2 will be a strong evidence to support NLGN3 as a GD-associated gene.

      __Reply: __We thank the reviewer for this important point. In view of this issue, we have strived to collect DNA from this individual. Unfortunately, despite trying repeatedly to contact the family of proband 2, it has not been practically possible to collect these extra data from this family.

      We also identified a third case via a public database with central hypogonadism who carried a stop-gain variant in NLGN3, but unfortunately the family did not release their consent for publishing this case.

      The authors claimed neither probands carried deleterious variants in known GD genes. It is suggested to indicate the exclusion criteria (which genes? How do they define a variant is deleterious?)

      Reply: We thank this reviewer for raising this important point of clarification. Inclusion criteria for variants in known GD genes (updated gene list available in Supplemental Table 3) were as per Saengkaew et al., 2021 (doi: 10.1530/EJE-21-0387): “Only variants that met the ACMG criteria for pathogenicity, likely pathogenicity, or variants of uncertain significance (VUS) were retained in the analysis”. We have added this sentence in the manuscript, lines 150-151.

      Please also include a sequence chromatogram for proband 2.

      Reply: We thank the reviewer for their comment. We added the chromatograms for proband 2 and his heterozygous mother in revised Figure 3.

      CROSS-CONSULTATION COMMENTS I agree with Reviewer 3, the genetics is not very strong, as NLGN3 mutations were only found in one GD case from their cohort and one pre-pubertal case from the literature. It will be nice to analyze the genotype and phenotype of Case 2's older brother. Further, it is important to screen NLGN3 rare sequencing variants in larger GD cohorts.

      Reply: We thank the reviewer for their comment, but respectfully disagree with this assertion. The second case is not from the literature, but is a second case found thanks to GeneMatcher, an international tool that allows researchers to collaborate on novel gene discovery. We have also explored other cohorts that were available to us, including the DECIPHER and 100,000 genome project, but have been limited to two cases for which we have familial consent to publish. We anticipate that further international patient cohorts will be screened following the publication of this manuscript (added in Discussion section, lines 306-308). As described above, despite trying repeatedly to contact the family of proband 2, it has not been practically possible to collect these extra data from this family.

      Reviewer #2 (Significance (Required)): This study provides a resource for the identification of novel GD-associated genes, and suggest an intrinsic connection between GD and other neurodevelopmental diseases, such as autism. It may welcome by researchers and clinicians in the filed of neurodevelopment.

      Reply: We thank the reviewer for their positive and supportive comments.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      __Summary: Oleari et al used murine GnRH1, and immortalized GnRH cell lines (GT1-7, Gn11) to define genes of interest in GnRH development and used this list to filter exome sequencing data from patients with some evidence for GnRH Deficiency.

      Title: I am concerned that the title of the paper overstates the results and conclusions.

      Intro: use of "candidate causative genes" overstates the evidence presented.

      __Reply: __We thank the reviewer for their comment and have revised the title to reflect the findings of the study. We have also edited the sentence in the abstract reporting "candidate causative genes" as follows: “Here, we combined bioinformatic analyses of primary embryonic and immortalized GnRH neuron transcriptomes with exome sequencing from GD patients to identify candidate genes implicated in GD pathogenesis”, lines 40-43.

      Results: The transcriptomic profile of the developing human GnRH neuron has been published via in vitro differentiation protocols twice (Lund et al 2020, and Keen et al 2021). Gene set data is publicly available. This should be explicitly compared in results not relegated to discussion -- two or three examples it not enough to say mouse can be used instead of human.

      __Reply: __We thank the reviewer for this comment. We apologize if our sentence in the Discussion was misleading, as we did not intend to make a conclusion on the similarities of the two datasets/cell types, neither to suggest the use of rodent instead of human.

      Although we are aware that differences among species might exist, mouse/rodent models including immortalized cells have been instrumental to understand the molecular mechanisms of GnRH neuron development and to predict candidate genes. Indeed, our aim was to demonstrate that transcriptomic profiles of rodent GnRH neurons could be integrated with exome sequencing data from human patients to reveal novel candidate genes.

      Therefore, the aim of our study was different to that of the Lund and Keen publications. Further, caution should be exercised in any deeper comparative analyses with our transcriptomes, for following reasons: first, the GnRH neurons generated from human iPSC and cultured for 20 and 27 days cannot be objectively defined for their ‘age’ in order to be then compared to immortalized or primary embryonic GnRH neurons; second, in these datasets a different and more extensive transcriptomic technique has been used (RNAseq vs microarrays).

      There was no intention to relegate to the discussion the possible similarities with other transcriptomic datasets, but we felt that these comparative analyses were beyond the scope of our work.

      However, following the Reviewer’s suggestion, we have tried to make comparative analyses with the publicly available datasets from Lund et al 2020 and Keen et al 2021, and with a paper just published (Wang et al 2022), as follows.

      In Lund et al. paper, GnRH-like neurons were obtained from human iPSCs by dual SMAD inhibition and FGF8 treatment. We selected data obtained from cells treated with FGF8 and cultured for 20 days and 27 days for comparison with our early and late genes, respectively.

      Because the authors of this paper did not publish the full list of differentially expressed genes (DEGs) from this specific comparison (20 vs 27days) and we were not able to retrieve it upon request, we used the normalized counts of these samples (available at ArrayExpress repository) to compare the two experimental groups with DESeq (Bioconductor release 3.15). To increase stringency of our analysis, we considered as differentially expressed those genes which displayed both an adjusted p-value of less than 0.05 and an absolute fold change of >2. The number of DEGs obtained was different and greater (5981) than from the published data, and this large number of genes may, by chance alone, contain a large fraction of any gene dataset (including the genes that we found with our analysis). For this reason, this particular comparison in this dataset cannot be informative or useful.

      Next, we considered the dataset from Keen et al. In this paper, the authors have tested different differentiation protocols to obtain GnRH-like neurons from human wild-type or mCherry embryonic stem cells (hESC). They transcriptomically profiled hESC-mCherry-derived GnRH neurons at 8,15 and 25 days of culture.

      Again, although we cannot precisely define the matching embryonic stage of cells cultured for 8, 15 or 25 days, we compared the lists of DEGs from immortalized GnRH neurons (GN11vsGT1-7) with the transcriptomic profiles of mCh-hESC at day 15 vs day 8 and mCh-hESC at day 25 vs day 15, respectively. We considered as differentially expressed the genes that displayed both an adjusted p-value of less than 0.05 and logFC>2. We found that the majority of the genes that were differentially expressed in one dataset were not in the other. However, the few genes that were differentially expressed in both datasets demonstrated a good correlation, i.e. the same expression trend. Although this latter approach was more fruitful, by suggesting a partial similarity between primary GFP-GnRH neurons and hESCs-derived GnRH neurons at day 25 vs day 15 time-point, we do not feel that we could draw significant and reliable conclusions.

      Further, if we compare these two datasets obtained by RNAseq from hiPSC and hESC, even by taking into account the large amount of DEGs found in our re-analysis of Lund et al., 2021 raw data, a relatively small number of common DEGs were found. These data also suggest that there is transcriptomic heterogeneity even among human-derived GnRH neurons.

      In addition to these two datasets, while our manuscript was under revision, a new paper was published, in which the authors dissected iPSC-derived GnRH neuron transcriptome with RNA-seq at single cell level (Wang et al., 2022, doi:10.1093/stmcls/sxac069). Again, although the same concerns may apply in comparing this dataset with ours and raw data of DEGs were not publicly available in this case, we compared the expression trends of our 29 candidates with gene expression trajectories identified in this work. As a result, 24/29 candidate genes, including NLGN3, were found to have an expression trend consistent with our dataset. The few remaining genes exhibited an opposite trend (2/29) or were not found in available data from this work (3/29). As this is a purely qualitative analysis, we do not feel it would be appropriate to include it in the Results section, but have included commentary on these comparative dataset analyses in the Discussion section (lines 247-257). A future study could be designed to mine the raw data from all the available transcriptomic profiles of developing GnRH neurons, but this is beyond the scope of our current manuscript.

      The authors need to comment on other GnRH1 expression in the brain of developing rodent and if they think the GnRH1 sorted neurons are just "GnRH Neurons" associated with reproduction (Parhar et al 2005) due to microdissection.

      __Reply: __We thanks the reviewer for raising this point of clarification. We have carefully selected by microdissection nasal areas from E14, nasal and basal forebrain areas from E17 and basal forebrain from E20 rat embryos (see revised Methods, lines 325-327). We are therefore confident that what we have obtained is RNA from ‘reproductive’ GnRH neurons only.

      Questions about Cases/Missing Phenotypic Information: 1) Case 1: the patient underwent increased testicular volume on testosterone therapy -- testosterone therapy does not increase testicular volume. Has this patient undergone or been assessed for reversal of his hypogonadism?

      __Reply: __We thank the reviewer for their comment. The patient had minimal testicular development on testosterone (from 10ml to 12ml) but did not increase testes volume beyond 12mls, consistent with a partial HH phenotype. He has had two trial periods of 3-4 months off testosterone treatment and during these periods had both low serum testosterone concentrations and symptoms of hypogonadism (tiredness, low energy and reduced muscle strength).

      2) Case 2: Is too young to be classified as having a pubertal defect. Microphallus is mentioned but what size, was this diagnosed at birth and treated? I think the case for GD is overstated in the results and discussion (especially with the discussion of small testes).

      Reply: We thank the reviewer for requesting these clarifications. The patient has not received any treatment for his microphallus (2.5 cm length in mid-childhood). We agree that this case is too young to be classified as having a pubertal defect, but the presence of microphallus and small testes volume in infancy and early childhood, in association with low gonadotrophins and absent erections, are well recognized as red flag signs for hypogonadotropic hypogonadism (Swee & Quinton, 2019, doi:10.3389/fendo.2019.00097). We added this information to the Results section, lines 175-177.

      Genetic Information: Since this was a candidate gene search -- what other candidate genes were uncovered in these probands?

      Reply: The revised list of 29 candidate genes were screened in the two probands from our study using the whole exome sequencing datasets for these individuals, and only the variants of interest in NLGN3 described in the manuscript were found.

      By searching for mutations of the revised list of candidate genes in our GD cohort, we identified nonsense variants only in NLGN3 and no splice variants. We also found few rare and predicted damaging missense variants in this gene list identified. Indeed, two rare (MAF 25) missense variants were identified in the genes PLXNC1 and CLSTN2 in two further probands (now summarized in Supplemental table 4). We have not identified further probands with PLXNC1 or CLSTN2 variants of interest from additional cohorts and thus at present we have not yet taken these gene variants further for molecular characterization, but we will examine the relevance of this gene variant in future work.

      Do the probands have a clear explanation for their developmental disability other than the gene noted?

      __Reply: __We thank the reviewer for raising this point. Proband exomes were also screened for genes related to developmental delay and no other causal gene variant were identified. We added this information in the text, lines 183-185.

      I would encourage the authors to update Table 3: they are missing IHH/KS genes such as GLI3, SEMA7A, SOX2, STUB1, TCF12. I suggest they update the Table and analyses.

      Reply: we thank the reviewer for highlighting this point. Since we performed a new analysis, we also performed a new candidate gene prioritization using a more up-to-date gene list to instruct ToppGene (please see revised Supplemental table 3).

      CROSS-CONSULTATION COMMENTS Dear Reviewer #2, I am concerned that the paper presents only a single case of GD to support the scientific work. What do you think?

      __Reply: __We would like to highlight that, as we describe above, GD can be diagnosed prior to pubertal age in individuals with red flag phenotypic signs and biochemical evidence of hypogonadism.

      Dear Reviewer #1: In addition to the weakness in the microarray data, what do you think about the authors using publicly available data from human GnRH neuron transcriptomics for analysis?

      __Reply: __please see the above discussion on the comparison with publicly available datasets.

      Reviewer #3 (Significance (Required)):

      There is not high significance to this paper: This is not the first article with GnRH transcriptomes. I would argue the human data is more relevant. Developmental disability has been previously linked the GnRH deficiency (as even cited in this paper) The article presents one case of GnRH deficiency, and one pre-pubertal case -- providing some modest evidence for a candidate gene, NLGN3.

      __Reply: __We would like to rebuff this assessment of the paper’s significance. To our knowledge, this is the first report of transcriptomes from primary GnRH neurons isolated at key embryonic developmental time points. Other published reports refer to iPSC-derived or adult GnRH neurons (Keen et al., 2021; Lund et al., 2020; Wang et al., 2022; Vastagh et al., 2016 and 2020).

      Similarly, the association of central hypogonadism with developmental disabilities have been reported in registry-based studies, but few causative genes have been identified, nor patient variants functionally validated in order to investigate the molecular biology underpinning this association. In the Discussion, in the light of a recent paper (Manfredi-Lozano et al., 2022, doi: 10.1126/science.abq4515), we also postulate that NLGN3 might be required for neuritogenesis of extra-hypothalamic projections of GnRH neurons thus contributing to the pathogenesis of NDD (lines 294-300).

      Regarding to human data, we would like to acknowledge that we had a third case that we were not able to publish due to family consent. NLGN3 deficiency is likely to be a rare disorder, but that should not obviate the impact of investigating the molecular etiology – indeed, many insights into human biology have come from private mutations in rare disease.

    1. authority and the expertise to make weather predictions in the first place and it's a story about how to transform knowledge of nature into market knowledge and thus profit and we'll see some of these similar

      I think this whole thing is interesting to me, it almost reminds me today of todays political climate. I feel like there are times in which the different news sources all argue and disagree in similarly childish ways, and where people will manipulate the information that is disseminated to the public for their own personal gain. This is an interesting social facet to the interaction/connection of nature and commerce; which I find interesting because commerce is a largely social construct and innovation in many ways is as well. Social facets may be more important than we think in these analyses; it almost reminds me of Solnits designation of the ghost dance as technology and the amount of social/group emotional factors that unexpectedly need to be considered in that thought.

  4. Oct 2022
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      RC-2022-01632

      Answers to referees

      First of all, we wish to thank the 3 referees for their careful evaluation of the manuscript. We see many issues that they have raised as legitimate and have tried to provide experimental or editorial answers. In contrast, some issues are presently addressed in the context of a future manuscript and we had rather not introduce these studies in the revised version.

      Below, one will find the answers and the description of the revisions already introduced in the revised manuscript (questions are recalled in blue italics).

      New and modified figures, plus not shown figures and tables are indicated in the text below but could not be pasted in the document and can be found in the Revision plan.

      Referee # 1

      Evidence, reproducibility and clarity

      They then delivered 86/8 and LSBio anti-En1 antibodies, that catch En1 in the cleft and prevent it from being captured by MNs.

      Perhaps we were not clear. We did not deliver the antibodies 86/8 and LSBio, we used them for western blots and immunohistochemistry (IHC) to identify EN1 and localize it. We delivered the third antibody, a single-chain anti EN1 antibody (scFvEN1), that captures extracellular EN1 and prevents it from being captured by MNs on the basis of the LSBio staining (Figure 4A-C).

      Finally, heterozygotes revealed also a degeneration in dopaminergic neurons within midbrain similar to the one observed in spinal MNs, along with an upregulation of SQTSM1/p62 gene/protein, a factor in MN ageing linked to the classical genes implicated in familial forms of ALS (SOD1, TDP-43, FUS, and C9ORF72).

      This is a fair comment/work description, that does not require answers.

      Significance

      *Major comments: *

      It is unclear why levels of intensity for RNAscope were not quantified, and qPCR was preferred for quantifications in Figure 1b. RNAscope is a technique that allows for spatial distribution analysis of the markers and their level of the expression. This data can be easily quantified utilizing the QuPath software which is open access. Same concerns apply to Figure 2a.

      Quantitative RT-PCR provides a quantitative measure of gene expression. Since only V1 interneurons (including, Renshaw cells) express EN1, we infer the spatial distribution, although not expression level cell by cell. Figure 2A is an actual counting at 4.5 months of En1+ cells and of Calbindin+ cells (Renshaw cells), both identified by RNAscope. Thus, it is clear that the number of En1-expressing cells (V1 interneurons) is not modified at 4.5 months when muscle weakness and death of aMNs are well advanced (around 70% of the aMNs that will eventually die, are already gone). Long-term survival of V1 interneurons is further demonstrated in Figure 2D (left panel) until 15.5 months, (see also below) whereas total En1expression is reduced by half. Quantification neuron by neuron of the amount of En1 transcribed (RNAscope) would indicate the variation, among interneurons, of En1 transcription in WT and mutant mice. This is interesting per se but would not modify the main information that these neurons do not die in the heterozygote and that En1 transcription does not decrease with time in both WT and mutant genotypes (at least until 15.5 months).

      *Antibodies should be validated utilizing a reporter mouse. En1cre mice are commercially available and can be crossed with reporters (TdTomato or YFP mice). Utilizing this tissue En1 antibodies can be easily validated. The EN1 antibody shown in Figure 1c seems unspecific, staining several neuronal populations in the spinal cord. *

      Indeed, antibody validation is extremely important. LSBio is commercial (CliniSciences), 86/8 was developed in the laboratory and fully characterized and used in previous studies (e.g. Alvarez-Fischer et al. Nature Neurosci. 14: 1260-1266, 2011; Rekaik et al. Cell Reports 13: 242-250, 2015; Blaudin de Thé et al. EMBO J. 37: e97374, 2018), scFv against EN1 was prepared from the 4G11 hybridoma (Developmental Hybridoma Bank, Iowa City, USA) and validated in previous studies (e.g. Wizenmann et al. Neuron 64: 355-366, 2009). In the present study, the two polyclonal were further validated inseveral ways.

      In the WBs we compared ventral midbrain (VMB) and spinal cord (SC) tissues and found similar patterns. Strong evidence for antibody specificity is immunostaining extinction with the antigen and with absence of first antibody, which we carried out.

      We have now used LSBio and 86/8 to perform a WB on spinal cord (SC) and ventral midbrain (VMB) extracts with or without the first antibody and we find that the absence of first antibody fully eliminates band staining. The western has been introduced in the revised manuscript in place of the cross immunoprecipitation.

      Finally, we have quantified EN1 in the aMNs of the heterozygote at 3 months (before cell death), showing that EN1 content is decreased by approximately 2-fold (LSBio antibody) in both a and gMNs with no change in neuron number. This result demonstrating that EN1 is diluted by approximately twofold (concentration per neuron when all neurons are still present), in addition to further validating the antibody, is itself interesting and has been introduced in the revised manuscript as Supp. Fig. 1A.

      Regarding the staining in other neuronal populations, there is always some background, in particular in the tissue treatment conditions used for RNAscope. Furthermore, given the large number and wide distribution of V1 interneurons (Fig. 1A), we cannot preclude that EN1 is present at a low concentration in the extracellular space and in several cell types (discussed in Fig. 9 of the manuscript). This does not weaken the main conclusion that it primarily accumulates in MNs which do not express En1 (RNAscope).

      *Investigations of En1 expression in motor neurons from already available omics data sets would support the idea that En1 is expressed in motor neurons. *

      The En1 locus is silent in MNs. Microdissection of MNs and proteomic analysis would not be definitive since the interneurons that produce EN1 are in close vicinity of the MNs and since some protein is necessarily present in the extracellular space (where it is trapped by scFvEN1), making contamination unavoidable.

      Differentiation between Gamma and Alpha motor neurons should be performed using specific markers as Err3, Wnt7a or NeuN.

      This is a possible way to do the distinction, but size criterion in Cresyl violet is supported in the literature (Wu et al. Journal of Biological Chemistry, 287: 27335-27344, 2012; Dutta et al. Experimental Neurology, 309: 193-204, 2018). In our study, it is further validated by the demonstration that, in 9-month-old animals, the results obtained (cell number and specific death of large neurons >300µm2, but not of intermediate size ones 200-299µm2) are replicated by counting ChAT-stained neuron (Figure 2C). It is of particular interest that the number of medium size neurons (also ChAT-positive medium size MNs) does not increase when the number of large size (Cresyl and ChAT-positive) neurons decreases, thus precluding a “shrinkage effect”. Most importantly, the size criterion (Cresyl violet) allows us not to be mistaken by a possible down-regulation of markers in the mutant, independently of cell survival. We provide for the reviewer (Revision plan) but not for publication, the evolution with time of the number of neurons based on size (above 200 µm2) showing clearly that at 15.5 months the large population (>300 µm2) is decreased in the En1-Het, with very little change for neurons between 200 and 300 µm2, and certainly not an increase which would be expected if shrinkage occurred.

      We were indeed surprised by this finding and a plausible explanation is that a lower metabolic activity makes interneurons less sensitive to stress than aMNs which have to “fuel” long axons and high firing rates (not the case for gMNs). We propose this explanation in the discussion and make it clearer in our revised version. We agree that it is speculative and that the point raised by the reviewer is very interesting. We hope to address this in the future and have discussed this point.

      Since the cells do not die, we did not look for signs of apoptosis.

      We analyze lumbar sections from L1 to L5 as now indicated in the methods section in the manuscript

      The set of experiments reported in Figure 4 is of difficult interpretation without showing the actual presence of extracellular En1, that could be assessed with protein detection or RNAscope.

      This is another interesting suggestion, but we think that it will be difficult to distinguish low extracellular staining due to EN1 diffusion from some unspecific background. Since the scFvEN1 is secreted by astrocytes, it necessarily neutralizes extracellular EN1, resulting in a decrease in the MN content of the protein. This is an experiment with high specificity since the same scFv harboring a Cysteine to Serine point mutation that prevents EN1 recognition (no disulfide bound formation between the light and heavy chains) does not block EN1 capture by MNs (Fig. 4C for IHC and quantifications).

      As for extracellular EN1 mRNA identified by RNAscope, we hesitate to embark on the idea as mRNAs are likely secreted in insufficient amounts to be identified, even by RNAscope. The results that we have (no En1 visible by RNAscope in MNs, loss of EN1 in MNs following extracellular scFvEN1 activity, and preferential addressing of injected EN1 to MNs) demonstrate EN1 capture by MNs. Indeed, we cannot completely preclude the transfer of tiny amounts (escaping RNAscope detection in MNs) of En1 mRNA (for example, through extracellular vesicles), but we plead for not considering this hypothesis in the present paper. However, if the reviewer wishes, the possibility can be introduced in the discussion.

      Referee 2

      Evidence, reproducibility and clarity

      In general, most of the experiments shown in this study are well done and convincing. However, the data on p62 upregulation appear correlative and do not allow any conclusions about the mechanism and function how EN-1 modulates motoneuron survival and function. In addition, this study is not very precise on the mechanisms how motoneurons degenerate in this model so that there are only limited insights into the way how EN-1 acts on motoneurons in a physiological manner and under pathophysiological conditions.

      This criticism is justified, at least in part, as we agree that p62 upregulation is correlative. However, the fact that the neutralization of extracellular EN1 by the scFv increases p62 expression, is in favor of a causative link. The increase is also seen at 3 months in the En1-Het when all aMNs are still present but not after, which is interesting because, due to aMNs death, surviving MNs receive more EN1, information provided below and now introduced and discussed in the revised manuscript (Supp. Fig. 1B).

      As for p62, and as also mentioned by referee 3, Fig. 8 is very hard to follow and we propose to simplify it to make the message clearer:

      We have revised Fig. 8C, D in which we focus exclusively on SQTSM1/p62 mean expression (see revision plan)

      A second information is that a difference in mean p62 expression between WT and Het is seen only at 3 months in aMNs. For aMNs, we propose that this is due to the fact that they are very sensitive to EN1 dosage (in contrast with gMNs which do not die in the En1-Het). At 3 months, aMNs have only half of their normal EN1 content. Later, at 4.5 months 75% of the aMNs bound to die are already dead (Fig. 2D) and the remaining neurons receive more EN1 (even more so at 9 months), as could be measured (see above Supp. Fig. 1B). We thus can propose an accelerated aging of aMNs at 3 months due to both EN1 decrease and high metabolic activity (higher than in gMNs).

      In the case of the scFv, scFvEN1, but not the mutated version induces enhanced mean p62 expression in the 80% surviving aMNs and in gMNs at 7 months (low aMN death in this model, see Fig. 4F). As can be seen also in a newly added figure (Supp. Fig. 2) that has been introduced in the revised manuscript and is shown below, 7-month-old scFv animals and 3- to 3.5-month-old En1-Het have similar phenotypes. This mild scFv phenotype (a-MN death and muscle strength loss) in 7-month-old mice in spite of a huge loss in the EN1 content of MNs (Fig. 4C) suggests that the En1-Het phenotype is not entirely due to the decrease in EN1 transport from V1 interneurons to MNs (see discussion and Fig. 9).

      It remains true that we have voluntarily decided not to examine in depth the molecular mechanisms allowing EN1 to exert its protective activity, a decision that we would like to defend and maintain.

      A first reason is that in previous papers on mesencephalic dopaminergic (mDA) neurons (Alvarez-Fischer et al. Nature Neurosci. 14: 1260-1266, 2011; Rekaik et al. Cell Reports 13: 242-250, 2015; Blaudin de Thé et al. EMBO J. 37: e97374, 2018), we evaluated several mechanisms involved in EN1 neurotrophic activity and we did not want this study to be a duplication of studies done on a different neuronal population, even if mechanisms might differ in part, between aMNs and mDA neurons. What has interested us more is that, in the two cases, age is an important factor in the unveiling of the degeneration phenotype (mDA neurons start dying at 1.5 months and aMNs at 3 months). It is because of this similarity that we performed the bioinformatic study that has led us to SQTSM1/p62. In this context, it is of interest that mean SQTSM1/p62 expression (variability of expression between neurons is not discussed in the revised version) increases with age in the wild type, thus can be seen as an age marker. It allows us to propose that EN1 extracellular neutralization and the loss of one En1 allele, that increases mean SQTSM1/p62 expression accelerate aging.

      A second reason is that the study is oriented toward a possible use of EN1 as a therapeutic protein. This orientation also has to do with the focus on SQTSM1/p62. Indeed, there are probably many pathways downstream of EN1, but in the bioinformatic analysis of genes differentially regulated in WT and En1-Het mDA neurons and also expressed in MNs, SQTSM1/p62 is the only one that interacts with the 4 genes mutated in the major ALS familial forms. In addition, SQTSM1/p62 mutations have been observed in ALS patients (References 41 to 45 in the manuscript).

      Finally, the most important point is that the main message of this paper is the discovery of a non-cell autonomous EN1 activity in the spinal cord and of its ability to travel between V1 interneurons and MNs. This specificity best explained by a targeting signal that we have identified is at the basis of the specific addressing to MNs of EN1 intrathecally injected, which also has implications for its potential therapeutic use.

      Specific points of criticism

        • In Fig. 2a, the authors show that EN-1-positive interneurons are not reduced at 4.5 months in the spinal cord. No data are shown for later time points such as 9 months, the corresponding stage when motoneuron loss is observed, or at 16 months which corresponds to the data shown in Fig.1. The argument that there is no reduction of V1 interneurons between 4.5 months and 16 months because there is no decrease of EN-1 expression between 4.5 and 16 months, as shown in Fig. 1b is not convincing. EN-1 expression could change in individual cells, thus compensating for the loss. Data on numbers of EN-1-positive cells at 9 and 16 months should be included, and a potential autocrine effect of EN-1 on V1 interneurons, as observed in midbrain dopaminergic neurons, characterized in more detail. * Fig. 2A illustrates the absence of interneuron loss at 4.5 months, but this set of data is completed by those of Fig. 2D that demonstrate the maintenance of V1 interneuron number until 15.5 months, at least. It can be noted that, in contrast with interneurons, aMNs at 4.5 months have experienced massive cell death (70% approx. of total aMN death at 15.5 months). As a whole, data of Fig. 2 demonstrate that the number of small neurons (100-199 µm2) and intermediate size neurons (200-299 µm2) does not change with age, at least through 15.5 months. This is in strong contrast with large aMNs (>300 µm2). As already explained in our answers to referee 1, size is an excellent marker for the identification of neuronal subtypes and the analysis of survival (See answers to referee 1, justifying the use of neuron size).
      1. In Fig. 2e, the authors present data on loss of muscle strength between 4.5 and 15.5 months. They conclude that this reflects gradual neuromuscular strength loss. Since neuromuscular endplates have a very high safety factor, they can maintain full function even if transmitter release is reduced by more than 80%. Therefore, the loss of muscle strength seems to reflect the progressive loss of presynaptic terminals at neuromuscular endplates, rather than a gradual loss of neuromuscular strength. *

      We apologize for the semantic confusion. What is measured is a progressive loss of muscle strength due to the progressive loss of presynaptic terminals and not a gradual loss of neuromuscular strength. This is now modified throughout the revised text.

      • More detailed data on NMJ morphology should be included. How does EN-1 modulate neuromuscular endplates? Is EN-1 located at neuromuscular endplates after being taken up from motoneurons? Even if the mechanism is indirect, via upregulation of p62 under conditions when EN-1 signaling is reduced, does this situation lead to enhanced localization of p62 at neuromuscular endplates? *

      We do not see expression of En1 mRNA or the presence of EN1 protein at the level of the endplate (Supp. Fig. 3 in revision plan)

      • The data shown in Fig. 3 on changes in NJM morphology appear incomplete and not convincing. As SV2a is not a good marker for changes in presynaptic compartments since it does not allow conclusions on how many synaptic vesicles are released, additional markers for presynaptic active zones such as Bassoon, Piccolo, Munc-13 should be studied. The analysis of fully occupied endplates appears arbitrary, and the differences are relatively small. Additional EM pictures and quantitative analyses of active zone proteins in the presynaptic compartment would help to support the argument of the authors that presynaptic compartments degenerate before cell bodies are lost in EN-1 +/- mice. *

      SV2a and NF staining (it is not only SV2a) at the level of endplates identified by a-Bungarotoxin labeling has been used in a large number of studies (Wahlin et al. J. Comp. Neurol. 506: 822-837, 2008; Hasting et al. Scientific Reports 10: 1-13, 2020; Yahata et al. J. Neurosci. 29: 6276-6284, 2009 ; Jones et al. Cell Reports 21: 2348-2356, 2017) Our goal was not to document the loss of synaptic activity through the use of the three suggested markers, Bassoon, Piccolo and Munc-13. Doing it would force us to initiate experiments taking several months to prepare the material and do a quantitative analysis in the models of EN1 loss of function (En1-Het) and neutralization (scFv), plus rescue by EN1. Nor do we wish to initiate a novel collaboration to produce a quantitative ultrastructural study. We see the latter morpho-functional studies beyond the scope of the manuscript and wish to be given the possibility to present them in a separate study (see below in “Description of the experiments that the authors prefer not to carry out”).

      The distinction between fully occupied, partially occupied and denervated endplates is not arbitrary and we apologize for not having sufficiently described the methodology. As illustrated in modified Fig. 3 and explained in Material and Methods, a fully innervated endplate is defined as an endplate in which 80% or more of the green pixels (a-BGT) are covered by a red pixel (SV2a), a partially one is between 20 and 80% and a denervated one below 20% coverage. Thus at 9 months and later ages, close to 30% of the endplates are either partially innervated or denervated. In fact, it is more likely that they are partially innervated since the number of AChR clusters does not change (totally denervated clusters normally dissolve). The 80% threshold for fully innervated was selected to give a margin of security, and it is likely that the percentage of 25 to 30% of partially innervated endplates is an underestimation.

      In the Revision plan is presented a table with the calculations and modified Figure 3.

      We agree that we were not clear enough in our description and that it may have given the impression that the differences were relatively small. We think that retrograde degeneration is strongly supported by a loss of muscle strength that parallels the decrease in fully occupied endplates (a-BGT, NF, SV2a) and precedes aMN loss by more than 1 month. We have recently contacted an electrophysiology group to establish a collaboration that will allow us to follow functional changes at the level of the spinal cord and of the neuromuscular junction and we see the experiments proposed by the reviewer as complementary to these physiological approaches. Yet, we do not want to ignore the opinion of the reviewer and mention it in the conclusion, on the basis of his/her comment.

      • The authors present evidence for a glycosaminoglycan (GAG) binding domain that appears responsible for uptake of EN-1 into motoneurons. However, it is unclear into which cellular compartment EN-1 is taken up after GAG binding on motoneurons. The authors propose this could be an alternative pathway to conventional endosomal uptake. How can the EN-1 that is taken up into cells exert transcriptional effects in motoneurons? As a minimum, more data on the subcellular distribution of endocytosed EN-1 should be included to support current hypotheses and to close the gap from cellular uptake to transcriptional regulation. *

      The question is justified since we did not recall until page 12 of the Discussion that EN1 is, as most tested homeoprotein transcription factors, captured by a mechanism distinct from endocytosis. While not yet fully understood, the process involves the formation of inverted micelles that allow for direct targeting to the cytoplasm and from there to the nucleus thanks to the NLS. We now mention in the introduction that EN1 transfer and HP transfer is based on unconventional secretion and internalization processes.

      • The differences in p62 expression with age in WT and EN-1 +/- mice as shown in Fig. 8c are not convincing. First, the p = 0.0499 and p = 0.0536 values for differences at 3-4 months of age appear borderline, and it is unclear what the dispersion analysis that is shown really means. Moreover, the question remains how a potential dysregulation of p62 then affects NMJ morphology and function. Is this change in p62 also detectable in presynaptic compartments? *

      We agree that p values in the range of 0.05 are not extremely high and this is due to the heterogeneity in SQTSM1/p62 expression, that reflects that of MN populations, and induces a high variance. We also agree that this figure is too complicated and a simplified version has been proposed above (see answers to reviewer 1). To summarize, Fig. 8C shows that in WT animals, with no aMN death (grey) the level of SQTSM1/p62 expression in aMNs and gMNs increases between 3 and 4.5 months and between 4.5 months and 9 months, with significances varying between pThe new Fig. 8 panel D (please see above, answers to referee 1) now includes the results obtained with the scFvs. A phenotype comparison between the two models (En1-Het and scFvEN1) has been introduced in Supp. Fig. 2 (see above).

      We have no evidence that EN1 modulates the SQTSM1/p62 promoter directly. The identification of this gene as a target (not necessarily a direct target) of EN1 comes from the bioinformatic analysis described in the manuscript and we were intrigued by the interaction with the 4 main familial ALS mutations and the existence of families with SQTSM1/p62 mutations. This is what led us to analyze its expression in our two models of EN1 loss of function. Although the En1-Het mouse is not an ALS model, the results support the idea that EN1 could be used as a therapeutic protein in several familial and even sporadic forms of the disease. The latter hypothesis is now being tested on MNs derived from iPSCs (sporadic patients, fALS and isogenic variants, and healthy controls). If the data lend weight to our hypothesis, as collaborative and in-house preliminary data suggest, then a complete analysis of EN1 targets in human MNs will be undertaken. Again, we really think that this is out of the scope of this study.

      For Fig. 8, we fully agree that it can give headaches and we apologize. Moreover, it induces wrong interpretations (mean intensity increases with age and dispersion between 4.5 and 9 months has a calculated p__Referee #3__

      Evidence, reproducibility and clarity

      Nevertheless, the connection between EN-1 and p62 is not well developed by the data presented and future readers may be left with many questions regarding how EN-1 and p62 are related (e.g. direct interaction? transcriptional regulation?), whether p62 is indeed the mediator of EN-1 trophic effects, or the significance of the increased levels of p62 for motoneuron disease

      The reviewer is right and we have tried to better explain and to simplify. Please see responses to referees 1 and 2.

      *Figure 1C: There appears to be EN1 immunoreactivity (green) in several areas of the spinal cord, including dorsal regions. Can the authors clarify what that labeling could be representing? *

      Unfortunately, there is always some background staining, in particular in the tissue treatment conditions appropriate for RNAscope. Furthermore, given the large number and wide distribution of V1 interneurons (Fig. 1A), we cannot preclude that EN1 is present at a low concentration in the extracellular space and in several cell types (now represented in Fig. 9). This does not weaken the main conclusion that it primarily accumulates in MNs which do not express En1 (RNAscope).

      *Figure 1D: These immunoprecipitation results lack a negative control with irrelevant antibody to confirm that the band shown it's being recognized specifically by the antibodies reacting with the blot. *

      Please see the response to reviewer 1 above with the Western blot and the absence of staining on a WB in absence of first antibody (86/8 or LSBio).

      F*igure 1E: The intensity of the EN1 labeling in MNs, much stronger than in V1 interneurons, is intriguing, given that MNs do not express engrailed-1 mRNA. One would have expected the opposite. It may help here if it was possible to show that immunoreactivity in MNs is diminished in the het mutant mouse. *

      We also were surprised by this intensity higher in MNs than in V1 interneurons, as if the protein was exported rapidly towards the target neurons. We have done the experiment proposed by the referee, found a twofold (approx.) immunoreactivity reduction in En1-Het MNs (see above Supp. Fig. 2A in answers to referee 2). This supplemental figure has been introduced in the revised version. The experiment was done at 3 months when no MN death has yet occurred. Later the neurons “replenish” with EN1, probably because they do not have to share the limited supply with the dead ones (see above answers to referee 2 and Supp. Fig. 2B).

      *Figure 2D: There are a few possible problems with these data and their interpretation. First, this reviewer feels that 5 neurons (y-axis) is a rather small number. Are these 5 neurons per what area? From how many mice? I did not find that information in the figure legend. A larger area should be quantified so that we get numbers that are more robust. Second, such differences could also be due to hypotrophy of the MNs, namely, that MN number is the same but they are smaller. *

      The differences cannot be attributed to hypotrophy. A first reason is that, at 9 months, the Cresyl violet and ChAT staining give the same results for medium size and large neurons (Fig. 2C). Furthermore, when one counts the cells throughout 15.5 months, the decrease in the number of large neurons is not compensated by an increase in the number of medium size or small ones. The reasoning and a graph, not intended for publication can be found in answers to referee 1.

      *Figure 3A: It would be useful that the authors explain how these AChR clusters were defined, visualized and counted. I could not find this information in the Methods. Perhaps this could be done by showing an alpha-BTX image illustrating the clusters. *

      We fully agree that the procedure was not well explained and we have introduced a correction in the Material and Methods section. For more details, please see answers to referee 2.

      *Figure 3B: As each adult endplate is only innervated by one MN, one would have expected fewer clusters and/or endplates, if indeed MNs are missing in this mouse, rather than endplates that are partially occupied. This could be clarified a bit more explicitly. *

      This is true and the ambiguity takes its origin in insufficient explanation of how fully innervated, partially innervated and denervated endplates were defined. Please see above and also in answers to reviewer 2. Modifications have been introduced in the text and in Fig. 3. The referee is right, the absence of change in the number of AChR clusters suggests that there are very few fully denervated endplates and that what is defined as such in the analysis corresponds to partially innervated endplates (see above). This is now discussed in the text.

      Figure 6B: Would not be better to do this with a virus, like in the case of the antibody? A more robust effect on MN survival may be attainable and thus strengthen the concept.

      This would be another interesting experiment and we are presently exploring this possibility (with preliminary results). The choice of the virus and of the promoters is very important. We are comparing several AAVs, including AAV2, AAV2-TT (which diffuses better) and AAV8. For the promoter, we do not want to express within MNs as the imported protein might have special properties, associated with import. V1 interneurons would be best, but we have to verify if this does not modify V1 physiology. Astrocyte is another option, but with a similar pitfall. This means that we have a long way to go before proposing a “gene therapy” approach.

      In addition, in the context of future clinical studies, we were eager, on the basis of the long-lasting activity of the protein already observed in the mesencephalic dopaminergic neurons (Alvarez-Fischer et al. Nature Neurosci. 14: 1260-1266, 2011; Rekaik et al. Cell Reports 13: 242-250, 2015; Blaudin de Thé et al. EMBO J. 37: e97374, 2018), to try a protein therapy in the spinal cord. Interestingly, the effects are also long-lasting in the spinal cord, (12 weeks in the mouse before a second injection is needed) and, according to contacted physicians, intrathecal injections, every second month or even more frequently, could be envisaged in the human. In that case, protein injection is possibly advantageous for the following reasons:

      (i) viral particles can travel far and we do not know what would be the side effects.

      (ii) the protein is short-lived but specifically addressed to MNs (thanks to the presence of EN1 binding sites at their surface), thus minimizing the issues associated with permanent expression and side effects.

      (iii) EN1 is a natural protein normally secreted and the immune system might not be solicited as much as with viral approaches.

      *Figure 7A: The protein seems to be mainly in the cytoplasm of those cells (nuclei are dark and unlabeled), which is also unusual for a transcription factor that functions in the nucleus. Also surprising that the protein is gone in 3 days, but has effects over 24 weeks. Any explanation for that? *

      The protein is imported and is thus both in the cytoplasm where it exerts an effect on protein translation (Brunet et al. Nature 438: 94-98, 2005; Alvarez-Fischer et al. Nature Neurosci. 14: 1260-1266, 2011; Yoon et al. Cell 148: 752-764, 2012) and in the nucleus where it exerts its transcriptional and “epigenetic activity (see below for the latter). In fact, different antibodies and fixation procedures can favor cytoplasmic or nuclear staining. When nuclear, the dark point at the center, probably the nucleolus is less stained.

      Two images illustrating this point are shown in the revision plan.

      For the second part of the question, three days are sufficient for a long-lasting activity. This was also observed in the midbrain where the protein restores the epigenetic marks jeopardized by an acute oxidative stress (Rekaik et al. Cell Reports 13: 242-250). This has led to the hypothesis that EN1 has an important action at the level of the structure of the heterochromatin, thus a long-lasting “epigenetic” activity. We are presently working on the latter effects on the chromatin structure using human MNs derived from iPSCs (patients and control).

      *Figure 7B: It's not clear what the blue and red bars mean, as this is not explained in the legend. Also, the y-axis says "%Chat+" suggesting they are counting MNs, but in the text they talk about EN-1 capture. If the latter, the y-axes should indicate % EN-1 over Chat, or something like that. In general, better figure legends would improve the experience of the reader. *

      In this experiment, we wanted to test the presence of a GAG-binding domain in EN1. To test its potential role in EN1 internalization and localization, we co-injected or not the RK-EN1 with hEN1 protein. Then, we counted the percentage of MNs (%ChAT+) which contain, or not, the hEN1 protein (hEN1+ in red or hEN1- in blue), allowing us to verify if the RK-EN1 alters the internalization of the hEN1 protein. So yes, we are looking at the capture of EN1 by the MNs with or without the RK-peptide (or control peptides). We have modified the text to make the point clearer.

      *Statistical analyses: In principle, comparisons of data obtained in studies that involved two variable parameters (such as time and genotype/treatment) should be weighted by a 2-way ANOVA test, which is more stringent since more conditions are being tested simultaneously. Usually a t-test is reserved for a pairwise comparison in an experiment involving only two conditions of the same variable. *

      The reviewer is correct. The two-way ANOVA is explained in the Statistical analyses section of the Methods. The analyses were carried out and the results listed in the legends for Figs 2, 3, 4, 6 and Supp. Fig. 1.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      We would like to thank the editor for handling our manuscript entitled, “Mouse SAS‑6 is required for centriole formation in embryos and integrity in embryonic stem cells”, and the reviewers for the insightful comments and suggestions to improve our work. We aim for our manuscript to be considered for a “Short Report” format. As such, we would like to emphasize that we did not focus on the in vivo part of our study, where the Sas-6 mutant mouse embryos resemble our previously published Sas-4 mutants, as pointed out by the reviewers, because both mutants lack centrioles. In our opinion, the novelty of our work is evident in the discovery that mouse embryonic stem cells (mESCs) lacking SAS-6 are still able to form centrioles, albeit mostly abnormal, which is also shared by the reviewers. This is in contrast to Sas-4 mutant mESCs for example, which lack centrioles (Xiao*, Grzonka* et al, EMBO Reports 2021), and human cultured human cell lines without SAS-6, which have been shown to lose centrosomes. We are in the process of editing the manuscript and performing additional experiments per the reviewers’ recommendations. Below, we provide a point-by-point description of our revision plan.

      2. Description of the planned revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The article by M. Grzonka and H. Bazzi entitled: Mouse Sas-6 is required for centriole formation in embryos and integrity in embryonic stem cells, describes new findings in novel mouse models of Sas-6 knockouts (KO). This is an interesting study that reports two different mouse Sas6 KO models and the depletion of Sas6 from mouse embryonic stem cells (mESCs). This type of analysis has never been done before and so it reveals and describes a role for Sas-6 in centriole biogenesis in mouse.

      We thank the reviewer for highlighting the novelty of our work on the roles of SAS-6 in mice.

      • *

      The authors compare their analysis with Sas-4 KO and overall found similar results when compared to previous work from H. Bazzi, when Sas-4 was depleted in mouse embryos. Due to the mitotic stopwatch pathway, Sas6KO embryos die during development at extremely early stages and this can be rescued by depletion in p53 and other members of the pathway.

      Perhaps, not so surprisingly, these embryos do not contain centrioles, showing that in vivo, Sas6 is absolutely required for centriole duplication. More surprisingly, however, in cultures of mESCs, established and propagated in vitro, Sas-6 crispr induced KO, does not result in lack of centrioles. Instead, abnormal structures that show aberrant morphologies, length, and incapacity to assemble cilia were detected. In principle, this means that centrioles can be assembled independently of Sas-6, even if not in the correct manner.

      We again thank the reviewer for astutely pointing out the most surprising finding in our data, which is that mESCs lacking SAS-6 can still form centrioles.

      The authors interpret these differences as possible differences in the pathways involved in centriole assembly and propose different requirements in different cell types, within the same species.

      I have problems with this interpretation. To me is very difficult to understand, how the "protein" absolutely required for cartwheel assembly at the early stages of centriole biogenesis, can be essential and dispensable at the same time. Although, I may be wrong, I think the authors have not envisage other possibilities to interpret their data, which should be taken into consideration.

      We agree with the reviewer that SAS-6 is currently considered in the centrosome field as one of the “core” centriole formation or duplication factors and that it is a major component of the cartwheel scaffold during the early phase of centriole biogenesis. Although, the absence of centrioles in the Sas-6 mutant mouse embryos in vivo supports the essential function of SAS-6, and perhaps the cartwheel, in centriole formation; the mere presence of centrioles in mESCs indicates that SAS-6, and again the cartwheel, is not essential for the existence of centrioles in these cells. Because this is a major finding that we would like to bring across from our study, we will better highlight and clarify it in the new version of the manuscript as described below. In fact, in points #4 and #5, we share the same possible explanation for the difference in the phenotypes between Sas-6 mutant mouse embryos and mESCs as the reviewer.

      1) I do not know anything about ESC and ESC cultures. So maybe this is a stupid suggestion. But can't they be derived exactly from the same genetic background of SAS-6KO embryos? Because the way the two (or even 3 as there are 2 mouse KO lines) are generated is different. Why is that?

      The reviewer is correctly suggesting that the mESC can be derived from the Sas-6 mutant blastocysts. We have initially derived mESCs from the Sas-6em4/em4 mutants and performed our analyses on the centriole phenotypes in these mutants before realizing that the allele was hypomorphic (SAS-6 staining in Fig. S2F, and the appearance of centrioles at E9 in Fig. S1B). Because the surprising finding in our study is that SAS-6 does not seem to be essential for centriole presence in mESCs, as pointed out by the reviewer, we decided to generate a more convincing Sas-6-/- null allele in mESCs by deleting the entire ORF of Sas-6 (more on this point below). We would also like to direct the attention of the reviewer that we have cultured the blastocysts (E3.5) from the Sas-6em5/em5 null mutants, which as we show lack centrioles at E3.5, and the cells indeed start to form centrioles just 24 h post-culture (Fig. 3C-D).

      To build on these findings, we have already taken this a step further and generated a mESC line from the Sas-6em5/em5 mutants. These Sas-6em5/em5 –derived mESCs show CENT2-eGFP-positive centrioles, and we are currently analyzing their number and integrity, similar to our analyses of the CRISPR-generated Sas-6-/- null mESCs.

      2) Still on mESCs, are the authors sure that there are no WT Sas-6 mRNAs still present in their ESC cells? Because tiny amounts are maybe sufficient to allow the initial cartwheel structure. In FigS2B, I can see a really faint band, very faint but it is there.

      Due to the nature of the surprising finding that Sas-6 mutant mESCs can still form centrioles, we understand the concerns and suggestions of this reviewer and the other reviewers in this regard. We have generated several Sas-6 mutant alleles in mESCs (in exons 2, 4 or 5), in which we used Western blots to check whether they were null alleles or not. We used different commercial (Proteintech cat# 21377-1-AP, Sigma-Aldrich cat# HPA028187 and Santa Cruz cat# SC-81431) and non-commercial (kind gift from Renata Basto, Institute Curie) antibodies. The SAS-6 antibody from the Basto lab gave the most reliable and reproducible results. Using this antibody, and in our own interpretation, we were not able to detect SAS-6 by Western blots in Sas-6 mutant mESCs. We concluded that SAS-6 in mESCs (and mouse embryos, see below) is expressed at low levels. Of note, we always detected centrioles in the different Sas-6 mutant mESCs, even those derived from the Sas-6em5/em5 null mutant blastocysts, which as blastocysts had no detectable centrioles.

      For a more definitive knockout in mESCs, we decided to bi-allelically delete the entire Sas-6 ORF DNA from the ATG to the TAA (over 34 Kb of DNA, Fig. S2A). According to the central dogma of molecular biology, when there is no DNA, then there should be no mRNA and no protein. In confirmation of this premise, recent RT-PCR data showed that Sas-6 mRNA is not detectable in these Sas-6-/- null mESCs. Also, immunofluorescence analyses did not detect SAS-6 in these cells. We will add the RT-PCR and immunofluorescence data to the fully revised manuscript. We will also repeat the SAS-6 Western blots to achieve better band resolution.

      These Sas-6-/- mESCs started from a single cell and have been passaged up to 20 times by now without losing centrioles. SAS-6 protein was not detectable at the early passages and the mRNA is still not detectable. This is how knockouts have been and are produced. If this mutant is still not convincing, then we respectfully ask that the reviewers provide their own suggestion on what will be more convincing. In our humble opinion, this Sas-6-/- mESCs line can be used to test the specificity of the antibodies in mouse cells and not the other way around.

      3) This last point goes also with the western-blot of Figure S2C- there is still a band, very tiny between the two very tick bands (marked with *). Maybe separating proteins better will help visualizing the real Sas-6 band? Have they used the Sas6 ab in other WBs from the KO embryos, for example? Can they use the Sas6 ab in immunostaining to show if the assembled abnormal centrioles completely lack Sas6. This will allow to distinguish between the hypothesis of having some (even if not much) sas6 left?

      The answer to these questions is above in point #2. In addition, we have used the Basto lab antibody for SAS-6 for Western blots on mouse embryos, which detect low levels of SAS-6 in controls and no signal in the mutants. We will repeat the SAS-6 Western blots on mESCs to achieve better band resolution. Using this antibody for immunofluorescence showed that the Sas-6em4/em4 mutant is hypomorphic, whereas the Sas-6em5/em5 mutant showed very low, most likely background, staining (Fig. S1F). For mESCs, we decided to delete the entire Sas-6 ORF DNA in mESCs and generate homozygous Sas-6-/- null mutants. Immunofluorescence analyses did not detect SAS-6 in these cells.

      4) Then a more theoretical point? Have the authors considered that the difference is more in the stability of the abnormal structures. Let's say, without a cartwheel and maybe enough PLK4 activity and high level of other centriolar components, the centrioles are abnormally assembled- they have no cartwheel, but they are disassembled very fast in the embryo but not in ESCs?

      • *

      We agree and share the reviewer’s interpretation for the potential requirement of SAS-6 in vivo to stabilize intermediate structures, that is compensated for by other factors in mESCs. This was not directly discussed in the first version of the manuscript and we will include it in the new version.

      5) Even if there is a real difference and without Sas-6 ESCs can make centrioles that are abnormal in structure and function (at least at the cilia assemble level), the choice of words "strictly required", I am not sure it is correct. Because, since Sas-6 is described by many studies as the factor required for cartwheel assembly, which occurs very early in the pathway, this means that in mESCs centrioles can assembled without forming a cartwheel. And so that the cartwheel is actually not required for the initial building block, but more as a structure that maintains the whole centriole in an intact manner?

      We agree with the reviewer on the likely requirement of SAS-6, and therefore the cartwheel as a whole, for the symmetry and integrity of the forming centrioles, which is along the same line as in point #4. In our interpretation, “centriole formation” does not necessarily mean centriole “initiation” but rather the presence of the centriole as a structure. We will use more appropriate and specific wording to match our shared interpretation with the reviewer.

      6) The authors mentioned that in flies, abnormal Sas-6 structures have been described in certain cell types. Are these mutants, null mutants? In other words, do these structures assembled in a context of no Sas6 or abnormal Sas-6 protein or even low levels of Sas-6?

      According to the published report (Figure S3B in Rodrigues-Martins et al, 2007, PMID: 17689959) the fly brains have no detectable DSAS-6 protein. Therefore, we assume that they are Sas-6 null fly mutants. The abnormal centrioles in Sas-6 C. Reinhardtii mutants and Sas-6-/- mESCs null mutants support the conclusion that the main role of SAS-6, and perhaps the cartwheel, is in maintaining the integrity of the forming procentriole.

      • *

      Other points:

      I think the 1st sentence of the abstract appears disconnected from the rest. The same goes for the 1st sentence of the introduction. And also, what is the evidence that pluripotent stem cells rely primarily on the proper assembly of a mitotic spindle? They rely on many other things, not sure this is the first one.

      The sentences were meant to highlight the importance of cell division in stem cells. We will adjust the wording in these sentences per the reviewer’s comment to not focus on pluripotency per se.

      The authors mention that centrioles are lost in Sas6-/- after "differentiation" of mESCs. The term differentiation is not appropriate, and confusing here. Differentiation normally refer to cells that stopped proliferating and exited the cell cycle, which is not the case here, as NPCs are progenitor cells that keep cycling.

      We believe the reviewer is referring to “terminal differentiation”, when the cells exit the cell cycle and adopt their destined cell fates. The word “differentiation” in this context refers to limiting the potency of stem cells into a subset of cell fates such as NPCs, which are proliferating progenitors.

      Figure S1: Percent of cells with centrosomes was assessed by a co-staining of gtubulin and Cep164, which mark the mother centrioles. As Cep164 may be absent from centrosomes after lack of centriole maturation in sas6-/- embryos, another combination of staining should be performed to evaluate the percent of cells without centrosomes. gtubulin staining can be seen in Sas6 em5/em5 embryos, while the quantification claims total absence of centrosomes. The authors use the CENT2-eGFP transgenic line to count the number of centrioles in Figure 3, they should do the same in Figure S1.

      We will follow the reviewer’s recommendation of counting Cent2-eGFP for the assessment of centrioles in Sas-6em5/em5 (Fig. S1).

      The g-tubulin (TUBG) aggregates at the poles of dividing cells are assembled in the absence of centrioles, as shown in Sas-6em5/em5 embryo sections (Fig. S1H). In addition, we have previously observed these pericentriolar material aggregates in Sas-4-/- mutant embryos (Bazzi and Anderson, 2014), which do not contain centrioles in serial transmission electron microscopy. Therefore, we do not refer to them as centrosomes in the absence of centrioles at their core.

      Reviewer #1 (Significance (Required)):

      This study shows with a novel mouse model the requirement of centrioles during mouse development. It will be relevant to centrosome labs, the novel mouse lines will be useful to many labs working on centrioles, cilia and centrosomes.

      My expertise: centrosome biology

      We thank the expert reviewer for the critical comments and suggestions, and the positive evaluation of our manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      • *

      Here, Grzonka and Bazzi present their work on characterizing the requirement of SASS6 in mouse embryo development and in embryonic stem cell (mESC) culture. In mouse, female and male gametes lack centrioles, and early divisions occur without centrioles. De novo formation typically happens at the blastocyst stage (~E3.5). The authors generated two SASS6 knock-out strains, SASS6 em4/em4 (frameshift deletion, reported as a severe hypomorphic allele), and SASS6 em5-em5 (frameshift deletion, reported as a null allele). Mutant embryos arrest development at mid-gestation unless the p53, USP28 and USP28 pathway is perturbed. As expected, centrioles do not form in SASS6 -/- mice. However, the authors report that de novo formation of centrioles is facilitated in mESC culture conditions for SASS6 CRISPR knock-out mESCs and mESCs derived from SASS6 em5/em5 blastocysts. Centrioles are lost upon differentiation of SASS6 CRISPR knock-out mESCs into neural progenitor cells (NPCs).

      The presented study is relevant for scientists investigating the requirements for centriole formation during embryonic development. Further, it provides insights in possibly different requirements for centriole formation between stages of differentiation, as well as differences in in vivo and in vitro models.

      We thank the reviewer for finding our work relevant and insightful into the differential requirements for centriole formation depending on the cell type.

      The data represented by Grzonka and Bazzi are robust and support the manuscript and conclusions made. However, the study is predominantly descriptive, and the authors do not test the molecular pathway underlying the de novo formation of centrioles observed in SASS6 -/- mESCs. It is generally believed that de novo formation of centrioles is not possible in SASS6 knock-out cells although work from Wang and Tsou with SASS6 a oligomerization mutant suggests otherwise. A dissection of the specific factors required for the de novo formation of centrioles in the mESC context would provide more insights into de novo centriole assembly in general and would increase the impact of this work. I would support publication of the manuscript if the following points are addressed:

      We again thank the reviewer for finding the data robust and support our conclusions and interpretation. We agree with the reviewer that our study opens new questions about how mESCs manage to assemble centrioles in the absence of SAS-6. Together with the phenotypes of the Sas-6 mutant D. melanogaster and C. Reinhardtii, and the SAS-6 oligomerization mutants (but not full SAS-6 mutants) in human cell lines mentioned by the reviewer and cited in our manuscript, the data open new investigations into the exact requirements of SAS-6 and the cartwheel in centriole biogenesis in the different cellular contexts.

        • One of the main figures, ideally Figure 1, should be dedicated to the characterization of the newly generated mouse strains. This should also be elaborated in the text further. I would like to see a schematic representation of the genomic modifications. The SASS6 stainings of wt and Sas-6 knock-outs (now Figure S1F) should be shown in that context as well as the Figures S2A-C. The authors should discuss why there still appears to be SASS6 protein in the SASS6-em5/em5 Sas-6 stainings visible. Also, the western blot, especially the unspecific bands so close to the SAS-6 protein, should be discussed. Adding qRTPCR results would also be good. Per the reviewer’s requests, we will move the embryo mutant characterization (Fig. S1F) and mESCs (Fig. S2A-C) to the main figures and elaborate the text accordingly. The genomic modifications in mice are described in a detailed tabular format in Table 1 in Materials and Methods. The immunofluorescence staining in Fig. S1F was performed on mouse embryonic sections, which tend to have higher backgrounds than cultured cells; Thus, we attribute the very low percentage of SAS-6 staining in Sas-6em5/em5* mutants to higher background, especially given the lack of centrioles in these mutants at all the stages examined.

      For Western blots, we used different antibodies against SAS-6 that were either commercially available (Proteintech cat# 21377-1-AP, Sigma-Aldrich cat# HPA028187 and Santa Cruz cat# SC-81431) or non-commercial (kind gift from Renata Basto, Institute Curie). The SAS-6 antibody from the Basto lab gave the most reliable and reproducible results. Using this antibody, and in our own interpretation, we were not able to detect SAS-6 by Western blots in Sas-6 mutant mESCs (including hypomorphic alleles). We concluded that SAS-6 in mESCs (and mouse embryos, see below) is expressed at low levels. Thus, we decided to use the antibody provided by Renata Basto and shown in current Fig. S2C, although it shows two thick non-specific bands flanking the specific band for SAS-6.

      For a more definitive knockout in mESCs, we decided to bi-allelically delete the entire Sas-6 ORF DNA from the ATG to the TAA (over 34 Kb of DNA, Fig. S2A). According to the central dogma of molecular biology, when there is no DNA, then there should be no mRNA and no protein. In confirmation of this premise, recent RT-PCR data showed that Sas-6 mRNA is not detectable in these Sas-6-/- null mESCs. Also, immunofluorescence analyses did not detect SAS-6 in these cells. We will add the RT-PCR and immunofluorescence data to the fully revised manuscript. We will also repeat the SAS-6 Western blots to achieve better band resolution.

      In addition, we have used the Basto lab antibody for SAS-6 for Western blots on mouse embryos, which detect low levels of SAS-6 in controls and no signal in the mutants.

      • The authors could elaborate on the topic of mESCs as a special in vitro model for centriole biology akin to the more "primitive" origins of life such as algae.*

      We will elaborate on the topic of mESCs as a special system for centriole biology to stress the findings that mESCs without SAS-6 can still form centrioles, but also that these cells seem to tolerate centriolar aberrations, such as in Sas-6 mutants, or even the loss of centrioles, as in Sas-4 mutants, without undergoing apoptosis or cell cycle arrest.

      • Figure 4 should show timeline of embryo development, include embryo stages (E3.5, E9 etc.), group together mESCs with corresponding embryonic developmental stage. The Figure can indicate when mESCs were derived from SASS6 em5/em5 blastocysts, when they were stained and indicate the number/state of centriole formation observed.*

      We will adjust the model in Fig. 4 to accommodate the suggestions of the reviewer, but at the same time try not to overcrowd the model and dilute the main findings of the study.

      • The work from Wang and Tsou using SAS-6 oligomerization mutants should be better discussed in the context of the work presented here since centriole assembly was not affected per se but structural defects were observed, like is the case in this study.*

      We will elaborate on this finding from Wang et al. In this respect, we will note that the loss of the entire SAS-6 protein in human RPE-1 cells (on a p53-mutant background), leads to the loss of centrioles, but that the deletion of the oligomerization domain of SAS-6 in these cells leads to similar phenotypes to the total loss of SAS-6 in mESCs.

      • The observation that the ability of forming centrioles de novo in NPCs derived from ESCs is lost is interesting but the mechanisms underpinning this differentiation remain unclear. The authors at a minimum should speculate on these further.*

      We agree with the reviewer and will speculate on this finding further. This comment is along the same line as the difference in phenotype between the cells in the developing mouse embryo and mESCs, where the NPCs are more akin to the in vivo phenotype.

      CROSS-CONSULTATION COMMENTS

      Looks like we are all pretty much in agreement.

      • *

      Reviewer #2 (Significance (Required)):

      • *

      This is a well executed study with no major flaws that builds on similar studies on knocking out centriole components in mouse and other cell types. Although well-executed the study remains descriptive and lacks a clear mechanistic understanding of why de novo centriole assembly is ineffective in NPCs. As it stands the advances this study provides to the centrosome biogenesis field remain incremental.

      We thank the reviewer for the compliments about our work and agree that it opens new questions in the field about the precise roles of SAS-6 and the cartwheel in centriole biogenesis.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this publication, Grzonka and Bazzi build upon their recent work describing the role of SAS-like protein function in centriole formation during embryonic development. More specifically, they demonstrate that loss of Sas-6 in vivo and in vitro disrupts centriole formation. To this reviewer's surprise, they found that Sas-6 is required for centriole formation in embryos, yet, stem cells form centrioles with disrupted centriole length and ability to template cilia.

      • *

      We thank the reviewer for highlighting the novel and surprising aspect of our work, which is that Sas-6 mutant mESCs are still able to form centrioles. We would like to stress that SAS-4, from our previously published work, and SAS-6, in this study, are not part of the same protein family and have different structures and roles in centriole formation. The naming has its origin in “Spindle-ASsembly abnormal/defective” mutant screens performed in C. elegans. Although the phenotypes are similar in vivo, due the lack of centrioles in both cases, only mutations in Sas-4, but not in Sas-6, lead to the lack of centrioles in mESCs.

      • *

      Likely, this occurs from the residual proteins that existed prior to CRISPR-mediated knockout.

      • *

      Due to the nature of the surprising finding that Sas-6 mutant mESCs can still form centrioles, we understand the concerns and suggestions of this reviewer and the other reviewers in this regard.

      For a more definitive knockout in mESCs, we decided to bi-allelically delete the entire Sas-6 ORF DNA from the ATG to the TAA (over 34 Kb of DNA, Fig. S2A). According to the central dogma of molecular biology, when there is no DNA, then there should be no mRNA and no protein. In confirmation of this premise, recent RT-PCR data showed that Sas-6 mRNA is not detectable in these Sas-6-/- null mESCs. Also, immunofluorescence analyses did not detect SAS-6 in these cells. We will add the RT-PCR and immunofluorescence data to the fully revised manuscript. We will also repeat the SAS-6 Western blots to achieve better band resolution.

      These Sas-6-/- mESCs started from a single cell and have been passaged up to 20 times by now without losing centrioles. SAS-6 protein was not detectable at the early passages and the mRNA is still not detectable. This is how knockouts have been and are produced. If this mutant is still not convincing, then we respectfully ask that the reviewers provide their own suggestion on what will be more convincing.

      • *

      Unsurprisingly, they found that Sas-6 loss in the developing mouse activates the 53BP1-USP28-p53 surveillance pathway leading to cell death and embryonic arrest at mid-gestation, similar to their findings in Cenpj knockouts. What remains to be properly elucidated is the mechanistic differences in the requirement for Sas-6 in stem cells versus the embryo, which may be beyond the scope of a short report. As it reads, the manuscript is a compliment to their Sas-4 paper but falls short of novelty and providing large strides in revealing the role of centriolar proteins in developmental processes. Moreover, the advances beyond the requirement for centriole and associated proteins in embryology is missing, therefore enthusiasm is tempered. Below are remaining concerns that must be addressed:

      • *

      Remaining concerns:

      The authors should provide clear description of the embryonic region (neural plate & mesenchym) used to analyze centriole presence or loss in Figures 1 and S1. Was this in the forelimb vs hindlimb regions?

      The assessment of centrosomes in Fig. 1 and S1 was performed on cell types in all three germ layers in the sections that were taken from the brachial region (forelimb and heart level). The information will be added to the Materials and Methods section.

      Similar to their Cenpj-mouse data, the authors should provide data detailing the mitotic index and activation of the mitotic surveillance pathway beyond just p53 staining. As novelty is not the only criteria for publication, a thorough analysis of the Sas-6 activation of the mitotic purveyance pathway should be provided, including the crosses between Sas-6 and p53, 53bp1 and usp28 knockout crosses to demonstrate the pathway functions similarly to Cenpj loss.

      We will perform the additional experiments suggested by the reviewer that are similar to our previous work in Sas-4 mutants (Xiao*, Grzonka* et al, 2021). We will perform these analyses knowing that both Sas-4 and Sas-6 mutants lose centrioles and activate the mitotic surveillance pathway, as the reviewer indicated. In particular, we will quantify the mitotic index in the Sas-6em5/em5 mutants and perform p53 and Cl. CASP3 staining in the double mutants with 53bp1 or Usp28, to show that the pathway has been suppressed in these mutants.

      Centriole structure should be assessed in the embryos using EM to assess loss and confirm the structural defects. This would strengthen their argument and be a slight advance to their largely descriptive paper.

      Because the Sas-6em5/em5 embryos lack centrioles, as indicated by regular immunofluorescence and Ultrastructure-Expansion Microscopy (U-ExM), using EM would be an attempt to find a structure that does not exist. In our opinion, it would again be a repetition of TEM studies that we have already performed in Sas-4-/- mutant embryos, that lack centrioles (Bazzi and Anderson, 2014). Using U-ExM has advanced the centriole biology field to a level that is approaching EM resolution and, in our opinion, can substitute for EM.

      The WB for Sas-6 knockout is not convincing and should be redone. There are validated Sas-6 antibodies available from SCBT and Proteintech. It is not clear that the band is gone or if there's overlap with the non-specific band.

      The answer to this comment is shown above. In addition, we have used the Basto lab antibody for SAS-6 for Western blots on mouse embryos, which detect low levels of SAS-6 in controls and no signal in the mutants. We will also repeat the SAS-6 Western blots on mESCs to achieve better band resolution as recommended by the reviewer.

      The authors describe the centriolar structural defect in the mESCs in Figure 2C and D, and further characterize the phenotype in S2D-H. Given the role of the SAS6-CEP135-CPAP axis for centriole elongation, it is peculiar that they see elongation upon reduction of CEP135. The authors should find a rationale mechanism to explain their discordant findings. In addition, other centriole distal end components including CEP97 and CP110 should be examined to determine the structural end caping defect in the Sas-6 mESC.

      Over 70% of the centrioles in Sas-6-/- mESCs retain CEP135, but the majority of CEP135 signals (over 80%) seem to be abnormally localized. One potential explanation for the elongated centrioles in Sas-6-/- mESCs is that the mis-localization of CEP135 impacts on the integrity of the centriole and results in parts of the centriolar walls being elongated. Per the reviewer’s suggestion, we have performed U-ExM with stainings for CP110 or CEP97, that also regulate centriole capping and elongation. The preliminary data suggest that similar to WT mESCs, they localize to the ends of the abnormal centrioles in Sas-6-/- mESCs. We will quantify the percentage of normally-localized CP110 and CEP97 in Sas-6-/- mESCs and include it along with the data interpretation in the next version of the manuscript.

      • *

      In Figure 2I, J the authors state the ciliation rate for the WT mESCs was only 11%, could the authors provide an explanation for the low ciliation rate in WT mESCs? Could cells be arrested to increase the ciliation rate? In addition, is there a rational explanation for the loss of centrioles and centrosomes upon differentiation into NPCs?

      mESCs ciliation rate has been shown to be generally low (Bangs et al, 2015; Xiao et al., 2021) perhaps because the cells spend most of the cell cycle in the S-phase. mESCs require a high serum percentage and well-defined media for growth and maintenance. In our hands, attempting to arrest the cells by withdrawing serum, or reducing its percentage, resulted in cell death and a change in morphology to the differentiated phenotype (unpublished data). Our data indicate that a pluripotent state in Sas-6-/- mESCs is compatible with centriole formation but differentiation results in the loss of centrioles (for example, NPCs). Therefore, we have refrained from interfering with the cell cycle of mESCs in order to avoid these confounding effects on cellular viability and centriole formation.

      Regarding the loss of centrioles upon differentiation of Sas-6-/- mESCs into NPCs, we agree with the reviewer and will speculate on this finding further. This goes along the same line as the difference in phenotype between the cells in the developing mouse embryo and mESCs, where the NPCs are more akin to the in vivo phenotype of Sas-6 mutants. The data suggest that the formation of centrioles in Sas-6-/- mESCs is associated with the in vitro pluripotent phenotype. A more comprehensive and general characterization of centriole duplication in mESCs is a future direction to elucidate their ability to form centrioles without SAS-6.

      In figure 3F in the Sas-6−/− NPCs have a box around a cell without centrosomes yet in 3G here are some cells with centrosomes. While the authors are trying to demonstrate the decrease in centrosome in the Sas-6−/− NPCs, they should show the few cell that have centrosomes or centrosome-like structures.

      We will add another example for the minority of cells that retain centrosomes upon differentiation of Sas-6-/- mESCs into NPCs.

      CROSS-CONSULTATION COMMENTS

      • *

      As mentioned in my review; while the Sas6 model is new, it does not provide further evidence of why centriole duplication is important in developing mice aside from it causing an abortive mitosis leading to cell death. The discordant phenotype in the mESCs likely arises from residual Sas6, similar to experiments that were performed in flies with Sas-4 depletion. Moreover, the odd centriole phenotype represents a very small number of cells and is likely phenomenological.

      In addition, their work from last year demonstrated a clear connection between Cenpj loss leading to the mitotic surveillance pathway activation. They performed double knockouts that partially rescued the survival phenotype. This new work falls short of that publication.

      Reviewer #3 (Significance (Required)):

      • *

      The new publication adds a known component to the list of animal models for centrosome-opathies but fails to provide novel mechanistic insights. Dr. Bazzi's publication on Sas-4 was far more novel at the time of publication due to the multiple mouse crosses that could rescue the phenotypes. The recent publication fails to provide as much evidence or any novel insights into the role of Sas-6 (sufficient to be convincing).

      The audience will be limited to centrosome biologists and even then it may not have enough novelty to be compelling. I would recommend with the revisions to be published in a more specialized journal.

      *My expertise lies in genetic causes of microcephaly-associated with mutations in centrosome encoding proteins. *

      • *

      We thank the reviewer for taking the time to evaluate our work and provide helpful comments and suggestions. We would like to emphasize that even if a certain phenotype is expected, the experiment has to be performed to test the hypothesis, which is the case with the Sas-6 mutant embryos phenocopying the Sas-4 mutants. In our opinion, the novelty of our work goes beyond Fig. 1 to the ability of Sas-6-/- null mESCs to form centrioles. This surprising finding opens new avenues of investigation into the precise roles of SAS-6, and the cartwheel, in centriole biogenesis. We are confident that our study will provide a trigger to re-examine these roles in other cell types and organisms.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      SUMMARY

      The manuscript by Smoak et al., provides an analysis of the Hyr/Iff-like (Hil) genes in Candida species with a strong focus on C. auris. The authors demonstrate a repeated expansion of these genes in unique lineages of fungal species, many of which are associated with stronger clinical disease. There is evidence of selection operating on the gene family in the primary domain used for identification. These genes include a repeat just downstream of that core domain that changes frequently in copy number and composition. The location of these genes tends to cluster at chromosome ends, which may explain some aspects of their expansion. The study is entirely in silico in nature and does not include experimental data.

      MAJOR POINTS

      Altogether, many of the general findings could be convincing but there are some aspects of the analysis that need further explanation to ensure they were performed correctly. To start, a single Hil protein from C. auris was used as bait in the query to find all Hil proteins in yeast pathogens. Would you get the same outcome if you started with a different Hil protein? What is the basis for using Hil1 as the starting point? It also doesn't make sense to me to remove species just because there are already related species in the list. This may exclude certain evolutionary trends. Furthermore, it would be helpful to know how using domain presence and the conservation of position changes the abundance of the gene family across species? (beginning of results).

      We appreciate the reviewer’s criticisms on our strategy for identifying Hil proteins. In response, we have significantly revised our pipeline. In particular, we now combine the search results from three queries: in addition to C. auris Hil1’s Hyphal_reg_CWP domain (XP_028889033), we added the Hyphal_reg_CWP sequences from C. albicans Hyr1 and C. glabrata Hyr1. They were chosen as representatives in the two phylogenetic groups distinct from the one containing C. auris in order to avoid the bias due to the query’s phylogenetic position. Using the same criteria as we did for the original search, we identified three additional hits compared with the original 104 homologs list. In response to the criticism of the arbitrary exclusion of some species, we now include any species from the BLASTP search results as long as it is part of the 332 yeast species studied by Shen et al. 2018 (PMID: 30415838). The reason for this criterion is so that we can use the high-quality species phylogeny generated by Shen et al. 2018 to properly study the gene family evolution by reconciling the gene tree with the species tree. We additionally include the species in the MDR clade closely related to C. auris and used Muñoz et al. 2018 (PMID: 30559369) as the basis for the species phylogeny in the clade. Lastly, we no longer require the particular domain organization in classifying Hil family members. All BLASTP hits satisfying the E-value cutoff of 1x10-5 and query coverage > 50% are included.

      A major challenge in the analysis like this one is in dealing with repetitive sequences present in amplified gene families. For example, testing modes of selection on non-conserved sites is fraught. It's not clear if all sites used for these tests are positionally conserved and this should be clarified. Alignments at repeat edges will need to maintain this conservation and relatively good alignments as stated in lines 241-242 are concerning that this includes sequence that does not retain this structure necessary for making predictions of selection.

      We appreciate the reviewer’s comment. In the original manuscript, we performed two different types of analyses, one on the conserved and well-aligned Hyphal_reg_CWP domain and another on the rapidly evolving repeat region. For the former, we performed phylogenetic dN/dS analyses using maximum-likelihood, for which a reliable alignment is crucial and is the case here. The Hyphal_reg_CWP domain alignment for C. auris Hil1-Hil8 is shown below and also included as Fig. S7 in the revised manuscript: (figure in the response file)

      In the text, we added this sentence to emphasize this point: “We chose to focus on the Hyphal_reg_CWP domain because of its potential importance in mediating adhesion and also because the high-quality alignment in this domain allowed us to confidently infer the evolutionary rates (Fig. S7).”

      For the repeat domain, what we did in the original version was to calculate the pairwise dN/dS between individual repeat units found in Hil1 and Hil2. This didn’t require aligning the entire repeat regions in the two proteins, but instead relied on the alignment of the individual ~44 aa repeat units, which were highly conserved (see below). In the revised manuscript, however, we decided to focus our analyses on the Hyphal_reg_CWP domain because of a different concern, namely gene conversions between paralogs could distort the evolutionary history of the repeats (the same concern was addressed for the effector domain using an additional step of detecting recombination breakpoints, but the same analysis would be challenging for the repeat region due to alignment issues).

      (figure in the response file)

      It's also unclear to me why Figure S12 is here. The parameters for this analysis should be tested ahead of building models so only one set of parameters should be necessary to run the test. The evolutionary tests within single genes and across strains is really nice!

      We appreciate the reviewer’s suggestion. Based on the reviewer’s suggestion, we removed Fig. S12 and describe the model set up in the Materials and Methods section. We were not sure if the last point was a comment or a suggestion. We didn’t perform a population level selective sweep scan in C. auris. Such an analysis has in fact been attempted by Muñoz et al. 2021, who identified several members of the Hil family as the top candidates for positive selection (PMID: 33769478). We cited this in our Discussion:

      “Lastly, scans for selective sweep in C. auris identified Hil and Als family members as being among the top 5% of all genes, suggesting that adhesins are targets of natural selection in the recent evolutionary history of this newly emerged pathogen (Muñoz et al. 2021).”

      A major challenge for expanded gene families is rooting based on the inability to identify a strong similarity match for the full length sequence. The full alignment mentioned would certainly include significant gaps. If those gaps are removed and conserved sites only are used, does it produce the same tree? Inclusion of unalignable sequences would be expected to significantly alter the outcomes of those analysis and may produce some spurious relationships in reconciling with the species trees. Whether or not there are similar issues in the alignment of PF11765 need to be addressed as well. There's nothing in the methods that clarifies site selection.

      We appreciate reviewer’s comment and agree with the concern about alignment quality affecting phylogenetic reconstruction. To clarify, all phylogenetic analyses in this work are based on the alignment of the Hyphal_reg_CWP domain, which is well aligned (shown above for the subset of eight homologs in C. auris). Alignment of all 215 homologs is provided for readers to review (shorturl.at/kDEJ3). To clarify this choice, we now include the following in Results:

      “To further characterize the evolutionary history of the Hil family, including among closely related Candida lineages, we reconstructed a species tree-aware maximum likelihood phylogeny for the Hil family based on the Hyphal_reg_CWP domain alignment (Fig. 1C, Fig. S2).”

      We also included detailed steps for reconstructing the gene tree in Materials and Methods.

      To test the effect of gaps in the alignment on phylogenetic tree inference, we used two trimming programs, ClipKit (PMID: 33264284) and BMGE (PMID: 20626897), with author-recommended modes. They resulted in consistent gene tree results. We present the tree based on the ClipKit trimmed alignment in the main results. The root of the gene tree was inferred by jointly maximizing the likelihood scores for the gene tree based on the alignment and the evolution of the gene family within the species tree, using GeneRax (Morel et al. 2020, PMID: 32502238).

      Figure 1A: the placement of evolved pathogenesis is a little arbitrary. It's just as feasible that a single event increased pathogenesis in the LCA of C. albicans and C. parapsilosis that was subsequently lost in L. elongisporus. These should be justified or I'd suggest removing. The assignment of Candida species here also seems incomplete. The Butler paper notes both D. hansineii and C. lusitaniae as Candida species whereas they are excluded here.

      We removed Figure 1 entirely based on this and another reviewer’s comment. We note that there is broad consensus that opportunistic yeast pathogens have independently arisen multiple times, such as C. auris, C. albicans and C. glabrata. Whether Candida pathogens that are more closely related evolved separately or not are subjects of ongoing research (PMID: 24034898).

      It is tricky to include scaffolds in analysis of chromosomal location of the HIL genes. The break in the scaffold may be due to the assc repeats of these proteins alone or other, nearby repeats. Any statistics would be best done to include only known chromosomes or those that are strongly inferred by Munoz, 2021. This will change the display of Figure 7, but is unlikely to change the take home message.

      We agree with the reviewer’s concern. In the revised manuscript and with more species included, we now only analyze genomes assembled to a chromosomal level, with the exception of C. auris B8441, which is supported by Muñoz et al. 2021 as having chromosome-length sequences. The revised Figure 7 now only includes these results. We also removed the accompanying supplementary figure that showed results based on scaffold-level assemblies.

      MINOR POINTS

      Line 18: "spp." Should be "spps."

      Addressed throughout the revised manuscript.

      Line 41: I might rephrase this as "how pathogenesis arose in yeast..."

      Accepted (line 43 in revised manuscript).

      I might use a yeast-centric example around line 40 for duplication and divergence. This could include genes for metabolism of different carbon sources in S. cerevisiae.

      Accepted (lines 47-48)

      The Butler paper referenced on line 51 compared seven Candida species and 9 Saccharomyces species

      Changed (line 48)

      The autors state no other evolutionary analysis of adhesins has been performed but do not acknowledge this study: https://academic.oup.com/mbe/article/28/11/3127/1047032

      We appreciate the reviewer pointing this important reference to us. We now cite it in the introduction (line 64) and discussion (line 340)

      The first paragraph of the Results could be condensed

      Addressed.

      How was the species tree in Figure 1A obtained?

      The previous figure 1 is now removed. The species tree used throughout the manuscript is based on Shen et al. 2018 with MDR clade species added, based on Muñoz et al. 2018.

      Figure 2: In panel A, "DH" and "SS" are not defined. I'd be careful with use of "non-albicans Candida" in Figure 2B. This usually includes C. tropicalis and C. dubliniensis and may confuse the reader.

      We removed the DH and SS labels. Instead, we highlighted three clades, which were defined in previous studies. These are the Candida/Lodderomyces clade (based on NCBI taxonomy database), the MDR clade (e.g., Muñoz et al. 2018, PMID: 30559369) and the glabrata clade (e.g., Gabaldón et al. 2013, PMID: 24034898).

      How was the binding domain defined to extract those sequences are produce a phylogeny? In building a ML model, how were parameters chosen?

      We now provide the following details in the Materials and Methods section:

      “To infer the evolutionary history of the Hil family, we reconstructed a maximum-likelihood tree based on the alignment of the conserved Hyphal_reg_CWP domain. First, we used hmmscan (HmmerWeb version 2.41.2) to identify the location of the Hyphal_reg_CWP domain in each Hil homolog. We used the “envelope boundaries” to define the domain in each sequence, and then aligned their amino acid sequences using Clustal Omega with the parameter {--iter=5}. We then trimmed the alignment using ClipKit with its default smart-gap trimming mode (Steenwyk et al. 2020). RAxML-NG v1.1.0 was compiled and run on the University of Iowa ARGON server with the following parameters on the alignment: raxml-ng-mpi --all --msa $align --model LG+G --seed 123 --bs-trees autoMRE.”

      The parameters for the ML tree reconstruction is listed on the last line above. The main parameter was the evolutionary model (LG+G), which accounts for rate variations using a gamma distribution. Other protein evolution models, e.g., VT+I+G, were tested and resulted in nearly identical tree topologies.

      Figure 3C/D could be just one panel.

      The structure predictions are now reorganized and presented on their own in the new Figure 3.

      Can you relate more the fungal hit to the Hil proteins conveyed in lines 152-154?

      We appreciate the reviewer’s comment, which referred to CgAwp1 and CgAwp3, whose effector domain structures were reported in a recent study (Reithofer et al. 2021, PMID: 34962966). We now discuss them in relation to the predicted Hyphal_reg_CWP structure, by showing them in Figure 3 and describing them in the Results (lines 181) “crystal structures for the effector domains of two Adhesin-like Wall Proteins (Awp1 and Awp3b) in C. glabrata, which are distantly related to those in the Hil family were recently reported, and the predicted structure of one of C. glabrata’s Hil family members (Awp2) was found to be highly similar to the two solved structures (Reithofer et al. 2021)”

      Line 168: Should read "Hence, ..."

      The original sentence was removed, but this grammatical error was checked for and corrected.

      Label proteins along the top of Figure 4 too.

      Accepted (in new Figure 4).

      Figure 5: for tests of selection, were sites conserved across the group? What does the black number at each node indicate? Dn and Ds are given as decimals. This is based on what attribute? For panel B, it is unclear what each tip denotes i.e., Hil1_tr6. Hil1 is the gene but what is "tr6"?

      In the revised manuscript, we provide the multiple sequence alignment for the Hyphal_reg_CWP domain used for the selection analysis as Fig. S7 to illustrate the level of conservation. The black numbers at the internal nodes are numeric indices used to refer to those nodes. In the revised manuscript, we use some of them to refer to the internal branches, e.g., 12…14 in the legend. In the new Figure 5, we do not list the numeric values of Dn and Ds (aka Ka, Ks). Instead, we use a color gradient to represent the estimated dN/dS ratios. The raw estimates are available in the project github repository. Panel B in the original Figure 5 and other panels related to the evolution of the repeats are now removed.

      It's unclear why comparison of the PF11765 domain includes the MRD proteins when those aren't included in the comparison to the repeats alone. Could that skew the comparison due to unequal sample numbers or changed variation frequencies in MDR relative to the other two groups?

      These results pertaining to the evolution of the repeats are now removed.

      Table 2 doesn't add much. This section could probably be reduced to a few sentences since it's highly speculative (intraspecies variation).

      Table 2 is now Table S5. We also simplified the result section in the revised version. While the functional implications of the intraspecific variable number of tandem repeats (VNTR) is speculative, it is founded on two bases: 1) the identification of the VNTR is credible, as the copy number variation is consistent within clades but differ between clades, which is not expected if they are caused by assembly errors; 2) experimental studies in S. cerevisiae for the Flo family strongly supported a direct impact of adhesin length on the adhesive phenotype of the cells (PMID: 16086015).

      Table 3 is not needed.

      Table 3 is now removed.

      Figure 6 - color coding in 6A needs to be explained. I'm assuming this is a taxonomical coding.

      In the revised Figure 6A, the coloring scheme is consistent with what we used in Figure 1 based on the three clades, and a legend is provided.

      Figure 1B is unnecessary. A Model of the protein indicating domains is sufficient here. Figure 1C needs labels for all termini, not just the pathogenic red branches. The figure doesn't provide clear association between adhesin families and the associated species. This could be omitted, especially since Flo is often associated with Saccharomyces species. Figure 1D is unnecessary.

      We have removed the original Figure 1.

      SIGNIFICANCE

      The work here is sorely needed in expanded gene families and in fungi specifically. No analysis at this level has been performed, to the best of my knowledge, in any fungal associated gene family and certainly not in relationship to pathogenic potential. The authors do a good job in citing the foundational literature upon which their study builds in most cases (one exception is noted above). It would be of general interest to those interested in the evolution of virulence, but the analysis is tricky. This is the biggest drawback I currently have as some of the information to assess the results is missing.

      We really appreciate the reviewer's positive comments. We agree and plan to explore the relationship between the adhesin family evolution and virulence phenotypes.

      Expertise: gene families, evolution dynamics, human fungal pathogens

      Reviewer #2

      SUMMARY

      Gene duplication and divergence of adhesin proteins are hypothesized to be linked with the emergence of pathogenic yeasts during evolution; however, evidence supporting this hypothesis is limited. Smoak et al. study the evolutionary history of Hil genes and show that expansion of this gene family is restricted to C. auris and other pathogenic yeasts. They identified eight paralogous Hil proteins in C. auris. All these proteins share characteristic domains of adhesin, and the structural prediction supports that their tertiary structures are adhesin-like. Evolutionary analysis of protein domains finds weak evidence of positive selection in the ligand-binding domain, and the central domain showed rapid changes in repeat copy number. However, performed tests cannot unambiguously distinguish between positive selection and relaxed selection of paralogs after gene duplication. Some alternative tests are suggested that may be able to provide more unambiguous evidence. Together with these additional tests, the detailed phylogenetic analyses of Hil genes in C. auris might be able to better support the hypothesis that the expansion and diversification of adhesin proteins could contribute to the evolution of pathogenicity in yeasts.

      We appreciate the reviewer’s comments and will address specific points below.

      MAJOR COMMENTS

      The authors present extensive analyses on the evolution of Hil genes in C. auris. There is significant merit in these analyses. However, the analyses conducted so far are incomplete, lacking proper consideration of other confounding factors. Detailed explanations of our major comments are listed below.

      1. First, the authors restricted genes in the Hil family to those only containing the Hyphal_reg_CWP domain. Yet, previous work included genes containing the ligand-binding domain or the repeat domain as Hil genes. More justification is needed whether the author's choice represents the natural evolutionary history of Hil genes appropriately. For instance, are the genes only containing the ligand-binding domain monophyletic or polyphyletic? We recommend including the phylogeny of all the Hil candidate genes, to discern whether evolutionary histories of the repeat domain and ligand-binding domain are congruent. Authors can use this phylogeny as justification to focus only on the ligand-binding domain containing genes.

      Butler et al. 2009 (PMID: 19465905) defined the Als family and the Hyr/Iff family as having either the N-terminal effector domain or the intragenic tandem repeats (ITR). Their rationale for the latter was that the ITS sequences were often conserved across species. Upon close inspection (Fig. S19,20 in that paper), however, we found that the ITS tend to be conserved in closely related species, but diverged among more distantly related species. Moreover, proteins in those figures that only contain the ITS and not the ligand-binding domains are all missing either the signal peptide, the GPI-anchor or both. This raises questions as to whether these proteins sharing the ITS sequence alone act as adhesins.

      More generally, defining the evolutionary history of proteins with multiple domains is complicated by recombination, which causes different parts (e.g., domains) of the protein to have distinct evolutionary histories. In fact, our study and others show that there exist “chimeras” that combine the effector domain from one adhesin family and the repeat sequence found in another (Zhao et al. 2011, PMID: 21208290, Oh et al. 2019, PMID: 31105652). In these cases, one phylogenetic tree is insufficient to describe the evolutionary history of the whole protein. We chose to define the Hil family by the Hyphal_reg_CWP domain and thus focus on the evolutionary history of this region because 1) while tandem repeat regions also contribute to adhesion in yeasts (Rauceo et al. 2006, PMID: 16936142), the effector domain likely plays a more important role in ligand binding and specificity. Therefore, we believe using the effector domain to define a protein family is more likely to group proteins with similar functional properties than if the repeat sequences were used. Also, while putative fungal adhesins lacking a recognizable ligand-binding domain exist, they are rare (Lipke 2018, PMID: 29772751); 2) The repeat region evolved much more rapidly than the effector domain, as we illustrate in Figures 2, 4 and 6 in our revised manuscript. While some repeat units are highly conserved, e.g., the ~44 aa unit found in Hil1-4 in C. auris and close relatives in the MDR clade, many others are short and degenerate, making it difficult to reliably identify homologs sharing the repeat. Besides, since each protein could contain many distinct repeats, it is not clear how one defines two sequences as belonging to the same family if they share one out of six types of repeats. We acknowledge that this definition leaves out the evolutionary history defined by the tandem repeats, which may reveal intriguing evolutionary dynamics, with functional implications. A recent review for the Als family discussed similar definition challenges and partly supported our choice (Hoyer and Cota, 2016, PMID: 27014205).

      In the analysis of positive selection, the authors do not adequately control for the effect of recombination on the evolutionary histories of protein sequences, especially given that Hil genes are rich in repetitive sequences. To account for recombination, GARD, an algorithm detecting recombination, should be performed to detect any recombination breakpoints within a protein domain. If recombination did occur within a protein domain, the authors should treat the unrecombined part as a single unit and use the phylogenetic information of this part to proceed with PAML analysis, instead of using the phylogeny of the entire protein domain. The authors should consider doing GARD analysis for the ligand-binding and repeat domains. For the repeat domain, low BS values in Fig. 5C indicate recombination between repeat units. Thus, the authors should analyze each repeat unit with GARD and re-analyze dN/dS.

      We deeply appreciate the reviewers’ criticism here. In the revised manuscript, we removed the analysis of the repeat units and followed the reviewers’ suggestion to carry out GARD analysis on the effector domain, which we now show reveals evidence of intra-domain recombination. Using the inferred breakpoints (Fig. S8), we identified two putatively non-recombining partitions and performed all downstream phylogenetic analyses on them separately. The results are presented in Fig. 5 and Table S6. Compared with the previous result based on the entire Hyphal_reg_CWP domain alignment, the new results reveal clearer patterns, including significantly elevated dN/dS on a subset of the branches. Newly added branch-site test results support a role of positive selection on the effector domain during the expansion of the Hil family in C. auris, suggesting functional diversification following gene duplications.

      The authors concluded positive selection in the ligand-binding domain based on the branch-wise model of PAML. Yet, w values were not higher than one, and it's unclear whether the difference in selective pressures the authors claimed here is biologically significant. Overall, what the authors present so far seems to be weak evidence of positive selection but is much more consistent with variation in the degree of purifying selection or evolutionary constraint. Using the site-wise model (m7 vs. m8) in PAML would allow the authors to detect which residues of the ligand-binding domain underwent recurrent positive selection. Combining the evolutionary information of protein residues and the predicted 3D structure will provide molecular insights into the biological impact of rapidly evolving residues. This would be a significant addition and raise the significance of the study, besides providing potentially stronger evidence of positive selection.

      We appreciate the reviewers’ criticism and suggestions. In the revised manuscript, we performed site tests comparing models M2a vs M1a, M8 vs M7 and M8a vs M8. For partition 1 (P1-414), all three tests were insignificant. For partition 2 (P697-987), the M2a vs M1a test was insignificant (P > 0.05) but M8 vs M7 and M8a vs M7a were both significant at a 0.01 level, and the omega estimate for the positively selected category was estimated to be ~15. The site tests require all branches to evolve under the same selection regime. To relax this constraint, we performed additional branch-site tests by designating the branches with an estimated dN/dS > 10 as the foreground (based on the free-ratio model estimates). This test was significant for both branches at a 0.01 level and the Bayes Empirical Bayes (BEB) procedure identified a total of 5 residues as having been under positive selection. Although three of the five residues, located in the C-terminus of the Hyphal_reg_CWP domain, are part of the α-crystallin domain, we refrain from drawing any functional conclusions because 1) the BEB procedure is known to be lacking power in identifying positively selected residues and 2) we still lack structure-function relationship for the α-crystallin domain. But we agree and believe that this line of analysis is promising in yielding functional insight into the evolution of the effector domain in the family.

      MINOR COMMENTS

      1. In Fig 1c, the figure legend should include more specific details: which adhesin proteins are shown here? Please specify species names on the species tree

      Figure 1 is removed in the revised manuscript

      In Fig 3E, secondary structures are labeled with the wrong colors. Sheet: purple, helix: yellow

      In the revised manuscript, the structures of SRRP-BR (original 3E) is now colored in a single color.

      What's the ligand-binding activity of the b-solenoid fold? How structurally similar are C. auris PF 11765 domains compared to C. glabrata Awp domains? This information will support the role of adhesin for the ligand-binding domain of Hil genes.

      We discuss the ligand-binding activity of the β-solenoid as follows in Discussion:

      “The elongated shape and rigid structure of the β-helix are consistent with the functional requirements of adhesins, including the need to protrude from the cell surface and the capacity for multiple binding sites along its length that facilitate adhesion. In some bacterial adhesins, such as the serine rich repeat protein (SRRP) from the Gram-positive bacterium, L. reuterii, a protruding, flexible loop in the β-helix was proposed to serve as a binding pocket for its ligand (Sequeira et al. 2018). Such a feature is not apparent in the predicted structure of the Hyphal_reg_CWP domain. Further studies are needed to elucidate the potential substrate for this domain and its mechanism of adhesion.”

      We also compare the structures of the C. auris Hil1/Hil7 Hyphal_reg_CWP domain and the CgAwp1/3 in Figure 3, with this in the legend “(C) Crystal structure of the C. glabrata Awp1 effector domain, which is highly similar to C. auris Hil1 and Hil7, but with the disulfide bond in a different location.”

      We added a section in the Discussion to comment on the structure-function relationship based on known β-helix (aka β-solenoid) structures. The main insight comes from similar structures identified through DALI searches, many of which are bacterial and viral surface proteins mediating adhesion. The ligand binding pocket and specificity would require additional structural studies to elucidate.

      In lines 248-249, the authors should also consider the influence of evolutionary history. For instance, repeats within the same Hil protein appeared later in evolution, compared to Hil gene duplication, and therefore these repeats experience less time for sequence divergence.

      In the revised manuscript, we removed the analyses pertaining to the evolution of the repeat regions due to multiple challenges including alignment, potential of gene conversion and recombination. This is an important and intriguing aspect of adhesin family evolution that we plan to follow up in future work.

      Although the bioinformatic evidence of C. auris Hil genes acting as adhesins is strong, it is still worthwhile to discuss the experiments of confirming the function of adhesins.

      We agree with the reviewer and acknowledge in the revised manuscript the limitation of our work:

      “Future experimental tests of these hypotheses will be important biologically for improving our understanding of the fungal adhesin repertoire, important biotechnologically for inspiring additional nanomaterials, and important biomedically for advancing the development of C. auris-directed therapeutics.”

      SIGNIFICANCE

      Overall, this study is interesting to investigate the evolutionary history of a crucial virulent gene in C. auris. Such evolutionary understanding will help us identify critical molecular changes associated with the pathogenicity of an organism during evolution, providing insights into the emergence of pathogens and novel strategies to cure fungal infections. The research question is important; however, the current analyses on the positive selection are incomplete, so the conclusion is modest so far. We recommend that the authors re-do the PAML analysis with the above considerations. This work will bring more significance to the mycology field if the functional impact of rapid evolution in protein domains can be supported or inferred.

      This manuscript is well-written, and the authors also did a great job specifying all the necessary details in the M&M.

      We appreciate the reviewers’ positive comments.

      Reviewer #3

      Summary:

      The manuscript by Smoak et al. provides considerable information gleaned from analysis of HYR/IFF genes in 19 fungal species. A specific focus is on Candida auris. The main conclusion is that this gene family repeatedly expanded in divergent pathogenic Candida lineages including C. auris. Analyses focus on the sequences encoding the protein's N-terminal domain and tracts of repeated sequences that follow. The authors conclude with the hypothesis that expansion and diversification of adhesin gene families underpin fungal pathogen evolution and that the variation among adhesin-encoding genes affects adhesion and virulence within and between species. The paper is easy to read, includes clear and attractive graphics, as well as a considerable number of supplementary data files that provide thorough documentation of the sources of information and their analysis.

      We appreciate the positive comment.

      MAJOR COMMENTS:

      • Are the key conclusions convincing?

      Overall, the authors' conclusions are supported by the information they present. However, the overall conclusion is stated as a hypothesis and that hypothesis is not particularly novel. The idea that expansion of gene families associated with pathogenesis occurs in the pathogenic species dates back at least to Butler et al. 2009, who first presented the genome sequences for many of the species considered here.

      We appreciate the reviewer’s comment. Our main conclusions are 1) the Hil family is strongly enriched in distinct clades of pathogenic yeasts after accounting for phylogenetic relatedness. This enrichment results from independent duplications, which is ongoing between closely related species; 2) the protein sequence of the Hil family homologs diverged rapidly following gene duplication, driven largely by the evolution of the tandem repeat content, generating large variation in protein length and β-aggregation potentials; 3) there is strong evidence for varying levels of selective constraint and moderate evidence for positive selection acting on the N-terminal effector domain during the expansion of the family in C. auris as our focal species. Based on these observations, we propose that expansion of adhesin gene families is a key preliminary step towards the emergence of fungal pathogenesis.

      Indeed, some version of this hypothesis has been proposed by several groups before us. We fully acknowledged this in our previous as well as the revised manuscript, by citing Butler et al. 2009 (PMID: 19465905), Gabaldón et al. 2013, 2016 (PMID: 24034898, 27493146). Our study built on these earlier efforts and extended them by addressing several limitations. First, we performed phylogenetic regression to test for the association between gene family size and the life history trait (pathogen or not) in order to properly account for the phylogenetic relatedness. This was not done in previous studies. Second, most earlier studies didn’t construct a family-wide gene tree to fully investigate the evolutionary history of the family. Gabaldón et al. 2013 did a phylogenetic analysis for the Epa family and a few others within the Nakaseomycetes, revealing highly dynamic expansions. In the present study, we expanded this effort by comprehensively identifying homologs within the Hil family in all yeasts and beyond. Third and perhaps the most important novelty in our study is our detailed analysis of sequence divergence and role of natural selection during the evolution of the family post duplication. This allowed us to present a complete picture of the family’s evolution, not just in its increase in copy number but also its diversification after the duplications, which is a key part of how gene duplications contribute to the evolution of novel traits. As such, we believe our study provides strong support for the above hypothesis.

      One key issue with a manuscript of this type is whether genome sequence data are accurate. The authors are not the first research group to take draft genome sequence data at face value and attempt to draw major conclusions from it. The accuracy of public genome data continues to improve, especially with the emergence of PacBio sequencing. Because the IFF/HYR genes contain long tracts of repeated sequences, genome assemblies from short-read data are frequently inaccurate. For example, is it reasonable to have confidence that the number of copies of a tandemly repeated sequence in a specific ORF is exactly 21 (an example taken from Table 2) when each repeat is 40+ amino acids long and highly conserved? Table S6 would benefit from inclusion of the type of sequence data used to construct each draft genome sequence. It is also reasonable to question whether the genome of the type strain is used as a template to construct the draft genomes of the other strains. If that was standard practice, conservation of the repeat copy number among strains might be an artefact. Conservation of repeat sequences to the degree shown is not a feature of the ALS family, a point of contrast between gene families that could be explored in the Discussion.

      We appreciate the reviewer’s comment and agree strongly that a key limitation in gene family evolution studies like ours is the quality of the genome assembly. In the original manuscript, we took several steps to ensure the completeness and accuracy of the Hil family homologs, primarily by basing our results on the high quality RefSeq collection of assemblies, and supplementing it with two fungi-specific databases. In the revised manuscript, we performed further quality analyses to assess and correct for inaccuracy in the BLASTP hits. Because RefSeq aims to provide a stable reference, it is often slow in replacing older assemblies with newer ones based on improved technologies. We thus compared the RefSeq hits for species in which a newer, long-read based assembly had become available. The results are documented in Text S1 and in summary, while we did find examples of missing homologs and inconsistent sequences, the problems were isolated to specific species and the inconsistency pertains only to the tandem repeat regions. Regarding the specific example of within-species variable number of tandem repeats (VNTR) in C. auris Hil1-Hil4, we are confident of both the copy number and the sequence variation for two reasons. First, all C. auris strain genomes analyzed in this study were assembled de novo rather than based on a reference genome, and all were long-read based (PacBio) (Table S4). Second, empirically, we found the VNTR identified in Hil1-Hil4 agree among strains within one of the four clades of C. auris while differing between clades (Table S5). Since assembly errors are not expected to produce clade-specific patterns, we believe this is strong evidence for the VNTR identified being real.

      We also appreciate the reviewer’s suggestion on discussing the conservation of the repeats as an interesting trait for a group of Hil proteins in comparison to the Als family. We now added a section in Discussion focusing on the special properties of this group of Hil proteins.

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      Due to the nature of my comments, this review will not be anonymous. I will include some of the data from my laboratory to further illustrate the point about the quality of draft genome sequences, especially for gene families that contain repeated sequences. My laboratory group has spent the past several years looking at the families of cell wall genes in these species and know that the C. tropicalis genome sequence used in the current analysis is highly flawed. There is even a manuscript from several years ago that documents problems in the assembly (doi: 10.1534/g3.115.017566). There is a new PacBio sequence available that has considerably improved data for this group of genes, but still is not perfect. We designed primers and amplified the various coding regions to verify whether the IFF/HYR were correct in the draft genome sequences. For C. tropicalis, we know that 7 of the genes listed in this paper are broken (i.e. prematurely terminated) giving a false impression of their construction. The current study did not verify any gene sequences, so broken/incomplete genes are a stumbling block for developing conclusions.

      We deeply appreciate the reviewer pointing out the flaws in the C. tropicalis genome. Using the PacBio sequence-based new assembly, we were able to confirm the reviewer’s comment on the sequence and annotation error in the RefSeq assembly for C. tropicalis. We listed the comparisons between the two assemblies in Table S8. Because the differences reside outside of the Hyphal_reg_CWP domain, they don’t impact our phylogenetic analyses, which are based on the effector domain alignment. To determine if this is a widespread issue affecting all genome assemblies based on older technologies, and in response to the reviewer’s criticism, we systematically checked the sequences of BLASTP hits based on the RefSeq assemblies against newer, long-read based ones when available. As detailed in Text S1 in the revised manuscript, the problems seen in C. tropicalis were not observed in four other species. While the sample size is small, we believe the issues with C. tropicalis are likely due to a combination of specific issues with the original assembly and special properties of the genome.

      Similarly, the recent work from Cormack's lab features a PacBio C. glabrata sequence (doi: 10.1111/mmi.14707). The paper details how the authors focused on accurate assembly of the types of genes studied here. Sequences from the current project should be compared to the PacBio assembly to determine if they provide the same results.

      We compared the sequences of the three C. glabrata Hil homologs identified in the RefSeq assembly (GCF_000002545.3) to the best BLAST hits in one of the new Cormack lab assemblies for (BG2 strain, GCA_014217725.1). Two of the three proteins showed identical sequences between the assemblies. One of them is longer in the new assembly than in the RefSeq (1861 vs 1771 aa, XP_447567.2, QHS67215.1). The main difference, however, was the number of hits recovered. Performing BLASTP searches in the new assembly recovered 13 hits vs 3 from the RefSeq assembly, of which 12 were in the subtelomeric region. For this reason, we used the new assembly as the basis for the Hil homologs in our subsequent analyses. To determine if we missed homologs in other genomes due to incomplete subtelomeric regions in the RefSeq assemblies, we repeated the BLASTP search in four other genomes (Text S1). In one of the four species, C. nivariensis, we recovered one more homolog than in the RefSeq. In all other three, we identified the same number (S. cerevisiae: 0, K. lactis: 1, C. albicans: 12), suggesting that the issues seen in C. glabrata is likely specific to this species and its RefSeq assembly.

      Another part of the study that deserves additional attention or perhaps altered presentation is the idea that the Iff/Hyr N-terminal domain binds ligands. The literature on the Iff/Hyr proteins is limited. In my opinion, though, the authors of this paper could more completely present the information that is known. The paper by Uppuluri et al. is cited (doi: 10.1371/journal.ppat.1007056), but I did not see any information about their data regarding interaction of C. albicans Hyr1 with bacterial proteins mentioned in the manuscript under review. It is formally possible that the N-terminal domain of Iff/Hyr proteins does not bind a ligand. The current manuscript includes a great deal of speculation on that point, suiting it better to a Hypothesis and Theory format rather than other types of publications.

      We appreciate the reviewer’s criticism and suggestion. We made two revisions based on the comments. First, we no longer refer to the Hyphal_reg_CWP domain as ligand-binding. Instead, we refer to it as the effector domain, following existing practices in the field (Lipke 2018, PMID: 29772751, de Groot et al. 2013, PMID: 23397570). Second, during the description of the predicted structure for the domain, we mentioned that it lacks an apparent binding pocket as suggested/identified in other β-solenoid proteins with carbohydrate binding abilities. Therefore, we suggest that the potential substrate and mechanism of binding by this domain remain to be determined with further experiments. We do, however, believe that there is strong evidence for the domain being involved in adhesion. A recent study (Reithofer et al. 2021) presented structural and phenotypic characterization of three Adhesin wall-like proteins (Awp1,2,3) in C. glabrata. In particular, experimental studies of CgAwp2, a Hil family protein, showed that its deletion resulted in the reversion of the hyperadhesive phenotype in one of the C. glabrata strains. Plastic was one of the substrates being evaluated, although, as the reviewer’s work pointed out, adhesion to plastics doesn’t indicate ligand binding, as it can be mediated by non-specific hydrophobic interactions (Hoyer and Cota 2016, PMID: 27014205). Nonetheless, the results presented in Reithofer et al. 2021 and other lines of evidence presented in the current work strongly supported adhesin functions of the Hil family.

      Table 1 attempts to offer evidence that the Iff/Hyr N-terminal domain has adhesive function but falls short of convincing the reader. One of the example structural templates is a sugar pyrophosphorylase that seems irrelevant to the current discussion. In the column called "Function", the word adhesin is found several times, but no detail is presented. The only entry that offers an example ligand indicates that the domain binds cellulose which is not likely relevant for mammalian pathogenesis, the main focus of the work. Other functions listed include self-association and cell aggregation--using the N-terminal domain. It is formally possible that Iff/Hyr proteins drive aggregation using the N-terminal domain and beta-aggregation sequences in the repeated region. The authors should develop these ideas further. Discussion of adhesive/aggregative function related to the ALS family can be found in Hoyer and Cota, 2016 (doi: 10.3389/fmicb.2016.00280).

      We appreciate the reviewer’s comments. In the revised manuscript, we removed Table 1, which was based on I-TASSER identified templates. Instead, we identified similar structures in the PDB50 database to the AlphaFold2 prediction for the Hyphal_reg_CWP domain in C. auris Hil1 using DALI (Table S3). We described the functional implications based on this list as follows:

      “We identified a number of bacterial adhesins with a highly similar β-helix fold but no α-crystallin domain (Table S3), e.g., Hmw1 from H. influenzae (PDB: 2ODL), Tāpirins from C. hydrothermalis (PDB: 6N2C), TibA from enterotoxigenic E. coli (PDB: 4Q1Q) and SRRP from L. reuteri (PDB: 5NY0). For comparison, the binding region of the Serine Rich Repeat Protein 100-23 (SRRP100-23) from L. reuteri was shown in Fig. 3F (Sequeira et al. 2018). Together, these results strongly suggest that the Hyphal_reg_CWP domain in the C. auris Hil family genes mediate adhesion.”

      One line of evidence that suggest the Hyphal_reg_CWP domain may have ligand-binding activity is from the L. reuteri SRRP-BR, which is one of the bacterial adhesins identified as having a highly similar β-helical structure (but missing the α-crystallin domain). In Sequeira et al. 2018 (PMID: 29507249), the authors showed via both in-vitro and in-vivo experiments that this domain “bound to host epithelial cells and DNA at neutral pH and recognized polygalacturonic acid (PGA), rhamnogalacturonan I, or chondroitin sulfate A at acidic pH”. However, the predicted structure for the Hyphal_reg_CWP domain in C. auris Hil1 and Hil7 lack a protruding, flexible loop in the β-helix, which was proposed to serve as a binding pocket for the ligand in SRRP-BR. We therefore commented in the text “Such a feature is not apparent in the predicted structure of the Hyphal_reg_CWP domain. Further studies are needed to elucidate the potential substrate for this domain and its mechanism of adhesion.”

      We also appreciate the reviewer’s suggestion to discuss the potential role of the Hil proteins in mediating adhesion vs cell aggregation. We now have a section in Discussion that focuses on the potential role of the β-aggregation sequences especially in the subset of Hil proteins led by C. auris Hil1-Hil4, which have an unusually large number of such sequences. We discuss the recent literature suggesting the potential of such features mediating cell-cell aggregation.

      The incredibly large number of figures that focus on the repeated sequences in the genes does not appear to include mention of the idea that these regions are frequently highly glycosylated. Knowing how much carbohydrate is added to these sequences in the mature protein would also have bearing on whether the beta-aggregation potential is realized. The Iff/Hyr proteins could stick to other things based on ligand binding (adhesion), hydrophobicity, aggregative activity, etc. Not much is really known about protein function so the conclusions are only speculative. The authors are largely accurate in presenting their conclusions as speculative, but the conclusions are not developed fully and always land on the idea that the N-terminal domain has adhesive function when that aspect clearly is not known.

      We appreciate the reviewer’s comment. We have performed N- and O-glycosylataion predictions for the Hil family proteins in C. auris as a focal example and presented the results in Figure 2 of the revised manuscript. Briefly, we found that all eight proteins are predicted to be heavily O-glycosylated (Fig. 2C). N-glycosylation is rare except in Hil5 and Hil6, in regions with a low Ser/Thr content (Fig. 2C). We also deemphasized the ligand-binding ability of the effector domain and its importance in assessing the adhesin function of the Hil family proteins. At the same time, we highlighted other mechanisms as the reviewer pointed out, such as aggregative activities, in our discussion on the potential importance of the large number of β-aggregation motifs.

      Another aspect of the analysis that is not mentioned is that several of the species discussed are diploid. What effect does ploidy have on the conclusions? Most draft genomes for diploid species are presented in a haploid display, so are not completely representative of the species. Additionally, some species such as C. parapsilosis are known to vary between strains in their composition of gene families, with varying numbers of loci in different isolates.

      We appreciate the reviewer raising this issue. The potential impact of having diploid genomes represented as haploids is twofold. First, if the genome sequencing was performed on a diploid cell sample with some highly polymorphic regions, that would present difficulties to the assembly and could result in poorly assembled sections. Second, either because of the first issue, or because the researchers used the haploid phase of the organism for sequencing, the representative haploid genome will not be “completely representative of the species” as the reviewer suggested. The second problem is not specific to diploids – even for haploids, any single or collection of genomes would represent just a slice of the genetic diversity in the species. We did two things to look into this. First, we analyzed multiple strains in C. auris to reveal both Hil family size variation and also sequence polymorphism, particularly in the tandem repeat region. We also, as part of the quality control, compared and searched assemblies for different strains of some species when available. We agree that characterizing multiple genomes in a species is important for fully revealing the gene pool diversity and could have important consequences on our understanding of the emergence of novel yeast pathogens.

      Regarding the first issue, we checked the original publications for two large-scale yeast genome sequencing projects that included 10 of the 32 species in the present study (Dujon et al. 2004, PMID: 15229592 and Butler et al. 2009, PMID: 19465905). In Dujon et al. 2004, the authors stated that haploid cells were used in cases where the species has both haploid and diploid phases. In Butler et al. 2009, the authors said in the Methods that “for highly polymorphic regions of diploid genomes, initial sequence assemblies were iteratively re-assembled in regions of high polymorphism to minimize read disagreement from the two haplotypes while maximizing coverage.”. Therefore, the potential issue of heterozygosity is likely minimal. In addition, many diploid yeasts have large regions in their genomes being homozygous, both as a result of clonal expansion and also due to loss of heterozygosity (LOH), as documented in C. albicans and other Candida species (e.g., PMID: 28080987). Nonetheless, we acknowledge that this issue is yet another challenge to having high-quality, complete genome assemblies. In the discussion, we fully acknowledge the limitation of our study by genome assemblies, and believe that ongoing improvement thanks to the development of long-read technologies will allow more in-depth studies, particularly in the subtelomeric regions and for repeat-rich sequences.

      The manuscript concludes that having more genes is better, that the gene family represents diversification that must be driven by its importance to pathogenesis, without recognizing that some species evolve toward lower pathogenesis. This concept could be explored in the Discussion. …My own experience makes me wonder if the authors found any examples of species that provide an exception to the idea that having more genes is better and positively associated with pathogenesis. The parallel between IFF/HYR and ALS genes is made many times in the manuscript. Spathaspora passalidarum, a species that is not pathogenic in humans, but clearly within the phylogenetic group examined here, has 29 loci with sequence similarity to ALS genes. How many IFF/HYR genes are in S. passalidarum?

      We appreciate the reviewer’s comment. We will address the two comments above together as they are related. First of all, S. passalidarum is now included in our extended BLAST search list and we identified a total of 3 homologs in this species. When compared with the related Candida/Lodderomyces clade, which includes C. albicans, the Hil family in this species is relatively small (3 vs. >10). More generally, we observe a significant correlation between the Hil family size and the species’ pathogenic potential (Figure 1B and the phylogenetic regression result in the text).

      Regarding the first comment, we did identify two species that had a large Hil family (>8 based on C. auris) and yet were not known to infect humans. One of them, M. bicuspidata, has 29 Hil homologs and is interestingly a parasite for freshwater animals, such as Daphnia. The other species, K. africana, has 10 homologs and its ecology is not well described in the literature. With respect to the relationship between adhesin family and pathogenicity, we would like to make two points. First, as mentioned above, we observed a strong correlation between the Hil family size and the pathogen status, after correcting for phylogenetic relatedness, suggesting that expansion of the Hil family is a shared trait among pathogenic species. This doesn’t rule out the possibility that some species may have an expanded adhesin family, such as the example the reviewer mentioned, for reasons other than infecting a human host. Second, a key point in our work is that expansion of the adhesin family is only the first step – the crucial contribution of gene duplications to adaptation is not just in the increase in copy number, but also in providing the raw materials for selection to generate novel phenotypes. On that front, we documented the rapid divergence of the central domains both between and within species, as well as signatures of relaxed selective constraint and positive selection acting on the effector domain following gene duplications in C. auris, both of which support the above theme.

      There are several current taxonomies for the species in this region of the tree. The source of the names used in this paper could be specific more completely.

      We appreciate the reviewer’s comment. We now gave the complete Latin names for all species in Figure 1 and only use abbreviated names, e.g., C. auris, after the first occurrence. For species with multiple names in the literature, we followed the species name and phylogenetic placement in Shen et al. 2018 (PMID: 30415838).

      The Results and Discussion sections are largely redundant. The tone of the paper is conversational, making it easy to read, but there seems little left to say in the Discussion that has not already been mentioned as the background for the various types of analyses. The authors should revise the paper to eliminate discussions of published literature from the Results and expand the Discussion to include some of the themes that have not been mentioned yet.

      We appreciate the reviewer’s comment. In the revised manuscript, we have moved discussion points from the Result to the Discussion section. We also overhauled the Discussion to focus on the implications based on, but not already covered, in the Result part, including the points the reviewer suggested, e.g., the implications of the structure on adhesion mechanism.

      Another point that the authors do not mention is documented recombination between IFF and ALS genes (doi: 10.3389/fmicb.2019.00781) and the effect of that process on evolution among these gene families.

      We appreciate the reviewer’s comment. We now mention this and related observations in Discussion as part of the discussion on the mutational mechanisms for the evolution of the family:

      “Diversification of adhesin repertoire within a strain can arise from a variety of molecular mechanisms. For example, chimeric proteins generated through recombination between Als family members or between an Als protein’s N terminal effector domain and an Hyr/Iff protein’s repeat region have been shown (Butler et al. 2009; Zhao et al. 2011; Oh et al. 2019). Some of the adhesins with highly diverged central domains may have arisen in this manner (Fig. S10).”

      My reading of the work by Xu et al. 2021 (doi: 10.1111/mmi.14707) does not match the direction of its presentation in the current paper. Oh et al., 2021 (doi: 10.3389/fcimb.2021.794529) discussed that point recently, providing another point for the Discussion in the current paper.

      We appreciate the reviewer’s comment and agree that our original reading of Xu et al. 2021 was incorrect. Instead of suggesting a higher mutation rates in the subtelomeric region, the authors instead suggested the evolution of the Epa family in the subtelomere was driven by Break-Induced Replication. We now update our discussion in the following way, also citing Oh et al. 2021

      “Finally, as reported by (Muñoz et al. 2021), we found that the Hil family genes are preferentially located near chromosomal ends in C. auris and also in other species examined (Fig 7), similar to previous findings for the Flo and Epa families (Teunissen and Steensma 1995; De Las Peñas et al. 2003; Xu et al. 2020; Xu et al. 2021) as well as the Als genes in certain species (Oh et al. 2021). This location bias of the Hil and other adhesin families is likely a key mechanism for their dynamic expansion and sequence evolution, either via ectopic recombination (Anderson et al. 2015) or by Break-Induced Replication (Bosco and Haber 1998; Sakofsky and Malkova 2017; Xu et al. 2021). Another potential consequence of the subtelomeric location of Hil family members is that the genes may be subject to epigenetic silencing, which can be derepressed in response to stress (Ai et al. 2002). Such epigenetic regulation of the adhesin genes was found to generate cell surface heterogeneity in S. cerevisiae (Halme et al. 2004) and leads to hyperadherent phenotypes in C. glabrata (Castaño et al. 2005).”

      I might have missed it, but I could not find what constitutes a BLAST-excluded sequence (Table S7). Additional explanation (or making the explanation easier to find) would help the reader.

      We apologize for the inadvertent mistake of leaving out Table S7. In the revised manuscript, we include all hits from species that are part of the 322 species phylogeny in Shen et al. 2018. Thus, we removed the original Table S7.

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      Ideally, validation of all sequences would provide a stronger foundation for the work. However, that request is not realistic in terms of time or resources.

      We agree with the reviewer and appreciate the understanding. In the revised manuscript, we performed additional analyses to evaluate the accuracy and correct the sequences of the BLASTP hits from RefSeq database by comparing them to long-read based assemblies when possible. Please see previous replies to reviewers’ comments and Text S1 for details.

      • Are the data and the methods presented in such a way that they can be reproduced?

      Yes, the data and methods are documented clearly and perhaps too thoroughly in many places. A considerable amount of confidence is placed in sequences that might not be accurate and tracking details down to the amino acid residue may not be reasonable in this context. A disclaimer might help--everyone probably already knows that genome sequences are not perfect but stating that the analysis is only as good as the genome sequence acknowledges that fact.

      We appreciate the reviewer’s comment. In the revised manuscript, we tried to strike a balance between providing enough methodological details for the readers to assess the conclusions and yet also keeping the flow of the paper. We also accepted the reviewer’s suggestion by adding a disclaimer in the Discussion:

      “we acknowledge the possibility of missing homologs in some species and having inaccurate sequences in the tandem-repeat region. We believe the expected improvements in genome assemblies due to advances in long-read sequencing technologies will be crucial for future studies of the adhesin gene family in yeasts.”

      • Are the experiments adequately replicated and statistical analysis adequate?

      The idea of replicates does not really apply to this analysis. I think that the species sampled are reasonable to represent the region of the phylogenetic tree on which the analysis is focused. The authors clearly documented their computational methods in an admirable way.

      We appreciate the reviewer’s comment.

      MINOR COMMENTS:

      Figure 1 has elements that would make a nice graphical summary, but most of it should not be part of the final manuscript. For example, Panel A is repeated in Figure 2. It is not clear what Panel C means until the reader gets to Figure 2. Panel D is unnecessary. The image in Panel B is a good graphic. Endothelial adhesion is not mentioned, though. It is also debatable whether the proteins bind directly to plastic or to the body fluids that coat the plastic.

      Based on this and another reviewer’s comments, we removed Figure 1 from the revised manuscript.

      Compared to Figure 1, the information in Figure 3 is inconsistent. The "central domain" in Panel A is not central to anything as drawn, located at the end of the protein. The figure should be revised to be consistent with the majority of the authors' results.

      We appreciate the reviewer’s suggestion. The terminologies used to describe the different parts of a typical yeast adhesin vary in the literature. In the Als family literature, central domain refers to the region after the N-terminal effector domain and before the C-terminal Ser/Thr-rich stalk domain. In the Hil family proteins, there is not a clear distinction between a “central” and a “stalk” region. In Boisramé et al. 2011 (PMID: 21841123), the authors referred to the region between the Hyphal_reg_CWP domain and the GPI-anchor as the central domain. We adopted that use. We realize that this can lead to confusion especially for Als researchers. In some other literature, e.g., Reithofer et al. 2021, this part of the protein is referred to as the B-region. But we couldn’t find wide use of that term. We decided to stay with “central domain” in this work and hope that by defining the term in Figure 2A, we would avoid any confusion within the scope of this work.

      Are the low-complexity repeats mentioned in the Figure 4 legend present anywhere else in the C. auris genome or elsewhere among the species used in this study? The answer to that question may also provide evolutionary clues.

      We did find one other putative GPI-anchored cell wall protein containing this ~44aa repeat unit, but with a different effector domain (GLEYA, PF10528). This protein (PIS58185.1 in C. auris B8441), appears to be a hybrid between the repeat region of C. auris Hil1 and an N-terminal effector domain of a different family. This result fits the theme of the reviewer’s work in C. albicans and C. tropicalis on the chimeric adhesins formed between the Als and Hyr/Iff families. Due to the scope of the current work, we omitted this finding from the main result.

      Figure S1 legend. How was the distance to C. glabrata measured to call it equal?

      The original Figure S1 was removed in the revised manuscript. A consistent set of criteria was employed in deciding which BLASTP hits to include as Hil family members.

      Figure S4 could be presented better. Both diagonals have the same information. One could be emptied or could alternatively present nucleotide identity.

      The original Figure S4 was removed in the revised manuscript.

      Italicize the species names in Panel C of Figure S8.

      The original Figure S8C is now Figure S9 and we systematically checked to make sure that species Latin names are italicized. Thanks for pointing this out.

      Lines 256-257: The paper selectively samples the Iff/Hyr family and does not examine the "entire" family. Please revise.

      We appreciate the reviewer’s comment. In the revised manuscript, we no longer selectively sample species. Instead, we only exclude three species that are not part of the 322-yeast species phylogeny in Shen et al. 2018 and Muñoz et al. 2018, namely Diutina rugosa, Kazachstania barnettii and Artibeus jamaicensis. Our extensive BLASTP searches also indicated that the family as defined in this work is specific to the budding yeast subphylum. We therefore believe it is accurate to describe the work as examining the entire Hil family.

      • Are prior studies referenced appropriately?

      I was disappointed to see that the paper does not reference my laboratory's work at all. When ALS genes are featured so strongly in a report, it seems reasonable to include something we have done over 30+ years. Our most-recent ALS paper (Oh et al., 2021 doi: 10.3389/fcimb.2021.794529) would be a reasonable source for defending the gene numbers used in Figure 2A. Other examples of our work that directly relate to concepts in this paper were mentioned above.

      We sincerely apologize for our negligence. We are new to the fungal adhesin field through an accidental finding, and despite our effort to digest the relevant literature, we did unfortunately overlook the extensive work done on the Als family, much of which came from the reviewer’s lab. We have carefully read the papers suggested by the reviewer as well as others, and now have better incorporated prior foundational and insightful work into our result and discussion sections (see previous replies to the reviewer’s comments).

      • Are the text and figures clear and accurate?

      Suggestions for improvement are incorporated into the comments above.

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Please present Methods and Results in the past tense. I still make the same mistake when I try to get my ideas on the page but proofread one more time and ensure the verb tenses are accurate.

      We appreciate the reviewer’s comments and have edited the Methods and Results sections accordingly.

      SIGNIFICANCE

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      The paper reads as if it is presenting preliminary data for a grant proposal. Perhaps Prof. He's lab wants to seek functional evidence for the role of the Iff/Hyr proteins. The current paper provides an exhaustive background for such a pursuit. As presented, there is little functional data for these proteins, genome sequences are not 100% accurate, but the trends noted are defendable.

      We appreciate the reviewer’s comments. We acknowledge that experimental studies will be needed to prove and further establish the functional importance of our findings. However, we believe our gene family evolutionary studies provided important novel insights and serve as an example for adhesin family evolution.

      • Place the work in the context of the existing literature (provide references, where appropriate).

      The ideas presented here are similar to those pioneered in the Butler et al. Nature paper in 2009 (doi: 10.1038/nature08064). We now have the benefit of more genome sequences so the analysis can encompass more species. C. auris adds a newer focus on part of the phylogenetic tree that was not previously emphasized. The idea of "more is better" is very simplistic, though. Parallel work for the ALS family shows complexity in gene expression levels, suggesting that some adhesins are poised to make a large contribution while others are likely to have a scant presence on the cell surface. Those concepts are not really explored in the current paper, either. See Hoyer and Cota 2016 (doi: 10.3389/fmicb.2016.00280); Oh et al. (doi: 10.3389/fmicb.2020.594531).

      We appreciate the reviewer’s comments and have included a discussion about the potential diversity of the duplicated Hil family proteins, in terms of function and their regulation in the Discussion. Also see our response to the first comment of the reviewer regarding the novelty of our hypothesis and the significance of our findings.

      • State what audience might be interested in and influenced by the reported findings.

      Potential readers would come from the fields of fungal adhesion and pathogenesis, as well as evolutionary biology.

      We appreciate the reviewer’s comments.

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      I discovered and named the ALS gene family in C. albicans and have spent 30+ years characterizing it. Most recently, my lab has focused on providing an accurate gene census and validated gene sequences for the cell wall "adhesinome" in the pathogenic Candida species. Some families are expanded and some are not. Some proteins appear only in a few species and demonstrate key roles in host-fungus interactions. There are many nuances to interpretation of what these fungi are doing from the standpoint of cell-surface adhesins and we look forward to exploring these ideas across many genomes, using validated gene sequences. We have a tremendous dataset that might make good fuel for a collaboration with Prof. He, given his enthusiasm for this area of study, as well as his outstanding expertise and perspectives on evolutionary analyses.

      We sincerely thank the reviewer for the critical analysis of our manuscript and appreciate the many suggestions for improving the manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The manuscript by Smoak et al. provides considerable information gleaned from analysis of HYR/IFF genes in 19 fungal species. A specific focus is on Candida auris. The main conclusion is that this gene family repeatedly expanded in divergent pathogenic Candida lineages including C. auris. Analyses focus on the sequences encoding the protein's N-terminal domain and tracts of repeated sequences that follow. The authors conclude with the hypothesis that expansion and diversification of adhesin gene families underpin fungal pathogen evolution and that the variation among adhesin-encoding genes affects adhesion and virulence within and between species. The paper is easy to read, includes clear and attractive graphics, as well as a considerable number of supplementary data files that provide thorough documentation of the sources of information and their analysis.

      Major comments:

      • Are the key conclusions convincing?

      Overall, the authors' conclusions are supported by the information they present. However, the overall conclusion is stated as a hypothesis and that hypothesis is not particularly novel. The idea that expansion of gene families associated with pathogenesis occurs in the pathogenic species dates back at least to Butler et al. (2009; doi: 10.1038/nature08064) who first presented the genome sequences for many of the species considered here.

      One key issue with a manuscript of this type is whether genome sequence data are accurate. The authors are not the first research group to take draft genome sequence data at face value and attempt to draw major conclusions from it. The accuracy of public genome data continues to improve, especially with the emergence of PacBio sequencing. Because the IFF/HYR genes contain long tracts of repeated sequences, genome assemblies from short-read data are frequently inaccurate. For example, is it reasonable to have confidence that the number of copies of a tandemly repeated sequence in a specific ORF is exactly 21 (an example taken from Table 2) when each repeat is 40+ amino acids long and highly conserved? Table S6 would benefit from inclusion of the type of sequence data used to construct each draft genome sequence. It is also reasonable to question whether the genome of the type strain is used as a template to construct the draft genomes of the other strains. If that was standard practice, conservation of the repeat copy number among strains might be an artefact. Conservation of repeat sequences to the degree shown is not a feature of the ALS family, a point of contrast between gene families that could be explored in the Discussion. - Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      Due to the nature of my comments, this review will not be anonymous. I will include some of the data from my laboratory to further illustrate the point about the quality of draft genome sequences, especially for gene families that contain repeated sequences. My laboratory group has spent the past several years looking at the families of cell wall genes in these species and know that the C. tropicalis genome sequence used in the current analysis is highly flawed. There is even a manuscript from several years ago that documents problems in the assembly (doi: 10.1534/g3.115.017566). There is a new PacBio sequence available that has considerably improved data for this group of genes, but still is not perfect. We designed primers and amplified the various coding regions to verify whether the IFF/HYR were correct in the draft genome sequences. For C. tropicalis, we know that 7 of the genes listed in this paper are broken (i.e. prematurely terminated) giving a false impression of their construction. The current study did not verify any gene sequences, so broken/incomplete genes are a stumbling block for developing conclusions.

      Similarly, the recent work from Cormack's lab features a PacBio C. glabrata sequence (doi: 10.1111/mmi.14707). The paper details how the authors focused on accurate assembly of the types of genes studied here. Sequences from the current project should be compared to the PacBio assembly to determine if they provide the same results.

      Another part of the study that deserves additional attention or perhaps altered presentation is the idea that the Iff/Hyr N-terminal domain binds ligands. The literature on the Iff/Hyr proteins is limited. In my opinion, though, the authors of this paper could more completely present the information that is known. The paper by Uppuluri et al. is cited (doi: 10.1371/journal.ppat.1007056), but I did not see any information about their data regarding interaction of C. albicans Hyr1 with bacterial proteins mentioned in the manuscript under review. It is formally possible that the N-terminal domain of Iff/Hyr proteins does not bind a ligand. The current manuscript includes a great deal of speculation on that point, suiting it better to a Hypothesis and Theory format rather than other types of publications.

      Table 1 attempts to offer evidence that the Iff/Hyr N-terminal domain has adhesive function but falls short of convincing the reader. One of the example structural templates is a sugar pyrophosphorylase that seems irrelevant to the current discussion. In the column called "Function", the word adhesin is found several times, but no detail is presented. The only entry that offers an example ligand indicates that the domain binds cellulose which is not likely relevant for mammalian pathogenesis, the main focus of the work. Other functions listed include self-association and cell aggregation--using the N-terminal domain. It is formally possible that Iff/Hyr proteins drive aggregation using the N-terminal domain and beta-aggregation sequences in the repeated region. The authors should develop these ideas further. Discussion of adhesive/aggregative function related to the ALS family can be found in Hoyer and Cota, 2016 (doi: 10.3389/fmicb.2016.00280).

      The incredibly large number of figures that focus on the repeated sequences in the genes does not appear to include mention of the idea that these regions are frequently highly glycosylated. Knowing how much carbohydrate is added to these sequences in the mature protein would also have bearing on whether the beta-aggregation potential is realized. The Iff/Hyr proteins could stick to other things based on ligand binding (adhesion), hydrophobicity, aggregative activity, etc. Not much is really known about protein function so the conclusions are only speculative. The authors are largely accurate in presenting their conclusions as speculative, but the conclusions are not developed fully and always land on the idea that the N-terminal domain has adhesive function when that aspect clearly is not known.

      Another aspect of the analysis that is not mentioned is that several of the species discussed are diploid. What effect does ploidy have on the conclusions? Most draft genomes for diploid species are presented in a haploid display, so are not completely representative of the species. Additionally, some species such as C. parapsilosis are known to vary between strains in their composition of gene families, with varying numbers of loci in different isolates.

      The manuscript concludes that having more genes is better, that the gene family represents diversification that must be driven by its importance to pathogenesis, without recognizing that some species evolve toward lower pathogenesis. This concept could be explored in the Discussion.

      The Results and Discussion sections are largely redundant. The tone of the paper is conversational, making it easy to read, but there seems little left to say in the Discussion that has not already been mentioned as the background for the various types of analyses. The authors should revise the paper to eliminate discussions of published literature from the Results and expand the Discussion to include some of the themes that have not been mentioned yet.

      My own experience makes me wonder if the authors found any examples of species that provide and exception to the idea that having more genes is better and positively associated with pathogenesis. The parallel between IFF/HYR and ALS genes is made many times in the manuscript. Spathaspora passalidarum, a species that is not pathogenic in humans, but clearly within the phylogenetic group examined here, has 29 loci with sequence similarity to ALS genes. How many IFF/HYR genes are in S. passalidarum?

      There are several current taxonomies for the species in this region of the tree. The source of the names used in this paper could be specific more completely.

      Another point that the authors do not mention is documented recombination between IFF and ALS genes (doi: 10.3389/fmicb.2019.00781) and the effect of that process on evolution among these gene families.

      My reading of the work by Xu et al. 2021 (doi: 10.1111/mmi.14707) does not match the direction of its presentation in the current paper. Oh et al., 2021 (doi: 10.3389/fcimb.2021.794529) discussed that point recently, providing another point for the Discussion in the current paper.

      I might have missed it, but I could not find what constitutes a BLAST-excluded sequence (Table S7). Additional explanation (or making the explanation easier to find) would help the reader. - Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. - Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      Ideally, validation of all sequences would provide a stronger foundation for the work. However, that request is not realistic in terms of time or resources. - Are the data and the methods presented in such a way that they can be reproduced?

      Yes, the data and methods are documented clearly and perhaps too thoroughly in many places. A considerable amount of confidence is placed in sequences that might not be accurate and tracking details down to the amino acid residue may not be reasonable in this context. A disclaimer might help--everyone probably already knows that genome sequences are not perfect but stating that the analysis is only as good as the genome sequence acknowledges that fact. - Are the experiments adequately replicated and statistical analysis adequate?

      The idea of replicates does not really apply to this analysis. I think that the species sampled are reasonable to represent the region of the phylogenetic tree on which the analysis is focused. The authors clearly documented their computational methods in an admirable way.

      Minor comments:

      • Specific experimental issues that are easily addressable.

      Figure 1 has elements that would make a nice graphical summary, but most of it should not be part of the final manuscript. For example, Panel A is repeated in Figure 2. It is not clear what Panel C means until the reader gets to Figure 2. Panel D is unnecessary. The image in Panel B is a good graphic. Endothelial adhesion is not mentioned, though. It is also debatable whether the proteins bind directly to plastic or to the body fluids that coat the plastic.

      Compared to Figure 1, the information in Figure 3 is inconsistent. The "central domain" in Panel A is not central to anything as drawn, located at the end of the protein. The figure should be revised to be consistent with the majority of the authors' results. Structures in Panels C to E would benefit from the "through the spiral" view that is featured in Figure S9. What experimental technique was used to solve the structure in Panel E? Adding that information to the legend would be helpful to the reader. Also, the secondary structure colors seem to be reversed between the legend and domain structure. Adding the coordinates of the domains shown would help the reader to understand their location in the mature protein.

      Are the low-complexity repeats mentioned in the Figure 4 legend present anywhere else in the C. auris genome or elsewhere among the species used in this study? The answer to that question may also provide evolutionary clues.

      Figure S1 legend. How was the distance to C. glabrata measured to call it equal?

      Figure S4 could be presented better. Both diagonals have the same information. One could be emptied or could alternatively present nucleotide identity.

      Italicize the species names in Panel C of Figure S8.

      Lines 256-257: The paper selectively samples the Iff/Hyr family and does not examine the "entire" family. Please revise. - Are prior studies referenced appropriately?

      I was disappointed to see that the paper does not reference my laboratory's work at all. When ALS genes are featured so strongly in a report, it seems reasonable to include something we have done over 30+ years. Our most-recent ALS paper (Oh et al., 2021 doi: 10.3389/fcimb.2021.794529) would be a reasonable source for defending the gene numbers used in Figure 2A. Other examples of our work that directly relate to concepts in this paper were mentioned above. - Are the text and figures clear and accurate?

      Suggestions for improvement are incorporated into the comments above. - Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Please present Methods and Results in the past tense. I still make the same mistake when I try to get my ideas on the page but proofread one more time and ensure the verb tenses are accurate.

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      The paper reads as if it is presenting preliminary data for a grant proposal. Perhaps Prof. He's lab wants to seek functional evidence for the role of the Iff/Hyr proteins. The current paper provides an exhaustive background for such a pursuit. As presented, there is little functional data for these proteins, genome sequences are not 100% accurate, but the trends noted are defendable. - Place the work in the context of the existing literature (provide references, where appropriate).

      The ideas presented here are similar to those pioneered in the Butler et al. Nature paper in 2009 (doi: 10.1038/nature08064). We now have the benefit of more genome sequences so the analysis can encompass more species. C. auris adds a newer focus on part of the phylogenetic tree that was not previously emphasized. The idea of "more is better" is very simplistic, though. Parallel work for the ALS family shows complexity in gene expression levels, suggesting that some adhesins are poised to make a large contribution while others are likely to have a scant presence on the cell surface. Those concepts are not really explored in the current paper, either. See Hoyer and Cota 2016 (doi: 10.3389/fmicb.2016.00280); Oh et al. (doi: 10.3389/fmicb.2020.594531). - State what audience might be interested in and influenced by the reported findings.

      Potential readers would come from the fields of fungal adhesion and pathogenesis, as well as evolutionary biology. - Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      I discovered and named the ALS gene family in C. albicans and have spent 30+ years characterizing it. Most recently, my lab has focused on providing an accurate gene census and validated gene sequences for the cell wall "adhesinome" in the pathogenic Candida species. Some families are expanded and some are not. Some proteins appear only in a few species and demonstrate key roles in host-fungus interactions. There are many nuances to interpretation of what these fungi are doing from the standpoint of cell-surface adhesins and we look forward to exploring these ideas across many genomes, using validated gene sequences. We have a tremendous dataset that might make good fuel for a collaboration with Prof. He, given his enthusiasm for this area of study, as well as his outstanding expertise and perspectives on evolutionary analyses.

    1. This can be said of the common cognitions of blue, yellow etc. also. If self-awareness could be discredited on the ground that it is the product of some beginningless urge, how can any other cognition be credited as valid so that one could depend upon the cognitions of blue, yellow etc.?”

      The other day, I was having an interesting discussion with Buddhist philosopher working on sanskrit. We were discussing perhaps it is the grammatical structure that lead us to think in a certain way. That the languagenand its grammatical structure may already guiding our way of thinking. Sor the argument was on White Cow, is the whiteness posessing the cow, or the cow posessing the whiteness. The scholars were arguing that it could be translated in both ways, that gives us a kind of vague different understanding of what is being described.

    1. Reviewer #1 (Public Review):

      The core question addressed by this study is whether right IFC damage disrupts stop-signal task performance because it plays a key role in response inhibition per se, or because it is crucial for attending to the need to engage response inhibition. A relatively large sample of patients with damage including right IFC, as well as lesioned and healthy control groups, were assessed on the stop-signal task accompanied by scalp EEG. The behavioral data were analyzed using hierarchical Bayesian modeling. Right IFC damage was associated with more trials where 'stopping' was not initiated, while an EEG hallmark of inhibitory control was present in trials where stopping initiation did occur, arguing that rIFG damage disrupts attention to the stop signal, rather than the inhibition that follows.

      This is an interesting study testing a well-defined hypothesis relevant to competing views of the brain basis of inhibitory control. The experimental design is sophisticated and the analysis was preregistered. The acquisition of both behavioral and EEG data in lesion patients provides converging evidence and supports causal inference.

      Interpretation of the results hinges on accepting that a hierarchical Bayesian model is appropriate for discriminating trials where stopping was 'triggered' from trials where there was no trigger. Likewise, we need to accept the EEG frontal beta burst pattern is an indicator of response inhibition. Both of these methodological elements have support from existing literature, although I don't think either of these has been applied in chronic focal lesion patients, so there may be technical issues to consider in their interpretation. Finally, as with most human lesion studies, caution should be applied in interpreting the critical lesion location: in this sample, the effects might relate to insula damage, or to white matter disruption within the ventrolateral/lateral frontal lobe or between those regions and subcortical regions. However, these provisos do not detract from the key finding that damage somewhere in these areas affected initiation/attentional processes rather than response control per se.

      The results are more consistent with an attentional account of right IFG (or more broadly, right ventral frontal lobe) contributions to stop-signal task performance; this is provocative in light of current views of prefrontal contributions to inhibitory control, although in line with a wider literature implicating right frontoparietal circuitry in selective attention. As the authors suggest, a sharp distinction between attention and inhibition may be somewhat artificial: these processes may be closely interrelated in speeded tasks requiring response interruption. However, the present study cleverly tackles the challenge of disentangling them, applying recent modeling and EEG distinctions with interesting results.

      The findings are helpful in further sharpening ideas regarding the neural basis of response control. They also have potential theoretical implications and perhaps direct experimental application in clinical-applied research on disorders of inhibitory control.

    1. Author Response

      Reviewer #1 (Public Review):

      The stated goal of this research was to look for interactions between metabolism, (manipulated by glucose starvation) and the circadian clock. This is a hot topic currently, as bi-directional links between metabolism and rhythmicity are found in several organisms and this connection has important implications for human health. The authors work with the model organism Neurospora crassa, a filamentous fungus that has many advantages for this type of research.

      The authors' first approach was to assay the effects of glucose starvation on the levels of the RNA and protein products of the key clock genes frq, wc-1, and wc-2. The WC-1 and WC-2 proteins form a complex, WCC, that activates frq transcription. The surprising finding was that WC-1 and WC-2 protein levels and WCC transcriptional activity were drastically reduced but frq RNA and protein levels remained the same. Under conditions where rhythmicity is expressed, the rhythms of frq RNA, FRQ protein, and expression of clock-driven "output" genes were also unaffected by starvation. The standard model for the molecular clock is a transcription/translation feedback loop dependent on the levels and activity of these clock gene products, so this disconnect between the starvation-induced changes in the stoichiometry of the loop components and the lack of effects of starvation on rhythmicity calls into question our understanding of the molecular mechanism of the clock. This is yet another example of the inadequacy of the TTFL model to explain rhythmicity. For me, the most significant sentence in the paper was this: "...an unknown mechanism must recalibrate the central clockwork to keep frq transcript levels and oscillation glucose-compensated despite the decline in WCC levels."

      The author's second approach was to try to identify mechanisms for the response to starvation by focussing on frq and its regulators, using mutations in the frq gene and strains with alterations in the activity of kinases and phosphatases known to modify FRQ protein. The finding that all of these manipulations have some effect on the starvation-induced changes in WC protein level is taken by the authors to indicate a role for FRQ itself in the response to starvation. This conclusion is subject to the caveat that manipulations of the activity of multifunctional kinases and phosphatases will certainly have pleiotropic effects on many cellular processes beyond FRQ protein activity.

      Because of the sometimes-speculative nature of our conclusions and based on the suggestion of the editor, we restructured the Discussion and discuss now the mechanism addressed by the Reviewer in the subsection "Ideas and Speculation". We added a sentence to the section about the possible pleiotropic effects of the tested signaling pathways: "Starvation triggers characteristic changes in the activity of signaling routes that affect basic components of the circadian clock. Although the multifunctional pathways might act via pleiotropic mechanisms as well, based on their earlier characterized role in the control of the Neurospora clock, their action can be inserted into a model describing the glucose-dependent reorganization of the oscillator."

      The third section of the paper is a major transcriptomic study of the effects of starvation on global gene expression. Two strains are compared under two conditions: wc wild-type and the wc-1 knockout strain, under fed and starved conditions. The hypothesis is that WCC has a role in the starvation response. The results of starvation on the wild-type are unsurprising and predictable: the expression of many genes involved in metabolic processes is affected. There are no new insights that come from these results and no new testable hypotheses are generated by the data.

      We agree with the reviewer that it is not surprising that glucose depletion strongly affects genes involved in metabolic processes and monosaccharide transport. These data obtained in wt served rather as a control for our experimental conditions. As a new aspect, our analysis focused on the differences between wt and wc-1 in the transcriptomic response to altered glucose availability.

      The authors refer to the wc-1 mutant strain as "clockless" and discuss its effects on the transcriptome only in terms of WC-1's function in the clock mechanism. However, WCC is known to be a major transcriptional regulator, controlling a number of genes beyond the TTFL. As acknowledged earlier in the paper, WC-1 is also the major light receptor in Neurospora. The transcriptomics experiments were carried out in a light/dark cycle, with cultures harvested at the end of the light period, when "an adapted state for light-dependent genes can be expected" according to the authors. However, wc-1 mutants are essentially blind, and so those samples are equivalent to being harvested in the dark. The multifunctional nature of WCC complicates the interpretation of the transcriptomics data. The differences in the transcriptome between wild-type and wc-1 may not be due to loss of clock function, but rather the loss of a major multifunctional transcription factor, or the difference between light and "dark".

      The reviewer is right, when we discussed the difference between wt and wc-1 in the transcriptional response to glucose, we did not emphasize the possible contribution of the photoreceptor function of the WCC. We added the following sentence to the revised version of the discussion: "Further investigations could differentiate between the clock and photoreceptor functions of the WCC in the glucose-dependent control of the transcriptome." Furthermore, we more specifically indicate that in wc-1 the lack of the WCC (and not the lack of a functional clock) results in the altered transcriptomic response to starvation when compared to wt (P15 L14-17).

      In the final set of experiments, the authors tested the hypothesis that the changes in the transcriptome between wild type and wc-1 might make wc-1 less competent to recover growth after starvation. They also test the recovery of frq9, a "clockless" mutant. The very surprising result is that the growth rates of these two mutants are slower than the wild type after transfer from starvation media to high glucose. This is surprising because there will be several generations of nuclear division and doublings of mass within a few hours and the transcriptome should have recovered fully fairly rapidly. A mechanism for this apparent "after-effect" is suggested with evidence concerning differences in expression of a glucose transporter, but it is not clear why this expression should not change rapidly with re-feeding on high glucose. As with previous experiments, the cultures were grown in light/dark cycles, which results in different conditions for the mutants, both of which have very low or absent WC-1 and are therefore blind to light. The potential effects of light have been disregarded.

      The reviewer is right that several generations of nuclear divisions occur within a few hours and lead to a number of doublings of the biomass. However, when the first phase of regeneration is delayed in one or more strains compared to the control, until the stationary phase a substantial difference in the biomass can be expected.

      To the expression change of the glucose transporter: In order to emphasize the different tendency of how glt-1 levels respond to glucose in the different strains, in the previous version of the manuscript we normalized the expression levels to the beginning of recovery (time point of glucose addition). Thus, expression differences between the strains were not shown. To give a more comprehensive picture, in the revised version of the manuscript expression levels without normalization are depicted (Fig 5F). The mutants did not adapt efficiently to changes in the glucose levels, i.e. expression of the transporter was relatively high in both wc-1 and frq10 during starvation and did not further increase upon glucose addition. On the other hand, 24 hours after glucose resupply, glt-1 levels were similar in all strains which might contribute to the similar growth rates observed under steady-state conditions in the standard medium.

      To the photoreceptor-independent function of the WCC during growth recovery: In the revised version of the manuscript we present additional data suggesting the importance of the photoreceptor-independent function of the WCC for efficient recovery from starvation. Fig. 5C and Fig. 5D show now that upon resupply of glucose, wt grows faster than the clock-deficient strains Δwc-1 and frq10 in both LD cycles and constant darkness, indicating that the role of the WCC in growth regeneration is at least partially independent of its photoreceptor function. To the function of the WCC in frq10: frq10 can not be considered blind. Although both Δwc-1 and frq10 lack a functional clock and WC levels are reduced in frq10, these strains show significant differences in WCC activity. While Δwc-1 is considered blind, in frq10 lack of the negative feedback results in high activity of the WCC in both DD and LL and expression levels of all examined, light-sensitive or light-dependent genes were found comparable in wt and in frq-less mutants (Schafmeier et al., 2005; Hunt et al., 2007; own unpublished data).

      The title of the paper refers to a "flexible circadian clock" but this concept of flexibility is not developed in the paper. I would substitute "the White Collar Complex" for this phrase: "Adaptation to starvation requires a functional White Collar Complex in Neurospora crassa" would be more accurate. Some experiments are also conducted using an frq null "clockless" strain, but because WC expression is very low in frq null mutants, any effects of frq null could also be attributed to WC depletion.

      As detailed above, low level of the WCC in the frq-less mutant does not mean low transcriptional activity and accordingly, the two clock mutants, wc-1 and frq10 show important functional differences. We used the word "flexible" to indicate that the molecular clock is able to operate under critical nutrient conditions and with a significantly changed stoichiometry of its key components. Results of our new experiments performed in DD (mentioned above) indicate that growth regeneration is rather independent of the photoreceptor function of the WCC. Nevertheless, we accepted the criticism of the reviewer and changed the title to "Adaptation to glucose starvation is associated with molecular reorganization of the circadian clock in Neurospora crassa".

      The major conclusion I took away from this paper is the multifunctional nature of the WCC as a transcription factor complex. It has been known for a long time that WCC controls the expression of many genes beyond the frq gene at the core of the circadian transcription/translation feedback loop. WC-1 is also the major blue light photoreceptor in Neurospora, controlling the expression of light-regulated genes, and this fact is barely touched on in the paper. These new data now extend the role of WCC in the regulation of metabolic networks as well.

      Reviewer #2 (Public Review):

      The authors have performed an interesting study addressing a topical question in considering how circadian oscillators remain accurate in changing environmental conditions and these circadian oscillators contribute to responses to environmental changes. The authors have performed their studies in Neurospora crassa. The authors have made a very interesting finding that starvation causes a profound decrease in white collar 1 WC-1 abundance, yet the circadian system continues to run despite this decrease in the abundance of a core oscillator component. The study of chronic glucose starvation in a Δwc-1 mutant is interesting and provides the opportunity to investigate the role of the WHITE COLLAR COMPLEX (WCC) and the clock system in adaption to starvation.

      Strengths:

      The authors have used a range of techniques to measure clock behaviour, including qPCR, phosphorylation, protein abundance, and subcellular localisation studies.

      An frq9 mutant was used to test the effects of FRQ on WC1 abundance since WC1 decreased during starvation. This is elegant, though it is not quite clear the logic of this experiment because FRQ did not change abundance during starvation, so why did the author think this experiment was needed?

      We regret that the examination of frq9 was not clearly justified in the previous version of the manuscript. It is true that FRQ levels did not change during starvation, only phosphorylation of the protein was affected, i.e. FRQ became more phosphorylated (displayed by an electrophoretic mobility shift on the Western blot (Garceau N, Liu Y, Loros J J, Dunlap J C. Cell. 1997;89:469–476.)) under low glucose conditions. We tested the starvation response in the FRQ-less strain because WCC level changed significantly in wt upon glucose depletion and expression of WC proteins is known to be controlled by FRQ. In the revised version of the manuscript we tried to introduce and explain the experiments performed with frq9 more thoroughly (P7 L22-P8 L14; P16 L21 – P17 L6).

      An interesting experiment was performed to test whether CK1a-dependent phosphorylation and inactivation of the WCC are involved in the starvation response. An FRQΔFCD1-2 mutant is used in which FRQ cannot interact with CK1a and therefore CK1a cannot phosphorylate and inactivate WC. This experiment suggested that CK1a is not involved in the response to starvation, again leading to the conclusion that FRQ is not involved in the starvation regulation of WC.

      The referee is right, effect of FRQ-bound CK-1a seems to be minor on the adaptation of the molecular clock to starvation, and this is also our conclusion in the manuscript. The major message of this experiment was that FRQ became phosphorylated in response to starvation without stably interacting with CK1a, probably via another mechanism. We agree with the notion that the behavior of WCC levels upon starvation was similar to that in the FRQ-less mutant.

      PKA is shown to be involved in the starvation-induced reduction of WC because the starvation-induced reduction in abundances of WC-1 was absent in the mcb strain in which the regulatory subunit of PKA is defective and hence, PKA is constitutively active.

      The authors have found an interesting potential link between glucose levels and WCC phosphorylation, they demonstrated that starvation reduces PP2A activity and that in a regulatory mutant of PP2A, which has reduced PP2A activity, there is little effect of starvation on WCC levels, suggesting the hypothesis that glucose-dependent PP2A dephosphorylation stabilises WCC.

      Analysis of starvation-regulated transcriptome in Δwc-1 and wild type found strong evidence that the transcriptomic response to starvation is in part dependent on WCC. Much of the misregulated transcriptome appears to be associated with metabolism.

      In a series of growth studies in wild-type frq and wc-1 mutants the authors provide strong evidence that FRQ and WC are involved in growth and survival following starvation, and recovery from starvation.

      Weaknesses:

      The authors describe Neurospora crassa as a model for circadian biology and apparently make the assumption that the findings are indicative of the behaviour of clock systems in other kingdoms. This is not the case. Neurospora crassa is a wonderful model for studying fungal clocks and is a great tool for studying basic circadian dynamics, but the interesting findings here are of a detailed molecular nature and therefore are applicable for fungal clocks, but not other kingdoms.

      We agree that we still do not know whether the described mechanism is specific for only fungal clocks. However, besides the basic feedback loop, overlapping mechanisms (controlled by e.g. casein kinases, glycogen synthase kinase, PKA, PP2A) are involved in the regulation of circadian timekeeping in different eukaryotic systems (reviewed in Reischl and Kramer, 2011, FEBS Lett; Brenna and Albrecht, 2020, Front Physiol). Our results suggest that some of these common factors (PKA, GSK, PP2A) are involved in the reorganization of the Neurospora clock in response to changes in glucose availability. Therefore, it is possible that analogous changes occur in the time keeping mechanisms of other eukaryotic systems when they face serious environmental challenges.

      We included a short section into the Discussion which gives a short overview about known interactions between glucose availability and circadian timekeeping at different levels of the phylogenetic hierarchy (P15 L18 – P16 L7).

      The authors assume that the reader is intimate with the intricacies of Neurospora crassa circadian studies and the significance of differences between LL and DD investigations. More background on the logic of the experiments would be helpful for readers from other fields.

      Thank you for the comment. In the revised version of the manuscript we tried to introduce the molecular clock of Neurospora more thoroughly and completed the description of the experimental conditions with detailed explanations.

      The data in Figure 2 are essential for the interpretation of the findings, demonstrating the presence of free-running rhythms. However, the data are entirely qualitative, making it hard to fully assess the authors' interpretations, a more quantitative assessment of the data would improve clarity.

      We quantified the Western blot signals and show the results in Fig 1E in the new version of the manuscript (according to the reviewer's suggestion Fig 2 of the old version is now part of Fig 1). Our data indicate that oscillation of FRQ levels is similar under both nutrient conditions.

      The conclusion that FRQ contributes to the regulation of WC1 abundance in response to starvation does not seem to be supported by the data because FRQ RNA does not change upon starvation. Furthermore, the authors conclude that the starvation-induced decrease in WC-1 and WC-2 protein levels are due to FRQ because a lack of reduction in an frq9 mutant is open to misinterpretation because this mutant makes WC levels low and therefore starvation might not lower already low levels of WC. Indeed WC-1 is lower in the frq9 mutant under any condition than in the WT under starvation and WC-2 does decrease in abundance in the frq9 mutant in starvation. The data strongly suggest to this reader that FRQ does not participate in the regulation of WC abundance in response to starvation.

      After rereading the criticized section, we admit that the text was not well structured and we carried out several modifications. We intended to emphasize that upon drastic changes of the glucose availability frq RNA levels remained compensated in wt, but this compensation was affected when functional FRQ was not present. We agree with the reviewer's opinion that the low expression of the WCC in frq9 makes it difficult to compare the glucose-dependence of WCC expression in frq9 and wt. We modified the conclusion by adding this information and now mainly focus on the strain-dependent difference in the changes of frq RNA expression. (P7 L22-P8 L14)

      The discussion accurately summarises the results and provides an interpretation but lacking is a comparison to other circadian systems in other kingdoms. How do the data compare with the effects of glucose and other sugars on the mammalian, plant, and insect clocks?

      We included a short section into the Discussion which gives a short overview about known interactions between glucose availability and circadian timekeeping in different organisms (P15 L18-P16 L7).

      How changes in WCC might result in changes in transcription is not explained. This might be very obvious to the authors but to the reader, it is not. Are the transcriptional outputs direct targets of WCC? Has WCC CHIPseq been performed by the authors or others, are the regulated transcripts directly bound by WCC? What are the enriched promoter sequences in the regulated genes, is it possible to identify the network by which these changes in transcription occur?

      We now show the list of genes (Figure 4 – Figure supplement 2) that changed in a strain-specific manner in response to glucose starvation and, based on Chip-Seq results, were earlier described as direct targets of the WCC (Smith et al., 2010; Hurley et al., 2014). Based on the literature data showing that the WCC affects the expression of several other transcription factors and controls basic cellular functions which might affect the expression of further genes, it was not surprising that only 90 out of the 1377 genes were reported to be direct targets of the WCC.

      Whilst the authors claim it is the circadian clock that is involved in the starvation response, in my view a more precise interpretation of the data is that WCC is involved in the response. Since WCC is a photoreceptor with dual function in the clock, is it yet possible to conclude that the effects discovered are due to the clock role of WCC? Or do the data support the role of light signalling in regulating the starvation response through WCC?

      We thank you for the comment. In the revised version of the manuscript we more specifically indicate that in wc-1 the lack of the WCC (and not the lack of a functional clock) results in the altered transcriptomic response to starvation compared to wt. In addition, in the revised version we present a new experiment (Fig. 5D.) which shows that upon resupply of glucose wt grows faster also in constant darkness than the clock-deficient strains wc-1 and frq10 do. This indicates that the role of the WCC in growth regeneration is largely independent of its photoreceptor function.

      The authors do not apparently reconcile that the effect of starvation is to hugely decreases WCC levels, but they find the transcriptional and growth response to starvation requires WCC?

      We agree with the reviewer that the problem of how low levels of WCC could sufficiently support the transcription of frq and different output genes under starvation conditions was not discussed properly. Our results suggest a model in which the maintained level of nuclear WCC and the weakened inhibition by both FRQ (the hyperphosphorylated form is less active in the negative feedback) and PKA (its activity lowered upon glucose depletion) together might ensure that transcriptional activity of the WCC is preserved upon glucose withdrawal in both DD and LL despite the decrease of the overall level of the complex. In the revised version these aspects are discussed more thoroughly (P16-18).

      This study contributes to the increased focus of the circadian community on the regulation of outputs by circadian oscillators. The manuscript will be of interest to many in the field. There needs to be less assumption of knowledge about the N. Crassa circadian system, and better discussion in a broader context of clocks in other kingdoms.

      We added a new section to the Discussion with data concerning interrelationships between glucose availability and the circadian clock in other organisms.

    1. Children’s codeswitching and translanguaging is influenced by the language model provided by parents and significant others in the family, school and community.

      I think this is very important for teachers to know as culturally code-switching and translanguaging may be a huge issue in the home. It is super important to make connections to students first language but we must do research into the families norms and what they want for their child within learning a new language.

    2. Serratrice (2013) notes that the profile of bilinguals constantly changes as their need for and use of each of their languages can vary greatly over time, depending on such factors as context, purpose, the formality of the situation, and who they wish or need to interact with. The term dynamic bilingualism captures this ever-changing nature of language use by emergent bilinguals (O. Garcia, 2009a).

      I think dynamic bilingualism is important for us as future teachers to acknowledge and understand. I also wonder how majority languages within the individual's everyday life impacts the changes between their usage of their languages. In the classroom this could effect teachers because we may become accustom to the use of English with our emergent bilingual students, however, we should still encourage and provide opportunities for students to use of their native/home language during learning. As teachers we should allow opportunities for students to further develop all their languages.

      -Lauren Mitchell

    1. reading too much into this? May

      I know annoying as a director or maybe even as a audience member I should assume thing but I don’t know in my opinion maybe it was aright if passage because he could’ve just asked her for the scissors or stoped what he was doing and got them for his self. But I could also argue that why didn’t he say something then. This made me think of the many questions we have for our parents that we never get answers to.

    1. Anxiety Makes Me Feel Like I am Losing My MindAnxiety, Mental Health, Therapy, Treatment<img width="550" height="321" src="https://elevationbehavioralhealth.com/wp-content/uploads/2019/01/anxiety-makes-me-feel-like-i-am-losing-my-mind-550x321.jpg.webp" class="attachment-entry_with_sidebar size-entry_with_sidebar wp-post-image" alt="i feel like i&#039;m losing my mind" /> Table of Contents Help! Anxiety Makes Me Feel Like I am Losing My MindI Feel Like I’m Losing My MindDifferent Types of Anxiety DisordersHow to Manage AnxietyHolistic Therapies That Help Manage StressElevation Behavioral Health Provides Expert Treatment for Anxiety  Help! Anxiety Makes Me Feel Like I am Losing My Mind Anxiety can be so hard to live with. Constant worry and stress keep you in a state of constant fight-or-flight mode at the slightest little trigger. You may try to reason with yourself, that the stress triggers are no big deal. Your brain, though, is locked and loaded to take you through the spectrum of anxiety symptoms. You just can’t seem to break the stress cycle. Many who approach a doctor with their complaints about their symptoms have truly suffered. They are seeking ways to manage the stress so they can live a normal, happy life. This goal is very possible to reach with the right treatment plan. Anxiety treatment can help reduce when you find yourself expressing am I losing my mind and help reduce the daily struggle and greatly improve your life. <img class="alignright wp-image-28337" src="https://elevationbehavioralhealth.com/wp-content/uploads/2019/06/losingmind.jpg.webp" alt="i'm losing my mind" width="300" height="634" srcset="https://elevationbehavioralhealth.com/wp-content/uploads/2019/06/losingmind.jpg.webp 568w,https://elevationbehavioralhealth.com/wp-content/uploads/2019/06/losingmind-142x300.jpg.webp 142w,https://elevationbehavioralhealth.com/wp-content/uploads/2019/06/losingmind-488x1030.jpg.webp 488w,https://elevationbehavioralhealth.com/wp-content/uploads/2019/06/losingmind-334x705.jpg.webp 334w" sizes="(max-width: 300px) 100vw, 300px" />I Feel Like I’m Losing My Mind Anxiety disorder is a broad grouping of mental health disorders, each with excess worry or fear driving it. Anxiety disorders are very common, with 40 million people struggling with one each year. This disorder is different from the common fear you might feel before having to make a public speech. We all have felt afraid from time to time, like when we are pushed out of our comfort zone. Anxiety disorders, though, are very intrusive. Constant stress can be so difficult to manage that it impacts one’s lifestyle, career, health, and friendships. What It Feels Like On one hand, when someone suffers from this problem, something will trigger a cascade of symptoms. There are many types of anxiety and each has its own unique features. The basic anxiety symptoms include: Feelings of dread and fear. Always being on alert for danger. Racing heart. Shaking. Sweating. Fast breathing. Shortness of breath, holding one’s breath. Stomach upset, diarrhea. Feeling jumpy or restless. Insomnia. Headaches. Different Types of Anxiety Disorders There are varied ways that anxiety is expressed. For this reason, there are six types of mental health disorders. The anxiety spectrum includes: Generalized anxiety disorder: GAD features constant worry for much of the day. This can result in headaches, muscle tension, nausea, and trouble thinking. Panic disorder: Sudden and unexplained feelings of intense terror. This can cause a racing heart, shortness of breath, nausea, chest pain, feeling out of my mind, dizzy. May lead to social isolation to avoid having an attack. Social anxiety: Intense fear of being judged or critiqued. Fear of being embarrassed in public. Causes social isolation. Specific phobias: Irrational fear of a certain thing, place, or situation. To manage this fear, the person will go to great measures to avoid triggers. Trauma disorder: PTSD is about never getting over trauma, even months later, It can lead to avoidance of people, places, or situations that trigger thoughts of the event. Flashbacks, nightmares, or repeated thoughts of the trauma stoke the symptoms. Obsessive-compulsive disorder: OCD involves worries about things like germs, causing harm, or a need for order. This drives compulsive behaviors in an attempt to manage the symptoms of anxiety caused by the fear. How to Manage Anxiety Do the symptoms of anxiety make you feel like you’re losing your mind? If so, it is time to meet with a mental health worker. At the first meeting, a therapist will assess what type of anxiety you are dealing with. We Can Help! Call Now! (888) 561-0868 He or she will then design a treatment plan that will help you manage the symptoms. The treatment uses a combined approach with psychotherapy, drugs, and healthy actions that help to reduce stress. Therapy for anxiety is based on the type you have. CBT is very helpful for people that struggle with excess worry and fear. It also helps you to notice how your thoughts are driving the panic-type response to a trigger. CBT then guides you toward changing those fear-based thoughts into more positive ones. Once the thoughts are reframed, the actions that follow will also be positive. Anti-anxiety drugs from the benzo group can be helpful for some people. These drugs work swiftly to help calm nerves and relax you. In some cases, antidepressants are used to treat anxiety as well. <img class="alignright wp-image-28339" src="https://elevationbehavioralhealth.com/wp-content/uploads/2019/06/maninmirror.jpg.webp" alt="feel like i'm losing my mind" width="300" height="634" srcset="https://elevationbehavioralhealth.com/wp-content/uploads/2019/06/maninmirror.jpg.webp 568w,https://elevationbehavioralhealth.com/wp-content/uploads/2019/06/maninmirror-142x300.jpg.webp 142w,https://elevationbehavioralhealth.com/wp-content/uploads/2019/06/maninmirror-488x1030.jpg.webp 488w,https://elevationbehavioralhealth.com/wp-content/uploads/2019/06/maninmirror-334x705.jpg.webp 334w" sizes="(max-width: 300px) 100vw, 300px" /> Holistic Therapies That Help Manage Stress Holistic therapy self-care for stress actions is now often found in the treatment plan for anxiety. This is because these activities can help improve the treatment outcome. They do this by teaching patients ways to achieve a relaxed state of being. For instance, some of these include: Yoga. Mindfulness. Deep breathing Acupuncture. Massage therapy. Equine therapy. Art therapy Elevation Behavioral Health Provides Expert Treatment for Anxiety  Elevation Behavioral Health is an upscale residential mental health treatment center in Los Angeles. If you feel like anxiety makes you feel like you’re losing your mind, our caring team of experts can help. It is time to seek the treatment you deserve to regain your quality of life. When your outpatient treatment is not giving the results you desire, consider a residential program. Treatment is much more focused, and the home-like setting gives you a chance to heal. Take a break from the stressors or triggers in your daily life. Enjoy our upscale private home and gorgeous setting. Our team will help guide you back to health and wellbeing. For questions about our program, reach out to us today at (888) 561-0868. November 22, 2020/by Elevation Behavioral HealthTags: am i losing my mind, feel like im losing my mind, help im losing my mind, i feel like i am losing my mind, i think im losing my mind, losing my mind, losing your mindShare this entryShare on FacebookShare on TwitterShare on PinterestShare on LinkedInShare on TumblrShare on VkShare on RedditShare by Mail https://elevationbehavioralhealth.com/wp-content/uploads/2019/01/anxiety-makes-me-feel-like-i-am-losing-my-mind.jpg 366 550 Elevation Behavioral Health https://elevationbehavioralhealth.com/wp-content/uploads/2018/12/logo_ebh.png Elevation Behavioral Health2020-11-22 01:00:132022-07-08 16:31:14Anxiety Makes Me Feel Like I am Losing My Mind

      When Anxiety is too Much I Feel Like I am Losing My Mind

    1. Can a Narcissist Stop Lying Even With Evidence?Behavior, Mental Health<img width="845" height="321" src="https://elevationbehavioralhealth.com/wp-content/uploads/2022/04/why-do-narcissists-lie-845x321.jpg" class="attachment-entry_with_sidebar size-entry_with_sidebar wp-post-image" alt="why do narcissists lie" /> Table of Contents Why Do Narcissists LieAbout Narcissistic Personality DisorderWhy Someone With NPD LiesLies Often Turn Into GaslightingYou Are the Narcissistic Supply SourceBreaking Free From an NPD LiarElevation Behavioral Health Provides Residential Luxury Mental Health Treatment Why Do Narcissists Lie Are narcissists compulsive liars? Can a narcissist ever stop lying, even when confronted with evidence of their lies? Learn all about narcissistic personality disorder. If you are involved with a narcissist, then you are quite used to being lied to. Their constant lies simply come with the territory. To a normal person, it may be very perplexing to be lied to all the time by someone who purports to care for you. Learn about what the narcissist seems to gain from telling lies all time. About Narcissistic Personality Disorder Narcissistic personality disorder (NPD) is a mental health disorder that stems from an unhealthy and inflated view of self. At least, that’s how it appears on the outside. Inside, though, the NPD really has a very low opinion of him or herself. All of their heinous behaviors are driven by a need to pump themselves up in their own eyes and others’. Individuals with NPD often seek out partners who have certain traits. For instance, they may be a compassionate and sensitive person, but may also be needy and have low self-esteem. Like a leech that latches to a blood source, the NPD latches onto its victim. Over time, the NPD slowly chips away at the victim’s sense of self-worth. Through lies and gaslighting, they put them down and cause them to doubt themselves. Through this emotional abuse, they can control the victim. But because the NPD has no conscience, they never feel regret or remorse for mistreating their partner. Someone with NPD demands constant admiration and praise while keeping their victim from receiving any. A narcissist does not want any competition. Symptoms of NPD include: Lacks empathy or compassion for others. Feels entitled to special treatment. Expects others to fawn over them. Belittles others; talks down to people. Takes advantage of the others’ weaknesses to build themselves up. Self important; arrogant. May hog the conversation. Emotionally detached. Believes that others envy him. Boastful and pretentious. Becomes angry if challenged. Torments the victim with fear. Has a bad temper; sudden angry outbursts. Easily slighted, sensitive to criticism. Doesn’t notice the needs of others. Emotionally stingy. May isolate their victim from friends. Feels insecure inside; self-loathing. Not willing to go to therapy. The NPD will refuse to get help, believing that they are perfect and beyond reproach. Why Someone With NPD Lies Why do narcissists lie… all the time? If you confront them with proof of the lie, they will still attempt to lie their way out of it. What inspires lying? Simply put, the NPD lies in order to inflate his or her own self-esteem. They lie to the other person, to beat them. By inflating truths, they attempt to make their own skills or abilities seem superior to the other person. In other words, they are a boar, the type of person people avoid at a party. We Can Help! Call Now! (888) 561-0868 When the NPD lies, he or she is trying to make themselves appear dominant. They lie for self-gain believing that telling mistruths makes them look smarter than the other person. Having a victim at their side who they can lie to provides them with a constant narcissistic supply, someone that fuels their sickness. When they impress their partner with their lies, they receive a rush or hit to feel better about themselves. Lies Often Turn Into Gaslighting For the NPD, the lies are often a prelude to gaslighting. Gaslighting is a psychological weapon used by some to keep a person emotionally off-balance. When they lie to the person’s face about what may have occurred, they cause the victim to question their own sanity. When the victim confronts the NPD with solid evidence of a misdeed, they will be met with lies. Not only will the NPD lie and deny it ever happened, but they are also likely to attack. This is where the gaslighting begins. They will attempt to twist the event around to become the fault of the victim. You Are the Narcissistic Supply Source There is a reason why the NPD wants to keep their victim around; the victim fulfills a need for them. They fill up their NPD cup daily by sucking the life out of the unsuspecting partner. Thus, the victim is not even aware of the role they play in the illness at first. The NPD will therefore go to great lengths to keep the victim from leaving them. Some tactics they use include: They may cry false tears to elicit sympathy, thus keeping the victim engaged. They may use force or become violent to assert dominance. They may try to manipulate the victim through guilt. They may threaten the victim by taking the money away or causing some type of harm. They make the victim feel bad about themselves so they won’t think they can do any better. They may threaten suicide, although it is an empty threat. Breaking Free From an NPD Liar If you have woken up to realize you are in a relationship with an NPD, you should run, not walk, to the exits. The sad truth is that these people are rarely able to change their ways, mostly because they don’t want to. In their own minds they feel they never do wrong, so why go to therapy? Partner with a therapist who can offer guidance and support as you detach from the NPD. These people can and do become violent when faced with their N-source leaving them. Prepare for the false promises and tears, as they play on your sense of compassion to keep you entrenched in the abuse cycle. So, can a narcissist stop lying, even with evidence of their lies? The answer is very clear: no, they cannot. Elevation Behavioral Health Provides Residential Luxury Mental Health Treatment Elevation Behavioral Health can help someone who is the victim of a narcissist. Our dedicated team is here to guide you toward wellness and discovering new insights. For questions about our program, please call us today at (888) 561-0868. April 27, 2022/by Elevation Behavioral HealthTags: dealing with a narcissist, lying narcissist, narcissist, when narcissist lieShare this entryShare on FacebookShare on TwitterShare on PinterestShare on LinkedInShare on TumblrShare on VkShare on RedditShare by Mail https://elevationbehavioralhealth.com/wp-content/uploads/2022/04/why-do-narcissists-lie.jpg 687 1030 Elevation Behavioral Health https://elevationbehavioralhealth.com/wp-content/uploads/2018/12/logo_ebh.png Elevation Behavioral Health2022-04-27 18:09:152022-04-27 18:09:15Can a Narcissist Stop Lying Even With Evidence?

      Are narcissists compulsive liars? Can a narcissist ever stop lying, even when confronted with evidence of their narcissistic lies? Learn all about narcissistic personality disorder.

    1. Author Response

      Reviewer #1 (Public Review):

      Kohler and Murray present high-throughput image-based measurements of how low-copy F plasmids move (segregate) inside E. coli cell. This active segregation ensures that each daughter cell inherit equal share of the plasmids. Previous work by different labs has shown that faithful F-plasmid segregation (as well as segregation of many other low-copy plasmids, segregation of chromosomes in many bacterial species and segregation of come supramolecular complexes) require ParA and ParB proteins (or proteins similar to them) and is achieved by an active transport mechanism. ParB is known to bind to the cargo (plasmid) and ParA forms a dimer upon ATP binding that binds to DNA (chromosome) non-specifically and also can bind to ParB (associated with cargo). After ATP hydrolysis (stimulated by the interaction with ParB), ParA dimer dissociates to monomers and from ParB and the chromosome. While different mechanisms of the ParA-dependent active transport had been proposed, recently two mechanisms become most popular - one based on the elastic dynamics of the chromatin (Lim et al. eLife 2014, Surovtsev PNAS 2016, Hu et al Biophys.J 2017, Schumaher Dev.Cell 2017) and the other based on a theoretically-derived "chemophoretic" force (Sugawara & Kaneko Biophysics 2011, Walter et al. Phys.Rev.Lett. 2017).

      It is a minor comment, but we would like to point out that we do not consider these two model types as alternatives but rather as models with different levels of coarse-graining. Our interest is in the molecular-level (stochastic) models (Lim et al. eLife 2014, Surovtsev PNAS 2016, Hu et al PNAS 2015, Hu et al Biophys.J 2017, Schumacher Dev.Cell 2017).

      The authors start by following motion of F plasmid with one or two plasmids per cell and by analyzing plasmid spatial distribution, plasmid displacement (referred to as velocity) as a function of their relative position, and autocorrelations of the position and the displacement. They concluded that these metrics are consistent with 'true positioning' (i.e. average displacement is biased toward the target position - center for one plasmid and 1/4 and 3/4 positions for two plasmids ) but not with 'approximate positioning' (i.e. when plasmid moves around target position, for example, in near-oscillatory fashion). This 'true positioning' can be described as a particle moving on the over-dampened spring. They reproduce this behavior by expanding the previous model for 'DNA-relay' mechanism (Lim et al. eLife 2014, Surovtsev PNAS 2016), in which plasmid is actively moved by the elastic force from the chromosome and ParA serves to transmit this force from the chromosome to the plasmid. Now, the authors explicitly consider in the model that the chromosome-bound ParA can diffuse (which the authors refer as 'hopping') and this allows the model to achieve 'true plasmid positioning' for some combination of model parameters in addition to oscillatory dynamics reported in the original paper (Surovtsev PNAS 2016).

      Based on their computational model, the authors proposed that two parameters, diffusion scale of ParA = 2(2Dh/kd)1/2/L (typical length diffused by ParA before dissociation) and ratio of ParB-dependent and independent hydrolysis rates = kh/kd are key control parameters defining what qualitative behavior is observed - random diffusion, near-oscillatory behavior, or overdamped spring ('true positioning'). They vary this two parameters ~30- fold and ~200-fold range by changing Dh and kh respectively, to illustrate how dynamics of the system changes between these 3 modes of motion. While these parameters clearly play important role, the drawback is that the authors did not put either theoretical reasoning why these parameters are truly governing or showed it by varying other model parameters (kh, number of ParA NParA, spring constant of chromosome k, diffusion coefficient of the plasmid Dp) to show that only these combinations define the type of the system behavior. The authors qualitative analysis on importance of relies on the steady state solution for the diffusion equation for ParA. It is really unfortunate that no ParA distribution was measured simultaneously with the plasmid motion, as this would allow to compare experimental ParA profiles to expected quasi-steady-state solutions.

      We spend almost an entire section and a figure explaining the theoretical reasoning behind the identification of the $\lambda=s/(L/2n)$ as an important system parameter (section “Hopping of ParA-ATP on the nucleoid as an explanation of regular positioning” and Figure 2) and predicted that regular positioning could only occur for $\lambda>1$. This was confirmed by parameter sweeps for the cases of 1 (Figure 3I) and multiple plasmids (Figure 5-figure supplement 1), indicating that $\lambda$ is indeed an important system parameter and that our conceptual understanding of this aspect of the system is correct. This point has now been made clearer.

      However, we agree that the reasoning for $\epsilon$ (varied through the hydrolysis rate $k_h$) was not clear. It was chosen to allow us to modulate the ParA concentration at the plasmid compared to elsewhere, motivated by the differences between different ParABS systems. We originally had also considered a third quantity related to the number of nucleoid-bound ParA but we found that this had little effect on the nature of the dynamics. All three quantities describe how the timescale of a reaction/process (ParA hopping/diffusion across the nucleoid, ParB induced hydrolsysis, ParA association to the nucleoid) compares to the timescale of basal hydrolysis, which we use as a reference timescale.

      We have now made this clearer as well as adding supplementary figures showing the effect of varying other system parameters at several locations in the phase diagram (Figure 3-figure supplement 3 and 4). These sweeps justify our identification of $\epsilon$ and $\lambda$ as a useful/important set of quantities for determining the dynamics of the system.

      Additionally, we now add example kymographs showing the ParA distribution (Figure 3-figure supplement 2C).

      The authors also show by simulations that overdamped spring dynamics can transition into oscillatory behavior when decreases, for example by cell growth. Indeed, they observed more oscillatory behavior when they compared single-plasmid dynamics in the longer cells compared to the shorter cells. This was not the case in double-plasmid cells, in eprfect agreement with their analysis. They also calculated ATP consumption in the model and concluded that the system operates close but below (perhaps, "above" should be used as it refers to bigger ) the threshold to oscillatory regime which minimize ATP consumption. While ATP consumption analysis is very intriguing, this statement (Abstract Ln24-25) seems at odds with the authors own analysis that another ParA-dependent plasmid system, pB171, operates mostly in oscillatory regime, and it is actually for this regime the authors' analysis suggest minimal ATP-consumption (Fig. 8).

      To clarify, we found that pB171 (which in our hands has a copy number of 2-3 in the SR1 reduced-copy-number strain) is only clearly oscillatory in cells with a single plasmid (and only mildly so in cells with two plasmids). Otherwise, it behaves very similarly to F plasmid. We therefore believe that these two distantly related ParABS systems exhibit, overall, similar dynamics and differ only in how close the systems are to the threshold of oscillatory instability. This was not clear as we did not specify the copy number of pB171. We now provide this in Figure 7–figure supplement 1.

      We refer to these systems as lying just below, rather than above, the threshold of the oscillatory instability because, on average, plasmids do not oscillate but only do so in cells with the lowest plasmid concentration.

      I think the real strength of the paper is that it can potentially to show that if one considers that the intracellular cargo can be moved by the fluctuating chromosome via ParA-mediated attachments, then various dynamics can be achieved depending on combinations of several control parameters (plasmid diffusion coefficient, ParA diffusion coefficient, rate of hydrolysis and so on) including previously reported 'oscillations' (Surovtsev PNAS 2016), 'local excursions' (Hu et al Biophys.J 2017) and 'true positioning' (Schumaher Dev.Cell 2017). The main drawback (in this reviewer opinion) that this is obscured by the current presentation and discussion of this work and previous modelling work on ParA-dependent systems. For example, instead of using "unifying" potential of the presented model, yet another name 'relay and hopping' is used in addition to previously used 'DNA-relay', 'Brownian ratchet', 'Flux-based positioning', …

      In the abstract and discussion, we already refer to developing a “unified” model (p1 L21, p15 L22 of the original manuscript) and in the discussion we explain how our model contains other models as limiting cases. But we agree with this recommendation - the unifying nature of our model is its main strength. We now emphasise this more.

      Regarding the model name, we felt obliged to refer to the previous named models (DNA-relay and Brownian ratchet) and simply gave our model a name to avoid confusion when making comparisons. We have now removed almost all mention of ‘hopping and relay’ and just refer to ‘our model’. However, our gitlab repository with the code must have a name and therefore is still called ‘Hopping and relay’ and so the same term is used in Table 3.

      … and it appears that the presented model is an alternative to these previously published work. And only in model description (in Methods section) one can find that the "... model is an extension of the previous DNA-relay model (Surovtsev et al., 2016a) that incorporates hopping and basal hydrolysis of ParA and uses analytic expressions for the fluctuations rather than a second order approximation"(p.17, ln15-17).

      We are sorry that this reviewer felt that the fact that our model is an extension of DNA relay is hidden in the methods. However, we wrote in the main text:

      “Motivated by the previous discussion, we decided to develop our own minimal molecular model (‘hopping and relay’) of ParABS positioning, taking the DNA relay model as a starting point … The original scheme is as follows… We supplemented this scheme with two additional components: diffusion (hopping) of DNA-bound ParA-ATP dimers across the nucleoid (with diffusion coefficient Dh, where the subscript indicates diffusion of the home position) and plasmid-independent ATP hydrolysis and dissociation (with rate kd). See Material and Methods for further details of the model. “

      We now make this clearer.

      However, we would argue that as models of the same system, there are naturally overlaps and the models of Hu et al and Schumacher et al could also be thought of as extensions of the DNA relay model.

      While it is of course the authors right to decide how to name their model, it should be explicitly clear to the reader what is a real conceptual difference between presented and previous models from the abstract, introduction and discussion section of the paper, not from the "fine-print" details in the supplementary materials.

      The main conceptual difference is that we have identified the importance of having a finite diffusive length scale for ParA diffusion/hopping on the nucleoid. This allows both oscillations and regular positioning to occur for biologically relevant parameter values and reproduces the length dependent transition from mid-cell positioning to confined oscillations that we observe for F plasmid. The DNA relay model does not have this behaviour as the ParA diffusive length scale in zero while it is infinite in the models of Ietswaart et al 2014 and Schumacher et al 2017. The model of Hu et al 2017 does have a finite length scale but the authors appear not to have realised its importance and never discovered the regular positioning regime at \lambda >1. While we make these points in the discussion in the context of Figure 8A, where we compare our model to the others, we agree with this reviewer that we should have been more explicit in the abstract and introduction. We have now corrected this.

      This would allow to avoid unnecessary confusion (especially for the readers not directly involved into the modelling of ParA/B system) and clarify that all these models rely on the elastic behavior of fluctuating chromosome to drive active transport of the cargo. This reviewer believes that more explicit discussion on the models (one from the authors and previously published) differences and similarities will help with our understanding of how ParA-dependent system operate. This discussion should also include works on PomXYZ system, in which it was shown that similar dynamic system can lead to specific positioning within the cell (Schumaher Dev.Cell 2017, Kober et al. Biophys.J 2019). This will may it explicit that the models results have direct impact beyond the ParA-dependent plasmid segregation.

      To further clarify the differences between the models (beyond the second and third sections of the main text and the discussion), we have now added a section to the methods and a new table (Table 3). We have also included the mentioned PomXYZ model. However, we would like this was not the first stochastic model to have ‘true’ positioning as this reviewer cites above. Though they did not include the mechanism of force generation, the model of Ietswaart et al 2014 produces regularly positioned plasmids and is referenced repeatedly in Schumacher et al. 2017.

      I think that expanded parameter analysis, and explicit model comparison/discussion will make the contribution of this work to the field more clear and with the potential to advance our general understanding of how the same underlying mechanism can lead to various modes of intracellular dynamics and patterning depending on parameters combination.

      Reviewer #2 (Public Review):

      The work presented in this manuscript details an analysis of the partitioning of low copy plasmids under the control of the ParABS system in bacteria. Using a high throughput imaging set up they were able to track the dynamics of the partition complex of one to a few plasmids over many cell cycles. The work provides an impressive amount of quantitative data for this chemo-mechanical system. Using this data, the paper sought to clarify whether the dynamics of plasmids is due to regular positioning or noisy oscillations around a mean position. They supplement their experimental work with an intuitive model that combines elements of previous modelling efforts. Their model relies on diffusion of the ParA substrate on the nucleoid with the dynamics of the ParB partition complex being driven by the underlying elastic force due to the nucleoid on which the substrate is tethered. Their model dynamics depend on two parameters, the ratio of the length over which the substrate can explore to the characteristic length of the space and the ratio of stimulated to non-stimulated hydrolysis rates of the substrate. If the length ratio is large, ParA can fully explore the space before interacting with the ParB complex leading to balanced fluxes and regular positioning. If it gets reduced, for example by lengthening the cell, oscillations can emerge as fluxes of substrates become imbalanced and a net force can pull the partition complex.

      Strengths:

      Given the large amount of data, the observations unambiguously show that one particular ParABS system under the conditions studied is carrying out regular positioning of plasmids. The model synthesizes prior work into a nice intuitive picture. These model parameters can be fit to the data leading to estimates of molecular kinetic parameters that are reasonable and in line with other observations. Lining up the experimental observations with the phase space of the model suggests that the system is poised on the edge of oscillations, allowing for the system to have regular positioning with low resource consumption.

      Weaknesses:

      However, despite the correspondence of the simulated results with the experimental findings, other explanations are not completely ruled out. The paper emphasizes that ParA diffusion/hopping on the nucleoid is essential for the establishment of regular positioning and that without it, only oscillations were possible. Prior simulation efforts, that the paper cites, which include ParA diffusion and mixing in the cytosol but no diffusion on the nucleoid have shown that regular positioning is possible and that oscillations could get triggered as the system lengthened. Thus ParA hopping is not a necessity for regular positioning (as claimed in the paper), but very well might be needed for the given kinetic parameters of the system studied here.

      We now comment on this result. In short, we believe that the mentioned model/regime is not relevant due to stochastic effects. We are not able to produce, with biological relevant parameters, regular positioning without ParA hopping.

      The paper also presents experimental results for a second ParABS system (pB171) that is more likely to show oscillations. They attribute the greater likelihood of oscillations for pB1717 being due to ParA exploring a smaller space than the F plasmid system that showed regular positioning. This is pure conjecture and the paper does not provide any evidence that this is the reason. Thus it is hard to conclude if oscillations may not be due to other factors.

      We do not explicitly make that claim. We did have a point in the phase diagram of Figure 8A representing pB171 with a lower value of lambda than F plasmid and stated “The location of pB171 is an estimate based on a qualitative comparison of its dynamics”. We agree this was unclear.

      We now indicate the region that has oscillations with roughly the same period as single plasmids of pB171. We also make it clear that we speculate, but have not shown, that the length scale of ParA hopping is smaller than for F plasmid.

      An important point here is that we can explain both oscillations and regular positioning in the same model with the same kinetic parameters, the regimes being determined by the cell length and plasmid number in a manner consistent with experimental observations.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors use the nanobody tools generated in the companion manuscript and have combined them with DNA-Paint oligonucleotide labeling to generate super-resolution images of indirect flight muscles. Using this approach, they could map the precise organization of the different domains from the two giant titin-like fly homologs called Sallimus and Projectin against which the nanobodies had been raised with a precision ranging from 1 nm to 4 nm, depending on the distance between them. They show that in indirect flight muscles the N-ter of Sallimus is located within 50 nm of the Z-disc, and that its C-ter reaches the A-band roughly 100 nm away from the Z-disc. Likewise, they show that the N-ter of Projectin colocalizes with the C-ter of Sallimus at the edge of the A-band, whereas its C-ter is located about 250 nm away in the A-band and 350 nm from the Z-disc. It overall suggests a staggered and linear organization of both proteins with a potential area of overlap spanning 10-12 nm, that Sallimus could bridge the Z-disc to the A-band acting as a ruler, while Projectin should only overlap with 15% of the A-band and possibly a 10 nm of the I-band.

      Thanks for this nice summary of our findings.

      The value of this work comes from its use of advanced technologies (DNA-Paint + superresolution). The biological conclusions confirm and refine earlier and recent papers, especially EM papers and the impressive and very comprehensive JCB paper by Szikora et al in 2020, although the conclusions of the present work differ somewhat from those of Szikora who had predicted that Sallimus does not reach the A-band. That aspect could have been better discussed.

      We have further extended our discussions of the results from Szikora et al. 2020, in particular regarding Sallimus in this revised version.

      Reviewer #2 (Public Review):

      Taking advantage of the high molecular order of the Drosophila flight muscle, Schueder, Mangeol et al. leverage small (<4 nm) original nanobodies, tailored coupling to fluorophores, and DNA-PAINT resolution capabilities, to map the nanoarchitecture of two titin homologs, Sallismus and Projectin.

      Using a toolkit of nanobodies designed to bind to specific domains of the two proteins (described in the companion article "A nanobody toolbox to investigate localisation and dynamics of Drosophila titins" ), Schueder, Mangeol et al position these domains within the sarcomere with <5nm resolution, and demonstrate that the N-ter of Sallismus overlaps with the C-ter of Projectin at the A-band/I-band interface. They propose this architecture may help to anchor Sallismus to the muscle, thus supporting flight muscle function while ensuring muscle integrity.

      This study nicely extends previous work by Szikora et al, and precisely dissect the the sarcomeric geography of Sallismus and Projectin. From these results, the authors formulate specific functional hypotheses regarding the organization of flight muscles and how these are tuned to the mechanical constraints they undergo.

      Although they remain descriptive in essence, the conclusions of the paper are well supported by the experimental results.

      We thank this reviewer for the nice summary of our results.

      Reviewer #3 (Public Review):

      This manuscript by Schueder et al. provides new insight into an important question in muscle biology: how can the smaller titin-like molecules of the much larger sarcomeres of invertebrate muscle perform the same function as the larger titin of vertebrate muscles which have smaller sarcomeres? These functions include the assembly, stability and elasticity of the sarcomere. Using two state of the art methods--nanobodies and DNA-PAINT superresolution microscopy, the authors definitively show that in the highly ordered indirect flight muscle of Drosophila, the elongated proteins Sallimus and Projectin are arranged such that the N-terminus of Sallimus is embedded in the Z-disk, and the C-terminus is embedded in the outer portion of the A-band, and that in this outer portion of the A-band is also embedded the C-terminus of Projectin; thus, if the C-terminus of Sallimus can bind to thick filaments, and/or these overlapping portions of Sallimus and Projectin interact, there would be a linkage of the Z-disk and/or thin filament to the thick filaments to help determine the length and stability of the sarcomere.

      The strengths of this paper include the implementation of nanobody and DNA-PAINT superresolution microscopy for the first time for muscle. The extraordinary 5-10 nm resolution of this method alloiws imaging for definitive localization of the termini of these elongated proteins in the Drosophila flight muscle sarcomere. In addition, the manuscript is well written with sufficient background information and rationale presented, is easy to read, complex new methods are well-described, the figures are of high quality, and the conclusions are well-justified. A minor weakness is that despite the authors demonstrating that the Cterminus of Sallimus is located at the outer edge of the A-band, and that the N-terminus of Projectin is located also in the outer edge of the A-band, the authors provide no data to show whether, for example, these portions of these titin-like molecules interact, or whether Sallimus might interact with thick filaments. Such data would be required to prove their model. However, I can understand that this would require extensive additional study, and the authors have already provided a tremendous amount of data for this first step in supporting the model. Nevertheless, the authors should cite a relevant previous study on the Sallimus homolog in C. elegans called TTN-1, which is also a 2 MDa polypeptide of similar domain organization to at least the large isoforms of Salliums found in fly synchronous muscles. In the study by Forbes et al. (2010), immunostaining, albeit not to the impressive resolution achieved in the present paper, showed that TTN-1 was also localized to the I-band with extension into the outer edge of the A-band. More importantly, that study also showed that "fragment 11/12", Ig38-40, which is located fairly close to the C-terminus of TTN-1 binds to myosin with nanomolar affinity (Kd= 1.5 nM), making plausible the idea that TTN-1 may bind to the thick filament in vivo.

      We thank this reviewer for sharing his enthusiasm about our results and methodology, and also about the way the data are presented. This is one more argument for us to leave a shortened Figure 1 in the PAINT manuscript.

      We are particularly thankful for pointing out the important C. elegans data that we had missed and that, as the reviewer said, perfectly fit with the model we propose for flight muscle (and also the larval muscle data, as the C-term of Sls is the same). Hence, we highlight this paper now in our discussion and compare to our findings.

      Reviewer #4 (Public Review):

      This manuscript reports combining recently developed and described in the accompanying paper nanobodies against Sallimus and Projectin with DNA-Paint technology that allows super-resolution imaging. Presented data prove that such a combination provides a powerful system for imaging at a nano-scale the large and protein-dense structures such as Drosophila flight muscle. The main outcome is the observation that in flight muscle sarcomeres Salimus and Projectin overlap at the I/A band border. This was elegantly achieved using double color DNA-Paint with Sls and Projectin nanobodies.

      We thank the reviewer for appreciating the quality of our work.

      Overall, as it stands, this manuscript even if of high technological value, remains entirely descriptive and short in providing new insights into muscle structure and architecture. The main finding, an overlap between short Sls isoform and Proj in flight muscle sarcomeres, is redundant with the author's observation (described in the companion paper "A nanobody toolbox to investigate localisation and dynamics of Drosophila titins") that in larval muscles expressing a long Sls isoform, Sls and Proj overlap as well.

      Alternatively, combination of Sls and Proj nanobodies with DNA-Paint represents an interesting example of technological development that could strengthen the accompanying nanobodies toolkit manuscript.

      Every structural paper reports the structure and is thus by definition descriptive. This is the aim of our manuscript. We do not think that the other nanobody resource paper reports an overlap of Sls and Projectin in the larvae. To resolve such a possible overlap, super resolution would be needed. The other paper does report that larval Sls isoform is dramatically stretched, more than 2 µm, and that Projectin is decorating the thick filament, likely in an oriented manner. If N-term of Projectin overlaps with C-term of Sallimus in this muscle is an open question that needs DNA-PAINT imaging of larval muscle. This requires a TIRF setting that is technically not trivial to achieve for larval muscle and hence has not been done by anybody.

    1. The end of Twitter

      Ben Werdmüller sees the Musk take-over as one of more signs that Twitter as we know it is sunsetting. Like FB it is losing its role as the all-in-one communal 'space'. I think the decline is real, but also think it will be long drawn out decline. Early adopters and early main stream may well jump ship, if they haven't already some time ago. The rest, including companies, will hang around much longer, if only for the sunk costs (socially and capital). An alternative (hopefully a multitude as Ben suggests) needs to clearly present itself, but hasn't in a way the mainstream recognises I think. It may well hurt to hold on for many, but if there's no other thing to latch onto people will endure the pain. Boiling frog and all that.

    1. Author Resonse

      Reviewer #1 (Public Review):

      The manuscript by Himmel et al is an interesting study representing a topic of substantial interest to the somatosensory neurobiology community. Here, the authors use CIII peripheral neurons to investigate polymodality of sensory neurons. From vertebrates to invertebrates, this is a long-standing question in the field: how is it that the same class of sensory neurons that express receptors for myriad sensory modalities encode different behavioral responses. This system in Drosophila seems to be an intriguing system to study this question, making use of the genetic toolkit in the fly and ease of behavioral assays. In this study, the authors identify a number of channels that are important for cold nociception, and they showed that some of these do not appear to also encode mechanosensation. Despite my initial enthusiasm for this paper, halfway through, it felt as if I were reading two different papers that were loosely tied together. This lack of cohesion significantly reduced my enthusiasm for this work. Below are some of my criticisms:

      We thank Reviewer #1 for their feedback. In addition to the points below, and in accordance with the reviewer’s overall criticisms, we have revised the body text to make it more cohesive. Our main goal with this revision was to better explain to the reader the shift from anoctamins to SLC12 cotransporters.

      1) The first half of the paper is about a role for Anoctamins in cold nociception, but the second half switched somewhat abruptly to ncc69 and kcc. I assumed the authors would connect these genes in a genetic pathway, performing some kind of epistatic genetic interaction studies or even biochemical assays, and that this was the reason to switch the focus of the paper midway through. But this was not the case. Moreover, they performed a different constellation of experiments for the genes in the first half vs the second half of the paper (eg. Showed a role in cold nociception vs mechanosensation or showing phenotype from overexpression). This lack of cohesion made it difficult to follow the work.

      We have edited the text to better explain this shift. Two notable changes are: (1) moving the phylogenetics to Figure 1, to more immediately present and demonstrate that subdued is part of the ANO1/ANO2 family of calcium-activated chloride channels; and (2) a new cartoon schematic in Figure 6 to more strongly communicate to a reader that chloride is a hypothetical mechanism of cold discrimination.

      In short, previous work and our phylogenetic analyses indicate that subdued is a Cl- channel (we have moved the phylogeny earlier in the paper to make this clear from the onset). We were therefore surprised that knockdown/mutation resulted in reduced CT behavior, as neural Cl- currents are often inhibitory. Thus, we looked to known mechanisms of Cl- homeostasis to try to formulate an informed hypothesis about the function of anoctamins in this system; hence the shift in focus to SLC12.

      In response to the second half of the comment: We have in fact performed cold nociception and mechanosensation experiments for both the anoctamins and the SLC12 cotransporters, although the SLC12 mechanosensation results were in a supplemental figure. We have moved the mechanaosensation results to the main Figure 6 to make this clearer. With respect to simple overexpression, the goal of the anoctamin experiments was to test the necessity of anoctamins to cold-evoked behavior, whereas the goal of the SLC12 experiments was to differentially modulate Cl- homeostasis, and this could hypothetically be accomplished by both knockdown and overexpression (hence we performed both knockdown and overexpression).

      2) In Fig1B,C how does one confirm a CIII neuron is being analyzed. It might help the reader if there were at least some zoomed out photos where all the cell types are labeled and potentially compared to a schematic. Moreover, is there a CIII specific marker to use to co-stain for confirmation of neuron type?

      Our CIII fusion is a specific marker for CIII neurons. To better demonstrate this, we have added images of the new CIII fusion expression patterns overlapping with a previously described CIII GAL4 driver (i.e. nompC-GAL4), and provided text describing how the CIII fusion transgene was discovered and generated. Please see the new Figure 1-Figure supplement 1.

      3) As this paper is predicated on detecting differences by behavioral phenotype, the scoring analysis is not as robust as it could be, especially considering the wealth of tools in Drosophila for mapping behaviors. The "CT" phenotype is begging for a richer behavioral quantification. This critique becomes relevant here when considering the optogenetic induced CT behavior in Fig5. If the authors were to use unbiased quantitative metrics to measure behavior, they could show how similar the opto behavior is to the natural cold evoked behavior. Perhaps the two are not the same, although loosely fitting under the umbrella of "CT".

      In accordance with our response above to necessary revisions, we have added one additional metric and reorganized the figures to better demonstrate the complexity of the behavior. We have no further data or new tools at this time.

      To improve our optogenetic analyses, we have added data for Channelrhodopsin-dependent CIII activation, which has been previously shown to induce cold-like behaviors at high levels of activation and innocuous touch-like behaviors at low levels of activation (Turner, Armengol et al 2016). Further, we have added videos (Figure 5—videos 1-3) showing behavior in response to both Channelrhodopsin and Aurora activation.

      With respect to differences in behavior, we have pointed out some differences in the Aurora-evoked behavior from the cold-evoked behavior: chloride optogenetics induces innocuous touch-like behaviors following CT. Please see lines 296-299.

      4) Following on from the last comment, the touch assays in Fig3 have a different measurement system from the other figures. Perhaps touch deficits would be identified with richer behavioral quantification. Moreover, do these RNAi larvae show any responses to noxious mechanical stimulation?

      The touch assays necessarily have different metrics from cold assays, as the touch-evoked behaviors are quite different from cold-evoked change in length (which are relatively simple, prima facie).

      With respect to noxious mechanical stimulation, while Class III neurons have been shown to facilitate this modality and be connected to relevant circuitry (please see Hu et al 2017 https://doi.org/10.1038/nn.4580 and Takagi et al 2017 https://doi.org/10.1016/j.neuron.2017.10.030), Class IV neurons are the primary sensory neuron which initiate the noxious mechanical-induced rolling response. Although this is an interesting question, we believe it is outside the scope of this study.

      Reviewer #2 (Public Review):

      Himmel and colleagues study how individual sensory neurons can be tuned to detect noxious vs. gentle touch stimuli. Functional studies of Drosophila class III dendritic arborization neurons characterized roles in gentle touch and identified a receptor, NompC, and other factors that mediate these responses. Subsequent work primarily from the authors of the current study focused on roles for the same sensory neurons in cold nociception. The two proposed sensory inputs lead to quite distinct sets of behaviors, with touch leading to halting, head turning and reverse peristalsis, and noxious cold leading to whole body contraction. How activity of one type of sensory neuron could lead to such different responses remains an outstanding question, both at the levels of reception and circuitry.

      The cIII responses to noxious cold and innocuous touch raises questions that the authors address here, proposing that studies of this system could advance the understanding of chronic neuropathic pain. A candidate approach inspired by studies in vertebrate nociceptors led the authors to study anoctamin/TMEM16 channels subdued, and CG15270, termed wwk by the authors. The authors focus on a pathway for gentle touch vs. cold nociception discrimination through anoctamins. Several of the experiments in this manuscript are well done, in particular, the electrophysiological recordings provide a substantial advance. However, the genetic and expression analysis has several gaps and should be strengthened. The data also do not provide strong support for some key aspects of the proposed model, namely the importance of relative levels of Cl co-transporters.

      Major comments:

      1) Knockout studies are accomplished using two MiMIC insertions whose effects on subdued or CG15270/wwk are not characterized by the authors. This needs to be established. The MiMIC system is also not well explained in the text for readers.

      We have modified the text to better explain MiMICs (Lines 137-140) and we have verified the mutagenic effects of these MiMIC insertions via RT-PCR (Figure 2 – supplement 1). We believe these data, in conjunction with other converging lines of evidence (e.g. rescue) demonstrate necessity of these genes in cold nociception.

      2) Subdued expression is inferred by a Gal4 enhancer trap. This can be a hazardous way of determining expression patterns given the uncertain relevance of the local enhancers driving the expression. According to microarray analysis subdued is strongly expressed in cIII neurons, but c240-Gal4 is barely present compared to nearby neurons, raising questions about whether this line reflects the expression pattern, including levels, even though the authors suggest that the line is previously validated (line 95; it is unclear what previously validated means). Figure 1B should not be labeled "subdued > GFP" since it is not clear that this is the case. Another more direct method of assessing expression in cIII is necessary. Confidence is higher for wwk using a T2A-Gal4 line, however, Figure 1C might be misleading to readers and indicate that wwk-T2A-Gal4 is cIII specific whereas in supplemental data the authors show how it is much more broadly expressed. The expression pattern in the supplemental figures should be moved to the main figures.

      We have removed the phrase “previously validated” and we have modified Figure 1 to change how we refer to the GFP expression (removed “subdued > GFP”).

      In accordance with the response to necessary revisions above, we make use of several converging lines of evidence to infer expression, including GAL4 expression patterns, microarray, and qPCR (the two latter experiments from isolated CIII samples). That subdued and wwk are expressed in CIII is clearly the most parsimonious hypothesis.

      We have also carefully reviewed our body text to be certain we do not make claims of differential expression between different neural subtypes based on differences in fluorescence in the GAL4-driven GFP imaging. We do not believe that this would be a reasonable way to infer differences in expression levels in any instance.

      With respect to the design of Figure 1, the intent is not to mislead the reader, and we state in the text that wwk is not solely expressed in CIII (lines 120-125). As eLife makes supplemental figures available directly alongside the main figures, we have left the relevant supplemental figures as supplements – we simply think this makes more sense from a standpoint of readability and style.

      3) In figure 8 the authors propose a model in which the relative levels of K-Cl cotransporters Kcc (outward) and Ncc69 (inward) in cIII neurons determine high intracellular Cl- levels and a Cl- dependent depolarizing current in cIII neurons. They test this model using overexpression and loss of function data, but the results do not support their model since for most of the overexpression and LOF of kcc and ncc69 do not significantly affect cold nociception, the exception being ncc69 RNAi. The authors suggest that this could be due to Cl homeostasis regulated by other cotransporters. Nonetheless, it leaves a significant unexplained gap in the model that needs to be addressed.

      We respectfully disagree that our results are not consistent with the stated hypothesis. In fact, it is the lack of change under certain conditions which lend evidence against the alternative hypothesis that CIII neurons maintain relatively low intracellular Cl-. The hypothesis we are testing is that ncc69 expression is driving relatively high intracellular Cl- concentrations, thus resulting in depolarizing Cl- currents.

      Under this hypothesis, we would predict that knockdown of ncc69 and overexpression of kcc would reduce cold sensitivity at 5˚C. That knockdown of ncc69 and overexpression of kcc reduces cold sensitivity is consistent with this hypothesis (and we point out in text that the evidence for kcc is less convincing) – at the least, these results do not disprove it.

      Under this hypothesis, we would also predict that knockdown of kcc and overexpression of ncc69 would not result in reduced cold sensitivity at 5˚C. As there was no phenotype at 5C, our results are likewise consistent with the hypothesis (at the least, they do not disprove it).

      We did find it curious that ncc69 RNAi did not affect neural activity at 10˚C, but speculate that our inability to detect physiological effects for ncc69 knockdown are limitations of our electrophysiology methodology (and we discuss this in the manuscript).

      The only piece of data inconsistent with the hypothesis may be that kcc overexpression may not have affected cold nociception at 5˚C – the data aren’t overwhelmingly convincing. However, this is only one experiment among many, and we believe the preponderance of evidence is consistent with the hypothesis. That is not to say we believe this hypothesis has complete explanatory power, however, as noted by our discussion of both the ncc69 electrophysiological and kcc behavioral data, and by our suggestion that there may be other regulatory mechanisms at work. This latter suggestion is wholly speculative, and we believe appropriate for the discussion section. We agree (and state in the discussion) that this would require further experimentation.

      4) Related to the #3, the authors should verify the microarray data that form the basis for their differential expression model.

      We have performed qPCR for ncc69 and kcc. Although qPCR is semiquantitative when comparing between genes, the Ct value for ncc69 was lower than for kcc, indicating more transcripts were present at the onset (assuming identical efficacy). These data (although semi-quantitative), the microarray, and our behavioral and electrophysiological data are consistent with the stated hypothesis.

      Reviewer #3 (Public Review):

      There are also several modest weaknesses in the paper:

      1) A notable gap remains in the evidence for the hypothesized mechanisms that enhance electrical activity during cold stimulation and the proposed role of anoctamins (Fig. 8) - the lack of evidence for Ca2+-dependent activation of Cl- current. The recording methods used in the fillet preparation should enable direct tests of this important part of the model.

      We have performed an additional experiment at the reviewer’s suggestion. Please see above (in essential revisions) and below (in recommendations for authors).

      2) The behavioral and electrophysiological consequences of knocking down either of the two anoctamins are incomplete (Fig.2), raising the significant question of whether combined knock-down of both anoctamins in the CIII neurons would largely eliminate the cold-specific responses.

      While the results of this experiment would certainly be interesting, we are unsure of how it would be acutely informative in this context and are not convinced that any possible outcomes would disprove any particular hypothesis. In part, this is because we know that blocking synaptic transmission in CIII neurons (via tetanus toxin) does not completely ablate cold-evoked behavior (Turner & Armengol et al 2016 https://doi.org/10.1016/j.cub.2016.09.038). This is also the case for combinatorial mutation of other genes associated with cold nociception (please see Turner & Armengol et al 2016; and more recently, Patel et al 2022 https://doi.org/10.3389/fnmol.2022.942548). Further, the husbandry required to generate the double knockdowns would be quite challenging and might result in GAL4 titration (hypothetically less strongly knocking down each gene). For these reasons, we have not performed this suggested experiment.

      3) Blind procedures were not used to minimize unconscious bias in the analyses of video-recorded behavior, although some of the analyses were partially automated.

      This is correct and a relative weakness of the study. We note it in our methods section. The use of semi-automated data analyses of the behavioral videos is designed to minimize experimenter-specific variability.

      4) The term "hypersensitization" is confusing. Pain physiologists typically use "sensitization" when behavioral or neural responses are increased from normal. In the case of increased neuronal sensitivity, if the mechanism involves an increase in responsiveness to depolarizing inputs or an increased probability of spontaneous discharge, the term "hyperexcitability" is appropriate. Hypersensitization connotes an extreme sensitization state compared to a known normal sensitization state (which already signifies increased sensitivity). In contrast, the effects of ncc69 overexpression in this manuscript are best described simply as sensitization (increased reflexive and neuronal sensitivity to cooling) and hyperexcitability (expressed as increased spontaneous activity at room temperature).

      We have modified the text in accordance with the reviewer’s suggestions (see recommendations for authors section). We have also changed the title of the paper to “Chloride-dependent mechanisms of multimodal sensory discrimination and nociceptive sensitization in Drosophila”

    1. It bears mention that Vannevar’s influential essay “As We May Think” in the July 1945 issue of The Atlantic is entirely underpinned by the commonplace book and zettelkasten traditions pervading Western thought and culture. Rather than acknowledge this tradition tacitly, he creates the neologism “Memex” which stands in for a networked and connected zettelkasten

      This is an interesting observation. Also because Memex went on to inspire e.g. Doug Engelbart. Was Engelbart aware of the history when he demo'd outlining and notes? Was Nelson when he thought up stretchtext in 67?

    1. Reviewer #3 (Public Review):

      This paper aimed to understand how toxin-antidote (TA) elements are spread and maintained in species, especially in species where outcrossing is infrequent and the selfish gene drive of TA elements is limited. The paper focuses on the possible fitness costs and benefits of the peel-1/zeel-1 element in the nematode C. elegans. A combination of mathematical modeling and experimental tests of fitness are presented. The authors make a surprising finding: the toxin gene peel-1 provides a fitness advantage to the host. This is a very interesting finding that challenges how we think about selfish genetic elements, demonstrating that they may not be wholly "selfish" in order to spread in a population.

      Strengths<br /> 1. The authors support results found with a zeel-1 peel-1 introgressed strain by using CRISPR/Cas9 genetic engineering to precise knock-out the genes of interest. They were careful to ensure the loss-of-function of these generated alleles by using genetic crosses.

      2. Similarly, the authors are careful with controls, ensuring that genetic markers used in the fitness assays did not affect the fitness of the strain. This ensures that the genes of interest are causative for any source of fitness differences between strains, therefore making the data reliable and easily interpretable.

      3. A powerful assay for directly measuring the relative fitness of two strains is used.

      4. The authors support relative fitness data with direct measurements of fitness proximal traits such as body size (a proxy for growth rate) and fecundity, providing further support for the conclusion that peel-1 increases fitness.

      Weaknesses<br /> 1. One major conclusion is that peel-1 increases fitness independent of zeel-1, but this claim is not well supported by the data. The data presented show that the presence of zeel-1 does not provide a fitness benefit to a peel-1(null) worm. But the experiment does not test whether zeel-1 is required for the increased fitness conferred by the presence of peel-1. Ideally, one would test whether a zeel-1(null);peel-1(+) strain is as fit as a zeel-1(+);peel-1(+) strain, but this experiment may be infeasible since a zeel-1(null);peel-1(+) strain is inviable.

      2. The CRISPR-generated peel-1 allele in the N2 background only accounts for 32% of the fitness difference of the introgressed strain. Thus, the effect of peel-1 alone on fitness appears to be rather small. Additionally, this effect of peel-1 shows only weak statistical significance (and see point 5 below). Given that this is the key experiment in the paper, the major conclusion of the paper that the presence of peel-1 provides a fitness benefit is supported only weakly. For example, it is possible that other mutations caused by off-target effects of CRISPR in this strain may contribute to its decreased fitness. It would be valuable to point out the caveats to this conclusion, or back it up more strongly with additional experiments such as rescuing the peel-1(null) fitness defect with a wild-type peel-1 allele or determining if the introduction of wild-type peel-1 into the introgressed strain is sufficient to confer a fitness benefit.

      3. The strain that introgresses the zeel-1 peel-1 region from CB4856 into the N2 background was made by a different lab. Given that N2 strains from different labs can vary considerably, it is unclear whether this introgressed strain is indeed isogenic to the N2 strain it is competing against, or whether other background mutations outside the introgressed region may contribute to the observed fitness differences.

      4. Though the CRISPR-generated null allele of peel-1 only accounts for 32% of the fitness difference of the zeel-1 peel-1 introgressed strain, these two strains have very similar fecundity and growth rates. Thus, it is unclear why this mutant does not more fully account for the fitness differences.

      5. Improper statistical tests are used. All comparisons use a t-test, but this test is inappropriate when multiple comparisons are made. Importantly, correction for multiple comparisons may decrease the already weak statistical significance of the fitness costs of the peel-1 CRISPR allele (Fig 3E), which is the key result in the paper.

      6. N2 fecundity and growth rate measurements from Fig 2B&C are reused in Fig 3C&D. This should be explicitly stated. It should also be stated whether all three strains (N2, the zeel-1 peel-1 introgressed strain, and the peel-1 CRISPR mutant) were assayed in parallel as they should be. If so, a statistical test that corrects for multiple comparisons should also be used.

      7. It appears that the same data for the controls for the fitness experiments (i.e. N2 vs. marker & N2 vs. introgressed npr-1; glb-5) may be reused in Fig 2A and 3E. If so, this should be stated. It should also be stated whether all the experiments in these panels were performed in parallel. If so, this may affect the statistical significance when correcting for multiple comparisons.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to reviewers


      Reviewer #1 (Evidence, reproducibility and clarity (Required)): The authors develop a previously identified lead compound for the blocking of malaria transmission from humans to mosquitoes further and identified a protein target of the chemical. The protein target, Pfs16 is long known to be upregulated in gametocytes and has been speculated to be a target for small molecules. The work is well (if at time maybe too well/too detailed) described and potential shortfalls are highlighted.

      My major comment is that without a deletion mutation of Pfs16, the paper will remain somewhat preliminary. I would strongly encourage the authors to generate such a mutant and compare it to the parasites treated with their drug candidate. I feel the text can be much shortened and a lot of information moved to the materials and methods. The conclusions should be toned down on several occasions (abstract, introduction, discussion). Avoid adjectives, e.g. what is a 'powerful starting point' (abstract) or 'compelling interdisciplinary evidence' but hot air?

      We thank the reviewer for this comment. However, we would like to reiterate (as stated in the manuscript) that knockout of Pfs16 in P. falciparum is transmission lethal, i.e. you do not get progression of male gametogenesis. Thus, whilst re-generation of a Pfs16 KO would be interesting in terms of comparing phenotypically with the drug treated parasites, we are not convinced it would add any further evidence of support for or against our conclusion in terms of the ability of the N-4HCS scaffold to target this protein. E.g. we could drug treat a Pfs16 KO but this would not be expected to show gametogenesis irrespective of treatment. Therefore, whilst of academic interest, we believe it is satisfactory to judge our phenotypic work based on published accounts of the Pfs16 KO without having to engage in the costly experiments to regenerate the parasite and work on it side-by-side, especially given the limited resolution it would give towards the overall goal of the work in terms of defining the effect and likely target of this drug class on parasites.

      Addressing the second comment, we are happy to alter areas of the paper that may have over-stated the conclusions of the work including the abstract/introduction and discussion.

      CROSS-CONSULTATION COMMENTS I think these three reviews are pretty much in line with their overall assessment. I am happy if send as is to authors as it will help them shape a much better paper

      Reviewer #1 (Significance (Required)):

      The paper shows that very likely a new chemical with some potential for transmission inhibition of malaria parasites for mosquitoes binds to a Plasmodium protein that is specifically expressed in the sexual stages of the parasite.

      The paper compares to good papers published in journals like ACS Infectious Diseases or Antimicrobial Agents and Chemotherapy, but I am not sure which of the Review Commons sister journals it would fit to. I am a molecular parasitologist.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): Transmission blocking drugs are of high interest as a strategy to combat malaria but they are difficult to study. For instance it is problematic to raise resistant parasites to find mode of action of transmission blocking drugs and to identify their targets in the cell. In this manuscript Yahiya et al. build on previous work which identified the N-4HCS scaffold, of which DDD01035881 is the lead compound, as an inhibitor P. falciparum male gametocytes. Using PAL to enrich for target proteins Pfs16 was identified and validated as a possible target of DDD01035881. Binding was validated through CESTA. Determination of the phenotype following DDD01035881 treatment was found to partially match the previously published Pfs16 KO phenotype. However curiously no impact was seen in gametocytogenesis despite published evidence of Pfs16 being involved in sexual conversion. The authors speculate as to reasons but a direct experimental comparison with Pfs16 mutant parasites (which likely would have been revealing) is not provided. On the positive side, this analysis of the stage-specific effect of the drug pinpoint the stage inhibited during microgamete development which is a very interesting part of the manuscript.

      We thank the reviewer for this positive assessment of our work. Mirroring comments above, our challenge with Pfs16 knockout or mutation is that if we ablate Pfs16 function we cannot assess the effect of drug action. Definition of a mutant that would demonstrate precisely the drug mode of action would require structural resolution of drug bound to target (i.e. to identify which residues to target) – this is a major goal for our research group moving forwards, but likely many years’ work. In general, our core approach here has been one of chemo-proteomic based methods and phenotypic investigation of the novel antimalarial. Further evidence might be forthcoming from molecular genetics/structural biology, but we believe these are beyond the scope of the current work (and our available resources at present). We state future directions in the discussion and can add more to this in any revised manuscript.

      This work deepens the understanding of a novel class of transmission blocking drugs with reasonable potency (foremost (-)-DDD01028076, which has low nanomolar activity, the modified versions considerably less). Question on how to achieve serum concentrations for sufficient potency aside, these compounds will in the very least provide experimental tools to study their mode of action and might reveal interesting biology. This work is therefore of interest to the malaria field.

      The experimental methodology seems excellent but some of the results raise questions that make definite conclusions difficult and this should be addressed. Overall, this is very solid work but leaves some doubts whether Pfs16 is indeed the (only) target of this class of compounds.

      Major comments: 1. The reasons for excluding Etramp10.3 are not convincing. In fact it could be argued it is nearly as good a candidate as Pfs16. Contrary to the author's statements in the results section, etramp10.3 transcription is highly upregulated in gametocytes (see e.g. PMID: 22129310) with a generally very low transcription in asexual stages. It is argued that Etramp10.3 is essential in blood stages because MacKellar et al failed to disrupt the gene and because the PiggyBAC screen predicted it to be essential. However, if this is an argument for exclusion then this would also apply to Pfs16 which is also predicted by the PiggyBAC screen to be essential (likely both are non-essential in blood stages as they are barely expressed but Pfs16 and Etramp10.3 might by chance have not received an insertion in the PiggyBAC screen due to their very small size which may also explain failure of disrupting integration in MacKellar). Given the finding that the drug binds Pfs16 only in late gams it might also be argued that an essential function in asexuals might not be affected if they behave similarly to young gams and hence this criterion is not valid anyway.

      Further following this line of thought that ETRAMP10.3 could be a hit equivalent to Pfs16, Figure 2D shows a band below the band considered to be Pfs16. It would not be all surprising if this were ETRAMP10.3 (the size would fit).

      We don’t disagree with reviewer 2’s comments that ETRAMP10.3 could be an additional target. Although not traditionally related there is some similarity between these proteins and it may be that at the macroscopic level there is a structural homology between them. As stated elsewhere we are happy to tone down the assertion that Pfs16 is the only drug target candidate, leaving open the possibility of future follow up work that may yet reveal additional targets. This cannot be explored much further without extensive experimentation, which is beyond our current capacity. Given the strong phenotypic effect on gametocytes, whilst ETRAMP may be upregulated, this paper naturally focused its core attention on Pfs16 as a candidate target. We certainly subscribe to the view that absence of evidence is not evidence of absence.

      Both, Pfs16 and ETRAMP10.3 can be expected to be very abundant proteins in the parasite periphery in gams. Can the authors exclude that these simply are the first to encounter the N-4HCS photoaffinity probe and that this may have led to their enrichment in the target identification experiments. The biochemical data argues for a specific interaction with Pfs16, but by itself is not that strong. Given the discrepancies of the phenotype with the Pfs16 disruption and the peculiar finding that the drug binds Pfs16 only in later stage gametocytes, it might be a good idea to further caution the conclusion of Pfs16 as the inhibited target.

      We don’t necessarily agree that the evidence is not strong (three methods pointing to the same target is by many accounts solid evidence). Additionally, whilst it is true that the N-4HCS photoaffinity probes likely interact with PVM proteins in first instance, it is also worth noting that this doesn’t necessarily deduct from their likelihood to be true targets, but instead fits with the N-4HCS phenotype. We observe the compounds to inhibit microgametogenesis without any prior incubation and to retain this activity even beyond activation of microgametogenesis, specifically during the window in which the PVM remains associated with the parasite. Our phenotypic observations therefore fit with the notion that the molecules target proteins that lie within the PVM and interact with the molecules at first instance. Whilst we understand the concern that PVM proteins may be likely to be enriched given their abundance and localisation, we believe this to support our phenotypic findings.

      The phenocopy evidence of the NH compounds with the Pfs16 disruption is based on comparison with published evidence. It would have been much preferred to have a side-by-side comparison with the (or an) actual Pfs16 disruption parasite line. Although the authors stress that the phenotype with DD01035881 fits the phenotype of the targeted gene disruption in the results, this only partially matches the cited publication (PMID: 14698439) which concludes there is an effect on the number of gametocytes produced. The exflagellation phenotype in that publication was classified as preliminary. Although this is discussed, the main results text should be adapted to reflect this and the conclusion that Pfs16 may be the target should be further cautioned.

      As stated, we are happy to tone down conclusions in this direction. We also note comments above about Pfs16 disruption.

      Minor comments: 4. From the modifications of the compounds it seems the chemical space for further modification to achieve higher potency is limited with this scaffold. Maybe the authors can comment whether they envisage this to be a potential obstacle.

      The modification space of the compounds is explored extensively in previous work from our group, which we feel more than adequately addresses this question. See Rueda-Zubiaurre et al (2020) J Med Chem.

      Line 67: references are superscript.

      We can change this

      Line 77: I would recommend replacing 'quiescence' here, a cell that matures is not quiescent.

      We can change this

      Line 116: consider removing 'interdisciplinary'.

      We can change this

      Line 120: I would caution here (see major comments) and recommend a less definite proclamation of Pfs16 as a promising new drug target

      We can change this along with the general “tone” of the manuscript.

      Page 7: compounds 9 is still considered active ("retained micromolar activity"), but in Table 1 this is given as >1000nM. Please add the actual IC50.

      We can add this to the final version. The actual IC50 for this compound was 1.7uM. For the SAR study we grouped compounds with IC50 >1uM into discrete groups based on rough IC50 (>1uM, >10uM etc.) hence this fell in the intermediate group.

      Line 138- 173: The order in which this is discussed makes it unclear that the work described was done prior to, and guided, the synthesis of compound 1 and probe 2

      This can be addressed in a revised manuscript.

      Line 194: was the data deposited in a database?

      The proteomics data has not been deposited in a database but is accessible in the extended SI.

      Line 202: introduction as to the benefits of using a competition + probe condition here could aid reader understanding. The interpretation of this data is complicated by the covalent and reversible binding of the two compounds and the weight of this control is therefore difficult to gage.

      We can embellish the description here.

      Table 2 and Extended Data Table 1 show different p values and enrichments for the same hits. This is confusing. It would also be useful to label the hits in the scatter plots in Figure 2 for easy identification and comparison to the tables.

      We can amend this and label each hit within the scatter plot.

      Line 215-218, please correct the data on Etramp10.3 (see major points) and put in perspective to Pfs16 (Etramp10.3 is similarly upregulated in gams where it is highly expressed; PiggyBAC predicts essentiality for Pfs16 and Etramp10.3 in blood stages).

      We can discuss this to a limited extent for future exploration of Etramp10.3.

      Line 221: the results from the PiggyBAC screen are stated as fact, but what the screen provides is a prediction of the probability of importance for parasite growth. I would replace 'is' with 'is predicted' (even though in the case of Rab1b it seems likely the prediction is correct).

      We can change this

      Line 233 and elsewhere: define 'reversibility' (binding? activity?).

      We can change this

      Line 240: clarify what is in the cited paper (see major points).

      We can clarify this

      Line 297: We utilised in-lysate...... clunky sentence, please rephrase.

      We can change this

      Line 325: reference is missing the year.

      We can change this

      Line 343: It is utterly puzzling that binding is specific to Pfs16 in mature gametocytes and I do not find the explanation in the discussion convincing (see point 28 below). Do the authors have another explanation? Could Pfs16 be modified in later gams (or vice versa)?

      We believe that Pfs16 is functionally different at different stages of gametocyte development, this is either in terms of its presentation (e.g. perhaps due to complex formation, though this remains elusive) or the functionality of different domains, as per the effect of different truncation mutants. We can address some of these concerns in a revised manuscript.

      Line 388: Justification seems odd as a PV protein would be unlikely to directly impact DNA replication. Please rephrase the sentence.

      We can change this

      Line 405: remove the 'to'

      We can change this

      Line 411: it would be useful to the reader to state at what IC-value the drug was used in these experiments.

      We can state this

      Line 431: While the alpha-tubulin staining indicates exflagellation and is similar to the DMSO only control, the staining for the RBC membrane (Glycophorin A) and DNA (DAPI) appear different, yet this is ignored. One interpretation of this could be that while late treatment doesn't block exflagellation, it still impacts other aspects of microgamete development.

      We can make mention of this

      Line 436: IFA work was done with drug treatment post activation while EM was done post activation but drug treatment prior to activation. Is there a reason for this?

      The reviewer is astute to point this out. Limitations with access to the EM facility meant that whilst IFAs were completed for pre-activation treated samples, the post-activation EM became impossible as the EM facility closed during the COVID lockdown. Thus, we do not have a complete set here. However, we do not feel this takes away from the EM observations presented. We can clarify this incompleteness in the revised manuscript.

      Line 450: is this really CytB, or was it CytD?

      We did indeed used Cytochalasin B here, which whilst less potent than D does still target microfilament formation.

      Line 465: Pfs16 localised to vesicles: there is no data showing the dots in the micrograph are vesicles, please rephrase.

      We can change this

      Page 19 and 20, discussion on stage-specific differences of Pfs16 during gametocytogenesis to explain the difference in binding: without experimental data using H-4HCS in the parasites of the publication cited to explain this (PMID: 21498641), this is very speculative. The cited work used episomal expression of Pfs16 tagged with fluorescent proteins. This would be the first integral PVM protein that is actually inserted into the PV membrane when tagged in that way (usually this results in a PV location), casting some doubt on the findings in that paper. All in all the provided explanation is not very convincing.

      We can attempt to clarify this in a revised discussion.

      Line 519: if with the conserved part the N-terminus is meant, then this has for other PVM proteins already been shown to be PVM internal, not facing the erythrocyte (show in very early work; PMID: 1852170 but also multiple times after that).

      We can clarify this

      Line 534: consider replacing 'highly plausible' with something more cautious.

      We can change this

      Line 550: Given this discussion how stable are N- 4HCS compounds?

      We can clarify this.

      Table 1: Having all chemical structures in same orientation would be nicer visually. I assume blue indicates modification but this is not stated.

      We can change this

      Figure 1: Please use different colours or symbols. The dark green crosses and the blue Pfs16 cross are hard to distinguish.

      We can change this

      Figure 3d: Unclear as to why a difference temperature range is displayed here.

      We can clarify this

      Figure 3e: Unclear % Inhibition compared to what.

      We can clarify this

      Figure 5G: What is the white arrow pointing to?

      We can clarify this

      Figure 5j: Given how the explanation is written this would make more sense between current image 5G and 5J.

      We are not sure what the comment relates to here but we can endeavour to clarify this

      Figure 6: Erythrocyte membrane colour not stated in legend.

      We can change this

      Figure 6A: were the exposure times similar? How can so little be left after ~4-5.5 minutes but at later time points there seems to be much more Pfs16 signal left? Maybe amount of signal should be taken into consideration to establish the fate of Pfs16 in the process.

      We can endeavour to clarify this

      Figure 6B: is the second phenotype (successful but aberrant egress) shown? The only image where WGA is not circular around the parasite is an exact match of Pfs16 which is in dots (image at 7.5-8.5 minutes). The imaging data for this phenotype should be presented more clearly.

      We can attempt to clarify this

      Reviewer #2 (Significance (Required)):

      Nature and significance: a lot of weight has been placed on transmission blocking drugs although there are also a number of problems associated with them (ethics for testing and use etc; drugs acting on asexuals and transmission stages alike might be even more useful). Transmission blocking drugs are difficult to study and this work is therefore important. The experiments are well done, but the conclusions are not fully convincing, leaving some doubts in regard to Pfs16 being the actual target of the class of drugs studied.

      Compare to existing published evidence: it is a logic continuation of previous work and this is appropriately highlighted in the manuscript.

      Audience: medium interest for malaria researchers; high interest for researchers working on transmission blocking drugs and those studying microgametes.

      Your expertise: malaria, P. falciparum, biology of apicomplexans

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): The manuscript by Yahiya et al describes an extensive investigation of the mode of action of DDD01028076, which specifically inhibits microgametogenesis in Plasmodium falciparum. The phenotypic characterisation of the MOA uses some very nice imaging to demonstrate the point at which this compound inhibits microgametogenesis. The authors have also attempted to identify the molecular target using chemoproteomics and label-free CETSA techniques. The photoaffinity labelling and pull-down approach suggested the Pfs16 may be preferentially enriched by a PAL probe that is representative of this series. However, the data supporting the validation of this target is not very conclusive, and in some cases argues against Pfs16 being a specific target of DDD01028076. Whilst the presented data makes a significant contribution to the literature regarding a novel drug candidate that targets microgametogenesis, it does not support the author's claims that Pfs16 is the target.

      Major Concerns: The strongest evidence for Pfs16 being the target comes from the chemoproteomics pull-down study that found Pfs16 to be the most significantly enriched protein by compound 2 vs DMSO. However, this should be interpreted with caution as it is based on only 3 replicates and omics studies are prone to false-positives. That only 125 proteins were detected also raises questions about the coverage of the proteomics, it is quite possible that the actual target is not detectable using this method, and the Pfs16 appears because it is one of the more abundant proteins during this stage of the lifecycle.

      As discussed, we are happy to tone down the conclusions about Pfs16 being an exclusive target for the N-4HCS drug class, however, we feel the reviewer is being unnecessarily negative. There are myriad papers in the literature based on singular proteomics experiments (given their cost, complexity and time -consuming nature) that then facilitate downstream experiments that support findings. We have endeavoured to be as thorough as we could in the work and believe, like others, three replicates of a massive experimental pipeline should be sufficient to make a defined conclusion – whether the additional downstream evidence we have then leaned on is supportive of this (as we judge it to be) is another matter. We agree, proteomics often suffers with low protein abundance. The complexity of growing large quantities of gametocytes is familiar to anyone who has struggled to grow these finicky parasites at a larger scale than 10-25mL dishes. Given the scales we have reached, we believe these might in fact be some of the most comprehensive proteomics studies to date!

      Somewhat concerningly, the control with 1 as the competitor did not show significant enrichment of Pfs16, although a trend was observed. More concerning, was the lack of enrichment when using DDD01028076 as the competitor. This result essentially proves that Pfs16 is not the specific target (and the argument about reversibility is unlikely since most drugs are reversible binders, but many have worked with this type of approach). It is surprising that DDD01028076 (ideally the (-) form) wasn't used as the competitor for the proteomics study. This compound has ~100-fold better potency than the probe 2, which should provide much better competition that 1. It would also be more specific than 1, which is an important control considering that (-)-DDD01028076 has activity in the low nanomolar range, whereas 2 acts in the micromolar range. Non-specific interactions are an important consideration to exclude, and whilst 1 is structurally similar, it is not very potent and therefore not the best control to find the target associated with activity.

      Whilst we understand the concerns with insignificant enrichment in the competition labelling, we believe the enrichment in the presence of photoaffinity probe 2 over background (i.e. DMSO vs. probe experiments) to be of more value given the design of the experiment. The competition experiments were performed by co-treating gametocytes with photoaffinity probe 2 and parent molecule 1 prior to UV irradiation, to enable irreversible conjugation to protein target(s). However, given that both compounds, probe and parent, theoretically bind to Pfs16 at the PVM in a reversible manner (i.e. losing interaction with even gentle washing), UV irradiation is likely to favour probe-binding irrespective of competition with a marginally more potent parent molecule (in this case, parent molecule 1). This is especially true as treated parasites were very thoroughly washed after irradiation, so should the parent molecule have bound the target protein(s), these drug-target interactions were likely lost during stringent washing. The drug-target interactions with parent molecule 1 wouldn’t have been aided by UV irradiation, as the molecule lacks the functional group required for bioconjugation. So, even if parent molecule-target interactions were more abundant than probe-target interactions, interactions between parent molecule 1 were most likely lost and proteins bound by probe were enriched.

      This would have been true with more potent N-4HCS derivatives such and DDD01028076 and (-)-DDD01028076 (where potency is tested in the DGFA, independent of bioconjugation), and here we opted for a structurally similar compound of similar potency to not skew competition solely based on potency.

      We can embellish on this in the revised manuscript to make our conclusions from this part clear.

      A closer look at the gels in the supplementary data raises many questions that undermine the authors conclusions: - Fig S1a - The lane without probe (2) still identifies Pfs16 (or a protein at that MW) as the most abundant protein. Also, as the Pfs16 band increases, you can see that most other proteins also increase in abundance, so either the loading is inconsistent, or the probe actually causes non-specific enrichment of many proteins. This figure also indicates that the washing protocol is not sufficient to remove non-specific binders. Given the covalent nature of the PAL approach I would think a very thorough washing protocol could be employed.

      It is certainly the case that Pfs16 is abundant in gametocytes, a reason behind its early discovery. Thus it is challenging to remove it from background. We still believe the enrichment to be specific, highlighting the comparative work with Pfg377 in Figure 2. Further repetitions with more stringent washing might resolve the background, however, this is beyond our current resources to repeat.

      -- Running another negative control in the proteomics using one of the inactive controls from table 1 might help to disambiguate specificity.

      We don’t disagree with this though this would involve an entire re-running of the experimental workflow which is not possible.

      • Fig S2a - The anti-Pfs16 Western blots show that this protein is actually enriched more in the flow-through than the eluates. This shows that this protein is not specifically enriched by the PAL-CuAAC pull-down, it is just more abundant in the treated samples.

      Again, the presence of Pfs16 in the flow-through is unsurprising, given its abundance in stage V gametocytes. The relative abundance in the eluate is not an indication that the binding and subsequent enrichment is not specific, rather this shows the compound does not necessarily bind each and every protein – which is not unexpected. The crucial conclusion to be drawn here is the concentration-dependent enrichment of Pfs16 in the eluate in the presence of probe.

      • Fig S2b - The darkest Pfs16 spot is actually the sample with no UV treatment. This is a negative control, so should not enrich the target protein. This sample also has significant signal in replicates A and C.

      As we have noted above, it is not unsurprising that modification of the N-4HCS scaffold to yield this probe may introduce a level of irradiation-independent binding, which explains the presence of signal in the UV-independent sample.

      • Fig S2c - This blot is very messy and difficult to read, but in general the Pfs16 spots in the IGF don't correlate with the intensities in the anti-Pfs16 western.

      These experiments are extremely challenging (something that is perhaps beyond the expertise of the reviewer) and what is presented is the result of substantial optimisation. Loss of AzTB fluorescence in the gel which is subsequently analysed by western blot explains this.

      • Fig S2 - This data, and the main figures based on this data, generally don't support the hypothesis that Pfs16 is the specific target. The controls are not as would be expected, and there are no loading controls. Looking at the flow-throughs suggests that there was just more Pfs16 (and possibly total protein) in the treated samples before the enrichment step. The Pfg377 also appears quite variable in the different samples, with replicates B and C not consistent with A.

      We do not concur with the reviewer here and their dismissal of what was extremely thorough and well-executed experimtns. These are not like traditional western blots and require substantial optimisation. We refer them to our previous point in reference to the UV controls. With regards to the Pfg377 variability, the experiment itself is inherently variable with such large volumes of parasites. In many cases, for example, the male:female ratio within a mature gametocyte culture can vary and this can contribute to the variability in 377 abundance between replicates.

      The other major concern is with the CETSA analysis, which appears to show very minor stabilisation of Pfs16, but the specificity of this target is questionable, and the data has the following inconsistencies. - The supplementary data only shows n=1, yet there are error bars in the main figures. Where did these come from?

      The individual western blot replicates can be provided in a revised manuscript if judged important.

      • The samples with apparent destabilisation are all near the edge of large western blots, which often doesn't run straight and has no loading controls. We need to see the loading controls.

      Given all proteins within a lysate will aggregate with thermal treatment, antibody loading controls are not feasible with these experiments. Each sample is normalised prior to thermal stabilisation (ensuring the same protein quantity is treated in both DMSO and drug, at each temperature) and any protein that is not aggregated is loaded – the nature of CETSA itself is to compare the stabilisation between DMSO and drug.

      • The melting temperature of Pfs16 is extremely high at around 85 degrees C. Most plasmodium proteins melt at around 50-60 degrees (Dzekian et al, 2019). Even the cited work on membrane proteins didn't go to those temperatures (Kawatkar et al, 2019) Can this high temperature be explained, and has the CETSA approach been validated at such high temperatures where additional physical and chemical processes may be occurring in the sample?

      We agree that this temperature of stabilisation is unusually high and may require further biochemical validation. Without further investigation we cannot say definitively why the melting temperature of Pfs16 is so high, but suspect its size and membrane localisation may play a role.

      • The lack of difference between + and - isomers suggests that the very small stabilisation observed here is not specific to drug activity, but is more likely a non-specific binding effect. Additional negative control compounds might help here, but the + isomer is probably the best negative control (albeit the concentrations were not ideal in the presented data).

      Please we have already addressed this in the text – refer to line 312 and beyond.

      • The very high concentration (100uM) increases the chances of non-specific effects being observed here (especially since the authors claim to see stabilisation at about 10nM). The study should be repeated at lower concentrations (with negative controls) in order to confirm a specific binding effect.

      Whilst further replicates with different conditions might be preferable, as discussed extensively here, this would be beyond the scope of what we are able to achieve for a revision.

      • The concentration-ranging study was performed at 78.4 degrees, at which temperature very little denaturation of Pfs16 occurs fig S4a (and Fig 3b-c). Therefore, you would not expect to see any drug-induced stabilisation, and it is not plausible that significant stabilisation could occur at this temperature. Therefore, the apparent destabilisation at sub-10nM drug concentrations is highly questionable.

      We would have to agree to disagree on this point.

      • Stabilisation of Pfs16 did not occur in lysates from younger gametocytes (fig s4g-h), but this is a biophysical assay, so regardless of the function of this protein at different stages, the biophysical interaction between the drug and the protein should be the same regardless of the source of the protein. This data argues against Pfs16 being a specific binding target of Pfs16.

      We don’t agree with this statement, since the drug is binding the protein in native lysate – this may be a multi-meric complex (homo or hetero) which only exists at certain stages. As such we disagree with the reviewer that this argues against Pfs16 being the target.

      In addition to the above concerns, the fact that this compound doesn't inhibit the earlier functions of Pfs16 in gametocytogenesis, and that it doesn't inhibit P. berghei, also argue against this being the specific target of this drug. Whilst the authors have a valid argument that these findings don't exclude the possibility of stage-specific targeting of Pfs16, we could also argue that all the phenotypic data in figures 4-6 is merely correlative of a drug that acts at the same point in the lifecycle as Pfs16.

      We have discussed this in the manuscript and strongly feel the reviewer is being unnecessarily dismissive of a body of work that is coherent. We are happy to tone down the narrative of the paper with Pfs16 being the exclusive target. Structural homology of P. berghei Pfs16 orthologues has never been done but it would not be unprecedented if another target was functionally homologous (an idea we are currently pursuing). Stage specificity is also possible given the nature of Pfs16 (e.g. if it is in a complex). The reviewer appears fixated on a singular entity and unable to imagine a complex scenario where structure or protein-protein interactions might affect drug binding (as it does with other proteins present in complexes, e.g. proteasomal targeting drugs).

      Overall, I believe that significant additional studies would be required to identify the target of this compound. Either by repeating the included studies with additional controls and conditions, or by follow-up studies such as genetic manipulation (knock-down or overexpression) or heterologous expression and biophysical binding studies.

      Alternatively, the manuscript could be restructured as primarily a report on the phenotypic effect of this compound on microgametogenesis, with the target identification work reported as a hypothesis-generating chemoproteomics study that provides some ideas about possible targets, but requires substantial follow-up to confirm the target (which may be beyond the scope of this report?).

      We strongly disagree with this reviewer’s entire dismissal of an extensive body of work. In line with other reviewers comments we accept a need to tone down our conclusions, but do not consent to dropping the majority of the paper in favour of a phenotypic descriptive work.

      MINOR COMMENTS The manuscript is very well-written and presented.

      Several of the conclusions are overstated (as detailed above) and several statements should be tempered based on this data (e.g. statements linking DDD01028076 effects to Pfs16 function).

      We can address the overstatement of conclusions in a revised manuscript.

      I find the term 'crosslinking' confusing for the photo-affinity labelling, as crosslinking in proteomics often refers to crosslinking between proteins (not between protein and drug).

      This is simple to address – to minimise confusion for readers, we can simply state where photoaffinity labelling and bioconjugation were performed (and not refer to the latter as crosslinking).

      The data and terminology around activity (IC50) for compounds in table 1 is a little confusing. Some IC50 values are reported as >1000, while others have precise mean values reported over 1000, and others are >10,000 or >25,000. This is especially confusing where 9 is claimed to have retained activity, but is >1000. If consistent thresholds are not appropriate then perhaps including dose response curves in the supp data might be necessary to explain these?

      We can simply provide the provide IC50s for compounds of greater potency. We are also happy to provide the curves but with such a large body of work already, this might be unnecessary.

      Reviewer #3 (Significance (Required)):

      The work is potentially interesting to Plasmodium biology and drug discovery researchers. The concept of a transmission-blocking drug is quite attractive to this community, so the topic is highly relevant. Keeping in mind that this compound was reported previously, the main novelty is in defining it's window of activity during the microgametogenesis process, and differentiating this from other drugs/compounds that inhibit this process. There is clearly an advance in knowledge presented here.

      If Pfs16 were to be confirmed as the target of this series then I think that this study would have much greater impact and attract interest from a broad audience. However, at this stage I don't see strong evidence for this hypothesis, and some of this data casts significant doubt on the likelihood that Pfs16 is the direct target.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript by Yahiya et al describes an extensive investigation of the mode of action of DDD01028076, which specifically inhibits microgametogenesis in Plasmodium falciparum. The phenotypic characterisation of the MOA uses some very nice imaging to demonstrate the point at which this compound inhibits microgametogenesis. The authors have also attempted to identify the molecular target using chemoproteomics and label-free CETSA techniques. The photoaffinity labelling and pull-down approach suggested the Pfs16 may be preferentially enriched by a PAL probe that is representative of this series. However, the data supporting the validation of this target is not very conclusive, and in some cases argues against Pfs16 being a specific target of DDD01028076. Whilst the presented data makes a significant contribution to the literature regarding a novel drug candidate that targets microgametogenesis, it does not support the author's claims that Pfs16 is the target.

      Major Concerns:

      The strongest evidence for Pfs16 being the target comes from the chemoproteomics pull-down study that found Pfs16 to be the most significantly enriched protein by compound 2 vs DMSO. However, this should be interpreted with caution as it is based on only 3 replicates and omics studies are prone to false-positives. That only 125 proteins were detected also raises questions about the coverage of the proteomics, it is quite possible that the actual target is not detectable using this method, and the Pfs16 appears because it is one of the more abundant proteins during this stage of the lifecycle.

      Somewhat concerningly, the control with 1 as the competitor did not show significant enrichment of Pfs16, although a trend was observed. More concerning, was the lack of enrichment when using DDD01028076 as the competitor. This result essentially proves that Pfs16 is not the specific target (and the argument about reversibility is unlikely since most drugs are reversible binders, but many have worked with this type of approach). It is surprising that DDD01028076 (ideally the (-) form) wasn't used as the competitor for the proteomics study. This compound has ~100-fold better potency than the probe 2, which should provide much better competition that 1. It would also be more specific than 1, which is an important control considering that (-)-DDD01028076 has activity in the low nanomolar range, whereas 2 acts in the micromolar range. Non-specific interactions are an important consideration to exclude, and whilst 1 is structurally similar, it is not very potent and therefore not the best control to find the target associated with activity.

      A closer look at the gels in the supplementary data raises many questions that undermine the authors conclusions:

      • Fig S1a - The lane without probe (2) still identifies Pfs16 (or a protein at that MW) as the most abundant protein. Also, as the Pfs16 band increases, you can see that most other proteins also increase in abundance, so either the loading is inconsistent, or the probe actually causes non-specific enrichment of many proteins. This figure also indicates that the washing protocol is not sufficient to remove non-specific binders. Given the covalent nature of the PAL approach I would think a very thorough washing protocol could be employed. -- Running another negative control in the proteomics using one of the inactive controls from table 1 might help to disambiguate specificity.
      • Fig S2a - The anti-Pfs16 Western blots show that this protein is actually enriched more in the flow-through than the eluates. This shows that this protein is not specifically enriched by the PAL-CuAAC pull-down, it is just more abundant in the treated samples.
      • Fig S2b - The darkest Pfs16 spot is actually the sample with no UV treatment. This is a negative control, so should not enrich the target protein. This sample also has significant signal in replicates A and C.
      • Fig S2c - This blot is very messy and difficult to read, but in general the Pfs16 spots in the IGF don't correlate with the intensities in the anti-Pfs16 western.
      • Fig S2 - This data, and the main figures based on this data, generally don't support the hypothesis that Pfs16 is the specific target. The controls are not as would be expected, and there are no loading controls. Looking at the flow-throughs suggests that there was just more Pfs16 (and possibly total protein) in the treated samples before the enrichment step. The Pfg377 also appears quite variable in the different samples, with replicates B and C not consistent with A.

      The other major concern is with the CETSA analysis, which appears to show very minor stabilisation of Pfs16, but the specificity of this target is questionable, and the data has the following inconsistencies.

      • The supplementary data only shows n=1, yet there are error bars in the main figures. Where did these come from?
      • The samples with apparent destabilisation are all near the edge of large western blots, which often doesn't run straight and has no loading controls. We need to see the loading controls.
      • The melting temperature of Pfs16 is extremely high at around 85 degrees C. Most plasmodium proteins melt at around 50-60 degrees (Dzekian et al, 2019). Even the cited work on membrane proteins didn't go to those temperatures (Kawatkar et al, 2019) Can this high temperature be explained, and has the CETSA approach been validated at such high temperatures where additional physical and chemical processes may be occurring in the sample?
      • The lack of difference between + and - isomers suggests that the very small stabilisation observed here is not specific to drug activity, but is more likely a non-specific binding effect. Additional negative control compounds might help here, but the + isomer is probably the best negative control (albeit the concentrations were not ideal in the presented data).
      • The very high concentration (100uM) increases the chances of non-specific effects being observed here (especially since the authors claim to see stabilisation at about 10nM). The study should be repeated at lower concentrations (with negative controls) in order to confirm a specific binding effect.
      • The concentration-ranging study was performed at 78.4 degrees, at which temperature very little denaturation of Pfs16 occurs fig S4a (and Fig 3b-c). Therefore, you would not expect to see any drug-induced stabilisation, and it is not plausible that significant stabilisation could occur at this temperature. Therefore, the apparent destabilisation at sub-10nM drug concentrations is highly questionable.
      • Stabilisation of Pfs16 did not occur in lysates from younger gametocytes (fig s4g-h), but this is a biophysical assay, so regardless of the function of this protein at different stages, the biophysical interaction between the drug and the protein should be the same regardless of the source of the protein. This data argues against Pfs16 being a specific binding target of Pfs16.

      In addition to the above concerns, the fact that this compound doesn't inhibit the earlier functions of Pfs16 in gametocytogenesis, and that it doesn't inhibit P. berghei, also argue against this being the specific target of this drug. Whilst the authors have a valid argument that these findings don't exclude the possibility of stage-specific targeting of Pfs16, we could also argue that all the phenotypic data in figures 4-6 is merely correlative of a drug that acts at the same point in the lifecycle as Pfs16.

      Overall, I believe that significant additional studies would be required to identify the target of this compound. Either by repeating the included studies with additional controls and conditions, or by follow-up studies such as genetic manipulation (knock-down or overexpression) or heterologous expression and biophysical binding studies. Alternatively, the manuscript could be restructured as primarily a report on the phenotypic effect of this compound on microgametogenesis, with the target identification work reported as a hypothesis-generating chemoproteomics study that provides some ideas about possible targets, but requires substantial follow-up to confirm the target (which may be beyond the scope of this report?).

      Minor comments

      The manuscript is very well-written and presented.

      Several of the conclusions are overstated (as detailed above) and several statements should be tempered based on this data (e.g. statements linking DDD01028076 effects to Pfs16 function).

      I find the term 'crosslinking' confusing for the photo-affinity labelling, as crosslinking in proteomics often refers to crosslinking between proteins (not between protein and drug).

      The data and terminology around activity (IC50) for compounds in table 1 is a little confusing. Some IC50 values are reported as >1000, while others have precise mean values reported over 1000, and others are >10,000 or >25,000. This is especially confusing where 9 is claimed to have retained activity, but is >1000. If consistent thresholds are not appropriate then perhaps including dose response curves in the supp data might be necessary to explain these?

      Significance

      The work is potentially interesting to Plasmodium biology and drug discovery researchers. The concept of a transmission-blocking drug is quite attractive to this community, so the topic is highly relevant. Keeping in mind that this compound was reported previously, the main novelty is in defining it's window of activity during the microgametogenesis process, and differentiating this from other drugs/compounds that inhibit this process. There is clearly an advance in knowledge presented here.

      If Pfs16 were to be confirmed as the target of this series then I think that this study would have much greater impact and attract interest from a broad audience. However, at this stage I don't see strong evidence for this hypothesis, and some of this data casts significant doubt on the likelihood that Pfs16 is the direct target.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: * Saha et al. characterize Drosophila egg chambers that are mutant for cup and identify an increase in the number of a specialized type of follicle cells, the border cells. They demonstrate that this increase correlates with an expanded domain of STAT activity and reduced Notch signaling in anterior follicle cells. Determining that cup is required in the germline cells, the authors postulate and provide some evidence that cup mutants prevent germline Delta from properly signaling to follicle cells. In line with this, they also show that blocking endocytosis phenocopies some aspects of cup mutants, particularly border cell numbers and Delta levels, which they monitor cytoplasmically and at the cell surface. Lastly, they demonstrate that activation of Rab11 can rescue Delta levels and border cell number in cup mutants. They conclude that a key function of Cup in the germline is to traffic Delta to signal to follicle cells, and that the endocytic processing of Delta is required for its function.*

      Major comments:

      • The findings of this study are interesting and novel. The authors have completed a lot of experiments and analyzed the results carefully and in great detail. Experimental design is described adequately and statistical analysis is sufficient. While the main results are largely convincing and support the conclusions, there are some weaknesses that need to be addressed.*

      Response: We thank the reviewer for appreciating our work and we have tried to address concerns of the reviewers to the maximal possible extent with the hope to strengthen our claims further.

      One major concern is that the vast majority of the experiments were conducted with a single homozygous allele for cup. The authors claim this was necessary because other alleles arrest oogenesis, which is understandable, but it leaves the potential problem that the allele, a P-element insertion, may affect other genes, or there may be other unidentified mutations on the mutant chromosome. The authors are able to partially rescue the border cell phenotype with overexpression of Cup and can also mimic the outcome with RNAi in the germline, which helps alleviate some of this concern, but this was only done for one set of experiments (those in figure 1). Similar experiments need to be included to demonstrate the same outcomes when cut is disrupted by other alleles/methods for at least some of the Notch/Delta analyses since this is key to the paper's conclusions.

      Response____: We acknowledge the concern raised by reviewer and to address it, we evaluated different allelic combination of Cup to rule out issues with background mutation. We evaluated the Delta count, NICD and border cell numbers in a different allelic background of cup8/ cup01355. Satisfyingly we observed similar results like that observed for cup01355/ cup01355 homozygotes. This result is included as (Fig S1E-G)

      In addition, we have specifically downregulated Cup function in the germline employing the RNAi approach and validated the non-cell autonomous effect of Cup function in border cell fate specification. This result is included in (Fig 1M-O)

      A second concern is that some evidence is circumstantial or indirect. Specifically, the authors argue that the effect of Cut is due to trafficking of Delta, but do not consider the possibility that Delta could be more directly regulated or that other factors may be relevant. Border cell specification is rescued by increasing recycling in cup mutants, but this could be due to recycling of more factors besides Delta. To address this more directly, the authors should overexpress Delta in the germline of cut mutants. It is possible the disruption of Delta in cut mutants is due to changes in Delta protein stability/levels, so the experiment may also clarify this issue. If this is the case, it may be that hypomorphic Delta mutants would have a defect on border cell number, which could be examined separately. If Delta levels are low, endocytosis and recycling increases may also rescue cut mutants indirectly, but the conclusion about what Cut regulates may differ.

      Response: As per the suggestion of the reviewer, we did attempt to over express Delta in the germline of cup mutants egg chambers. Unfortunately, we couldn’t record any Delta overexpression as the available vector (UASt- Delta) can drive stable expression only in the somatic cells but not in the germline cells. However, to check out the possibility if Delta was being directly regulated by Cup, we compared the levels of proteins between wild type and Cup mutant egg chambers (Figure 4E-G). Unlike our expectation we didn’t observe any significant differences in the levels of Delta in Cup compared to the control. This kind of supports our belief that Cup may not be directly regulating the levels of Delta in the germline.

      Another concern is that Cup's main role is a confusing since it regulates many things, including cytoskeleton and cytoskeleton is necessary for general health and vesicle trafficking in the egg chamber - how do the authors think Rab11 upregulation is overcoming these defects?

      Response: We appreciate the reviewer for raising this concern as it kind of intrigued us to examine if the overexpression Rab11CA was rescuing the cytoskeleton too. Interestingly, we observed that Rab11CA overexpression restored the actin filament in Cup mutant germline(figure S6H-K). This result is in line with report that Rab11 effector Nuf can modulate actin polymerization (Jian Cao et al.,2008).

      Rab11CA rescues Delta levels almost completely in cut mutants but only partially rescues Notch activation, suggesting there are other problems in these egg chambers that could contribute to the defects. While exploring possible other factors is beyond the scope of this work, the authors may want to acknowledge this issue.

      Response: We do agree with the reviewer that we only observe partial rescue of the NRE GFP with Rab11CA, it suggests that Cup can affect different aspect of egg chamber development independent of Rab11 function.

      Minor comments:

      It would help the presentation of the paper to introduce Notch/Delta signaling during oogenesis in the introduction. More introduction and clarity about the number of polar cells at early stages and their role in the border cell cluster may also be useful to the reader.

      Response: We have modified the introduction to highlight the role of Notch/ Delta signaling in early oogenesis.

      It is notable that the primary phenotype of a change in border cell numbers is quite subtle, often only affecting 1-2 cells, and the variation in different genotypes and experiments is sometimes also that large. The authors do a good job of being careful to count the cells at a specific developmental time and do appropriate statistical tests within an experiments. Still, it difficult to be sure that the effects are due to the gene being manipulated specifically or the genetic background. Related to this, a few issues should be addressed. Notably, at earlier stages, Notch signaling impacts cell division, so some of the phenotypes might be explained by there being more total cells in the domain instead of more signaling. The authors show Cut is in the same domain and pH3 is similar, but they didn’t seem assess overall numbers.

      Response: As per the suggestion of the reviewer, we assessed the total number of follicle cell nuclei in stage 8 egg chambers. This analysis was done each confocal z slide of the egg chamber taking care that each nuclei (DAPI) was counted only once. Satisfyingly we didn’t observe any significant difference in the number of follicle cell nuclei between wild type and cup mutant egg chambers supporting our earlier claims with pH3 and Cut antibody that cell proliferation is not responsible for the excessive border cell fate in Cup mutants. This result in included in (Fig S2O-Q)

      Secondly, for the stat suppression of cut (figure 2L), the authors need to show the stat-/+ control for comparison to make a conclusion about suppression versus additive effects.

      Response: As per the suggestion of the reviewer, we have included the data for statp1681/+ control in figure 2L.

      In addition, prior work (Wang et al 2007) expressed DN Kuz in border cells and did not see a change in specification, unlike what is claimed here. In the experiment in question, the control has lower than normal numbers of border cells and the DN Kuz has a number more typical of the controls in other experiments- so this is a concern that there is something else in the genetic background influencing the numbers. Other controls could help make this case, but ultimately this result is probably not necessary for the main argument. Thus the authors might consider leaving it out the Kuz analysis or perhaps can comment on the discrepancy with prior published results.

      Response: We have removed the data on Kuzbanian and have added data that suggests that Notch activation in the follicle cells downstream of Cup facilitates specification of appropriate number of migratory border cells (Fig 3K-N).

      Can the authors comment on why the volume of the border cell cluster increases more dramatically (>2x) than the number of cells (30% more)? * Does the increase in border cell number change the migratory capacity? That is, do the clusters in cut mutant egg chambers migrate normally while the egg chamber looks okay?*

      Response: We believe that dramatic increase in the volume of the border cell cluster I (>2x) than the number of cells (30% more) is due the loose arrangement of the cells in the border cell cluster. Interestingly, the cup mutant border cell clusters do exhibit migration defect that we are examine as part of separate study.

      Several of the figure legend titles state conclusions that are over interpretations of the data shown:

      - Figure 3 legend is overstated- these experiments do not assay STAT activity, only border cell number, so the title can be simplified to say that.

      Response: We have modified the Figure legend in line with the data presented.

      - For figure 4, both cytoskeleton and Delta are shown to be disrupted in cup mutants, but they are not directly linked, eg, the experiments do not show a change in Delta in cytoskeletal mutants alone. While it is interesting that cup mutants have disrupted cytoskeleton, ultimately this result is not well connected to the main issue of Notch/Delta signaling; in fact, it becomes confusing how anything can be trafficked to the cell surface if there is poor cytoskeletal organization. Since the authors favor the hypothesis that the cytoskeleton is not the key to the border cell specification difference, they may want to move this result out of figure 4.

      __Response: __We have included the data that suggests that cytoskeleton organization is critical for Delta trafficking. Specifically we demonstrate that treatment of egg chambers with Cytochalasin D exhibits accumulation of Delta in the nurse cell cytoplasm (Fig S5D-F).

      - The Figure 5 legend is also overstated- these experiments show that Delta is higher in cup mutants and endocytosis mutants AND that endocytosis (of something) is required in the germline for border cell number- but these results are not linked in this figure. More evidence for this connection does come later in figure 6. * Some figure legends are quite brief and could benefit from a little more detail on what is being shown*.

      __Response: __We have modified the title of the Figure legends with respect to data presented.

      Figure layout could be improved by keeping images consistent sizes and making sure graph text is large enough to read easily. Figures in general could be streamlined by having negative results and less pertinent results in supplemental data.

      Response: We have reorganized the figures and worked on the graph text for easy read.

      Not all papers cited in the text are in the reference list.

      Responses: We have modified the title of the figure legends and cross checked our reference list with the papers mentioned in the main text.

      CROSS-CONSULTATION COMMENTS

      I generally agree with the other reviewers that there are concerns with the precise function of cup in this context, and that some revision is needed, including editing of the writing. In response to reviewer 2, prior published studies only detected Cup in germline, but it is possible that it is expressed in follicle cells at a low level. The mutant clonal experiment in follicle cells that the authors did had no effect on border cells, so that provides some evidence the role is non-autonomous. I agree with reviewer 2's concern that the authors overstate the connection between cup and Delta and border cells based on their data and need a few more experiments to tie things together. I understand reviewer 3's concerns that the experimental effects on border cell numbers are very small and variable- I listed this as a minor concern, though, since this number is mainly being used as a read-out for STAT signaling levels and the data were extensively quantified and statistically tested.

      Reviewer #1 (Significance (Required)):

      My expertise is in cell migration, developmental biology, and Drosophila genetics. This paper will be of broad interest in these fields as it incorporates aspects of each in its characterization of a new regulatory mechanism to induce a motile cell population non-cell-autonomously, which is an exciting finding. Specifically, the work increases our understanding of the intersection between Notch and Jak/STAT signaling, which many researchers study - these were both known to be involved in border cell specification. The study provides more detailed characterization of the signaling and specification process in general, and makes significant advances in understanding how Delta signals are produced and presented from germline cells to receiving cells in the soma. Cut has not been previously implicated in these signaling pathways, so that is also novel, although its precise mechanistic role here is still somewhat unclear.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Saha et al. made a detailed description of the role of the mRNA binding protein Cup in specifying the number of Border Cells (BC) during Drosophila melanogaster oogenesis. First of all, they show that females homozygote for a hypomorph allele of cup have higher number of BCs compared to Wild Type (WT) females. They present a series of experiments that points towards the phenotype being due to a specific role of cup in the nurse cells that non-cell autonomously regulates BC specification. Also, they show that this phenotype is the result of an increase in the levels of JAK/STAT signalling in the BC, a major determinant of BC

      fate. In addition, they show that cup mutant egg chambers exhibit a downregulation of

      the Notch (N) pathway function in the BCs and that over-activating Notch results in the rescue of the number of BCs. Moreover, the authors present data on the effect of cup in Delta (Dl) trafficking in the nurse cells: They found that cup mutant egg chambers show increased number of Dl puncta within the cytoplasm of the nurse cells, but reduced numbers in the nurse cell-Anterior Follicle Cell (AFC) boundary as a result of defective Dl endocytosis. Finally, they were able to rescue the Dl trafficking phenotype, as well as the number of BC by overexpressing an active form of Rab11.

      Mayor points:

      In this study, the authors employed an hypomorph allele of Cup to generate egg chambers where both germline and somatic cells are mutant for Cup. They did a series of experiments to try to demonstrate that the Border Cell (BC) specification phenotype they observe is non-cell autonomous and that is due to the Loss of Function (LOF) of Cup exclusively in the nurse cells. Although I appreciate the difficulties of eliminating or reducing the levels of Cup specifically in the nurse cells only during mid-oogenesis, I feel like this is key to be able to claim that this effect of Cup in BC specification is really non-cell autonomous. The reasons why I still have some doubts that there might be some cell autonomous effects in the FCs are the following:

      o The authors show that cup01355 mutant egg chambers have a phenotype in Dl trafficking. Although they analysed in detail the effects on Dl in the nurse cells, their images show that there might be a defect in Dl levels/trafficking in the Follicle Cells (FCs) as well (Fig5A-B). It has been shown that Dl mut FCs have reduced levels of Notch activity due to reduced lateral inhibition (Poulton et al., 2011), so there is a possibility that the reduced levels of Notch activity in the cup01355 egg chambers might be due, partially, to defects in Dl trafficking/levels in the FCs, rather than in the nurse cells. o The authors tested the role of the Notch pathway in the cup mutant phenotypes by measuring the number of NICD puncta in the signal receiving cells as proxy for Notch activity (Fig4). Although I understand the rationale, I am not convinced that they can completely rule out that the changes in NICD puncta number in FCs is not due to some effect of cup LOF on Notch trafficking in these cells.

      o In figure 6, the authors show that expression of a constitutively active form of Rab11 specifically in the nurse cells restores the BC number to that of the WT. However, the levels of Dl particles and, especially the levels of NRE-GFP expression, remains slightly lower than in the WT conditions.

      Response: We do agree with the reviewer that we only observe partial rescue of the NRE GFP with Rab11CA, it suggests that Cup can affect different aspect of egg chamber development independent of Rab11 function. This has been acknowledged in the main text and it now reads as “We did note that irrespective of partial rescue in the levels of NRE-GFP and Delta puncta count, a complete reversion to wild type border cell numbers was observed when Rab11CA was overexpressed in the cup mutant germline. This may suggest either that border cell fate specification is quite robust beyond a certain base level of signaling or Cup may affect other aspects of egg chamber development independent of Rab11 function.”

      One of the main conclusions of this study is that cup regulates BC specification through a non-cell autonomous mechanism that involves communication between nurse cells and AFCs. For that reason, I think in order to conclusively say that, the authors need to try to remove the function of cup specifically in the nurse cells. They mentioned they have tried different ways of doing this unsuccessfully, but do not specify how they have tried. I suggest using the cup-RNAi line combined with a nurse cell specific Gal4 and a ubiquitous gal80ts line (tub-Gal80ts), if they have not try this. I do not expect the authors to repeat all the experiments with this condition, but at least they should test the main findings i.e. number of BCs, JAK/STAT overactivation and Notch attenuation.

      Response: To further support the non-autonomous role of Cup in border cell fate specification, we down regulated Cup function in germline nurse cells employing Mat-alpha GAL4 and Cup RNAi. Since Mat-alpha GAL4 driver has weak expression in the nurse cells of early stage chambers, it enabled us to evaluate Cup function during mid oogenesis. Consistent with our expectation, we observed higher number number of border cells in the migratory cluster compared to the control supporting our conclusion that germline Cup modulates the number of adjacent anterior follicle cells that acquire migratory border cell fate. The above results are included in (Fig 1M-O). In addition over expression if Cup cDNA in the anterior follicle cells failed to the rescue the excessive border cells observed in the Cup mutant egg chambers supporting the germline role of Cup further. This result in included in (Fig S1L-O).

      • The authors have shown in Figure 3 that there is a decrease in Notch signalling in the AFCs in cup01355 egg chambers. In order to test that the BC number phenotype observe in this condition is due to that effect on Notch signalling they have done a rescue experiment using the antimorphic Notch allele Nax-16. Since in this condition all cells (nurse cells and FCs) have increased levels of Notch, they cannot conclusively say that the increase in Notch function in the FCs rescues the cup

      phenotype. If they want to show that the function of Notch is specifically needed in the FCs, they should over-activate Notch exclusively in the AFCs. For instance, they could express a constitutively active form of Notch, such as UAS-NICD (Go et al., 1998) or UAS-NDECD (Fortini et al., 1993), specifically in the AFCs. Otherwise, they should re-write the text since they cannot conclusively say that the increase in Notch function in the FCs rescues the cup phenotype.

      Response: Following the suggestion of the reviewer, we attempted over expression of NICD in the follicle using driver slbo-GAL4 in the cup mutant background. Gratifyingly, we observed rescue in the border cell fate of Cup mutant egg chambers. However, we didn’t observe any rescue in the morphology of nurse cell nuclei of Cup mutants. This supports our conclusion that increase in Notch function in the FCs rescues the cup phenotype with respect to the border cell fate only. (Fig 3K-N).

      • The authors had made a great effort to prove that proper Delta endocytosis in the nurse cells is essential for adequate Notch signalling in the AFCs and right number of BCs recruitment. Specifically:

      o They checked the consequences on Dl trafficking of down-regulation of rab5 or auxilin, but they did not test the effect in BC numbers * o They show that downregulating the function of shi affects the number of BCs, but did not show the effect of this condition in Dl trafficking. * Consequently, they cannot conclusively say that effects on trafficking of Dl affect number of BCs, since they haven't really tested both effects on the same background. I think that for simplification, they should test both, effects on Dl trafficking and number of BCs in one of those genetic backgrounds and leave the other two for supplementary material. Alternatively, they should re-write their conclusion for this section.

      Response: As Rab11GTPase over expression rescued the excessive border cell fate in the cup mutants, to test the specificity we downregulated Rab11 function in the germline itself to check Delta trafficking and border cell fate specification. We employed a late expressing GAL4 driver in the germline and observed that down regulation of Rab11 function resulted in more number of follicle cell acquiring border cell fate and decrease in the number of Delta puncta at the interface of Anterior follicle cells and nurse cells. This phenotype is reminiscent of the Cup mutants suggesting that perturbing the recycling component of endocytosis perse affects border cell fate and Delta trafficking. This result in included in (Fig 6D-I)

      • Their results clearly show that Dl accumulates in puncta, suggesting that there might be a defect in Dl trafficking, and although their rescue experiments point towards an scenario where Rab11-dependent Dl recycling is being affected, I think there are some weak points on their arguments. The fact that Rab11-KD does not generally affect Notch signalling in the FCs, as shown in (Windler & Bilder, 2010) argues against their conclusion that the effect of cup in nurse cells on Rab11 function is responsible for the defects in Dl trafficking and, subsequently, on Notch activity in AFCs. An alternative explanation is that Rab11 overactivation in the Cup mutant background compensates for a different defect on Dl trafficking, for example, Rab4-dependent recycling pathway. Another possibility is that AFCs could be specially sensitive to changes in Rab11-dependent Dl trafficking defects in the nurse cells. To distinguish between these two possibilities, they should perform some of the following experiments:
      • o First of all, there are a number of endosome markers that can be used to check in which step of the endocytic route Dl is being accumulated, including (but not limited to) anti-Rab11 antibody, anti-Rab5, anti-Rab7, tub-Rab4-mcherry. They should do co-localization experiments with Dl and endosomal markers.*
      • o Also, they could check what happens to the number of BCs and Dl trafficking when Rab11 function is blocked in the nurse cells, in a similar way to what they did with Auxillin, Rab5 and Shi. They could use some of the tools described in (Satoh et al., 2005)*

      Response: We have perturbed Rab11 function during mid oogenesis which is quite distant from early stage egg chambers examined by Windler & Bilder. We observed that down regulation of Rab11 activity in germline affects both border cell fate in the AFCs and Delta trafficking in the germline itself. Protein Trap analysis of Rab11 in wild type and Cup mutant background suggests Rab11 is enriched in the trans-golgi network where the activity of Rab11 is modulated through nucleotide exchange. Over all our results suggest that Rab11 activity is diminished in the cup01355 egg chambers and thus stimulating the recycling endocytosis restores Notch signalling in the AFCs, limiting JAK-STAT activation and restricting BC cell fate specification.

      • The authors final model is one in which cup in the nurse cells regulates Rab11 function to ultimately control JAK/STAT signalling in the AFCs. However, they have not looked at the status of JAK/STAT signalling in their Rab11-CA rescue experiments. I think this experiment will really round-up their work.* Response: The border cell fate is linked to activation of JAK-STAT signaling in the anterior follicle cells. As we have already exhausted the STAT antibody, it will difficult to access the levels of STAT perse.

      Minor points:

      • The authors tested if the extra BC phenotype observed in the cup mutant egg chambers is due to defects in FCs endoreplication. I have two questions related to this section.*

      • o First of all, I do not understand the rationale behind this idea that defects in FCs endoreplication would result in extra BCs. Please explain and add any relevant references.*

      • o Secondly, they say that they used Cut and Phospho-Histone3 as endoreplication markers. I believe that what they mean is that the absent of these two markers indicates that FCs have exit the cell cycle and enter the endocycle (Sun & Deng, 2005), however they are not markers of endoreplication. Please, re-write to make this clear.*

      Response: The follicle cell exhibits a switch from mitotic to endocycle phase at a particular stage of oogenesis (Sun & Deng’ 2005). Our premise is that incase this switch is delayed, will the extra proliferation can account for the excessive border cell fate? In this context we have modified the text to render clarity to this section.

      • The authors tested whether the levels of Notch activity were altered in the cup mutant egg chambers. For that, they used an NRE-GFP construct that shows a clear reduction in the levels of Notch activity in the AFCs. They also used the number of NICD and NECD puncta in signal receiving and sending cells respectively, as proxy of Notch activity. Although I understand the rationale, there are other explanations for this phenotype as discussed above. Thus, if they want to have an alternative way of showing the dampening of Notch signalling, they could use the levels of expression of well characterised targets of Notch in the FCs, such us hnt and E(spl)mb-CD2 or E(spl)m7. Response: We believe that our new set of data with NICD over expression (in the AFCs) rescuing border cell fate in Cup mutants coupled with NRE-GFP, NICD, NECD data now lends stronger support to our claim that Notch signaling in the follicle cells is indeed downstream of Cup function in developing egg chambers.

      • In M&M the authors explain that NRE-GFP levels were expressed in Fold change. However, in figure 3C the units of the graph are Fluorescence Intensity in a.u. Please,*

      check this small inconsistency

      Response: We have modified this as per reviewer’s suggestion.

      • In figure 4, they show the quantification of tubulin fibres within the nurse cells, however they are missing a similar analysis of Phalloidin (Pha) fibres/levels. I think this experiment and figure will be more complete if the authors added such a quantification of the effects of cup LOF in Pha distribution. Also, the authors do not show the single Pha channel in Fig4C, which would greatly helped to appreciate the differences between the WT and Cup LOF nurse cells. I suggest modifying the figure to better show the changes in Pha distribution. Response: We have modified the figure and included quantitation of actin fibre length in Supplementary figure 6H- K.

      • In figure 4F-G the authors are showing the general effect of cup LOF in Delta distribution. They indicate with yellow arrowheads the cytoplasmic Dl puncta accumulation in the nurse cells, however it is almost impossible to see such puncta with that level of magnification/resolution. I suggest removing the arrowheads, since the figure 4H-I shows the same puncta more clearly. Response: We have modified the figure to render clarity

      • In the Dl trafficking experiments (Fig4 H-I,K,L and Fig5A-C), the authors measured the number of puncta in the anterior nurse cell-follicle cell junction. In order to do those types of quantifications they need to be able to tell the cell boundaries that separate FCs from the nurse cells. Please, clarify the criteria for determining if the puncta are within the FCs or the underlying nurse cells. Response: Delta, NICD, NECD proteins marks the apical surface of the follicle cells. We used this as a reference to segregate nurse cell puncta with respect to follicle cells. This has been elaborated in the Material & Method section.

      • In figure 6C-D the authors show example images of egg chambers expressing Rab11-CA-YFP using the germline specific nos-Gal4. However, in the images it looks like the YFP signal is coming from the surrounding stretched FCs. Please check that these are the right images or explain the inconsistency.

      Response: We have crosschecked the images and the YFP signaling is from nurse cell periphery which gives the wrong impression that it is from stretched follicle cells.

      • In figures 1R, 2L, 3Q, 6I, 6M, the authors should show the results of the statistical analysis between all the conditions tested. I think that this is crucial to be able to tell whether some of the rescues are complete or only partial. *Responses: To avoid cramming the Figures, we have including some of the p values in the Figure legends. *

      • Line 174: should say "mutant egg chambers".*
      • Line 281: There is a reference that is missing from reference list: Liu et al., 2010;*
      • Line 292: The reference for the NRE-GFP construct is not the correct one, since that references to a review article. Please, add the correct reference.*
      • In line 462 of the manuscript you have a reference that is missing from your reference list.*
      • In line 394 the authors say: "protein, it's enrichment in the cytoplasmic fraction of the cup mutant egg chambers", but I think that they meant mutant nurse cells.*

      Response: We have modified the text as per the all the suggestions above Reviewer #2 (Significance (Required)):

      The BC migration is an excellent model to study collective cell migration and how epithelial cells can acquire migratory behaviours. After years of study, there is good understanding of the signals and genetic circuits that regulate BCs specification and migration (Montell et al., 2012), but there are not many studies, to my knowledge, that describe a role of nurse cells in specifying or guiding the migration of these cells. Thus, this study by Saha and colleagues is one of the first studies that show a role for nurse cells in specifying the number of BCs.

      My field of expertise is in cell-cell communication through different pathways, including Notch and Integrin signalling. I have studied the role of endocytosis in regulating Notch signalling in various contexts, including follicular epithelium in Drosophila ovaries.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This manuscript describes an investigation into the signaling that induces the differentiation of follicle cells into border cells in the Drosophila ovary. Previous studies have established the border cells as an informative model for studying how epithelial cells delaminate and undergo collective cell migration, and have identified the JAK-STAT and Notch pathways as important regulators of the process. Here, the authors performed a forward genetic screen and identified cup as another gene that is involved in the regulation of border cell differentiation. Their findings are consistent with a model in which cup is required in germ cells for the endocytosis of the Notch ligand, Delta. In cup mutants, impaired trafficking of Delta leads to decreased Notch signaling in follicle cells, which allows for increased JAK-STAT expression in follicle cells and an increase in the number of follicle cells that differentiate into border cells. Overall, the approach is thorough and the phenotypes are clear and well-described. The quantification of phenotype penetrance and of aspects of the images, such as pixel intensities and the number of particles in a region is a strength of the paper. The use of multiple independent methods to test key points is another strength. However, there are several concerns that should be addressed before the paper is considered for publication:

        1. The central phenotype that this paper is based on is a difference in the number of border cells per cluster in wildtype and mutant genotypes. However, this phenotype is fairly subtle in some cases (e.g. in Fig. 2L, it varies by only about 10% between control and mutant) and it is somewhat variable. For example, the number of cells in border cell clusters of the controls range from 4.49 in Fig. 3M to 6.41 in Fig. 1F. Considering that the mutant values fall within this range in some cases (e.g. 5.98 in Fig 3M) and the difference between the means from control and mutant genotypes is often less than two, the significance of this phenotype is unclear. How does this compare to other mutants that have been described to affect border cell specification? Are there any consequences for the differentiation of the follicle or the function of the egg caused by this defect?*

      Response: We are using the border cell number as readout for the output of JAK-STAT signaling. Though the difference in numbers may appear to be subtle, we believe our data clearly demonstrates that Cup non cell autonomously regulates border cell fate by modulating Notch signaling in the follicle cells*. *

      • Wang, et al. (PMID 17010965) have described previously that Notch signaling, and*

      Kuzbanian specifically, is required for border cell migration. The authors should cite this paper and discuss their findings in light of this study. For example, if Notch signaling is impaired in cup mutants, is border cell migration also impaired? Likewise, the citation of the Assa-Kunik, 2007 study as evidence that Notch and JAK-STAT signaling act antagonistically (Line 286) is a bit of an oversimplification. While that study does show that Notch and JAK-STAT act antagonistically at earlier stages of follicle development, Fig. 6 of that paper shows that a Notch reporter and a JAK-STAT reporter are both expressed concomitantly in border cells of a Stage 10 follicle and in the anterior follicle cells of what looks like a Stage ~8 follicle. The authors should discuss the apparent contradiction between their findings and this study.

      Response: We provide genetic evidence to support our claims that Cup in the germline modulates Notch activation in the anterior follicle cells thus limiting border cell fate specification to a few. The overlap in the expression of Notch reporter m7-lacz and STAT in the follicle cells and border cells is interesting and will need further investigation in real time to decipher any comparison between the two studies.

      • Lastly, the manuscript contains many grammatical errors, incomplete sentences, improper punctuation and spacing, and informal writing, such as the use of contractions. It should be thoroughly edited for content and clarity.*

      Response: We have tried to edit the manuscript with the aim to improve on the language, grammar and punctuations.

      Reviewer #3 (Significance (Required)):

      Although the identification of cup as a contributor to the regulation of border cell differentiation is novel, the other main regulators investigated in this study, including Notch and JAK-STAT signaling, have been identified previously. The role of cup in this context seems to be to fine tune Notch signaling and it seems to play a relatively minor role in the process of border cell specification. In addition, the conclusions of this paper are not well-integrated into the existing literature on Notch and JAK-STAT signaling in border cells, and the discussion about the broader implications of this study for the understanding of Notch signaling was not well-developed. However, the careful documentation and quantification of the phenotypes reported in this study adds rigor and allows for firm conclusions. For these reasons, this study may have a lasting but perhaps somewhat incremental impact on the study of border cell migration in the Drosophila ovary.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary:

      In this manuscript, authors establish a glyco-profiling platform for the functional analysis of genes involved in pseudaminic (Pse) and legionaminic (Leg) acid biosynthetic pathways. They used B. subvibroides and C. crescentus specific mutants in pseI and legI genes involved in the Pse and Leg biosynthesis, respectively, and cross-complementation assays with orthologous genes from different bacterial species, analysing motility and flagellin glycosylation. These assays show that Pse and Leg biosynthetic pathways are genetically different and recognize the LegX enzyme as a critical element in the Leg-specific enzymatic biosynthesis. Since that legX orthologous were only identified in the genome of bacteria with Leg biosynthetic pathways, it becomes a good marker to distinguish Leg from Pse biosynthesis pathways and a novel bioinformatic criterion for the assignment and discrimination of these two pathways. Reconstitution of Leg biosynthetic pathway of B. subvibroides in the C. crescentus mutant that lack flagellins, PseI and FlmG, complemented with both flagellin and FlmG of B. subvibroides, identified a new class of FlmG protein glycosyltransferases that modify flagellin with legionaminic acid. Furthermore, the construction of a chimeric FlmG through domain substitutions, allowed to reprogram a Pse-dependent FlmG into a Leg-dependent enzyme and reveal two modular determinants that govern flagellin glycosyltransferase specificity: a glycosyltransferase domain that accepts either Leg or Pse, and a specialized flagellin-binding domain to identify the substrate.

      Major comments:

      The conclusions obtained are convincing and well-supported. However, I think some points should be specify or clarify.

      1.- In the mutants (pseI, legI, flmG,...) the non-glycosylated flagellin are exported and assembled in a flagellum filament shorter than the WT strain. However, motility in plates is absent or very reduced. This might be produced by instability of the flagellum filament when rotating in a semi-solid surface. MET was performed from plates or liquid cultures? Do the author analyses motility in liquid media? If they did, changes in motility were observed?

      Response: The Caulobacter ΔpseI mutant accumulates low levels of flagellin in the supernatant. TEM analysis reveals that the flagellar filament is not assembled and only the hook structure is visible (PMID: 33108275). Brevundimonas subvibrioides ΔlegI or ΔflmG cells feature a shorter filament compared to WT by TEM. In all these analyses, TEM was performed on cells grown in broth to exponential growth phase as detailed in the Experimental procedures section. These mutant cells do not swim when analyzed by phase contrast microscopy. While is not known if swimming on semi-solid medium would further destabilize the flagellar structures seen in liquid cultures by TEM, there is more residual motility in B. subvibrioides mutants that make a short filament compared to C. crescentus mutants that lack the flagellar filament. Thus, our analyses point to a positive correlation between the residual motility and residual filament length when comparing the B. subvibrioides and C. crescentus mutants.

      2.- In page 5, lines 158-163, the analysis, by HPLC, of derivatized nonulosonic acid from B. subvibroides flagella, shows a major peak at 9.8 minutes retention and a minor peak at 15.3 minutes. Since that Pse-standard have retentions peaks at 9.7 and 13 minutes, and Leg-standard at 12.3 minutes, the authors cannot infer, only with these data, the flagella sugar is a legionaminic acid derivative. In my opinion, should be included that inference comes from the data obtained by HPLC analysis and genetic approaches. Thanks. Corrected. 3.- In page 5, line 173-175. Authors indicate, "While no difference in the abundance of flagellin was observed in extracts from mutant versus WT cells, flagellin was barely detectable in the supernatants of mutant cultures, suggesting flagellar filament formation is defective in these mutants". MET images show that the flagellum filament length is shorter in the mutants than in the WT strain. Therefore, if the same number of mutants and WT cells has been used in the immunodetection assays, there should be more flagellin monomers in the WT samples than in the mutants ones and flagellin bands should be less intense in mutant samples corresponding to the anchored flagellum. Why bands corresponding to flagellin in mutants and WT show similar intensity in the immunodetection assays (Figure 3C and D)? Furthermore, in lane 177-178, authors suggest that LegI and FlmG govern flagellin glycosylation and export (or stability after export). However, if filament stability is affected, the amount of flagellin monomers in the supernatant of mutants should be higher than in the WT. However, immunodetection assays show less abundance of flagelin monomers in the supernatant of mutants. Please, can you clarify this? In relation to this point, I suggest that authors include, in the experimental procedures, how they obtained the supernatants to flagellin immunodetection, as well as why they used anti- FljKCc anti-serum to detect the B. subvibroides flagellin.

      We thank the reviewer for raising this point. We have now clarified this question in the updated Experimental procedures section. Our immunoblots harbor the same number of cells harvested in exponential phase (OD=0.4). One mL of cells was harvested from cultures by centrifugation at full speed. The supernatant that was used for the immunodetection corresponds to the supernatant after the centrifugation. The supernatant fraction contains flagella that have been shed during the cell cycle at the swarmer cell to stalked cell (G1-S) transition of C. crescentus and B. subvibrioides.

      Thus, it is clear that the majority of flagellins detected by immunoblotting are in fact cell associated and specifically the intracellular flagellins. The evidence for this is that the levels are comparable between WT and ΔflmG mutant cells, even though the latter has shorter or no flagellar filaments. Moreover, while C. crescentus cells are not constantly flagellated during the cell cycle, flagellins are detectable on cell-associated samples by immunoblotting even when cells do not yet or no longer have a flagellar filament. Based on these two points, we conclude that the total flagellin levels associated with cells do not reflect the levels of flagellin assembled into a flagellar filament, but rather the flagellin bulk present in the cytoplasm.

      Consistent with this view, we previously reported that C. crescentus ΔpseI cells have the same amount of flagellins in cell lysates compared to the WT strain (PMID: 33108275), even though the mutant cells lack a flagellar filament. Thus, the results obtained here are consistent with previous observations and indicate that B. subvibrioides flagellin glycosylation mutants also still produce comparable amounts of flagellins intracellularly like the WT strain, despite the absence of flagellin glycosylation and inefficient assembly into a flagellar filament.

      Concerning the potential role of LegI and FlmG in flagellin stability after export, we were referring to protein stability (half-life), not filament stability. Glycosylation may impact the half-life of extracellular flagellins since glycosylation can protect from proteolytic degradation of proteins, possibly in this case by different proteases that may accumulate in the supernatant. Thus, non-glycosylated flagellins could be more easily degraded by extracellular proteases once they are exported, ultimately resulting in a lower amount in the supernatant.

      Addressing the final question about the specificity of the anti-FljKCc antiserum: we used this anti-serum because it detects the B. subvibrioides flagellins owing to the high sequence similarity between B. subvibrioides flagellins and C. crescentus flagellins. We previously showed that the anti-FljKCc anti-serum detects all six flagellins from C. crescentus, as determined by individually expressing each flagellin in a strain deleted for all six flagellin genes (Δfljx6) (PMID: 33108275). FljKCc (against which the antibody was raised) is 65% similar to the most distant C. crescentus flagellin, FljJ. As the similarity of FljKCc to the three B. subvibrioides flagellins ranges from 74% -67% sequence similarity, they should be even better recognized by the anti- FljKCc antibody than C. crescentus FljJ. However, on immunoblots we cannot attribute the signal to any individual B. subvibrioides flagellin as they could all co-migrate on SDS-PAGE and therefore all flagellins might reside in the same immunoblot band. However, we can clearly demonstrate that the immunoblot band corresponds to flagellins: a B. subvibrioides ΔflaF mutant (see below) that we constructed revealed that the flagellin signal is lost, as is the case for a C. crescentus ΔflaF mutant (PMID: 33113346). In the case of C. crescentus, the FlaF secretion chaperone is required for flagellin translation (synthesis) and we suspect that this also the case for B. subvibrioides FlaF. This experiment provides additional evidence that the B. subvibrioides flagellins are recognized by the anti-FljK (C. crescentus) anti-serum.

      4.- Authors demonstrate the specificity of the GT-B domain of FlmG, using a chimeric FlmGCc-Bs in a mutant of C. crescentus that lacks FlmG and harbour the Leg biosynthetic pathway of B. subvibroides. However, since that TPR comes from C. crescentus, this chimeric protein, could be transfer the legionaminic acid to the flagellin of B. subvibroides? Furthermore, the complementation of this mutant with the FlmGBs did not support efficient flagellin modification and this might be related to the TPRCc domain. Therefore, in my opinion, the chimeric protein should be introduced in the B. subvibroides∆flmG background. The answer to the first question is “No” or “very inefficiently” as determined from immunoblot analyses of B. subvibrioides ΔflmG cells expressing the chimeric FlmG_Cc-Bs protein that we now show in Fig S2B.

      Expression of the different FlmG (FlmG_Cc, FlmG_Bs, FlmG_Cc-Bs) in C. crescentus cells producing Pse or Leg revealed that FlmG_Bs does not support efficient flagellin modification with Pse in C. crescentus, likely because FlmG_Bs interacts poorly with the C. crescentus flagellins. By using the FlmG_Cc-Bs chimera we hoped to overcome this interaction problem with the C. crescentus flagellins (because the FlmG chimera harbors the C. crescentus TPR to bind the C. crescentus flagellins), however glycosyltransfer still does not occur efficiently because the GT domain from FlmG_Bs does not function with Pse. However, FlmG_Cc-Bs can modify the C. crescentus flagellins once C. crescentus is genetically modified to produce CMP-Leg (instead of CMP-Pse). This confirms that the FlmG TPR from C. crescentus is important for flagellin modification through the FlmG/flagellin interaction and that GT_B type glycosyltransferase only transfers Leg. In addition, we have now added as Fig S2B an immunoblot and as Fig S2C a motility test of B. subvibrioides ΔflmG cells expressing the FlmG_Cc-Bs chimeric protein in which we only observed little modification of B. subvibrioides flagellins and a poor motility, respectively. We extended our discussion of these results.

      5.- Page 8, line 299-301. Authors point out that C. crescentus that lacks FlmG and harbour the Leg biosynthetic pathway of B. subvibroides and the chimeric FlmGCc-Bs, although it has a glycosylated flagellin, whose mobility in SDS-PAGE is like the WT strain, is non-motile. They suggest that additional factors exist in the flagellation pathway that exhibit specificity towards the glycosyl group that is joined to flagellins. However, would be interesting to see if the flagellum filament has similar length to the WT strain or at least, it has increased in relation to the flagella length of the mutant. If flagella length has not increased, it could suggest that changes in the glycan type might affects the flagellin assembly or the stability of the flagellum filament. Therefore, would be also important to analyse its motility in liquid media.

      To investigate why the C. crescentus cells that produce Leg and express the chimeric FlmGCc-Bs glycosyltransferase are non-motile (Figure S5B) despite flagellin modification (by immunoblotting, Figure 7C), we employed two strategies. First, we performed immunoblot analyses on the supernatant fraction from these cells to determine if flagellins accumulate extracellularly. As now showed in Figure S5A, only low amounts of C. crescentus flagellins modified by Leg are present in the SN fraction. Second, we conducted TEM analyses of cells grown to exponential growth phase in broth. As shown in Figure S5C, the C. crescentus cells producing Leg and expressing FlmG_Cc-Bs glycosyltransferase harbor a shorter flagellum compared to those expressing the FlmG_Cc in which C. crescentus flagellins are modified by Pse. Altogether these results explain why these cells are non-motile both on soft agar plate and in liquid.

      Minor comments: 1.- Pag 3 line102. Please change ".....two predicted synthases, a PseI and LegI homolog, and C. crescentus only encodes only PseI...." to ".....two predicted synthases, a PseI and LegI homolog, and C. crescentus only encodes a PseI...." 2.- Figure 2 A. Plasmid nomenclature (Plac-neuB) is confusing because C.c. ΔpseI cells express predicted LegI or PseI synthases. Please change to Plac, as in Figure 2B and 4. Figure 2A and 2B do not contain any complementation with Bacillus subtilis (Basu), however two complementation are labelled as Bs in Figure 2A and 2B. Furthermore, no Bs are present in the Figure 2 legend. 3.- Legend of figure 3 should include B. subvibrioides abreviation Bs. Line 774: Please change ".......glycosylation and secretion in B. subvibrioides." to ".......glycosylation and secretion in B. subvibrioides (Bs)." 4.- Figure 3. In order to keep a similar nomenclature in all plasmids, plasmid Plac-legI syn and Plac-flmG should be labelled as Plac-legIBs syn and Plac-flmGBs.

      5.- Legend of figure 4 should include B. subvibrioides abreviation Bs. Line 791: Please change "....... complementation of the B.subvibrioides ΔlegI mutant with ...." to "....... complementation of the B.subvibrioides (Bs)ΔlegI mutant with ...." Furthermore, Legend of figure 4 indicate in line 795, that immunoblots reveal the intracellular levels of flagellin, however figure 2 and 3 show immunoblot of cell extracts. Please, correct this sentence. 6.- Legend of figure 5, 6 and 7 should include B. subvibrioides abreviation Bs. Line 808: Please change "Predicted Leg biosynthetic pathway in B. subvibrioides " to"Predicted Leg biosynthetic pathway in B. subvibrioides (Bs)" Line 834: Please change "....affects motility, flagellin glycosylation and secretion in B. subvibrioides."to "....affects motility, flagellin glycosylation and secretion in B. subvibrioides (Bs).Line 852: Please change "...acetyltransferase in flagellar motility of B. subvibrioides cells." to ""...acetyltransferase in flagellar motility of B. subvibrioides (Bs) cells." Furthermore, figure 5 should include C. crescentus abbreviation. Line 815: Please change "....whole cell lysates from C. crescentus mutant cultures......." to "....whole cell lysates from C. crescentus (Cc) mutant cultures......." 7.- In my opinion it would be useful to include a scheme of the gene organization involved in Leg biosynthesis in B. subvibrioides.

      8.- Legend of figure S1 should include B. subvibrioides (Bs) and C. crescentus (Cc) abbreviations. Line 888-867: Please change "...C. crescentus ΔpseI cells and B. subvibrioides ΔlegI cells with plasmids expressing..." to "...C. crescentus (Cc) ΔpseI cells and B. subvibrioides (Bs) ΔlegI cells with plasmids expressing..." Furthermore, the name and abbreviations (Mm, So, Ku, Pi, Dv) of the species used should be included in the legend. Why the authors used a plasmid with a Pvan promoter in these assays? Why the authors changed the code color of pseI and legI orthologous genes? It would be more useful and understandable follow the code color used in figure 2 and 4.

      Page 6 line 200, Please change ".....complementing synthases exhibit greater overall sequence similarity to LegI than Pse of C. jejuni. 22268,....." to ".....complementing synthases exhibit greater overall sequence similarity to LegI than PseI of C. jejuni. 22268,....." 10.- Page 7 line 231, Please change ".....negative bacteria A. baumannii LAC-4 (GCA_000786735.1)[38] and P. sp. Irchel 3E13..." to ".....negative bacteria A. baumannii LAC-4 (GCA_000786735.1)[38] and Pseudomonas sp. Irchel 3E13..." 11.- Introduce a line break between line 503 and 504. 12.- Page 14 line 543, please change "XbaI" to "XbaI" Thanks for the careful editing. We changed the text as suggested by the reviewer. We also added a scheme showing the genetic organization of the genes involved in Leg production and present as Figure 1B. When this study was initiated, the pMT335 plasmid with a Pvan promoter was used before we switched to using the pSRK plasmid with Plac promoter for better induction. Note that the results with Pvan or Plac are comparable regarding the PseI synthases interchangeability. Color code is now homogenous through the manuscript.

      Reviewer #1 (Significance (Required)):

      This is an interesting manuscript that contributes to the knowledge of the legionaminic biosynthetic pathway and establish a glyco-profiling platform for the functional analysis of genes involved in pseudaminic (Pse) and legionaminic (Leg) acid biosynthetic pathways. The analysis of Leg patway allowed to identify a gene (legX) that can be used to distinguish Leg from Pse biosynthesis pathways, becoming a bioinformatic tool for the assignment and discrimination of these two pathways. Furthermore, a new class of FlmG protein glycosyltransferases, able to transfer Leg to the flagellin, has been identified and its analysis reveal two modular determinants that govern flagellin glycosyltransferase specificity: a glycosyltransferase domain that accepts either Leg or Pse, and a specialized flagellin-binding domain to identify the substrate.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: Viollier and co-workers present a study in which they preform an elegant and rigorous genetic profiling of the the legionaminic and pseudaminic acid biosynthesis and flagellar glycosylation pathways in C. crescentus (native Pse) and B. subvibrioides (native Leg). They use motility as a representative readout for functional flagellar glycosylation with these microbial sialic acids. They discover orthologous Pse synthase genes can replace the function of the native synthase in C. crescentus and orthologous legionaminic acid synthase genes can achieve the same in B. subvibrioides. However, not vice versa indicating a strong preference for each microbial sialic acid stereoisomer in these species. For the Leg biosynthesis pathway, which requires GDP-GlcNAc, the authors also identify LegX as an essential component to synthesize this sugar nucleotide and thus a marker for Leg biosynthesis pathways. Upstream in theses pathways, they also identify a new class of FlmG flagellar protein glycosyltransferases. Importantly, through heterologous reconstitution experiments to uncovered that these glycosyltransferases possess two distinct domains, a transferase domain the determines specificity for either CMP-Leg or CMP-Pse, and a flagellin-binding domain to achieve selectivity for the substrate. Interestingly, by creating chimeric FlmG for these two domains between C. crescentus and B. subvibrioides they show that these two modular parts can be interchanged to adapt flagellin glycosyltransferase specificity in these species. Major comments: The key conclusions of the manuscript by Viollier and co-workers are convincing and well supported by their experiments and used methods, with respect to the insulation of the Leg and Pse biosynthetic pathways, they key role of LegX in launching the Leg pathway and the successful reconstitution of Leg glycosylation in a previously Pse-producing C. crescentus strain. Finally, they convincingly show that a chimeric version of the involved glycosyltransferases is functional, which besides intriguing future glycoengineering possibilities also emphasizes the two discrete domains in these transferases that dictate their sugar nucleotide and acceptor specificity. There is one additional experiment I would suggest with relation to the detection and confirmation of Pse and Leg on flagella of respectively, C. crescentus and B. subvibrioides. In the case of C. crescentus the detected DMB derivatized monosaccharide co-elutes with a validated standard of tri-acetylated Pse, which is convincing evidence of its identity. However, for B. subvibrioides. Their DMB derivatized monosaccharides from its flagella, results in a peak the does not co-elute with the only Leg standard (Leg5Ac7Ac) they have, it does elute at the same time as their Pse standard. Although it cannot of course be Pse as B. subvibrioides. Does not possess a Pse biosynthesis pathway, it also does not provide enough evidence to conclude that it is a Leg derivative. An MS(-MS) measurement of the eluted signal would not be a big investment in time and resources and would provide additional evidence to at least assign this peak to microbial sialic acid related to the present Leg biosynthesis pathway. It the identified mass would lead to identification of the derivative, it would also add to the proper characterization of the flagella glycosylation in the bacterium.

      We have now added the glycopeptide analyses as requested. They are described in the last experimental section and confirm our results.

      The data and the methods presented in this study are presented with sufficient detail so that they can be reproduced? However, I would suggest as is common nowadays in most journals that the authors include images of the raw unprocessed blot in de supporting info.

      *The motility pictures are representative of three independent experiments and the immunoblots are representative of at least two independent experiments. This has now been mentioned in the Experimental procedures. The raw unprocessed blots have now been added as supporting info. *

      Minor comments: There are a few textual errors that the authors should fix: -page 2, line 70: change "used" to "use" -page 11, line 407: add the word "are" after Pse On page 2, line 36, the authors state that "most eubacteria and the archaea typically decorate their cell surface structures with (5-, 7-)diacetamido derivatives, either pseudaminic acid (Pse) and/or its stereoisomer legionaminic acid (Leg,". This should be nuanced as to my knowledge it is not most eubacteria, but more a subset as identified by Varki in his seminal PNAS paper. The authors clearly present their data and conclusions in the figures of this manuscript. However, I would recommend the take a critical look at the drawing of their monosaccharide chair conformations and the positioning of the axial and equatorial groups on these chairs in Figure 1 and 5, as these are in most cases drawn a bit crooked, which can easily be corrected. We corrected the text as the reviewer suggested. We changed the sentence in the introduction to be more nuanced. The drawing of the monosaccharide has been improved.

      Reviewer #2 (Significance (Required)):

      The family of carbohydrates called sialic acids was long thought to exclusively occur in glycoproteins and glycolipids of vertebrates, but has since also been found in specific microbes. Especially symbiotic and pathogenic microbes associated with the humans express a wide array of unique microbial sialic acids for which their functional roles are not well understood and the associated glycosylhydrolase and glycosyltransferase have in most cases not been identified yet. The authors present an impressive insight into flagellar glycosylation with Pseudaminic and Legionaminic acid in two bacterial species, using genomic analysis, rewiring, immunoblots and motility assays as their main tools. They provide compelling evidence on the insulation of the Pse of Leg pathway in these species, the flexibility in exchanging between biosynthetic enzymes from the same pathway between various species. Crucially, most glycosyltransferases that add the Pse or Leg glycoform onto various acceptor sites in bacteria, have up to this point remained elusive in most cases. It is therefore very valuable information that the authors here provide on the involved glycosyltransferases. Especially, on the two domains that govern their sugar nucleotide and acceptor specificity, and that these can be reengineered as chimeric glycosyltransferases. To me as a chemical glycobiologist this provides compelling possibilities for glycoengineering possibilities in future studies in the field to elucidate the functional roles of Pse and Leg glycosylation.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary of the findings and key conclusions (including methodology and model system(s) where appropriate): Kint et al describe a neat study of bacterial flagellin glycosylation by a recently identified class of protein glycosyltransferases called FlmG. The experiments are well designed, the data presented is convincing and the conclusions drawn are mostly in line with the experimental evidence presented. These are the key findings. Kint et al show that genetic tools and motility can be used as a readout to probe the sugar biosynthesis pathway in bacteria. Using the recently characterized system of Caulobacter crescentus, they have performed a survey of different PseI/LegI/NeuB genes from various bacteria, checking whether they could rescue the motility defect in C. crescentus ΔpseI cells. They found that those genes that did confer motility also had higher sequence similarity to C. jejuni PseI than to C. jejuni LegI or C. jejuni NeuB. They also found that these genes also restored flagellin glycosylation as checked by mobility shift on gel electrophoresis with immunoblotting to anti-FljK antibody. This survey brought up an interesting finding that the PseI/LegI/NeuB orthologs of the closely related Brevundimonas species were unable to confer motility to C. crescentus ΔpseI cells, and were more similar to C. jejuni LegI than to C. jejuni PseI or C. jejuni NeuB. They also performed similar glycoprofiling experiments using B. subvibrioides ΔlegIBs cells and various PseI/LegI/NeuB orthologs from different bacteria, which indicated the restoration of motility by putative LegI synthases. Kint et al demonstrate flagellin glycosylation in B. subvibrioides by performing in-frame deletions of FlmG, and LegI genes in B. subvibrioides and checking for motility, presence of flagella, and flagellin glycosylation by motility shift on gel electrophoresis. Further, they confirm the critical nature of GDP-GlcNAc for Leg biosynthesis by assessing flagellin glycosylation and motility in B. subvibrioides with an in-frame deletion in PtmE/LegX and by performing heterologous complementation with an M. humiferra PtmE ortholog. They also reconstitute the legionaminic acid biosynthesis pathway in C. crescentus cells that lack flagellins, PseI and FlmG, and show that the heterologously expressed B. subvibrioides flagellin is glycosylated by heterologously expressed B. subvibrioides FlmG. Finally, they also show that whereas the CcFlmG cannot substitute for BsFlmG and vice versa, a chimeric FlmG bearing the TPR domain from C. crescentus FlmG (that recognizes C. crescentus FljK) and the GT domain from B. subvibrioides FlmG (that transfers CMP-Leg) modifies CcFljK in C. crescentus cells that lack CcFlmG but express both Pse (endogenously) and Leg (from the reconstituted pathway). This demonstrates the modularity of the FlmG glycosyltransferases. Kint et al provide the chemical nature of C. crescentus flagellin glycosylation. Kint et al have analyzed the glycans released from the flagellin by acid hydrolysis and clearly shown the nature of the glycan in C. crescentus flagellin to be Pse4Ac5Ac7Ac by use of Pse standards. The glycan from B. subvibrioides was distinct from the Leg standard used, and could be a Leg derivative distinct from Leg5Ac7Ac.

      Major comments: 1. Table 1 and Text in Results, lines 116-119, "In support of the notion that derivatization occurs after the PEP-dependent condensation reaction to form Pse or Leg, our glyco profiling analysis revealed that putative PseI proteins (identified by sequence comparisons to C. jejuni 11168, Table S1) conferred motility to C. crescentus ΔpseI cells, whereas putative LegI synthases did not." Not clear how putative PseI and LegI synthases were identified. Table 1 only lists overall percent sequence identity and similarity to Cj PseI, LegI and NeuB, and percent identities and similarities of the various nonulosonic synthases to these proteins are in the similar range, as expected. In the absence of sequence alignments indicating the presence of conserved residues, particularly related to the substrate binding region, that are distinct in these paralogs, calling out the type of synthase based on the highest percent identity (to Cj PseI, LegI or NeuB) is speculative. Also, Shewanella oneidensis does not follow the pattern of highest similarity to NeuB3. Second, in the absence of data showing that the Leg and Pse found in these different organisms actually are different derivatives, this does not support that "derivatization occurs after the PEP-dependent condensation reaction to form Pse or Leg". Putative PseI and LegI were proposed based on BlastP analyses in which the protein sequences of interest were aligned to the three experimentally validated synthases from C. jejuni 11168: PseI, LegI, NeuB as well as PseI from C. crescentus, as indicated in Table S1. While, the assignment of the donor sugar is based only on the sequence identity and similarity to LegI or PseI, this assignment corresponds well according to the restoration of the motility of the C. crescentus ΔpseI mutant upon expression of PseI ortholog and B. subvibrioides ΔlegI mutant with heterologous LegI expression.

      It is true that for Shewanella oneidensis the assignment as PseI or LegI is ambiguous, exhibiting nearly identical similarity, but it is quite distinct from NeuB. This actually makes the S. oneidensis synthase a very interesting case to explore the enzymology of its Pse/LegI ortholog, knowing that it has been previously shown that this bacterium glycosylates its flagellins with Pse derivatives (PMID: 24039942). The results from our genetic complementation analysis are however very clear (PseI ortholog) and very consistent with the functional analysis in S. oneidensis.

      Concerning the different derivatives of Pse or Leg: McDonald and Boyd (PMID 32950378) recently published a review giving some examples of Bacteria/Archaea experimentally shown to contain Pse/Leg-derivatives: C. jejuni 11168 modifies its flagellin with 5,7-N-acetyl Pse, Sinorhizobium fredii NGR234 (not used in this study but in our previous work PMID 33113346 and showed to restore the motility of C. crescentus ΔpseI cells) modifies its capsule with 5-acetamido-7-3-hydroxybutyramido-Pse), Treponema denticola modifies its flagellin with 7-(2-metoxy-4,5,6-trihydroxy-hexanoyl-Pse, A. baumannii LAC-4 produces 5,7-N-acetyl-8-epi-Leg to decorate the capsule, Halorubrum sp. PV6 modifies the LPS with N-formylated Leg and L. pneumophila produces 5-acetamidino-Leg.

      The reviewer is right in that we do not know the exact version of Pse or Leg produced in C. crescentus and B. subvibrioides, HOWEVER, the fact that complementation works with the majority of the orthologs of PseI and LegI including many from bacteria that are known to produce modified Pse derivatives for example in Shewanella oneidensis and Treponema denticola, the most likely explanation is that derivatization occurs after the PseI or LegI step, but we concede that the results are also compatible with a promiscuous enzyme that can accept different Pse derivatives or different Leg derivatives.

      1. Related to (1), Text in Results, lines 130-131, "We conclude from our survey that (heterologous) PseI synthase activity generally confers motility to C. crescentus ΔpseI cells, whereas LegI-type (or NeuB-type) synthases are unable to do so." There is no a priori evidence provided indicating that these were PseI or LegI type synthases. So the conclusion really is that assuming only PseI type synthases would be able to rescue the motility defect in C. crescentus ΔpseI cells, this glyco-profiling motility assay now provides the first biochemical evidence telling us which synthases are Pse-type, and which are Neu/Leg-type. And in my view, this is the conclusion of greater significance in the field - to be able to now identify which is a PseI and which a LegI based on these complementation assays. However, if the authors still wish to retain their original conclusion, they could cite or provide evidence (either biochemical evidence in this work or reported literature regarding the sugar synthesized or bioninformatics analysis regarding the presence of distinct genes such as the Ptm genes for legionaminic acid biosynthesis pathway or genes that differ in their enzyme activities and overall fold such as PseB/LegB or PseG/LegG in the gene neighborhood) indicating or suggesting the PseI/LegI/NeuB nature of the different synthases. Also, methods for the bioinformatics analysis (eg. BLASTp settings used, dates of searches, whether regular BLAST or PSI-BLAST was used, etc.) are missing in the manuscript, and need to be included. We agree that for many PseI or LegI tested, there is no provided biochemical evidence. HOWEVER, this is not the case for some of them including the PseI, LegI and NeuB from Campylobacter jejuni (PMID 19282391), some A. baumannii strains (α-epi-legionaminic acid for A. baumannii LAC-4 PMID 24690675), Shewanella oneidensis (Pseudaminic acid with methylation PMID 23543712), Legionella pneumophila (Legionaminic acid PMID 18275154) or Halorubrum sp. PV6 (N-formylated legionaminic acid PMID 30245679). Thus, we maintain the two conclusions: the PseI and LegI synthases are generally interchangeable and the complementation assays can enable to identify and assign PseI and LegI function. BLAST2P was used to compare the protein sequences of the tested NeuB-like synthases with NeuB1, LegI (NeuB2) and PseI (NeuB3) from Campylobacter jejuni but also with PseI from C. crescentus. BLOSUM62 matrix was used as well as a word of size 3 for the comparison. We have now added this procedure in the legend of the Table S1.
      2. It is interesting that there is still a signification amount of flagellin secretion/assembly in the B. subvibrioides LegI and FlmG mutants. It will be good to see a discussion about whether this is likely from due to low level of function despite the in-frame deletion of genes; how many flagellin subunits are likely to have managed secretion and assembly in these short flagella; whether there is any redundancy of LegI / FlmG (perhaps with lower levels of expression); considering Parker and Shaw's findings of glycosylation being required for flagellin binding to the chaperone and subsequence secretion in A. caviae whether there is a FlaJ homolog in B. subvibrioides. Also, can the authors rule out the possibility that absence of glycosylation does not affect flagellin assembly but makes the flagellum prone to shear/breaks in B. subvibriodes, resulting in smaller flagella? How many flagellins are there in B. subvibrioides? Is it possible that one is glycosylated but another/others are not, and that is the reason for the small flagellum in these mutants? The number of flagellin subunits that are assembled into a full-length flagellar filament is unknown in C. crescentus and in B. subvibrioides. There are 3 different flagellin genes that are now presented schematically in Figure 1C. No redundancy has been found for LegI or FlmG. It is possible that the B. subvibrioides is better in exporting non-glycosylated flagellin or that the capping proteins can function better with sugar modification or that the filament of B. subvibrioides mutants is less fragile when it is non-glycosylated or that its flagellins “stick” better. It is also possible that short filaments are not actually containing flagellins mounted on the hook but another protein that polymerizes aberrantly in the absence of Leg or FlmG. This remains to be investigated and compared to the situation of Pse and FlmG mutants of C. crescentus.

      B. subvibrioides possesses an ortholog of the C. crescentus flagellin secretion chaperon FlaF (PMID 33113346). As observed in C. crescentus, FlaF likely has a role in flagellin translation as its inactivation totally prevents flagellins production (see answer to reviewer #1). For C. crescentus, bacterial two hybrid experiments revealed that FlaF can interact with non-glycosylated flagellins in E. coli. Thus, it is strongly possible that FlaF/flagellins interaction is not dependent on the flagellins glycosylation state. In addition, the short flagellum filament observed in B. subvibrioides ΔlegI or ΔflmG mutants argues that at least some flagellins are secreted while not glycosylated.TEM pictures have been performed in liquid medium from exponential growth phase. In this condition, no fragment of flagella was observed in the culture medium by TEM but only small flagella with a hook structure attached. Also, flagella breaks might result in more random length of flagellum.

      Three flagellins are in B. subvibrioides (Bresu_2403 is 59% identical with FljLCc, Bresu_2638 is 57% identical with FljKCc and Bresu_2636 is 62% identical with FljJCc). We now show this genetic organization of the flagellins in Fig. 1C. The three flagellins are all detected by the anti-FljKCc anti serum (see answer and figure to reviewer #1). We cannot attribute the immunoblot signal to any individual B. subvibrioides flagellin as they could all co-migrate on SDS-PAGE. However, the signal often looks like a doublet (as shown in Figure 4B for example) suggesting that at least two flagellins are detected and this doublet is always found to migrate faster in absence of glycosylation that could indicate that all B. subvibrioides flagellins (or at least 2) are modified.

      Text in Results, lines 170-171, "We then probed the resulting ΔlegIBs and ΔflmGBs single mutants for motility defects in soft agar and analyzed flagellin glycosylation by immunoblotting using antibodies to FljKCc". Was the antibody to FljKCc determined to also specifically bind to FljKBs? Also, how many flagellins are there in B. subvibrioides? Are all detected with this antibody? Antibodies raised to FljKCc were raised against His6-FljK produced in E. coli (previously published in Ardissone et al, 2020). This serum recognizes the 6 flagellins from C. crescentus (PMID: 33108275). It recognized the three flagellins from B.s. (see answer to reviewer #1).

      It is interesting that C. cresentus cells expressing Pse (endogenously) and Leg (reconstituted pathway), and BsFlmG and BsFljK (corresponding to Figure 5C) are not motile. Was the motility assay done for the experiment of figure 5B as well? Are the C. crescentus cells lacking Pse and FlmG but with heterologous expression of Leg and BsFljK and BsFlmG also non-motile? Also, it will be good to see the TEM images for these cells.

      C. crescentus cells that produce Pse (endogenously) or Leg (reconstituted pathway) and BsFlmG and BsFljK (formerly Figure 5C and now as Figure 7C) are indeed not motile as shown by the motility tests presented in Figure S5B. Motility assays with cells used in the former Figure 5B (now Figure 7B) have also been done and are now presented Figure S4B. These cells are non-motile because BsFljK is not efficiently secreted (or unstable after secretion) as shown on the immunoblot of the supernatant fraction in Figure S4A lower panel. As a result, flagellar filament is not properly assembled as only a short flagellum was observed by TEM in such cells compared to the WT C. crescentus (Figure S4C and S4D).

      Immunoblotting of the supernatants should be shown (in addition to the cell extracts) for Figures 5B and 5C so that the reader can appreciate whether glycosylation has taken place but secretion/assembly has not. Further, HPLC of the acid extracts from flagellin could be done to unambiguously show whether the CcFlmG has transferred Pse and the BsFlmG and Cc-BsFlmG have transferred Leg on to the CcFljK in Figure 5c, and the identity of the sugar, if any, transferred by CcFlmG in the absence of Pse, and BsLeg genes or BsLegX gene in figure 5B.

      *__ Immunoblots of the supernatants for Figure 5B (now Figure 7B) have been done and been added (Figure S4A lower panel). BsFljK is barely detected in the supernatant whatever its glycosylation state (with or without Leg). Note that in the supporting info where the raw unprocessed blot used for this panel is shown, a positive control of blotting (C. crescentus Δfljx6 mutant expressing CcFljK from pMT463) has been used. Immunoblots of the supernatant from Figure 5C (now 7C) have been done and been added in figure S5A. The CcFljK modified with Leg is poorly secreted (or unstable after secretion). As a result, these cells only harbor a short flagellum compared to those that are able to modify CcFljK with Pse (Figure S5C).

      HPLC of the acid extracts from flagellins have been performed on purified flagella obtained by ultracentrifugation. As C. crescentus cells expressing BsFlmG and Cc-BsFlmG harbor no or short flagellar filament, the purification by ultracentrifugation is limited. Thus, to further confirm that CcFlmG has transferred Pse and Cc-BsFlmG (and BsFlmG) has transferred Leg on CcFljK (former Figure 5C and now Figure 7C), we performed immunoblots on the cell extracts of C. crescentus ΔflmG ΔpseI cells that cannot produce Pse but able to produce Leg (reconstituted pathway). These experiments, now presented in Figure 7C (lower panel) confirmed that no modification of CcFljK was observed in C. crescentus cells expressing CcFlmG whereas CcFljK is modified in C. crescentus expressing Cc-BsFlmG, confirming that Cc-BsFlmG has transferred Leg (the only NulO produced in this condition).__*

      Text in discussion, lines 334-338, "By extension, having recognized the LegX/PtmE enzyme as a critical element in the Leg-specific enzymatic biosynthesis step (Figure 6) likewise offers another functional, but also a novel bioinformatic, criterion for the correct assignment and discrimination of predicted stereoisomer biosynthesis routes residing in ever-expanding genome databases" It will be nice to see a discussion on the prevalence of PtmE versus GlmU (or equivalent gene), PtmF, PtmA, PgmL in the Leg synthesizing organisms. Is the PtmE but not the other genes found in all cases, which makes it better as a molecular determinant for bioinformatics predictions of the type of pathway? Also, on whether PtmE has any homology to genes in other pathways (not associated with flagellin glycosylation) and how reliable a marker it is to differentiate Leg biosynthesis from Neu5Ac biosynthesis pathways.

      GlmU is a potential bifunctional UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase that can be part of both Pse and Leg pathway (PMID 19282391). Accordingly, a GlmU ortholog is found in C. crescentus and B. subvibrioides that we showed are producing Pse and Leg, respectively. Thus, GlmU cannot be attributed to a Leg pathway signature. On the other hand, PtmE is barely found in the organisms from which PseI orthologs restore the motility of C. crescentus ΔpseI cells.

      PtmF, PtmA, PgmL and GlmS are proposed to act upstream of the production of GlcN-1-P that is a precursor of both UDP-GlcNAc and GDP-GlcNAc, the precursors of Pse and Leg respectively. In addition, orthologs of these genes are not prevalent in the Leg synthetizing organisms present in Table S2 using BlastP analyses with C. jejuni proteins as templates.PtmE ortholog is found in most of the Leg synthetizing organisms as shown in Table S2 and often genetically linked with other genes coding for proteins involved in Leg production (shown with the asterisk * in table S2). Of note, PtmE is found not only in organisms that modify flagellin(s) with Leg but also in organisms that add Leg on capsule such as A. baumannii LAC-4.

      It is not clear from the methods or the figure legends how many times the immunoblotting, motility experiments were done; how many experiments/trials are the images representative of? The motility pictures are representative of three independent experiments. The immunoblots are representative of at least two independent experiments. This information is now added in the Experimental procedures section.

      Minor comments:

      1. The gene for GlcN-1-P guanylyltransferase in the Leg-specific enzymatic biosynthesis step is already known as PtmE from the work of Schoenhofen's group. For the sake of consistency, it would be better to retain the nomenclature as PtmE throughout the manuscript instead of introducing the name LegX, which makes it sound like it is a previously unknown gene.

      2. Text in abstract, lines 15-17: "Sialic acids commonly serve as glycosyl donors, particularly pseudaminic (Pse) or legionaminic acid (Leg) that prominently decorate eubacterial and archaeal surface layers or appendages" The glycosyl donor is the nucleotide sugar and not the nonulosonic acid or sialic acid... rephrasing required for accuracy. Done

      3. Text in abstract, lines 18: "a new class of FlmG protein glycosyltransferases that modify flagellin" The authors are presumably referring to FlmG as the new class of protein glycosyltransferases... rephrasing required for accuracy Corrected
      4. Text in introduction, lines 41-42 "Pse and Leg derivatives synthesized in vitro can be added exogenously in metabolic labeling experiments" It should be "derivatives of Pse and Leg precursors" and not "Pse and Leg derivatives" corrected
      5. Text in introduction, line 46 "Pse- or Leg-decorated flagella may also be immunogenic." This sentence is not referenced and it is not clear why it is written here.

      6. Text in introduction, lines 63-66 "The synthesis of CMP-Pse or CMP-Leg proceeds enzymatically by series of steps [20-22], ultimately ending with the condensation of an activated 6-carbon monosaccharide (typically N-acetyl glucosamine, GlcNAc) with 3-carbon pyruvate (such as phosphoenolpyruvate, PEP) by Pse or Leg synthase paralogs, PseI or LegI, respectively" The synthesis begins with activated GlcNAc. The substrate for condensation is not activated GlcNAc. It is 2,4-diacetamido-2,4,6-trideoxy-D-mannopyranose in case of LegI and 2,4-diacetamido-2,4,6-trideoxy-b-L-altropyranose in case of PseI. Indeed, we modified the sentence.

      7. Text in introduction, line 70 "for used as glycosyl donors" Typographical error, "for use as glycosyl donors" Corrected
      8. Text in Results, line 102, "C. crescentus only encodes only PseI" Do the authors mean "only one PseI"? Corrected
      9. Text in Results, lines 108 and 109, "Such modifications could occur before the PseI synthase acts or afterwards. In the latter case, most (if not all) synthases would be predicted to produce the same Pse molecule," Do the authors know of any reports of modifications occurring before the PseI synthase? Please cite references, if known. Why "most (if not all)"? If the former case is true, the PseI synthase might not be able to accept the substrate. Correct. Because we cannot test all enzymes we must keep the statement non-committing.

      “Most (if not all)” refers to the latter case i.e. the modification occurs after PseI synthase. In this context, PseI should do the same reaction, however, there might be some exceptions.

      There is, to our knowledge, no reports showing that modifications occur before the PseI synthase. The glyco-profiling experiments all suggest that modification occurs after Pse production based on our motility readout. It is possible that PseI enzymes that condense a modified precursor would not be functional in our motility assay.

      Text in Results, lines 141-143, "our bioinformatic searches using C. jejuni 11168 as reference genome identified all six putative enzymes in the B. subvibrioides ATCC15264 genome (CP002102.1) predicted to execute the synthesis of Leg from GDP-GlcNAc" Not clear how this was done. Do the authors mean that they used the genes from C. jejuni 11168 as the query genes to identify homologs in B. subvibrioides ATCC15264 genome (CP002102.1)? Or did they use putative genes from B. subvibrioides ATCC15264 genome (CP002102.1) and pull out homologs from C. jejuni 11168 by using C. jejuni 11168 as the reference genome? We now have modified the sentence to make it clearer.

      At first reading, the flow of the manuscript is difficult to follow due to the figures not appearing in full in order of their occurrence. For instance, Figures 5B and 5C are discussed only in the end of the manuscript after the results of Figures 6 and 7. Other instances also exist. The authors may consider re-ordering the figure parts if possible so that all parts of each figure appear in order of occurrence in the manuscript text. Thanks for raising this issue. We have now tried to address this concern by re-organizing the order of occurrence of the figures. Notably we have now exchanged Figure 5 (on Leg pathway reconstitution and FlmG rewiring) with Figure 7 (on LegB and LegH). We modified the text accordingly. We hope that it makes the manuscript and corresponding figures easier to follow.

      Reviewer #3 (Significance (Required)):

      The nonulosonic acids, Pseudaminic acid and Legionaminic acid, are abundant in bacterial systems in the capsular and lipopolysaccharides as well as in glycoprotein glycans. The Ser/Thr-O-nonulosonic acid glycosylation of flagellins has been studied with respect to the system of Maf glycosyltransferases in Campylobacter jejuni, C. coli, Helicobacter pylori, Aeromonas caviae, Magnetospirillum magneticum, Clostridium botulinum and Geobacillus kaustophilus, and recently with respect to the system of FlmG glycosyltransferases by Viollier's group in Caulobacter crescentus. However, the determinants that govern the glycosyltransferase function are not still well known. Kint et al have performed excellent work using bacterial genetics tools to (1) highlight the "functional insulation" of the Leg and Pse biosynthesis pathways, (2) demonstrate the modularity of the FlmG glycosyltransferase proteins with respect to the flagellin binding and glycosyltransferase domains. This work makes a significant advance in the field with respect to (1) understanding flagellin glycosylation by FlmG, (2) making designer protein Ser/Thr-O-glycosyltransferases, and (3) bioinformatics analysis of genomes with respect to the Pse/Leg/Neu nonulosonic acid biosynthetic potential encoded. The findings will be of great interest to scientific audiences working in the areas of glycobiology and bacteriology. My area of expertise: Maf flagellin glycosyltransferases

    1. Author Response

      Reviewer #1 (Public Review):

      The software presented in this paper is well documented and represents a significant achievement that breaks new ground in terms of what is possible to render and explore in the web browser. This tool is essential for the exploration of SC2 data, but equally useful for the tree of life and other tree-like data sets.

      Thank you for reviewing my work and for this generous assessment.

      Reviewer #2 (Public Review):

      This manuscript describes a web-based tool (Taxonium) for interactively visualizing large trees that can be annotated with metadata. Having worked on similar problems in the analysis and visualization of enormous SARS-CoV-2 data sets, I am quite impressed with the performance and "look and feel" of the Taxonium-powered cov2tree web interface, particularly its speed at rendering trees (or at least a subgraph of the tree).

      Thank you for the kind words.

      The manuscript is written well, although it uses some technical "web 2.0" terminology that may not be accessible to a general scientific readership, e.g., "protobuf" (presumably protocol buffer) and "autoscaling Kubernetes cluster". The latter is like referring to a piece of lab equipment, so the author should provide some sort of reference to the manufacturer, i.e., https://kubernetes.io/.

      Thank you for flagging this. I have now replaced the colloquial "protobuf" with "protocol buffer". I have now provided a URL for Kubernetes. It is always difficult to judge how much to explain technical terms. I certainly agree that many people will be unfamiliar with, for instance, protocol buffers, but an explanation of what they are (which may not be particularly important for understanding Taxonium) can sometimes overshadow more important details. So my preference in that particular case is for an interested reader to research the unfamiliar term.

      In other respects, the manuscript lacks some methodological details, such as exactly how the tree is "sparsified" to reduce the number of branches being displayed for a given range of coordinates.

      This is an important point also raised by Reviewer 3. I have added a new section in the Materials and Methods which discusses this in some detail.

      Some statements are inaccurate or not supported by current knowledge in the field. For instance, it is not true that the phylogeny "closely approximates" the transmission tree for RNA viruses.

      I agree that this was an overly broad claim, and have softened it, now saying:

      "The fundamental representation of a viral epidemic for genomic epidemiology is a phylogenetic tree, which approximates the transmission tree and can allow insights into the direction of migration of viral lineages."

      Mutations are not associated with a "point in the phylogeny", but rather the branch that is associated with that internal node.

      I have changed this as suggested.

      A major limitation of displaying a single phylogenetic tree (albeit an enormous one) is that the uncertainty in reconstructing specific branches is not readily communicated to the user. This problem is exacerbated for large trees where the number of observations far exceeds the amount of data (alignment length). Hence, it would be very helpful to have some means of annotating the tree display with levels of uncertainty, e.g., "we actually have no idea if this is the correct subtree". DensiTree endeavours to do this by drawing a joint representation of a posterior sample of trees, but it would be onerous to map a user interface to this display. I'm raising this point as something for the developers to consider as a feature addition, and not a required revision for this manuscript.

      I entirely agree with this point. I have added a sentence in the discussion:

      "Even where sequences are accurate, phylogenetic topology is often uncertain, and finding ways to communicate this at scale, building on prior work [Densitree citation] would be valuable."

      The authors make multiple claims of novelty - e.g., "[...] existing web-based tools [...] do not scale to the size of data sets now available for SARS-CoV-2" and "Taxonium is the only tool that readily displays the number of independent times a given mutation has occurred [...]" - that are not entirely accurate. For example, RASCL (https://observablehq.com/@aglucaci/rascl) allows users to annotate phylogenies to examine the repeated occurrence of specific mutations. Our own system, CoVizu, also enables users to visualize and explore the evolutionary relationships among millions of SARS-CoV-2 genomes, although it takes a very different approach from Taxonium. Taxonium is an excellent and innovative tool, and it should not be necessary to claim priority.

      I agree that comparisons with existing tools are difficult and often provide a sense of unnecessary competition. I attempted to be quite careful in the specific section focused on comparison, but may have been less careful earlier on. The intent with this first sentence in the abstract was to provide a succinct description of the gap that Taxonium was developed to fill with "however, existing web-based tools for analysing and exploring phylogenies do not scale to the size of datasets now available for SARS-CoV-2". I have now removed the words "analysing and", focusing on the exploration of phylogenies. I think this new sentence is defensible in that valuable tools such as CoVizu intentionally do not explore a phylogeny directly but instead take a multi-level approach, and this new sentence better matches the comparisons in the paper. In the second sentence, I have removed the phrase "is the only tool that", which I agree adds little and may not be accurate, depending on one's interpretation of "readily". Thank you for these points.

      Although the source code (largely JavaScript with some Python) is quite clean and has a consistent style, there is a surprising lack of documentation in the code. This makes me concerned about whether Taxonium can be a maintainable and extensible open-source project since this complex system has been almost entirely written by a single developer. For example, usher_to_taxonium.py has a single inline comment (a command-line example) and no docstring for the main function. JBrowsePanel.jsx has a single inline comment for 293 lines of code. There is some external documentation (e.g., DEVELOPMENT.md) that provides instructions for installing a development build, but it would be very helpful to extend this documentation to describe the relationships among the different files and their specific roles. Again, this is something for the developers to consider for future work and not the current manuscript.

      This is an entirely fair comment. The version of Taxonium presented in the manuscript is "2.0", which is a new version built from scratch with considerably less technical debt than the version that preceded it. Its technical strengths are that (with the exception of the backend) it is relatively well-modularised into functional components. But the limitations that the reviewer notes with respect to commenting are entirely fair. What I would say is that in the time since this manuscript was submitted, several important features have been added by an external collaborator, Alex Kramer, most notably the Treenome Browser (https://www.biorxiv.org/content/10.1101/2022.09.28.509985v1). I hope that the ability of Alex to add these features with little need for support provides some evidence of Taxonium's extensibility. But I acknowledge there is room for improvement.

      Reviewer #3 (Public Review):

      The paper succinctly provides an overview of the current approaches to generating and displaying super-large phylogenies (>10,000 tips). The results presented here provide a comprehensive set of tools to address the display and exploration of such phylogenies. The tools are well-described and comprehensive, and additional online documentation is welcome.

      The technical work to display such large datasets in a responsive fashion is impressive and this is aptly described in the paper. The author rightly decides that displaying large phylogenies is not simply a matter of rendering "more nodes", and so in my eyes, the major advancement is the approach used to downsample trees on-the-fly so that the number of nodes displayed at one time is manageable. This is detailed only briefly (Results section, 1st paragraph, 2 sentences). I would like to see more discussion about the details of this approach.

      Thank you for this point, also raised by Reviewer 2. I have now added a lengthy section on this in the Materials and Methods, which I hope is helpful. The approach is not especially sophisticated, but it does the job and runs quickly.

      Examples that came up while exploring the tool: the (well implemented) search functionality reports results from the entire tree (e.g. in Figure 4, the number of red circles is not a function of zoom level), how does this interact with a tree showing only a subset of nodes?

      Yes, this is an important feature which I perhaps did not do justice to in the write-up. I have included in the new section in the Materials and Methods a paragraph discussing search results:

      "In order to ensure that search results are always comprehensive, but at the same time to avoid overplotting, we take the following approach::

      ● Searches are performed across every single node on the tree to select a set of nodes that match the search. The total number of matches is displayed in the client.

      ● If fewer than 10,000 matches are detected, these are simply displayed in the client as circles

      ● If more than 10,000 matches are detected, the results are sparsified using the method above, and then displayed.

      ● Upon zooming or panning, the sparsification is repeated for the new bounding box."

      How is the node order chosen with regards to "nodes that would be hidden by other nodes are excluded" and could this affect interpretations depending on the colouring used?

      This perhaps was slightly sloppy language which did not directly describe the implementation. I have now rephrased this to "only nodes that overlap other nodes are excluded", as we don't in fact consider a notion of z-index when doing this. The way the sparsification works (now better described) means that the nodes excluded are determined essentially by position and I don't foresee this introducing particular biases, but this was an insightful point to raise.

      Taxonium takes the approach of displaying all available data (sparsification of nodes notwithstanding). Biases in the generation of sequences, especially geographical, will therefore be present (especially so in the two main datasets discussed here - SARS-CoV-2 and monkeypox). This caveat should be made explicit.

      This is certainly true. I have added this new paragraph in the Discussion:

      "A further challenge is the vastly different densities of sampling in different geographic regions. Because Cov2Tree does not downsample sequences from countries which are able to sequence a greater proportion of their cases, the number of tips on a tree is not indicative of the size of an outbreak and in some cases even inferences of the directionality of migration may be confounded. There would be value in the development of techniques that allow visual normalisation of trees for sampling biases, which might allow for less biased phylogenetic representations without downsampling."

      Has the author considered choosing which nodes to exclude for sparsified trees in such a way as to minimise known sampling biases?

      The last sentence of the new paragraph above alludes to a sort-of-similar approach. I hadn't directly considered the approach the reviewer suggests. It is an interesting idea. The downsampling algorithm has to be very computationally inexpensive but it would be interesting to explore ways to do this. I am tracking this in https://github.com/theosanderson/taxonium/issues/437.

      Interoperability between different software tools is discussed in a technical sense but not as it pertains to discovering the questions to ask of the data. As an example, spotting the specific mutations shown in figure 3 + 4 is not feasible by checking every position iteratively; instead, the ability to have mutations flagged elsewhere and then seamlessly explore them in Taxonium is a much more powerful workflow. This kind of interoperability (which Taxonium supports) enhances the claim of "providing insights into the evolution of the virus".

      Thank you for flagging this point -- I am very excited by the growing ecosystem of interoperable tools. You are absolutely right that most of the insights Taxonium can bring into evolution rely also on this broader ecosystem. I have added a florid sentence in the concluding paragraph: "It forms part of an ecosystem of open-source tools that together turn an avalanche of sequencing data into actionable insights into ongoing evolution."

      The prosaic reason I don't discuss Taxonium's interoperability features in more detail in this manuscript is that it aims to describe the version of Taxonium I initially developed, and these features were developed collaboratively by a broader group later on (and after deposition of this preprint).

      Taxonium has been a fantastic resource for the analysis of SARS-CoV-2 and this paper fluently presents the tool in the context of the wider ecosystem of bioinformatic tools in use today, with the interoperability of the different pieces being a welcome direction.

    1. We have spent too much time on inward-lookingdebates that pit distant against close reading, and not enough time understandingconnections to other disciplines.

      Of course with innovation comes back lash, it is within human nature to want to not have/want change.

      Ie. Technology such as the newest phones and older generations not wanting to learn/ not knowing how to understand then.

      We tend to pit every new idea to an assortment of way/methods/things we already know rather than exploring them for what they were thought to be made for. I think seeing that there was an inward debate on the subject wasn't much of surprise but rather a given! Knowing of critics such as Stephen Marche and Stanly Fish, it is easy to see way it was the way it is.

      I do however wonder which other disciplines we could better connect to? And whether it would be a better use of our time just understanding distant reading at its surface level or to keep a comparative with others too? (I would say comparing to others may help in the overall scheme of things).

    1. Winston Churchill's "Blood, Toil, Tears, and Sweat" Speech On Friday evening last I received from His Majesty the mission to form a new administration. It was the evident will of Parliament and the nation that this should be conceived on the broadest possible basis and that it should include all parties. I have already completed the most important part of this task. A war cabinet has been formed of five members, representing, with the Labour, Opposition, and Liberals, the unity of the nation. It was necessary that this should be done in one single day on account of the extreme urgency and rigor of events. Other key positions were filled yesterday. I am submitting a further list to the king tonight. I hope to complete the appointment of principal ministers during tomorrow. The appointment of other ministers usually takes a little longer. I trust when Parliament meets again this part of my task will be completed and that the administration will be complete in all respects. I considered it in the public interest to suggest to the Speaker that the House should be summoned today. At the end of today's proceedings, the adjournment of the House will be proposed until May 21 with provision for earlier meeting if need be. Business for that will be notified to MPs at the earliest opportunity. I now invite the House by a resolution to record its approval of the steps taken and declare its confidence in the new government. The resolution: "That this House welcomes the formation of a government representing the united and inflexible resolve of the nation to prosecute the war with Germany to a victorious conclusion." To form an administration of this scale and complexity is a serious undertaking in itself. But we are in the preliminary phase of one of the greatest battles in history. We are in action at many other points — in Norway and in Holland — and we have to be prepared in the Mediterranean. The air battle is continuing, and many preparations have to be made here at home. In this crisis I think I may be pardoned if I do not address the House at any length today, and I hope that any of my friends and colleagues or former colleagues who are affected by the political reconstruction will make all allowances for any lack of ceremony with which it has been necessary to act. I say to the House as I said to ministers who have joined this government, I have nothing to offer but blood, toil, tears, and sweat. We have before us an ordeal of the most grievous kind. We have before us many, many months of struggle and suffering. You ask, what is our policy? I say it is to wage war by land, sea, and air. War with all our might and with all the strength God has given us, and to wage war against a monstrous tyranny never surpassed in the dark and lamentable catalogue of human crime. That is our policy. You ask, what is our aim? I can answer in one word. It is victory. Victory at all costs — Victory in spite of all terrors — Victory, however long and hard the road may be, for without victory there is no survival. Let that be realized. No survival for the British Empire, no survival for all that the British Empire has stood for, no survival for the urge, the impulse of the ages, that mankind shall move forward toward his goal. I take up my task in buoyancy and hope. I feel sure that our cause will not be suffered to fail among men. I feel entitled at this juncture, at this time, to claim the aid of all and to say, "Come then, let us go forward together with our united strength."

      Important speech by Winston Churchhill

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      The reviews are on balance an accurate, thoughtful, thorough assessment of the manuscript. We appreciate the careful engagement with the B cell differentiation aspect of our work. We identify 2 major critiques from the reviews:

      1. The manuscript should make stronger connections with existing literature on ____in-vitro _and _in-vivo ____B cell differentiation. We agree the manuscript should be revised to interact more holistically and carefully with relevant B cell differentiation research. In this respect, the reviewers both help by pointing to high-quality and relevant literature that will be discussed and cited.

      The cytokine mixture we used on the B cells was not defined / described in the manuscript. This fact hinders the interpretation of the data because B cells will respond to diverse stimuli in quite different ways.

      We agree this hinders interpretation of the data, and the reviewers bring up astute points about different types of stimuli (TD vs. TI vs. TLR vs. BCR). Unfortunately, the manufacturer of the product, Stem Cell Technologies, will not disclose exactly what is in the product. Given we are in strong agreement with the reviewers on this point, we analyzed the cytokine contents of the cocktail and our cell culture supernatants using a luminex cytokine panel. We present a discussion of our findings on this data in a supplementary note and figure. We acknowledge this analysis is non-exhaustive, because it does not include possible additions of non-cytokine stimulants. However, we maintain it adds much clarity to the interpretation of the data.

      We note that the contents of the stimulation cocktail are knowable and well-defined. These attributes are in contrast to almost all B cell stimulation protocols of which are aware. Typical stimulation protocols use various types of feeder cells, cytokines, and FBS (Fetal Bovine Serum). In particular, the feeder cells and FBS, are highly variable between labs, lots, and even experiments. FBS has a myriad of issues which are described here (____https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8349753/____). Major variability, from genomic to phenotypic, has been described in laboratory cell lines like the ones used as feeder cells. With respect to B cells specifically, large differences in B cell activation programs are observed between lots of FBS, as described here (____https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7854248/#r5____). Additionally, we have observed the presence of bovine viruses and other contaminants in FBS (unpublished data). Thus, the stimulation protocol we used is reproducible and robust in ways generally unseen by us in B cell stimulation literature. In summary, we view this cocktail as useful in a similar way to how FBS is useful to biologists – a major difference being that this cocktail is better defined and controlled. We provide similar thoughts in our supplementary note.

      A final general point we will make is about the significance of our work, which appears to be lost on Reviewer #1. Similar technical and conceptual advances by our lab have been cited 1000s of times. Thus, we think the impact of our scientific approach speaks for itself. Many of our results confirm and expand on previous literature about B cells. We deliberately chose to make this novel technical and conceptual advance in the well-studied system of B cell differentiation. This allows us to integrate our findings with prior literature and helps validate the general approach. Reviewer #1 has performed a scholarly service by independently verifying our findings are coherent with existing literature, and we thank them for that.

      In response to the reviews, we have edited the manuscript to reference even more of the papers in the field which report similar findings. Thus, our concordance with prior literature should be viewed as a strength of the manuscript. It shows readers of the manuscript the conceptual framework we use here is valid and can generate similar insights in less well-studied systems. For example, the approach developed here could be used in non-B cells, non-human immune systems, or even non-model organisms. In response to the reviewers critique, we modified the discussion of our work in multiple places to emphasize these points.

      Description of the planned revisions
      

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.

      1) Which B cell activation protocol was used? No information is provided in the main text or supplementary information. Yet, this information is key to fully understand many of the conclusions of this work (e.g., ... memory B cells are intrinsically two-fold more persistent in vitro (2A)), which largely depend on the nature of the stimuli used in the in vitro B cell culture.

      We used the B cell activation protocol developed by StemCell Technologies as described in our methods section. We agree the reader and scientific community would benefit from additional information about this cocktail. To this end, we added a discussion of the cocktail to the supplementary information. We also used a cytokine analysis panel to analyze the cocktail, which provided detailed although non-exhaustive information about what is in the cocktail.

      2) It would be informative to use more than one B cell activation program, e.g., CD40L with or without a cytokine as well as CD40L vs. CpG-DNA. Authors make broad statements about B cell fates without discussing the impact of a given signal on a given B cell fate. For instance, do memory B cells follow the same differentiation program upon stimulation with CD40L, IL-4 or a combination of CD40L and IL-4? How about differences between a TD signaling program such as that provided by CD40L and IL-10 and TI signaling program such as that provided by CpG-DNA and IL-10?

      This is a good point. We agree stimulation using a panel of different agents would be a worthwhile experiment. It stands as a goalpost for future studies. Currently, performing single cell RNA sequencing on so many samples is both beyond the scope of this manuscript and very resource intensive. ____

      3) Page 3, first line: After low quality and non-B cells (Fig S1A & B). What does this statement mean? The sentence seems incomplete.

      Thank you for catching this typo. It is now clarified in the manuscript that we removed these cells bioinformatically.

      4) First, we noted that non-B cells present in the input rapidly became undetectable by day 4, which shows the specificity of the cytokines for B cell expansion. Which cytokines are we talking about? No detail is provided.

      We now provide our analysis of the stimulation cocktail, in Supplementary note 1 and Supplementary figure 1A. We still believe it is an interesting observation that this cocktail specifically stimulates B cells because many cytokines are not specifically B cell division signals, there were some impurities in the input population, and many cytokines are produced by the cultured cells themselves.

      5) Plasmablasts were not distinguished from plasma cells.

      We agree this is an interesting and important distinction to make. We have now distinguished between these classes of cells.

      6) Critically, we observed no appreciable evidence of hypermutation in vitro (S2C), consistent with prior literature (Bergthorsdottir et al. 2001). This statement is vague, misleading and likely inaccurate for the following reasons. (a) The B cell culture conditions used by the authors are completely unknown. (b) It was shown that SHM can be achieved under specific in vitro B cell culture conditions that include the presence of activated CD4+ T cells (PMID: 9052835; PMID: 10092799; PMID: 10878357; PMID: 12145648). Did authors try to recapitulate those culture conditions?

      We see how this statement could be misunderstood. We only claim not to observe evidence of hypermutation in our specific culture conditions, which is important for the inferences we make. We added language to make this more clear.

      We did not try to recapitulate the conditions in the references supplied by the reviewer. We note that these references use cell lines and not B cells. While there is immensely valuable work done on cell lines, they behave very differently from actual cells and these findings may not be relevant to our human B cells.

      7) Some of the reported findings are repetitive of previously published results and provide no additional new information. For example:

      1. a) "Interestingly, we found mutated B cells were far more likely to express genes involved in T cell interaction (2B), suggesting Memory B cells are intrinsically licensed to enter an inflammatory state which activates T cells". This evidence is already published (PMID: 7535180 among many other published studies).

      We will cite this paper, which is a landmark study. We don’t claim we are the first to discover a propensity of mutated B cells to present help to T cells, but note that we were able to observe this fact via lineage tracing in a single experiment, which is a conceptual and technical advance. Additionally, we report an entire transcriptional module of genes which are upregulated in memory B cells vs. Naive B cells exposed to the same stimulus. This adds to the systematic understanding of the Memory B Cell activation program.

      1. b) "Instead, Naive B cells were biased toward expressing lectins and CCR7, suggesting Naive B cells are intrinsically primed to home into the lymphatic system and germinal centers (2B)". This evidence is already published (PMID: 9585422 among many other published studies).

      While this is an interesting and important paper referenced by the reviewer, we are unable to find anything similar to our claim about naive B cells in the reference provided. The investigators do not discuss intrinsic differences between memory and naive subsets when responding to the same stimulus.

      8) We quantified the in vitro dynamics of CSR through the lens of mutation status, which revealed strongly different fate biases between germline and mutated cells (2D). Most strikingly, B cells which switched to IGHE were almost exclusively derived from germline progenitors: the ratio of germline IGHE cells to mutated IGHE cells was (8-fold - inf, 95 % CI). Also this evidence is not novel (PMID: 34050324 among other published studies) and, again, must reflect the presence of specific culture conditions that remain completely undisclosed. This is incredibly confusing.

      Thank you for providing this reference, we were not aware of this interesting study. These studies are quite different and complementary. Differences between these studies likely reflect the fact that their B cells are isolated from a niche, rather than generated ____in-vitro_. Most of the tissue-resident cells in their study are quite mutated, and thus are not the Naive B cells we are making a claim abountj. In fact, despite their claim of low mutational load, these cells would fall into the “mutated” or even “heavily mutated” categories we defined in our paper. Cells with mutation levels of 5% are not thought to be Naive in any classification scheme. Our study showed that, _in vitro____, IGHE B cells effectively came exclusively from germline progenitors, their study shows no such result. The novelty of this finding was appreciated by reviewer 2.

      9) Authors should mention that non-switched memory B cells include IgDlowIgM+CD27+ and IgD-IgM+CD27+ memory B cells. Some authors define these distinct memory B cell fractions as marginal zone (MZ) or MZ-like B cells (please, notice that splenic MZ B cells recirculate in humans) and IgM-only B cells, respectively (PMID: 28709802; PMID: 9028952; PMID: 10820234; PMID: 11158612; PMID: 26355154; PMID: 15191950; and PMID: 24733829 among many other published studies).

      We appreciate these points. We attempted to classify our B cells within this taxonomy and found no such separation clearly exists in single-cell RNA-seq profiles. Instead, we opted to re-classify our data with a state-of-the-art algorithm called celltypist (DOI: 10.1126/science.abl5197____ )____ which harmonizes cell annotations across a growing number of single-cell RNA sequencing studies. While this classification system is not currently mutually exclusive / completely exhaustive, we believe using this system provides standardization and data availability that are key for sharing results. As single-cell RNA-seq and flow/mass cytometry harmonize their classification systems, anyone should be able to transfer their preferred classification scheme to the cells profiled here.

      10) Thus, CSR from IGHM cells did not meaningfully contribute to the abundance of IGHA+ cells in the population. Also this conclusion may be misleading and/or inaccurate. Indeed, an efficient class switching to IgA requires the exposure of naïve B cells to the cytokine TFG-beta in addition to a robust TD (CD40L) or TI (CpG DNA or BAFF or APRIL) co-signal. Was TGF-beta present in this culture?

      This is a good point about TGF-beta and switching to IgA. Here is a clear example of the novelty and power of our approach, as well as the benefits of using a well-characterized system such as B cell differentiation. Lineage tracing clarifies between two explanations for why there are IgA cells in the output population. One explanation is that non-IgA B cells in the input switch to IgA, driven by TGF-beta. Another explanation is that IgA cells in the input expand modestly and account for IgA cells in the output. Lineage tracing offers clear evidence that the latter explanation is true. Following from this, our approach allows us to make a strong inference that TGF-beta is not present in the incompletely determined cytokine mixture. We are not sure how this conclusion may be misleading or inaccurate, as it is a clear and simple description of our data, not a claim about what factors are necessary for switching.

      11) In contrast, we noted that many intraclonal class-switching events appeared to be directly from IGHM to IGHE. Explanations involving unobserved cells with intermediate isotypes notwithstanding, these data illustrate the relative ease with which B cells can switch directly to IGHE. It is very difficult to interpret this statement, as no information regarding the B cell-stimulating conditions used is provided. In addition, relevant literature is not quoted (e.g., PMID: 34050324).

      We clarify our discussion here to claim the ease with which peripheral blood IGHM B cells switch to IGHE. Again, lineage tracing has allowed us to distinguish between two very different population-level phenomena. One explanation is that undetected IGHE+ progenitors in the input population expanded rapidly and account for the IGHE+ cells. Another explanation is that cells class-switch to IGHE. Our data are consistent with the latter. We note that this validates the conceptual use of lineage tracing to understand rapid population dynamics in immune responses and cell differentiation protocols. This is a strength of our manuscript. We appreciate the the reviewer has furnished relevant studies, which we will cite.

      12) Our data for IGHE cells contrasts with in vivo data which show IgE B cells to be: (1) very rare, (2) apparently derived from sequential switching (e.g. from IgG1 to IgE) (Horns et al., 2016; Looney et al., 2016), and (3) often heavily hypermutated (Croote et al., 2018).

      While this reviewer agrees with the first comment (switching to IgE is relatively rare in vivo, at least in healthy individuals), the other statements are quite inaccurate. Indeed, unmutated extrafollicular naïve B cells from tonsils and possibly other mucosal districts directly class switch from IgM to IgE in healthy individuals, thereby generating a low-affinity IgE repertoire. In principle, low-affinity IgE antibodies may protect against allergy by competing with high-affinity IgE specificities. In allergic patients, high-affinity IgE clones emerge from class-switched and hypermutated memory B cells that sequentially switch from IgG1 or IgA1 to IgE as a result of specific environmental conditions, including an altered skin barrier (PMID: 22249450; PMID: 30814336; PMID: 32139586).

      Moreover, in contrast to what stated by authors, sequential IgG1/IgA1-to-IgE class switching mostly occurs in allergic patients but is less frequent in healthy individuals, where IgE specificities are less mutated (PMID: 30814336). Along the same lines, IgE is heavily mutated only in allergic individuals with significant molecular evidence of sequential IgG1/IgA1-to-IgE class switching (PMID: 30814336; PMID: 32139586). Overall, the data provided by Swift M. et al. are largely confirmatory of previously published evidence.

      We appreciate the clarification of this complex field and will cite the relevant literature. We also agree with the reviewers assessment that our data are validated by other approaches and groups. We see that our discussion of IgE B cells should have included that caveat that we are discussing IgE B cells detected in the peripheral blood. We have restricted claim to the suggestion that if our conditions mimic such niches where B cells switch to IgE, there are clearly efficient mechanisms which limit the amount of circulating IGHE B cells mechanisms in comparison to other isotypes.____

      Taken together, these data suggest that while direct switching to IGHE from Naive progenitors is trivial in vitro, niche factors or intrinsic death programs efficiently limit their generation or lifetime in vivo." I cannot understand this conclusion, which seems to contradict earlier statements.

      We hope we have clarified via the above comment.

      13) I am not sure I learned much regarding the "cell-intrinsic" fate bias and transcriptional memory of B cells after reading this elegantly presented but confusing and superficially discussed manuscript (please, see also comments 15-23).

      We understand the reviewer is confused about various aspects of our manuscript and appreciate the opportunities to clarify. We show cells with broad identities (such as germline vs. mutated or naive vs. memory) respond differently to the same stimulus. These are cell intrinsic fate biases. We quantify them and provide statistical bounds on the effect sizes of these differences, which to our knowledge has not been done. We agree with the reviewer that in the case of memory and naive B cells, much is already known about their biases. We recapitulate some of this knowledge, while adding a quantitative and an unbiased transcriptomic lens with which to view the biases. However, our analysis moves beyond cell types broadly defined, and focuses on the concept that each clone is a cell state or identity, where some of the identity may be faithfully propagated over generations and other information may not be. To this end, we tracked the transcriptome of clones during differentiation. We show that B cell clones share highly similar cell fates, implicating cell-intrinsic heterogeneity as a major contribution to diversity in immune responses. We note this reviewer did not critique this aspect of our work. The review also did not critique Figure 3 or 4, in which we present a quantitative analysis of which transcriptional programs are maintained by B cells and contribute to their clonal identity. Finally, via our analysis of human long-lived plasma cells, we report these transcriptional identities are observed in-vivo, over long time scales. This type of cell-intrinsic bias has not been studied or described to our knowledge. These findings were of particular interest to reviewer 2 and other readers of the manuscript.

      MINOR COMMENTS

      Thank you for reading the manuscript carefully and providing these comments and observations. We have fixed all clerical errors that were pointed out. We also responded to some of these minor comments here, and made changes to the manuscript to clarify.

      1) Figures 1B, S1C and S1D are not referred to in the text. x

      2) 2B in the Text is 2D in the Figure. x

      3) 2D in the Text is 2E in the Figure. x

      4) Figure 2D seems to show only 10 genes. Please, clarify.

      We clarify in the manuscript that we present the top differentially expressed genes

      5) 2E in the text is 2F in the Figure. x

      6) Figure S3B is not indicated in the text. x

      7) Figure 3E is not indicated in the text. x

      8) Figure S4A is not indicated in the text. x

      9) In some sections of the text, Figure panels are not sequentially discussed, which makes the text very difficult to follow. x

      Reviewer 2:

      Major comments:

      On p.3 the authors assume that a B cell with an unmutated BCR in the time course arose from a naive B cell progenitor. However, it is also possible that it arose from a IgM memory B cells since they also contain a non-negligible proportion of cells with 0 mutations. This was initially seen already in the Klein, J Exp Med, 1998 paper and later confirmed by e.g. Weller et al, J Exp Med, 2008 and Wu et al, Front Immunol, 2011. And since the authors herein and others have demonstrated that IgM memory B cells have a high proliferative capacity it is possible that IgM memory B cells are overrepresented among those unmutated BCRs seen in the cultures.

      The finding that IgM memory B cells are highly proliferative is not novel. It has been demonstrated by other groups before and one good example is Seifert et al, PNAS, 2015 where IgM memory B cells proliferated significantly more to BCR stimulation than naive or IgG memory B cells. However, it is also shown that IgG memory B cells are more responsive to TLR9 stimulation than IgM memory B cells as demonstrated by e.g. Marasco et al, Eur J Immunol, 2017. This is not discussed by the authors and should be added into the discussion for context of their finding by scRNAseq methods.

      These are astute points. We incorporated a more nuanced discussion of the prior literature about highly proliferative IgM memory B cells, which have been reported before. We also added a figure which identifies the genes associated with proliferative clones in Figure 3d, which adds to our understanding of the gene regulatory networks which govern IgM memory B cell behavior. We appreciate the reference to the Seifert et al paper, which is relevant and high quality work. We concur that a discuss of Marasco would be helpful, especially because it is unknown if a TLR9 agonist is in the stimulation cocktail, but their data would suggest there is not.

      The notion that a memory transcriptional program can be induced without SHM is not novel and this should be brought up in the discussion. One paper showing a memory transcriptional program in unmutated memory B cells is Kibler et al, Front Immunol, 2022.

      We were not aware of this literature and have now cited it in our discussion of this finding.

      The observation that memory B cells are more likely to enter an inflammatory state and support T cells has been suggested by other groups (Seifert et al, PNAS, 2015; Magri et al, Immunity, 2017, Grimsholm et al, Cell Reports, 2020).

      We have now cited and discussed a number of papers which contain similar findings. We note that we add to the holistic understanding of this phenomenon via our single cell transcriptomic approach.

      Please provide the age distribution of the peripheral blood samples as well.

      We have now provided the age distribution of the peripheral blood samples

      Please show flow cytometry analysis of the cultures to assist in assessing subset distribution, viability and plasma cell differentiation for each time point. This can be provided as supplementary information.

      We did not use flow cytometry for subset distribution and measurements of differentiation per se, only to exclude non-viable cells and we have now made this clearer in the methods section. We also now include representative plots show our sorting strategy.

      The stimulation cocktail used for this study, what does it contain? This needs to be specified in the manuscript and not only referring to the manufacturer. This has major impact on the results since different stimulatory agents will induce different pathways.

      This is a valid point that we addressed in our response to reviewer 1. See supplementary note 1 and Figure S1A for our analysis of the stimulation cocktail.

      Minor comments:

      Please avoid the term plasma B cells, does it refer to plasmablasts and/or plasma cells?

      Thank you for the suggestion, we have modified our language to refer to plasmablasts and plasma cells separately.

    1. Author Response

      Reviewer #1 (Public Review):

      In this manuscript by Kim et al., the authors use live-cell imaging of transcription in the Drosophila blastoderm to motivate quantitative models of gene regulation. Specifically, they focus on the role of repressors and use a 'thermodynamic' model as the conceptual framework for understanding the addition and placement of the repressor Runt, i.e. synthetic insertion of Runt repressor sites into the Bicoid-activated hunchback P2 enhancer. Coupled with kinetic modeling and live-cell imaging, this study is a sort of mathematical enhancer bashing experiment. The overarching theme is measuring the input/output relationship between an activator (bicoid), repressor (runt), and mRNA synthesis. Transcriptional repression is understudied in my opinion. One finding is that the inclusion of cooperativity between trans-acting factors is necessary for understanding transcriptional regulation. Most, if not all, of the tools used in this paper have been published elsewhere, but the real contribution is a deep, quantitative dissection of transcriptional regulation during development. As such, the only real questions for this referee are whether the modeling was done rigorously to produce some general biological conclusions. By and large, I think the answer is yes.

      We thank the reviewer for this thoughtful evaluation of our work. We agree with the reviewer’s assessment that transcriptional repression, especially the quantitative dissection of transcriptional repression, is understudied compared to transcriptional activation.

      Comments:

      Fig. 6 was the most striking figure for this referee, specifically that different placements of Runt molecules on the enhancer lead to distinct higher order interactions. I am wondering if the middle data column in Fig. 6 represents a real difference from the other two, and if so, it seems that the positioning - as opposed to simply the stoichiometry - is essential in cooperativity. This conclusion implies that transcriptional regulation is more precise than what some claim is just a mushy ball of factors close to a promoter. In other words, orientation may matter. Proximity may matter. Interactions in trans matter.

      We thank the reviewer for pointing out a feature of our data that we did not emphasize enough originally. Indeed, the construct in the middle column, which we termed [101], could be better recapitulated with the simplest model of zero free parameters than the other two constructs. As the reviewer pointed out, this raises an interesting question about the “grammar” of an enhancer: the placement and orientation of binding sites for transcription factors might matter yet we do not have a clear understanding of the logic. We have now incorporated a discussion of this topic in the Discussion section.

      There needs to be at least one prediction which is validated at the level of smFISH / mRNA in the embryo. Without detracting from the effort the authors have expended in looking directly at transcription, if the effects can't be felt by the blastoderm at the level of mRNA/cell, it becomes difficult to argue for the relevance to development. Also, I feel there is little chance that these measurements can be quantitatively replicated unless translated to the level of total protein or mRNA. Such a measurement (orthogonal quantitative confirmation of the repressor cooperativity result) would also assuage my concern about the time averaging as shown in Fig. S3.

      Our study focused on predicting the initial rate of transcription because it is the measurable quantity that most directly relates to the binding and action of the transcriptional activators and repressors used in this study. We argue that the action of transcription factors would be more accurately assessed by monitoring the rate of transcription, rather than the accumulated mRNA, which could be confounded by the dynamics of the whole transcription cycle—initiation, elongation and termination—as well as nuclear export, diffusion and degradation of transcripts. We are, of course, excited to eventually be able to predict a whole pattern of cytoplasmic mRNA over space and time from knowledge of the enhancer sequence. However, if we cannot predict the initial rate of RNA polymerase loading dictated by an enhancer, we argue that there is little hope in predicting such cytoplasmic patterns. We emphasized this point in the Discussion (Line XX-YY). Regardless, to assuage the reviewer’s concern, we have performed additional analyses to assess the effect of repression at the level of accumulated mRNA.

      First, we have quantified the accumulated mRNA during nuclear cycle 14, which is the time window that we have focused on in this study. To make this possible, we have integrated the area under the curve of MS2 time traces which has been already shown to be a reporter of the total amount of mRNA produced by FISH (Garcia et al., Current Biology 23:2140, 2013;Lammers et al., PNAS 17:836, 2020). This integration reporting on accumulated mRNA is now shown for all constructs in the presence and absence of Runt protein in the new Figure S17. This figure clearly shows that the consequences of repression are present in the blastoderm, not just at the level of transcriptional initiation, but also at the level of accumulated mRNA.

      We then compared the accumulated mRNA profiles shown in Figure S17 to the initial rate of RNAP loading at each position of the embryo along the anterior-posterior axis for all constructs in the presence and absence of Runt protein. These new results are shown in a new figure, Figure S19. Interestingly, we saw a good correlation (Pearson correlation coefficient of 0.90) between these two metrics. Thus, we argue that our conclusion that higher-order cooperativity is necessary to account for the initial rate of RNA polymerase loading would still hold for predicting the accumulated mRNA.

      Reviewer #3 (Public Review):

      The authors have presented results from carefully planned and executed experiments that probe enhancer-drive expression patterns in varying cellular conditions (of the early Drosophila embryo) and test whether standard models of cis-regulatory encoding suffice to explain the data. They show that this is not the case, and propose a mechanistic aspect (higher order cooperativity) that ought to be explored more carefully in future studies. The presentation (especially the figures and schematics) are excellent, and the narrative is crisp and well organized. The work is significant because it challenges our current understanding of how enhancers encode the combinatorial action of multiple transcription factors through multiple binding sites. The work will motivate additional modeling of the presented data, and experimental follow-up studies to explore the proposed mechanisms of higher order cooperativity. The work is an excellent example of iterative experimentation and quantitative modeling in the context of cis-regulatory grammar. At the same time, the work as it stands currently raises some doubts regarding the statistical interpretation of results and modeling, as outlined below.

      We thank the reviewer for noting the significance of our work. We tried our best to address the concerns of the reviewer regarding the statistical interpretation of results and theoretical modeling throughout our responses below.

      The results presented in Figure 5 are used to claim that the data support (i) an unchanging K_R regardless of the position of the Runt site in the enhancer and (ii) an \omega_RP that decreases as the site goes further away from the promoter, as might be expected from a direct repression model. This claim is based on only testing the specific model that the authors hypothesize and no alternative model is compared. For instance, are the fits significantly worse if \omega_RP is kept constant and the K_R allowed to vary across the three sites. If different placements of the Runt site can result in puzzling differences in RNAP-promoter interaction, it seems entirely possible that the different site placements might result in different K_R, perhaps due to unmodeled interference from bicoid binding. Due to these considerations, it is not clear if the data indeed argue for a fixed K_R and distance-dependent \omega_RP.

      We apologize for the lack of justification in assuming that Kr remains constant and wrp varies depending on the position of the Runt binding sites. Following the reviewer’s suggestion, we tested the alternative scenarios where we either fix or vary different combinations of wrp and Kr for our one-Runt binding site constructs. The result is now shown in a new figure, Figure S16. In short, as reported by the Akaike Information Criterion (AIC) in Figure S16F, the MCMC fit explains the data best in the scenario of fixed Kr and different wrp values for one-Runt binding site constructs. Furthermore, we also performed the MCMC inference in the case where we varied both Kr and wrp values across constructs. From this analysis, we obtained similar values of Kr while having different values of wrp across constructs as shown in Figure S16G. Overall, we believe that this evidence strongly supports our assumption of having consistent Kr values but different wrp values for the one-Runt binding site constructs.

      Results presented in Figure 6 make the case that higher order cooperativity involving two DNA-bound molecules of Runt and the RNAP is sufficient to explain the data. The trained values of such cooperativity in the three tested enhancers appear orders of magnitude different. As a result, it is hard to assess the evidence (from model fits) in a statistical sense. Indeed, if all of the assumptions of the model are correct, then using the high-order cooperativity is better than not using it. To some extent, this sounds statistically uninteresting (one additional parameter, better fits). It is not the case that the new parameter explains the data perfectly, so some form of statistical assessment is essential.

      The inferred cooperativity values are indeed orders of magnitude different. However, the cooperativity terms can be also written as “w = exp(-E/(kBT))”, where the E is the interaction energy, kB is the Boltzmann constant, and T is the temperature. As a result, we should compare the magnitude of the different cooperativities on a log-scale. In brief, the interaction energies wrr from the three two-Runt binding site constructs range between 0 and 1kBT, and the higher-order cooperativity wrrp has an energy between -2 and 4kBT. Interestingly, these energies are of the same order of magnitude as the interaction energies typically reported for bacterial transcription factors (e.g., Dodd et al., Genes and Development 18:344-54, 2004). It is important to note that our inferred interaction energies could be either positive or negative, suggesting that both cooperativity and anti-cooperativity can be at play depending on the architecture of the two Runt binding sites. We now report on these interactions in the language of energies Table S1 and elaborate on this in the Discussion section (Line XX-YY).

      Finally, following the reviewer’s suggestion on statistical assessment of whether addition of parameters indeed explains the data better, we adopted the Akaike Information Criterion (AIC) as a metric to compare different models used in Figure 6 and now show the results in a new panel, panel G. Briefly, AIC is calculated by assessing the model’s ability to explain the data while penalizing for having more parameters. The smaller the AIC value is, the better the model explains the data. As we have claimed, the AIC showed a dramatic decrease when adopting the higher-order cooperativity as shown in Figure 6G. Thus we argue that the addition of higher-order cooperativity, while not being able to completely explain the data, is indeed capable of increasing the agreement between experiments and theory across all our two-Runt site constructs.

      Moreover, it is not the case that the model structure being tested is the only obvious biophysics-driven choice: since this is the first time that such higher order effects are being tested, one has to be careful about testing alternative model structures, e.g., repression models that go beyond direct repression and pairwise cooperativity that goes beyond the traditional approach of a single (pseudo)energy term.

      We agree with the reviewer that alternative models with different mechanisms of repression should be mentioned. We have clarified this point further in Discussion (Line XX -YY). In summary, we tested both “competition” and “quenching” models of repression as proposed in Gray et al, (Genes and Development 8:1829, 1994). Interestingly, Figure S5 shows that the “competition” model gives a worse fit compared to the “direct repression” and “quenching” models for the one-Runt binding site cases. We further tried to test these alternative models in the case of two-Runt binding sites constructs. The result is shown in Figure S7 (competition) and S8 (quenching). These figures also reveal that the “competition” model underperformed compared to the “direct repression” or “quenching” models. For the “quenching” model to fit the data, we also had to invoke higher-order cooperativity that is beyond pairwise cooperativity. Thus, we believe that the requirement of higher-order cooperativity holds regardless of the choice of the specific model. Of course, our models of repression are very likely an oversimplification of how repressors actually work. However, given that these simple models have been a prevalent choice of proposed mechanisms for repression in the field of transcriptional repression for the past decades, we believe that the significance of our work lies in the fact that we challenged these models by turning them into precise mathematical statements (in the form of widespread thermodynamics models) and confronting them with quantitative data.

      The general theme seen in Figure 6 is seen again in Figure 7, when a 3-site construct is tested: model complexities inferred from all of the previous analyses are insufficient at explaining the new data, and new parameters have to be trained to explain the results. The authors do not seem to claim that the higher order cooperativity terms (two parameters) explain the data, rather that such terms may be useful.

      We agree that our previous approach was confusing. Figure 7A indeed incorporated all inferred parameters from the previous rounds of inference (Kb, wbp, p, R, as well as Kr, wrp, wrr, and wrrp). However, it is clear that this set of parameters, even including the higher-order cooperativity from two-Runt binding sites cases, was not enough to explain the data from three-Runt binding sites case. Thus, we had to invoke another free parameter, which we termed wrrrp, to explain the data. We have revised Figure 7B such that it is now showing the “best” MCMC fit which explains the data quite well (instead of just showing the “improvement” of fits).

    1. Author Response

      Reviewer #1 (Public Review):

      This paper introduces a new statistical framework to study cellular lineages and traits. Several new measures are introduced to infer selection strength from individual lineages. The key observation is that one can simply relate cumulants of a fitness landscape to population growth, and all of this can be simply computed from one generating function, that can be inferred from data. This formalism is then applied to experimental cell lineage data.

      I think this is a very interesting and clever paper. However, in its current form the paper is very hard to read, with very few explanations beyond the mathematical observations/definitions, which makes it almost unreadable for people outside of the field in my opinion. Some more intuitive explanations should be given for a broader audience, on all aspects : definitions of fitness « landscape », selection strength(s), connections between cumulants and other properties (including skewness) etc... There are many new definitions given with names reminiscent of classical concepts in evolutionary theory, but the connection is not always obvious. It would be great to better explain with very simple, intuitive examples, what they mean, beyond maths, possibly with simple examples. Some of this might be obvious to population geneticists, and in fact some explanations made in discussion are more illuminating, but earlier would be much better. I give more specific comments below.

      We thank the reviewer for calling our attention to the lack of accessible explanations on the significant terms and quantities in this framework. Following the suggestion in the comments below, we added Box 1, providing intuitive and plain explanations on the terms of fitness, fitness landscape, selection, selection strength, and cumulants. In each section, we explain the standard usage of these terms in evolutionary biology and clarify the similarities and differences in this framework. We also added a figure to Box 1 and provided a schematic explanation of the relationships among chronological and retrospective distributions, fitness landscapes, and selection strength. We believe that these explanations and a figure would better clarify the meanings and functions of these quantities.

      Major comments :

      1) the authors give names to several functions, for instance before equation (1) they mention « fitness landscape », then describe « net fitness » , which allows the authors to define « fitness cumulants ». Later on, a « selection » is defined. Those terms might mean different things for different authors depending on the context, to the point there are sometimes almost confusing. For instance, why is h a « landscape » ? For me, a landscape is kind of like a potential, and I really do not see how this is connected to h. « fitness cumulants » is particularly jargonic. There are also two kinds of selection strengths, which is very confusing. I would recommend that the authors make a glossary of the term, explain intuitively what they mean and maybe connect them to standard definitions.

      We appreciate the suggestion of making a glossary of the terms. Following the suggestion, we added Box 1 to provide intuitive and plain explanations of the terms used in this framework.

      In Box 1, we explain why we called h(x) a fitness landscape, referring to its standard usage in evolutionary biology. In evolutionary biology, fitness landscapes (also called adaptive landscapes) are visual representations of relationships between reproductive abilities (fitness) and genotypes. The height of landscapes corresponds to fitness. Since constructing "genotype space" is usually difficult, fitness is often mapped on an allele frequency or phenotype (trait) space to depict a "landscape." Fitness landscapes introduced in our framework are analogous to those in evolutionary biology in that fitness differences are mapped on trait spaces. Although fitness landscapes in evolutionary biology are usually metaphorical or conceptual tools for understanding evolutionary processes, the landscapes in our framework are directly measurable from division count and trait dynamics on cellular lineages.

      We also explain "selection" and "selection strength" in Box 1. As pointed out, we define three kinds of selection strength measures. These three measures share a similar property of reporting the overall correlations between traits and fitness. However, they also have critical differences regarding additional selection effects they represent: S_KL^((1)) for growth rate gain, S_KL^((2)) for additional loss of growth rate under perturbations, and their difference S_KL^((2))-S_KL^((1)) for the effect of selection on fitness variance. We restructured the sections in Results and clarified these important meanings of the different selection strength measures.

      We removed the term "fitness cumulants" as this is non-general and might cause confusion to readers. We now rephrased this more precisely as "cumulants of a fitness landscape (with respect to chronological distribution)." Besides, we added a general explanation of "cumulants" to Box 1 and clarified what first, second, and third-order cumulants represent about distributions.

      2) Along the same line, it would be good to give more intuitive explanations of the different functions introduced. For instance I find (2) more intuitive than (1) to define h . I think some more intuition on what the authors call selection strengths would be super useful . In Table 1 selection strengths are related to Kublack Leibler divergence (which does not seem to be defined), it would be good to better explain this.

      In addition to Box 1, we included more intuitive explanations on fitness landscapes and selection strength where they first appear in the Theoretical background section. As pointed out, descriptions of the linkage between the selection strength measures and Kullback-Leibler divergence were only in the Supplemental Information in the original manuscript. We now explicitly show this linkage where we first define the selection strength.

      Following this comment, we also changed the definition of a fitness landscape from the original one to h(x)≔τΛ+ln⁡〖Q_rs (x)/Q_cl (x)〗 (Eq. 1), using the chronological and retrospective distributions introduced in the preceding paragraph. This definition is mathematically equivalent to the previous one, but we believe it is more intuitive.

      3) It seems to me the authors implicitly assume that, along a lineage, one would have almost stationary phenotypes (e.g. constant division rate) . However, one could imagine very different situations, for instance the division rates could depend on interactions with other cells in the growing population, and thus change with time along a lineage. One could also have some strong random components of division rate over time . I am wondering how those more complex cases would impact the results and the discussion

      We thank the reviewer for pointing out our insufficient explanation of an essential feature of this framework. As we now explain in the "Examples of biological questions" section (L62-65) and Discussion (L492-493), this framework does not assume stationary phenotypes (traits) on cellular lineages. On the contrary, we developed this framework so that one can quantify fitness and selection strength even for non-stationary phenotypes (traits) due to factors such as non-constant environments and inherent stochasticity.

      In fact, if traits are stationary in cellular lineages, this framework becomes essentially identical to the individual-based evolutionary biology framework (see ref. 26, for example). Our framework assumes a cell lineage as a unit of selection and any measurable quantities along cellular lineages as lineage traits, whether they are stationary or non-stationary. Therefore, our framework can evaluate fitness landscapes and selection strength without explicitly taking the environmental conditions around cells into account. This means that h(x) and S[X] in this framework extract the correlations between the traits of interest and division counts among various factors that could potentially influence division counts. On the other hand, this framework has a limitation due to this design: it cannot say anything about the influence of factors such as non-quantified traits and potential variations in environmental conditions. We now explain these important points explicitly in the revised manuscript (L493-496).

      Likewise, stochasticity in division rate does affect division count distributions, and its influence appears as differences in the selection strength of division count S[D]. As stated in the text, S[D] sets the maximum bound for the selection strength of any lineage trait (L143-145). Therefore, S_rel [X]≔S[X]/S[D] reports the relative strength of the correlation between the trait X and lineage fitness in a given level of S[D] in each condition.

      To clarify the influence of stochasticity in division rate, we present a cell population model in which cells divide stochastically according to generation time (interdivision time) distributions in Appendix 2 (we moved this section from the Supplemental Information with modifications). We can confirm from this model that the shapes of generation time distributions influence the selection strength S[D]. Importantly, one can understand from this model that stochasticity in generation times constantly introduces selection to cell populations and modulates the growth rate and selection strength even in the long-term limit. We now clarify this important point in the Discussion (L519-526).

      4) « Therefore, in contrast to a common assumption that selection necessarily decreases fitness variance, here we show that under certain conditions selection can increase fitness variance among cellular ». This is a super interesting statement, but there is such a lack of explanations and intuition here that it is obscure to me what actually happens here.

      When a decrease in fitness variance by selection is mentioned in evolutionary biology, an upper bound and inheritance of fitness across the generations of individuals are usually assumed. In such circumstances, selection drives the fitness distribution toward the maximum value, and the selection eventually causes fitness variance to decrease. However, even in this process, a decrease is not assured for every step; whether selection reduces fitness variance at each step depends on the fitness distribution at that time.

      In our argument, we compared fitness variances between chronological and retrospective distributions. We showed both theoretically and experimentally that there are cases where the variances of the retrospective distributions (distributions after selection) become larger than those of the chronological distributions (distributions before selection). The direction of variance change depends on the shape of chronological distributions, primarily on the skewness of the distributions (positive skew for increasing the variance and negative skew for decreasing the variance). The direction of variance changes can also be probed by the difference between the two selection strength measures S_KL^((2))-S_KL^((1)). Notably, we can demonstrate that there are cases where retrospective fitness variances are larger than chronological fitness variances even in the long-term limit, as shown by a cell population model in Appendix 2.

      We now explain what kind of situations are usually premised when reduction of fitness variance is mentioned and clarify that, in our framework, we compare the fitness variances between chronological and retrospective distributions (L542-548). We also explain that a selection effect on fitness variance generally depends on fitness distribution and that a larger fitness variance in retrospective distribution is possible even in the long-term limit (L548-557).

      Reviewer #2 (Public Review):

      The paper addresses a fundamental question: how do phenotypic variations among lineages relate to the growth rate of a population. A mathematical framework is presented which focuses on lineage traits, i.e. the value of a quantitative trait averaged over a cell lineage, thus defining a fitness landscape h(x). Several measures of selection strengths are introduced, whose relationships are clarified through the introduction of the cumulant generating function of h(x). These relationships are illustrated in analytical mathematical models and examined in the context of experimental data. It is found that higher than third order cumulants are negligible when cells are in early exponential phase but not when they are regrowing from a stationary phase.

      The framework is elegant and its independence from mechanistic models appealing. The statistical approach is broadly applicable to lineage data, which are becoming increasingly available, and can for instance be used to identify the conditions under which specific traits are subject to selection.

      We appreciate the reviewer for the positive evaluation. We will reply to your specific comments below.

      Reviewer #3 (Public Review):

      In this work the authors have constructed a useful mathematical framework to delineate contributions leading to differences in lineages of populations of cells. In principle, the framework is widely applicable to exponentially growing populations. An attractive feature is that the framework is not tailored to particular growth models or environmental conditions. I expect it will be valuable for systems where contributions from phenotypic heterogeneity overwhelm contributions from intrinsic stochasticity in cellular dynamics.

      I am generally very positive about this work. Nevertheless, a few specific concerns:

      1) In here, lineages are considered as fitter if they have more division events. But this consideration neglects inherent stochasticity in division events. Even in a completely homogeneous population, the number of division events for different lineages is different due to intrinsic stochasticity, but applying the methods discussed in this manuscript may lead to falsely assigning different fitness levels to different lineages. The reason why (despite having different number of division events) these lineages ought be assigned the same fitness level is that future generations of these cells will have identical statistics, in contrast with those of cells that are phenotypically different. Extending the idea to heterogeneous populations, the actual difference in fitness levels may be significantly different from what is obtained from the mathematical framework presented here, depending on the level of inherent stochasticity.

      We thank the reviewer for the comment on the point of which our explanation was insufficient in the original manuscript. Intrinsic stochasticity in interdivision time (generation time) is, in fact, critical for selection. For example, if a cell divides with a generation time shorter than the average due to stochasticity, this cell is likely to have more descendant cells in the future population on average than the other cells born at the same timing, even if the descendants follow identical statistics. Therefore, the properties of intrinsic stochasticity, including shapes of generation time distributions and transgenerational correlations, significantly affect the overall selection strength S_KL^((1)) [D] (and also S_KL^((2)) [D]). We now explain this important point in the Results section, referring to the analytical model in Appendix 2 (L327-334), and also in Discussion (L519-524).

      Importantly, even when cell division processes seem purely stochastic, different states in some traits might underlie these variations in generation times. In such cases, evaluating h(x) and S_rel [X] can still unravel the correlations between the trait values and fitness. Especially, the relative selection strength S_rel [X]≔S_KL^((1) ) [X]/S_KL^((1) ) [D] extracts the correlation of the trait values in a given level of division count heterogeneity in each condition. We now clarify this important aspect of the framework in Discussion (L524-526).

      When a cell population is composed of heterogeneous subpopulations each of which follows a distinct statistical rule, our framework evaluates the combined effects from the heterogeneous rules and the inherent stochasticity of each subpopulation. Untangling these two contributions is generally challenging unless we have appropriate markers for distinguishing the subpopulations. However, when the subpopulations follow significantly distinct statistics, the division count distribution should become skewed or multimodal, and the difference between the two selection strength measures S_KL^((2) ) [D]-S_KL^((1) ) [D] can suggest the existence of such subpopulations. Therefore, detailed analyses using all the selection strength measures and the fitness landscapes can provide insights into cell populations’ internal structures and selection.

      We now explain the effect of inherent stochasticity in generation times (L327-334 and L519-524) and discuss how we can probe the existence of subpopulations based on the selection strength measures (L508-512). Please also refer to our reply to the comment 3 of reviewer #1.

      2) In one of the sections the authors mention having performed analytical calculations for a cellular population in which cells divide with gamma distributed uncorrelated interdivision times. It's unclear if 1) within specific sub-populations, cells with the sub-population divide with the same division time, and the distribution of division times is due to the diverse distribution of sub-populations; or 2) if there are no such sub-populations and all cells stochastically choose division time from the same distribution irrespective of their past lineage. If the latter, then I do not see the need for a lineage-based mathematical formulation when the problem can dealt with in much simpler traditional ways which so not keep track of lineages.

      We dealt with the situation of 2) in this model. As noted by the reviewer, we can calculate the chronological and retrospective mean fitness and the population growth rate by a simpler individual-based age-structured population model (see ref. 10, for example). However, applying this framework to this model can clarify the utility of the cumulant generating function, the meaning of the differences between these fitness measures, and the effect of statistical properties of intrinsic stochasticity on long-term growth rate and selection. Therefore, we kept this model in Appendix 2 (the section is moved from Supplemental Information) with additional clarification of our motivation for analysis and the implication of the results.

      3) The analytical calculations provided seem to be exact only for trajectories of almost infinite duration (or in practice, duration much greater than typical interdivision time). For example, if the observation time is of the order of division time, this would create significant artifacts / artificial bias in the weights of lineages depending on whether the cell was able to divide within the observation time or not. Thus, the results claiming that contributions of higher order cumulants become significant in the regrowth from a late stationary phase are questionable, especially since authors note that 90% of cells showed no divisions within the observation time.

      We thank the reviewer for an insightful comment. It is true that the duration of observation influences the results. In the regrowing experiments with E. coli, we aimed to compare the two cell populations regrowing from different stages of the stationary phase. Therefore, it is appropriate to fix the time windows between the two conditions. Even though a significant fraction of cell lineages remains undivided, the regrowing cells already divide several times within this time window. Therefore, the results are valid if we compare and discuss the selection levels in this time scale. However, clarification of the selection in the longer time scales requires a more detailed characterization of lag time distributions under both conditions.

      We now clarify the range of validity of the results and the limitations on prediction for the long-term selection without knowing the details of the lag time distributions in Discussion (L536-539).

    1. Author Response:

      Reviewer #1 (Public Review):

      Here, Servello et al explore the role of temperature and the temperature-sensing neuron AFD in promoting protection against peroxide damage. Unlike many other environmental threats, peroxide toxicity is expected to be temperature-dependent, since its chemical reactivity should be enhanced by higher temperatures. The authors convincingly and rigorously show that transient exposure to 25C, a condition of mild heat stress in C. elegans, activates animals' defenses against peroxides but potentially not other agents. Interestingly, this response requires the temperature-sensing AFD neurons, though whether temperature-dependent AFD activity is itself involved in this regulation is not explored. Further, the authors find that temperature regulates AFD's expression of the insulin ins-39 and provide evidence supporting the idea that repression of ins-39 at 25C contributes to enhanced peroxide defense. The authors use transcriptomic approaches to explore gene expression changes in animals in which AFD neurons are ablated, providing evidence that the FoxO-family transcription factor DAF-16 potentiates AFD signaling. However, because AFD ablation triggers effects broader than transient 25C exposure, the significance of these findings for temperature-dependent peroxide defense is somewhat unclear. Additionally, the possibility that DAF-16 (as well as another protective factor, SKN-1) function in parallel to temperature stress is consistent with many of the results shown but is not as thoroughly considered. Together, these studies identify a fascinating example of pre-emptive threat response triggered by the detection of a potentiator of that threat, a phenomenon they term "enhancer sensing." While some predictions of the specificity of this phenomenon remain untested, the paper provides intriguing insight into the potential mechanisms by which it may occur.

      Major issues:

      The dependence of the enhancer-sensing phenomenon on AFD leads the authors to conclude that the 25C stimulus is sensed by AFD itself, but this needs to be directly tested. To do this, they could ask whether tax-4 function is required in AFD, or use mutants in which AFD's thermosensory function is compromised.

      We thank the reviewer for suggesting these experiments. As requested, we determined whether previously identified mechanisms for temperature perception by the AFD neurons were required for the temperature-dependent regulation of peroxide resistance using gcy-18 gcy-8 gcy-23 triple mutants and the respective single mutants. The findings from the new experiments lead us to conclude that temperature perception by AFD via the GCY-8, GCY-18, and GCY-23 receptor guanylate cyclases, which are exclusively expressed in the AFD neurons, contributes to the temperature-dependent regulation of peroxide resistance in C. elegans. These experiments are detailed in the following new paragraph in the results section:

      “Last, we determined whether previously identified mechanisms for temperature perception by the AFD neurons were required for the temperature-dependent regulation of peroxide resistance. The AFD neurons sense temperature using receptor guanylate cyclases, which catalyze cGMP production, leading to the opening of TAX-4 channels (Goodman and Sengupta, 2019). Three receptor guanylate cyclases are expressed exclusively in AFD neurons: GCY-8, GCY-18, and GCY-23 (Inada et al., 2006; Yu et al., 1997) and are thought to act as temperature sensors (Takeishi et al., 2016). Triple mutants lacking gcy-8, gcy-18, and gcy-23 function are behaviorally atactic on thermal gradients and fail to display changes in intracellular calcium or thermoreceptor current in the AFD neurons in response to temperature changes (Inada et al., 2006; Ramot et al., 2008; Takeishi et al., 2016; Wang et al., 2013; Wasserman et al., 2011). We found that when grown and assayed at 20°C, gcy-23(oy150) gcy-8(oy44) gcy-18(nj38) triple null mutants survived 43% longer in the presence of tBuOOH than wild-type controls (Figure 3J). In contrast, at 25°C, the gcy-23 gcy-8 gcy-18 triple mutants showed a 12% decrease in peroxide resistance relative to wild-type controls (Figure 3K). Therefore, the three AFD-specific receptor guanylate cyclases influenced the temperature dependence of peroxide resistance, lowering peroxide resistance at 20°C and slightly increasing it at 25°C. At 20°C, the gcy-8(oy44), gcy-18(nj38), and gcy-23(oy150) single mutants increased peroxide resistance by 10%, 51%, and 21%, respectively, relative to wild-type controls (Figure 3L). Therefore, each of the three AFD-specific receptor guanylate cyclases regulates peroxide resistance. We conclude that temperature perception by AFD via GCY-8, GCY-18, and GCY-23 enables C. elegans to lower their peroxide resistance at the lower cultivation temperature.”

      The enhancer-sensing model is fascinating, but as it stands it is somewhat oversold. The authors could tone down the writing, indicating that this model is suggested rather than shown. Alternatively, they could more carefully test some of its predictions - for example by exploring the response to other threats (e.g. some of the toxicants described in Fig. S5) at 20C and 25C in WT and AFD-ablated animals.

      We edited the manuscript and expanded the manuscript’s discussion to address these concerns as well as similar concerns from reviewer #3. In the paper we show that the regulation of the induction of H2O2 defenses in C. elegans is coupled to the perception of temperature (an inherent enhancer of the reactivity of H2O2). To understand the significance of this finding in an evolutionary context, and to explain why such a regulatory system would evolve, we introduced in the discussion a new conceptual framework, “enhancer sensing,” and devoted a section of the discussion to demonstrating that the phenomenon that we observed could not be adequately explained by existing frameworks used to understand the evolutionary origins of the regulatory systems for defense responses.

      We now realize that we did not sufficiently and clearly explain the scope for the criterion for establishing a phenomenon represents enhancer sensing, leading to incorrect predictions by reviewer’s 1 and 3 about (a) whether what we observed in C. elegans is an instance of enhancer sensing (or more proof is needed) and (b) what the enhancer sensing model for the coupling of temperature perception to H2O2 defense would predict about how temperature and the AFD neurons would affect resilience to other chemicals. We regret failing to adequately explain the model’s scope and predictions and believe that we have now explicitly addressed the scope of what constitutes enhancer sensing and the predictions of the model. In particular, we previously did not spell out (a) the distinction between the enhancer sensing strategy and the mechanistic implementation of that strategy; and, importantly, (b) we did not discuss what the enhancer sensing strategy coupling temperature perception to H2O2 defense in C. elegans predicted (and did not predict) about whether a similar strategy would be expected to be used by C. elegans to deal with other temperature-dependent threats. We now address these issues in two new paragraphs in the discussion that read:

      “We show here that C. elegans uses an enhancer sensing strategy that couples H2O2 defense to the perception of high temperature. We expect this strategy’s output (the level of H2O2 defense) to provide the nematodes with an evolutionarily optimal strategy across ecologically relevant inputs (cultivation temperatures) (Kussell and Leibler, 2005; Maynard Smith, 1982; Wolf et al., 2005). This strategy is implemented at the organismic level through the division of labor between the AFD neurons, which sense and broadcast temperature information, and the intestine, which responds to that information by providing H2O2 defense (Figure 9D). Ascertaining that C. elegans relies on this enhancer sensing strategy does not depend on the temperature information broadcast by AFD exclusively regulating defense responses to temperature-dependent threats, because the regulation of defenses towards temperature-insensitive threats could affect defenses towards temperature-dependent threats; for example, suppressing defenses towards a temperature-insensitive threat would be beneficial if those defenses interfered with H2O2 defense or depleted energy resources contributing to H2O2 defense.

      As with any sensing strategy, enhancer sensing strategies are more likely to evolve when sensing is informative and responding is beneficial. In their natural habitat, C. elegans encounter many environmental chemicals that, like H2O2, are inherently more reactive at higher temperatures. It will be interesting to determine the extent to which C. elegans uses enhancer sensing strategies coupling temperature perception to the induction of defenses towards those chemicals, and whether those strategies rely on temperature perception and broadcasting by the AFD neurons. We expect that sensing strategies regulating defense towards those chemicals would be more likely to evolve when those chemicals are common, reactive, and cause consequential damage.”

      We note that our ability to predict survival to other toxicants, such as those that trigger specific gene-expression responses that are AFD-dependent but are unaffected between 20C and 25C (as proposed by the reviewer), is limited not only by our lack of knowledge about the specific mechanisms that protect worms from those toxicants, but also by our lack of knowledge about whether defense towards hydrogen peroxide interferes (or synergizes) with defense towards each of those toxicants and whether defense towards those toxicants interferes (or synergizes) with H2O2 defense. We therefore think that those experiments would be better addressed in future studies.

      The role of ins-39 remains somewhat speculative. Fig 4F shows that ins-39 mutants have a reduced induction of peroxide defense, but it seems that this could be the result of a ceiling effect. The authors' model predicts that overexpression of ins-39, particularly at 25C, should sensitize animals to peroxide damage, a prediction that should be tested directly. Further, the authors seem to assume that AFD is the relevant site of ins-39 function, but this needs to be better supported.

      As requested by all three reviewers, we determined whether ins-39 gene expression in AFD was sufficient to lower peroxide resistance by restoring ins-39(+) gene expression only in the AFD neurons using the AFD-specific gcy-8 promoter. As predicted by the reviewer, these worms were more sensitive to peroxide than wild-type worms. The findings from this experiment lead us to conclude that expression of ins-39 in the AFD neurons was sufficient to regulate the nematode’s peroxide resistance. The new section reads:

      “Next, we determined whether the INS-39 signal from AFD regulated the nematode’s peroxide resistance. The tm6467 null mutation in ins-39 deletes 520 bases, removing almost all the ins-39 coding sequence (Figure 5A), and inserts in that location 142-bases identical to an intervening sequence located between ins-39 and its adjacent gene. In nematodes grown and assayed at 20°C, ins-39(tm6467) increased peroxide resistance by 26% relative to wild-type controls (Figure 5F). To determine whether ins-39 gene expression in AFD was sufficient to lower peroxide resistance, we restored ins-39(+) expression only in the AFD neurons using the AFD-specific gcy-8 promoter (Inada et al., 2006; Yu et al., 1997) in ins-39(tm6467) mutants. Expression of ins-39(+) only in AFD eliminated the increase in peroxide resistance of ins-39(tm6467) mutants (Figure 5F). Notably, the peroxide resistance of the two independent transgenic lines was 28% and 30% lower than that of wild-type controls, likely due to overexpression of the gene beyond wild-type levels. We conclude that the gene dose-dependent expression of ins-39 in the AFD neurons regulated the nematode’s peroxide resistance.”

      The temperature-shift experiments in figure 5G (formerly 4F) indicated that the effect on peroxide resistance at 20C of growth at 25C and of the ins-39 mutation were non additive. We interpreted this epistatic interaction to be due to action in a common pathway. It is possible that while growth at 25C increases the subsequent peroxide resistance at 20C, it could limit the nematodes’ subsequent peroxide resistance at 20C (beyond those peroxide-resistance increasing effects) when in combination with another intervention, even if those interventions acted via parallel mechanisms—a ceiling effect, as proposed by the reviewer. We favor the alternative interpretation, that the mechanisms act sequentially, because of our findings that ins-39 gene expression within AFD was lower at 25C than at 20C, leading us to propose the sequential model in figure 5H (formerly 4G).

      Most of the daf-16 and skn-1 experiments are carried out in AFD-ablated animals, making the relevance of these findings for the 25C-dependent induction of peroxide defense somewhat unclear. As the authors show, AFD ablation causes much more extensive changes than transient 25C exposure, clearly seen in slope of the line in 3C. Further, unlike 25C exposure, AFD ablation is a chronic and non-physiological state. It would be useful for the authors to be cautious in their interpretation of these findings and to be clearer about how strongly they can connect them to the "enhancer sensing" phenomenon. Along these lines, the potentiation idea could be toned down a bit. Much of the data is consistent with parallel function for daf-16 (and skn-1) - for example, Fig 5C indicates additive effects of daf-16 and 25C exposure; 6C shows that AFD ablation still has a clear effect on peroxide sensitivity in the absence of both daf-16 and skn-1; and Fig S8a shows that much of the transcriptional response to AFD ablation (along PC1) is intact in daf-16 animals.

      We have made several adjustments in the text to address these concerns. As the reviewer noted, the experiments with skn-1 were performed only in AFD ablated worms. We have renamed the section heading to “SKN-1/NRF and DAF-16/FOXO collaborate to increase the nematodes’ peroxide resistance in response to AFD ablation” to make that clear.

      In contrast, the peroxide resistance experiments with daf-16 were done also in worms grown at 25C and then shifted to 20C during the peroxide resistance assay. The connection of daf-16 with the temperature dependent regulation of peroxide resistance was established in temperature shifts experiments in daf-16 single mutants (Figure 6C, formerly 5C) and in transgenic worms rescuing the daf-16 mutant only in the intestine (Figure 6F). In the revised text we make it clearer that the effect of the daf-16 mutation is bigger when the nematodes are shifted from 25C to 20C: “The daf-16(mu86) null mutation decreased peroxide resistance in nematodes grown at 25°C and assayed at 20°C by 35%, a greater extent than the 21% reduction in peroxide resistance induced by that mutation in nematodes grown and assayed at 20°C (Figure 6C).”

      As the reviewer noted, daf-16 and skn-1 have a role in peroxide resistance when the AFD neurons are not ablated (albeit a smaller one than when those neurons are ablated). We have made several changes and additions to the text to make that explicit. Most notably, the revised last paragraph of the SKN-1 section now reads: “We propose that when nematodes are cultured at 20°C, the AFD neurons promote signaling by the DAF-2/insulin/IGF1 receptor in target tissues, which subsequently lowers the nematode’s peroxide resistance by repressing transcriptional activation by SKN-1/NRF and DAF-16/FOXO. However, this repression is not complete, because both daf-16(mu86) and skn-1(RNAi) lowered peroxide resistance at 20°C when the AFD neurons were present. It is also likely that DAF-16 and SKN-1 are not the only factors that contribute to peroxide resistance in AFD-ablated nematodes at 20°C, because AFD ablation increased peroxide resistance in daf-16(mu86); skn-1(RNAi) nematodes, albeit to a lesser extent than in daf-16(+) or skn-1(+) backgrounds.”

      The potentiation idea was specific to the effects of DAF-16 on gene expression. As the reviewer noted, much of the transcriptional response to AFD ablation is intact (albeit reduced in magnitude) in AFD-ablated daf-16 mutants, leading to a shift in the PC1 score for the mutant. At the level of the expression of individual genes, we quantified those effects in Figure 8G (formerly 7D). When we did the RNAseq experiments we had expected that lack of daf-16 would eliminate either all the changes in gene expression induced by AFD ablation or eliminate those changes for a subset of genes. Instead, what we found was much more subtle, and unexpected: the size of the gene expression change induced by AFD ablation was reduced by the daf-16 mutation, and that reduction was systematic. Specifically, we found that the bigger the change in gene expression induced by AFD ablation, the bigger the effect of daf-16 in the AFD ablated animals (that is, potentiation), leading to a change in the slope in the regression line in Figure 8G. We revised the paper to ensure we only used the word potentiation in this context (gene expression), even though formally DAF-16 also potentiated the effects of AFD ablation (and temperature shift from 25C to 20C) on peroxide resistance.

      Reviewer #3 (Public Review):

      This paper offers novel mechanistic insights into how pre-exposure to warm temperature increases the resistance of C. elegans to peroxides, which are more toxic at warmer temperature. The temperature range tested in this study lies within the animal's living conditions and is much lower than that of heat shock. Therefore, this study expands our understanding of how past thermosensory experience shapes physiological fitness under chemical stress. The paper is technically sound with most experiments or analyses carried out rigorously, and therefore the conclusions are solid. However, it challenges our current understanding of the role of the C. elegans thermosensory system in coping with stress. The traditional view is that the AFD thermosensory neuron is activated upon sensing temperature rise, and that temperature sensation through AFD positively regulates systemic heat shock response and promotes longevity in C. elegans. Thus, it is quite unexpected that AFD ablation activates DAF-16 and improves peroxide resistance. It also appears counterintuitive that genes upregulated at 25 degrees overlap extensively with those upregulated by AFD ablation at 20 degrees. I feel that it is premature to coin the term "enhancer sensing" for such a phenomenon, as their work does not rule out the possibility that AFD ablation increases resistance to other stresses that are independent of temperature regarding their toxicity or magnitude of hazard. Additional work is necessary to clarify these issues.

      1. Whether the role of AFD in inhibiting peroxide resistance is related to AFD activity needs further clarification. AFD activity depends on the animal's thermosensory experience. As animals in this study are maintained at 20 degrees unless indicated specifically, the AFD displays activities starting around 17 degrees and peaks around 20 degrees. Under such condition, the AFD displays little or no activity to thermal stimuli around 15 degrees. It will be important to test whether cultivation of animals at 20 degrees improves peroxide resistance at 15 degrees, compared to 15 degrees-cultivation/15 degrees peroxide testing. The authors should also test whether AFD ablation further improves survival under peroxides at 15 degrees for animals grown at 20 degrees, whose AFD should show little or no activities at 15 degrees.

      The reviewer raises an interesting point about the relation between the mechanisms that determine AFD activity in response to temperature and those that enable AFD to regulate peroxide resistance. In the revised manuscript we tested whether known mechanisms enabling AFD to sense changes in temperature acutely (receptor guanylate cyclases GCY-8, GCY-18, and GCY-23) played a role in the temperature dependence of peroxide resistance. We found that they did, as detailed in our response to reviewer #1’s point 1.

      As noted by reviewer #2 in their point 1, and in our reply to that comment (and in a new discussion paragraph in the revised manuscript), the relationship between the known mechanisms the acutely regulate the activity of AFD in response to temperature and the mechanisms by which constant cultivation temperature regulates gene expression in AFD (and therefore the expression of peroxide resistance regulating signals like INS-39) is not well understood. Therefore, it is difficult to predict which temperatures will cause induction of peroxide defenses via AFD-dependent mechanisms, or via other mechanisms. While we agree with the reviewer that it will be interesting to characterize the extent to which other cultivation temperatures besides 25C lead to increased peroxide resistance at lower temperatures (including the proposed shifts from 20C to 15C), we think that those questions will be better addressed in future studies.

      2. The importance of the thermosensory function of AFD should be verified. In the current study, the tax-4 mutation was used to infer AFD activity, but tax-4 is expressed in sensory neurons other than AFD. In addition to AFD, AWC can sense temperature and it also expresses tax-4. Therefore, influence on AFD from other tax-4-expressing neurons cannot be excluded. On the other hand, ablation of AFD removes all AFD functions, including those that are constitutive and temperature-independent. Therefore, the authors should test the gcy-18 gcy-8 gcy-23 triple mutant, in which the AFD neurons are fully differentiated but completely insensitive to thermal stimuli. These three thermosensor genes are exclusively expressed in AFD. Compared to the tax-4 mutant that is broadly defective in multiple sensory modalities, this triple gcy mutant shows defects specifically in thermosensation. They should see whether results obtained from the AFD ablated animals could be reproduced by experiments using the gcy-18 gcy-8 gcy-23 triple mutant. The authors are also recommended to investigate ins-39 expression in AFD and profile gene expression patterns in the gcy-18 gcy-8 gcy-23 triple mutant.

      We thank the reviewer for this suggestion. We have performed the requested experiments, as detailed in our response to reviewer #1’s point 1. Briefly, we determined found that gcy-18 gcy-8 gcy-23 triple mutants increased peroxide resistance at 20C but not at 25C, and found that the respective gcy single mutants affected peroxide resistance at 20C. In light of these findings, we concluded that temperature perception by AFD via GCY-8, GCY-18, and GCY-23 enables C. elegans to lower their peroxide defenses at the lower cultivation temperature.

      3. The literature suggests that AFD promotes longevity likely in part through daf-16 (Chen at al., 2016) or independent of daf-16 (Lee & Kenyon, 2009). Whatever it is, various studies show that activation of AFD and daf-16 promote a normal lifespan at higher temperature, and AFD ablation shortens lifespan at either 20 or 25 degrees. Therefore, the finding that DAF-16-upregulated genes overlap extensively with those upregulated by AFD ablation is quite unexpected (Figure 5B). The authors should perform further gene ontology (GO) analysis to identify subsets of genes co-regulated by DAF-16 and AFD ablation, whether these genes are reported to be involved in longevity regulation, immunity, stress response, etc.

      We thank the reviewer for this interesting comment about the complex mechanisms by which AFD regulates longevity. We note that AFD also has additional temperature-dependent roles in lifespan regulation, as Murphy et al. 2003 found that RNAi of gcy-18 increased lifespan in wild-type worms at 20C but not at 25C. Therefore, AFD-specific interventions can also be lifespan extending at 20C.

      We performed WormCat analysis, which is similar to gene ontology, in Figure 8-figure supplement 2 (formerly Figure S8G), which we described in the results section: “we found that the extent to which AFD ablation affected the average expression of sets of genes with related functions (Higgins et al., 2022; Holdorf et al., 2020) was systematically lower in daf-16(mu86) mutants than in daf-16(+) nematodes (R_2 = 86%, slope = 0.67, _P < 0.0001, Figure 8—figure supplement 2).” Visual inspection of the plot and the very high coefficient of determination of 86% indicate that the size of the effect of AFD ablation on gene expression was systematically smaller when the contribution of DAF-16 to gene expression was removed.

      In the revised manuscript we also moved the three panels quantifying the expression of DAF-16 targets and daf-16-regulated genes from the supplement to the main figure. One of those panels (Figure 8F) shows that genes upregulated by daf-16(+) in daf-2 mutants were disproportionally affected by lack of daf-16 in AFD-ablated worms, as we described in the results section: “In addition, in AFD ablated nematodes, lack of daf-16 lowered the expression of genes upregulated in a daf-16-dependent manner in daf-2(-) mutants (Murphy et al., 2003) to a greater degree than in unablated nematodes (Figure 8F).”

      4. I feel that "enhancer sensing" is an overstatement, or at least a premature term that is not sufficiently supported without further investigations. The authors should explore whether AFD ablation or pre-exposure to warm temperature specifically enhances resistance to a stressor the toxicity of which is increased at higher temperature, but does not affect the resistance to other temperature-insensitive threats.

      We edited the manuscript and expanded the manuscript’s discussion to address these concerns as well as similar concerns from reviewer #1. For clarity, we repeat much of our response to reviewer #1’s point 2 here, with the last paragraph of this response specific to this reviewer’s comment.

      In the paper we show that in C. elegans the regulation of the induction of H2O2 defenses is coupled to the perception of temperature (an inherent enhancer of the reactivity of H2O2). To understand the significance of this finding in an evolutionary context, and to explain why such a regulatory system would evolve, we introduced in the discussion a new conceptual framework, “enhancer sensing,” and devoted a section of the discussion to demonstrating that the phenomenon that we observed could not be adequately explained by existing frameworks used to understand the evolutionary origins of the regulatory systems for defense responses.

      We now realize that we did not sufficiently and clearly explain the scope for the criterion for establishing a phenomenon represents enhancer sensing, leading to incorrect predictions by reviewer’s 1 and 3 about (a) whether what we observed in C. elegans is an instance of enhancer sensing (or more proof is needed) and (b) what the enhancer sensing model for the coupling of temperature perception to H2O2 defense would predict about how temperature and the AFD neurons would affect resilience to other chemicals. We regret failing to adequately explain the model’s scope and predictions and believe that we have now explicitly addressed the scope of what constitutes enhancer sensing and the predictions of the model. In particular, we previously did not spell out (a) the distinction between the enhancer sensing strategy and the mechanistic implementation of that strategy; and, importantly, (b) we did not discuss what the enhancer sensing strategy coupling temperature perception to H2O2 defense in C. elegans predicted (and did not predict) about whether a similar strategy would be expected to be used by C. elegans to deal with other temperature-dependent threats. We now address these issues in two new paragraphs in the discussion that read:

      “We show here that C. elegans uses an enhancer sensing strategy that couples H2O2 defense to the perception of high temperature. We expect this strategy’s output (the level of H2O2 defense) to provide the nematodes with an evolutionarily optimal strategy across ecologically relevant inputs (cultivation temperatures) (Kussell and Leibler, 2005; Maynard Smith, 1982; Wolf et al., 2005). This strategy is implemented at the organismic level through the division of labor between the AFD neurons, which sense and broadcast temperature information, and the intestine, which responds to that information by providing H2O2 defense (Figure 9D). Ascertaining that C. elegans relies on this enhancer sensing strategy does not depend on the temperature information broadcast by AFD exclusively regulating defense responses to temperature-dependent threats, because the regulation of defense towards temperature-insensitive threats could affect defenses towards temperature-dependent threats; for example, suppressing defenses towards a temperature-insensitive threat would be beneficial if those defenses interfered with H2O2 defense or depleted energy resources contributing to H2O2 defense.

      As with any sensing strategy, enhancer sensing strategies are more likely to evolve when sensing is informative and responding is beneficial. In their natural habitat, C. elegans encounter many environmental chemicals that, like H2O2, are inherently more reactive at higher temperatures. It will be interesting to determine the extent to which C. elegans uses enhancer sensing strategies coupling temperature perception to the induction of defenses towards those chemicals, and whether those strategies rely on temperature perception and broadcasting by the AFD neurons. We expect that sensing strategies regulating defense towards those chemicals would be more likely to evolve when those chemicals are common, reactive, and cause consequential damage.”

      We note, in the first of the new discussion paragraphs, that the existence of an enhancer sensing strategy is not contingent on whether the AFD neurons (that implement the temperature sensing and temperature-information broadcasting functions regulating peroxide defenses) also do not regulate defense responses to temperature-insensitive threats. For example, it may be beneficial to an animal facing high concentrations of environmental peroxides to suppress defense against a temperature-insensitive threat when those defenses are detrimental towards defense towards hydrogen peroxide. This could occur, for example, because there is an energetic trade off when mounting multiple defense responses, or because specific defenses towards temperature-insensitive threats interfere with peroxide defense. As we noted in our response to reviewer #1’s point 2, our ability to predict survival to threats other than H2O2 (including temperature-independent threats) is limited not only by our lack of knowledge about the specific mechanisms that protect worms from those threats, but also by our inability to predict the extent to which defenses towards different threats operate independently, constructively, or destructively with those that provide hydrogen peroxide defense. We therefore think that those experiments would be better addressed in future studies.

    1. Author Response

      Reviewer #1 (Public Review):

      This study examines whether the D2 receptor antagonist amisulpride and the mu-opioid receptor antagonist naltrexone bias model-based vs model-free behavior in a well-established two-step task of behavioral control. The authors find that amisulpride enhances model-based choices, which is further supported by computational modeling of the data, revealing an increase in the relative contribution of model-based control of behavior. Naltrexon on the other hand had no reliable effect on model-based behavior.

      Overall, this is a very nice study with many strengths, including the task and data analysis. A particular strength of the design is the combination of a between-subject drug administration protocol with two within-subject (baseline vs. drug) sessions. This reduces between-subject variability in baseline model-based vs model-free behavior and enhances the power to detect drug effects.

      The introduction could do a better job articulating the rationale for testing the effect of these two specific drugs. Currently, the rationale is that both transmitter systems targeted by these drugs are involved in drug addiction, which is characterized by an imbalance in model-based vs. habitual control of behavior. This appears somewhat indirect.

      Blood draws were used to determine serum levels for amisulpride and naltrexone but these data are not included as covariates in the analysis.

      We thank the reviewer for the high acclaim of our study, and for the constructive comments to improve it. We acknowledge that the introduction did not motivate the main research goal of the manuscript clearly enough. We have now extended this section and provided further insight into our reasoning behind the study design. Beyond the involvement of opioid and dopamine promoting drugs in addiction, there is abundant evidence from experimental studies showing comparable effects of manipulating receptors of both systems in model-free processes such as reinforcement, and habit formation. Based on this overlap one may predict that both neurotransmitter systems disrupt habit formation in a similar fashion, and that blocking their respective receptors will improve the ability to behave in a model-based manner. However, as we now elaborate in the manuscript, an argument against this could be that disrupting model-free processes might not be enough to promote model-based behaviour, as such behaviour relies heavily on cognitive control. It is therefore especially interesting to compare opioid antagonists, that do not enhance cognitive function, with a D2 antagonist at a dosage that has been shown to increase cognitive control as well as increase the desire to exert cognitive effort.

      This is expressed in the following paragraphs of the Introduction (p.2 §3 and p.3 §1):

      “Opiates, psychostimulants, and most other drugs of abuse increase the release of dopamine along the mesolimbic pathway (Chiara, 1999; Koob & Bloom, 1988), a circuit that plays a central role in reinforcement learning (Schultz, Dayan, & Montague, 1997). On top of this, the reinforcing properties of addictive drugs also depend on their ability to activate the μ opioid receptors (Becker, Grecksch, & Kraus, 2002; Benjamin, Grant, & Pohorecky, 1993; Le Merrer, Becker, Befort, & Kieffer, 2009). This suggests that both the dopamine and the opioid systems might be particularly relevant in model-free reinforcement learning processes that drive the formation of habitual behaviour. Studies in rodents show that activating receptors of both systems across the striatum increases cue-triggered wanting of rewards (Peciña & Berridge, 2013; Soares-Cunha et al., 2016). Conversely, inhibition of both D1-type and D2-type of dopamine receptors (referred to as D1 and D2 from here on) as well as opioid receptors reduces motivation to obtain or consume rewards (Laurent, Leung, Maidment, & Balleine, 2012; Peciña, 2008; Soares-Cunha et al., 2016). This data raises the hypothesis that the drift towards habitual control is enabled by dopamine and opioid receptors via a common neural pathway. Recent work in humans provides some evidence in this direction, whereby systemic administration of opioid and D2 dopamine receptor antagonists causes a comparable reduction of cue responsivity and reward impulsivity (Weber et al., 2016) and decreases the effort to obtain immediate primary rewards (Korb et al., 2020). This suggests that when allocating control between the model-based and model-free system, dopamine or opioid receptor antagonists might comparatively disrupt model-free behavioural strategies and increase model-based behaviour. Yet, no study in humans has directly investigated this. Furthermore, disrupting habit formation might not in itself lead to increased model-based control, without either increasing the perceived value of applying cognitive control or making it easier to do so.”

      We also mention the implications of this direct comparison of the two compounds in the Discussion (p.8 §1):

      “Our findings provide initial evidence for a divergent involvement of the dopamine and opioid neurotransmitter systems in the shift between habitual and goal-directed behaviour. The lack of effects of naltrexone on the model-based/model-free trade-off also provides some support for the notion that simply disrupting neurobiological systems that subserve habitual behaviour might not be enough to increase goal-directed behaviour in this task. An increase in the model-based/model-free weight following amisulpride administration advocates for dopamine playing a decisive role in flexibly applying cognitive control to facilitate model-based behavior and highlights the specific functional contribution of the D2 receptor subtype.”

      Reviewer #3 (Public Review):

      I think this is an interesting study on an important topic. I agree that there is not enough research to understand how the dopaminergic system interfaces with goal-directed planning, and I like the focus on specific types of dopamine receptors. It is interesting that they seem to find a specific effect on just the dopamine antagonist. I also appreciate the clarity with which the authors describe this field of research and their results. However, I also feel that there are several concerns with this paper, both in terms of framing and in terms of the experimental design and analysis. For completeness, I must note that I am not a dopamine expert.

      I felt that the introduction of the paper did not sufficiently motivate the focus on the comparison between neurotransmitters systems, and (for the dopaminergic system) the distinction between D1/D2 receptors. Why is the mapping between stability/flexibility and D1/D2 receptors important? How does this relate to model-based control? Why do the authors predict that model-based control would increase when D2 receptors are blocked? If the hypothesis is about contrasting the contribution of D1 and D2 receptors to goal-directed control, why did the authors not use antagonists directly targeting these two systems?

      In addition, the predictions that are more explicit, for example, that blocking D2 receptors increases MB control by stabilizing goal-relevant information, are fairly specific. However, the current version of the two-step task is not amenable to testing such a specific hypothesis, because it doesn't allow us to measure the specific components of planning (e.g., maintaining goals, the representation of the structure, prospective reasoning). Moreover, MB control in this version of the two-step task is marked by flexibility, because it requires the agent to be sensitive to switching starting states.

      The predictions for the opioid system are also lacking. Why are the authors targeting this system? Why are they comparing the effects of the D2 antagonist with the opioid agonist? Why do the authors predict that amisulpride should have a stronger effect than naltrexone? In my opinion, these predictions were not sufficiently laid out, which made it difficult to appreciate the authors' motivation to run the study.

      We thank the reviewer for their critical take on the manuscript and for clearly pointing out the weaknesses in argumentation. In particular, we appreciate the reviewer’s comment on the lack of clarity in describing why the comparison of dopamine and opioid antagonists’ effects on MB/MF behaviour might be particularly interesting and why we focused on D2 and not D1 receptors. We now extended the introduction section to clarify our rationale for comparing these two compounds (p.2-3). In short, apart from the fact that both systems are implicated in addiction, there is also abundant experimental evidence from human and non-human animal studies that the two systems are involved in processes related to forming habitual responses to primary and secondary rewards. This suggests that blocking receptors of either system might comparatively affect the MB/MF trade-off by impairing model-free processes. We therefore proceeded to compare opioid and dopamine antagonists.

      As we note, using D1 antagonists would likely be detrimental to cognitive control related processes, and therefore more likely to decrease model-based performance. We therefore chose to compare opioid antagonists to D2 receptor antagonists. Another important reason for comparing the effects of opioid and D2 dopamine antagonists is the reasoning that it is not clear whether blocking model-free processes is in itself enough to promote model-based behaviour, without boosting cognitive control related processes. Given the recent evidence for D2 antagonists increasing cognitive effort (Westbrook et al., 2020) and the proposed role of prefrontal D2 receptors in destabilising prefrontal representations (according to the dual state theory of prefrontal dopamine function proposed by Durstewitz & Seamans, 2008)) we reasoned that D2 receptor blockade might also boost the ability (or willingness) to keep the mapping between spaceships and planets online while making choices.

      We incorporated these arguments in the revised Introduction (p.2-3):

      “Opiates, psychostimulants, and most other drugs of abuse increase the release of dopamine along the mesolimbic pathway (Chiara, 1999; Koob & Bloom, 1988), a circuit that plays a central role in reinforcement learning (Schultz et al., 1997). On top of this, the reinforcing properties of addictive drugs also depend on their ability to activate the μ opioid receptors (Becker et al., 2002; Benjamin et al., 1993; Le Merrer et al., 2009). This suggests that both the dopamine and the opioid systems might be particularly relevant in model-free reinforcement learning processes that drive the formation of habitual behaviour. Studies in rodents show that activating receptors of both systems across the striatum increases cue-triggered wanting of rewards (Peciña & Berridge, 2013; Soares-Cunha et al., 2016). Conversely, inhibition of both D1-type and D2-type of dopamine receptors (referred to as D1 and D2 from here on) as well as opioid receptors reduces motivation to obtain or consume rewards (Laurent et al., 2012; Peciña, 2008; Soares-Cunha et al., 2016). This data raises the hypothesis that the drift towards habitual control is enabled by dopamine and opioid receptors via a common neural pathway. Recent work in humans provides some evidence in this direction, whereby systemic administration of opioid and D2 dopamine receptor antagonists causes a comparable reduction of cue responsivity and reward impulsivity (Weber et al., 2016) and decreases the effort to obtain immediate primary rewards (Korb et al., 2020). This suggests that when allocating control between the model-based and model-free system, dopamine or opioid receptor antagonists might comparatively disrupt model-free behavioural strategies and increase model-based behaviour. Yet, no study in humans has directly investigated this. Furthermore, disrupting habit formation might not in itself lead to increased model-based control, without either increasing the perceived value of applying cognitive control or making it easier to do so. Crucially, there are important differences in how each of the two neurochemical systems relate to cognitive control that is pivotal for model-based behaviour. Across a wide range of studies using various dosing schemes, opioid receptor antagonists did not have an effect on tasks that require cognitive control, such as working memory (Del Campo, McMurray, Besser, & Grossman, 1992; File & Silverstone, 1981; Volavka, Dornbush, Mallya, & Cho, 1979), sustained attention(Zacny, Coalson, Lichtor, Yajnik, & Thapar, 1994), or mathematical problem-solving (Del Campo et al., 1992) (see (van Steenbergen, Eikemo, & Leknes, 2019) for a review). Dopaminergic circuits, on the other hand, play a central role in higher cognitive functions and goal-directed behaviour (Brozoski, Brown, Rosvold, & Goldman, 1979). In particular, D1 dopamine receptors in the prefrontal cortex enable maintenance of goal-relevant information and working memory(Goldman-Rakic, 1997; Sawaguchi & Goldman-Rakic, 1991; van Schouwenburg, Aarts, & Cools, 2010; Williams & Goldman-Rakic, 1995), while the D2 dopamine receptor activity disrupts prefrontal representations(Durstewitz & Seamans, 2008). In support of this, decreased working memory performance was observed after blocking prefrontal D1, but not prefrontal D2 receptors (Arnsten, 2011; Sawaguchi & Goldman-Rakic, 1991; Seamans & Yang, 2004). In humans, systemic administration of D2 antagonism increased the ability to maintain and manipulate working memory representations (Dodds et al., 2009; Frank & O’Reilly, 2006) and increased the value of applying cognitive effort (Westbrook et al., 2020). This data suggests that blocking D2 receptors, in contrast to blocking opioid receptors, could further facilitate model-based behaviour through enabling or encouraging flexible use of cognitive control.”

      Another important point that the reviewer stresses is that the two-step task we use does not allow us to draw any conclusions through which mechanisms amisulpride increases model-based behaviour. Although we base our hypothesis that D2 might promote model-based behaviour (on top of disrupting habit formation) on previous work showing D2 blockade increasing cognitive effort and the ability to manipulate working memory representations, we completely agree that our setup does not give any definite answers about which of these cognitive processes mediated the increase in model-based weights. In the discussion we try to interpret our findings in the context of the dual-state hypothesis framework and within the framework of striatal control of adaptive behaviour (p.8 §3-4), whereby we centre our argumentation around dopaminergic circuits that subserve one or the other mechanism.

      We agree with the reviewer that the task requires a high degree of flexible planning and that the dual-state theory might not be enough to account for our effects. We mention this in the Discussion (p. 8 §3):

      “The effects of D2 antagonism on model-based/model-free behaviour in our study can be interpreted within this [dual-state] framework to result from increased ability to maintain prefrontal representation of the mapping between the spaceships and the planets online. However, this is difficult to reconcile with the fact that model-based behaviour in dynamic learning paradigms, such as the one used here, also requires flexible updating of action values.”

      We also elaborate on the general limitations of drawing inference about the underlying cognitive/computational mechanisms in the Discussion (p. 14 §2):

      “Importantly, it should also be acknowledged that the behavioural setup in our study does not allow us to draw definite conclusions about the mechanisms that mediate amisulpride’s effects on model-based or model-free behaviour. For example, it is not clear whether amisulpride increases the perceived benefit of applying cognitive control, or whether it increases the participant’s ability to do so through various possible complementary processes, such as goal maintenance or planning abilities. Future studies should further elucidate the mechanistic contributions of dopamine receptors to the distinct coding and utilisation of task relevant representations (Langdon, Sharpe, Schoenbaum, & Niv, 2018; Stalnaker et al., 2019).”

      Related to this, I felt that the introduction was a bit too quiet on the genetic markers. Their discussion in the results was a bit surprising, and it wasn't quite clear why the authors decided to investigate these interaction effects.

      We appreciate this comment as we were quite uncertain ourselves on how much weight to give to those data. Previous research had indeed shown profound variability in MB/MF behaviour across genotypes related to baseline dopamine function. The main purpose of the genetic analysis was to control for potential baseline differences and to explore the drug genotype interactions. However, including the serum data as a covariate in analyses, as suggested by the other reviewers, made most results relating to the genetic analysis disappear, even when using less conservative priors that likely understate the variance of posterior distributions of group effects. We have therefore opted to keep coverage of the genetic data to a minimum, but still report the results and make the data available online for future studies.

      I found some of the core results confusing. Most importantly, why does amisulpride make people less like to stay after a reward when the first-stage state is the same? When first-stage states repeat, both an MB agent and an MF agent will be more likely to stay after a reward. To me, this kind of behavior doesn't seem particularly model-based. Why does this behavior occur under amisulpride? I was surprised that the authors did not really address it.

      We agree that these results have been somewhat difficult to reconcile. However, adding amisulpride serum levels to our analyses now allow us to get a better understanding. It seems that across both serum groups model-based behaviour was increased, however, only in the high serum group did we additionally observe increased exploration. We also note that increased exploration was related to a reduced effect of previous points in the first same state trials, whereas the interaction term (effect of previous points in diff vs. same state trials) was more strongly associated with the model-based weight. In the manuscript this is described in the results section and in the discussion.

      The following text is included in the Results (p.6):

      “We first observed that the more model-based choices the participants made, the more money they earned (r = 0.65, 95% CI [0.53, 0.76]). This serves as a validity check of the task, which was designed to make cognitive control pay off (literally)45. We then looked at how the model parameters relate to the random slopes from the behavioural analysis of staying behaviour and found that the participant-level (random effect) slope for the effect of previous points on staying behaviour in different vs. same first state trials was most strongly related to ω (d = 0.493, P < 10e-3) and negatively related to the inverse temperature parameter η (d = -0.328, P < 10e-3), and the slope for trials with same first states was mostly related to η (d = 0.822, P < 10e-3), and less so to ω (d = 0.235, P < 10e-3).”

      The following text is included in the Discussion (p.8 §2):

      “Interestingly, amisulpride also increased choice stochasticity parametrised by the softmax inverse temperature parameter. In a paradigm with two choice options, it cannot be definitively determined whether this indicates higher decision-noise or increased exploration of alternative choices. We can however speculate that increased decision noise would lead to overall detrimental effects on learning in both trial types with same and different consecutive first stage states, which we do not observe in our data. The effect on the choice stochasticity parameter was only present in participants with a higher effective dose75, suggesting that the effect was more likely to be post-synaptic. Similarly, in the same effective dose group, we found some evidence that amisulpride reduces response stickiness indicating increased switching between actions. This is well in line with a prominent model of the cortico-striatal circuitry implicating post-synaptic D2 receptors in exploration/exploitation65 and supported by empirical data. In animal studies, activation of D2 receptors was shown to lead to choice perseverance and more deterministic behaviour, whereas D2 receptor inhibition increases the probability of performing competing actions and increases randomness in action selection76. In humans, a recent neurochemical imaging study showed that D2 receptor availability in the striatum correlated with choice uncertainty parameters across both reinforcement learning and active inference computational modelling frameworks77. Increased choice uncertainty was also observed in a social and non-social learning tasks in a study using 800 mg of sulpiride, a dose that is known to exert post-synaptic effects54,78. We note, however, that the evidence for the difference in exploration between the low and high serum groups was not robust (p=0.066). Furthermore, it has been suggested that increased striatal dopamine is also related to tendency for stochastic, undirected exploration79,80, arising due to overall uncertainty across available options79 or through increasing the opportunity cost of choosing the wrong option68,71. This suggests that the same biological signature that leads to increased cognitive effort expenditure also promotes choice exploration. In line with this, both prior studies that investigated the effect of increasing dopamine availability with L-DOPA on model-based/model-free behaviour observed increase choice exploration as well as increased model-based behaviour (although in one it was only present in individuals with a higher working memory capacity)55,58.”

      With regards to the design, it is unfortunate that the order of drug administration is not counterbalanced. As far as I understand, model-based control is always measured without a drug in the first session, and then with the drug (or placebo) in the second. The change between sessions is then tested for all three conditions. Of course, it is possible that the increase in model-based control in the amisulpride condition is only driven by the drug. However, given the lack of counterbalancing, it's also possible that amisulpride increases model-based control only after the experience with the task. That is, if the authors had counterbalanced the drug effect, they may have found that amisulpride had a different effect if it was administered in the first session. That would have changed their interpretation quite a bit! As it stands, they are unable to verify their (admittedly simpler) hypothesis that there is only a main effect.

      We thank the reviewer for this comment. Indeed, a full within-subject design would have been statistically more powerful and would have enabled us to exclude the possibility that amisulpride’s effect on model-based behaviour is indirect. We have now included the following paragraph in the discussion that aims to highlight the limitation of not counterbalancing the drug administration (p.10):

      “One of the strengths of our design is a baseline measure, and the fact that the participants were all introduced to the task under no administration, thus avoiding potential effects of the treatment on task training. Although this design allowed to reduce between-subjects variability, we cannot completely exclude order effects. Although unlikely, it is possible that the effects of the treatment that we observe come indirectly from the effects of the two drugs on either skill transfer from the previous session, or simply on the effect of the drugs on the part of the experiment that preceded the task. For instance, participants under amisulpride could be less tired from other tasks and therefore more willing to exert effort in the task presented here. Speaking against this is the observation that we found no differences in mood between amisulpride and placebo regardless of low or high serum levels.”

    1. the essay is like a journey, we may be more mindful of our intended audience, with whom we are bringing along as fellow travelers.

      this is an interesting way to think about it, but it kind of helps!

    1. notasclearandneatasitmightseem.Toactasthoughitwereistoinvite allkindsoftrouble.Ifwepretendthatourrolebehaviorissomehownotconnectedtowhowereallyare,forexample,thenweavoidtakingrespon­sibilitynotonlyfortherolebutalsoforourportionoftheplay

      In this passage I found this example of social life as theater very interesting and well said. The way we participate in social life is based on the action of our behavior and how we use it to interact with each other. Although our actions may differ based on the social system that we are in or the people we are interacting with, the way an individual acts still represents who they are as a person because they are choosing to act that way on their own. They are choosing to act that way because that's how they want other people to view them as. However, I do think that this concept can also be a little confusing to people. This is because in specific environments where individuals are pretending to act in a certain way because of different circumstances, may argue against that concept and say that they had a reason to act like that. Whether it was to get out of that situation or because they were uncomfortable or simply because they wanted to fit in. Ultimately, I think that it is still associated with who you are as a person.

    1. Author Response

      Reviewer #1 (Public Review):

      This study presents a series of experiments that investigate maternal control over egg size in honey bees (Apis mellifera). Honey bees are social insects in which a single reproductive female (the queen) lays all the eggs in the colony. The first set of experiments presented here explore how queens change their egg size in response to changes in colony size. Specifically, they show that queens have relatively larger eggs in smaller colonies, and that egg size changes when queens are transplanted into colonies of a different size (i.e. confirming that egg size is a plastic trait in honey bee queens). The second set of experiments investigates candidate genes involved in egg size determination. Specifically, it shows that Rho1 plays a role in determining egg size in honey bee queens.

      In principle, we agree with this summary, although we find the experimental demonstration that perceived colony size affects egg size (first set of experiments) and the overall proteomic comparison of ovaries that produce small and large eggs (second set of experiments that indicate the upregulation of metabolism, protein transport, cytoskeleton organization, and a few other processes in large egg-producing ovaries) also important.

      A strength of the study is that it combines both manipulative field (apiary) experiments and molecular studies, and therefore attempts to consider broadly the mechanisms of plasticity in egg size. The link between these two types of dataset in the manuscript, however, is not strong. While the two parts are related, the molecular experiments do not follow from the conclusions of the field experiments but rather run in parallel (both using the same initial treatments of queens from large v small colonies).

      We would welcome suggestions on how to further strengthen the integration between the field experiments and our molecular studies. We sought to explore the molecular basis of the observed plasticity in reproductive behavior and thus focused on samples from the first set of experiments for our proteome comparisons, realizing that every additional field experiment could have entail a similar molecular follow-up. We attempted to bring molecular studies and field experiments back together with the RNAi-mediated knock-down of Rho1 in queens that produce eggs in differently-sized colonies under realistic apicultural conditions. There may be better, additional opportunities for a closer integration of molecular and field experiments, but we could not conceive of them.

      Another strength of the study is the focus on social cues for egg size control in a social insect. Particularly interesting is data showing that queens suddenly exposed to the cues of a larger colony (even where egg-laying opportunities did not actually increase) will decrease their egg size, in the same way as queens genuinely transplanted to larger colonies. That honey bee queens can control their egg sizes in response to cues in the colony is not unexpected, given that queens are known to vary egg size based on the cell type they are laying into (queen, drone or worker cell). Nevertheless, it is interesting to show that worker egg sizes over time are also mediated by social cues.

      We thank the reviewer for this positive assessment and want to highlight that this experiment not only controls for egg laying opportunities, but also for potentially greater resource availability in larger colonies. These results are therefore important for the key argument that egg size is actively regulated by honey bee queens.

      A weakness of the study is that the consequence of egg size on egg development and survival in honey bees is not made clear. The assumption is that larger egg size compensates for smaller colonies in some way. Do smaller eggs (i.e. those laid in large colonies) fare worse in smaller colonies than they do in large colonies? Showing that the variation in egg size is biologically relevant to fitness is an important piece of the puzzle.

      We agree that the consequences of egg size variation are important to address beyond our previously published data set and the benefits demonstrated in other contexts by other authors. However, to comprehensively resolve the consequences requires considerable additional experiments that exceed the scope of our current study, which is primarily focused on the causes of the queens’ reproductive plasticity.

      Also, the relationship between egg number and egg size in honey bees remains rather murky. Does egg size depend at least in part on daily egg laying rate (which is sure to be greater in larger colonies)? The study makes an effort to explore this by preventing queens from laying for two weeks and then comparing their egg size when they resume to those that did not have a pause in laying. Although egg size did not vary between the groups in this case, it is unclear whether the same effect would be seen if queens had simply been restricted from laying at such high rates (e.g. if available empty brood cells had been reduced rather than removed entirely).

      We agree that the relation between egg number and egg size is complicated. We have added more data that show that egg laying rates can be higher in larger colonies than in smaller colonies. We also report now that the egg size is negatively correlated to egg number, although not in all instances, which partially supports (and partially contradicts) our previous findings (Amiri et al. 2020). We have modified the discussion of our results to account for the additional results and point out the limitation of the experiment with caged queens. It is important to realize though that the queens were caged on comb and not restricted in typical, small queen cages that are used for queen transport. It is not clear whether this treatment resulted in a downregulation of the reproductive efforts and/or the resorption of eggs.

      Overall this study makes new contributions to our understanding of maternal control over egg size in honey bees. It provides stepping stones for further investigation of the molecular basis for egg size plasticity in insects.

      We agree that we could not resolve everything in this study and that more investigations are needed.

      Reviewer #2 (Public Review):

      This paper builds on recent work showing that honeybee queens can change the size of the eggs they lay over the course of their life. Here the authors identified an environmental condition that reversibly causes queens to change their egg sizes: namely, being in a relatively small or large colony context. Recently published work demonstrated the existence of this egg size plasticity, but it was completely unknown what signaled to the queen. In a series of simple and elegant experiments they confirmed the existence of this egg size plasticity, and narrowed down the set of environmental inputs to the queen that could be responsible for signaling the change in the environment. They also began the work of identifying genes and proteins that might be involved in controlling egg size. They did a comparative proteomic analysis between small-egg-laying ovaries and large-egg-laying ovaries, and then selected one candidate gene (Rho1). They showed that it is expressed during oogenesis, and that when it is knocked down, eggs get smaller.

      This is a good summary, although we think that it is fair to add that the expression of Rho1 is specific to the egg growth stage, and that we found an almost perfect correlation of Rho1 mRNA levels and egg size in two separate experiments (in addition the difference between large and small egg-producing ovaries at the protein level).

      The experiments on honeybee colonies are well-designed, and they provide fairly strong evidence that the queens are reversibly changing egg size and that it is (at least some component of) their perception of colony size that is the signal. One minor but unavoidable weakness is that experiments on honeybees are necessarily done with small sample sizes. The authors were clear about this, however, and it was very effective that they showed all individual data points. Alongside the previous work on which this paper builds, I found their core results to be rather convincing and important.

      We thank the reviewer for this positive evaluation.

      I found the parts of the paper on oogenesis to be useful, but overall less informative in answering the questions that the authors set out for those sections. On balance, I think the best way to interpret the oogenesis results is as "suggestive and exploratory". For instance, the experiment aimed at understanding the relationship between egg-laying rate and egg size does not include a direct measurement of egg-laying rate, but instead puts queens in a place with no suitable oviposition sites. The proteomic analysis was fine, but since they were using whole ovaries, with tissue pooled across all stages of oogenesis including mature oocytes, I would be cautious in interpreting the results to mean that they had identified proteins involved in making larger eggs. These proteins might just as easily be the proteins that are put into larger eggs. In fact, for the one candidate gene that is examined, its transcripts seem as though they are predominantly in the oocyte cell itself rather than in the supporting cells that actually control the egg size (although it is hard to tell from the micrographs without a label for cell interfaces).

      We have added data on the number of eggs produced in the first experiment, which actually show a negative correlation between egg size and egg number. In addition, we have cautioned our wording about the conclusions that can be drawn from the oviposition restriction experiment. Concerning the expression and role of Rho1, we apologize for the lack of a cell membrane marker. However, we share the reviewer’s interpretation that the mRNA is located in the oocyte. While we also agree that egg loading from the nurse cells is important, transport of vitellogenin from the follicle cells may also be quite significant for egg size (Wu et al. 2021 – doi:10.3389/fcell.2020.593613 and Fleig 1995 - doi:10.1016/0020-7322(95)98841-Z), a process that could be controlled by Rho1 in the documented location. We have added to the discussion to clarify this point.

      On that note, with the caveat that the sample sizes are quite small, I agree that there is some evidence that Rho1 is involved in honeybee oogenesis. If this was the only gene they knocked down, and given that it results in a small size change with such a small sample size, it strikes me as a bit of a stretch to say that these results are evidence that Rho1 plays an important role in egg size determination. It is essential to know if this is a generic result of inhibiting cytoskeletal function or a specific function of Rho1. That is beyond the scope of this study, but until those experiments are done, it is hard to know how to interpret these results. For context, in Drosophila, there are lots and lots of genes such that if you knock them down, you get a smaller or differently shaped egg, including genes involved in planar polarity, cytoskeleton, basement membrane, protrusion/motility, septate junctions, intercellular signaling and their signal transduction components, muscle functions, insect hormones, vitellogenesis, etc. This is helpful, perhaps, for thinking about how to interpret the knockdown of just one gene.

      We thank the reviewer for this perspective and have consequently cautioned our wording. The role of Rho1 in regulating the cytoskeletal function has been established in other organisms, but we do not have the tools to study the corresponding pathways and establish causality in honey bees. We have added to the discussion to alert the reader to the point that additional experiments are necessary.

      Overall, I found the results to be technically sound, and there are several clever manipulations on honeybee colonies that will doubtless be repeated and elaborated in the future to great effect. The core result-that queens can change the size of their eggs quickly and reversibly, in response to some perceived signal-was honestly pretty astonishing to me, and it reveals that there are non-nutritive plastic mechanisms in insect oogenesis that we had no idea existed. I look forward to follow-up studies with interest.

      We thank the reviewer for the overall evaluation and encouragement to continue our research.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: The authors compared the various multinucleated cells, osteoclasts, LCG and FBGC. Overall, the manuscript shows rigor in the analyses, and also very interesting approaches for retrieving mononuclear cells, for instance using DC-STAMP siRNA. This work adds very much to understanding the biological differences, as summarized in figure 6h. A lot of work in osteoclast field with for instance qPCR is hampered because, inevitably, a mix of mononuclear and multinucleated cells is always measured. Here, a solid attempt to separate those mixes with cell sorting and subsequent analysis on the mononuclear and multinucleated isolates, really adds. Choice of figures is good, also the extra info of the supplementary figures is relevant and makes it easy to read.

      Major and minor concerns:

      1. For osteoclasts, various markers exist for their biological characterization, for instance the ability to resorb bone. What, apart from the arrangement and number of nuclei, were the biological parameters that confirmed that the cells made by addition of IFN or IL-4 were LCG and FBGC? [Authors’ reply]. In order to address this point, we focused on gene sets that characterize LCGs and FBGCs. By doing so, we aimed to identify (i) lineage dependent factors and (ii) markers of LGCs and FBGCs. (See new Supplementary Figure 1B and C, New Supplementary Table 1 and highlighted text in Results). As expected, and in line with the lineage-determining factors, the transcriptomics comparison between mononucleated/multinucleated IFN-γ and IL-4-differentiated macrophages showed predominance of IFN-γ and IL-4-related pathways, respectively (Supplementary Figure 1B and C and Supplementary Table 1). Among known LGC and FBGC markers, we confirmed up-regulation of CCL7 [1] and CD86 [2], respectively.* As per the biological parameters, we indeed confirm that FBGCs show enhanced phagocytosis properties (Figure 5C) while LGCs can form granuloma-like clusters in vitro (Figure 4D and E). Altogether, we characterize LGCs and FBGCs with (i) polykaryon-specific nuclear arrangement, (ii) polykaryon-specific gene expression markers, (iii) previously shown and new phenotypic characteristics such as LGCs’ unique ability to form in vitro clusters containing CD3+ cells. *

      In fig 2c: did the authors perform stainings with isotype control antibodies? In my experience, quite often, antibodies stain mononuclear cells much intenser, since the cytoplasm is much more condense, less spread over a large area.

      [Authors’ reply]. According to the reviewer’s suggestion, we provide isotype control staining for MRC1 in IFN-g-stimulated mononucleated/multinucleated cells by ImageStream (left panel) and immunofluorescence in LGCs, FBGCs and osteoclasts (right panel). There was negligible staining with the isotype control antibody for MRC1 in both settings (Figure provided to the journal).

      *We did not observe a potential artefact of staining in multinucleated cells when compared to mononuclear cells. In fact, some markers of multinucleation such as B7-H3 is augmented in LGCs (Figure 4E). *

      Resorption assay in 6 is not clear. It is weird that osteoclasts apparently display so limited resorption? Also the traces are not typical for osteoclasts. Please explain.

      [Authors’ reply]. Human osteoclasts are cultured for 2 days on hydroxyapatite-coated plates and the amount of resorption is dependent on the healthy donor the peripheral blood is derived from. In addition to genetic variability, the support (hydroxyapatite) is different from dentine, which is also widely used for measuring osteoclast resorptive activity. The visualization of the human osteoclast resorption is made by transparency (area not coated by hydroxyapatite due to its resorption) on image J.

      Provide a better image Supplementary 2A, even at 250% the lettering is vague. What do the colours in 2A mean?

      [Authors’ reply]. *According to the reviewer’s suggestion, we now provide the Supplementary Figure 2A with better resolution. In STRING protein-protein interaction analysis, there is no particular meaning of the node color itself. *

      CROSS-CONSULTATION COMMENTS

      I have read the comments of the other two reviewers, and together. I absolutely agree with their additions, Indeed, supplementary tables are lacking, as well as there could be a bit more emphasis on the fact that it is all in vitro work. Together, I think the three of us are complementary in our comments, with good overlap as well. Any effort to stain for instance pathology material with the markers that have been found, would be great, especially for the LGC and the FBGC, that are much less studied in the field of MNGs. Having said that ,I can also live without this addition, but then it could be highlighted in the discussion that these are the future avenues that should be considered. Collaborate with Pathology!

      [Authors’ reply]. We appreciate that the reviewer provides cross-consultation comments which we address in our revised manuscript. As such, we discuss future avenues regarding the translatability of these results to human pathology involving MGCs.

      Reviewer #1 (Significance (Required)):

      This manuscript is particularly interesting to those who are interested in the BIOLOGY of MNCs. In essence, three types of MNCs were cultured and compared, with each of them a specific function.

      I am an osteoclast expert (76 publications), and have two publications on FBGCs

      [Authors’ reply]. *We sincerely thank the reviewer for his/her pertinent comments, enthusiasm for our findings and for providing us an overall summary of our findings in view of all other reviewer comments. *

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      In this manuscript, the authors performed a comparative transcriptome analysis of mononuclear and multinuclear human osteoclasts, LGCs and FBGCs. They found that multinucleation triggers a significant downregulation of macrophage identity in all three types of MGCs. Furthermore, RNA-seq data and in-vitro functional analysis of multinucleated cells showed that macrophage cell-cell fusion and multinucleation enhance phagocytosis and contribute to lysosome-dependent intracellular iron homeostasis. Furthermore, multinucleation of osteoclasts promoted mitochondrial activity and oxidative phosphorylation, resulting in maximal respiration. This unique and interesting study addresses the fundamental question of how cell-cell fusion and multinucleation contribute to cellular activity and biological homeostasis.

      Major comments

      1 The authors generated mature multinucleated cells by stimulating human PBMC-derived macrophages with either IFN-g, IL-4, or RANKL. However, no quantitative data have been presented to determine how effectively IL-4, IFN-g, and RANKL can induce multinucleated giant cells from mononuclear macrophages. Quantitative data showing induction efficiency would provide a more detailed picture of the overall experiment.

      [Authors’ reply]. According to the reviewer’s suggestion, quantitative data showing the efficiency of these cytokines to induce multinucleation (i.e. fusion index) is now provided as part of the revised Supplementary Figure 1A (right panel).

      2 The authors mentioned, "The distinct morphological appearance of these three types of MGCs (Figure 1B) suggested cell type-specific functional properties and shared mechanisms underlying macrophage multinucleation". However, there is no discussion or data showing how the nuclear arrangement and intracellular location affect the biological function of multinucleated cells.

      [Authors’ reply]. This is good point and is now discussed in the revised manuscript (see highlighted text in revised manuscript and below).

      Whether MGC-specific nuclear arrangements and/or numbers are indicative of specialized function is currently unclear. Intracellular nuclei arrangement is likely to be important for the sealing zone formation in a polarized bone-resorbing osteoclast. Furthermore, whether distinct transcriptional activities are assigned to different nuclei of the MGC also remain to be tested. Recent elegant work performed in multinucleated skeletal myofibers suggest transcriptional heterogeneity among the different nuclei of the polykaryon [3].

      3 Based on the results of DC-stamp knockdown experiments, the authors concluded that cell-cell fusion and multinucleation suppress the mononuclear phagocytic gene signature. However, to strengthen this hypothesis, it would be necessary to provide at least data showing that DC-stamp knockdown reduces the number of multinucleated cells.

      [Authors’ reply]. According to the reviewer’s suggestion, we provide data showing that DCSTAMP knockdown reduces multinucleation in LGCs and FBGCs (see below and new Supplementary Figure 2F). For human osteoclasts, the data was included in our previously published paper ([4] and figure provided to the journal).

      4 In Figure4, the authors showed that transcripts in LGCs were enriched for antigen presentation and adaptive immune system pathways. In addition, multinucleation of LGCs increased the surface expression of B7-H3 (CD276) and colocalized with CD3+ cells, suggesting that LGC multinucleation potentiates T cell activation. However, the authors did not present enough data to demonstrate the antigen-presenting ability of LGCs or their specific T cell activating capacity.

      [Authors’ reply]. We agree with the reviewer that our data on a potential role of LGCs’ on T cell activation is based on increased surface expression of B7-H3 and the unique CD3+ cluster forming ability of LGCs. In order to check for further markers of antigen presentation, we have performed MHC-1 and MHC-2 quantification by ImageStream in 3 types of MGCs (figure provided to the journal).

      Although there was no difference in MHC-I/MHC-2 between the mononucleated and multinucleated macrophages, the mean fluorescent intensity (MFI) range was the highest in IFN-g-stimulated macrophages, suggesting that LGCs may be better equipped for antigen presentation than the other 2 types of MGCs. A more comprehensive analysis of antigen presentation requires enzymatic digestion and isolation and phenotyping of LGCs from clusters in vitro and human tissues in vivo. This is a program of research that we have initiated as part of a separate study, which will focus on the in vivo relevance of the current findings such as the unique Ag presentation ability of LGCs in a non-sterile tissue environment.

      5 Figure 6 clearly shows that mature multinucleated osteoclasts exhibit increased ATP production and maximal respiration. However, the glycolytic pathway did not differ between mononuclear and multinuclear osteoclasts. No explanation for this observation has been provided. It is easy to understand that osteoclasts acquire ATP through aerobic respiration during multinucleation. But how NADPH, which is essential for its redox reaction, is produced? Is it by acquiring αKG from the glutamine pathway?

      [Authors’ reply]. This is a point worth expending (see also discussion; highlighted text). Osteoclast multinucleation is characterized by increased mitochondrial gene expression which also translates into increased spare respiratory capacity (SRC or maximal respiration). This metabolic rewiring does not modify glycolysis and basal respiration rate. As the reviewer correctly states, increased SRC may be a way to supply more ATP to the energy-demanding polykaryon.

      As per the production of NAD(P)H as an electron source for ETC, it could indeed be through glutamine rather than glucose usage in multinucleated osteoclasts. Furthermore, as iron is an essential cofactor for ETC activity through activity of iron-sulfur clusters, the mitochondrial concentration of iron is likely to be critical for the mitochondrial activity of multinucleated osteoclasts (see also discussion).

      Minor comments:

      6 Supplementary tables 1-6 were not provided.

      [Authors’ reply]. We apologize for this. The revised versions of supplementary tables are provided as part of the revised manuscript.

      7 Figure 2D right panel, difficult to see DAPI+ nuclei.

      [Authors’ reply]. Thanks for pointing this out. We have now replaced Figure 2D with a more pronounced DAPI+ nuclei.

      Reviewer #2 (Significance (Required)):

      Although it is well known that multinucleation of cells constantly occurs, especially in osteoclasts, skeletal muscle, and trophoblasts of the placenta, the biological significance of multinucleation and the intracellular functions of multinucleation are not well understood. In this unique study, three types of multinucleated cells were generated from human peripheral blood to elucidate the genetic and functional differences between mononucleated and multinucleated cells. Furthermore, by demonstrating the possibility that the morphological peculiarity of multinucleation can regulate cell function, this paper provides clues to understanding the underlying biology of multinucleated cells and how they maintain cell function in homeostatic and pathological settings.

      [Authors’ reply]. We thank the reviewer for finding our study unique and biologically meaningful. We also thank the reviewer for all the suggestions that improved significantly the overall message of the manuscript.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      The manuscript of Ahmadzadeh and Pereira et al is an interesting study of the fusion process key to the formation of multinucleated giant cells (MGCs). Our current ability to discriminate between different types of MGCs is limited, and there are gaps in our understanding of the molecular determinants of cell fusion. In this study, the authors isolated different MGC variants - osteoclasts, Langhans giant cells (LGCs) and foreign body giant cells (FBGCs) and identified common, as well as MGC-specific genes and pathways involved in the process of cell fusion. The approach of isolating and comparing different types of MGCs is novel, and the manuscript is well presented and written. However, due to the in vitro nature of the study, the physiological significance of the findings is unclear. I have further minor and major points for the authors to address, as detailed below.

      Minor comments:

      1. The approach to isolate the different MGCs using FACS and imaging technique is highly novel. However the difference between MGC subtypes isolated isn't immediately apparent beyond the morphological comparisons. In my opinion some of the results of MGC-specific assays from Figures 4, 5 and 6 can be included in Figure 1, e.g. TRAP staining and hydroxyapatite resorption for osteoclasts, to provide evidence of purity and specificity of each MGC subtype early on in the manuscript. Classical or canonical genes associated with each MGC subtype can also be highlighted in the volcano plots in Figure 1C, e.g ACP5, CTSK, TNFRSF11A for osteoclasts. [Authors’ reply]. We thank the reviewer for this point and we agree it is important to highlight markers for each polykaryon early in the manuscript. In accordance with this reviewers’ comment (and also with Reviewer 1’s point), we first verified existence of lineage-dependent factors and markers of LGCs and FBGCs as these cells are relatively less well-defined compared to osteoclasts. (New Supplementary Figure 1B and C and New Supplementary Table 1). As expected, and in line with the lineage-determining effects, the transcriptomics comparison between mononucleated/multinucleated IFN-γ and IL-4-differentiated macrophages showed predominance of IFN-γ and IL-4-related pathways, respectively (New Supplementary Figure 1B and C and New Supplementary Table 1). Among known LGC and FBGC markers, we confirmed up-regulation of CCL7 [1] and CD86 [2], respectively (New Supplementary Table 1). We have added this information in the revised manuscript (see highlighted text). Osteoclast phenotyping is provided by TRAP staining and resorption assay (Figure 6C) and we also confirm that CTSK is indeed significantly up-regulated upon multinucleation (LogFc=1.69; P=9.2 x 10E-6; highlighted in the revised manuscript).

      The overall decrease in phagocytic identity of all the MGCs, and the specific upregulation of phagocytic pathways in the FBGCs are conflicting. Are there subsets of phagocytic pathways that were down and upregulated during the formation of FBGCs?

      [Authors’ reply]. This is a very good point. As the reviewer indicates, the results suggest that subsets of phagocytic pathways are changed upon multinucleation. All three types of MGCs show a downregulation of transcripts that belong to Fc receptors and complement C1Q family. However only FBGCs show an up-regulation of S. Aureus bioparticle-mediated phagocytosis. Hence the exact surface receptors responsible for this pathogen clearance remain to be identified. FBGC phagocytosis is a complex process including non-canonical phagocytosis pathways and participation of increased membrane area and endoplasmic reticulum [5, 6]*. Whether these pathways are specifically induced in human FBGCs remain to be identified. We now discuss this point in the revised manuscript (see highlighted text in Discussion). *

      What are the identities of the mononuclear cells in each of the MGC experiment? They appeared to be quite heterogeneous based on the DEGs identified, beyond the common phagocyte signature. Can the authors comment on the difference between the mononuclear cells and whether this will affect the DEG analysis?

      [Authors’ reply]. This is also a very relevant point that we now address in the revised manuscript (New Supplementary Figure 1B and C; New Supplementary Table 1 and highlighted revised text in Results). The reviewer is correct that MGC-specific pathways are in line with the known function of each polykaryon (Figure 4A, 5A and 6A). To what extent lineage-dependent effects (e.g. IFN-g and IL-4) are conserved between the mononucleated and multinucleated state is yet to be determined. In order to address this point, we compared DEG in IFN-g and IL-4-differentiated mononucleated macrophages to the ones obtained in multinucleated macrophages (New Supplementary Figure 1B and C; New Supplementary Table 1). The results showed that the multinucleated cell state preserves the majority of the lineage-dependent pathways which are very significantly represented at the mononucleated cell state (e.g. IFN-g and IL-4-related pathways). Interestingly, although less significant, this analysis also showed pathways that were specific to the mononucleated or multinucleated state in IFN-γ-differentiated macrophages when compared to IL-4-differentiated ones and vice versa. (Supplementary Figure 1B and C). For instance, TRAF3-dependent IRF activation pathway is specific to mononucleated IFN-g-differentiated macrophages (Supplementary Figure 1B).

      The authors should also frame/discuss the findings in the context of diagnostic and therapeutic potentials to highlight the clinical significance of this study.

      [Authors’ reply]. We thank the reviewer for this point and we now discuss our results from a clinical/diagnostic perspective (see highlighted text in the Discussion and below).

      From a clinical perspective, since lysosome-regulated intracellular iron homeostasis appears to be a general condition for macrophage multinucleation across different tissues, its blockade may hold therapeutic potential. However, it is still unclear whether granulomatous disease can benefit from targeting LGC fusion. For non-granulomatous inflammatory diseases, inhibiting MGC formation by targeting lysosomes may be a therapeutic avenue. This approach would avoid FBGC-related adverse effects during foreign body reaction or inhibit the formation of MGCs of white adipose tissue during obesity. v-ATPase inhibitors have been previously proposed to inhibit osteoclast activity and bone resorption [7]* so their selective targeting in the lysosomal compartment may be generalized to other MGCs such as FBGCs. In addition to potential clinical translation, the results presented in this study require confirmation in tissues originating from human pathology involving MGCs. *

      Major comments:

      • As mentioned before, the physiological significance of the findings is unclear. Some form of in vivo data is needed to support some of the key conclusions of the study, e.g validating some of the markers of the pathways identified (common and MGC subtype-specific), and the role of lysosome-mediated iron homeostasis in multinucleation. The authors can make use of the FACs and imaging approaches they developed to look at MGCs in relevant tissues. [Authors’ reply]. This is an important point that we would like to explore in a comprehensive way. We have initiated a 2-year program to undertake a Multiplexed Immunohistochemistry (mIHC) using MILAN (Multiple Iterative Labeling by Antibody Neodeposition) https://www.lpcm.be/multiplex-ihc-milan approach in human biopsies using >100 antibodies. The current study is pivotal in selecting the gene targets (i.e. common and MGC-specific markers) for prioritization. We foresee to gain critical pathophysiological information about the tissue characteristics of MGCs. The reviewer would acknowledge that these high-throughput and biopsy-based initiatives are lengthy and not the primary scope of our current findings which set the foundation of major cellular events governing multinucleation in macrophages.

      Reviewer #3 (Significance (Required)):

      Significance:

      • The approach of isolating and comparing different types of MGCs is novel, and the findings certainly improved our understanding of the fusion processes of MGCs. However, the physiological role of these processes in health and disease that involve MGCs is still unclear due to the lack of in vivo data. The findings were discussed in quite a bit of detail in the context of current literature, though clinical impact was not explored. [Authors’ reply]. *We are grateful to Reviewer 3 for raising relevant and constructive points regarding the main findings. His/her review significantly improved the clarity of the overall manuscript. *

      We recognize our study lacks human clinical association, but we highlight the prospective translatability of our findings and the usage of donor-based human macrophages throughout the manuscript. As also recommended by Reviewer 1 in his/her cross-consultation, we discuss the potential clinical impact of our findings in the Discussion of our revised manuscript.

      • My background is bone biology with a very keen interest in osteoclast biology so arguably my knowledge on other MGCs eg LGCs and FBGCs is limited. References

      • Chen Y, Jiang H, Xiong J, Shang J, Chen Z, Wu A, Wang H: Insight into the Molecular Characteristics of Langhans Giant Cell by Combination of Laser Capture Microdissection and RNA Sequencing. J Inflamm Res 2022, 15:621-634.

      • McNally AK, Anderson JM: Foreign body-type multinucleated giant cells induced by interleukin-4 express select lymphocyte co-stimulatory molecules and are phenotypically distinct from osteoclasts and dendritic cells. Exp Mol Pathol 2011, 91(3):673-681.
      • Petrany MJ, Swoboda CO, Sun C, Chetal K, Chen X, Weirauch MT, Salomonis N, Millay DP: Single-nucleus RNA-seq identifies transcriptional heterogeneity in multinucleated skeletal myofibers. Nat Commun 2020, 11(1):6374.
      • Pereira M, Ko JH, Logan J, Protheroe H, Kim KB, Tan ALM, Croucher PI, Park KS, Rotival M, Petretto E et al: A trans-eQTL network regulates osteoclast multinucleation and bone mass. Elife 2020, 9.
      • McNally AK, Anderson JM: Multinucleated giant cell formation exhibits features of phagocytosis with participation of the endoplasmic reticulum. Exp Mol Pathol 2005, 79(2):126-135.
      • Milde R, Ritter J, Tennent GA, Loesch A, Martinez FO, Gordon S, Pepys MB, Verschoor A, Helming L: Multinucleated Giant Cells Are Specialized for Complement-Mediated Phagocytosis and Large Target Destruction. Cell Rep 2015, 13(9):1937-1948.
      • Qin A, Cheng TS, Pavlos NJ, Lin Z, Dai KR, Zheng MH: V-ATPases in osteoclasts: structure, function and potential inhibitors of bone resorption. Int J Biochem Cell Biol 2012, 44(9):1422-1435.
    1. Author Response

      Reviewer #1 (Public Review):

      This manuscript seeks to identify the mechanism underlying priority effects in a plantmicrobe-pollinator model system and to explore its evolutionary and functional consequences. The manuscript first documents alternative community states in the wild: flowers tend to be strongly dominated by either bacteria or yeast but not both. Then lab experiments are used to show that bacteria lower the nectar pH, which inhibits yeast - thereby identifying a mechanism for the observed priority effect. The authors then perform an experimental evolution unfortunately experiment which shows that yeast can evolve tolerance to a lower pH. Finally, the authors show that low-pH nectar reduces pollinator consumption, suggesting a functional impact on the plant-pollinator system. Together, these multiple lines of evidence build a strong case that pH has far-reaching effects on the microbial community and beyond.

      The paper is notable for the diverse approaches taken, including field observations, lab microbial competition and evolution experiments, genome resequencing of evolved strains, and field experiments with artificial flowers and nectar. This breadth can sometimes seem a bit overwhelming. The model system has been well developed by this group and is simple enough to dissect but also relevant and realistic. Whether the mechanism and interactions observed in this system can be extrapolated to other systems remains to be seen. The experimental design is generally sound. In terms of methods, the abundance of bacteria and yeast is measured using colony counts, and given that most microbes are uncultivable, it is important to show that these colony counts reflect true cell abundance in the nectar.

      We have revised the text to address the relationship between cell counts and colony counts with nectar microbes. Specifically, we point out that our previous work (Peay et al. 2012) established a close correlation between CFUs and cell densities (r2 = 0.76) for six species of nectar yeasts isolated from D. aurantiacus nectar at Jasper Ridge, including M. reukaufii.

      As for A. nectaris, we used a flow cytometric sorting technique to examine the relationship between cell density and CFU (figure supplement 1). This result should be viewed as preliminary given the low level of replication, but this relationship also appears to be linear, as shown below, indicating that colony counts likely reflect true cell abundance of this species in nectar.

      It remains uncertain how closely CFU reflects total cell abundance of the entire bacterial and fungal community in nectar. However, a close association is possible and may be even likely given the data above, showing a close correlation between CFU and total cell count for several yeast species and A. nectaris, which are indicated by our data to be dominant species in nectar.

      We have added the above points in the manuscript (lines 263-264, 938-932).

      The genome resequencing to identify pH-driven mutations is, in my mind, the least connected and developed part of the manuscript, and could be removed to sharpen and shorten the manuscript.

      We appreciate this perspective. However, given the disagreement between this perspective and reviewer 2’s, which asks for a more expanded section, we have decided to add a few additional lines (lines 628-637), briefly expanding on the genomic differences between strains evolved in bacteria-conditioned nectar and those evolved in low-pH nectar.

      Overall, I think the authors achieve their aims of identifying a mechanism (pH) for the priority effect of early-colonizing bacteria on later-arriving yeast. The evolution and pollinator experiments show that pH has the potential for broader effects too. It is surprising that the authors do not discuss the inverse priority effect of early-arriving yeast on later-arriving bacteria, beyond a supplemental figure. Understandably this part of the story may warrant a separate manuscript.

      We would like to point out that, in our original manuscript, we did discuss the inverse priority effects, referring to relevant findings that we previously reported (Tucker and Fukami 2014, Dhami et al. 2016 and 2018, Vannette and Fukami 2018). Specifically, we wrote that: “when yeast arrive first to nectar, they deplete nutrients such as amino acids and limit subsequent bacterial growth, thereby avoiding pH-driven suppression that would happen if bacteria were initially more abundant (Tucker and Fukami 2014; Vannette and Fukami 2018)” (lines 385-388). However, we now realize that this brief mention of the inverse priority effects was not sufficiently linked to our motivation for focusing mainly on the priority effects of bacteria on yeast in the present paper. Accordingly, we added the following sentences: “Since our previous papers sought to elucidate priority effects of early-arriving yeast, here we focus primarily on the other side of the priority effects, where initial dominance of bacteria inhibits yeast growth.” (lines 398-401).

      I anticipate this paper will have a significant impact because it is a nice model for how one might identify and validate a mechanism for community-level interactions. I suspect it will be cited as a rare example of the mechanistic basis of priority effects, even across many systems (not just pollinator-microbe systems). It illustrates nicely a more general ecological phenomenon and is presented in a way that is accessible to a broader audience.

      Thank you for this positive assessment.

      Reviewer #2 (Public Review):

      The manuscript "pH as an eco-evolutionary driver of priority effects" by Chappell et al illustrates how a single driver-microbial-induced pH change can affect multiple levels of species interactions including microbial community structure, microbial evolutionary change, and hummingbird nectar consumption (potentially influencing both microbial dispersal and plant reproduction). It is an elegant study with different interacting parts: from laboratory to field experiments addressing mechanism, condition, evolution, and functional consequences. It will likely be of interest to a wide audience and has implications for microbial, plant, and animal ecology and evolution.

      This is a well-written manuscript, with generally clear and informative figures. It represents a large body and variety of work that is novel and relevant (all major strengths).

      We appreciate this positive assessment.

      Overall, the authors' claims and conclusions are justified by the data. There are a few things that could be addressed in more detail in the manuscript. The most important weakness in terms of lack of information/discussion is that it looks like there are just as many or more genomic differences between the bacterial-conditioned evolved strains and the low-pH evolved strains than there are between these and the normal nectar media evolved strains. I don't think this negates the main conclusion that pH is the primary driver of priority effects in this system, but it does open the question of what you are missing when you focus only on pH. I would like to see a discussion of the differences between bacteria-conditioned vs. low-pH evolved strains.

      We agree with the reviewer and have included an expanded discussion in the revised manuscript [lines 628-637]. Specifically, to show overall genomic variation between treatments, we calculated genome-wide Fst comparing the various nectar conditions. We found that Fst was 0.0013, 0.0014, and 0.0015 for the low-pH vs. normal, low pH vs. bacteria-conditioned, and bacteria-conditioned vs. normal comparisons, respectively. The similarity between all treatments suggests that the differences between bacteria-conditioned and low pH are comparable to each treatment compared to normal. This result highlights that, although our phenotypic data suggest alterations to pH as the most important factor for this priority effect, it still may be one of many affecting the coevolutionary dynamics of wild yeast in the microbial communities they are part of. In the full community context in which these microbes grow in the field, multi-species interactions, environmental microclimates, etc. likely also play a role in rapid adaptation of these microbes which was not investigated in the current study.

      Based on this overall picture, we have included additional discussion focusing on the effect of pH on evolution of stronger resistance to priority effects. We compared genomic differences between bacteria-conditioned and low-pH evolved strains, drawing the reader’s attention to specific differences in source data 14-15. Loci that varied between the low pH and bacteria-conditioned treatments occurred in genes associated with protein folding, amino acid biosynthesis, and metabolism.

      Reviewer #3 (Public Review):

      This work seeks to identify a common factor governing priority effects, including mechanism, condition, evolution, and functional consequences. It is suggested that environmental pH is the main factor that explains various aspects of priority effects across levels of biological organization. Building upon this well-studied nectar microbiome system, it is suggested that pH-mediated priority effects give rise to bacterial and yeast dominance as alternative community states. Furthermore, pH determines both the strengths and limits of priority effects through rapid evolution, with functional consequences for the host plant's reproduction. These data contribute to ongoing discussions of deterministic and stochastic drivers of community assembly processes.

      Strengths:

      Provides multiple lines of field and laboratory evidence to show that pH is the main factor shaping priority effects in the nectar microbiome. Field surveys characterize the distribution of microbial communities with flowers frequently dominated by either bacteria or yeast, suggesting that inhibitory priority effects explain these patterns. Microcosm experiments showed that A. nectaris (bacteria) showed negative inhibitory priority effects against M. reukaffi (yeast). Furthermore, high densities of bacteria were correlated with lower pH potentially due to bacteria-induced reduction in nectar pH. Experimental evolution showed that yeast evolved in low-pH and bacteria-conditioned treatments were less affected by priority effects as compared to ancestral yeast populations. This potentially explains the variation of bacteria-dominated flowers observed in the field, as yeast rapidly evolves resistance to bacterial priority effects. Genome sequencing further reveals that phenotypic changes in low-pH and bacteriaconditioned nectar treatments corresponded to genomic variation. Lastly, a field experiment showed that low nectar pH reduced flower visitation by hummingbirds. pH not only affected microbial priority effects but also has functional consequences for host plants.

      We appreciate this positive assessment.

      Weaknesses:

      The conclusions of this paper are generally well-supported by the data, but some aspects of the experiments and analysis need to be clarified and expanded.

      The authors imply that in their field surveys flowers were frequently dominated by bacteria or yeast, but rarely together. The authors argue that the distributional patterns of bacteria and yeast are therefore indicative of alternative states. In each of the 12 sites, 96 flowers were sampled for nectar microbes. However, it's unclear to what degree the spatial proximity of flowers within each of the sampled sites biased the observed distribution patterns. Furthermore, seasonal patterns may also influence microbial distribution patterns, especially in the case of co-dominated flowers. Temperature and moisture might influence the dominance patterns of bacteria and yeast.

      We agree that these factors could potentially explain the presented results. Accordingly, we conducted spatial and seasonal analyses of the data, which we detail below and include in two new paragraphs in the manuscript [lines 290-309].

      First, to determine whether spatial proximity influenced yeast and bacterial CFUs, we regressed the geographic distance between all possible pairs of plants to the difference in bacterial or fungal abundance between the paired plants. If plant location affected microbial abundance, one should see a positive relationship between distance and the difference in microbial abundance between a given pair of plants: a pair of plants that were more distantly located from each other should be, on average, more different in microbial abundance. Contrary to this expectation, we found no significant relationship between distance and the difference in bacterial colonization (A, p=0.07, R2=0.0003) and a small negative association between distance and the difference in fungal colonization (B, p<0.05, R2=0.004). Thus, there was no obvious overall spatial pattern in whether flowers were dominated by yeast or bacteria.

      Next, to determine whether climatic factors or seasonality affected the colonization of bacteria and yeast per plant, we used a linear mixed model predicting the average bacteria and yeast density per plant from average annual temperature, temperature seasonality, and annual precipitation at each site, the date the site was sampled, and the site location and plant as nested random effects. We found that none of these variables were significantly associated with the density of bacteria and yeast in each plant.

      To look at seasonality, we also re-ordered Fig 2C, which shows the abundance of bacteria- and yeast-dominated flowers at each site, so that the sites are now listed in order of sampling dates. In this re-ordered figure, there is no obvious trend in the number of flowers dominated by yeast throughout the period sampled (6.23 to 7/9), giving additional indication that seasonality was unlikely to affect the results.

      Additionally, sampling date does not seem to strongly predict bacterial or fungal density within each flower when plotted.

      These additional analyses, now included (figure supplements 2-4) and described (lines 290-309) in the manuscript, indicate that the observed microbial distribution patterns are unlikely to have been strongly influenced by spatial proximity, temperature, moisture, or seasonality, reinforcing the possibility that the distribution patterns instead indicate bacterial and yeast dominance as alternative stable states.

      The authors exposed yeast to nectar treatments varying in pH levels. Using experimental evolution approaches, the authors determined that yeast grown in low pH nectar treatments were more resistant to priority effects by bacteria. The metric used to determine the bacteria's priority effect strength on yeast does not seem to take into account factors that limit growth, such as the environmental carrying capacity. In addition, yeast evolves in normal (pH =6) and low pH (3) nectar treatments, but it's unclear how resistance differs across a range of pH levels (ranging from low to high pH) and affects the cost of yeast resistance to bacteria priority effects. The cost of resistance may influence yeast life-history traits.

      The strength of bacterial priority effects on yeast was calculated using the metric we previously published in Vannette and Fukami (2014): PE = log(BY/(-Y)) - log(YB/(Y-)), where BY and YB represent the final yeast density when early arrival (day 0 of the experiment) was by bacteria or yeast, followed by late arrival by yeast or bacteria (day 2), respectively, and -Y and Y- represent the final density of yeast in monoculture when they were introduced late or early, respectively. This metric does not incorporate carrying capacity. However, it does compare how each microbial species grows alone, relative to growth before or after a competitor. In this way, our metric compares environmental differences between treatments while also taking into account growth differences between strains.

      Here we also present additional growth data to address the reviewer’s point about carrying capacity. Our experiments that compared ancestral and evolved yeast were conducted over the course of two days of growth. In preliminary monoculture growth experiments of each evolved strain, we found that yeast populations did reach carrying capacity over the course of the two-day experiment and population size declined or stayed constant after three and four days of growth.

      However, we found no significant difference in monoculture growth between the ancestral stains and any of the evolved strains, as shown in Figure supplement 12B. This lack of significant difference in monoculture suggests that differences in intrinsic growth rate do not fully explain the priority effects results we present. Instead, differences in growth were specific to yeast’s response to early arrival by bacteria.

      We also appreciate the reviewer’s comment about how yeast evolves resistance across a range of pH levels, as well as the effect of pH on yeast life-history traits. In fact, reviewer #2 pointed out an interesting trade-off in life history traits between growth and resistance to priority effects that we now include in the discussion (lines 535-551) as well as a figure in the manuscript (Figure 8).

    1. Author Response

      Reviewer #1 (Public Review):

      In this manuscript the authors describe an approach for controlling cellular membrane potential using engineered gene circuits via ion channel expression. Specifically, the authors use microfluidics to track S. cerevisiae gene expression and plasma membrane potential (PMP) in single cells over time. They first establish a small engineered gene circuit capable of producing excitable gene expression dynamics through the combination of positive and negative feedback, tracking expression using GFP (Figure 1). Though not especially novel or complex, the data quality is high in Figure 1 and the results are convincing. Note that the circuit is excitable and not oscillatory; it is being driven periodically by a chemical inducer. I think the authors could have done a better job justifying the use of an excitable engineered gene circuit system, since you could get a similar result by just driving a promoter with the equivalent time course of inducer.

      We restructured the manuscript by presenting the open-loop version of our synthetic circuit and demonstrate that closed loop system integrating feedbacks performs significantly better than its open-loop version (revised Figures 1 and 3). This open-loop system is based on Mar proteins that can synchronizes gene expression on extended spatiotemporal scales (PerezGarcia et al., Nat Comm, 2021). Other driven systems (i.e., TetR, AraC, LacI) can temporally synchronize gene expression in single bacteria cells to successive cycles of inducer. However, over time these bacterial systems build substantial delays in phases between cells, partially due to noise that ultimately led to desynchrony between individual cells even though they tend to follow the common inducer. This is clearly not the case in Mar-based systems (Perez-Garcia et al., Nat Comm, 2021) as eukaryotic cells synchronize to each other under guidance of common environmental stimuli with neglectable phase drift. Furthermore, in revised version we show that dual feedback strategy provides a robust solution to control ion channel expression and associated changes in PMP (see Conclusions lines 231-237).

      The authors then use a similar approach to produce excitable expression of the bacterial ion channel KcsA, tracking membrane voltage using the voltage-sensitive dye ThT rather than GFP fluorescence (Figure 2). The experimental results in this figure are more novel as the authors are now using the expression of a heterologous ion channel to dynamically control plasma membrane potential. While fairly convincing, I think there are a few experimental controls that would make these results even more convincing. It is also unclear why the authors are now using power spectra to display observed frequencies compared to the much more intuitive histograms used in Figure 1.

      Now we use violin plots with period distributions consistently in all figures to ease the comparison between scenarios.

      Finally, the authors move on to use a similar excitable engineered gene circuit approach to produce inducible control of the K1 toxin which influences the native potassium channel TOK1 rather than the heterologous ion channel KcsA (Figure 3). I have a similar reaction to this figure as with Figure 2: the results are novel and interesting but would benefit from more experimental controls. Additionally, the image data shown in Figure 3b is very unclear and could be expanded and improved.

      In revised version we have decided to remove K1 toxin data as we are aware that we cannot modulate K1 degradation rate due to its extracellular nature. Instead, we have decided to perform additional experiments in which we directly plugged our circuit to TOK1 native potassium channel to demonstrate that our feedback-integrating synthetic circuit is capable of controlling TOK1 dosage and associated PMP changes (revised Figure 3, and lines 209-220). We believe these new data make more direct connection between synthetic circuits phytohormones and native channel expression than presented earlier K1-based scenario.

      Overall, in my opinion the claims in the abstract and title are a bit strong. I would deemphasize global coordination and "synchronous electrical signaling" since the authors are driving a global inducer. To make the claim of synchronous signaling I would want to see spatial data for cells near vs. far from K1 toxin producing cells in Figure 3 along with estimates of inducer/flow timescale vs. expression/diffusion of K1 toxin. As I read the manuscript, I see that most of the synchronicity comes from the fact that all cells are experiencing a global inducer concentration.

      We agree with the Reviewer, synchronicity and global coordination comes from phytohormone sensing feedback circuit that is guided by cyclic environmental changes. We have revised definition of synchronous signaling as suggested, focusing on the macroscopic synchronization of ion channel expression achieved by external modulation, which is the key message coming from this work.

      Reviewer #2 (Public Review):

      The authors present a novel method to induce electrical signaling through an artificial chemical circuit in yeast which is an unconventional approach that could enable extremely interesting, future experiments. I appreciate that the authors created a computer model that mathematically predicts how the relationship between their two chemical stimulants interact with their two chosen receptors, IacR/MarR, could produce such effects. Their experimental validations clearly demonstrated control over phase that is directly related to the chemical stimulation. In addition, in the three scenarios in which they tested their circuit showed clear promise as the phase difference between spatially distant yeast communities was ~10%. Interestingly, indirect TOK1 expression through K1 toxin gives a nice example of inter-strain coupling, although the synchronization was weaker than in the other cases. Overall, the method is sound as a way to chemically stimulate yeast cultures to produce synchronous electrical activity. However, it is important to point out that this synchronicity is not produced by colony-colony communication (i.e., self-organized), but by a global chemical drive of the constructed gene-expression circuit.

      The greatest limitation of the study lies in the presentation (not the science). There are two significant examples of this. First, the authors state this study 'provides a robust synthetic transcriptional toolbox' towards chemo-electrical coupling. In order to be a toolbox, more effort needs to be put into helping others use this approach. However little detail is given about methodological choices, circuit mechanisms in relation to the rest of the cell, nor how this method would be used outside of the demonstrated use case. Second, the authors stress that this method is 'non-invasive', but I fail to see how the presented methodology could be considered non-invasive, in in-vivo applications, as gene circuits are edited and a reliable way to chemically stimulate a large population of cells would be needed. It may be that I misunderstood their claim as the presentation of method and data were not done in a way that led to easy comprehension, but this needs to be addressed specifically and described.

      We apologize Reviewer for a potential misunderstanding. By ‘non-invasive’ we meant that such systems would not need, for instance, the surgical installation of light components to control ion channel activity. Nonetheless, we have removed these confusing sentences from the revised manuscript.

      The rational for using Mar-based system with feedback strategy data has been now presented in more structured and comprehensive way across the revised manuscript to demonstrate benefits from integrating feedback as well as potential of such systems for excitable dynamics with noise-filtering capability and faster responsiveness. We also show how system can be coupled to native potassium channels, opening ways to integrate synthetic circuit into other organisms.

      In terms of classifying the synchronicity, while phase difference among communities was the key indicator of synchronization, there were little data exploring other aspects of coupled waveforms, nor a discussion into potential drawbacks. For example, phase may be aligned while other properties such as amplitude and typical wave-shape measures may differ. As this is presented as a method meant for adoption in other labs, a more rigorous analytical approach was expected.

      In the revised manuscript, we have analyzed synchronicity using several different approaches:

      (1) we calculate cumulative autocorrelations of response between communities.

      (2) to complement autocorrelation analysis, we developed a quantitative metric of ‘synchrony index’ defined as 1 - R where R is the ratio of differences in subsequent ThT peak positions amongst cell communities (phase) to expected period. This metrics describes how well synchronized are fungi colonies with each other under guidance of the common environmental signal.

      (3) we analyzed amplitudes and peak widths for all presented scenarios and we conclude that while periods and peak widths are robust across communities there is noticeable variation in amplitudes (i.e. Figure 3E).

      We therefore believe that this multistep quantitative approach is rigorous in identifying oscillatory signal characteristics.

      Reviewer #3 (Public Review):

      We are enthusiastic about this paper. It demonstrates controlled expression of ion channels, which itself is impressive. Going a step further, the authors show that through their control over ion channel expression, they can dynamically manipulate membrane potential in yeast. This chemical to electrophysiological conversion opens up new opportunities for synthetic biology, for example development of synthetic signaling systems or biological electrochemical interfaces. We believe that control of ion channel expression and hence membrane potential through external stimuli can be emphasized more strongly in the report. The experimental time-lapse data were also high quality. We have two major critiques on the paper, which we will discuss below.

      First, we do not believe the analyses used supports the authors' claims that chemical or electrical signals are propagating from cell-to-cell. The text makes this claim indirectly and directly. For example, in lines 139-141, the authors describe the observed membrane potential dynamics as "indicative of the effective communication of electrical messages within the populations". There are similar remarks in lines 144 and 154-156. The claim of electrical communication is further established by Figure 2 supplement 3, which is a spatial signal propagation model. As far as we can tell, this model describes a system different from the one implemented in the paper.

      Second, it is not clear why the excitable dynamics of the circuit are so important or if the circuit constructed does in fact exhibit excitable dynamics. Certainly, the mathematical model has excitable dynamics. However, not enough data demonstrates that the biological implementation is in an excitable regime. For example, where in the parameter space of Figure 1 supplement 1 does the biological circuit lie? If the circuit has excitable dynamics, then the authors should observe something like Figure 1 supplement 1B in response to a nonoscillating input. Do they observe that? Do they observe a refractory period? Even if the circuit as constructed is not excitable, we don't think that's a major problem because it is not central to what we believe is the most important part of this work - controlling ion channel expression and hence membrane potential with external chemical stimuli.

      We thank Reviewer for encouraging comments and positive evaluation of our work.

    1. So that it is not because God is unmindful of their wickedness,

      This statement can be viewed as a warning to all people because it is telling us that no matter what we do or how well we think we are hiding something, even though no one on earth may know, God knows everything and every sin we have committed.

    1. Posted byu/raphaelmustermann9 hours agoSeparate private information from the outline of academic disciplines? .t3_xi63kb._2FCtq-QzlfuN-SwVMUZMM3 { --postTitle-VisitedLinkColor: #9b9b9b; --postTitleLink-VisitedLinkColor: #9b9b9b; --postBodyLink-VisitedLinkColor: #989898; } How does Luhmann deal with private Zettels? Does he store them in a separate category like, 2000 private. Or does he work them out under is topics in the main box.I can´ find informations about that. Anyway, you´re not Luhmann. But any suggestions on how to deal with informations that are private, like Health, Finances ... does not feel right to store them under acadmic disziplines. But maybe it´s right and just a feeling which come´ out how we "normaly" store information.

      I would echo Bob's sentiment here and would recommend you keep that material like this in a separate section or box all together.

      If it helps to have an example, in 2006, Hawk Sugano showed off a version of a method you may be considering which broadly went under the title of Pile of Index Cards (or PoIC) which combined zettelkasten and productivity systems (in his case getting things done or GTD). I don't think he got much (any?!) useful affordances out of mixing the two. In fact, from what I can see looking at later iterations of his work and how he used it, it almost seems like he spent more time and energy later attempting to separate and rearrange them to get use out of the knowledge portions as distinct from the productivity portions.

      I've generally seen people mixing these ideas in the digital space usually to their detriment as well—a practice I call zettelkasten overreach.

    1. Constrains block our thinking and idea generation. Naturally, we consider constraints as soon as an idea germinates, so eliminating even some of these constraints can encourage creative idea generation; for example, ask participants “What if there is no gravity, how can we improve the flying experience?”

      I would say this if quite useful. What is a constraint or limitation for us is not necessarily for others. Others may possibly offer solutions, so when we remove the constraint and start to think and research along the path that we did not think about before due to the existence of the constraints, we will often have a different insight.

    1. Underlining Keyterms and Index Bloat .t3_y1akec._2FCtq-QzlfuN-SwVMUZMM3 { --postTitle-VisitedLinkColor: #9b9b9b; --postTitleLink-VisitedLinkColor: #9b9b9b; --postBodyLink-VisitedLinkColor: #989898; }

      Hello u/sscheper,

      Let me start by thanking you for introducing me to Zettelkasten. I have been writing notes for a week now and it's great that I'm able to retain more info and relate pieces of knowledge better through this method.

      I recently came to notice that there is redundancy in my index entries.

      I have two entries for Number Line. I have two branches in my Math category that deals with arithmetic, and so far I have "Addition" and "Subtraction". In those two branches I talk about visualizing ways of doing that, and both of those make use of and underline the term Number Line. So now the two entries in my index are "Number Line (Under Addition)" and "Number Line (Under Subtraction)". In those notes I elaborate how exactly each operation is done on a number line and the insights that can be derived from it. If this continues, I will have Number Line entries for "Multiplication" and "Division". I will also have to point to these entries if I want to link a main note for "Number Line".

      Is this alright? Am I underlining appropriately? When do I not underline keyterms? I know that I do these to increase my chances of relating to those notes when I get to reach the concept of Number Lines as I go through the index but I feel like I'm overdoing it, and it's probably bloating it.

      I get "Communication (under Info. Theory): '4212/1'" in the beginning because that is one aspect of Communication itself. But for something like the number line, it's very closely associated with arithmetic operations, and maybe I need to rethink how I populate my index.

      Presuming, since you're here, that you're creating a more Luhmann-esque inspired zettelkasten as opposed to the commonplace book (and usually more heavily indexed) inspired version, here are some things to think about:<br /> - Aren't your various versions of number line card behind each other or at least very near each other within your system to begin with? (And if not, why not?) If they are, then you can get away with indexing only one and know that the others will automatically be nearby in the tree. <br /> - Rather than indexing each, why not cross-index the cards themselves (if they happen to be far away from each other) so that the link to Number Line (Subtraction) appears on Number Line (Addition) and vice-versa? As long as you can find one, you'll be able to find them all, if necessary.

      If you look at Luhmann's online example index, you'll see that each index term only has one or two cross references, in part because future/new ideas close to the first one will naturally be installed close to the first instance. You won't find thousands of index entries in his system for things like "sociology" or "systems theory" because there would be so many that the index term would be useless. Instead, over time, he built huge blocks of cards on these topics and was thus able to focus more on the narrow/niche topics, which is usually where you're going to be doing most of your direct (and interesting) work.

      Your case sounds, and I see it with many, is that your thinking process is going from the bottom up, but that you're attempting to wedge it into a top down process and create an artificial hierarchy based on it. Resist this urge. Approaching things after-the-fact, we might place information theory as a sub-category of mathematics with overlaps in physics, engineering, computer science, and even the humanities in areas like sociology, psychology, and anthropology, but where you put your work on it may depend on your approach. If you're a physicist, you'll center it within your physics work and then branch out from there. You'd then have some of the psychology related parts of information theory and communications branching off of your physics work, but who cares if it's there and not in a dramatically separate section with the top level labeled humanities? It's all interdisciplinary anyway, so don't worry and place things closest in your system to where you think they fit for you and your work. If you had five different people studying information theory who were respectively a physicist, a mathematician, a computer scientist, an engineer, and an anthropologist, they could ostensibly have all the same material on their cards, but the branching structures and locations of them all would be dramatically different and unique, if nothing else based on the time ordered way in which they came across all the distinct pieces. This is fine. You're building this for yourself, not for a mass public that will be using the Dewey Decimal System to track it all down—researchers and librarians can do that on behalf of your estate. (Of course, if you're a musician, it bears noting that you'd be totally fine building your information theory section within the area of "bands" as a subsection on "The Bandwagon". 😁)

      If you overthink things and attempt to keep them too separate in their own prefigured categorical bins, you might, for example, have "chocolate" filed historically under the Olmec and might have "peanut butter" filed with Marcellus Gilmore Edson under chemistry or pharmacy. If you're a professional pastry chef this could be devastating as it will be much harder for the true "foodie" in your zettelkasten to creatively and more serendipitously link the two together to make peanut butter cups, something which may have otherwise fallen out much more quickly and easily if you'd taken a multi-disciplinary (bottom up) and certainly more natural approach to begin with. (Apologies for the length and potential overreach on your context here, but my two line response expanded because of other lines of thought I've been working on, and it was just easier for me to continue on writing while I had the "muse". Rather than edit it back down, I'll leave it as it may be of potential use to others coming with no context at all. In other words, consider most of this response a selfish one for me and my own slip box than as responsive to the OP.)

    1. Author Response

      Reviewer #1 (Public Review):

      Figures 2 through 6. There is no description of the relationship between the findings and the anatomical location of the electrodes (other than distal versus local). Perhaps the non-uniform distribution of electrodes makes these analyses more complicated and such questions might have minimal if any statistical power. But how should we think about the claims in Figures 2-6 in relationship to the hippocampus, amygdala, entorhinal cortex, and parahippocampal gyrus? As one example question out of many, is Figure 2C revealing results for local pairs in all medial temporal lobe areas or any one area in particular? I won't spell out every single anatomical question. But essentially every figure is associated with an anatomical question that is not described in the results.

      To address the reviewer’s point we now report the distribution of spike-LFP pairs across anatomical regions for each Figure 2-6. The results split by anatomical regions are reported in Figure 2 – figure supplement 7, Figure 3 – figure supplement 7, Figure 4 – figure supplement 1, Figure 5 – figure supplement 2, and Figure 6 – figure supplement 3. We also calculated a non-parametric Kruskal-Wallis Test to statistically examine the effect of anatomical regions on the results shown in each figure. Generally, these new results show that the effects are similar across regions, apart from two exceptions (i.e. Figure 4 – supplement 1; and Figure 5 – supplement 2). However, we would like to stress that these results should be taken with a huge grain of salt because the electrodes were not evenly distributed across regions (i.e. ~75% of observations pertain to the hippocampus), and patients as the reviewer correctly points out. This leads to sometimes very low numbers of observations per region and it is difficult to disentangle whether any apparent differences are driven by regional differences, or differences between patients. Detailed results are reported below.

      Manuscript lines 207-212: “In the above analysis all MTL regions were pooled together to allow for sufficient statistical power. Results separated by anatomical region are reported in Figure 2 – figure supplement 7 for the interested reader. However, these results should be interpreted with caution because electrodes were not evenly distributed across regions and patients making it difficult to disentangle whether any apparent differences are driven by actual anatomical differences, or idiosyncratic differences between patients.”

      Manuscript lines 255-258: “Finally, we report the distal spike-LFP results separated by anatomical region in Figure 3 – figure supplement 7, which did not reveal any apparent differences in the memory related modulation of theta spike-LFP coupling between regions.”

      Manuscript lines 264-266: “PSI results separated by anatomical regions are reported in Figure 4 – figure supplement 1, which revealed that the PSI results were mostly driven by within regional coupling.”

      Manuscript lines 399-303: “We also analyzed whether the memory-dependent effects of cross-frequency coupling differ between anatomical regions (see Figure 5 – figure supplement 2). This analysis revealed that the results were mostly driven by the hippocampus, however we urge caution in interpreting this effect due to the large sampling imbalance across regions.”

      Manuscript lines 343-346: “As for the above analysis we also investigated any apparent differences in co-firing between anatomical regions. These results are reported in Figure 6 – figure supplement 3 and show that the earlier co-firing for hits compared to misses was approximately equivalent across regions.”

      Figure 1

      1A. I assume that image positions are randomized during a cued recall?

      Yes, that was the case. We now added that information in the methods section.

      Manuscript lines 526: “Image positions on the screen were randomized for each trial.”

      What was the correlation between subjects' indication of how many images they thought they remembered and their actual performance?

      We did not log how many images the patients thought they remembered. Specifically, if the patients answered that they remembered at least one image, then they were shown the selection screen where they could select the appropriate images. Therefore, we cannot perform this analysis. We report this now in the methods section. However, albeit interesting, the results of such an analysis would not affect the main conclusions of our manuscript.

      Manuscript lines 523-524: “The experimental script did not log how many images the patient indicated that they thought to remember.”

      1B. Chance is shown for hits but not misses. I assume that hits are defined as both images correct and misses as either 0 or 1 image correct. Then a chance for misses is 1-chance for hits = 5/6. It would be nice to mark this in the figure.

      Done as suggested (see Figure 1).

      The authors report that both incorrect was 11.9%. By chance, both incorrect should be the same as both correct, hence also 1/6 probability, hence the probability of both incorrect seems quite close to chance levels, right?

      Yes, that is correct, however, across sessions the proportion of full misses (i.e. both incorrect) was significantly below chance (t(39)=-1.9214; p<0.05). Nevertheless, the proportion of fully forgotten trials appears to be higher than expected purely by chance. This is likely driven by a tendency of participants to either fully remember an episode, or completely forget it, as demonstrated previously in behavioural work (Joensen et al., 2020; JEP Gen.). We report this now in the manuscript.

      Manuscript lines 132-136: “Across sessions the proportion of full misses (i.e. both incorrect) was significantly below chance (t39=-1.92; p<0.05). However, the proportion of fully forgotten trials appears to be higher than expected purely by chance. This is likely driven by a tendency of participants to either fully remember an episode, or completely forget it, as demonstrated previously in behavioral work (25).”

      1C. How does the number of electrodes relate to the number of units recorded in each area?

      The distribution of neurons per region is shown in the new Figure 1D (see above). It approximately matches the distribution of electrodes per region, except for the Amygdala where slightly more neurons where recorded. This is because of one patient (P08) who had two electrodes in the left and right Amygdala and who contributed at lot of sessions (i.e. 9 sessions, comparing to an average of 4.44 per patient).

      Line 152. The authors state that neural firing during encoding was not modulated by memory for the time window of interest. This is slightly surprising given that other studies have shown a correlation between firing rates and memory performance (see Zheng et al Nature Neuroscience 2022 for a recent example). The task here is different from those in other studies, but is there any speculation as to potential differences? What makes firing rates during encoding correlate with subsequent memory in one task and not in another? And why is the interval from 2-3 seconds more interesting than the intervals after 3 seconds where the authors do report changes in firing rates associated with subsequent performance? Is there any reason to think that the interval from 2-3 seconds is where memories are encoded as opposed to the interval after 3 seconds?

      Zheng et al. used a movie-based memory paradigm where they manipulated transitions between scenes to identify event cells and boundary cells. They show that boundary cells, which made up 7.24% of all recorded MTL cells, but not event cells (6.2% of all MTL cells) modulate their firing rate around an event depending on later memory. There are quite a few differences between Zheng et al’s study and our study that need to be considered. Most importantly, we did not perform a complex movie-based memory paradigm as in Zheng et al. and therefore cannot identify boundary cells, which would be expected to show the memory dependent firing rate modulation. This alone could contribute to the fact that no significant differences in firing rates in the first second following stimulus onset were observed. Such an absence of a difference of neural firing depending on later memory is not unprecedented. In their seminal paper, Rutishauser et al. (2010; Nature) report no significant differences in firing rates (0-1 seconds after stimulus onset, which is similar to our 2-3 seconds time window) between later remembered or later forgotten images. This finding is also in line to Jutras & Buffalo (2009; J Neurosci) who also show no significant difference in firing rates of hippocampal neurons during encoding of remembered and forgotten images.

      The 2-3 seconds time interval, which corresponds to 0-1 seconds after the onset of the two associate images, is special because it marks the earliest time point where memory formation can start, therefore allowing us to investigate these very early neural processes that set the stage for later memory-forming processes. While speculative, these early processes likely capture the initial sweep of information transfer into the MTL memory system which arguably is reflected in the timing of spikes relative to LFPs. It is conceivable that these initial network dynamics reflect attentional processes, which act as a gate keeper to the hippocampus (Moscovitch, 2008; Can J Exp Psychol) and thereby set the stage for later memory forming processes. This interpretation would be consistent with studies in macaques showing that attention increases spike-LFP coupling, whilst not affecting firing rates (Fries et al., 2004; Science). We modified the discussion section to address this issue.

      Manuscript lines 468-474: “Interestingly, these early modulations of neural synchronization by memory encoding were observed in the absence of modulations of firing rates, which is consistent with previous results in humans (16) and macaques (12), but contrasts with (43). Studies in macaques showed that attention increases spike-LFP coupling whilst not affecting firing rates (44). It is therefore conceivable that these initial network dynamics reflect attentional processes, which act as a gate keeper to the hippocampus and thereby set the stage for later memory forming processes (45).”

      Lines 154-157 and relationship to the subsequent analyses. These lines mention in passing differences in power in low-frequency bands and high-frequency bands. To what extent are subsequent results (especially Figures 3 and 4) related to this observation? That is, are the changes in spike-field coherence, correlated with, or perhaps even dictated by, the changes in power in the corresponding frequency bands?

      To address this question we repeated the analysis that we performed for SFC for Power in those channels whose LFP was locally coupled to spikes in gamma, and distally coupled to spikes in theta. Furthermore, we correlated the difference in peak frequency between hits and misses between Power and SFC. If power would dictate the effects seen in SFC then we would expect similar effects of memory in power as in SFC, that is an increase of peak frequency for hits compared to misses for gamma and theta. Furthermore, we would expect to find a correlation between the peak frequency differences in power and SFC. None of these scenarios were confirmed by the data. These results are now reported in Figure 2 – figure supplement 5 for gamma, and Figure 3 – figure supplement 5 for theta.

      Manuscript lines 195-199: “We also tested whether a similar shift in peak gamma frequency as observed for spike-LFP coupling is present in LFP power, and whether memory-related differences in peak gamma spike-LFP are correlated with differences in peak gamma power (Figure 2 – figure supplement 5). Both analyses showed no effects, suggesting that the effects in spike-LFP coupling were not coupled to, or driven by similar changes in LFP power.”

      Manuscript lines 248-253: “As for gamma, we also tested whether a similar shift in peak theta frequency is present in LFP power, and whether there is a correlation between the memory-related differences in peak theta spike-LFP and peak theta power (Figure 3 – figure supplement 5). Both analyses showed no effects, suggesting that the effects in spike-LFP coupling were not coupled to, or driven by similar changes in LFP power.”

      Do local interactions include spike-field coherence measurements from the same microwire (i.e., spikes and LFPs from the same microwire)?

      Yes, they do. Out of the 53 local spike-SFC couplings found for the gamma frequency range, 11 (20.75%) were from pairs where the spikes and LFPs were measured on the same microwire. We assume that the reviewer is asking this question because of a concern that spike interpolation may introduce artifacts which may influence the spectrograms and consequently the spike-LFP coupling measures. This was also pointed out by Reviewer #2. To address this concern, we split the data based on whether the spike and LFP providing channels were the same or different. The results show that (i) the spectrogram of SFC is highly similar between the two datasets, with a prominent gamma peak present in both and no significant differences between the two; (ii) restricting the analysis to those data where the LFP and spike providing channels are different replicated the main finding of faster gamma peak frequencies for hits compared to misses; and (iii) limiting the SFC analysis further to only ‘silent’ channels, i.e. channels where no SUA/MUA activity was present at all also replicated the main finding of faster gamma peak frequencies for hits compared to misses.

      These analyses suggest that the SFC results were not driven by spike interpolation artefacts.

      Manuscript lines 199-203: “To rule out concerns about possible artifacts introduced by spike interpolation we repeated the above analysis for spike-LFP pairs where the spike and LFP providing channels are the same or different, and for ‘silent’ LFP channels (i.e. channels were no SUA/MUA activity was detected (see Figure 2 – figure supplement 6). “

      60 Hz. It has always troubled me deeply when results peak at 60 Hz. This is seen in multiple places in the manuscript; e.g., Figures 2B, 2E. What are the odds that engineers choosing the frequency for AC currents would choose the exact same frequency that evolution dictated for interactions of brain signals? This is certainly not the only study that reports interesting observations peaking at 60 Hz. One strong line of argument to suggest that this is not line noise is the difference between conditions. For example, in Figure 2B, there is a difference between local and distal interactions. It is hard for me to imagine why line noise would reveal any such difference. Still ...

      The frequency for AC currents in Europe is 50 Hz, not 60 Hz as in the US. Therefore, our SFC effects are well outside the range of the notch.

      Figure 6. I was very excited about Figure 6, which is one of the most novel aspects of this study. In addition to the anatomical questions about this figure noted above, I would like to know more. What is the width of the Gaussian envelope?

      The width of the Gaussian Window used in the original results was 25ms. We chose this time window because in our view it represents a good balance between integrating over a long-enough time window and thus allowing for some jitter in neural firing between pairs of neurons, whilst still being temporally specific. Finding the right balance here is not trivial because a too short time window underestimates co-firing, and a too long time window may not provide the temporal specificity necessary to detect co-firing lags (Cohen & Kohn, 2011; Nat Neurosci). To test whether this choice critically affected our results, we repeated the analysis for different window sizes, i.e. 15, 35, and 45 ms. The results show that the pattern of results did not change, with hits showing earlier peaks in co-firing compared to misses. Critically, the difference in co-firing peaks was significant for all window sizes, except for the shortest one which presumably is due to the increase in noise because of the smaller time window over which spikes are integrated. These issues are now mentioned in the methods section, and the results for the different window sizes are reported in Figure 6 – figure supplement 4.

      Manuscript lines 346-347: “The co-firing analyses were replicated with different smoothing parameters (see Figure 6 – figure supplement 4).”

      Manuscript lines 894-898: “We chose this time window because it should represent a good balance between integrating over a long-enough time window and thus allowing for some jitter in neural firing between pairs of neurons, whilst still being temporally specific (57). To test whether this choice critically affected our results, we repeated the analysis for different window sizes, i.e. 15, 35, and 45 ms (see Figure 6 – figure supplement 4).”

      Are these units on the same or different microwires?

      All units used for the analysis shown in Figure 6 come from different microwires. This was naturally the case because the putative up-stream neuron was distally coupled to the theta LFP, and the putative down-stream neuron was locally coupled to gamma at this same theta LFP electrode. This information is listed in Figure 6 – source data 1 which lists the locations and electrode IDs for all neuron pairs shown in figure 6.

      How do the spike latencies reported here depend on the firing rates of the two units?

      To address this question we first tested whether firing rates (averaged across the putative up-stream and down-stream neurons) differ between hits and misses. If they do, this would be suggestive of a dependency of the spike latency differences between hits and misses on firing rates. No such difference was observed (p>0.3). Second, we correlated the differences between hits and misses in Co-firing peak latencies with the differences in firing rates. Again, no significant correlation was observed (R=-0.06; p>0.7), suggesting that firing rates had no influence on the observed differences in co-firing latencies. These control analyses are now reported in the main text.

      Manuscript lines 347-350: “No significant differences in firing rates between hits and misses were found (p>0.3), and on correlations between firing rates and the co-firing latencies were obtained (R=-0.06; p>0.7), suggesting that firing rates had no influence on the observed co-firing differences between hits and misses.”

      What do these results look like for other pairs that are not putative upstream/downstream pairs?

      As we reported in the original manuscript in lines 352-355 we did not find a memory dependent effect on co-firing latencies if we select neuron pairs solely on the basis of distal theta SFC. Within this analysis the distally theta coupled neuron would be the up-stream neuron and the neuron recorded at the site where the theta LFP is coupled would be the down-stream neuron. This null-result suggests that in order for the memory dependent difference in co-firing lags to emerge, the down-stream neurons need to be coupled to a local gamma rhythm in order for the memory effect on co-firing latencies to emerge. However, within this previous analysis there is still a notion of up-stream and down-stream neurons because neuron pairs were selected based on distal theta phase coupling. We therefore repeated this analysis for all pairs of neurons in a completely unconstrained fashion such that all possible pairs of neurons that were recorded from different electrodes were entered into the co-firing analysis. This analysis also revealed no difference in co-firing lags, neither for positive lags nor for negative lags. Instead, what this analysis showed is tendency for hits to show a higher occurrence of simultaneous or near simultaneous firing, which is in line with Hebbian learning. These results are now reported in Figure 6 – figure supplement 1.

      Manuscript lines 333-335: “In addition, a completely unconstrained co-firing analysis where all pairs possible pairings of units were considered also showed no systematic difference in co-firing lags between hits and misses (Figure 6 – figure supplement 1).”

      Reviewer #2 (Public Review):

      Roux et al. investigated the temporal relationship between spike field coherence (SFC) of locally and distally coupled units in the hippocampus of epilepsy patients to successful and unsuccessful memory encoding and retrieval. They show that SFC to faster theta and gamma oscillations accompany hits (successful memory encoding and retrieval) and that the timing of the SFC between local and distal units for hits comports well with synaptic plasticity rules. The task and data analyses appear to be rigorously done.

      Strengths: The manuscript extends previous work in the human medial temporal lobe which has shown that greater SFC accompanies improved memory strength. The cross-regional analyses are interesting and necessary to invoke plasticity mechanisms. They deploy a number of contemporary analyses to disentangle the question they are addressing. Furthermore, their analyses address limitations or confound that can arise from various sources like sample size, firing rates, and signal processing issues.

      Weaknesses:

      Methodological:

      The SFC coherence measures are dependent in part on extracting LFPs derived from the same or potentially other electrodes that are contaminated by spikes, as well as multiunit activity. In the methods, they cite a spike removal approach. Firstly, the incomplete removal or substitution of a signal with a signal that has a semblance to what might have been there if no spike was present can introduce broadband signal time-locked to the spike and create spurious SFC. Can the authors confirm that such an artifact is not present in their analyses? Secondly, how did they deal with the removal of the multiunit activity? It would be suspected that the removal of such activity in light of refractory period violation might be more difficult than well-isolated units, and introduce artifacts and broadband power, again which would spuriously elevate SFC. Conversely, the lack of removal of multiunit activity would seem to for a surety introduce significant broadband power. One way around this might be that since it is uncommon to have units on all 8 of the BF microwires, to exclude the microwire(s) with the units when extracting the LFP to avoid the need to perform spike removal.

      The reviewer raises a valid concern which we address as follows. Firstly, an artefact introduced into SFC by linear interpolation would be a problem for those local SFCs where the spike providing channel and the LFP providing channel are identical. Out of the 53 local spike-SFC couplings found for the gamma frequency range, only 11 (20.75%) were from pairs where the spikes and LFPs come from the identical microwire. It is unlikely that this minority of data would have driven the results. Furthermore, it is unlikely that the interpolation would introduce a frequency shift of SFC that is memory dependent, because the interpolation is more likely to cause a general increase in broadband SFC (as opposed to having a frequency band specific effect). However, to address this concern, we split the data based on whether the spike and LFP providing channels were the same or different. The results show that (i) the spectrogram of SFC is highly similar between the two datasets, with a prominent gamma peak present in both and no significant differences between the two; (ii) restricting the analysis to those data where the LFP and spike providing channels are different replicated the main finding of faster gamma peak frequencies for hits compared to misses.

      Secondly, we followed the reviewer’s suggestion and repeated the SFC analysis for ‘silent’ microwires, i.e. microwires where no single or multi-units were detected. This analysis replicated the same memory effects as observed in the analysis with all microwires. Specifically, we found an increase in the local gamma peak SFC frequency for hits compared to misses, as well as an increase in distal theta peak SFC frequency for hits compared to misses. These results are reported in the main manuscript and in Figure 2 – figure supplement 6 for gamma, and figure 3 – figure supplement 6 for theta.

      Manuscript lines 199-203: “To rule out concerns about possible artifacts introduced by spike interpolation we repeated the above analysis for spike-LFP pairs where the spike and LFP providing channels are the same or different, and for ‘silent’ LFP channels (i.e. channels were no SUA/MUA activity was detected (see Figure 2 – figure supplement 6).”

      Manuscript lines 253-255: “We also repeated the above analysis for spike-LFP pairs by only using ‘silent’ LFP channels (i.e. channels were no SUA/MUA activity was detected (see Figure 3 – figure supplement 6) to address possible concerns about artefacts introduced by spike interpolation.”

      In a number of analyses the spike train is convolved with a Gaussian in places with a window length of 250ms and in others 25ms. It is suspected that windows of varying lengths would induce "oscillations" of different frequencies, and would thus generate results biased towards the window length used. Can the authors justify their choices where these values are used, and/or provide some sensitivity analyses to show that the results are somewhat independent of the window length of the Gaussian used to convolve with the times series.

      The different choices in window length for the Gaussian convolution reflect the different needs of the two analyses where these convolutions were applied. In one analysis we wanted to get a smooth estimate of spike densities that we can average across trials, similar to a peri-stimulus spike histogram. For this analysis we used a window length of 250 ms which we found appropriate to yield a good balance between retaining smooth time courses whilst still being temporally sensitive. Importantly, for the statistical analysis of the firing rates, spike densities were averaged in much larger time windows than 250 ms (i.e. 1 – 2 seconds) therefore our choice of window length for spike densities would not have any bearing on the averaged firing rate analysis.

      In the other analysis, which is more central for our manuscript, we used a cross-correlation between spike trains to estimate co-firing lags in the range of milliseconds. Therefore, this analysis necessitated a much higher temporal precision. We used a Gaussian Window with a width of 25ms because it represents a good balance between integrating over a long-enough time window and thus allowing for some jitter in neural firing between pairs of neurons, whilst still being temporally specific. Finding the right balance here is not trivial because a too short time window would be prone to noise and underestimates co-firing, whereas a too long time window may not provide the temporal specificity necessary to detect co-firing lags (Cohen and Kohn, 2013; Nat Neurosci). To test whether this choice critically affected our results, we repeated the analysis for different window sizes, i.e. 15, 35, and 45 ms. The results show that the basic pattern of results did not change, with hits showing earlier peaks in co-firing compared to misses. Critically, the difference in co-firing peaks was significant for all window sizes, except for the shortest one which is likely due to the increase in noise because of the smaller time window over which spikes are integrated. These issues are now mentioned in the methods section, and the results for the different window sizes are reported in Figure 6 – figure supplement 4.

      Manuscript lines 346-347: “The co-firing analyses were replicated with different smoothing parameters (see Figure 6 – figure supplement 4).”

      Manuscript lines 894-898: “We chose this time window because it should represent a good balance between integrating over a long-enough time window and thus allowing for some jitter in neural firing between pairs of neurons, whilst still being temporally specific (57). To test whether this choice critically affected our results, we repeated the analysis for different window sizes, i.e. 15, 35, and 45 ms (see Figure 6 – figure supplement 4).”

      Conceptual:

      The co-firing analyses are very interesting and novel. In table S1 are listed locally and distally coupled neurons. There are some pairs for example where the distally coupled neuron is in EC and the downstream one in the hippo, and then there is a pair that is the opposite of this (dist: hippo, local EC). There appear to be a number of such "reversal", despite the delay between these two regions one would assume them to be similar in sign and magnitude given the units are in the same two regions. It seems surprising that in two identical regions of the hippo the flow of information or "causality", could be reversed, when/if one assumes information flows through the system from EC to hippo. This seems unusual and hard to reconcile given what is known about how information flows through the MTL system.

      The reviewer is correct that the spike co-firing analysis suggests a bi-directional flow of information between the hippocampus and surrounding MTL regions (e.g. entorhinal cortex; see Figure 6 – figure supplement 3). However, this bi-directional flow of information is not incompatible with neuroanatomy and the memory literature. The entorhinal cortex serves as an interface between the hippocampus and the neocortex with superficial layers providing input into the hippocampus (via the perforant pathway), and the deeper layers receiving output from the hippocampus (van Strien et al., 2009; Nat Rev Neurosci). Therefore, on a purely anatomical basis we can expect to see a bi-directional flow of information between the hippocampus and the entorhinal cortex, albeit in different layers. Importantly, reversals as shown in our Figure 6 – source data 1 involved different microwires and therefore different neurons (i.e. the entorhinal unit in row 1 was recorded from microwire 3, whereas the entorhinal unit in row 2 was recorded from microwire 8). It is conceivable that these two neurons correspond to different layers of the entorhinal cortex and therefore reflect input vs. output paths. Moreover, studies in humans demonstrated that successful encoding of memories depends not only on the input from the entorhinal cortex into the hippocampus, but also on the output of the hippocampal system into the entorhinal cortex, and indeed on the dynamic recurrent interaction between these input and output paths (Maass et al. 2014; Nat Comms; Koster et al., 2018; Neuron). Our bi-directional couplings between hippocampal and surrounding MTL regions (such as the EC) are in line with these findings. We have added a discussion of this issue in the discussion section.

      Manuscript lines 447-452: “Notably, the neural co-firing analysis indicates a bidirectional flow of information between the hippocampus and surrounding MTL areas, such as the entorhinal cortex (see Figure 6 – figure supplement 3; Figure 6 – source data 1). This result parallels other studies in humans showing that successful encoding of memories depends not only on the input from surrounding MTL areas into the hippocampus, but also on the output of the hippocampal system into those areas, and indeed on the dynamic recurrent interaction between these input and output paths (43, 44).”

    1. Author Response

      Reviewer #3 (Public Review):

      This paper is based on digital reconstruction of a serial EM stack of a larva of the annelid Platynereis and presents a complete 3D map of all desmosomes between somatic muscle cells and their attachment partners, including muscle cells, glia, ciliary band cells, epidermal cells and specialized epidermal cells that anchor cuticular chaetae (chaetal follicle cells) and aciculae (acicular follicle cells). The rationale is that the spatial patterning of desmosomes determines the direction of forces exerted by muscular contraction on the body wall and its appendages will determine movement of these structures, which in turn results in propulsion of the body as part of specific behaviors.

      To go a step further, if connecting this desmosome connectome with the (previously published) synaptic connectome, one may gain insight into how a specific spatio-temporal pattern of motor neuron activity will lead, via a resulting pattern of forces caused by muscles, to a specific behavior. In the authors' words: "By combining desmosomal and synaptic connectomes we can infer the impact of motoneuron activation on tissue movements".This is an interesting idea which has the potential to make progress towards understanding in a "holistic" way how a complex neural circuitry controls an equally complex behavior. The analysis of the EM data appears solid; the authors can show convincingly that desmosomes can be resolved in their EM dataset; and the technology used to plot and analyze the data is clearly up to the task. My main concern is with the way in which the desmosome pattern is entered in the analysis, which I think makes it very difficult to extract enough relevant information from the analysis that would reach the stated goal.

      1) The context of how different structures of the Platynereis larval body, by changing their position, move the body needs much more introduction than the short paragraph given at the end of the Introduction.

      -My understanding is that the larval body is segmented, and contraction of the segments can cause a certain type crawling or swimming: does it? Do the longitudinal muscles, for example, insert at segment boundaries, and alternating contraction left-right cause some sort of "wiggling" or peristalsis?

      Longitudinal muscles do not insert only at segment boundaries, but have desmosomal connections along the entire length of the cell. Individual longitudinal muscle cells can span up to 3 segments. However the cells are staggered in such a way that all longitudinal muscle cells with somas in one segment can collectively cover up to 4 segments. Longitudinal muscles are involved in turning when swimming (Randel et al., 2014). The undulatory trunk movements and parapodial walking movements are due to the contraction of oblique and parapodial muscles. The longitudinal muscles provide support during crawling (via desmosomal links) but it is unlikely that these muscles contract segmentally. Disentangling the distinct contributions of 53 types of muscles during crawling will require further studies.

      -In addition, there are segmental processes (parapodia, neuropodia), and embedded in them are long chitinous hairs (Chaetae, Acicula). Do certain types of the muscles described in the study insert at the base of the parapodia/neuropodia (coming from different angles), such that contraction would move the entire process, including the chaetae/acicula embedded in their tips?

      Yes, acicular muscles insert at the proximal base of the acicula, and by moving the acicula they move the entire noto-/neuropodia. We have presented the anatomy of all acicular and chaetal muscles types in the figures and videos.

      -Or is it that only these chaetae/acicula move, by means of muscles inserting at their base (the latter is clearly part of the story)? Or does both happen at the same time: parapodium moves relative to the trunk, and chaeta/acicula moves relative to the parapodium? How would these movements lead to different kind of behaviors?

      -Diagrams should be provided that shed light on these issues.

      We have extended Video2 to show individual muscles and their relation to the aciculae in one of the parapodia. We also clarified this in the text:

      “Several acicular muscles attach on one end to the proximal base of the aciculae and on the other end to the paratrochs and epidermal cells. Oblique muscles attach to the basal lamina, epidermal and midline cells at their proximal end, run along the anterior edge of parapodia and attach to epidermal and chaetal follicle cells at their distal tips. Both of these muscle groups are involved in moving the entire parapodium. Acicular muscles move the proximal tips of the aciculae, while oblique muscles move the parapodium by moving the tissue around the chaetae and the aciculae. All acicular movements also correspond to parapodial movements. Chaetae are embedded in the parapodium and therefore move with it, but the chaetal sac muscles can also independently retract the chaetae into the parapodium or protract them and make them fan out.”

      2) The main problem I have with the analysis is the way a muscle cell is treated, namely as a "one dimensional" node, rather than a vector.

      -In the current state of the analysis, the authors have mapped all desmosomes of a given muscle cell to its attached "target" cell. But how is that helpful? The principal way a muscle cell acts is by contracting, thereby pulling the cells it attaches to at its two end closer together. As the authors state (p.4) "...desmosomes..are enriched at the ends of muscle cells indicating that these adhesive structures transmit force upon muscle-cell contraction."

      At the level of the current analysis our data reveal which cells may be moved by the contractions of the individual muscle cells. The reviewer is right that treating a muscle as a vector (or set of vectors) would be a more accurate description, which would potentially also open up the possibility of computational modelling. We have provided such a vectorised dataset in the revised version, where each muscle-cell skeleton is subdivided into short linear segments (Figure2–source–data 2). This dataset may be useful to approach the problem with a three dimensional approach, which is beyond the scope of the current analysis. We also included an additional video (Video 7) showing examples of muscles and their partners where the cells and the desmosomes connecting them are highlighted. This reveals that the desmosomes connecting two cells are often at the very end of the muscle cell.

      -for that reason, the desmosomes at the muscle tips have to be treated as (2) special sets. Aside from these tip desmosomes there are other desmosomes (inbetween muscles, for example), but they (I would presume) have a very different function; maybe to coordinate muscle fiber contraction? Augment the force caused by contraction?

      Desmosomes between muscles only occur between muscles of different types, not for homotypic connections. There are other types of junctions (adhaerens-like junctions) that connect individual cells of a muscle bundle together (not analysed here). We clarified this in the text.

      • As far as I understand for (all of) the desmosome connectome plots, there is no differentiation made between desmosome subsets located at different positions within the muscle fiber. I therefore don't see how the plots are helpful to shed light on how the multiplicity of muscles represented in the graphs cause specific types of neurons.

      We would like to point out that the cells and structures that muscles connect to via desmosomes are very likely the parts of the body that will move during the contraction of the muscle or will provide structural support (e.g. basal lamina) for the muscle cell to contract. This is most evident in the parapodial complex. The majority of muscles in the body connect to the aciuclar folliclecells and the aciculae are the most actively moving parts in the body during crawling (see Video 4). In any case, since we provide all skeleton reconstructions and the xyz coordinates of all desmosomes, the data could be further analysed following these suggestions by the reviewer.

      • As it stands these plots "merely" help to classify muscles, based on their position and what cell type they target: but that (certainly useful) map could have probably also be achieved by light microscopic analysis.

      This has never been achieved by light microscopy analysis in the hundreds of papers on invertebrate muscle anatomy (e.g. by phalloidin staining). For an LM analysis, it would not be sufficient to label the muscle fibres, but one would also need to label the desmosomes and a multitude of non-muscle cell types including the extent of their cytoplasm. This is technically very challenging (we would nevertheless be happy to hear specific suggestions for markers etc. from the Reviewer). Currently, only EM provides the required depth of structural information and resolution. This is why we believe that our dataset and analysis is unique, despite over a century of research in invertebrate anatomy.

      3) Section "Local connectivity and modular structure of the desmosomal connectome" p.4-7" undertakes an analysis of the structure of the desmosome network, comparing it with other networks.

      -What is the rationale here? How do the conclusions help to understand how the spatial pattern of muscles and their contraction move the body?

      We hope that our analysis may also be of interest to the community of network scientists and we believe that the reconstruction of a quite large and novel type of biological network warrants a more quantitative network analysis, using the standard methods and measures of network science – as we presented e.g. in Figure 4 – even if these mathematical analyses may not directly reveal how muscles move the body. We hope that some readers with an interest in quantitative analyses will also appreciate the broader picture here.

      -Isn't, on the one hand (given that position of the desmosome was apparently not considered), the finding that desmosome networks stand out (from random networks) by their high level of connectivity ("with all cells only connecting to cells in their immediate neighbourhood forming local cliques") completely expected?

      We disagree that the result was completely expected. Even if this was the case, we think it is quite different to say that a result is expected or to thoroughly quantify certain parameters and mathematically characterise key properties of the desmosomal graph (as we have done). These network analyses help to conceptualise our findings and to think about the muscle system in more global, whole-body terms.

      -On the other hand, does this reflect the reality, given that (many?) muscle cells are quite long, connecting for example the anterior border of a segment with the posterior border.

      Indeed, a quantitative analysis helped us to identify cases where the reality deviated somewhat from what was completely expected, and we thank the reviewer for these comments. As we explain in the revised version, some longitudinal muscles show an unexpected position in the force-field layout of the graph, due to their long-range connections. We have added extra clarifications to the text: “To analyse how closely the force-field-based layout of the desmosomal connectome reflects anatomy, we coloured the nodes in the graph based on body regions (Figure 5). In the force-field layout, nodes are segregated by body side and body segment. Exceptions include the dorsolateral longitudinal muscles (MUSlongD) in segment-0. These cells connect to dorsal epidermal cells that also form desmosomes with segment-1 and segment-2 MUSlongD cells. These connections pull the MUSlongD_sg0 cells down to segment-2 in the force-field layout (Figure 5D).”

      1. In the section "Acicular movements and the unit muscle contractions that drive them" the authors record movement of the acicula and correlate it with activity (Ca imaging) of specific muscle types. This study gives insightful data, and could be extended to all movements of the larva.

      -The fact that a certain muscle is active when the acicula moves in a certain direction can be explained (in part) by the "connectivity": as shown in Fig.7L, the muscle inserts at a acicular follicle cell on the one side, and to an epithelial (epidermal?) cell and the basal lamina on the other side. But how meaningful is a description at this "cell type level" of resolution? The direction of acicula deflection depends on where (relative to the acicula base) the epithelial cell (or point in the basal lamina) is located. This information is not given in the part of the connectome network shown in Fig.7L, or any of the other graphs.

      This information is indeed not shown in the graphs, where each cell is treated as a node. However, we provide this information in the detailed anatomical figures in Figure 6 – figure supplement 1-3 and Video 7, where the individual acicular and oblique muscle types are visualised. In principle, one could subdivide aciculae into e.g. proximal and distal halves and derive a more detailed network. We have not done this but since all the EM, anatomical rendering and connectivity data are available in our public CATMAID server (https://catmaid.jekelylab.ex.ac.uk/), we hope that the interested readers will be able to further analyse the data.

      We renamed ‘epithelial’ cells to ‘epidermal’ cells.

    1. When files are rendered on a computer screen a user witnesses something akin to the performance of a play. The underlying data in a file is interpreted and rendered through software for a user to interact with in much the same way that the script of a play is interpreted and performed by a cast on a stage. In each case, while the underlying script or files remains the same, a given performance of a file or a play is going to look and sound different. For some kinds of research questions those differences do not matter, however, it is necessary in either case to be aware of the differences.

      Because seeing a play is such a fleeting event, writing a play review may be a thrilling, though tough, effort. You must be both a spectator watching and appreciating the performance as well as a critical analyst of the production itself. You must be able to offer a quick overview of the play, as well as a close objective analysis of the performance you attended, as well as an interpretation and review of the full ensemble of staging, acting, directing, and so on. Couldn't the same be said about a file that is rendered on a computer screen, it has the ability to disappear at any point, so should we not read it carefully and think of why this specific work was digitized. What does this work really look like in its original form? How was this artifact written?

    1. “Acknowledging that they have that sovereignty over the material, that it is indeed not yours [the institution’s], is one of the key things we’re trying to promote in the work that we’re doing with the archival community in general,”

      I think this approach to the matter is a fantastic step forward to such a sensitive issue. The items being archived are no more the archivist's property than they are the institutions property. These records belong to a culture that we should aim to preserve independently of our own, and we cannot truly attempt such a feat if we try to claim ownership over every piece we host. After all, in essence, these records are knowledge that local indigenous communities are offering to preserve for us as opposed to it being lost to time. Some of it may never be shared with outsiders of that community, yet some of it may be shared, and surely that value alone would be worth the cost of the preservation programs.

      To give an analogy to this idea, if you could prevent the library of Alexandria from burning, even though you may never personally access it, but others might, would you? or would you let it burn and lose an unknown amount of knowledge and history in the process.

    2. They even refer to deaths. Indeed, for some families, these records may be the only existing documents detailing the fates of their children.

      I think that it is important to keep these records of our nation's past as a reminder of where we came from and what horrid mistakes we made. Without these records, future generations may be doomed to repeat the same atrocities committed at residential schools, as well as other places. I also think that the ability to digitize and spread these documents allows the families of those affected to finally find closure in what has happened to their families.

    1. We utilized a between ‐ subjects design in which we compared two types of feedback: Antisocial feedback and prosocial feedback, with no feedback as a control condition. In the antisocial feedback condition, keeping tokens to the self (i.e., maximizing one's own outcome) received many thumbs up, whereas in the prosocial feedback condition, donations to the group received many thumbs up. The no feedback control condition was similar to the feedback conditions in the sense that participants were informed that a spectator group would evaluate their decisions, so participants anticipated the possibility of feedback. The only difference in the no feedback control condition was that after making their decisions, participants were not shown any feedback

      I feel like if I was a part of this experiment that this would effect my decisions. If I knew that someone was watching and judging my decisions I would subconsciously change my original answers to answers that I think the people watching would approve of. It may just be something that is wired into our minds, that we have to accommodate our answers/actions based upon who is watching.

    2. they may also be instrumental in prompting adolescents to adopt other types of behavior, such as prosocial behavior

      I never really thought of it this way, the fact that adolescents may be picking up these bad behaviors because they're told to do the opposite. I believe that our hearts are not as pure as we think they are, when Adam and Eve ate of the forbidden fruit and brought sin into this world, we have all now been born with sin, therefore I believe that there is something within us that desires to go against good... there is evil within us that wants to succeed, and maybe that's where this rebellion comes from that we see being mentioned here.

    1. Author Response

      Reviewer #1 (Public Review):

      Using a large neonatal dataset from the developmental Human Connectome project, Li and colleagues find that cortical morphological measurements including cortical thickness are affected by postnatal experience whereas cortical myelination and overall functional connectivity of ventral cortex developed significantly were not influenced by postnatal time. The authors suggest that early postnatal experience and time spent inside the womb differentially shape the structural and functional development of the visual cortex.

      The use of large data set is a major strength of this study, furthermore an attempt to examine both structural and functional measures, and connectivity analysis and separating these analyses based on the pre-and full-term infants is impressive and strengthens the claims made in the paper. While I find this work theoretically well-motivated and the use of the large dHCP dataset very exciting, there are some concerns, that need to be addressed.

      There is a bit of confusion if the authors really compared the structural-functional measures in the final analysis. If the authors wish to make claims about the relationship, then there must be a compelling analysis detailing these findings.

      Thanks for the suggestions. We have added analysis to directly investigate the relationship between the development of homotopic connection and corresponding structural measurements in the area V1 (Page 13 Line 5-16):

      “The above results revealed that structural and functional properties of the ventral visual cortex both developed with PMA, but were differently influenced by the in-utero and external environment (Table 1). We further investigated the relationship between structural and functional development based on area V1, which showed a strong developmental effect in both structural and functional analyses. Mediation analysis was employed to see whether the development (GA or PT) of the homotopic connection between bilateral V1 was mediated by the structural properties (CT or CM). We found that the PT had a significant direct effect on the homotopic function that was not mediated by CT or CM (Fig 6a-b). In contrast, the direct effect of GA on the homotopic connection was not significant but the indirect effect of GA through CM on the connection was significant (Fig 6c-d).”

      There is also a bit of confusion in the terminology used in the study regarding ages; the gestational age, premenstrual age, and postnatal time. I think clarifying and simplifying it down to GA and postnatal time will help the reader and avoid confusion.

      Thank you for the suggestion. We have made extensive revision regarding the terminology throughout the paper and simplified it down to GA and PT. Please see the response to the 1st major concern in the Essential Revisions (for the authors) section above.

      *Reviewer #2 (Public Review):

      The authors utilize the publicly available dHCP dataset to ask an interesting question: how does postnatal experience and prenatal maturation influence the development of the visual system. The authors report that experience and prenatal maturation differentially contribute to different aspects of development. Namely, the authors quantify cortical thickness, myelination, and lateral symmetry of function as three different metrics of development. The homotopy and preterm infant analyses are strengths that, on their own, could have justified reporting. However, I have concerns about the analytic approaches that were used and the conclusions that were drawn. Below I list my major concerns with the manuscript.

      PMA vs. GA vs. PT

      The authors seek to understand the contribution of experience and prenatal development, yet I am unsure why the authors focused on the variables they did. There are three variables of interest used throughout this study: Gestational age at birth (GA), postnatal time (PT), and postmenstrual age at the time of scan (PMA). The last metric, PMA, is straightforwardly related to GA and PT since PMA = GA + PT. In most (but not all) of the manuscript, the authors use PMA and PT, with GA used without justification in some cases but not in others.

      It is unclear why PMA is used at all: PMA is necessarily related to PT and GA, making these variables non-independent. Indeed, the authors show that PMA and PT are highly correlated. The authors even say that "the contribution of postnatal experience to the development was not clarified because PMA reflects both prenatal endogenous effect and postnatal experience." So, why not use GA at birth instead of PMA? Clearly, GA is appropriate in some cases (e.g., Figure S4 or in some of the ANOVA applications), and to me, it seems to isolate the effect the authors care about (i.e., duration of prenatal development). Perhaps there is some theoretical justification for using PMA, but if so, I am unaware.

      That said, I expect that replacing all analyses involving PMA with GA will substantially change the results. I do not see this as a bad thing as I think it will make the conclusions stronger. As is, I am left unsure about what the key takeaways of this paper are.

      We appreciate the suggestions, and we have replaced the related analyses involving PMA with GA in the manuscript. Please see the Response to the 1st major concern in the Essential Revisions (for the authors) section above for more detail.

      Using GA instead of PMA will have several benefits: 1) It will be much simpler to think of these two variables since they contrast the duration of fetal maturation and time postnatally. 2) This will help the partial correlation analyses performed since the variance between the variables is more independent. It will also mean that the negative relationships observed between PT and cortical thickness when controlling for PMA (e.g., Figure 2h) might disappear (reversed signs for partial correlations are common when two covariates are correlated). 3) this will allow the authors to replace Figure 1a with a more informative plot. Namely, they could use a scatter of GA and PT, giving insight into the descriptive statistics of both dimensions.

      We have revised the manuscript throughoutly following the reviewer’s suggestion. However, we thought it would be necessary to show the overall development of CT and CM across the general age (PMA) in Figure 1. Therefore, we didn’t replace the figure 1a but added a scatter figure between GA and PT in Figure 2-figure supplement 1 and added descriptive statistics of them in the manuscripts: “The mean GA of the neonates was 39.93 weeks (SD = 1.26) and the mean PT was 1.21 weeks (SD = 1.25), the correlation between them was not significant (r = - 0.08, p > 0.1; Figure 2-figure supplement 1).” Moreover, the negative relationships between PT and CT when controlling for PMA disappeared in the revised results as the reviewer’s predicted.

      I suspect that one motivation for the use of PMA over GA is for the analysis in Figure 6. In this analysis, the authors pick a group of term infants with a PMA equal to the preterm infants. Since PMA is the same, the only difference between the groups (according to the authors) is the amount of postnatal experience. However, this is not the only difference between the groups since they also vary in GA (and now PT and GA are negatively correlated almost perfectly). I don't know how to interpret this analysis since both the amount of prenatal maturation and postnatal experience vary between the groups.

      We appreciate the reviewer’s opinion that both GA and PT were different between preterm and term-born neonates. Then any of the differences between the two groups might came from the combined effect of GA and PT in our results, and unfortunately, we might not able to separate them in this analysis. However, the preceding results indicated that the CT was significantly influenced by PT and GA while CM was significantly influenced by GA, which So we discuss the preterm and term-born comparison in the context of these findings (Page 19 Line 26-29 and Page 20 Line 1-5): “We found CT in the ventral cortex was generally lower in the term-born than preterm-born infants, while the CM showed the opposite trend in the two groups. Since the preterm babies have longer PT but shorter GA compared to full-term infants at the same PMA, this result supported the above analysis that CT was preferably influenced by PT while CM was largely dependent on GA during the neonatal period”. Furthermore, we added a description in the limitation section to stress the caveat (Page 20 Line17-19): “Meantime, both GA and PT were different between preterm and term-born neonates. Then any of the differences between the two groups might came from the combined effect of GA and PT, and unfortunately, we were not able to separate them in this study.”

      Justification of conclusions and statistical considerations

      I had concerns about some of the statistical tests and conclusions that the authors made. I refer to some of these in other sections (e.g., the homotopy analyses), but I raise several here.

      I am not sure what evidence the authors are using to make this claim: "we found that the cortical myelination and overall functional connectivity of ventral cortex developed significantly with the PMA but was not directly influenced by postnatal time." Postnatal time is significantly correlated with cortical myelination, as shown in Figures 2g, 2h, 3b, 3c, and postnatal time is significantly correlated with functional connectivity, as shown in Figures 4h, 5c, 5d, and 5e. Hence, this general claim that "the development of CT was considerably modulated by the postnatal experience while the CM was heavily influenced by prenatal duration" doesn't seem to be supported: both myelination and thickness are affected by postnatal experience and prenatal duration (as measured by PMA). A similar sentiment is expressed in the abstract. Perhaps the authors suggest different patterns in the strength of change for PMA vs. PT across these metrics, but if so, then statistical tests need to support that conclusion, and the claims need to reflect that sentiment.

      Interestingly, Figure S4 presents a compelling ANOVA that does support this conclusion. Still, this result is relegated to the supplement, and it also uses GA, rather than PMA, making it hard to reconcile with the other claims made in the main text. Moreover, it uses ANOVAs, which dichotomizes a continuous variable. Here and elsewhere in the manuscript (e.g., Figures 3d, 3e), the authors split the infants into quartiles and compare them with ANOVAs. Their use for visualization is helpful, but it is unclear what the statistical motivation for this is rather than treating these as continuous variables like is possible with linear mixed-effects models. Moreover, it is unclear why the authors excluded half the data from the study (i.e., quartiles 2 and 3) in this ANOVA when all four quartiles could be used as factors.

      We appreciate the reviewer’s comments. We have clarified our results and conclusion in the revised manuscript based on the new analyses that replaced PMA with PT and GA (See the response to the 1st major concern in the Essential Revisions). The previous claims have been changed as following:” the postnatal time could modulate the cortical thickness in ventral visual cortex and the functional circuit between bilateral primary visual cortices. But the cortical myelination, particularly that of the high-order visual cortex, developed without significant influence of postnatal time in such early period” (Page 2, Lines 8-12). This claims could be supported by the results in figure 2. Moreover, to support the claims about the comparison of the influence between GA and PT on structural development, we replaced the ANOVA analysis with a linear mixed-effect model as the reviewer mentioned.

      1) To compare the influence of GA against PT on the structural development in the whole ventral visual cortex (Page 7 Line 15-19), “We applied a linear mixed-effect model to test whether the CT (or CM) of the whole ventral cortex were differently influenced by the GA vs. PT, and found that the GA had a significantly stronger effect on the CM than PT (interaction between GA and PT, p < 0.05) but no significant difference was found of the effect on the CT between the ages (p > 0.6).”

      2) To compare the influence of GA against PT on the structural development in the area V1 and VOTC, we applied a similar linear mixed-effect model analysis for the two ROIs (Page 8 Line 17-18 and Page 9 Line 1-4): “Moreover, we applied a linear mixed-effect model to test the developmental influence of GA vs. PT on the cortical structure , and the results showed that the CT in two ROIs showed non-significantly different influences from GA against PT (p > 0.3), but CM showed at least marginally significant results in both two ROIs (V1: p < 0.01 and VOTC: p < 0.09).”

      It is unclear what the evidence is to support the following claim: "Both CT and CM show higher correlation with PMA in the posterior than anterior region, and higher correlation in the medial than lateral part within the anatomical mask (Figure 2a and Figure S2b-c [sic])" From Figure 2 or Figure S2, I don't see a gradient. From Figure S3, there might be a trend in some plots, but it is hard to interpret since it is non-monotonic. More generally, is there a statistical test to support this claim?

      We added a correlation analysis between the diction (x: lateral to medial; y: posterior to anterior) and measurements (CT and CM) in the ventral visual cortex, and the resulting coefficient was all significant (r = 0.7/-0.8 for CT along x/y axis, and r = 0.91/-0.83 for CM along x/y axis; p < 0.001). See Figure 1-figure supplement 2. However, the consideration provided by the reviewer still exists that such significance was driven by part of the areas and the gradient was non-monotonic. Therefore, we replaced the original claim with the following sentence (Page 6 Line 3-8): “In addition, we found distinct spatial variation along ventral cortex, e.g. posterior-anterior and medial-lateral directions (Figure 1-figure supplement 2a-b). Generally, both CT and CM showed higher correlation with PMA in the posterior than anterior region (r = -0.8 and -0.83; p < 0.001), and higher correlation in the medial than lateral part within the ventral visual cortex (r = 0.7 and 0.91; p < 0.001; Figure 1-figure supplement 2c-d).”.

      "and the interaction [sic] was more prominent in CM (simple effect: t = 10.98, p < 10-9) that in than CT (t = 2.07, p < 0.05)." Does 'more prominent' mean it is 'significantly stronger'? If not, then the authors should adjust this claim

      The claim ‘more prominent’ did express ‘significantly stronger’ since we found that the interaction between CM and CT along PMA or PT was significant in the ANOVA analysis. This analysis has been removed because we thought that the comparison between two structural measurements is not very relevant to the conclusion of the paper. We now applied a linear mixed-effect model to compare the influence of GA against PT on specific structural development. So this result and claim have been removed from the new manuscript.

      Are the authors Fisher Z transforming their correlations? In numerous places, correlation values seem to be added together or used as the input to other correlation analyses. It is unclear from the methods whether the authors are transforming their correlation values to make that use appropriate.

      We are sorry for the confusion. All the statistical analyses involving correlation coefficients were Fisher-Z transformed. We have added a clear description in the manuscripts involving the Fisher-Z transformation (Page 25 Line 16-18).

      Homotopy analyses

      The homotopy section is a strength of the paper, but I have doubts about the approach taken to analyze this data and some of the conclusions drawn. I don't expect any of my suggestions to change the takeaway of this section, but I do think they are essential criticisms to address.

      I do not think that the non-homotopic control condition is appropriate. In Arcaro & Livingstone (2017), the authors had 3 categories for this analysis: homotopic pairs (e.g., left V1 vs. right V1), adjacent pairs (e.g., left V1 vs. right V2), and distal pairs (e.g., left V1 vs. right PHA1). In the homotopy analysis performed by Li and colleagues, they compare homotopic pairs with all other pairs. I don't think that is generous to the test since non-homotopic pairs include adjacent pairs that should be similar and distal pairs that shouldn't be similar. This may explain why some non-homotopic distribution overlaps with the homotopic distribution in Figure 4c.

      Thanks for these suggestions. In the revised manuscript, we reanalyzed the data by dividing the connections into three groups for each subject. See Page 26 Line 24-29: “For each subject, Pearson correlations were carried out on the ROI-averaged time series within and across the left and right ventral cortex. The resulting connections were divided into three groups, namely the homotopic connection (the connection between two paired areas in two hemispheres. e.g. right and left V1), adjacent connection (e.g., right V1 and left V2 since V1 and V2 are adjacent) and distant connections (two areas that were not the paired or adjacent)”.

      Regardless of this decision, I think the authors should reconsider their statistical test. I think the authors are using a between samples t-test to compare the 34 homotopic pairs with the hundreds of non-homotopic pairs. This is statistically inappropriate since the items are not independent (i.e., left V1 vs. right V1 is not independent of left V1 vs. right V2, which is also not independent of left V3 vs. right V2). This means the actual degrees of freedom are much lower than what is used. Moreover, I am unsure how the authors do this analysis across participants since this test can be done within participants. The authors should clarify what they did for this analysis and justify its appropriateness.

      Thank you for the suggestion. In the previous manuscript, we first averaged the connection matrix across subjects and then calculated the homotopic (or non-homotopic) connections between areas, and therefore, statistical analysis could not be performed. In the revised paper, we calculated the three groups of connections for each subject before the average. We applied a non-parameter statistical analysis (Wilcoxon signed-rank) to address the issue of the independent comparison among the connections, and found the homotopic connections were significantly stronger than the adjacent or distant connections.

      See (Page 26 Line 29 and Page 27 Line 1-3): “Independent-sample T-test was used to test whether the homotopic correlation was significantly greater than zero across subjects. To compare the correlation among the three types of connections, we applied a non-parameter statistical analysis (Wilcoxon signed-rank) across subjects”.

      The results showed that (Page 9 Line 17-21) “the homotopic connections in all ROIs of ventral cortex were significant (mean r = 0.13– 0.43, t > 12.87, s < 10-9; Fig 4a-b), and were significantly higher than adjacent connections (0.29 ± 0.12 vs. 0.19 ± 0.10, Wilcoxon signed rank test on the Fisher-Z transformed r value: z = 16.32, p < 10-9) and distal connections (0.04 ± 0.06, z = 16.32, p < 10-9; Fig. 4c)”.

      Could the authors speculate on why the correlations in homotopic regions are so much lower than what Arcaro and Livingstone (2017) found. I can think of a few possibilities: higher motion in infants, less rfMRI data per participant, different sleep/wake states, and different parcellation strategies. Regarding the last explanation, I think this is a real possibility: the bilateral correlation may be reduced if the Glasser atlas combines functionally heterogeneous patches of the cortex. Hence, the authors should consider this and other possible explanations.

      Thank you for the suggestion. The neonates included in this study were all under natural sleep during the scan, so sleep/wake states would not be one of the causes. We added some possible reasons for this difference following the related results (Page 19 Line 9-13): “However, the present homotopic connections in the human neonates were lower than those in neonate macaca mulattas (Arcaro and Livingstone, 2017). This difference might relate to the higher motion in human infants, less r-fMRI data in the present study, coarser parcellation in the visual cortex used in this work, and the developmental difference between primates and humans in the neonatal period.”

      The authors assume that the homotopic analyses mean that there are lateral connections between hemispheres (e.g., "Furthermore, the connections among the ventral visual cortex have developed during this early stage. Specifically, the homotopic connections between bilateral V1 and between bilateral VOTC both increased with GA, indicating an increased degree of functional distinction"). While this might be true, it doesn't need to be. Functional connectivity can be observed between regions that lack anatomical connectivity. Instead, two regions could both be driven by another region. In this case, the thalamus might drive symmetrical activity in the visual cortex.

      We agree with the reviewer’s view that the development of functional connectivity might be driven by other regions like thalamus. So we added this interpretation in the discussion section (Page 19 Line 23-25): “It is worth noting that the increased homotopic connection can be direct or indirect, e.g., the effect might be driven external regions with enhanced connection to both of the areas (e.g. thalamus)”.

      Miscellaneous

      I am not sure what the motivation of this line is: "Moreover, those studies did not fully control the visual experience in the first few weeks of the subjects, thus cannot give a clear conclusion whether the innate functional connectivity is unrelated to postnatal visual experience." Arcaro, Schade, Vincent, Ponce, & Livingstone (2017) did control the visual experience of subjects. Moreover, the research here doesn't control infant experience in the way this sentence implies: it implies an experiment manipulation (i.e., fully control) rather than a statistical control that is done here. Consider rephrasing

      We have rephrased this sentence in the introduction section (Page 5 Line 2-5): “Moreover, the human infants participating in a previous study (Kamps et al., 2020) were around one month old (mean age: 27 d; range from 6 to 57 d), who might already acquire some visual experience, and thus this study could not exclude postnatal visual experience on the innate functional connectivity”.

      I am not sure why this claim is made: "Area V1 was selected because this region is the most basic region for visual processing and probably is the most experience-dependent area during early development". Is there evidence supporting this claim? Plasticity is found throughout the visual cortex, and I think which region is most plastic depends on the definition of plasticity. For instance, most people have the same tuning properties to gabor gratings (e.g., a cardinality bias), but there is enormous variability in face tuning across cultures.

      We have removed this claim in the manuscript.

      The abstract says 783 infants were included in this study, but far fewer are actually used. The authors should report the 407 number in the abstract if any number at all.

      We have revised the number accordingly.

      Any comparisons of preterms and terms ought to be given the caveat that the preterm environment can be very different than the term environment: whereas a term infant goes home and sees friends and family without restriction, the preterm environment can be heavily regulated if they are in a NICU. Authors should either provide details about the environments of the preterms in their study, or they should consider how differences in the richness of visual experience - regardless of quantity - may affect visual development.

      We agree with the reviewer’s concern, and added a paragraph in the limitation section to stress the caveat (Page 20 Line 12-16): “One limitation of this study is the comparison between preterm and term-born infants did not consider the different visual experience in these infants. The preterm-born neonates may experience very different environment than those of the term-born, e.g. the preterm environment can be heavily regulated if they were in a NICU, but we didn’t have detailed information about the postnatal environment to control for it.”

      Reviewer #3 (Public Review):

      The authors use a large neonatal dataset to examine how development may occur differently based on whether on not the neonate spent that time in gestation or out of the womb accruing potentially accruing visual experience. In this manner, the authors hope to tease apart those aspects of development that are biologically programmed versus those that occur in response to experience within the visual cortex. They show structurally that cortical thickness is affected by postnatal experience while cortical myelination is not, and functionally they find regional differentiation present between visual areas at birth and that their connectivity changes with development and postnatal experience. The conclusions seem well supported by the data and analyses and provide some insight into which aspects of brain structure at birth are sculpted more by postnatal experience and which are more determined by endogenous developmental timelines.

      The analyses are based on a large sample of infants, and the authors were careful to statistically separate which aspects of an infant's age, gestational or postnatal, are driving brain development, providing a deeper picture of infant brain development than previous publications. Overall, the findings seem well supported by the data as the analyses are relatively straightforward.

      Visualization of the data and findings could be improved, as a few figures are difficult to interpret without having to read the methods.

      We have extensively revised the figures in the manuscript to improve the readability. See updated Figures 2-7.

      The acronyms regarding gestation, postnatal, and post-menstrual time are a little distracting. Please consider explicitly writing "gestational time" etc when referring to these numbers to improve readability.

      We have replaced the analyses involving PMA with gestational age (GA) or postnatal time (PT) in the revised manuscript to simplify the terminology. Please see the Response to the 1st major concern in the Essential Revisions (for the authors) section above. We believe this change makes the paper easier to follow even with the abbreviations.

      Because the cortical ribbon of infants is so thin at birth, there seems to be a possibility that partial-volume effects could be more prevalent in less-developed infants and impact myelin metrics. If not modeled or estimated, it should at least be discussed.

      In fact, the cortical thickness of the neonatal brain is not thinner than that of the adult. Particularly, the average cortical thickness of infants aged 0-5 months is around 2-2.5 mm (Wang et al., 2019), which is similar to adults (Fjell et al., 2015). Therefore, the partial-volume effect for cortical gray matter is not a special concern for infants.

      Nevertheless, we agree that the partial-volume effects might have different influences on infants of different ages. We added this consideration in the limitation section (Page 20 Line 20-24). “Another concern was about the partial-volume effect on the cortical measurements. The changing thickness of cortical ribbon during development may changes the degree of partial-volume effect, and thus may affect the cortical myelination measurement and may contribute to the myelination difference observed between preterm and term-born groups.”

      Structural and functional development could be more formally compared using quantitative models if the authors want those points more strongly related; the two are only qualitatively discussed at present.

      We have added a formal analysis to investigate the relationship between structural and functional development. Please see the Response to the 1st concern of Reviewer 1 (public review).

    1. Author Response

      Reviewer 2 (Public Review):

      1) The hypothesis that the genes responsible for the Mendelian traits are also the causal genes for the cognate complex traits does not seem to hold, given the prior work and the data shown in the study. For example, if this hypothesis is true, it is unexplained why the candidate genes were not even enriched in the GWAS regions for height and breast cancer.

      Following the removal of a data artifact from our breast cancer analysis and the inclusion of Backman et al.’s larger list of genes implicated in height, every phenotype in our analysis displays enrichment in proximity to GWAS peaks. Enrichment is present not only in genes selected based on cognate Mendelian phenotypes, but also on those from Backman et al., which examined the same complex trait phenotypes that were used for GWAS. In that work, the enrichment GWAS signal near of genes selected on coding variants was as high as 59.3-fold.

      Our use of Mendelian-trait-causing genes is not dependent on GWAS. Short of large-scale experimental work, we do not know any better way to confirm the genes’ broad relevance to GWAS phenotypes than their enrichment near peaks. This enrichment has been persuasively demonstrated by previous research. Freund et al. (2019) tested the enrichment of 20 Mendelian disorder gene sets against 62 complex phenotypes. Though there was no statistically significant overlap of phenotypically non-matched Mendelian genes and GWAS peaks (2% matched), the overlap of matched Mendelian genes and GWAS peaks was significant (54% matched).

      We have included additional evidence and references for this relationship in Supp. Note 1.

      2) The only evidence supporting their hypothesis appears to be the enrichment of the candidate genes in the GWAS regions for seven out of the nine traits. However, significant enrichment of the candidate genes in the GWAS regions does not necessarily mean that a large proportion of the candidate genes are the causal genes responsible for the GWAS signals. Analogously, we cannot use the strong enrichment of eQTLs in GWAS regions as evidence to claim that a large proportion of the GWAS signals are driven by eQTLs.

      Our gene sets were selected by considering two criteria: whether they are relevant to each complex trait, and whether they are biologically interpretable.

      The genes identified in Backman et al. have a strong case for relevance. They are evaluated for association, not with cognate Mendelian phenotypes, but with the exact same complex traits used for GWAS.

      Our genes, selected based on cognate Mendelian traits, are less obviously relevant, but have advantages for interpretation. Many have well-understood biological roles and are part of pathways that have been studied in great detail. Because most of these genes can cause dramatic phenotypic changes with one variant, the direction of effect is easier to understand than genes identified through burden testing. In fact, loss-of-function coding variants that cause autosomal dominant traits can be thought of as large-effect, context-independent eQTLs—they cause phenotypic change by decreasing gene expression roughly 50% across cell types, developmental stages, etc.

      Ideal genes for our analysis would combine the advantages of both sets. They would have individual coding variants that could be tied to complex traits using exome sequences. However, natural selection creates tradeoffs between variant frequencies and variant effect sizes. Large-effect variants (such as those responsible for Mendelian traits) are generally too rare to be detected in population sequencing. Coding variants that reach frequencies detectable in databases such as UK Biobank typically have smaller effect sizes, requiring them to be aggregated in order to implicate genes.

      We believe that our original gene set is plausible both because of its collective enrichment in GWAS signal and because each gene is individually known to cause cognate phenotypes. Enrichment is not proof, but can serve as strong evidence when backed up by known biology. Though selection precludes a perfect gene set, the enrichment in both our Mendelian gene set and the set from Backman et al. addresses each criterion—interpretability and relevance—individually, and, taken together, provides an argument for the relevance of genes selected based on coding variants.

      3) Considering the large numbers of GWAS signals, we would expect a substantial number of genes in the GWAS regions by chance. It would be interesting to quantify the number of genes in the GWAS regions if the 143 genes are randomly selected. Correcting the observed number of genes for that expected by chance (e.g., subtracting the observed number by that expected by chance), the proportion of the candidate genes in the GWAS regions would be small.

      The proportion of the candidate genes whose eQTL signals were colocalized with the GWAS signals or in close physical proximity with the fine-mapped GWAS hits was small. However, I would not be surprised if they are significantly enriched, compared with that expected by chance (e.g., quantified by repeated sampling of the 143 genes at random).

      Taking random sets of genes, or the entire set of non-putatively-causative genes shows that, given the size of our gene set, we would expect 43 randomly selected genes to fall within 1 Mb of a peak (95% confidence interval: 31.5-54.5). Instead, we find 147 peak-adjacent genes. When looking closer to genes, the enrichment increases. At a distance of 100 kb, we find 104 putatively causative genes, but the null model predicts only 11 (95% CI 4.5-17.0), a roughly ten-fold difference.

      Enrichment remains significant even when using a more conservative null. It may be that genes like ours, with importance to phenotype, are more likely than random genes to fall near GWAS peaks, even if their phenotype does not correspond to the GWAS phenotype. In this case, we might see enrichment even in the absence of a relationship between our Mendelian and complex traits. To account for this, we also tested significance by testing genes sets against different phenotypes (e.g. testing our LDL genes with a UC GWAS, and our height genes with a T2D GWAS). The results of this permutation are visible in Supp. Fig. 1, and further confirm the enrichment.

      Finally, non-expression based analysis found that Mendelian genes had large enrichments in heritability. As in our study, they included Mendelian genes for diabetes and LDL—the Mendelian diabetes genes were enriched 65-fold for common-variant heritability and the Mendelian LDL genes were enriched 212-fold (Weiner et al. 2022).

      Though it is true that the number of colocalizations and TWAS hits likely represents a statistically significant enrichment over all genes, we feel that this does not affect the conclusions of the paper. The model that noncoding variants identified by GWAS act as eQTLs certainly has some truth—colocalization and TWAS studies have found, in total, many associations. But the model’s success has not lived up to its expectations. This has been suggested, albeit inconclusively, by the failure of most GWAS peaks to colocalize. By evaluating, not the portion of loci that can be tied to a gene, but the portion of already-implicated genes that can be tied to a locus, we believe the model’s deficiencies are both more clear and more puzzling.

      4) It is unclear how the authors selected the breast cancer genes. If the genes were selected based on tumor somatic mutations, it is a problem because there is no evidence supporting that somatic mutation target genes are also cancer germline risk genes.

      Genes for breast cancer were selected using the MutPanning method (Dietlein et al. 2020), which takes somatic mutations found in tumors, and evaluates them in the context of known mutation patterns. The relationship between somatic and germline variants in cancer is little studied. We believe it is meaningful that, as explained in our response to overall comment 2ii, we do now find an enrichment of our breast cancer genes near GWAS peaks. Though these genes are very unlikely to be a perfect set, the conclusions of our paper remain true with or without the inclusion of this phenotype.

      5) The authors observed no enrichment of the candidate genes in height and breast cancer GWAS regions. In this case, should these traits and the corresponding genes be removed from the subsequent analyses?

      The reviewers’ notes about enrichment—and its absence in height and BC—prompted us to review our analysis of it. The enrichment for five of our phenotypes remained significant, and the lack of enrichment for breast cancer genes proved artifactual. After accounting for the artifact, the enrichment of breast cancer genes displays the same pattern as most other phenotypes, displaying highly significant enrichment as compared to the genomic background and a permutation analysis. Supplementary figure 1 has been updated to reflect this change, and to add the enrichments found in Backman et al.

      Because our original analysis of height has nominal, but not corrected, significance for enrichment, the problem may be one of power. The set of height genes identified by Backman et al. is larger than our original set and displays a significant enrichment in proximity to GWAS signal. This enrichment is also present when the two gene sets are combined, as shown in the updated Supp. Figure 1.

      Reviewer 3 (Public Review):

      1) The positive results are substantially reduced when restricting the analyses to a set of selected tissues of relevance to the trait. Isn't it implicated that the selection of relevant tissues in this study is not comprehensive, and further, tissue specificity is common in mediating genetic effects by gene expression? First, it seems some apparently relevant tissues are not selected (Table 2), such as bone for height (Finucane et al. 2015 NG). One approach to assess the relevant tissues for the predefined set of putatively causative genes is to see if these genes are enriched in the differentially expressed gene sets for those tissues. Second, among 84 putatively causative genes overlapped with GWAS signals, they identified 39 genes by TWAS, 11 genes by fine mapping with linear distance to chromatin modification features, and 41 genes by fine mapping with ChromHMM enhancer annotations, but these numbers reduced substantially to 9, 5 and 27 when restricting the same analysis to the selected tissues for each trait. If genes function only in the relevant tissues, I think using bulk expression data would lose power but is unlikely to give false positives. Thus, it is possible that for the traits analysed, not all relevant tissues are selected so that only a fraction of genes identified in bulk expression analysis can be replicated in the tissue-specific analysis. This appears to me a notable piece of evidence to support the hypothesis of biological context that the authors tend to have reservations in discussion.

      Testing for colocalizations or TWAS hits in all tissues may increase power for several reasons. First, it is possible that some GTEx tissues have unrealized relevance to our phenotypes. Secondly, in the event that a tissue is not present in GTEx, we may still detect relevant eQTLs in a tissue that is not itself involved in the trait, but which has similar patterns of expression. Finally, some tissues may be correct, but underpowered due to their small sample size. In this case, we may better detect the colocalization in tissues that are “irrelevant,” but are well-powered and have correlated expression.

      However, this creates problems of interpretation. Say we find, for example, a colocalization of an APOE eQTL with an LDL GWAS peak in skin tissue. Does this mean that skin tissue contributes to LDL levels? Is it simply because skin tissue has more samples than liver? Are we uncovering a strange, unexpected pleiotropy?

      We believe we can achieve both objectives—power and interpretability—with our use of MASH (Urbut et al. 2019) as described in response 3 of the first section. Briefly, MASH is a Bayesian tool that we use to update the estimates of eQTLs in GTEx data. Each tissue is adjusted to incorporate signals detected in other tissues with similar expression. This mitigates the danger of ignoring the correct tissue, and increases the power of tissues with small sample sizes. Its benefit is demonstrated by the substantial increase in the number of expression-GWAS colocalizations identified by coloc—however, the number of genes identified that fall within our putatively causative gene sets remains strikingly small.

      2) How much do both LD differences between GWAS and eQTL samples and the presence of allelic heterogeneity contribute to the observed low colocalization rate? One of their main findings is the low colocalization between trait-associated variants and eQTL in non-coding regions, which accounts for only 7% of the putatively causative genes. In discussion, the authors believe that this finding cannot be explained by lack of statistical power and is directly supported by a Bayesian analysis which reported high posterior probabilities of distinct signals for GWAS and eQTL. I agree that power is probably not a big issue. However, my concern is that given the large difference in sample size between GWAS and GTEx datasets, any small differences in LD between the two samples might cause a statistical separation of the signals even when trait phenotype and gene expression truly share a causal variant. Moreover, the presence of more than one causal variant with allelic heterogeneity in the locus may also play a part in the failure of colocalization. Consider two causal variants for the complex trait, one regulating the target gene and the other regulating another gene in co-expression. Potentially, the presence of the second causal variant would diminish the colocalization probability at the target gene.

      The ability of our statistical tools to actually find colocalizations is a critical one in this project. Small sample size increases the variance of the LD matrix, but is one of only many factors that influence power, which include LD differences between study populations and eQTL effect sizes.

      Though we restricted both GWAS and GTEx samples to subjects with European ancestry and used PCs as covariates, reviewers are correct that there are likely to be LD differences between samples, due to both slight variations in populations and the smaller sample sizes of GTEx. Analysis of colocalization tools in cases of mismatched LD have shown that decreases in power are small. Chun et al. (2017) tested JLIM in simulated conditions of modest population mismatch, using CEU haplotypes to create the GWAS, and haplotypes from all non-Finnish Europeans for eQTL associations. They then attempted to distinguish shared vs. distinct causative variants for GWAS and eQTL, finding no decrease in sensitivity or specificity (Supp. Fig. 6 of Chun et al. 2017).

      The case in which two genes are co-regulated by nearby variants, both causative for the GWAS trait, creates a condition of allelic heterogeneity for the GWAS trait (as opposed to the expression trait). Chun et al. evaluated JLIM’s loss of power as a result of AH, and found that the power loss is small, except in cases in which the two variants have equal effects (Supp. Fig. 10). Testing cases in which the AH occurs for the expression trait returned a similar result (Supp. Fig. 9).

      Hukku et al. (2021) performed similar analyses on coloc, eCAVIAR, and fastENLOC. Allelic heterogeneity was found to damage the power of coloc (by about a factor of 2). Testing on different pairs of populations, they conclude that extreme LD mismatches (e.g. Finnish vs. Yoruban samples) can lead to substantial power loss, but moderate LD mismatches (e.g. Finnish vs. British samples) do not. Though a factor of two is substantial, it would not change the qualitative conclusions of this paper. Overall, given the variety of methods we employ (including those, such as JLIM, more robust to AH), we are confident that they have, when taken together, been shown to be robust to the concerns raised.

      Finally, TWAS should, by design, be less vulnerable to LD differences and allelic heterogeneity. This can result in false positives, when genes with correlated expression are identified together, despite only one being causative. It can also result in non-causative genes being prioritized over causative ones, however, generally both genes will be identified (Wainberg et al. 2019).

      3) Perhaps the authors can perform some simulations to quantify the influence of tissue-specific expression effects, LD differences between eQTL and well-powered GWAS, and allelic heterogeneity, as discussed above, on their analyses. I understand that the authors may not be willing to do as it would involve a lot of work. But I'd like to see at least some discussion on how these questions can be better addressed in the future research.

      These are nuanced technical questions, and to address them by simulation in our paper would, as noted, involve a lot of work. We have summarized previous work that evaluated the effects of LD differences and AH in our response to essential revision 4. We discuss our concerns about the possibility of an overly broad tissue search in essential revisions 3 and 5, and our decision to address this question using MASH in essential revision 3.

      4) It looks quite striking that only 6% of the putatively causative genes are identified by TWAS with the correct effect direction. But I think this number is slightly misleading as one may interpret it as only 6% of the functionally relevant genes are regulated by trait-associated variants. In fact, 46% of the genes are detected by TWAS but only 11% are confirmed in their selected tissues, among which about half (5/9) have correct effect direction. First, the result could be limited by the selection of relevant tissues, as discussed above. Second, the fact that half of the genes do not show correct effect direction may reflect a nonlinear relationship between expression and trait, or the presence of cell-type heterogeneity within a tissue. These may not necessarily overturn the assumption that these genes are regulated by trait-associated variants in the causal tissues or cell types.

      In our initial submission, we had been reluctant to expand the list of tissues for two reasons. First, increasing from the small number of tissues with known biological relevance to all tissues (or all non-brain tissues) increases the multiple-testing correction burden. Second, and, in our eyes, more important, colocalizations in tissues without clear biological relevance are not biologically interprable. Such hits can be results of complicated genetic architecture (e.g. shared eQTLs), power differences in tissues with correlated expression, or biology not directly related to the trait in question.

      That said, the tissue data we have access to are incomplete, and we are without question missing some relevant tissues. Additionally, some relevant tissues have lower sample sizes, and thus lower power, than tissues that are not relevant but may still share eQTLs. To overcome these problems, we applied Multivariate Adaptive Shrinkage (MASH), a Bayesian method that detects correlations between different (in this case tissues) and uses them to produce posterior estimates of summary statistics in each tissue (Urbut et al. 2019). Unlike meta-analysis, which produces one result, the effect size estimates for each tissue are distinct, though informed by one another.

      Using MASH has a pronounced effect on colocalization results. The number of non-putatively causative genes colocalizing increases from 389 to 489, while the number of putatively causative genes in our Mendelian set is unchanged, remaining at 2. The number of genes from the Backman et al. set increases from 2 to 5. Though this is a proportionally large increase, it still represents a small fraction of genes. We have updated our paper to use these results—which should be less dependent on the tissues we selected—but the message has not changed.

      5) While they highlight the roles of alternative regulatory mechanisms, few testable hypotheses are put forward for the field, which is somewhat disappointing but understandable given how little we know about the human genome at the mechanistic level.

      We have added a set of models that may explain the “missing heritability” to Table 4 in the discussion. Though we do not propose experiments, we have included citations for research relevant to confirming or disproving these models.

    2. Reviewer #3 (Public Review):

      Connally et al investigated a central question in complex trait genomics - what's the main mechanism that mediates the effects of trait-associated variants in non-coding regions, which harbour most of the signals identified by genome-wide association studies (GWAS). It is widely perceived that these variants affect trait phenotypes by regulating expression of genes in cis that are functionally relevant to the trait. The authors argue that this is not true because they find limited evidence of linking the trait-associated non-coding variants to a set of putatively causative genes that are known to cause the severe form of the complex trait. The authors discussed four possible explanations to their observations. They argue that incorrect assumptions and lack of statistical power are not likely to be critical, withhold their judgment on the biological context, and claim that the most convincible explanation is the existence of alternative regulatory mechanisms. This conclusion is very important and sobering if it is true because it will inform where to invest the most efforts in the future GWAS.

      It is an interesting idea of using genes of known roles in the "Mendelian forms" of the cognate complex traits as true positives to investigate the biology of non-coding variants. The analyses are done carefully. The discussion of the results is sharp, stands high, and provides lots of food for thought. My major comments lie in the strength of support of their results for the conclusion of "missing regulation" likely attributed to alternative regulatory mechanisms. The results presented seem to also support the biological context hypothesis that non-coding variants regulate gene expression in a tissue or cell type-specific manner.

      Major comments:

      The positive results are substantially reduced when restricting the analyses to a set of selected tissues of relevance to the trait. Isn't it implicated that the selection of relevant tissues in this study is not comprehensive, and further, tissue specificity is common in mediating genetic effects by gene expression?<br /> First, it seems some apparently relevant tissues are not selected (Table 2), such as bone for height (Finucane et al. 2015 NG). One approach to assess the relevant tissues for the predefined set of putatively causative genes is to see if these genes are enriched in the differentially expressed gene sets for those tissues. Second, among 84 putatively causative genes overlapped with GWAS signals, they identified 39 genes by TWAS, 11 genes by fine mapping with linear distance to chromatin modification features, and 41 genes by fine mapping with ChromHMM enhancer annotations, but these numbers reduced substantially to 9, 5 and 27 when restricting the same analysis to the selected tissues for each trait. If genes function only in the relevant tissues, I think using bulk expression data would lose power but is unlikely to give false positives. Thus, it is possible that for the traits analysed, not all relevant tissues are selected so that only a fraction of genes identified in bulk expression analysis can be replicated in the tissue-specific analysis. This appears to me a notable piece of evidence to support the hypothesis of biological context that the authors tend to have reservations in discussion.

      How much do both LD differences between GWAS and eQTL samples and the presence of allelic heterogeneity contribute to the observed low colocalization rate?<br /> One of their main findings is the low colocalization between trait-associated variants and eQTL in non-coding regions, which accounts for only 7% of the putatively causative genes. In discussion, the authors believe that this finding cannot be explained by lack of statistical power and is directly supported by a Bayesian analysis which reported high posterior probabilities of distinct signals for GWAS and eQTL. I agree that power is probably not a big issue. However, my concern is that given the large difference in sample size between GWAS and GTEx datasets, any small differences in LD between the two samples might cause a statistical separation of the signals even when trait phenotype and gene expression truly share a causal variant. Moreover, the presence of more than one causal variant with allelic heterogeneity in the locus may also play a part in the failure of colocalization. Consider two causal variants for the complex trait, one regulating the target gene and the other regulating another gene in co-expression. Potentially, the presence of the second causal variant would diminish the colocalization probability at the target gene.

      Perhaps the authors can perform some simulations to quantify the influence of tissue-specific expression effects, LD differences between eQTL and well-powered GWAS, and allelic heterogeneity, as discussed above, on their analyses. I understand that the authors may not be willing to do as it would involve a lot of work. But I'd like to see at least some discussion on how these questions can be better addressed in the future research.

      It looks quite striking that only 6% of the putatively causative genes are identified by TWAS with the correct effect direction. But I think this number is slightly misleading as one may interpret it as only 6% of the functionally relevant genes are regulated by trait-associated variants. In fact, 46% of the genes are detected by TWAS but only 11% are confirmed in their selected tissues, among which about half (5/9) have correct effect direction. First, the result could be limited by the selection of relevant tissues, as discussed above. Second, the fact that half of the genes do not show correct effect direction may reflect a nonlinear relationship between expression and trait, or the presence of cell-type heterogeneity within a tissue. These may not necessarily overturn the assumption that these genes are regulated by trait-associated variants in the causal tissues or cell types.

      While they highlight the roles of alternative regulatory mechanisms, few testable hypotheses are put forward for the field, which is somewhat disappointing but understandable given how little we know about the human genome at the mechanistic level.

    1. Reviewer #1 (Public Review):

      As an m6A reader, YTHDC1 is known to affect the processing of RNA post-transcriptionally and this article attempted to relate this function in splicing and nuclear export to defects in muscle regeneration after acute injury using LACE-seq. Mechanistically, they provided evidence on m6A-YTHDC1 participation in modulating splicing and target export in myoblast. Additionally, the authors preliminarily confirmed the interaction of YTHDC1 with several key RNA processing factors such as hnRNPG1 to suggest a possible mechanism for m6A-YTHDC1 regulating splicing. Overall it provides new insight into YTHDC1 function in regulating SC activation/proliferation, although some of the data could be improved to fully support the conclusions.

      1. The title "Nuclear m6A Reader YTHDC1 Promotes Muscle Stem Cell Activation/Proliferation by Regulating mRNA Splicing and Nuclear Export" seems a bit overstated. Their data are not sufficient to show YTHDC1 regulating nuclear export. From figure 6 we could see some mRNAs export was inhibited upon YTHDC1 loss but intron retention also occurs on these mRNAs, for example, Dnajc14. Since intron retention could lead to mRNA nuclear retention, the mRNA export inhibition may be caused by splicing deficiency. From the data they provided we could not draw the conclusion that YTHDC1 directly affects mRNA export. I think they should not emphasize this point in the title.

      2. The mechanism of YTHDC1 promoting muscle stem cell activation/proliferation is not solidified. The authors could strengthen their evidence through bioinformatics analysis or give more discussion. Besides, the previous work done by Zhao and colleagues (Zhao et al., Nature 542, 475-478 (2017).) reported another m6A reader Ythdf2 promotes m6A-dependent maternal mRNA clearance to facilitate zebrafish maternal-to-zygotic transition. Does YTHDC1 regulate mRNA clearance during SC activation/proliferation? The authors should explore this possibility by deep-seq data analysis and provide some discussion.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Major #1

      This study primarily uses the genetic mouse model in which LSD1 gene is inactivated after tamoxifen injection in 8 weeks old mice, as shown in supplemental figure 1 B and C. 8 weeks after birth postnatal growth of muscle is not complete and the contribution of satellite cells to muscle growth is still significant. Therefore the timing of tamoxifen injection used cannot discriminate if the observed phenotype involves the function of LSD1 during the post-natal growth of the muscle or in the muscle fibers or both. One way to demonstrate the real contribution of LSD1 in the maintenance of muscle fibers plasticity under environmental stress would be to inject Tamoxifen later (around 10-12 weeks of age), in order to remove a possible bias caused by the contribution of satellite cells during the post-natal growth. At least key findings should be confirmed at this later stage.

      In this study, we used ACTA1-CreERT mice to conditionally knockout LSD1 in the skeletal muscle. The ACTA1 promoter is derived from human muscle actin gene, which is not expressed in the satellite cells, and has been widely used for the transgene expression in myofibers (Stantzou et al. Development 2017). Thus, the inactivation of LSD1 occurs in the existing myofibers, and alterations in satellite cell function, if any, would be indirect effects of the loss of LSD1 in mature myocytes or differentiating myoblasts.

      To test whether postnatal muscle growth was affected in our LSD1-mKO mice, we administrated tamoxifen (4OHT) to pre-weaning mice (11 days old). LSD1 depletion did not affect the expression of muscle fiber genes, when muscle tissues were isolated from mice 11 days after the start of 4OHT (Additional Data).

      These evidences exclude the contribution of satellite cells in the phenotypes observed in the LSD1-mKO mice. __Additional Data __will be added in the revised manuscript.

      Major #2

      LSD1 m-KO muscles seem to have more type I and IIA fibers than WT, even without DEX treatment. Is it possible to quantify the results in supplemental figure 4C?

      As suggested, we quantitatively analyzed the fiber type compositions in Supplemental Fig. 4C using the data from WT (n=4) and LSD1-mKO (n=5) mice (Additional Data). We did not find a significant difference between these mice, confirming our finding that the loss of LSD1 accelerates the Dex-driven phenotypic changes. __Additional Data__will be added in the revised manuscript.

      Major #3

      The effect on fiber type is convincing, while variations in gene expression are of quite low amplitude. However, the atrophy should be induced by other means to ensure that the effects are specific to GC/nuclear receptors pathways; Denervation? Starvation? Not all the experiments need to be repeated, just key results such as, for example, exacerbation of atrophy in LSD1 m-KO, Foxk1 increase.

      We agree that testing alternative atrophy models is important for generalizing our findings. For this, we employed a model for diabetes-related muscle atrophy. A pro-diabetic agent streptozotocin (STZ) disturbs the function of pancreatic islet leading to fast-fiber atrophy (O’Neill et al. Diabetes 2019). LSD1-mKO did not affect the muscle weight in STZ-treated mice (Additional Data). Consistently, there were no major difference in the expression of atrophy genes in STZ-treated WT and LSD1-mKO mice (Additional Data). These results suggest that the LSD1 function depends on the source of atrophy-inducing stress, and that the loss of LSD1 sensitized the muscle to GC-mediate signaling. Additional Data will be added in the revised manuscript.

      Major #4

      Autophagy data: the effect on the LC3I/LC3II ratio are modest. The autophagy part should be removed or completed with additional data to convincingly show that autophagy is affected. Links between LSD1 and mTOR have been published, so the mTOR pathway could be investigated in the model (S6k, S6 and 4EBP1 phosphorylation). Given AKT levels and phosphorylation are affected by the absence of LSD1 + DEX, it can be predicted that mTOR activity will change.

      We have analyzed the expression of additional autophagy markers p62 and phosphorylated 4EBP1. Consistent with the upregulated expression of atrophy genes and increased LC3I/II ratio, LSD1-mKO mice had elevated levels of p62 and phosphorylated 4EBP1 (Additional Data). Altogether, the data suggest that Dex-induced muscle atrophy was exacerbated by the loss of LSD1. Additional Data will be added in the revised manuscript.

      Major #5 The ability of LSD1 to retain FOXK1 in the nucleus is an important information that should be better supported experimentally. In the absence of such information, no mechanism can be proposed for the effect of LSD1 of FOXK1. The immunofluorescence images provided are not convincing and moreover they could be interpreted by a reduction in the level of FOXK1 protein (degradation?) rather than by a nuclear exclusion in the presence of DEX. This point should be addressed, authors could include western blot of nuclear and cytoplasmic fractions to better quantify the nuclear level of FOXK1 in absence of LSD1.

      We agree that performing the suggested experiment would further enhance the quality of our study.

      Major #6 The absence of centralized nuclei indicates that there is no fiber regeneration but it does not exclude the possibility that satellite cells were recruited to existing fibers and thus participated to hypertrophy. To eliminate this possibility, the average nuclei/cytoplasm volume should decrease if hypertrophy results from increased protein synthesis and not myonuclei accretion. This should be checked.

      We histologically analyzed the sections of Gas muscles after Dex treatment and found that there is no evidence of central nuclei in either WT or KO mice (Supplemental Fig. 4D).

      As mentioned above (Major #1), it is unlikely that the satellite cell function was responsible for the enhanced atrophic phenotype.

      Major #7 The upregulation of ERR____g in the absence of LSD1 is convincing in the VWR conditions. ERR____g level should be evaluated in the sedentary LSD1 KO mice.

      We have analyzed the expression of ERRg in sedentary mice, and found no significant difference between WT and KO mice (Additional Data). This suggests that the loss of LSD1 in combination with VWR training led to the increased expression of ERRg. Additional Data will be added in the revised manuscript.

      Minor #1

      There is a clear difference in the number of mouse replicates between treated (Dex or VWR) and non-treated mice, regardless the genotype. Experiments with non-treated mice lack adequate numbers to make a definitive conclusion. For example, there is a huge spread in the data in Figure 1 B and 4 D. If the number of animals would have been increased, would the absence of difference hold up?

      We increased the number of non-treated animals in Figures 1B and 4B as suggested. Nonetheless, we did not find any significant differences in the muscle weight (Additional Data). These changes will be reflected on original Figures 1B and 4B.

      Minor #2 The authors claim that: "Consistent with the results of the augmented endurance capacity, the Sol muscle in the KO mice showed enhanced succinate dehydrogenase (SDH) staining, indicating that the number of oxidative fibers increased (Figure 4F and Supplemental Figure 8F)". However, supplemental figure 8 D indicates that the number of type I fibers does not change compared to WT. Authors should clarify this statement.

      Indeed, we found that the area of type I fiber but not the number was increased in the LSD1-KO Sol (Fig. 4D and Supplemental Fig. 8D). Because SDH staining reflects the OXPHOS capacity in all fiber types, it is possible that the OXPHOS capacity in the fibers other than type I had been augmented by LSD1-KO. Thus, for clarification, we will change the statement as follows: OXPHOS capacity of Sol was enhanced by the loss of LSD1.

      Reviewer #2

      __Methods

      1__

      The authors used the Cre-lox system with tamoxifen to generate skeletal muscle-specific LSD1 KO mice. It is clear that both the mRNA and protein levels of LSD1 in various muscles were dramatically reduced, but there is still some LSD1 expressed in skeletal muscle, especially in Sol muscle (Supplemental Figure 1C). The author needs to think about whether it is appropriate to use the term "LSD1 knockout" or "LSD1 deficiency".

      We thank the reviewer for this comment. In this study, we crossed LSD1-floxed mice with ACTA1-creERT mice. This enables the deletion of critical exons of LSD1 in mature myocytes and myogenic precursors that have initiated the differentiation program. LSD1 is a ubiquitously expressed gene, and it is known that immature myogenic cells (e.g., satellite cells, Tosic et al. Nat Commun. 2018) and other non-myogenic cells such as hematopoietic and vascular cells abundantly express LSD1 (Kerenyi et al. Elife 2013, Yuan et al. Biochem Pharmacol. 2022). Thus, it is likely that LSD1 expression by these cell types were detected in our whole muscle western blots. We will add these statements in the text for clarification.

      __Results

      2__

      To identify the transcriptional regulators that mediate the regulation of atrophy-associated genes by LSD1, the authors performed motif analyses on the promotor regions of upregulated genes in LSD1-mKO Gas. Based on the results and other reports, they focused on Foxk1 and proved LSD1 and Foxk1 cooperatively regulate the atrophy transcriptome in the presence of Dex. However, Figure 3C showed that a transcription factor Nfatc1 is also reduced in Sol muscle similar to Foxk1. Also, other studies demonstrated that the transcription factor NFATc1 controls fiber type composition and is required for fast-to-slow fiber type switching in response to exercise in vivo. More specifically, NFATc1 inhibits MyoD-dependent fast fiber gene promoters by physically interacting with the N-terminal activation domain of MyoD and blocking recruitment of the essential transcriptional coactivator p300 (Cell Rep. 2014 Sep 25; 8(6): 1639-1648). Furthermore, it has been reported that LSD1 Controls Timely MyoD Expression via MyoD Core Enhancer Transcription (Cell Rep. 2017 Feb 21;18(8):1996-2006. doi: 10.1016/j.celrep.2017.01.078). It is unclear how the authors exclude Nfatc1 for the LSD1-mediated effects in different muscle fibers. Further experiments may be necessary to exclude Nfatc1.

      We thank the reviewer for an insightful comment. In addition to Foxk1, we tested the involvement of NFATc1 in the gene regulation under LSD1-depleted state. We treated C2C12 with an LSD1 inhibitor S2101 in combination with a calcium ionophore that promotes the transcriptional function of NFATc1 by inducing its nuclear localization (Meissner et al. J Cell Physiol. 2007). While LSD1 inhibition promoted the expression of Pgc1a and Myh7, ionophore treatment had no additive effects (Additional Data). Because we found a physical association of Foxk1 with LSD1, we focused on the functional involvement of Foxk1 in LSD1-mediated repression of atrophy genes. We recently performed an ATAC-seq analysis in Dex-treated muscle, and found that the Foxk1 motif but not the NFATc1 motif was enriched in the LSD1-KO-specific open chromatin regions. This data further suggests the significant contribution of Foxk1 in the transcriptional regulation under LSD1 depletion.

      #3

      In figure 3D, only merged images were colored. It would be better to show colored images for Foxk1 and DAPI.

      We will replace the images with the colored ones.

      #4

      Immunofluorescence analysis in C2C12 myotubes showed that Dex exposure reduced the nuclear retention of Foxk1, which was further promoted by the addition of T-3775440, an LSD1 inhibitor (Figure 3D). The author also used Foxk1-KO C2C12 myotubes to prove LSD1 and Foxk1 cooperation to regulate the expression of type I /IIA fiber and atrophy genes in Foxk1-KO cells. Are the effects of LSD1 dependent on Foxk1 or synergistically acting with Foxk1? The treatment of LSD1 inhibitor in Foxk1-KO C2C12 may be helpful to answer this question.

      As suggested, we will examine the combination effect of LSD1 inhibition and Foxk1-KO. In addition, we will analyze chromatin association of LSD1 in Foxk1-KO cells by ChIP experiments, to test whether the function of LSD1 depends on Foxk1.

      #5

      In supplementary figure 2, body weight in the mKO+Dex group was reduced in comparison to that of WT+Dex. How about the body weight of mKO mice without Dex injection compared to that of WT? This data will be helpful to understand the effect of muscle-specific LSD1 deficiency on whole-body energy balance.

      We measured the body weight of untreated mice, and found that there is no genotype effect (Additional Data). Thus, we think that LSD1-mKO alone does not influence the whole-body energy balance. We will include this data in the revised version.

      #6

      The authors analyzed the size distribution of myofibers and mentioned that large type I and type IIA fibers preferentially increased in the LSD1-mKO muscle, whereas large type IIB + IIX fibers decreased (Supplemental Figure 4, B, E, and F). It is better to show the results of statistics. If no significance were found, it should be mentioned in the result section.

      We have performed statistical analyses on Supplemental Fig. 4E and F, and found that a fraction of large type I fibers was significantly larger in KO mice. This result will be added in the next version.

      #7

      Page 11, To reveal the genes regulated by LSD1 under the VWR condition, the authors performed additional RNA-seq analysis using Sol muscle. The non-hierarchical clustering analysis was informative and showed signaling pathways related to ‘mitochondrion’, ‘mitochondrion organization’, and ‘oxidative phosphorylation’ were altered in the Sol muscle deficient in LSD1 under the VWR condition (Figure 5B). However, it is unclear why they focus on Err-gamma to explain LSD1-KO phenotypes in Sol muscle. Is this gene also derived from RNA seq? It is better to show whether Err-gamma expression is also significantly altered based on RNA seq data.

      Indeed, ERRg was upregulated by LSD1-KO+VWR and was included in the Cluster 6 genes together with the OXPHOS and mitochondria-related genes (Additional Data and Fig. 5A). These data prompted us to focus on ERRg as a potential factor that explains the LSD1-KO phenotype. Additional Data will be included in the revised version.

      #8

      The authors claim that LSD1 serves as an "epigenetic barrier" that optimizes fiber type-specific responses and muscle mass under stress conditions. This claim is derived from the loss of function studies. To generalize the functions of LSD1, the gain of function studies will be also necessary. Adding the characteristics of LSD1 overexpression in C2C12 cells will further improve the quality of the manuscript.

      We agree that the gain of function studies will further strengthen the quality of our manuscript. As suggested, we will perform an LSD1 overexpression experiment using C2C12 cells and analyze the expression of atrophy and fast fiber related genes. Because Esrrg is completely silenced in C2C12 cells, it is difficult to monitor ERRg-mediated gene regulation in these cells. To overcome this, we will use a cardiomyocyte cell line, in which ERRg is functionally involved in differentiation (Sakamoto et al. Nat Commun 2022). We will overexpress LSD1 in these cells and examine whether the expression of ERRg and its downstream targets are altered.

      __Discussion

      9__

      The authors mentioned supplementary figure 10 only at the end of the manuscript of the discussion section (page 15) without a specific explanation of the figures in the result section. The data are important in that LSD1 expression in human muscles declined with age and showed a negative correlation with the expression of the atrophy gene. It should be presented in the result section with a more detailed description.

      We agree that these data are important and need further explanations. We will describe the details in the Results section and move the entire figure to the main figure.

      #10

      There are other studies to examine LSD1 and muscle regeneration or functions (e.g. Nat Commun 9, 366 (2018). ____https://doi.org/10.1038/s41467-017-02740-5____). More discussion to compare the current study and other studies will be necessary.

      We thank the reviewer for this comment. We will add the discussion accordingly.

    1. We think of the key, each in his prison Thinking of the key, each confirms a prison

      Humans are social animals for a reason. It is due to others that we have a sense of ourselves. We compare ourselves to others, and by knowing the differences between us and others, we gradually gain a sense of identity. If we are just alone, without any comparison between ourselves and other people, we can never know the defining characteristics of ourselves. There will be no sun without shadows. And by forming a prison around ourselves, we create a sense of “otherness” that forms a wall between a body of our own and our surroundings. However, as the objective world holds unlimited amounts of truths for us to perceive, Bradley argued that there is fundamentally “no difference between the inner and the outer”. All humans have access to the same amount of information, but what makes us distinct is not “any difference of kind, but only of degree”. In other words, there is an extent to which we perceive the surrounding world. There may be overlaps between the perceptions of mine and that of others, but ultimately, it is the chain of every single person that makes up the whole world. Eliot may have mentioned Bradley’s argument about self-identity to further his opinion on the continued existence of the self after death. The world is made up of a chain of identities, where as one goes away, another one spawns and fills up the spot. There may be millions and millions of overlapping areas, but they are not necessarily the same due to tiny nuances. There is that sense of continuity that transcends bodily boundaries as well.

    2. He who was living is now dead We who were living are now dying

      This section of the Waste Land may be read as a commentary on religion that works on multiple levels, created via allusion to the books of the Bible. The two key lines of this section are ‘He who was living is now dead/ We who were living are now dying’. The first line has much Biblical precedent. In the Book of Revelations, Jesus says ‘I am the first and the last. I am he that liveth, and was dead; and behold, I am alive for evermore, Amen; and have the keys of hell and of death’. The reason that Jesus used to be dead but is now living is because of the Reïncarnation. Eliot’s ‘He’ – an obvious nod to God – is a reversal. It can be read as a Jesus that was never reïncarnated, as he went simply from being alive to being dead, without the final stage, or as having been reïncarnated – so ‘living’ again – but then somehow expired following that. Whichever way we read it, however, present an anti-Biblical view, one in which Jesus is no longer ‘living’. The second line – ‘we who were living are now dying’ – casts ‘us’ (the question arising: is the reader included amongst the ‘we’?) as being in the long, drawn-out, almost timeless process of ‘dying’, but still somehow alive. Therefore: ‘we’ have outlived Jesus. This may be a commentary on religion itself – that is, Jesus does not exist in the modern world – or on failing attitudes towards religion, with the question on the minds of many at the time being ‘how is it possible for both Christian love to exist in the world and such deep suffering?’. Additionally, John 11.25 states ‘he that believeth in me, though he were dead, yet shall he live.’ According to Eliot, however, ‘we’ are dying, perhaps representing the decline of belief. To take this yet further, Psalm 63 begins ‘O God, thou art my God; early will I seek thee: my soul thirsteth for thee’. Therefore, God is here presented as a life-nourishing water. In Eliot’s Waste Land, ‘there is no water’. By contrast, to take the other interpretation: if we, however, have outlived Jesus, then what happens now with the ‘keys to hell and to death’. Are Hell and Death flung open? Is ‘Hell empty, and all the Devils here’ (another potential reference to the Tempest…)? Is tha perhaps why such suffering permeates the world?

      An additional note: few of Eliot’s contemporary readers could have read the line ‘we who were living are now dying’ without thinking of John McCrae’s famous ‘In Flanders Fields, especially those most poignant and emotive of lines: ‘We are the Dead. Short days ago/ We lived, felt dawn, saw sunset glow,/ Loved and were loved, and now we lie,/ In Flanders fields.’ To think of these lines allows us to perhaps transcend the meta-religious interpretation of this passage, to forget God, salvation, belief and faith, but merely to focus on the human: the suffering, the pain, the love lost – the Dead.

    1. The work that we make, McGann tells us, “is not the achievement of one’s desire: it is the shadow of that desire.”[2

      I strongly agree. Often, especially in the industries where art is concerned, what is actually made is just a "watered down" version of many desires. What people may perceive that piece of work as is not all there is to it. For example, persons may look at a painting and say "It's very pretty" while the artist himself viewed it as an entire storyline while trying to put it into object form.

    2. The distance between our wish and our object is often so great

      This is very true noting that our imaginations can often run wild and create the literal most. However, creating these wishes into an object may prove to be challenging because of many factors such as processes that need to be done

    1. Fake news” was actual false news: stories that were blatantly made up, written and shared by people in the US who were economically or politically motivated. Or, in some cases, by Macedonians seeking a paycheck. While the motives may vary, the product is the same: fictional stories.

      This statement was particularly interesting because it makes one think what the true motive of creating false news truly is? As stated, for some it may be a paycheck but there must be a deeper reason and unfortunately we may never know that answer.

    1. General comments:

      This study carefully delineates the role of magnesium in cell division versus cell elongation. The results are really important specifically for rod-shaped bacteria and also an important contribution to the broader field of understanding cell shape. Specifically, I love that they are distinguishing between labile and non-labile intracellular magnesium pools, as well as extracellular magnesium! These three pools are really challenging to separate but I commend them on engaging with this topic and using it to provide alternative explanations for their observations!

      A major contribution to prior findings on the effects of magnesium is the author’s ability to visualize the number of septa in the elongating cells in the absence of magnesium. This is novel information and I think the field will benefit from the microscopy data shown here.

      I completely agree with the authors that we need to be more careful when using rich media such as LB. It is particularly sad that we may be missing really interesting biology because of that! It’s worth moving away from such media or at least being more careful about batch to batch variability. Batch to batch variability is not as well appreciated in microbiology as it is for growing other cell types (for example, mammalian cells and insect cells).

      For me, the most exciting finding was that a large part of the cell length changes within the first 10min after adding magnesium. The authors do speculate in the discussion that this is likely happening because of biophysical or enzymatic effects, and I hope they explore this further in the future!

      I love how the paper reads like a novel! Congratulations on a very well-written paper!

      Kudos to the authors for providing many alternative explanations for their results. It demonstrates critical thinking and an open-mind to finding the truth.

      Specific comments:

      Figure 2C → please include indication of statistical significance

      Figure 3C → please include indication of statistical significance

      Figure 6A → please include indication of statistical significance

      Figure 8B → please include indication of statistical significance

      Figure S1B → please include indication of statistical significance

      Figure S3B → please include indication of statistical significance

      For your overexpression experiments, do the overexpressed proteins have a tag? It would be helpful to have Western blot data showing that the particular proteins are actually being overexpressed. I think the phenotypes that you observe are very compelling so I don’t doubt the conclusions. Western blot data would just provide some additional confirmation that you are actually achieving overexpression of UppS, MraY, and BcrC.

      Questions:

      Based on your data, there are definitely differences in gene expression when you compare cells grown in media with and without magnesium. Because the majority in cell length increase occurs in such a short time though (the first 10min), I was wondering if you think that some or most of it is not due to gene expression? Do you have any hypotheses what is most likely to be affected by magnesium? Do you think if the membrane may be affected?

      Why do you think less magnesium activates this program of less division and more elongation? Additionally why is abundant magnesium activating a program of increased cell division and less elongation? Do you think there is some evolutionary advantage, especially considering how important magnesium is for ATP production?

      Related to this previous question, I also wonder if this magnesium-dependent phenotype would extend to other unicellular organisms, may be protists or algae? That would be a really exciting direction to explore!

      Regarding the zinc and manganese experiments, why do you think they lead to additional phenotypes compared to magnesium? Do you have any hypotheses?

      Regarding your results that Lipid I availability may be a major a problem for the cell division in the absence of magnesium, do you think that is due to effects magnesium has on the enzymes directly, or do you think magnesium affects the substrate availability/conformation by coordinating the phosphate groups? Or something else, may be membrane conformation?

    1. Author Response

      Reviewer #1 (Public Review):

      The authors took advantage of an existing protein-trap resource in zebrafish to identify genes important for normal pacemaker function in adults. They generated a collection of lines with mutation in genes that expressed at reasonably high levels in the heart and assess their ECG. They identified 3 candidates with increased incidence of sinus arrest and focused on validation of dnajb6b. The dnjb6b mutant fish display other defects including enhanced response to atropine and carbacol and bradycardia. They show that dnajb6b is expressed in a subset of cells in the sinus node in zebrafish. In mouse sinus node, DNAJB6 expressing cells have low expression of TBX3 and its target HCN4. In addition, Dnajb6b+/- mice also display similar phenotypes. Analysis of pacemaker function in ex vivo mouse hearts by high-resolution fluorescent optical mapping of action potentials revealed that the number of leading pacemakers in Dnajb6b+/- hearts is decreased in the sinus node, with a concomitant increase in the auxiliary pacemakers. RNAseq analysis of the right atrial tissues detected expression changes in ion channels and genes involved in Ca2+ handling and Wnt signaling. Overall, the results support the conclusion that DNAJB6 is important for proper sinus node function, thus adding it to the short list of sick sinus syndrome genes. However, the manuscript has several weaknesses.

      Weakness:

      The manuscript does not address the mechanism by which decreased DNAJB6B causes sick sinus syndrome. For example, it is unknown if DNAJB6B functions cell autonomously or non-cell autonomously in the sinus node. The RNAseq analysis identified changes in ion channels in the right atrial tissues of 1-year old mice, cellular electrophysiology of the sinus node cells was not assessed.

      The main goal of this research is to prove the feasibility of discovering novel SSS genes in adults via a forward genetic approach in zebrafish. Thus, the major hallmark would be to prove causality and specificity of the candidate genes identified from this screen, such as Dnajb6. Comprehensive mechanistic study would be a focus for future studies.

      Nevertheless, we carried out the following experiments to address the mechanisms. Based on these data, a new section was added to the discussion section (Lines 424-465).

      (1) In mice, we did more antibody immunostaining and confirmed a negative correlation in terms of expression intensity between the Dnajb6 and Tbx3 proteins. We further detected a significantly increased Tbx3 immunostaining signal in the SAN tissues of Dnajb6 heterozygous mice compared to WT controls (new Figure 3D-F).

      (2) In zebrafish, we compared expression patterns of the sqET33-mi59B conduction system reporter line between the GBT411/dnajb6b heterozygous and homozygous mutants. We found the atrio-ventricular canal (AVC) signal became diffused in GBT411/dnajb6b homozygous adult hearts. In addition, the ring-like structure usually seen in the SAN region of WT controls and in the GBT411/dnajb6 heterozygous was largely lost in 3 out of 9 GBT411/dnajb6b homozygous adult hearts examined (new Figure 2).

      Together with the ectopic pacemaker activity detected in the Dnajb6 heterozygous mice (new Figure 5A and 5B), we speculate that Dnajb6 might act as a suppressor of Tbx3 transcription factor in defining cell fate specification into SAN pacemaker myocytes. Since Tbx3 was reported to suppress chamber myocardial differentiation (Mommersteeg et al., Circ Res. 2007;100(3):354-62), upregulation of Tbx3 may thus contribute to enhanced atrial ectopic activity in Dnajb6 heterozygous mice.

      Furthermore, TBX3 has been recently identified as a component of the Wnt/β-catenin-dependent transcriptional complex (Zimmerli et al., eLife. 2020;9:e58123), which is significantly affected in Dnajb6 heterozygous mice (see new Figure 7B-C). This further supports a possible role of TBX3 in both SAN and atrial remodeling.

      (3) Finally, in collaboration with Drs. Grandi, Morotti, and Ni from University of California Davis, we utilized a population-based computational modeling approach to determine the cellular/ionic mechanisms that could underlie the ex vivo observed SSS phenotype in the Dnajb6 heterozygous mice (new Figure 6). We used our previously published model of the mouse SAN myocyte (Morotti et al. Int J Mol Sci. 2021; 22(11):5645) and enhanced it with addition of both sympathetic and parasympathetic stimulations to model the effects of isoproterenol- and carbachol-induced changes in pacemaker activity (i.e., firing rate), respectively. We generated a population of 10,000 mouse SAN myocyte models by random modification of selected model parameters describing maximum ion channel conductances and ion transport rates from the baseline model and assessed isoproterenol- and carbachol-induced effects on each model variant. We then separated this population of models in two subpopulations representing the WT and Dnajb6+/- mice phenotypes: namely, we extracted the model variants that recapitulate changes observed in Dnajb6+/- vs. WT mice, including a reduced firing rate at baseline, an increased response to isoproterenol, and a decreased response to carbachol administration (new Figure 6). This filtering process resulted in n=438 models that correspond to the Dnajb6+/- mice phenotype and n=6,995 models that correspond to the WT phenotype. We analyzed the parameter value differences in these two subgroups to revealed several crucial parameters that are significantly correlated with the observed electrophysiological changes. The analysis revealed a significant decrease in the maximal conductances of the fast (Nav1.5) sodium current, the L-type Ca2+ current (ICa,L), the transient outward, sustained, and acetylcholine-activated K+ currents, the background Na+ and Ca2+ currents, as well as the ryanodine receptor maximal release flux of the Dnajb6+/- vs. WT model variants. We also found a significant increase in the Na+/Ca2+ exchanger (NCX) maximal transport rate, and conductances of the T-type Ca2+ current and the slowly-activating delayed rectifier K+ current. These new studies provide some novel mechanistic insights into the observed SSS phenotype in Dnajb6+/- mice. Importantly, these new in silico experiments add another conceptual level to the phenotype-based screening approach introduced in the current study to identify new genetic factors associated with SAN dysfunction. Direct testing of these mechanisms would require a substantial amount of single SAN cell patch clamp and confocal microscopy experiments which are out of scope of the current manuscript and will be pursued in a follow-up study.

      The manuscript does not address why the zebrafish homozygous mutants are adult viable while the mouse homozygotes are embryonic lethal. The insertion of the GBT411 disrupt dnajb6b(L) but not dnajb6b(S), while the mouse mutation deletes the entire gene. Does this difference partially explain the difference?

      Indeed, the difference between zebrafish and mouse can be partially explained by the fact that only the long isoform of dnajb6b gene, dnajb6b(L), was disrupted in the GBT411 mutant, while both the long-Dnajb6(L) and short-Dnajb6(S) isoforms of Dnajb6 gene was largely deleted in the Dnajb6 knockout mice. However, we think the main reason is probably that functional redundancy in zebrafish but not mouse: zebrafish has two dnajb6 homologues, dnajb6b and dnajb6a, while mouse has only one Dnajb6 homologue. We added these points to the paper (Lines 377-379).

      Reviewer #2 (Public Review):

      In this manuscript, the authors expand upon previous work describing development of a protein trap library made with the gene-break transposon. This library was screened to identify lines displaying gene trap expression in the heart (zebrafish insertional cardiac mutant collection). A pilot screen of these lines using adult ECG phenotypes identifies dnajb6b as a new gene important for cardiac rhythm. Using the GBT/dnajb6b zebrafish line, Ding et al. find a proportion of aged homozygous mutant fish (1.5-2 years) present sinus arrest episodes and reduced heart rate. Treating GBT411/dnajb6b mutant adults with compounds revealed aberrant responses to autonomic stimuli, and sinus arrest episodes were induced following verapamil exposure, providing evidence that GBT411/dnajb6b as an arrhythmia mutant. This conclusion could be better supported by presenting specific ECG parameters to characterize the conduction defect more thoroughly. The authors then report that Dnajb6+/- adult mice recapitulate some of the phenotypes observed in zebrafish, including sinus arrest and AV blocks, as well as impaired (although different) responses to autonomic stimuli. The authors describe that these are features of sick sinus syndrome in the absence of cardiomyopathy phenotypes in either the zebrafish or mouse lines. However, overall cardiac morphology is not well described for either the GBT411/dnajb6b or Dnajb6+/- models.

      We carried out more experiments to examine left ventricular (LV) structure in Dnajb6 heterozygous mice at 1 year of age, using H&E staining, Masson’s trichrome staining, and transmission electron microscopy (TEM) analysis. We now show clearly that there are no significant myocardium structural changes in the LV as well as atrial and SAN tissues of Dnajb6 heterozygous mice (new Supplemental Figures 3 and 5), when the SSS phenotype was already noticeable. However, in the GBT411/dnajb6b heterozygous mutant at ~2 years of age, we detected severe sarcomere structural abnormality in 1 out of 3 fish hearts examined (see Response-only Figure 1). In addition, in a previous publication (Ding et al., Circ Res, 2013:112(40:606-17), we reported evident cardiac remodeling phenotypes in the GBT411/dnajb6b homozygous fish at 12 months of age.

      Together, we have obtained more experimental evidence to strengthen the claim that arrhythmia is not due to cardiomyopathy/structural remodeling in the Dnajb6+/- mice. However, the evidence from fish remains weak. Therefore, we removed the claim that “when structural remodeling/cardiac dysfunction have not yet occurred” in fish and modified our statement in mice accordingly (Lines 372-377, 385-386).

      To further support a role for Dnajb6 in sinoatrial node dysfunction, the authors performed optical mapping of action potentials from isolated mouse atrial tissue. These data reveal that Dnajb6+/- cultures exhibit ectopic pacemakers outside of the sinoatrial node, including within the atrial wall and inter-atrial septum. These data also show prolongation of SAN recovery time at baseline and following autonomic stimulation, further suggesting SAN dysfunction. RNA-sequencing experiments of DNAjb6+/- adult right atrial tissue showed differentially expressed genes encoding Ca2+ handling related proteins, ion channels, and WNT pathway related proteins. As these genes are involved in the cardiac conduction system, the authors suggest these pathways as molecular mechanisms underlying SSS phenotypes in Dnajb6 models.

      Sick sinus syndrome is a relatively rare arrhythmia most commonly found in older populations. Therefore, it has been challenging to establish clinically relevant models and there is a limited understanding of mechanisms of SSS pathogenesis. One particular strength of this manuscript is the ECG phenotype-based forward screen of the gene-breaking transposon (GBT)-based gene trap library in aged animals. This pilot study provides proof-of-concept that this screening approach is well suited to identify regulators of cardiac function in adults and genes linked to adult diseases like SSS.

      Thank you very much for recognizing the major strength of our manuscript!

    1. Why are Georgians Nostalgic about the USSR? Part 1 Several surveys in recent years suggest that close to half of the Georgian public considers the dissolution of the USSR a bad thing. After nearly 30 years since gaining independence, why do so many Georgians look back with nostalgia towards the Soviet Union? Reasons for Soviet nostalgia in other contexts are usually associated with how people experienced transition from state socialism to capitalism. The economic hypothesis explaining nostalgia argues that a perception of being part either “a winner” or “a loser” of the transition is associated with nostalgic feelings towards the Soviet Union. Other hypotheses introduce politics into the equation. According to this explanation, those who reject democracy on ideological grounds are more likely to be nostalgic as are those who think that democratic institutions are too feeble in delivering state services. Are these explanations true for Georgian Ostalgie? This series of blog posts explores these and other potential explanations to Soviet nostalgia.The 2019 Caucasus Barometer survey asked respondents whether the dissolution of the USSR was a good or a bad thing, as well as the reasons why. Respondents were considered nostalgic if they reported that the dissolution was a bad thing. However, it is worth keeping in mind the exact wording of the question when reading the analysis. Overall, 42% of the public think that the dissolution of the USSR was a bad thing, and a statistically indistinguishable share (41%) report it was good, leaving about 16% who were not sure.When it comes to why it was a bad thing, by far, the most common reason is that respondents believe that people’s economic situation has worsened. And they’re not necessarily wrong.Georgia had a particularly difficult economic transition during independence. Overall purchasing power is much higher today than before the transition, however, it only recovered to pre-transition levels in 2006 according to World Bank data.At the same time, average purchasing power hides the high levels of economic inequality in Georgia. Inequality increased from an estimated GINI of 0.313 in 1988 to 41.3 in 1998. In 2018, it stood at 37.9 according to the World Bank data. Concomitantly social services were cut.This likely explains why a majority of respondents that are nostalgic report that the economic situation has worsened to explain why they think the dissolution of the Soviet Union was a bad thing. The fact that some respondents directly cite a lower number of workplaces as a reason for believing that the dissolution was a negative thing, attests to this. The second most common reason is related to the conflicts that followed independence and the lost territories.What sets nostalgic Georgians apart? A logistic regression model looking at attitudes towards democracy, Russia, political party preferences, and a number of demographic measures suggests a number of characteristics. Age is an important predictor, with older people being considerably more nostalgic.Education also appears important, as individuals with more education are less likely to be nostalgic. Wealth has a less clear role, appearing only slightly relevant for overall attitudes, and more relevant when we look at those citing economic reasons for their attitude. This suggests that those who regret the dissolution of the USSR are those who suffered the most during the transition. This also suggests that as the economy improves and newer generations come of age, nostalgia towards the USSR may decline.While age, education, and wealth are relevant, they are not the only factors. Attitudes towards democracy and towards Georgia’s orientation to Russia also seem to separate nostalgics from non-nostalgics. Those who believe that Georgia should forego NATO and EU membership in favor of closer ties to Russia as well as those who think that Georgia is not a democracy and that democracy is not necessarily the best form of government, are more likely to also believe that the dissolution of the USSR was a negative thing.Similar patterns emerge when disaggregating the reasons for nostalgia, with wealth being more relevant for those who mentioned the worse economy as a reason for nostalgia. Interestingly, feeling close to a particular political party does not seem to be relevant for these attitudes, once other factors are held constant. One exception is when looking at identity-related responses for the attitudes. Respondents who feel close to pro-western opposition parties are less likely to believe that the dissolution of the USSR was a bad thing because ties with other nationalities became less common, travel to other former Soviet Republics became harder, or for people judging each other because of their identity. Ethnic minorities in Georgia are more likely to report these reasons than ethnic Georgians.Nostalgia towards the USSR seems to be primarily related to an individual’s experience of the transition, and their current attitudes towards democracy and Russia. This connection might suggest that skepticism towards democracy and the West is related to individuals’ experiences of the transition. However, more direct analysis of attitudes towards democracy is needed to test this idea.
      აღნიშნული ბლოგი არის იმის შესახებ, თუ რატომ არიან ქართველები ნოსტალგიურად განწყობილნი საბჭოთა კავშირის მიმართ. ავტორი გვთავაზობს რამდენიმე მიზეზს სსრკსადმი ქართველი ერის დადებითი დამოკიდებულების საილუსტრაციოდ. უპირველესი მიზეზი ამ კეთილგანწყობის არის ის, თუ როგორ გამოსცადა ქართველმა ხალხმა სოციალისტური  წყობილებიდან კაპიტალისტურში გადასვლა. ბლოგში აღნიშნულია, რომ ქართველებმა ძალზედ განიცადეს საბჭოთა კავშირის დაშლა, ვინაიდან მათი ეკონომიკური მდგომარეობა გაუარესდა. 
        მეორე მიზეზი კი არის ის, რომ საბჭოთა კავშირის დაშლამ უარყოფითად იმოქმედა საქართველოს შიდაპოლიტიკურ ცხოვრებაზე. ბლოგში ნახსენებია ის კონფლიქტები, რაც მოჰყვა დამოუკიდებლობის მოპოვებასა და სახელმწიფო ტერიტორიების დაკარგვას. საყურადღებოა ისიც, რომ სსრკ-ს მონატრებას გრძნობს ძირითადად ძველი თაობა. ახალი თაობა კი განათლების ძალით ხვდება, თუ რატომ არ არის საბჭოთა კავშირში ნოსტალგიის სამართლიანი საფუძველი. 
         ჩემი დამოკიდებულება ამ საკითხის მიმართ აშკარაა. ვინაიდან და რადგანაც, მე მივეკუთვნები საქართველოს იმ ახალგაზრდა თაობას, რომელიც დაიბადა და ცხოვრობს დამოუკიდებელ საქართველოში, იოტისოდენა სურვილიც არ მაქვს მენატრებოდეს საბჭოთა კავშირი. ის ბოროტების იმპერია, რომელიც ტოტალიტარულად მართავდა მცირე ერებს, ქვეყნებს. მიკვირს, როგორ შეიძლება მისტიროდე იმ დესპოტურ რეჟიმს, რომელიც ხალხს მუდმივ ტერორში ამყოფებდა და თავისუფალი აზრის ნებისმიერ გამოვლინებას სასტიკად უდგებოდა? ვფიქეობ, განათლების როლი ამ საკითხში ყველაზე დიდია. ერუდირებული ადამიანი მოვლენებს სწორად აფასებს და შესაბამისად, ნაკლები შანსია მისი მხრიდან სსრკ-სადმი ნოსტალგიის.
      
    1. the reason is that a perception 00:10:38 is kind of perceptual in structure and the buddhist world encodes this by arguing that the internal um sense the the manus venana is a sense faculty just like external faculties 00:10:52 and so just as our external faculties present us with a world that just seems to us even though we know it's not to be just as it is that we see it just as it is 00:11:03 it's tempting to think that we've got this apparent object distinct from our sensory apprehension of it but is but an object that's presented by a completely veritable process 00:11:15 because as i say perception just feels like it presents the world to us as it is i look at a red apple and i think damn i know exactly what that apple smells like looks like tastes like and 00:11:27 feels like forgetting that all i have is the apple as it's mediated by the peculiar perceptual system that i have and by all of the conceptual resources through which i filtered my perception 00:11:41 so in the same way a perception or introspective awareness just feels like it presents our own cognitive affective and perceptual states to us just as they are 00:11:53 independent of that appreceptive system and those conceptual categories so just as external perception gives us the illusion that we're just detectors of the world as it is inner perception can give us the illusion that we are just 00:12:06 detectors of our inner um our inner world just as it is so even when we remind ourselves as i'm reminding you right now of this 00:12:18 extremely complex mediation of our perceptual encounter with external objects we find ourselves in constantly experiencing our own experience as though 00:12:31 we've got the world just as it is and then we sometimes say okay maybe we're not getting the world just as it is but at least i'm getting my sensory experiences just as they are the apple might not be red but the redness i 00:12:42 experience is exactly the redness that i think i experience the sweetness that i introspect must be the sweetness just as it is and so forth so even if we give up for a moment and it's hard to give it up 00:12:54 for more than that the notion of immediacy with regard to external perception we often retreat to thinking that that's mediated but my awareness of my own inner episodes is the immediate 00:13:06 awareness that mediates my knowledge of the external world and i think that in the sense of that perception that sense of immediacy is even greater it's really hard for us to be convinced that our inner experience 00:13:20 could possibly be deceptive we seem to think that if i think that i believe something i must believe it if i think that i'm feeling something i must be feeling it and that feeling and that believing grab my inner 00:13:33 reality just as it is and so part of the problem that arises is that the mediation of our introspective awareness by our introspective faculty becomes 00:13:46 cognitively invisible to us just as what i'm seeing the world my visual faculty is invisible and it just delivers a visible world to me and i have to really think to to understand 00:13:58 what my own visual faculty visual organ and visual consciousness are contributing i think i experience my introspective faculty as just giving me inner objects and i have to think and remind myself 00:14:11 that actually my inner sense faculty is also a fallible instrument and that i may be misusing that instrument or that instrument might be intrinsically deceptive and that's a hard thing to get one's mind around 00:14:25 as a consequence we've become seduced by this idea that even if our knowledge of some things is mediated that mediation can't go all the way down we get seduced by the idea that there's got to be a 00:14:38 basic foundational level of experience to which we can have some kind of immediate access and to which when we know it we know it absolutely veritically in the theory of knowledge that leads us to foundationalism in the 00:14:51 philosophy of mind it leads us to sense datum theory um and i find that in a lot of buddhist situations a lot of buddhist practitioners take it to be this idea of an infallibility of an immediate kind of 00:15:03 experience if i'm sitting on the cushion just right so with all of that in play um i want to move to exercising that myth of the given that i've been characterizing 00:15:16 and to show that buddhist philosophy offers us powerful ways of doing that and i'm going to begin by talking about first person knowledge through the lens of the madhyamaka tradition

      Jay emphasizes the compelling sense of this allure of immediacy. We believe that our perceptual and our introspective faculties give us an infallible representation of reality, and never question that it could be fallible.

      This is very much aligned with the research on Umwelt by Jakob Von Uexkull.

      Aperception, the introspection and awareness of our inner space is just as alluring.

      So in summary: perception gives us the feeling that we are sensing the way the external world actually is and aperception gives us the feeling that we are aware of the inner world as it is. However, both are relative, the first to our peculiar sense faculties and the second to our linguistic and conceptual modeling of reality. Both are specific filters that create the specific situated interpretation of reality as a human being.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01536

      Corresponding author(s): Michael Glotzer

      [Please use this template only if the submitted manuscript should be considered by the affiliate journal as a full revision in response to the points raised by the reviewers.

      If you wish to submit a preliminary revision with a revision plan, please use our "Revision Plan" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

      1. General Statements We thank the reviewers for their thoughtful and helpful comments. In general, the reviews were highly positive, although their reviews indicated parts of the manuscript that needed further clarification. We have made extensive changes that improve the clarity and rigor of this submission. We have performed several additional experiments which have extended our analysis in several ways detailed below. None of the conclusions have changed.

      The following is a list of eight major changes implemented during the revisions. Point-by-point responses to the reviewers comments follow on subsequent pages.

      1. The reviews made clear that we needed to more explicitly discuss the AIR-1 depletion phenotype. This phenotype is complex, it does not result in a complete loss of asymmetry, unlike, for example, depletion of the centrosome component SPD-5. This is because, in AIR-1 depleted embryos, a PAR-2 and cortical flow-dependent pathway induces PAR-2 accumulation at both anterior and posterior poles that induces flows from each pole to the lateral region (Reich 2019, Kapoor 2019, Zhao 2019, Klinkert 2019; PMIDs 31155349, 31636075, 30861375, 30801250). These flows also modulate ECT-2 localization. To clarify this point which came up in multiple reviews, we now include an explanation of the complexity of the AIR-1 phenotype and we present an analysis of ECT-2 localization in embryos depleted of both AIR-1 and PAR-2.

      In addition to the 95% confidence intervals that were present on our graphs, we now include indications of the results of statistical tests of significance to the results of different treatments.

      We have revised the analysis ECT-2 accumulation in two ways. First, in the previous draft, we assessed the anterior accumulation over the anterior 40% and the posterior 15% of the embryo. We have revised this analysis comparing the anterior and posterior 20% of the cortex, respectively. This is simpler and more logical in contexts where embryos are symmetric. In addition, we altered the measurements of the length of the posterior boundary. Previously we used a common threshold value, below which we counted pixels to assess boundary length. During the revisions, we noticed that this value was not appropriate for our mutant transgenes which accumulated to higher levels. Therefore, we revised our analysis pipeline such that, for each embryo, we measure the average intensity of the cortex in the anterior 60% of the embryo. We set a threshold of 0.85* this average anterior intensity value. As before, cortical positions below this threshold contribute to the boundary length. This is a more robust and simpler means of evaluating the size of the posterior domain. Neither of these changes affect any of our conclusions, but they are simpler and more rigorous.

      Most of our figures include quantification of the degree of ECT-2 asymmetry as well as the average anterior and posterior accumulation of ECT-2 as a function of time. While the images show the intensity profiles across the embryo, previously, we did not explicitly show a quantification of the average intensity of ECT-2 as a function of position along the embryo. A new graph, Figure 2Bv, shows this for control embryos and embryos in which tubulin is depleted and depolymerized. This shows that the MT depolymerization results in lower accumulation at the posterior of the embryo and higher accumulation at the anterior.

      We provide documentary and quantitative evidence that ZYG-9 depletion induces potent cortical flows (Figure 3c and Figure 3, supplement 3), further bolstering the central role of cortical flows in inducing ECT-2 asymmetry.

      As requested by reviewer 2 (R2b), we have included the analysis of ECT-2 distribution in Gα depleted embryos. As expected due to the lack of spindle elongation, the displacement of ECT-2 from the posterior cortex is greatly attenuated.

      As requested by reviewer 2 (R2d), we now show that ECT-2C fragments accumulate on the cortex in embryos depleted of ECT-2.

      One other important point raised by several reviewers concerns the behavior of the ECT-2 T634E allele. This allele, due to the substitution of a phosphomimetic residue, accumulates on the cortex at about 50% the level of the wild-type version. To investigate the possibility that this quantitative difference was the cause of the phenotype, we depleted both the wild-type and mutant ECT-2 constructs by RNAi (these are the sole sources of ECT-2 in the animals). First, we find that wild-type ECT-2 can be depleted to 20% of wild type levels with only a 13% rate of cytokinesis failure (when T634E is depleted to 20%, embryos fail more than 50% of the time). Thus the two-fold reduction in cortical ECT-2 seen in T634E not likely highly significant (ECT-2 is not haploinsufficient). In addition, embryos with ECT-2 T634E initiate ingression in a timely manner, but the furrows ingress more slowly than wild-type. In contrast, depletion of ECT-2 to 20% results in a delay in furrow initiation, but once these furrows form, they ingress at rates similar rates to wild-type. Thus, the T634E variant exhibits a behavior that is quite distinct from that resulting from a (strong) reduction in the levels of wild-type ECT-2.

      Point-by-point description of the revisions

      This section is mandatory. Please insert a point-by-point reply describing the revisions that were already carried out and included* in the transferred manuscript. *

      (Reviewer comments: italicized 9 pt font, author response: plain text 10 pt font. Numbers have been added to the reviewer comments e.g. R2c=Reviewer 2, third comment)

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary

      R1a* In this study the authors addressed how Ect2 localization is controlled during polarization and cytokinesis in the one-cell C. elegans embryo. Ect2 is a central regulator of cortical contractility and its spatial and temporal regulation is of uttermost importance. After fertilization, the centrosome induces removal of Ect2 from the posterior plasma membrane. During cytokinesis Ect2 activity is expected to be high at the cell equator and low at the cell poles. Similarly to polarization, the centrosome provides an inhibitory signal during cytokinesis that clears contractile ring components from the cell poles. Whether and how the centrosomes regulate Ect2 localization is not know and investigated in the study. *

      This is an accurate summary of the goals of this study.

      R1b *The authors start by filming endogenously-tagged Ect2 and find that Ect2 localizes asymmetrically, with high anterior and low posterior membrane levels during polarization and cytokinesis. They reveal that the centrosome together with myosin-dependent flows results in asymmetric Ect2 localization. Previous studies had suggested that Air1, clears Ect2 from the posterior during polarization and the authors expand those finding by showing that Air1 function is also required to displace Ect2 from the posterior membrane during cytokinesis. *

      *To elucidate if Ect2 displacement is induced by phosphorylation of Ect2 by Air1, the authors investigate the localization of a C-terminal Ect2 fragment containing the membrane binding PH domain. When the predicted Air1 phosphorylation sites are mutated to alanine, the Ect2 fragment still localizes asymmetrically but exhibits increased membrane accumulation. *

      *Finally, they investigate the functional role of Air-1 during furrow ingression. They demonstrate that embryos deficient of Air1 and NOP1 have impaired furrow ingression. Lastly, the authors sought to confirm that there is a direct effect of Air1 on Ect2 function by generating a phosphomimetic point mutation of Ect2 using Crispr. They find that the membrane localization of phosphomimetic Ect2 is reduced and consequently furrow ingression is impaired. *

      This is an accurate summary of our results.

      Major comments

      R1c *It is not convincing that the six putative phosphorylation sites are targeted by the Air1. If Air1 phosphorylation displaces Ect2 from the membrane, a reduction in Ant/Post Ect2 ratio is expected in the phosphodeficient mutants, like after air1 RNAi. However this is not observed for cytokinesis or polarization (Fig. 5D(i); E). This suggests that phosphorylation of those sites is not essential for the asymmetric Ect2 localization. *

      In otherwise wild-type embryos, phosphorylation of these sites is not required for asymmetric ECT-2 localization. Non-phosphorylatable ECT-2 variants exhibit asymmetric localization because these proteins relocalize due to myosin-directed flows. To test the role of phosphorylation, we examine the distribution of ECT-2 and ECT-2C fragments in myosin-depleted embryos in which the flows are blocked, under these conditions, transient local depletion is observed with the phosphorylatable variants, Fig 5E.

      While AIR-1 promotes normal polarity establishment, as shown in several recent papers, cortical changes nevertheless occur in the absence of AIR-1. Specifically, a parallel PAR-2 dependent pathway induces weaker flows from both poles toward the equator. To further substantiate the effect of PAR-2 accumulation on ECT-2 accumulation in AIR-1 depleted embryos, we assayed ECT-2 accumulation in air-1(RNAi); par-2(RNAi) embryos (Figure 4, supplement 2). These results show that ECT-2 is nearly symmetric in these double depleted embryos. In addition we have edited the text to describe the unusual bi-polar PAR-2 accumulation that occurs in AIR-1 depleted embryos.

      R1d *The authors aim to demonstrate that phosphorylation of the identified sites is important for cytokinesis. For this they investigate contractile ring ingression in the phosphomimetic point mutation. Since ring ingression is slower and fails in nop1 mutant they authors conclude that this demonstrates a functional importance of this site. I am not surprised that embryos ingress slower in this mutant since Ect2 localization to the membrane is reduced. This however does not show that this phosphorylation site is the target of the centrosome signal. Importantly, authors would need to demonstrate that Rho signaling and thus Ect2 activity, is increased at the poles, when phosphodeficient Ect2 is the only Ect2 in the embryo. *

      The fact that a phosphomimetic residue at this site leads to reduced membrane localization is highly relevant, as we suggest that phosphorylation of this site contributes to the mechanism by which AIR-1 generates asymmetric ECT-2. Given the role of AIR-1 in regulating polarity, a version of ECT-2 that can not be phosphorylated would be predicted to be dominant lethal, necessitating a conditional expression strategy which does not currently exist in the early C. elegans embryo system (indeed we were unable to recover a T-> A allele at this site, despite extensive efforts). To avoid this issue, we used a viable, fertile, hypomorphic allele that is predicted to be less responsive to AIR-1 activity. The goal of this experiment was to evaluate whether the putative AIR-1 sites affect not only the NOP-1 pathway for furrow ingression, but also impact furrowing that is centralspindlin-dependent.

      To complement this finding have performed experiments in which ECT-2 was partially depleted We used RNAi to partially deplete ECT-2 and ECT-2 T634E and measured the total embryo fluorescence of each ECT-2 variant and the kinetics of furrow ingression. Partial depletion of wt ECT-2, to ~ 20% of control levels leads to delay in furrow formation and all but 2/18 (11%) of embryos complete cell division. In contrast, a similar depletion of ECT-2T634E depletion results in a failure of furrow ingression in ~52 % of embryos. Furthermore, while ECT-2T634E embryos initiate furrowing with normal kinetics, they exhibit a slower rate of furrow ingression, in contrast, partial depletion of WT ECT-2 results in a delay in furrow initiation, but once initiated, the rate of furrow ingression is not significantly affected. These results demonstrate that ECT-2T634E behavior can not simply be explained by a modest reduction in membrane binding.

      R1e *The authors use the Aurora A inhibitor MLN8237: It was shown prior (De Groot et al., 2015) that this inhibitor is not highly specific for Aurora A, and that it also inhibits Aurora B. Thus experiments need to be repeated with MK5108 or MK8745. They should also be conducted during polarization. Why does Aurora A inhibition not abolish asymmetry? That would be expected? *

      The role of AIR-1 in symmetry breaking during polarization is previously published, including with chemical inhibitors (Reich 2019, Kapoor 2019, Zhao 2019, Klinkert 2019, PMID 31155349, 31636075, 30861375, 30801250). ECT-2 localization depends on both the spatial regulation of AIR-1 activity and the distribution of cortical factors that contribute to ECT-2 cortical association, as a result of cortical flows. During acute, chemical perturbation of AIR-1 it is likely that these factors, which were polarized prior to drug treatment, remain polarized, allowing the residual cortical ECT-2 to remain asymmetric. The reviewer is correct about the specificity of MLN8237 and we do not rely on it alone to demonstrate the role of AIR-1. Rather this experiment is a complement to our AIR-1 depletion studies, which are sufficient to establish specificity. We present this experiment merely to show that AIR-1 acutely regulates ECT-2 during cytokinesis in embryos that were entirely unperturbed during polarization.

      R1f *There is no statistical analysis of the results in the entire study. For all claims stating a change in Ant/Post Ect2 ratio or Ect2 membrane localization selected time points should be statistically compared: for example the main point of Fig.1 is that Ect2 becomes more asymmetric during anaphase. Thus a statistical analysis of the Ect2 ratio at anaphase onset (t=0s) and eg. t=90 s after anaphase onset should be performed; or Fig. 3A nop-1 mutant Ant/Post Ect2 ratio during polarization: again statistical analysis of control and nop-1 mutant embryos is needed at a particular time point. *

      All of the graphs were presented with the mean of ~10 embryos per condition and included the 95% confidence intervals. In the revised manuscript, we have included tests of statistical significance, at each time point. While non-overlapping confidence intervals generally suggest statistical significance, we include these analyses on the graphs as it can be difficult to assess statistical significance when the confidence intervals overlap.

      R1g *The aim of Fig. 2B is to demonstrate that Ect2 localization is independent of microtubules, however they still observe some microtubules with the Cherry-tubulin marker and those are even very close to the membrane and therefore could very well influence Ect2 on the membrane. Therefore I am not convinced that this experiment rules out that microtubules have no role in regulating Ect2 localization. *

      We do not exclude that microtubules play a contributing role in ECT-2 phosphoregulation, but rather we conclude that the primary cue is the centrosome. Indeed, microtubules can play an important role in controlling spindle positioning which affects the proximity of the centrosome to the cortex.

      The manuscript states, “Despite significant depletion of tubulin and near complete depolymerization of microtubules (Figure 2B, insets), we observed strong displacement of ECT-2 from a broad region of the posterior cortex during anaphase (Figure 2B).” Thus, despite dramatic reductions in microtubules, not only does ECT-2 become polarized, it becomes hyperpolarized. In contrast, were microtubules directly involved in ECT-2 displacement, one would expect a reduction in polarization as a result microtubule depolymerization. Conversely, though SPD-5 depleted embryos contain far more microtubules than embryos in which microtubule assembly is suppressed, ECT-2 is not polarized in SPD-5 depleted embryos. Thus in the manuscript, we conclude, “Collectively, these studies suggest that ECT-2 asymmetry during anaphase is centrosome-directed.” This conclusion is well supported by the results shown.

      R1h *Throughout the paper the authors should tone down their statement that Air1 breaks symmetry by phosphorylating Ect2, since phosphorylation of Ect2 by Air2 is not shown. *

      We agree with this comment and will make the necessary edits to the text. Indeed, this is the reason why we had included the final section in our original draft, “Limitations of this study” which makes this point explicitly.

      R1i *I understand that the establishment of Ect2 asymmetry is important for polarization. However, how does asymmetric Ect2 localization result in more active Ect2 at the cell equator, which is required for the formation of the active RhoA zone? Would we not expect an accumulation of Ect2 at the cell equator, or if that is not the case more active Ect2 at the equator versus the poles? *

      The pseudocleavage furrow forms as a result of the anterior enrichment of active RHO-1 and its downstream effectors. There is no evidence for a local accumulation of active RHO-1 specifically at the site of the pseudocleavage furrow. Rather, this furrow forms at the boundary between the portion of the embryo where RHO-1 is active and the posterior of the embryo where RHO-1 is far less active (Figure 1 Supplement 2). We suggest that aster-directed furrowing during cytokinesis likewise results from asymmetric accumulation of the same components, without them necessarily being specifically enriched solely at the furrow.

      While cytokinesis generally involves an equatorial contractile ring, furrow formation can be driven by an asymmetric - i.e. non-equatorial - accumulation of actomyosin. This behavior is exemplified during pseudocleavage during which the entire anterior cortex is enriched for actomyosin and the posterior is depleted of myosin (Figure 1 Supplement 2). Several published studies provide evidence that the asymmetric pattern of myosin accumulation contributes to cytokinesis (PMID 22918944, 17669650).

      Minor comments

      R1j *Can the authors explain why the quantification of Ant/Post Ect2 ratio in control embryos differs in different figures? For example: in Fig. 1D i) a slight increase of Ect2 asymmetry ratio is seen at around 80 s after anaphase onset. In comparison, in Fig. 2C (i) this increase is not obvious. Are those different genetic backgrounds? *

      In figure 1 D, time 0 begins at anaphase onset, whereas in 2C, time 0 is specified at the time of nuclear envelope breakdown (NEBD). The duration between NEBD and anaphase onset is ~130 sec and an increase in ECT-2 polarization is observed at 220 s post NEBD, ie 90 sec post anaphase onset comparable to that seen in Fig 1D.

      R1k *One key point of the paper is that myosin-dependent cortical flows amplify Ect2 asymmetry during polarization and cytokinesis. During polarization the data is convincing, however during cytokinesis Ect2 ratio is only slightly decreased after nmy-2 depletion, again is this decrease even significant? *

      Figure 3 supplement 1 shows a significant difference in ECT-2 asymmetry between control and myosin-depleted embryos.

      R1l *In the introduction: "Centralspindlin both induces relief of ECT-2 auto-inhibition and promotes Ect2 recruitment to the plasma membrane" it should be added 'Equatorial' membrane, since Ect2 membrane binding is, to my knowledge, not compromised in centralspindlin mutants or in Ect2 mutants that cannot bind centralspindlin. *

      Generally speaking, the reviewer is correct that cortical accumulation of ECT-2 globally is centralspindlin independent. However, as seen in e.g. ZYG-9 depleted embryos, ECT-2 is recruited to the posterior cortex in a centralspindlin-dependent manner. Thus centralspindlin can promote ECT-2 accumulation to the cortex and the site of that accumulation will be dictated by the position of the spindle midzone.

      R1m *Labels in the figures are often very small eg Fig. 1 ii-v) and difficult to read. In addition it is easier for the reader if the proteins shown in the fluorescent images is also labeled in the figure (eg Fig. 2B add NG-Ect2). *

      These useful suggestions have been incorporated.

      R1n *Material and methods it should be mentioned which IPTG concentration was used. *

      The IPTG concentration (1 mM) has been added to the revised text.

      R1o *The authors speculate that the Air1 phosphorylation sites in Ect2 PH domain prevent binding to phospholipid due the negative charge. At the same time, the authors propose that the PH domain binds to a more stable protein on the membrane, which is swept along with the cortical flows and they propose anillin could be that additional binding partner. I might miss something, but do the authors suggest Ect2 has two binding partners: anillin and the phospholipids? It would be necessary to explain this better. *

      *The authors should test if anillin represents the suggested myosin II dependent Ect2 anchor. For this they should check if Ect2 localization to the membrane is altered upon on anillin RNAi. *

      This summary of our model is largely correct, though we do not know the identity of the more stable cortical anchor(s). While we suspect the PH domain binds to a phospholipid, ECT-2 cortical localization also requires ~100 residues C-terminal to the PH domain. It is likely that this domain interacts with a cortical component.

      In preliminary experiments, ECT-2 accumulation is not strictly anillin-dependent. However, functional redundancy may obscure a contribution of anillin. Anillin was mentioned simply because of the evidence for a physical interaction between ECT-2 and anillin (Frenete PMID 22514687). In the revised manuscript we also include the possibility that ECT-2 accumulations involves one or more anterior PAR proteins. The identity of the cortical anchor(s) is an interesting question for future studies. We consider this question beyond the scope of the current manuscript.

      R1p *The title of fig. 3 does not fit the statement the authors want to make, since the key point is how Ect2 polarization is affected and not membrane localization in general. *

      Thank you for this suggestion. The title has been changed to “Cortical flows contribute to asymmetric cortical accumulation of ECT-2”

      R1q *In Fig 4A/C. After air1 depletion the authors observe a reduction in Ect2 asymmetry. Why are the centrosomes not marked in the figures? Because they cannot be detected? The authors would also need to show that the mitotic spindle and centrosomes are no altered by air1 RNAi in the zyg9 mutant. Otherwise the observed effect might be indirect. *

      Centrosomes are perturbed by depletion of AIR-1 (Hannak, PMID 11748251), but they are still detectable and their positions will be added to figure 4. As has been extensively demonstrated, AIR-1 depletion does lead to attenuated spindles and defects in spindle assembly, some of which are also seen TPXL-1 depleted embryos. These consequences of AIR-1 depletion does complicate the analysis, but this is typical of factors that regulate many processes. This is one of the key reasons why we used ZYG-9 depletion in combination with AIR-1 depletion to overcome these indirect effects.

      R1r *The authors state that tpxl-1 depletion attenuates Ect2 asymmetry, this is not seen in the quantification ((Fig. 4B(i)). The main phenotype they observe is that Ect2 levels on the membrane increase (Fig. 4 (ii) and (iii). They go on testing the function of tpxl1 by depleting tpxl1 in the zyg9 mutant, where the centrosomes are close to the posterior cortex. Here they see no effect on Ect2 asymmetry. Based on that they conclude that tpxl1 has no role in this process. To me this finding is not surprising since the centrosome is close the cortex in zyg9 mutant embryos. Therefore sufficient amounts of active Air1 could reach the membrane and displace Ect2. Thus an amplification of the inhibitory signal by tpxl1 on astral microtubules might not be required. The authors need to mention this possibility and tone down their statment (also in the discussion) that tpxl1 is not required for this process. *

      In the text, we state, “Cortical ECT-2 accumulation is enhanced by TPXL-1 depletion, though the degree of ECT-2 asymmetry is unaffected (Figure 4B).… we observed robust depletion of ECT-2 at the posterior pole in zyg-9 embryos depleted of TPXL-1, but not AIR-1 (Figure 4C). We conclude that while AIR-1 is a major regulator of the asymmetric accumulation of ECT-2, the TPXL-1/AIR-1 complex does not play a central role in this process.” We consider this to be an accurate description of the results. In sum, we have found no evidence that TPXL-1 contributes to generating ECT-2 asymmetry, beyond its well established role in regulating spindle length and position. The are several other processes that are known to be AIR-1 dependent and TPXL-1 independent; these primarily involve the centrosome (Ozlu, PMID 16054030). Given that TPXL-1 associates with astral microtubules, the fact that microtubule depletion can enhance ECT-2 asymmetry also argues against a requirement for TPXL-1.

      R1s *It was shown that the C-terminus of Ect2 is sufficient and the PH domain is required for Ect2 membrane localization in C. elegans (Chan and Nance, 2013; Gomez-Cavazos et al., 2020). Papers should be cited. *

      Thank you for this helpful comment. Chan and Nance 2013 indeed shows that the ECT-2 C-term is sufficient to localize to the cell cortex. In contrast, the Gomez-Cavasos paper (PMID 32619481) shows in figure S2 that the PH domain is required for cortical localization of ECT-2; this paper does not focus extensively on cortical accumulation of ECT-2. We have cited Chan and Nance in the revised manuscript.

      R1t *The authors find that nmy-2 depletion results in loss of asymmetry for the Ect2 C-term and Ect2 3A fragment during polarization. Why is the same experiment not shown for cytokinesis? *

      Strong depletion of NMY-2 prevents polarity establishment, resulting in symmetric spindles, which in turn results in symmetric ECT-2 accumulation. Thus, the requested experiment would not provide significant additional information.

      R1u *Air1 is targeted to GFP-C-term Ect2 fragment via GFP-binding to determine the influence on GFP-C-term Ect2 localization (Fig. 5F). They state that they see a reduction of Ect2 C-term but not of C-term 3A after targeting. The reader has to compare Fig. 5D with F. Since the differences are not big, they need to compare the Ect2 C-term and Ect2 C-term 3A with and without Air1 targeting in the same graph (plus statistics). Otherwise this statement is not convincing. *

      It is not straightforward to directly compare ECT-2C in the presence and absence of GBP-mCherry-AIR-1, because the GBP:AIR-1 fusion protein recruits a large fraction of ECT-2C to the centrosome. For this reason we think it is best to compare the behavior over time of ECT-2C and ECT-2C3A in the presence of GBP-mCherry-AIR-1. At the onset of anaphase, these two fragments localize similarly, but they then diverge over time.

      R1v *In Fig. 6A the authors determine the contribution of air1 to furrowing. For this they deplete air1 in the nop1 mutant. According to previous studies, air1 mutants have a monopolar spindle. How can the authors analyze the function of air1 in cytokinesis when the spindle is monopolar? Did the authors do partial air1 depletion? They authors need to show that there is not major effect on the spindle and centrosome for their conditions. For comparison air1(RNAi) alone has to be included, otherwise the experiment is not conclusive. *

      AIR-1 depletion does not result in a monopolar spindle in C. elegans embryos, though the spindle is attenuated and disorganized (PMID 9778499). TPXL-1 depletion also results in short, well organized spindles (PMID 19889842). The concerns are the reason we performed the ZYG-9 depletion experiments in Figure 4C to ensure the centrosomes are proximal to the cortex.

      R1w *Upon air1(RNAi) in the nop1 mutant NMY2 intensity seems decreased and not increased. Can the authors comment on that, since that is opposite of what is expected. *

      This is expected as previous studies have shown that NOP-1 contributes to RHO-1 activation during polarization and cytokinesis (Tse, PMID 22918944). (NOP stands for No Pseudocleavage).

      R1x *In Fig 6B they introduce a phosphomimetic point mutation in S634 [sic, T634] in the endogenous Ect2 locus. It not clear why the authors chose this site out of the six putative sites and why they only chose one and not 3 or 6 sites? This needs some explanation. *

      In our early work with ECT-2 transgenes, we found that a T634E mutation strongly affected cortical ECT-2C, so we decided to assess its affect on the function and localization of endogenous ECT-2. While we were able to recover a T634E variant, we were not able to recover a T634A variant, despite considerable effort. Based on these experiences, we anticipated that we would be unable to recover a mutant version of ECT-2 in which all sites were changed to phosphomimetic.

      R1y *In the model (fig. 7) no astral microtubules are shown during pronuclear meeting and metaphase. Astral microtubules are present at this stage and should be added to the schematic. *

      MTs will be added to the figure.

      Reviewer #1 (Significance (Required)):

      R1z *The centrosomes inhibit cortical contractility during polarization and cytokinesis in the one-cell C. elegans embryo. Centrosome localized Air1 was proposed to be part of this inhibitory signal, however the phosphorylation target of Air1 is not known. The identification of Ect2 as a phosphorylation target of Air1 would be a great advancement in the field. However, the presented manuscript lacks convincing data that Ect2 is the phosphorylation target of Air1 during polarization and cytokinesis. *

      We explicitly acknowledge that we have not directly shown that AIR-1 phosphorylates ECT-2. However, we have shown that (i) AIR-1 inhibits cortical ECT-2 localization, (ii) the negative regulator of AIR-1, SAPS-1, promotes AIR-1 cortical accumulation, (iii) that the cortical localization domain of ECT-2 has putative AIR-1 sites, which, when mutated to non-phosphorylatable residues leads to increased cortical accumulation of ECT-2 (and (iv) phosphomimetic residues reduce its cortical accumulation), and (v) that these AIR-1 sites are required to render GFP-ECT-2C responsive to GBP-AIR-1. For these reasons we feel that our data makes a strong, albeit indirect, case that AIR-1 regulates ECT-2, even though we clearly acknowledge that we do not directly show that AIR-1 directly phosphorylates ECT-2.

      Direct proof would require the demonstration that AIR-1 phosphorylates ECT-2 in vivo. This would be difficult to show as ECT-2 phosphorylation is likely transient, it likely affects only a subset of the total ECT-2 pool, and it likely results in loss of membrane association of ECT-2. As it it not possible to synchronize C. elegans embryos, biochemical analysis would be very difficult. Even a phosphospecific antibody for the putative ECT-2 phosphosites might not be particularly informative, as it would be predicted to give a diffuse cytoplasmic signal.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      R2a* In this work, Longhini and Glotzer investigate the localization of an essential regulator of polarity and cytokinesis, RhoGEF ECT-2, in the one-cell C. elegans embryo. The authors show that centrosome localized Aurora A kinase (AIR-1 in C. elegans) and myosin-dependent cortical flows are critical in asymmetric ECT-2 accumulation at the membrane. Since membrane interaction of ECT-2 is dependent on the Pleckstrin homology domain present at the C-terminus of ECT-2, they further analyzed the importance of putative AIR-1 consensus sites present in this domain. The authors linked the relevance of these sites in controlling ECT-2 localization and its significance on cytokinesis. The manuscript is well written, the work is interesting, and the data quality is high. *

      We thank the reviewer for their critique.

      Major comments:

      R2b - In Fig. 2, the authors claim that the centrosomes and the position of the mitotic spindle are critical in regulating the asymmetric enrichment of ECT-2 at the membrane. To test the relevance of spindle positioning on ECT-2 localization, the authors depleted PAR-3 and PAR-2. The authors observed that the ECT-2 asymmetry is affected in these settings. However, PAR-3 or PAR-2 depletion impacts polarity, which is critical for many cellular processes, including spindle positioning. Can the authors try to specifically misposition the spindle without affecting polarity? For instance, by depleting Galpha/GPR-1/2 and assessing the impact of such depletion on ECT-2 localization.

      Thank reviewer for good suggestion. We have performed the suggested experiment (presented in Figure 2, supplement 2). As one might predict, ECT-2 starts out polarized as Gα is not required for polarity establishment. During anaphase, ECT-2 becomes more symmetric in Gα depleted embryos as compared to wild-type.

      R2c *-I wonder why the intensity of ECT-2 at the anterior and posterior membrane decreases in air-1(RNAi) post anaphase onset (Fig. 4A)? Moreover, I fail to observe a significant asymmetric distribution of ECT-2 in embryos depleted for PERM-1. Therefore it appears that the difference between DMSO and MLN8237-treated embryos is not substantial (at least in the images)? *

      We do not have a complete or rigorous explanation for all the changes in cortical ECT-2, but they are highly reproducible. We speculate that there are cell cycle regulated changes in ECT-2 accumulation, in addition to its regulation by AIR-1. For example, in figure 1, a strong reduction in both anterior and posterior cortical ECT-2 is evident beginning at approximately -350 sec, which may reflect the initial stages of Cdk1 activation. This may result from cell cycle regulated modulation of ECT-2, as there is evidence that mammalian ECT-2 is subject to a very potent inhibition membrane association by Cdk1 (PMID 22172673). Alternatively, there could be cell cycle modulation of the cortical factor that serves as the “co-anchor” of ECT-2. The ability of GBP-AIR-1 to induce GFP-ECT-2C dissociation also appears cell cycle regulated.

      Consistent with a cell cycle regulated component, note that NEBD is delayed in AIR-1 depleted embryos (PMID 17669650, 17419991, 30861375). This delay results in a shorter interval between NEBD and e.g. the peak in Cdk1 activation, explaining the earlier decrease in AIR-1(RNAi) embryos vs. control, relative to NEBD.

      Our quantitative analysis indicates a significant increase of cortical ECT-2 upon treatment with MLN8237. In addition, the quantitation in the previous version did show a significant polarization of ECT-2 in PERM-1-depleted embryos prior to treatment. We have revised this figure to simply show an acute increase in cortical ECT-2 upon drug treatment, as the focus of this experiment was solely to show that ECT-2 cortical accumulation is acutely responsive to chemical inhibition during cytokinesis in otherwise normal embryos.

      *-The data in Fig. 5 and 6 are exciting but raise a few concerns: *

      R2d *a). The authors show that ECT-2C localization mimics the localization of endogenous tagged ECT-2. However, all these analyses with ECT-2C and various mutants are performed in the presence of endogenous ECT-2. Can the author check the localization of these mutant strains in conditions where the endogenous proteins are depleted? I understand that the cortical flow would be perturbed in conditions where endogenous ECT-2 is depleted. However, I suspect that one can analyze the anaphase-specific distribution. *

      We have examined ECT-2C localization in embryos depleted of ECT-2. Cortical localization of ECT-2C is not dependent upon endogenous ECT-2. This result is now shown in figure 5 supplement 1. However, as the reviewer suggested, embryos depleted of ECT-2 do not show a high degree of ECT-2C asymmetry as ECT-2 is required for the cortical flows that amplify the symmetry breaking during polarization. During cytokinesis, ECT-2C does show a modest change in localization at the poles; the extent of the polar reduction is limited and the changes are symmetric as ECT-2 displacement causes spindles to be symmetrically positioned and limits their elongation during anaphase.

      R2e *b). Can the author comment on why ECT-2C does not accumulate at a similar level as ECT-2C(3A or 6A) at the cell membrane when AIR-1 is depleted (compare Fig. 5D with Supplemental Fig. 5)? *

      When ECT-2C(3A or 6A) are expressed in otherwise wild-type embryos, embryo polarization occurs, resulting in anterior-directed flows that concentrate the factor(s) that enables the anterior enrichment of ECT-2 (and ECT-2C 3A/6A). By contrast, when AIR-1 is depleted, most embryos exhibit a “bipolar” phenotype in which PAR-2 is recruited to both anterior and posterior poles, and the actomyosin network becomes somewhat concentrated laterally (PMID 30801250, 30861375, 31636075). The differential positioning of the actomyosin network in AIR-1 depleted embryos is likely responsible for the interesting difference that the reviewer points out. This section of the results states. “Nevertheless, these variants accumulated in an asymmetric manner. ECT-2C asymmetry temporally correlated with anteriorly-directed cortical flows (Figure 5 D,E), raising the possibility that asymmetric accumulation of endogenous ECT-2 drives flows that cause asymmetry of the transgene, irrespective of its phosphorylation status.”

      R2f *c). Does the cortical localization of the ECT-2C(6A) mutant become symmetric upon further depletion of AIR-1? Of course, if the asymmetric distribution of ECT-2C(6A) is dependent on the presence of endogenous protein in the cellular milieu, the point raised earlier will help address this concern. *

      We have not performed this exact experiment with ECT-2C-3A though we have performed it with a longer ECT-2 C-terminal fragment (aa 559-924). As expected, due to the considerations described above, the asymmetry of ECT-2C-3A is reduced when AIR-1 is depleted. Likewise, ECT-2C-6A is becomes symmetric when endogenous ECT-2 is depleted due to the dependence of its asymmetry on cortical flows, as discussed above.

      In the revised manuscript, we provide additional explanation of the AIR-1 depletion phenotype which will explain the origin of the asymmetric distribution of ECT-2.

      R2g *d). The authors predict that the AIR-1 mediated phosphorylation delocalizes ECT-2 from the polar region of the cell cortex. Since the posterior spindle pole is much closer to the posterior cortical region, the delocalization is much more robust at the posterior cell membrane. I wonder why targetting AIR-1 at the membrane (GBP-mCherry-AIR-1) does not entirely abolish GFP-ECT-2C membrane localization? Can the author include the localization of GBP-mCherry-AIR-1 in the data? Also, do we know for sure if GBP-mCherry-AIR-1 is kinase active? *

      The GBP-mCherry-AIR-1 transgene was obtained from the Gönczy lab which demonstrated that it has some activity (PMID 30801250). Given that centrosomal AIR-1 (as compared to astral AIR-1) is the primary pool of AIR-1 responsible for modulating cortical ECT-2 levels, it is a not clear that the GBP-fused form of AIR-1 is as active as the centrosomal pool of AIR-1; indeed we suspect it is significantly less active, similar to the manner in which TPXL-1/AIR-1 appears less active towards ECT-2 than centrosomal AIR-1. Indeed as the reviewer suggests, were this pool of AIR-1 highly active, we would expect that its cortical recruitment would preclude embryo polarization, and this transgene would cause lethality when expressed with a GFP-tagged cortical protein. These concerns notwithstanding, we do observe a specific reduction in the anterior accumulation of ECT-2C as compared to ECT-2C3A, suggesting that this form of the kinase has some ability to modulate ECT-2C.

      Co-expression of GFP-ECT-2C with GBP-mCherry-AIR-1 induces the centrosomal/astral accumulation of GFP-ECT-2C, which is highly visible in the figure and not seen in the absence of GBP-mCherry-AIR-1. Not surprisingly, the co-expression also induces a cortical pool of GBP-mCherry-AIR-1 that is not seen in the absence of GFP-ECT-2C. These redistributions indicate formation of the complex between GFP-ECT-2C and GBP-mCherry-AIR-1. The mCherry-AIR-1 images could be added as insets to the figure, but in our opinion, they would not make a substantive contribution, given the dramatic accumulation of centrosomal GFP-ECT-2C.

      R2h *e). The authors show that centrosomal enriched AIR-1 [spd-5(RNAi)], but not the astral microtubules localized AIR-1 [tpxl-1(RNAi)], is vital for ECT-2 membrane localization. Interestingly, the authors showed that AIR-1 acts in the centralspindlin-directed furrowing pathway (Fig. 6A). I wonder if the authors can combine NOP-1 depletion with TPXL-1 depletion? I guess this will further help to exclude the function of TPXL-1 in the centralspindlin-directed furrowing pathway. *

      We would like to clarify that our data indicates that AIR-1 acts on both the centralspindlin-independent furrowing (e.g. the anterior furrow in 4C), as well as centralspindlin-dependent furrowing (Figure 6).

      While the experiment the reviewer proposes appears simple in theory, the interpretation is potentially a bit more complex, due to the role of TPXL-1 in spindle elongation, which can affect centralspindlin-directed furrowing. That said, there are two published experiments and one experiment in the manuscript that indicate that centralspindlin dependent furrowing can occur in TPXL-1 depleted embryos. First, Lewellyn et. al. showed that while tpxl-1(RNAi) embryos furrow, tpxl-1(RNAi); zen-4(RNAi) embryos do not, suggesting centralspindlin can function in the absence of TPXL-1. Second, the same paper shows that embryos doubly depleted of TPXL-1 and GPR-1/2 exhibit multiple furrows. Our previous work has shown that furrowing in Galpha-depleted embryos is centralspindlin dependent (Dechant and Glotzer). Furthermore, in the current manuscript we found that embryos depleted of both TPXL-1 and ZYG-9 form posterior furrows (8/8 embryos, 6/8 furrows were strong furrows) although the appearance of these furrows is delayed, presumably due to the reduction in spindle elongation due to TPXL-1-depletion. As described in the manuscript, these posterior furrows have been previously shown to be centralspindlin dependent and NOP-1 independent.

      In accordance with these results, and in direct response to the reviewer’s specific suggestion, we do observe furrowing in nop-1(it142); TPXL-1(RNAi) embryos (10/10 embryos furrow, 9/10 complete cytokinesis) . Thus, all of the available results indicate that TPXL-1 is largely dispensable for centralspindlin dependent furrowing. However, the role of TPXL-1 in centralspindlin-dependent furrowing is not a focus of the manuscript, thus we do not favor including this result, as it distracts from the primary focus of the study.

      R2i *f). Why do NMY-2-GFP cortical levels appear lower in 30% of the embryos that show various degrees of cytokinesis defects (Fig. 6A)? *

      There are a number of possible origins of the variability. As shown in (Reich 2019, Kapoor 2019, Zhao 2019, Klinkert 2019, PMID 31155349, 31636075, 30861375, 30801250), AIR-1 depletion results in variable polarization (unpolarized PAR-2, bipolarized PAR-2, anterior PAR-2, posterior PAR-2). Furthermore, spindles in AIR-1 depleted embryos exhibit somewhat variable positioning. While we were unable to correlate these sources of variability with furrow formation, these results demonstrate that AIR-1 depletion impairs furrowing directed by centralspindlin, which was not entirely expected, given that (i) AIR-1 depletion potently suppresses NOP-1 dependent flows of cortical myosin, as evidenced by the loss of an anterior furrow in AIR-1(RNAi); nop-1(it142) embryos and (ii) centralspindlin directed furrowing can occur in the posterior in ZYG-9 depleted embryos both in the presence or absence of AIR-1 (Figure 4C).

      R2j *g). The authors report that phosphomimetic mutation at the phospho-acceptor residue in ECT-2 impacts its cortical accumulation. This strain, together with NOP-1 depletion, affects furrow ingression. One explanation for this phenotype is that phosphomimetic mutant weakly accumulates at the membrane. However, one interesting observation is that ECT-2T634E enriches at the central spindle (Fig. 6B, panel 120 sec), which somehow I could not find in the text. Could this additional localization of ECT2 at the central spindle contribute to the cytokinesis defects that the authors have observed? The microscopy images the authors have included show that ECT-2T634E significantly localizes at the equator at the time of furrow initiation. Can the authors add the localization of ECT2 wild-type and ECT-2T634E in NOP-1 depleted conditions where they see an apparent impact on the cytokinesis? Similarly, if the authors include the localization of NMY-2 in these conditions-it will further add more weightage to the data. *

      We regularly detect trace amounts of ECT-2 on the central spindle and this is slightly enhanced at in the ECT-2T634E mutant. However, given the large cytoplasmic pool of ECT-2, it seems unlikely that the slight enrichment of ECT-2 on the central spindle significantly affects the cortical pool of ECT-2, though the reduction in cortical ECT-2 may facilitate its enrichment on the central spindle.

      As shown in figure 3B, depletion of NOP-1 does not dramatically affect cortical ECT-2 levels in wild-type embryos. Likewise, we did not observe a significant effect of NOP-1 depletion in ECT-2 T634E, thus we decided not to include this negative result.

      As discussed in general point 8, we suggest the modest reduction in the membrane pool of ECT-2 is unlikely to be the primary cause of the T634E, but rather the ability of AIR-1 to modulate induce its relocalization. Consistent with this interpretation, the embryos that failed ingression tended to have more symmetric spindles, which could limit the residual cortical flows that facilitate furrow ingression.

      Minor comments:

      R2k -An explanation of how the timing of NEBD was analyzed in multiple settings would be helpful.

      Depending on the experiment, we used either ECT-2:mNG fluorescence (it is excluded from the nucleus until NEBD) and/or the Nomarski images to score NEBD.

      R2l ____-*The authors mentioned on p. 6-'Despite significant depletion of tubulin.....during anaphase'. These experiments are performed in the near complete depolymerization of microtubules; thus, regular anaphase will not establish. I understand that the authors are monitoring localization wrt the timing similar to anaphase in the non-perturbed condition, and thus a bit of change in the sentence is required. *

      Thank you for highlighting this point. We have substituted “following mitotic exit” for “anaphase”. In these images, mitotic exit can be scored by the emergence of contractility.

      R2m*-After testing the relevance of SPD-5 (that primarily acts on PCM and not on centrioles)-the authors write on p. 6 that 'two classes of explanation...early embryo'. I did not understand the importance of this sentence here. *

      To clarify, we deleted the words “classes of” from the sentence in question and following that sentence we added the word, “first” indicating that we were explaining the first of the two possible explanations

      R2n*-The observed impact of spd-5 (RNAi) on ECT-2 localization could be because of the effects of SPD-5 depletion on centrosomal AIR-1? The authors can link the impact of SPD-5 depletion not only with the centrosome but also with AIR-1 in the discussion. *

      Indeed, it is well established that SPD-5 is required for centrosomal AIR-1 (Hamill DR, et. Al Dev Cell (2002). The revised discussion now states, “Specifically, during both processes, ECT-2 displacement requires the core centrosomal component SPD-5, which is required to recruit AIR-1 to centrosomes{Hamill et al., 2002, #1201}, but ECT-2 displacement is not inhibited by depolymerization of microtubules and it does not require the AIR-1 activator TPXL-1 (see below).”

      R2o-In the various Figure legends, sometimes the authors mention time '0' as anaphase, and other time as anaphase onset.

      In all cases, anaphase onset was intended and the legends will be corrected.

      Reviewer #2 (Significance (Required)):

      R2p *The manuscript is well written, the work is interesting, and the data quality is of good quality. *

      We thank the reviewer for their encouragement as well as for their thoughtful critique!

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      R3a* Symmetry breaking is the process by which uniformity of the system is broken. Many biological systems, such as the body axes establishment and cell divisions in embryos, undergo symmetry breaking to pattern cellular interior design. C. elegans zygote has been a classic model system to study the molecular mechanism of symmetry breaking. Previous studies demonstrated critical roles of centrosomes and microtubules in breaking symmetry in the actin cytoskeleton during anterior-posterior polarization and cytokinesis. It, however, remains elusive how centrosomes and/or microtubules regulate the assembly and contractility of the actin cytoskeleton. Recent reports identified Aurora-A AIR-1 as the key centrosomal kinase that suppresses the function of the actin cytoskeleton, but little is known about a substrate of the kinase during symmetry breaking events. *

      Longhini and Glotzer proposed in this manuscript that RhoGEF ECT-2 plays a critical role in symmetry breaking of the actin cytoskeleton under the control of AIR-1 kinase. Kapoor and Kotak (2019) previously proposed the same GEF as a downstream effector of centrosomes, but this work did not provide direct evidence for ECT-2 as the AIR-1 effector. This manuscript identified three putative phospho-acceptor sites in the PH domain of ECT-2 that render ECT-2 responsive to inhibition by AIR-1. Although this manuscript lacks direct in vivo and in vitro evidence for phosphorylation of ECT-2 by AIR-1 kinase, the above findings reasonably support a model where in AIR-1 promotes the local inhibition of ECT-2 on the cortex. Design of the experiments, the quality of images, and data analysis are reasonable, and the main text was written very well. The main conclusion of this work will attract many readers in cell and developmental biology fields. I basically support its publication in the journals supported by Review Commons with minor revisions (see below).

      We thank the reviewer for their encouraging remarks and helpful comments.

      Minor comments

      R3b 1) In Figures 2A and 2B, the authors claimed apparent correlation between spindle rocking and ECT-2 displacement. However, because both MTs and ECT-2 in Fig2AB images are blur, I cannot convince myself whether ECT-2 intensities on the cortex showed negative correlation with the distance between the posterior centrosome and the cortex. The authors may want to provide quantitative data set and use a statistical test to support this conclusion.

      Only figure 2A focuses on the rocking. The important structure to assess is the position of the centrosome, as the astral arrays of microtubules are largely radially symmetric (except towards the spindle midzone). As this point in the manuscript were were not discriminating between the astral microtubules and the centrosomes, rather focusing on the overall position of the aster as a whole. Figures 2B, 2D, Fig 2 Supplements 1 and 2, Fig 3C, and Fig 4B, summarized in figure 7A provide quantitive evidence that the centrosome-cortex distance is an important determinant of ECT-2 cortical accumulation.

      R3c *2) Figure 2D would [sic; presumably should] show a ratio between the anterior/posterior pole and the lateral cortex. *

      The reviewer is presumably noticing that the lateral cortex is brighter than the poles when PAR-3 is depleted. While we agree with this assessment, the point of this experiment was to evaluate whether both centrosomes are equally capable of regulating cortical ECT-2 at the respective poles. It appears to us that comparing the anterior and posterior poles is the appropriate measurement to make to address this point and comparison of the poles to the lateral cortex in par-3(RNAi) vs control would be confusing to readers.

      R3d *3) In Figure 3D, the authors need to clarify why they measured ECT-2 dynamics only within the "anterior pole". It would be reasonable to measure ECT-2 dynamics by FRAP and cortical high-speed live imaging on the posterior and the lateral cortex during symmetry breaking. *

      We measured ECT-2 recovery at a variety of sites with similar recovery kinetics. The comparison of ECT-2 dynamics on anterior and posterior furrows were shown in order to compare ECT-2 dynamics on centralspindlin-dependent and -independent furrows.

      We now provide additional supplemental data on ECT-2 dynamics during symmetry breaking. When ECT-2 is polarized, the residual signal is too low to obtain a measure of its recovery.

      R3e 4) In Figure 4 supplement, a difference between with or without ML8237 seems marginal. The authors need to show a statistical test to claim "rapid enhancement of cortical ECT-2 after ML8237 treatment".

      We will provide a statistical analysis. As the inhibitor affects ECT-2 globally, the anterior/posterior ratio doesn’t change significantly. To avoid confusion, we now present total cortical ECT-2 levels upon anaphase onset in this experiment as this is the most relevant parameter.

      R3f *5) I would strongly suggest the authors to clearly state in the first paragraph of discussion that "this working hypothesis is not supported by direct evidence for phosphorylation of ECT-2 by AIR-1 kinase in vitro and in vivo." It should be reasonable to weaken the statement "by Aurora A-dependent phosphorylation of the ECT-2 PH domain" in p13. *

      We agree with the underlying sentiment (as indicated by the “limitations” section that was present in the original version) and we have revised these sentences accordingly: “Our studies suggest that asymmetric, posteriorly-shifted, spindle triggers an initial focal displacement of ECT-2 from the posterior cortex by Aurora A-dependent phosphorylation of the ECT-2 PH domain, though the evidence for this phosphorylation event is indirect.”

      Reviewer #3 (Significance (Required)):

      *See the second paragraph of the Evidence, Reproducibility, and Clarity section. *

    1. Author Response

      Reviewer #1 (Public Review):

      In one of the most creative eDNA studies I have had the pleasure to review, the authors have taken advantage of an existing program several decades old to address whether insect declines are indeed occurring - an active area of discussion and debate within ecology. Here, they extracted arthropod environmental DNA (eDNA) from pulverized leaf samples collected from different tree species across different habitats. Their aim was to assess the arthropod community composition within the canopies of these trees during the time of collection to assess whether arthropod richness, diversity, and biomass were declining. By utilizing these leaf samples, the greatest shortcoming of assessing arthropod declines - the lack of historical data to compare to - was overcome, and strong timeseries evidence can now be used to inform the discussion. Through their use of eDNA metabarcoding, they were able to determine that richness was not declining, but there was evidence of beta diversity loss due to biotic homogenization occurring across different habitats. Furthermore, their application of qPCR to assess changes in eDNA copy number temporally and associate those changes with changes to arthropod biomass provided support to the argument that arthropod biomass is indeed declining. Taken together, these data add substantial weight to the current discussion regarding how arthropods are being affected in the Anthropocene.

      Thank you very much for the positive assessment of our work.

      I find the conclusions of the paper to be sound and mostly defensible, though there are some issues to take note of that may undermine these findings.

      Firstly, I saw no explanation of the requisite controls for such an experiment. An experiment of this scale should have detailed explanations of the field/equipment controls, extraction controls, and PCR controls to ensure there are no contamination issues that would otherwise undermine the entirety of the study. At one point in the manuscript the presence of controls is mentioned just once, so I surmise they must exist. Trusting such results needs to be taken with caution until such evidence is clearly outlined. Furthermore, the plate layout which includes these controls would help assess the extent of tag-jumping, should the plate plan proposed in Taberlet et al., 2018 be adopted.

      Second, without the presence of adequate controls, filtering schemes would be unable to determine whether there were contaminants and also be unable to remove them. This would also prevent samples from being filtered out should there be excessive levels of contamination present. Without such information, it makes it difficult to fully trust the data as presented.

      Finally, there is insufficient detail regarding the decontamination procedures of equipment used to prepare the samples (e.g., the cryomil). Without clear explanations of the steps the authors took to ensure samples were handled and prepared correctly, there is yet more concern that there may be unseen problems with the dataset.

      We are well aware of the potential issues and consequences of contamination in our work. However, we are also confident that our field and laboratory procedures adequately rule out these issues. We agree with the reviewer that we should expand more on our reasoning. Hence, we have now significantly expanded the Methods section outlining controls and sample purity, particularly under “Tree samples of the German Environmental Specimen Bank – Standardized time series samples stored at ultra-low temperatures” (lines 303-304), “Test for DNA carryover in the cryomill” (lines 448-464) and “Statistical analysis” (lines 570-575).

      We ran negative control extractions as well as negative control PCRs with all samples. These controls were sequenced along with all samples and used to explore the effect of experimental contamination. With the exception of a few reads of abundant taxa, these controls were mostly clean. We report this in more detail now in the Methods under “Sequence analysis” (lines 570-575). This suggests that our data are free of experimental contamination or tag jumping issues.

      We have also expanded on the avoidance of contamination in our field sampling protocols. The ESB has been set up for monitoring even the tiniest trace amounts of chemicals. Carryover between samples would render the samples useless. Hence, highly clean and standardized protocols are implemented. All samples are only collected with sterilized equipment under sterile conditions. Each piece of equipment is thoroughly decontaminated before sampling.

      The cryomill is another potential source of cross-contamination. The mill is disassembled after each sample and thoroughly cleaned. Milled samples have already been tested for chemical carryover, and none was found. We have now added an additional analysis to rule out DNA carryover. We received the milling schedule of samples for the past years. Assuming samples get contaminated by carryover between milling runs, two consecutive samples should show signatures of this carryover. We tested this for singletaxon carryover as well as community-wide beta diversity, but did not find any signal of contamination. This gives us confidence that our samples are very pure. The results of this test are now reported in the manuscript (Suppl. Fig 12 & Suppl. Table 3).

      Reviewer #2 (Public Review):

      Krehenwinkel et al. investigated the long-term temporal dynamics of arthropod communities using environmental DNA (eDNA) remained in archived leave samples. The authors first developed a method to recover arthropod eDNA from archived leave samples and carefully tested whether the developed method could reasonably reveal the dynamics of arthropod communities where the leave samples originated. Then, using the eDNA method, the authors analyzed 30-year-long well-archived tree leaf samples in Germany and reconstructed the long-term temporal dynamics of arthropod communities associated with the tree species. The reconstructed time series includes several thousand arthropod species belonging to 23 orders, and the authors found interesting patterns in the time series. Contrary to some previous studies, the authors did not find widespread temporal α-diversity (OTU richness and haplotype diversity) declines. Instead, β-diversity among study sites gradually decreased, suggesting that the arthropod communities are more spatially homogenized in recent years. Overall, the authors suggested that the temporal dynamics of arthropod communities may be complex and involve changes in α- and β-diversity and demonstrated the usefulness of their unique eDNA-based approach.

      Strengths:

      The authors' idea that using eDNA remained in archived leave samples is unique and potentially applicable to other systems. For example, different types of specimens archived in museums may be utilized for reconstructing long-term community dynamics of other organisms, which would be beneficial for understanding and predicting ecosystem dynamics.

      A great strength of this work is that the authors very carefully tested their method. For example, the authors tested the effects of powdered leaves input weights, sampling methods, storing methods, PCR primers, and days from last precipitation to sampling on the eDNA metabarcoding results. The results showed that the tested variables did not significantly impact the eDNA metabarcoding results, which convinced me that the proposed method reasonably recovers arthropod eDNA from the archived leaf samples. Furthermore, the authors developed a method that can separately quantify 18S DNA copy numbers of arthropods and plants, which enables the estimations of relative arthropod eDNA copy numbers. While most eDNA studies provide relative abundance only, the DNA copy numbers measured in this study provide valuable information on arthropod community dynamics.

      Overall, the authors' idea is excellent, and I believe that the developed eDNA methodology reasonably reconstructed the long-term temporal dynamics of the target organisms, which are major strengths of this study.

      Thank you very much for the positive assessment of our work.

      Weaknesses:

      Although this work has major strengths in the eDNA experimental part, there are concerns in DNA sequence processing and statistical analyses.

      Statistical methods to analyze the temporal trend are too simplistic. The methods used in the study did not consider possible autocorrelation and other structures that the eDNA time series might have. It is well known that the applications of simple linear models to time series with autocorrelation structure incorrectly detect a "significant" temporal trend. For example, a linear model can often detect a significant trend even in a random walk time series.

      We have now reanalyzed our data controlling for autocorrelation and for non-linear changes of abundance and recover no change to our results. We have added this information to the manuscript under “Statistical analysis” (lines 629-644).

      Also, there are some issues regarding the DNA sequence analysis and the subsequent use of the results. For example, read abundance was used in the statistical model, but the read abundance cannot be a proxy for species abundance/biomass. Because the total 18S DNA copy numbers of arthropods were quantified in the study, multiplying the sequence-based relative abundance by the total 18S DNA copy numbers may produce a better proxy of the abundance of arthropods, and the use of such a better proxy would be more appropriate here. In addition, a coverage-based rarefaction enables a more rigorous comparison of diversity (OTU diversity or haplotype diversity) than the readbased rarefaction does.

      We did not use read abundance as a proxy for abundance, but used our qPCR approach to measure relative copy number of arthropods. While there are biases to this (see our explanations above), the assay proved very reliable and robust. We thus believe it should indeed provide a rough estimate of biomass. As biomass is very commonly discussed in insect decline (in fact the first study on insect decline entirely relies on biomass; Hallmann et al. 2017), we feel it is important go include a proxy for this as well. However, we also discuss the alternative option that a turnover of diversity is affecting the measured biomass. A pattern of abundance loss for common species has been described in other works on insect decline.

      We liked the reviewer’s suggestion to use copy number information to perform abundance-informed rarefaction. We have done this now and added an additional analysis rarefying by copy number/biomass. A parallel analysis using this newly rarefied table was done for the total diversity as well as single species abundance change. Details can be found in the Methods and Results section of the manuscript. However, the result essentially remains the same. Even abundance-informed rarefaction does not lead to a pattern of loss of species richness over time (see “Statistical analysis”).

      The overall results are supporting a scenario of no overall loss of species richness over time, but a loss of abundance for common species. And we indeed see the pattern of declining abundance for once-common species in our data, for example the loss of the Green Silver-Line moth, once a very common species in beech canopy (Suppl. Fig. 10). We have added details on this to the Discussion (lines 254-260).

      These points may significantly impact the conclusions of this work.

      Reviewer #3 (Public Review):

      The aim of Weber and colleagues' study was to generate arthropod environmental DNA extracted from a unique 30-year time series of deep-frozen leaf material sampled at 24 German sites, that represent four different land use types. Using this dataset, they explore how the arthropod community has changed through time in these sites, using both conventional metabarcoding to reconstruct the OTUs present, and a new qPCR assay developed to estimate the overall arthropod diversity on the collected material. Overall their results show that while no clear changes in alpha diversity are found, the βdiversity dropped significantly over time in many sites, most notable in the beech forests. Overall I believe their data supports these findings, and thus their conclusion that diversity is becoming homogenized through time is valid.

      Thank you for the positive assessment.

      While overall I do not doubt the general findings, I have a number of comments. Firstly while I agree this is a very nice study on a unique dataset - other temporal datasets of insects that were used for eDNA studies do exist, and perhaps it would be relevant to put the findings into context (or even the study design) of other work that has been done on such datasets. One example that jumps to my mind is Thomsen et al. 2015 https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/1365-2656.12452 but I am sure there are others.

      We have expanded the introduction and discussion on this citing this among other studies now (lines 71-72, 276-278).

      From a technical point of view, the conclusions of course rely on several assumptions, including (1) that the biomass assay is effective and (2) that the reconstructed levels of OTU diversity are accurate,

      With regards to biomass although it is stated in the manuscript that "Relative eDNA copy number should be a predictor for relative biomass ", this is in fact only true if one assumes a number of things, e.g. there is a similar copy number of 18s rDNA per species, similar numbers of mtDNA per cell, a similar number of cells per individual species etc. In this regard, on the positive side, it is gratifying to see that the authors perform a validation assay on 7 mock controls, and these seem to indicate the assay works well. Given how critical this is, I recommend discussing the details of this a bit more, and why the authors are convinced the assay is effective in the main text so that the reader is able to fully decide if they are in agreement. However perhaps on the negative side, I am concerned about the strategy taken to perform the qPCR may have not been ideal. Specifically, the assay is based on nested PCR, where the authors first perform a 15cycle amplification, this product is purified, then put into a subsequent qPCR. Given how both PCR is notorious for introducing amplification biases in general (especially when performed on low levels of DNA), and the fact that nested PCRs are notoriously contamination prone - this approach seems to be asking for trouble. This raises the question - why not just do the qPCR directly on the extracts (one can still dilute the plant DNA 100x prior to qPCR if needed). Further, given the qPCRs were run in triplicate I think the full data (Ct values) for this should be released (as opposed to just stating in the paper that the average values were used). In this way, the readers will be able to judge how replicable the assay was - something I think is critical given how noisy the patterns in Fig S10 seem to be.

      We agree with this point, and this is why we do not want to overstate the decline in copy number. This is an additional source of data next to genetic and species diversity. We have added to our discussion of turnover as another potential driver of copy number change (lines 257-260). We have also added text addressing the robustness of the mock community assay (lines 138-141).

      However, we are confident of the reliability and robustness of our qPCR assay for the detection of relative arthropod copy number. We performed several validations and optimizations before using the assay. We have added additional details to the manuscript on this (see “Detection of relative arthropod DNA copy number using quantitative PCR”, lines 548-556). We got the idea for the nested qPCR from a study (Tran et al.) showing its high accuracy and reproducibility. We show that our assay has a very high replicability using triplicates of each qPCR, which we will now include in the supplementary data on Dryad. The SD of Ct values is very low (~ 0.1 on average). NTC were run with all qPCRs to rule out contamination as an issue in the experiments. We also find a very high efficiency of the assay. At dilutions far outside the observed copy number in our actual leaf data, we still find the assay to be accurate. We found very comparable abundance changes across our highly taxonomically diverse mock communities. This also suggests that abundance changes are a more likely explanation than simple turnover for the observed drop in copy number. A biomass loss for common species is well in line with recent reports on insect decline. We can also rely on several other mock community studies (Krehenwinkel et al. 2017 & 2019) where we used read abundance of 18S and found it to be a relatively good predictor of relative biomass.

      The pattern in Fig. S10 is not really noisy. It just reflects typical population fluctuations for arthropods. Most arthropod taxa undergo very pronounced temporal abundance fluctuations between years.

      Next, with regards to the observation that the results reveal an overall decrease in arthropod biomass over time: The authors suggest one alternate to their theory, that the dropping DNA copy number may reflect taxonomic turnover of species with different eDNA shedding rates. Could there be another potential explanation - simply be that leaves are getting denser/larger? Can this be ruled out in some way, e.g. via data on leaf mass through time for these trees? (From this dataset or indeed any other place).

      This is a very good point. However, we can rule out this hypothesis, as the ESB performs intensive biometric data analysis. The average leaf weight and water content have not significantly changed in our sites. We have addressed this in the Methods section (see ”Tree samples of the German Environmental Specimen Bank – Standardized time series samples stored at ultra-low temperatures”, lines 308-311).

      With regards to estimates of OTU/zOTU diversity. The authors state in the manuscript that zOTUs represent individual haplotypes, thus genetic variation within species. This is only true if they do not represent PCR and/or sequencing errors. Perhaps therefore they would be able to elaborate (for the non-computational/eDNA specialist reader) on why their sequence processing methods rule out this possibility? One very good bit of evidence would be that identical haplotypes for the individual species are found in the replicate PCRs. Or even between different extractions at single locations/timepoints.

      We have repeated the analysis of genetic variation with much more stringent filtering criteria (see “Statistical analysis”, lines 611-615). Among other filtering steps, this also includes the use of only those zOTUs that occur in both technical replicates, as suggested by the reviewer. Another reason to make us believe we are dealing with true haplotypic variation here is that haplotypes show geographic variation. E.g., some haplotypes are more abundant in some sites than in others. NUMTS would consistently show a simple correlation in their abundance with the most abundant true haplotype.

      With regards to the bigger picture, one thing I found very interesting from a technical point of view is that the authors explored how modifying the mass of plant material used in the extraction affects the overall results, and basically find that using more than 200mg provides no real advantage. In this regard, I draw the authors and readers attention to an excellent paper by Mata et al. (https://onlinelibrary.wiley.com/doi/full/10.1111/mec.14779) - where these authors compare the effect of increasing the amount of bat faeces used in a bat diet metabarcoding study, on the OTUs generated. Essentially Mata and colleagues report that as the amount of faeces increases, the rare taxa (e.g. those found at a low level in a single faeces) get lost - they are simply diluted out by the common taxa (e.g those in all faeces). In contrast, increasing biological replicates (in their case more individual faecal samples) increased diversity. I think these results are relevant in the context of the experiment described in this new manuscript, as they seem to show similar results - there is no benefit of considerably increasing the amount of leaf tissue used. And if so, this seems to point to a general principal of relevance to the design of metabarcoding studies, thus of likely wide interest.

      Thank you for this interesting study, which we were not aware of before. The cryomilling is an extremely efficient approach to equally disperse even traces of chemicals in a sample. This has been established for trace chemicals early during the operation of the ESB, but also seems to hold true for eDNA in the samples. We have recently done more replication experiments from different ESB samples (different terrestrial and marine samples for different taxonomic groups) and find that replication of extraction does not provide much more benefit than replication of PCR. Even after 2 replicates, diversity approaches saturation. This can be seen in the plot below, which shows recovered eDNA diversity for different ESB samples and different taxonomic groups from 1-4 replicates. A single extract of a small volume contains DNA from nearly all taxa in the community. Rare taxa can be enriched with more PCR replicates.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This paper presents an investigation of the mechanisms of how chitin is synthesized in Drosophila by investigating the chitin synthetase Kkv and two proteins related/redundant proteins that are required for chitin production Exp and Reb.

      The authors show that synthesis of nascent chitin polymers is separable from the secretion of chitin and that Ex/Reb is specifically required for chitin translocation/secretion. To understand the functions of Exp/Reb, the authors perform structure/function analyses and examine the localization of the proteins. They find that Na-MH2 domain in Exp/Reb is required for chitin translocation, and that a motif the authors name CM2 is required for Exp localization. For Kkv, they show the WGTRE domain is required for ER exit and that a coiled-coiled domain is required for KKV localization and full Kkv activity. By using live imaging and mutations that disrupt membrane trafficking, the authors show that Kkv, which is a transmembrane protein, cycles to the membrane, and like most membrane proteins, is endocytosed and transits through the endocytic system and is returned to the apical surface. Interestingly, despite being dynamically moved around the cell, chitin synthesis produces highly organized extracellular matrixes. Considering that constitutive production of chitin by Kkv everywhere in the cell would create a mess, these results underscore that regulated organized secretion/translocation of chitin is central to generating patterned extracellular matrixes (as the saying goes, "location, location, location"). Consistent with Exp/Reb being important regulators in extracellular matrix patterning, Exp/Reb not only are required for export of chitin, in the absence of Exp/Reb, the pattern of Kkv localization at the apical surface is altered. Unexpectedly however, by using super resolution microscopy the authors show that Kkv and Exp/Reb have complementary rather than matching localizations. Thus, while it is not clear exactly how Exp/Reb are regulating Kkv, they are doing something very interesting.<br /> Overall, this paper will be of broad interest to the cell biology and developmental biology communities, and to the translational community working to develop chitin as a commercial biopolymer. It is also generally clearly written, although I think there are some inaccuracies in the how some points are phrased. The experiments are well done, and subject to the revisions out lined below.

      Major concerns:

      • A major conclusion of the paper is that Exp/Reb are not required for chitin synthesis. On the most basic level this statement is well supported, because chitin grains are made in the cytoplasm in the Exp/Reb mutants. However, I think the field would be better served with a more nuanced consideration or the role of Reb/Exp. From the data presented, it seems that in the absence of Reb/Exp, the total amount of chitin produced is greatly reduced. I think it would be worth considering Exp/Reb, or the synthesis process in general, as having processivity or duty cycle or quality control such that in the absence of Exp/Reb while Kkv may make short chitin polymers, or occasional long polymers, the major production of chitin doesn't get going without Exp/Reb. Thinking of Reb/Exp as processivity factors in addition to export factors dramatically changes how one thinks of the proteins and the process of chitin synthesis. While these considerations can be handed with some discussion, it would be very interesting to look at the length of the chitin polymers in the Reb/Exp mutants and see if the average chain length is much reduced. This would help distinguish between Exp/Reb reving up the total number of Kkv molecules that produce chitin and Exp/Reb allowing the same number of Kkv molecules to stay active and produce much longer chitin chains. A caveat here is that I have no idea how hard this is to do, so I won't put this at the level of a required revision, but this result would significantly deepen the analysis in the paper.
      • In looking that the subcellular localization of the Kkv and Reb in regular and super resolution, the authors I think the authors missed an important, but straight forward way to gain insight into the apparent complementary distribution of Kkv and Exp/Reb. In stage 16 WT embryos, Kkv has a distinct ringed pattern that corresponds to the tanedial ridges (e.g. clearly visible in Fig. 6A and 6G). How those ridges are set up is unclear, although there are some interesting Turing-pattern models out there. One prediction might be that Exp/Reb should be in between the Kkv rings. If so, maybe Exp/Reb are key components of patterning chitin secretion to make this 3D patterned matrix? Alternatively, maybe Exp/Reb act on a smaller length scale and will match the Kkv ring pattern, just not overlapping with Kkv at the very fine scale. These are straightforward experiments and again could provide key insights into the function of Exp/Reb.
      • In general, most of the figures do not include WT or a control for comparison. This makes it hard for non-experts to assess what the effect of a mutation or condition is. For example, there are no examples of WT or Df(exp reb) in Figures 1-4. I realize this would increase the number of panels, but the paper would be more accessible if comparisons were within figures instead of comparing between main and supplementary figures and other papers.
      • To bolster the case the Exp/Reb directly regulate Kkv distribution, the authors should examine the distribution of Kkv in a catalytically null Kkv mutant, or drugs that block Kkv, or mutations in other genes required for Kkv activity to show that the altered distribution of Kkv in Exp/Reb mutants is a direct consequence of the lack of Exp/Reb rather than in indirect consequence of lack of extracellular chitin, which causes gross perturbations in the trachea. Also, are there differences in the distributions of Kkv in salivary glands with or without the presence of Exp/Reb? If Exp/Reb change the distribution of Kkv in the salivary glands, which normally do not express Kkv and presumably many other components of the chitin ECM system, this would be a powerful argument that there is a direct effect.

      Minor concerns.

      • Page 5 "These intracellular chitin punctae disappeared from stage 14, when chitin is then deposited extracellularly (Fig 1B')." Fig. 1B' is stage 15 embryos.
      • Page 5 "lead to tracheal morphogenetic defects". It would be helpful to the reader if the text or legend told the reader what they were looking for? Broken tubes? Inflated tubes? Variable tubes?
      • Fig. 1H. Main text says "co-expression of Kkv and expMH2/rebMH2 did not lead to tracheal morphogenetic defects (Fig 1H, ...". The tracheal dorsal trunk in Fig. 1H does not look WT. The legend does not state the stage, but the DT looks to have an enlarged diameter and it might be too long. Please present measurements on stage 16 trachea to confirm that there is no effect on tracheal morphology.
      • Fig. 3E there is a lot of GFP-Kkv that is not in co-localized with the KDEL marker. Can the authors clarify what compartment all the other staining is? ER?
      • Section 3.1. The authors imply that the WGTRE domain is specifically required for ER exit. However, an alternative is that absent the WGTRE domain, the protein just does not fold correctly, which would also preclude ER exit, but would be a different problem for the protein to make chitin if it isn't folded.
      • Page 15. I disagree with statement "At stage 16, control embryos showed a highly homogeneous apical distribution of Kkv in stripes, corresponding to the taenidial folds, and Kkv vesicles were largely absent (Fig 6G)." In Fig. 6G, the tandeal ring pattern is clearly visible, as are the fusion cells. If Kkv distribution were "highly homogeneous" these structures/pattern would not be visible.
      • Page 15. I also disagree with the characterization of the apical Kkv distribution in st 15 embryos. "In control embryos we detected a very uniform and homogenous pattern of apical Kkv (Fig 6I).". To my eye, the pattern is punctate and random for the clumps of stain, with the underlying beginnings of the tanidial pattern starting to be visible. The pattern appears neither uniform nor homogenous.
      • P16. The degree of order in the distribution of Kkv is overstated. The authors state that "The results of this analysis, showed that Kkv on the apical membrane, is evenly distributed following a regular pattern (Fig. 6L,L',L',M)." However, given that there is barely a visibly perceptible difference between the actual distribution of Kkv in 6L' and a calculated random distribution in 6L", and that the pattern is neither visibly even or regular, it would be more representative to say something to the effect that the analysis shows there is "underlying order" or "some degree of order" or a "non-random pattern". Visually, the key difference between 6L ' and L" is that there are fewer closely clustered Kkv dots. You could still have an uneven distribution of Kkv that maintains minimum spacing, which is a kind of ordered organization, but not one that would be assumed from the description. It would be helpful if the authors instead of just saying a "regular pattern" also stated the nature of the pattern they observe, i.e. Grid? Stripes? Minimum spacing?
      • Discussion. Another model for the role of Exp/Reb could be to bind and neutralize an inhibitor of Kkv activity. This would account complementary distribution of Kkv and Exp/Reb.
      • Fig. 6L. what tissue is being analyzed? Presumably trachea, but this should be specified as salivary glands are also mentioned in the legend.
      • Fig. 7 C models. I believe that the super resolution data is not accurately accounted for in the models. In both model 1 and model 2, Kkv and Exp/Reb are shown to be in close proximity, but the super resolution data suggests that most Kkv and Exp/Reb are separated hundreds of nanometers. Further, showing Kkv and Exp/Reb as touching was not supported by the coIP experiments, which failed to detect an interaction. It is possible that only a small fraction of Exp/Reb that is in close proximity to Kkv is active, but if so, this should be explicitly mentioned in the models to reconcile the data showing that Kkv and Exp/Reb are mostly not anywhere near each other.
      • -Image analysis. Please detail the criteria for "apical" and "basal" regions were the basis for freehand segmentation. What was counted as apical and what was basal?
      • Abstract and Introduction: The authors state that "We find that Kkv activity in chitin translocation, but not in polymerization, requires the activity of Exp/Reb, and in particular of its conserved Na-MH2 domain.", but then follow that with the statement that "Furthermore, we find that Kkv and Exp/Reb display a largely complementary pattern at the apical domain, and that Exp/Reb activity regulates the topological distribution of Kkv at the apical membrane." Many readers, will find the use of "furthermore" confusing because they will take furthermore as the about to be described data logically following the previous data, but then run headlong into the fact the Kkv and Exp/Reb show a complementary distribution, which does not obviously follow from Kkv activity requiring Exp/Reb. The authors could clarify this and highlight the interesting, unexpected and exciting nature of their results by replacing "Furthermore" with "Unexpectedly" or "Surprisingly", and emphasizing the important role of Exp/Reb in Kkv organization. Maybe something like: Unexpectedly, we find that although Kkv and Exp/Reb display largely complementary patterns at the apical domain, Exp/Reb activity nonetheless regulates the topological distribution of Kkv at the apical membrane.

      Significance

      The topic is interesting from the aspect of cell biology in terms of how a long polymer is created intracellularly, secreted and spatially organized to create a sophisticated extracellular matrix. The topic is also of general interest because chitin is central to the body plan of all insects, crustaceans and many other species, and chitin is of increasing interest as a biopolymer that could have extensive commercial uses.

      In addition to an informative structure/function analysis of the Kvv and Exp/Reb, the results identify what is, to my knowledge, the first regulator of the spatial organization of chitin sythase in insects and it unexpectedly shows a complementary pattern to the the synthase. This highlights just how little we understand about how complex extracellular matrixes are synthesized.

    1. The question the world’s scientists are tackling is to what extent human-caused global heating is to blame for a particular extreme weather event as opposed to natural variability in weather patterns.

      I appreciate the author bringing up this point to acknowledge that there may be other sources which lead to the extreme weather that occurs today. It is effective in avoiding biases and also raising the question to the readers of whether there may be more than one factor which goes into the extreme weather we see today (i'm not saying I think global warming is not the cause).

  5. docdrop.org docdrop.org
    1. t is obvious that the b_ackgrounds of students conrribute to the uneven-ness of opportunities for academic success

      As talked about in Duncan and Murname's article, there is a significant difference in academic success based on the children's socioeconomic status. Another factor, such as the structure of the school may have an impact on the student's academic success as well. In high school, I remember it was pretty diverse, but the students still separated among themselves into racial groups. I think it is inevitable for these groups to not be created based on specific traits since we are naturally attracted to others that have similar traits.

    1. Author Response

      Reviewer #1 (Public Review):

      This report describes evidence that the main driving force for stimulation of glycolysis in cultured DGC neurons by electrical activity comes from influx of Na+ including Na+ exchanging into the cell for Ca2+. The findings are presented very clearly and the authors' interpretations seem reasonable. This is important and impactful because it identifies the major energy demand in excited neurons that stimulates glycolysis to supply more ATP.

      Strengths are the highly rigorous use of fluorescent probes to directly monitor the concentrations of NADH/NAD+, Ca2+ and Na+. The strategies directly test the roles of Na+ and Ca2+.

      A weakness is an ambiguity about the effects of ouabain to inhibit the Na+/K+ ATPase directly and the absence of biochemical controls to validate the interpretation of the ouabain experiment.

      We appreciate the reviewer's comments about the work. While we can not rule out non-specific effects of ouabain at the concentrations needed to block Na+/K+ ATPase in these experiments, we do think that we can rely on the prior biochemical work characterizing the multiple components of ouabain binding in fresh mouse brain tissue, which is a close match to the acute mouse brain slice tissue used here.

      Reviewer #2 (Public Review):

      This study seeks to determine how neuronal glycolysis is coupled to electrical activity. Previous studies had found that glycolytic enzymes cluster within nerve terminals (in C. elegans) during activity. Furthermore, the glucose transporter GLUT4 is recruited to synaptic surface during activity. The authors previously showed that Ca2+ does not stimulate glycolysis in active neurons. Here, the authors show that the cytosolic Na+, not Ca2+, and the activity of the Na+/K+ pump drive glycolysis. However, it is important to note that in this study, glycolysis was examined in the soma, not nerve terminals, where some of the previous studies were conducted. A few other caveats in the interpretation of the findings are listed below:

      1) The NADH/NAD+ ratio is used throughout as the only measurement reflecting glycolytic flux.

      In this and previous work, we have validated that increased cytosolic NADH production (whose major sources are related to glycolysis), rather than altered NADH reoxidation, produces the changes in NADH/NAD+ ratio.

      2) It has been hypothesized that the close association of glycolytic enzymes with ion transporters (such as the Na+/K+ pump) is meant to provide localized ATP to power these pumps. How does bulk glycolysis (monitored with NADH/NAD+ ratio) relate to localized/compartmentalized glycolysis?

      Even if glycolysis is indeed localized to the plasma membrane (an interesting and difficult-to-address hypothesis), we believe that because the mitochondrial shuttles are the main pathway for NADH re-oxidation, and most mitochondria are not localized to the plasma membrane, changes in glycolytic NADH production are likely to be reflected in changes of the bulk cytosolic NADH/NAD+.

      3) Related to point 2, most of the Peredox measurements in the paper have been made at baseline, in the absence of electrical activity. Therefore, it is not clear how the findings relate to activity-driven glycolysis.

      The ion exchange experiments and even the faster Ca2+ puff experiments can mimic but indeed cannot match the speed of activity-driven changes in ion concentrations. Unfortunately, it is impossible to induce normal electrical activity in neurons in the absence of extracellular Na+. We believe that the complete inability of Ca2+ elevation alone (without Na+-Ca2+ exchange) to stimulate glycolysis, combined with the substantial Ca2+ contribution to activity-driven glycolysis, makes a good argument that Ca2+ entering during activity is likely to stimulate glycolysis via Na+ entry and the Na+/K+ ATPase.

      4) The finding that inhibition of SERCA during stimulation actually elevates cytosolic NADH level argues against Na+ being the only ion that regulates glycolysis.

      The ability of SERCA inhibition to produce a small increase in activity-driven glycolysis is consistent with the simple argument that reduced SERCA-driven uptake of Ca2+ into ER results in additional Ca+ removal via Na+/Ca2+ exchange (which can then affect glycolysis via Na+ levels).

      5) The finding that "SBFI ΔF/F transients were longer in duration than the RCaMP LT transient" does not necessarily mean that Na+ elevation lasts longer than Ca2+ in the cell. This could be an artefact of the SBFI on/off rate relative to RCaMP. In fact, prolonged elevation of cytosolic Na+ would make neurons refractive to depolarization in AP trains.

      The rates of Na+ binding and unbinding to SBFI are likely to occur on the microsecond timescale (based on the known properties of crown ether molecules), much faster than the observed transient duration of approximately one minute. Prolonged elevation of cytosolic Na+ alone (to the levels seen here) should not cause neurons to be refractory to firing; refractoriness typically occurs in the setting of prolonged depolarization and consequent inactivation of NaV channels.

      Reviewer #3 (Public Review):

      Meyer et al have studied the mechanisms of glycolysis activation in the hippocampus during neuronal activity. The study is logically laid out, uses sophisticated fluorescence lifetime imaging technology and smart experimental designs. The support for intracellular [Na+] vs [Ca2+] rise driving glycolysis is strong. The evidence for the direct involvement of the Na+/K+ pump is based only on pharmacology using ouabain but the Na+/K+ pump is admittedly not an easy subject for specific perturbations. I still think that the Authors should strengthen the support for the pathway.

      We are happy that the reviewer feels that the evidence for Na+ rather than Ca2+ as the effector of glycolysis is strong. The tools for investigating the role of the Na+/K+ pump (NKA) are indeed limited to pharmacology, because (as the reviewer says) there are not many other options. The requirement for Na+ elevation (which stimulates NKA activity) to trigger glycolysis and the ability of ouabain, a specific NKA inhibitor, to prevent this seem like strong implication of NKA in the mechanism of glycolysis activation. Genetic manipulation of the NKA may be unable to change the level of pump activity, because of compensation by altered expression of other subunits (PMID 17234593); it also is unclear how any chronic manipulation would shed light on the role of NKA in triggering glycolysis. But perhaps future studies of knock-in mice in which the α1 isoform of NKA has made more sensitive to ouabain (PMIDs 15485817; 34129092) might allow the identification of the NKA as the target of ouabain in this situation to be made even more secure.

      Also, there is a long list of publications on the connection between the Na+/K+ pump and glycolysis. It might be useful to highlight the role of the NCX- Na+/K+ pump coupling in the activation of glycolysis in the title.

    1. Author Response

      Reviewer #1 (Public Review):

      Dotov et al. took joint drumming as a model of human collective dynamics. They tested interpersonal synchronization across progressively larger groups composed of 1, 2, 4 and 8 individuals. They conducted several analyses, generally showing that the stability of group coordination increases with group numerosity. They also propose a model that nicely mirrors some of the results.

      The manuscript is very clear and very well written. The introduction covers a lot of relevant literature, including animal models that are very relevant in this field but often ignored by human studies. The methods cover a wide range of distinct analyses, including modelling, giving a comprehensive overview of the data. There are a few small technical differences across the experiments conducted with small vs. large groups, but I think this is to some extent unavoidable (yet, future studies might attempt to improve this). Furthermore, the currently adopted model accounts well for behaviors where all individuals produce a similar output and therefore are "equally important". However, it might be interesting to test to what extent this can be generalized to situations where each individual produces a distinct sound (as in a small orchestra) and therefore might selectively adapt to (more clearly) distinguishable individuals.

      We agree that this is important. We discuss this in a new section (4.1) at the end of the discussion. We suggest that heterogeneity makes it possible for other modes of organization to compete with the attractive tendency towards the global average. We also point out that factors such as individual skill, task difficulty, delays, and selective attention enable such heterogeneity in the ensemble.

      Similarly, it would be interesting to test to what extent the current results (and model) can be generalized to interactions that more strongly rely on predictive behavior (as there is not much to predict here given that all participants have to drum at a stable, non-changing tempo).

      We can only speculate that the present results are less relevant to interactions that rely strongly on predicitive behavior, as behaviour in our simple task could be modeled well by our hybrid single oscillator Kuromoto model. We inserted the idea that the presence of a group rhythm can diminish the demands for individuals to predict each other’s notes, the end of paragraph 1, page 27.

      An important implication of this study is that some well-known behaviors typically studied in dyadic interaction might be less prominent when group numerosity increases. I am specifically referring to "speeding up" (also termed "joint rushing") and "tap-by-tap error correction" (Wolf et al., 2019 and Konvalinka et al., 2010, also cited in the manuscript, are two recent examples). I am not sure whether this depends on how the data is analyzed (e.g. averaging the behavior of multiple drummers), yet this might be an important take-home message.

      Thank you for the suggestion. We edited to emphasize that the relevant part of the analysis of the drumming data was performed at the individual level and using the same methods as typically done in dyadic tapping (first sentences of Section 2.7.2). Speeding up was the only variable where we used group-averages. For consistency, and to avoid confusion, in the present version we re-did the stats (the changed statistical parameters are highlighted) and figures using the individual data points and we did not observe major changes.

      I am confident that this study will have a significant impact on the field, bringing more researchers close to the study of large groups, and generally bridging the gap between human and animal studies of collective behavior.

      Reviewer #2 (Public Review):

      In this manuscript Dotov et al. study how individuals in a group adjust their rhythms and maintain synchrony while drumming. The authors recognize correctly that most investigation of rhythm interaction examines pairs (dyads) rather than larger groups despite the ubiquity of group situations and interactions in human as well as non-human animals. Their study is both empirical, using human drummers, and modeling, evaluating how well variations of the Kuramoto coupled-oscillator describe timing of grouped drummers. Based on temporal analyses of drumming in groups of different sizes, it is concluded that this coupled oscillator model provides a 'good fit' to the data and that each individual in a group responds to the collective stimulus generated by all neighbors, the 'mean field'.

      I have concerns about 1) the overall analysis and testing in the study and about 2) specific aspects of the model and how it relates to human cognition. Because the study is largely empirical, it would be most critical for the authors to propose two - or more - alternative hypotheses for achieving and maintaining synchrony in a group. Ideally, these alternatives would have different predictions, which could be tested by appropriate analyses of drummer timing. For example, in non-human animals, where the problem of rhythm interaction in groups has been examined more thoroughly than in humans, many acoustic species organize their timing by attending largely to a few nearby neighbors and ignoring the rest. Such 'selective attention' is known to occur in species where dyads (and triads) keep time with a Kuramoto oscillator, but the overall timing of the group does not arise from individual responses to the mean field. Can this alternative be evaluated in the drumming data ? Would this alternative fit the drumming data as well as, or better than , the mean field, 'wisdom of the crowd' model ?

      These are very important points. The present paper is restricted to a simple task where participants are instructed to synchronize with each other. However, we now more explicitly acknowledge the limitations of our study and include a new section, “Beyond the group average” at the end of the Discussion that is dedicated to this issue and discussed other organizing tendencies that are particularly relevant in larger and more diverse ensembles. In the context of the present task, the relative difference between local and global interactions was likely negligible because of the small differences in timing, from 4 to 16 ms, between the closest and most distant pairs.

      It will be interesting in future studies to introduce acoustic heterogeneity by varying the timbre of the instruments, for example. In the present study, the instruments had the same timbre with narrowly varying fundamental frequencies (117-129 Hz in the duets/quartets and 249-284 Hz in the octets), a situation that encourages integration of all the acoustic information. We do point out that the present approach needs to be expanded to be able to account for competitive pressure and selective attention.

      The well-known Vicsek model (discussed briefly in paragraph 2, page 15), related to the Kuramoto under certain assumptions, can account for a variety of dynamic behaviors in flocking animals. The ability for selective attention in the form of a heterogeneous coupling matrix, combined with the existence of competitive pressure in the form of negative coupling terms can result in spontaneous formation of clusters and spatiotemporal patterns of movement. This is consistent with prior research in chorusing animals (insects and anurans). Large musical ensembles also involve groupings of instruments such as separate sections that change their relative loudness across time. Typically these are not spontaneous but composed and conducted, yet they may satisfy the same constraints.

      We also pointed out that we see these as complementary organizing principles. Even in the Vicsek model, there is a notion of a ‘local order parameter’ whereby individuals are coupled to a group average within a narrow interaction radius. The relative importance of other organization tendencies depends on the layout of the acoustic environment and the competitive and collaborative aspects of the task. Hence, parameters such as delay and individual heterogeneity could act as symmetry breaking terms that enable different stabilities from the basic global group synchrony.

      A second concern arises from relying on a hybrid, continuous - pulsed version of the Kuramoto coupled oscillator. If the human drummers in the test could only hear but not see their neighbors, this hybrid model would seem appropriate: Each drummer only receives sensory input at the exact moment when a neighbor's drumstick strikes the drum. But the drummers see as well as hear their neighbors, and they may be receiving a considerable amount of information on their neighbors' rhythms throughout the drum cycle. Can this potential problem be addressed? In general, more attention should be paid to the cognitive aspects of the experiment: What exactly do the individual drummers perceive, and how might they perceive the 'mean field' ?

      This is all very relevant. We instructed participants to focus on X’s in the centers of their drums and not look at their peers (edited to mention that in at the end of Section 2.4, page 9). Additionally, the pattern of results for tempo change, cross-correlations, and variability in the dyadic condition was consistent with previous studies that involved purely auditory tapping tasks (emphasized in the begging of paragraph 2, page 26). The best way to address this limitation would be to repeat the study and block the visual contact among participants, as well as include a condition emphasizing visual contact.

      It is beyond the scope of the present paper to make model-based predictions of effects of coupling and information availability, but this should be done in future work. For the present paper, we now include a simulation involving continuous coupling (end of section 2.9.2, page 16) and Supplementary Figure 8A) which fails to reproduce the results for variability, results that are well captured by the hybrid continuous-pulsed model we developed, see the Supplementary Materials.

      Reviewer #3 (Public Review):

      The contribution provides approaches to understanding group behaviour using drumming as a case of collective dynamics. The experimental design is interestingly complemented with the novel application of several methods established in different disciplines. The key strengths of the contribution seem to be concentrated in 1) the combination of theoretical and methodological elements brought from the application of methods from neurosciences and psychology and 2) the methodological diversity and creative debate brought to the study of musical performance, including here the object of study, which looks at group drumming as a cultural trait in many societies.

      Even though the experimental design and object of study do not represent an original approach, the proposed procedures and the analytical approaches shed light on elements poorly addressed in music studies. The performers' relationships, feedbacks, differences between solo and ensemble performance and interpersonal organization convey novel ideas to the field and most probably new insights to the methodological part.

      It must be mentioned that the authors accepted the challenge of leaving the nauseatic no-frills dyadic tests and tapping experiments in the direction of more culturally comprehensive (and complex) setups. This represents a very important strength of the paper and greatly improves the communication with performers and music studies, which have been affected by the poor impact of predictable non-musical experimental tasks (that can easily generate statistical significant measurements). More specifically, the originality of the experiment-analysis approach provided a novel framework to observe how the axis from individual to collective unfolds in interaction patterns. In special, the emergence of mutual prediction in large groups is quite interesting, although similar results might be found elsewhere.

      Thank you for these comments.

      On another side, important issues regarding the literature review, experimental design and assumptions should be addressed.

      I miss an important part of the literature that reports similar experiments under the thematic framework of musical expressivity/expression, groove, microtiming and timing studies. From the participatory discrepancies proposed in 1980's Keil (1987) to the work of Benadon et al (2018), Guy Madison, colleagues and others, this literature presents formidable studies that could help understand how timing and interactions are structured and conceptualized in the music studies and by musicians and experts. (I declare that I have no recent collaborations with the authors I mentioned throughout the text and that I don't feel comfortable suggesting my own contributions to the field). This is important because there are important ontological concerns in applying methods from sciences to cultural performances.

      Thank you for the suggestions. We included a brief discussion in the newly added “Beyond the group average” section at the end of the Discussion, specifically the first paragraph, pages 27-8. We think that expressive timing naturally fits in continuation with the other reviewers’ concerns about how much the idea of the group average generalizes to real musical situations. By design and instruction, we stripped individual expression from the present task. Specific cultural contexts and performance styles may want to escape or at least expressively tackle this constraint of our task, and we believe that now that we have established the mean field as one factor affecting group behaviour, further studies can take on the challenge of developing models that make predictions in more complex situations closer to real musical interactions – and testing those models empirically.

      One ontological issue that different cultural phenomena differ from, for example, animal behaviour. For example, the authors consider timing and synchrony in a way that does not comply with cultural concepts: p.4 "Here we consider a musical task in which timing consistency and synchrony is crucial". A large part of the literature mentioned above and evidence found in ethnographic literature indicate that the ability to modulate timing and synchrony-asynchrony elements are part of explicit cultural processes of meaning formation (see, for example, Lucas, Glaura and Clayton, Martin and Leante, Laura (2011) 'Inter-group entrainment in Afro-Brazilian Congado ritual.', Empirical musicology review., 6 (2). pp. 75-102.). Without these idiosyncrasies, what you listen to can't be considered a musical task in context and lacks basic expressivity elements that represent musical meaning on different levels (see, for example, the Swanwick's work about layers/levels of musical discourse formation).

      Indeed, this is an important issue. We often use cultural phenomena merely as a motivation but do not dive in the relevant details. Here, in addition to the previous discussion, we now reiterate that the tendency towards the group average is one organizing tendency but there are additional ones, enabled by individual heterogeneity and context. For example, marching bands and chanting crowds probably impose different constraints than individual artistic expression by skillful musicians.

      Such plain ideas about the ontology of musical activities (e.g. that musical practice is oriented by precision or synchrony) generate superficial constructs such as precision priority, dance synchrony, imaginary internal oscillators, strict predictive motor planning that are not present in cultural reports, excepting some cultures of classical European music based on notation and shaped by industrial models. The lack of proper cultural framing of the drumming task might also have induced the authors to instruct the participants to minimize "temporal variability" (musical timing) and maintain the rate of the stimulus (musical tempo), even though these limiting tasks mostly take part of musical training in some societies (examples of social drumming in non-western societies barely represent isochronous tempo or timing in any linguistic or conceptual way). The authors should examine how this instruction impacts the validity of results that describe the variability since it was affected by imposed conditions and might have limited the observed behaviour. The reporting of the results in the graphs must also allow the diagnosis of the effect of timing in such small time frame windows of action.

      We agree totally. We made changes and tried to be more specific about the cultural framing, delineating contexts where the present ideas are more relevant and where they are less relevant, or at least incomplete (the bottom of page 3, and pages 27-8).

    1. Author Response

      Reviewer #1 (Public Review):

      This paper primarily assessed the host/phage interactions for bacteria in the order of Cornyebacteriales to identify novel host factors necessary for phage infection, in regards to genes responsible for bacterial envelope assembly. Bacteria in this order, such as Mycobacterium tuberculosis and Corynebacterium diphtheriae have unique, complex envelopes composed of peptidoglycan, arabinogalactan, and mycolic acids. This barrier is a potent protector against the therapeutic effects of antibiotics. Phages can be used to discover novel aspects of this bacterial envelope assembly because they engage with cell surface receptors. To uncover new factors, the researchers challenged a high-density transposon library of Corynebacterium glutamicum (called Cglu in the paper) with phages, Cog, and CL31. Results by transposon sequencing identified loci that were interrupted, leading to phage resistance. This study implicated the importance of Cglu genes, ppgS, cgp_0658, cgp_0391, and cgp_0393. They also identified a new gene called cgp_0396 necessary for arabinogalactan modification and recognized a conserved host factor called Ahfa (Cpg_0475) that plays a crucial role in Cglu mycolic acid synthesis. Ultimately, this work implicated the importance of mycomembrane porins, arabinogalactan, and mycolic acid synthesis pathways in the assembly of the Cornyebacteriales envelope.

      Strengths of the research:

      • Language choice: A major strength of the paper is that this could easily be given to an undergraduate student with introductory knowledge of biology and they would still be able to get the gist of this paper. The language is written in a clear, concise fashion with explanations of terms not everyone would immediately know unless they worked in the field specifically.

      • These figures are generally explained in a direct manner, clearly stating the major conclusions the reader should get after carefully analyzing the presented data

      We thank the reviewer for the enthusiasm for our work and our description of it.

      How the research could be strengthened:

      • It could be worthwhile to describe some of your results mathematically. For example, the differences you see in your phage infections relating to the differences in logs, etc. Bar graphs also should be described in mathematical terms, when "something is lower compared to the WT," how much is lower, etc?

      To keep the text streamlined, we refrained from adding descriptions of the results mathematically in the text. The reader can refer to the figures to get the magnitudes of any changes observed.

      • There were no p values relating to the statistical significance of any of the data presented, which should be changed for the final manuscript implicating the importance of this work.

      We added the p-values as requested.

      • Figure 8 was not entirely supported by the data, especially Figure 8A which either could be improved with better images that support the author's claims, etc.

      We do not understand why the reviewer believes that Figure 8A does not support our conclusions. The mutant cells do not label with the 6-TMR-Tre dye whereas the WT control does. The dye labels mycolic acid such that our conclusion that AhfA is involved in mycolic acid synthesis is valid. In any case, we have included an additional supplementary source data file of the uncropped image of the 6-TMR-Tre treated cells to show a larger number of mutant cells that fail to stain, further supporting our conclusion.

      Reviewer #2 (Public Review):

      In this manuscript, McKitterick and Bernhardt use genetic approaches to investigate genes in Corynebacterium glutamicum that are required for efficient phage infection. They make use of a high-density transposon library that was generated in the Bernhardt lab recently. They challenged the library with two phages, CL31 and Cog. Importantly, they elegantly adapted the phages to the laboratory strain MB001 before. The MB001 strain is ideal for genetic experiments since all prophage elements were removed in this strain. The evolved phages are likely a very useful tool for further investigations aiming to understand host/virus interactions in this model. The phage-infected libraries were plated and the collected colonies were sequenced. Genes involved in efficient phage infection had multiple transposon insertions. Using this method the authors identified specific genes required for infection with Cog and CL31. The Cog phage needs apparently the porin proteins in the mycolic acid membrane for efficient infection and the authors speculate that the porins may act as auxiliary receptors for phage adsorption. Furthermore, genes involved in putative arabinogalactan modification were found to be important. Mutants in these genes did not abolish phage adsorption and thus play a role in viral genome injection. For phage CL31 the authors show that in particular genes involved in mycolic acid synthesis are essential. The genes identified include one coding for a protein involved in protein mycoloylation. A candidate for such a lipidation is the porin protein complex PorAH. The trehalose-6-phosphate synthase OtsA was also identified as important for phage infection. Also strictly required for the establishment of the myco membrane, otsA deletions are viable in C. glutamicum. As part of their analysis, they also identified an unknown factor in mycolic acid synthesis in C. glutamicum. Analysis of a spontaneous resistant mutant to CL31 revealed a mutation in cg_0475 (renamed ahfA). Deletion of ahfA drastically reduced mycolic acid production. This was proven by thin layer chromatography and fluorescent staining. Interestingly, deletion of ahfA also results in a cell morphology defect, indicating the importance of a correct mycolic acid layer for cell shape.

      In summary, the authors provide an excellent paper that is clearly written and experiments are conducted nicely.

      We thank the reviewer for their kind words and enthusiasm for the work.

      Reviewer #3 (Public Review):

      In their manuscript, McKitterick and Bernhardt perform a screen to determine host factors, such as receptors, which are important for bacterial viruses (phages) to infect Corynebacterium glutamicum., an organism that shares the unique membrane of mycobacteria (mycomembrane), with M. tuberculosis. To do so, they challenge a previously described Tn-seq library with a high MOI of 2 phages - Cgl and Cog. The surviving strains are those in which genes important for phage infection (such as receptors) are disrupted. The authors' screen is successful, and the authors identify and validate several factors important for the infection of each phage, providing the first such screen in Corynebacterium. Moreover, the authors perform a suppressor screen to identify additional factors and experimentally follow up several genes of interest. Finally, the authors use the newly determined host specificity of te phages to implicate new genes in mycolic acid synthesis. As a whole, this is a strong work that paves the way to a deeper understanding of Corynebacterial and (by extension) Mycobacterial phages and should be of broad interest.

      Below, we suggest additional analyses, context, and elaboration that will help the ms. elaboration to fully realize its impact.

      Major points:

      1. Although the authors' experimental design is fundamentally sound, I am worried about the possibility of "jackpotting" in shaping their results, particularly in the uninfected control experiment. If the authors' Tn-seq library is ~200,000 strains, and they don't plate at least 10-100x times that many colonies then any given strain (regardless of its phenotype) may or may not be represented in the output of the experiment, causing false phenotypes to be ascribed to genes based on chance. This is particularly a problem for the uninfected control, where the authors choose to dilute the culture 1000fold to mimic the number of colonies that survive infection. They may be better served by plating the whole culture on the plates, to ensure adequate representation of the library. Part of the reason for this concern is that an overwhelming majority of statistically significant hits (something like 80-90%) appear to confer susceptibility rather than resistance (source data Fig 2) - something the authors' experimental design should not be able to measure. The lack of accurate representation of distributions of strains in the starting culture also calls into question the quantitative differences they present in the results

      We thank the reviewer for their thorough analysis of our experimental design. The Tn-Seq experiments were repeated with the uninfected controls plated at a density that maintains the representation of the original library. The overall results are largely unchanged because we maintain our focus on hits that become greatly enriched following phage infection not those that become depleted. The vast majority of these hits were validated for their involvement by constructing mutant strains, indicating the robustness of the current and previous analyses. With respect to the depletion of insertion mutants, we mentioned in the original submission that they are unlikely to be biologically meaningful.

      a. L138. Where the authors describe their initial experimental design it would be helpful to add more details. What is the size of the Tn library? What is the coverage in their experiment? Approximately how many colonies are recovered on the plates after phage infection and in the uninfected control?

      This information has been added (Fig. 2 table supplement 1).

      b. it is important to know how the number of colonies on the plates compares to the number of reads in the experiment. In the analysis of most HT screens, one implicitly assumes that each read corresponds to 1 cell, hence each read can be treated as statistically independent. This assumption is critical to the statistical methods used to analyze this data. By scraping a plate of colonies (which may be required for efficient phage infection), the authors potentially violate this assumption (since the number of cells → number of colonies, which are the actual statistically independent entities in the experiment). Does this assumption hold (or approximately hold) for the screen? If not, a different statistical method should be used to determine p-values.

      We respectfully disagree with the reviewer on this point. In our view, a slurry of colonies from a plate is no different than a culture. Both contain a mixture of cells containing an array of different transposon mutants each represented multiple times in the population due to replication of the original mutant. We do not think there is any meaningful difference to the analysis whether this replication occurs in liquid or on a plate. In both cases, a read corresponds to a single cell/molecule of purified genomic DNA from the population.

      1. The authors' Tn-seq methodology is different from previously published HT-phage screens (e.g. Mutalik et al., 2020 and Rousset et al., 2018). Based on my knowledge of classical phage biology, I agree that plating the infected cells has advantages. However, the rationale will not be clear for most people performing such experiments. Please explain the rationale for the experimental protocol.

      Although the authors in the Mutalik et al paper did do competition experiments in liquid over several infection cycles, they also made use of a solid platebased assay in which they adsorbed their phages to the library cells for 15 minutes before plating. These plates were incubated overnight and resistant colonies were scraped, pelleted, and DNA prepped in a similar manner to the approach we took.

      We prefer plating over liquid growth because colony formation is an easy way to ensure that the mutant population has undergone numerous rounds of doubling under a given condition before the analysis is performed.

      a. Why did the authors plate the cultures after initial phage absorption instead of remaining in liquid?

      We were concerned that some potential receptor-related mutants would be less fit and would therefore be lost in a competition experiment. As such, plating after phage adsorption would decrease the competition between phage survivors. Furthermore, we thought that plating would additionally ensure that the bacteria that are sequenced are true survivors and not just reflect remnant DNA in the culture.

      b. How reproducible are the authors' Tn-seq results? The SRA ascension shows multiple replicates but this is not described in the manuscript nor reflected in the supplementary data. Given the potential for bottleneck and jackpotting effects in this assay, some measure of reproducibility is important for interpreting the results (see point 1).

      We performed completely new Tn-seq experiments for each phage in duplicate. The hit lists remained largely unchanged from our initial analysis and those that were investigated further were enriched for insertions in both new data sets. Thus, the results are highly reproducible.

      c. L587 "Significant hits with fewer than 10 insertions on each strand were manually removed." Why did the authors choose this criterion? Almost all of the genes they removed have very asymmetric distributions (e.g. in the Cog experiment, cgp3051 has 47853 fwd reads and 6 rev reads. Asymmetric distribution of insertions suggests that overexpression of downstream genes has an important (positive or negative) effect. This is a worthwhile pursuit, and many automated analysis pipelines can disambiguate these effects, including those developed in the Walker Lab (e.g. doi: 10.1038/s41589018-0041-4). These genes shouldn't be thrown away when they are arguably some of the most informative hits!

      We have updated the criteria we used for selecting the most impactful insertion enrichments. Our concern in this report was to investigate mutants that affect phage infection when inactivated. We will pursue genes that affect phage infection when overexpressed (as indicated by asymmetric insertion orientation distributions) in a follow-on study. We think such a study would best be carried out with a different transposon containing a strong outward facing promoter.

      1. There is a somewhat extensive phylogeny of M. smegmatis phages (phagesdb.org). Are the phages that the authors work on related to any of these phages? If so, what cluster do they map to? What is the host range of other phages in that cluster? If not, may be worthwhile to mention that these are quite distinct from other studied phages.

      We agree that the phylogenetic history of corynephages is quite interesting. Very few phages that infect Cglu have been isolated and sequenced, let alone studied. Neither Cog nor CL31 share significant nucleotide identity with other sequenced phages, thus they do not have assigned clusters at the moment.

      1. Given that cgp_0475 was a strong hit in the Tn-seq, why was it not identified in the previous chemical genomics experiments from the lab (https://doi.org/10.7554/eLife.54761) ?

      We appreciate the reviewer’s interest in previous work from the lab. In the prior phenotypic analysis, cgp_0475 was identified as having severe fitness defects across many conditions. However, it was not possible to correlate its phenotype with other genes involved in mycolic acid synthesis like pks and fadD2 because they were found to be so sick in the phenotypic outgrowth that they were classified as essential.

      1. Is there any relationship between the growth-rate of the mutants and their phage susceptibility? This can be analyzed using the authors' previous studies of this library.

      While some of the phage resistant mutants are associated with poor fitness (namely those involved in mycolic acid synthesis), not all were associated with decreased growth. For example, there were minimal fitness defects associated with deletions of either porAH or the genes involved GalN decoration. However, loss of these genes greatly inhibited the ability of Cog to infect.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In the manuscript entitled "Long-term mitotic DNA damage promotes chromokinesin-mediated missegregation of polar chromosomes in cancer cells," the authors propose that DNA damage on mitotic chromosomes causes chromokinesin-mediated polar chromosomes, which eventually results in missegregation and micronuclei formation. They first performed screening of compounds that cause DNA damage on mitotic chromosomes and found that DNA damage delayed mitosis in the nocodazole wash-out experiment. The authors found that several DNA damage-inducing compounds all caused an increase of asymmetric Mad1 localization on polar chromosomes. Using photoactivatable GFP-a-tubulin, the authors showed that a-tubulin stabilizes after Etoposide treatment. They finally showed that chromokinesin Kid and Kif4a knockdown rescues the asymmetric Mad1 localization.

      Major comments:

      1. Page 6, line 155: the authors claim that "In contrast, among other defects, treatment with any of the DNA-damaging compounds caused a significant mitotic delay due to the presence of misaligned chromosomes near the spindle poles." Although Figure 2A shows a representative image of polar chromosomes, I do not find quantitative data that analyze %polar chromosomes in mitosis treated with DNA-damaging compounds. I also do not find the data supporting the claim that polar chromosomes caused a mitotic delay. Because most subsequent analyses were performed based on this result, the quantitative data should be provided here. For the latter, I suggest showing "time in mitosis (Fig 2B)" separately with or without polar chromosomes.
      2. According to Figure 2C, the ratio of "Exit with micronuclei (from misaligned chromosome(s))" is relatively low compared to other phenotypes such as "Mitotic arrest" or "Cell death." I wonder if polar chromosome phenotype is also correlated with these other cell fates. Please clarify which fate is correlated with polar chromosome formation after DNA damage.
      3. In Figure 3, the authors used Nocodazole-treated background to assess the involvement of SAC in DNA-damaging compound-induced mitotic delay. However, as shown in Figure 2B, DNA-damaging compounds cause a minor delay in mitosis, which might be challenging to analyze in the presence of Nocodazole. There is also a possibility that DNA damage response (DDR) works independently and adjunctly to delay mitosis. Because one of the major claims of the authors is that "the SAC is the only mechanism that is required to delay mitosis in the presence of long-term mitotic DNA damage (page 10, line278)", I recommend Nocodazole wash-out (as in Figure 2B) to examine the effect of MPS1-IN-1 (and ideally an inhibitor of the DDR pathway, such as ATMi) on mitotic delay induced by DNA-damaging compounds.
      4. Line 226, (our unpublished observations): because the authors claim that "the formation of polar chromosomes due to the stabilization of kinetochore-microtubule attachments upon long-term mitotic DNA damage is likely exclusive to cancer cells," the authors should present data on RPE-1 cells at least for %polar chromosome formation (as suggested in comment 1) and Mad1 localization. Plus, even though the data is provided, the statement "exclusive to cancer cells (page 8, line 230)" is speculative and should be toned down. Mad1 localization data is also important because the authors claim that "long-term mitotic NA damage specifically stabilized kinetochore-microtubule attachments in cancer cells (page 10, line 288)" in the discussion.
      5. For the Mad1 assay, such as in Fig. 4A, the authors analyzed the CENP-C pair with two or one Mad1 foci formation. However, in some representative pictures, for example, Fig S4A-Etoposide, I found pairs of CENP-C signals on the polar chromosome without any Mad1 foci (the one next to the pairs shown in the square). As the authors argue, these kinetochores may represent polar chromosomes that eventually satisfy SAC and may be important. I, therefore, wonder why those kinetochores are omitted from the assay. Please explain this point in the manuscript if there is any reason.

      Minor comments:

      1. Page 7, line 168: the authors claim that "regardless of the type of DNA lesion, long-term mitotic DNA damage persists throughout mitosis and promotes micronuclei formation from polar chromosomes." However, the former claim is not fully supported by Figure S3, which addressed the effect of Etoposide only; the latter claim is not fully supported by Figure 2C, which lacks clarity (as pointed out in comment 2) and statistical analysis. Please revise this sentence.
      2. Line 182: it would be helpful for readers to explain why MG132 was used.
      3. Line 210: it would be helpful for readers to explain briefly what PA-GFP means and how the assay works.
      4. Figure 6E-G: I wonder whether siKid+siKif4a affected %polar chromosomes or not.
      5. Page 10, line 287: the authors claim that "we show that long-term mitotic DNA damage..., causing the missegregation of polar chromosomes due to the action of arm-ejection forces by chromokinesisns,...." However, only Mad1 localization data is provided in Figure 6E-G, and whether siKid + siKif4a rescues the missegregation of polar chromosomes is not clear. The authors should either provide supporting evidence or revise this sentence for clarity.
      6. Figure 1E: some color codes for each compound are difficult to distinguish. I also found it challenging to locate some lines on the graph. I recommend separating this graph, for example, by types of DNA lesions caused by compounds, and color codes that are easy to distinguish should be used.

      Referees cross-commenting

      I generally agree with other reviewers' comments and confirmed that they raised similar concerns.

      Significance

      It has been described previously that mitotic arrest induces DNA damage and that the DDR pathway during mitosis is attenuated. The data presented in this manuscript provide a potentially novel cellular response against DNA damage during mitosis. The manuscript will be of interest to those in the field of the cell cycle (especially mitosis), the DDR, and tumor chemotherapies. While the finding that DNA damage during mitosis causes polar chromosomes is potentially interesting, the manuscript is still rather descriptive, and data that address the molecular mechanism is insufficient for the level that the authors conclude. Although the data quality is high, I think some essential data supporting their conclusion and clarity of the description are missing from the manuscript, which can be addressed before publication.

    1. this is supposed to be science fiction. escapism! exploring the boundaries that we can’t explore in polite shitty society. and not one character in your entire novel is trans, gay, ace, or queer? you may be excused, old dead white dude, for being born in the early twentieth century when your very exposure to such ideas would have been oppressively policed. like ianthe says, i can respect that but i can’t admire it fade into obsolesence pls kthx

      j’adore.

      I wonder when the last time was that I read a book written by a man. Maybe the Adventure Zone comics? Ah, and Yeats. But I feel a bit as though… I am willing to humble myself before some authors to see what I must expand my view to understand. But for white dudes I do not extend very much Benefit Of Doubt unless I have a lot of social context telling me they are worth it. This has been working out pretty okay. I still read about cis dudes and they aren’t always boring, so I think it’s possible to keep tuning an approach until you’re finding stuff that works along multiple dimensions; it isn’t entirely zero-sum.

    1. Further, with the hierarchical powerdynamic neutralized clients have the freedom to provide input and tailortreatment goals. They may even feel safe to correct or interject if practitionersare off-course to ensure the accuracy of information. Finally, practitionerhumility, flexibility, and openness allow for adjustments during treatment tooptimize outcomes. The end result is motivated clients who are heard,validated, and empowered

      Some clients will actually fight you to put you in a power position over them. There is something which they may find relieving or conforting in surrendering that empowerment, or perhaps they just believe they don't have the right to expect a non-heirarchical relationship with their counselor. Often when we are asked for advise as counselors it is actually a bid for the permission for the client to surrender their will, intentions and decisiveness to you. That may have characterised familiar abusive relationships for them in the past. Or it may be a way for them to avoid looking inside or taking up their power. But if you give advice too readilly - you might end up responsible if things don't turn out right. What if things don't turn out right for them because it wasn't what they wanted and they get hurt- even if they don't blame you for this- you might actually be to blame. Cultural humility is partly having the ability to recognize that we can accompany and help people quite a lot, but we can not rescue them or think for them, or want for them or know better for them. We are actually fighting for a capacity within the therapeutic relationship- to build connections of equality and empowerment.

    1. All this is true, but all this is not all the truth.  What the older scientific men did not see -- what Newton did not see, as he looked to the perfect order of the heavens -- what Cuvier did not see, when he dwelt so fondly on the teleology seen in every part of the animal structure -- what Paley did not see, when he pointed out the design in every bone, in every joint and muscle -- what Chalmers did not see, when in his astronomical discourses he sought to reconcile the perfection of the heavens with the need of God's providing a Saviour for men -- has been forced on our notice, as naturalists have been searching into animal life, with its struggles and its sufferings

      Again, I think McCosh is highlighting that all of these scientific men in varying fields have found answers to certain questions we have about the universe and ourselves. They may have uncovered some truths through the practice of research, experimentation, and other scientific tools, but they still cannot account for the larger aspect of their existence, who created them and how were they made from the start? Only a supreme being could be responsible. So, he's saying, yes some of what you're saying/ found, identified, uncovered is true, but it's not the entire truth. You still can't account for the innate, and there is where we find God.

  6. Sep 2022
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      See cover letter for more details.

      Summary of response to reviewers:

      We were immensely pleased that the reviewers considered our conclusions “well supported” and our study “beautifully executed”. Reviewers also recognized the significance of our work. Reviewer 1 stated that “building a model that describes one of these pathways will allow us to begin to test therapies to treat or prevent scoliosis” then noted that we “help to build a larger model of normal spine morphogenesis” and that this is “important”. Reviewer 2 called our work an “exciting advance in our understanding of one of the essential signaling pathways that help regulate body axis straightening and spine morphogenesis in zebrafish” and mentioned that our work “may also help to further our understanding of the etiology and pathophysiology of multiple forms of neuromuscular scoliosis in humans”. Reviewer 3 agreed, stating that our work “adds important information on the role of urotensin signaling in spine formation” and noted that it is timely: “findings are of special significance in the light of recent reports that mutations in UTS2R3 show association with spinal curvature in patients with adolescent idiopathic scoliosis”.

      We thank the three reviewers for reading our research and providing feedback. In all cases, we have incorporated (or plan to incorporate) their suggestions, and we believe this has (will) make our manuscript much stronger. Indeed, reviewers had only a small number of “major points”, and all are easily addressed as summarized below. We have already addressed some of those “major points”, as well as the majority of “minor points” raised by reviewers, in our current draft. We expect that all comments can be fully addressed within around 1 month.

      2. Description of the planned revisions

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are plannedto address the points raised by the referees.

      • *

      We have divided our responses by whether the reviewers considered their points major or minor. All points have already been, or will soon be, fully addressed.


      Major points


      Reviewer 1

      • *

      The key conclusions are well supported, see below for my two major issues.

      Please don't call this lordosis. Lordosis or hyperlordosis effects lumbar vertebra. The curve in the lumbar region shifts body weight so that human gait is more efficient that that in the great apes, or so the story goes. Zebrafish do not have lumbar vertebra equivalents or a natural curve in the caudal region. Similarly, fish do not have the equivalent vertebra to generate kyphosis, which is again a hyper flexion of a normal human spinal curve. Instead zebrafish have Weberian, precaudal and caudal vertebra. It would be so much more useful for the field if the authors used these terms and specified ranges, i.e. numbered vertebrae, that are effected so we can directly and accurately compare regions of defects between zebrafish mutants. It would help to make the point that the uts2r3 mutant has more caudally located curves than urp1/2 double mutants. We appreciate this point and agree with the reviewer. Lordosis (or hyperlordosis) is indeed the accentuation of a curve which naturally exists in humans but not zebrafish. We called the phenotype of Urotensin pathway mutants ‘lordosis’ or ‘lordosis-like’ because of the position of the curves — in caudal vertebrae, which are evolutionarily and positionally equivalent to lumbar vertebrae, though they are structurally different to human lumbar vertebrae. To address this comment, we will no longer refer to the phenotype as lordosis in our Introduction or Results sections and we will expand our Discussion to include this point raised by the reviewer.

      1. The observation that urp1/2 double mutants have curves only in the D/V plane and almost completely lack side-to-side curves is noteworthy. Does the urp1-/-urp2-/- mutant uncouple two systems for posture? If this separate a DV from side-to-side postural control system, that would be very interesting. It is particularly important to describe how penetrant the phenotype is and how many times it was observed. See 9 minor comments. It would help the reader if the authors explicitly described the features that they see in the cfap298 mutant that constitute lateral curves and that are lacking in urp1/2 (e.g. in figure 4E).

      We plan to expand the figure and analysis describing D/V curves and M/L curves. While our first draft included only cfap298 and urp1-∆P;urp2-∆P mutants, our next draft will also incorporate uts2r3 and pkd2l1 mutants. We have already scanned cohorts of all mutant fish, and so the remaining work to render and quantify the degree of lateral curvature will not take long. This will allow us to conclusively determine whether these different mutations indeed uncouple two systems controlling posture in different directions. As the reviewer requests, we will include all fish analyzed in either main or supplementary figures, include numbers in figure legends, and quantify the penetrance of M/L and D/V curves.

      We have also generated cfap298;urp1-∆P;urp2-∆P triple mutants and are currently scanning them to reveal skeletal form. Preliminary data suggests triple mutants have three-dimensional curves but D/V curves are more severe in triple mutants than in cfap298 mutants alone. This makes sense if Urp1/Urp2 are important for controlling D/V spinal shape and, as our qPCR shows, Urp1/Urp2 are downregulated but not lost completely in cfap298 mutants. It also furthers the notion that cilia motility controls D/V and M/L curves by separable mechanisms. * *

      • *

      Reviewer 2

      Need to show that the CRISPANT targeting was effective for mutagenesis at each loci screened in the work presented in Figure 1E. In Figure 1E, we presented the phenotypes of crispant embryos (i.e. embryos injected with four gRNAs targeting a specific gene alongside relatively high doses of Cas9 protein; see schematic in Figure 1G). In positive controls (cfap298 and sspo), crispants showed the expected phenotype in all cases (Figure 1E and see Figure 1H for quantitation). As for germline mutants, urp1 and urp2 crispants showed no early axial phenotypes (Figure 1E and 1H). As such, the reviewer requests that we perform molecular assays to determine whether mutagenesis was successful in these embryos. To do so, we will perform either T7 assays or next-generation/Sanger sequencing of mutated loci. This will allow us to determine and quantify the effectiveness of our mutagenesis. Results will be shared in a new supplementary figure. These assays are straightforward and we expect they will not take very long to complete. Indeed, we have performed these assays previously for other genes (e.g. Grimes et al., 2019 and several unpublished genes). We have achieved high levels of mutagenesis in all cases, making us very confident that we will achieve similarly high levels of mutagenesis in this case.

      Reviewer 3


      The addition of the F0 crispant experiment to show that the pro-peptide of urp1/2 does not have a function and is responsible for the difference between the observed morpholino and the crispr phenotype was important. However, since no phenotype was observed in crispants it is important to add evidence of induced cuts for all guide RNAs used in the crispant experiment. These control experiments might have been done already. If not, they can easily be done in a short period of time by performance of T7 assays on injected fish and would not require additional reagents. This is the same point raised by reviewer 2 and so we refer to the response above. In summary, we agree with the reviewer and we are currently performing these suggested experiments which are straightforward and working well.

      The authors claim that there were no structural defects observed in urp1/2 double mutants. However, the hemal arch in figure 3 E seems to be deformed. This could be normal variance or a phenotype. This can be addressed by simple reinspection of the scans.

      We believe there are no major vertebral structural defects that could be attributed to causing the spinal curves because vertebrae are well-formed in mutants and we see no defects in the initial patterning of vertebrae in our calcein experiments. However, since urp1-∆P;urp2-∆P and uts2r3 mutant spines are curved, the vertebrae are a little misshapen. We plan two revisions, one textual and one analytical.

      First, we will make clear in our textual edits that some vertebrae are slightly misshapen, as occurs in non-congenital forms of human spinal curve disease (in congenital forms, the shape defects are more striking and likely causative in the curvature). We agree with the reviewer that stating that there is a lack of vertebral structural defects lacked nuance, so we will expand on this in our next draft.

      Second, we will quantify vertebral shapes in spinal curve mutants and report these data in our next draft. After reinspection of the scans, as the reviewer suggested, we believe it would be informative for our readers to see quantitation of vertebral shape. We expect these data to more rigorously back up our statements about ‘minor structural differences’ of vertebrae between uncurved and curved individuals. We have already begun this work, and completing it should only take a few more weeks. As an example, we have measured the shape of centra by calculating aspect ratios in wild-type and urp1-∆P;urp2-∆P double mutants in curved regions of the spine:

      These preliminary data already make clear that there are indeed subtle morphological differences between vertebrae in mutants and wild-type, as occurs in human spinal curve deformities. We will present completed versions of these data (several parameters that describe vertebral shape) in our next draft and provide comments about whether such changes could be causative in spinal curve etiology as occurs in congenital-type scoliosis.

      Minor points


      Reviewer 1

      Supplementary FigS3B How to measure the Cobb Angle is unclear. Why is the first curve not counted? I count 3 curves. First a ventral displacement, then a dorsal to ventral return, then a sharp flex before the tail. How to measure Cobb angle might be easier to explain if the figure is expanded into steps. Identify the apical vertebra, then showing how the lines are drawn parallel to those vertebrae, then where the measured angle forms between the lines perpendicular to the drawn parallel lines.

      We will more thoroughly explain how Cobb angle is measured in our next draft.

      5a. I think we (zebrafish biologists) need be explicit about what we mean with "without vertebral defects." What do we count as defects? Vertebrae can be fused, bent, shortened or the growing edges can be slanted. In Figure 3E, and movie7, it is clear that the highlighted mutant vertebrae are shorter than WT. The growing ends of normal vertebra are perpendicular to the long axis of the vertebra. In the mutants the ends are slanted. Please define in the text what you consider a relevant vertebral defect, because these vertebrae have defects. Or are you only considering the calcein stained centra at 10dpf?

      We strongly agree with the reviewer. As described more thoroughly above in response to Major Comment – Reviewer 3, we plan both textual edits and new quantitation of vertebral shape to address this comment. Our quantitation indeed shows some vertebrae are shorter in mutants as the reviewer noticed. We also plan a new paragraph in the Discussion section which will speak about the issue of what zebrafish biologists might mean by “without vertebral defects”.

      5b. Do you want to base your patterning conclusion on primarily the calcein data as these are closer to the notochord patterning time window. Please anchor this conclusion to a specific time or standard length e.g. 10dpf/5.6mm.

      When we edit our descriptions of vertebral defects, and include new quantitative data on the shape of vertebrae, we will be clear that the vertebrae are slightly structurally malformed. In addition, when we speak of the calcein data, we will anchor those conclusions to the specific timepoint best studied by this method, as the reviewer suggests.

      "At 30 dpf... several mutants exhibited a significant curve in the pre-caudal vertebrae, in addition to a caudal curve (Fig. 3D and S3C). Since pre-caudal curves were rare in mutants at 3-months, this suggested that curve location is dynamic".The frequency of this observation is important. Does it effect all or a fraction of mutants? Can you provide some numbers to anchor these observations? Maybe fractions e.g.. 3 of 4 fish had precaudal curves at 30pdf, and 0 of 10 fish had precaudal curves by 3 mpf?

      In our next draft, we will provide numbers of fish examined at 30 dpf and also show graphical summaries of curve position (as we did for younger fish). Last, all scans will be included in a new supplementary figure.

      The description of the pkd2l1 mutant, instead of terming it kyphosis can you tell the reader the vertebra number at the peak of the curve. The authors say the pkd2l1 mutant is highly distinct from urp1/urp2-/-, but the reader needs to hear exactly what is distinct. For example, does this mutant have both lateral and D/V curves?

      We have now scanned several pkd2l1 mutant fish and we will include images of pkd2l1 mutants at two different timepoints together with quantitation of curve position. Our results agreed with those previously published for this mutant line (Sternberg et al., 2018) but we believe it is important for our readers to see side-by-side images and quantitation so they can see the distinctions.

      At 3-months of age, pkd2l1 mutants essentially appear wild-type but by around 12-months they have developed a D/V curve in the pre-caudal vertebrae. They do not exhibit M/L curves; we will quantify this and include these data in our Figure about M/L deviation.

      We called the phenotype displayed by pkd2l1 mutants “kyphosis” to be in line with a previous publication describing these mutants (Sternberg et al., 2018). We will add new wording in the Discussion about whether or not zebrafish can truly model kyphosis and lordosis (see response to Reviewer 1 major comment above), and we make clear in our Results that the phenotype has “been argued to model kyphosis (Sternberg et al., 2018)” rather than “is kyphosis”.

      It is intriguing that pkd2l1 mutants do not exhibit any curves until much later in life than urp1-∆P;urp2-∆P and uts2r3mutants. Inspired by this finding, we aged urp1-∆P and urp2-∆P single mutants and found that they go on to develop D/V curves by 12-months i.e.

      • *

      • *3-months 12-months Position of curve

      urp1-∆P no curves mild D/V curves Mostly caudal

      urp2-∆P mild D/V curves intermediate D/V curves Mostly caudal

      urp1-∆P;urp2-∆P severe D/V curves severe D/V curves Mostly caudal

      uts2r3 severe D/V curves severe D/V curves Mostly caudal

      cfap298 severe 3D curves severe 3D curves Caudal and pre-caudal

      pkd2l1 no curves mild D/V curves Mostly pre-caudal

      Phenotypes in urp1-∆P and urp2-∆P single mutants upon aging shows: 1) Urp1 and Urp2 are not entirely redundant in long-term spine maintenance and 2) proper Urp1/Urp2 dose is essential. We will include these new data in our next draft.

      Does uts2r3-/- have no /minimal side-to-side curves like urp1/urp2-/-?

      This is an interesting question. To address it, we will add images of uts2r3 mutant spines from the dorsal aspect and include them with our new quantitation of lateral curvature. To sum, the reviewer’s suggestion is correct – there are minimal side-to-side curves in uts2r3 mutants.

      One finding that deserves more discussion is the observation that urp1/urp2 double mutants have almost no side-to-side defects and all the obvious bends are in the D/V plane. Does this uncouple two systems for posture? Please consider the following paper. It shows a proprioception system that maintains normal side-to-side posture. A spinal organ of proprioception for integrated motor action feedback. Picton LD, Bertuzzi M, Pallucchi I, Fontanel P, Dahlberg E, Björnfors ER, Iacoviello F, Shearing PR, El Manira A. Neuron. 2021 Apr 7;109(7):1188-1201.e7. doi: 10.1016/j.neuron.2021.01.018. Epub 2021 Feb 11. PMID: 33577748

      Thank you for pointing out this manuscript. We will include it in our expanded Discussion.

      Reviewer 2

      Fig 3F: might be improved by making the images black and white and possibly inverted. It is not easy to clearly see the vertebrae as is. * *

      Thanks for the suggestion, we will make this change.

      • *

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Minor points


      Reviewer 1

      • *

      Figure 1D legend says urp1 is expressed in dorsal while urp2 is express in all CSF-cNeurons, but the image for urp1 shows only ventral cells in WT, while the image for urp2 shows the same cells ...and more dorsal cells. Please replace image with one that matches the text. Apologies for this, we have now corrected it. The image was correct but we accidentally wrote “dorsal” instead of “ventral” when describing the CSF-cN sub-population harboring urp1 transcripts.

      In Figure 2H, the position of curve apex graphic, how many fish were examined? In 2f it looks like n=8 and n=9. Can this info be added to the figure?

      We have now included the number examined in the legend.

      I did not find legends for the movies. The first call to the movies calls movies 1-3 without explaining what each shows. The labels on the downloaded files are not informative.

      Apologies for forgetting to submit these. We have now added informative Movie legends.

      Reviewer 3

      • *

      It would be helpful to the reader to add a little more information on urp1 and upr2 proteins and their processing to make it clear while only the 3' region of the protein was targeted to induce mutations. We have incorporated textual edits to make this more clear. We now state in the second sentence of the Results section:

      Urp1 and Urp2 are encoded by 5-exon genes with the final exon coding for the 10-amino acid peptides that are released by cleavage from the pro-domain (Fig. 1A).

      Together with Fig. 1A and Supplementary Fig. 1, we hope it is now clear to readers how Urp1 and Urp2 are generated from a 5-exon gene encoding the pro-domain and the peptide, which are separated by cleavage.

      It would also be helpful to the reader to have a schematic indicating the guide target sites (they could be added to figure S1 C + D) in the protein to be able to interpret the result more easily.

      Done!

      Figure 5: Addition of a square to H would help understand were the pictures in D-F were taken.

      Done!

      4. Description of analyses that authors prefer not to carry out

      N/A. We are performing all experiments/analyses requested by reviewers.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01574

      Corresponding author(s): Casey, Greene

      1. General Statements [optional] We thank the reviewers for their thorough feedback. We have addressed all the points raised, revised the manuscript accordingly, and explained our changes below. To aid readability, the reviewers’ comments have been converted to italics, and our responses have been bolded.

      Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The authors systematically evaluate the performance of linear and non-linear ML methods for making predictions from gene expression data. The results are interesting and timely, and the experiments are well designed.

      I have a few minor comments:

      - It was hard for me to understand Figure 1B. I think a figure like this would be very helpful however. What do the numbers represent? If sample ID, then I am not sure why x-axis label is also "samples"

      - For analysis of GTEx data, not sure what "studywise splitting" would mean, since the GTEx dataset is one study? Do you leave out the same individuals from all tissues for evaluation?

      We thank the reviewer for their input on these two points. To make Figure 1B clearer and to elaborate on our stratified splitting methods, we have amended its description to “We stratify the samples into cross-validation folds based on their study (in Recount3) or donor (in GTEx). We also evaluate the effects of sample-wise splitting and pretraining (B).”

      - I found the sample size on x-axis of Fig 2a confusing. If I understand correctly, GTEx has a total of ~1000 subjects. So in some sense, effective sample size can not be bigger than 1000. If you are counting subjects x tissue as sample, then it can be misleading in terms of the effective sample size.

      We thank the reviewer for this point. To incorporate it into the manuscript, we’ve added the following text to the description of Fig. 2: “It is worth noting that "Sample Count" in these figures refers to the total number of RNA-seq samples, some of which share donors. As a result, the effective sample size may be lower than the sample count. “

      - Would be interesting to assess out-of-sample generalizability of linear and non-linear models. Have you tried training on GTEx and predicting on Recount3 or vice versa?

      This question intrigued us. We reran the tissue prediction experiments from the manuscript on a subset of the GTEx and Recount3 datasets in which we performed an intersection over tissues and genes. We found that in the out-of-sample domain the logistic regression model and the three layer neural network performed similarly, while the five layer net generally had a lower accuracy despite having similar accuracy in the training domain. We also found (consistent with our results in the paper) that GTEx predictions are an easier task than their Recount counterparts. Below are plots demonstrating these findings:

      [These plots appear in the PDF but do not appear to work in the ReviewCommons Form].

      Reviewer #1 (Significance (Required)):

      Important and timely study, evaluating linear vs non-linear methods for predicting phenotype from gene expression datasets.

      We appreciate the reviewer’s positive comments on the timeliness of our manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary

      The authors want to assess the presence of non-linear signal in gene expression values in the task of tissue and sex classification. They use logisitic regression classifiers and two types of neural networks, with 3 and 5 layers, and assess classification performance on two large expression datasets from Recount3 and GTEX and three simulated datasets.

      The authors carefully construct their learning setup in such a way that one can reason about the removal of linear signal from the expression features. The interesting conclusion is, that although the linear approach works well on both datasets, and sometimes even better than the more complex models. The authors convingly show, that there is a significant non-linearity in the gene expression data. However, just because it is "there" does not imply that any non-linear methods performs better.

      Major comments:

      - Are the key conclusions convincing?

      The authors did a good job in showing, that there is non-linear signal in gene expression features for the classification problems studied.

      We thank the reviewer for their positive feedback.

      - Should the authors qualify some of their claims as preliminary or speculative, or

      remove them altogether?

      The overall claims of the authors are justified, the discussion may be improved.

      We appreciate the reviewer’s support for our overall claims and we have adjusted the manuscript as noted point by point below.

      - Would additional experiments be essential to support the claims of the paper?

      No, additional experiments are not essential. But the authors did not compare to other non-linear methods such as SVM or knn-classifiers in the resulst or conclusion section. It is unlikely that the main conclusion would change if those methods were tried. But it is possible that other "simpler" non-linear methods, such as knn for example, are able to outperform the logistic regression classifier on the GTEX and Recount3 data set. Thus, the authors should at least mention this as part of the conclusion and could extend their discussion on the implications of their study concerning other tasks or models.

      We agree that there should be more discussion of other models in the conclusion section. We have updated the fifth paragraph of the conclusion accordingly:

      “We are also unable to make claims about all problem domains or model classes. There are many potential transcriptomic prediction tasks and many datasets to perform them on. While we show that non-linear signal is not always helpful in tissue or sex prediction, and others have shown the same for various disease prediction tasks, there may be problems where non-linear signal is more important. It is also possible that other classes of models, be they simpler nonlinear models or different neural network topologies are more capable of taking advantage of the nonlinear signal present in the data.”

      - Are the suggested experiments realistic in terms of time and resources?

      Not applicable.

      - Are the data and the methods presented in such a way that they can be reproduced?

      There is a separate github repo which has the code to reproduce the analyses. This is good. However, would be nice to explain in more detail in the manuscript how the limma function was used for removing the linear signal, as they mention the "removeBatchEffect" function was used, but it would be good to tell the reader how that works, as this is their way for assessing the effect of linear-signal removal. Are there any limitations for the assessment of signal removal in this way?

      We thank the reviewer for their input, and have updated the model training section on signal removal to read: “We also used Limma[24] to remove linear signal associated with tissues in the data. We ran the ‘removeBatchEffect’ function on the training and validation sets separately, using the tissue labels as batch labels. This function fits a linear model that learns to predict the training data from the batch labels, and uses that model to regress out the linear signal within the training data that is predictive of the batch labels.”

      We have also elaborated on the limitations of signal removal by updating the sentence “This experiment supported our decision to perform signal removal on the training and validation sets separately, as removing the linear signal in the full dataset induced predictive signal (supp. fig. 6)” to read “This experiment supported our decision to perform signal removal on the training and validation sets separately. One potential failure state when using the signal removal method would be if it induced new signal as it removed the old. This state can be seen when removing the linear signal in the full dataset(supp. fig. 6).”

      - Are the experiments adequately replicated and statistical analysis adequate?

      Yes

      Minor comments:

      - Specific experimental issues that are easily addressable.

      no

      - Are prior studies referenced appropriately?

      Yes

      - Are the text and figures clear and accurate?

      *Also, they conducted 3 different experiments in Figure 3. It would be useful to separate the figure into 3) A, 3) B, and 3) C and link that specifically in the text. Figure 4 is an extended version of Figure 2, just with the additional results of the signal removed performances. *

      We appreciate the feedback. To make the figure and the text more clear, we have added A, B, and C subheadings to figure 3, and updated the subfigure’s references within the text accordingly.

      First, the pairwise results in 4B are hard to read as the differences in colors and line type are difficult to see as some lines are short. Second, we did not find it helpful to reproduce the full signal approach in Figure 4. We would suggest to make Figure 4 as Figure 2, and simply only talk about the Full signal mode in the beginning, how it is in the text.

      We agree. We have made Figure 4 our new Figure 2 and updated the references in the text.

      Further, it would be nice to give better names in the legends of these plots. Pytorch_lr is not a nice name.

      We thank the reviewer for pointing this out. We have updated the names in the legends to be “Five Layer Network”, “Three Layer Network”, and “Logistic Regression”

      - Do you have suggestions that would help the authors improve the presentation of

      their data and conclusions?

      As the Recount3 dataset is different in quality and complexity it would be reasonable to show the results of the binary classifcation also in the main paper. In particular, as this behaves different to the GTEX binary classification.

      We have now moved the Recount binary classification figure from the supplement to join the GTEx binary classification data as the new figure 4.

      -The title is somewhat unprecise. It may induce the impression that the paper is about expression-prediction, although that is not the case. Further, in the abstract they don't mention what prediction problem they solve and that these are classification problems. After reading the paper it is clear why the authors choose that, but we are suggesting an alternative title that the authors may consider:

      The effect of nonlinear signal in classification problems using gene expression values

      We agree with the reviewer’s comment and have updated our title to “The effect of non-linear signal in classification problems using gene expression”

      Further, they should give more details on the problem learned in the abstract.

      We thank the reviewer for their feedback, and have added details to the abstract about the problem domains. The relevant sentence now reads “We verified the presence of non-linear signal when predicting tissue and metadata sex labels from expression data by removing the predictive linear signal with Limma, and showed the removal ablated the performance of linear methods but not non-linear ones.”

      *-In addition, the conclusion section, which may be title as Disucssion and Conclusion, could contain additional points concerning the topology and training of the neural networks. *

      We have updated the heading of the final section to Discussion and Conclusion. To expand on the potential drawbacks of our neural network topologies, we have also updated the limitation portion of Discussion and Conclusion to read “We are also unable to make claims about all problem domains or model classes. There are many potential transcriptomic prediction tasks and many datasets to perform them on. While we show that non-linear signal is not always helpful in tissue or sex prediction, and others have shown the same for various disease prediction tasks, there may be problems where non-linear signal is more important. It is also possible that other classes of models, be they simpler nonlinear models or different neural network topologies are more capable of taking advantage of the nonlinear signal present in the data.”

      Obviously, it is possible that other simpler or more complex neural networks have a better performance on the GTEX and Recount3 data sets compared to logistic regression. In fact, the results from Figure4 suggest that, as there is clearly useful non-linear signal in those datasets for the classification problems studied. However, optimizing a non-linear model is inherently more complex and time-consuming, and thus may not be done thoroughly in previously published papers. Compared to a linear model that is easier and faster to optimize, this may be one reason why studies find that, despite non-linear signal, the linear model performs better. Other factors such as the samples size, which the authors already mention, of course also plays a big role, and if hundreds of thousands of datasets would be there , e.g. from single cell measurements, non-linear methods may have a better chance of outcompeting linear models.

      We agree, which is why we consider the signal removal experiment to be so important. By demonstrating that the non-linear methods we used were in fact learning non-linear signal we were able to show that there was something that non-linear models were able to learn that logistic regression was unable to. That is to say that while the presence of non-linearity in the decision boundary is necessary for non-linear models to outperform linear ones, it is not by itself sufficient. Perhaps with more data or a different model non-linear methods would perform better, but there is certainly a class of models and problems where logistic regression is preferable.

      Reviewer #2 (Significance (Required)):

      The submitted manuscript adds to the discussion of the necessity of non-linear models when solving classification problems using gene expression data. The significance is mostly technically, as a comparison of logistic regression and two neural network topologies that are being compared on two large expression datasets. However, there is also a conceptual part of the contribution, which is with regards to the implications of their experiments.

      Interested audience would be computer scientists and bioinformaticians or others, that are involved in creating or interpreting these or similar prediction models.

      Our field of expertise is in the creation of machine learning models using different types of OMICs data. All aspects of the work could be assessed.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, the authors discuss an interesting problem regarding the comparative performance of linear and non-linear machine learning models. The main conclusion is that logistic regression (linear model) and neural networks (non-linear model) have comparable performance if the data contain both linear and non-linear relations between the features (X) and the prediction target (Y), however, if the linear component in the X-Y relation is removed (e.g. regressed out) the neural networks will outperform logistic regression. This conclusion implies that linear models such as logistic regression mainly relies on the linearity in the X-Y relation.

      However, whether X-Y relation has a linear component and whether the data (e.g. for different Y classes) are linearly separable are two different questions. For example, consider a data generating mechanism, y=x^2+x and label the data points using two classes (y1). Clearly, the data is linearly separable, and any machine learning algorithm should perform very well on this problem. Now remove the linear component form the X-Y relation and use y=x^2 to generate the data. The data is still linearly separable, and the performance of logistic regression should not be affected.

      We agree that there is a difference between optimal linear decision boundaries and linear relationships between elements in the training data. Our use of the term “relationship” in place of “decision boundary” was imprecise. To make this more clear, we have made the following changes:

      Introduction:

      “Unlike purely linear models such as logistic regression, non-linear models should learn more sophisticated representations of the relationships between expression and phenotype.” -> “Unlike purely linear models such as logistic regression, non-linear models can learn non-linear decision boundaries to differentiate phenotypes.”

      “However, upon removing the linear signals relating the phenotype to gene expression we find non-linear signal in the data even when the linear models outperform the non-linear ones.” -> “However, when we remove any linear separability from the data, we find non-linear models are still able to make useful predictions even when the linear models previously outperformed the nonlinear ones.”

      Discussion and conclusion:

      We removed the following paragraph: “Given that non-linear signal is present in our problem domains, why doesn’t that signal allow non-linear models to make better predictions? Perhaps the signal is simply drowned out. Recent work has shown that only a fraction of a percent of gene-gene relationships have strong non-linear correlation despite a weak linear one [23].”

      The point is that the performance of linear models is mainly dependent on whether the data are linearly separable instead of the linearity in X-Y relation as the manuscript suggests.

      We agree that this is the key point and appreciate the reviewer for helping us to more carefully hone the language to convey this point.

      Reviewer #3 (Significance (Required)):

      The performance comparison between linear and non-linear machine learning models is important.

      We appreciate the reviewer’s recognition of the significance of the work.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to the Reviewers

      We thank the reviewers for their excellent suggestions and constructive comments. We now added new data on PE15/PPE20 binding to Ca2+, the PDIM status of mutant strains, additional controls, added to the discussion, added detail to the Methods, and provide all RNA-seq data. Please see replies to the comments in detail below:

      Reviewer 1:

      Major points

      1. Cellular localization:
      2. “The authors do not describe the cellular fractionation method…”, “The authors show some Western blot data in Fig. S3, though the legend is superficial (abbreviations not explained) and the controls with markers for cellular localization appear to be lacking”. “Further, the authors do not prove that FLAG-tagged PE20 is functional.”

      We included a description of the fractionation method in Materials and Methods (lines 475-485). We also added detail to the legend of Fig. 4A to explain the abbreviations and controls used. The same cell fractions were used in Fig. 4A and Fig. S3A, as mentioned in the Figure S3 legend (“The same cell fractions as in Fig. 4A were used, see controls therein”). We know that the FLAG-tagged PPE20 is functional because the strain used in this experiment is the same we used for genetic complementation experiments in which FLAG-tagged PPE20 functionally complements ppe20 deletion in all three assays (ATP consumption, biofilm, Ca2+ influx, Fig.4 B,C,D,G).

      • “The authors should extend discussion part of the manuscript. Several proteomic studies.” “Did authors analyze culture filtrate fraction by Western?

      We thank the reviewer for the references and extended the Discussion to include results from existing proteomic studies on PE15/PPE20 (lines 229-234). We did not test for PE15/PPE20 in culture filtrate, and previous proteomic results are contradictory. Several PE/PPE proteins, including PE15/PPE20 have been detected in the cell wall and in the CFP, but not consistently. The functional significance of this dual localization is unclear.

      1. Is PE15/PPE20 a channel?

      2. “PPE20 purified alone from the cytosol of E. coli?”

      We did not purify either protein by itself. As the reviewer correctly notes, PE/PPE proteins are refractory to individual purification. We clarified that we purified and used the complex for experiments even if only PPE20 is shown, as in Figure 3C,D, and E (Lines 124-127). See also Methods line 382 ff.

      • “…a positive control of a mutant that is indeed deficient in Mg2+ import (and thus showing a phenotype) is lacking.”

      Lacking a specific Mg2+ import mutant, and because it is a relatively minor point, we removed the statements about selectivity.

      1. Thermal melting assay

      2. It is surprising to see that the thermal melting assays was done for PPE20 and PE15 as separately purified proteins.

      We co-purified PE15 and PPE20 for all biochemical experiments. We clarified that point (see also point 2 above).

      • “the thermal melting assay only seemed to give some results for PPE20 alone, and not for PE15”

      PE15 did not produce interpretable results in this assay, as mentioned in line 144. We clarified in the Fig. 3 legend that the complex was used although only PPE20 is detected by Western blot and shown in Figure 3C.

      • “…the results are counter-intuitive… How can the authors be sure that the presence of Ca2+ does not simply lead to more protein precipitation (via rather unspecific interactions) at elevated temperatures? Some positive controls with bona fide calcium binding protein in the same thermal melting setup would have helped to clarify this.”

      The effect of Ca2+ on PPE20 is somewhat counterintuitive, although not unprecedented. Proteins can be stabilized or destabilized by ligand binding, and a recent proteome-wide study on the basis of thermal shift analysis showed that ~17% of proteins were destabilized by ligand (ATP). For a channel in particular, ligand binding might be expected to be coupled to protein relaxation in the process of channel opening, which could well translate to lower thermal stability. We added the positive control showing the behavior of a known Ca2+ binding protein (new Fig. S2A). In addition, we included a negative control showing that Ca2+ does not generally increase protein denaturation (Fig. S2B). We think that this control addresses the reviewer’s concern more directly.

      • If the authors want to stick to their claims regarding Ca2+ binding to PE15/PPE20, they have to perform additional assays (e.g. equilibrium dialysis or ITC) with the entire PE15/PPE20 complex. Further, they have to show that PE15/PPE20 forms a proper oligomeric protein that is membrane bound and reasonably behaved on size exclusion chromatography, when expressed in and purified from E. coli.

      Detecting Ca2+ binding to proteins is not trivial, and we thank the reviewer for suggesting equilibrium dialysis as another, orthogonal assay. We now show an equilibrium dialysis experiment that confirms Ca2+ binding by the PE15/PPE20 complex. Please see the new Fig. 3F. and G. and lines 146-152 (Results) and 429-443 (Methods).

      The PE/PPE proteins are generally difficult to express and purify recombinantly, likely due to the typically large unstructured regions. Also, the yield of PE15/PPE20 when expressed in E. coli was very low so that we were not able to detect the complex by SEC. However, data in Fig. 3 conclusively show that PE15 and PPE20 bind.

      1. RNA-seq data

      2. The authors should include a table with all other identified genes that are potentially involved in calcium homeostasis

      We provided all other significant differentially expressed genes in the new Table S1.

      Minor points:

      1. “what is the binding affinity of the Ca sensor?”

      We added the Ca2+ binding affinity of Twitch-2B (KD: 200nM) in line 176.

      1. Figure 4D: “one would expect a drop in FRET signal after EGTA addition… Can the authors explain?”

      We do see a clear drop in FRET signal after EGTA addition, in particular in 7H9 medium (black versus red line, Fig. 5B). Given the high affinity of Twitch-2B for Ca2+ (200nM), however, it is not surprising that the drop is not more pronounced, as intracellular Ca2+ is expected to be tightly bound to Twitch.

      1. The experiments showing outer membrane localization of PE15/PPE20 are very important, but results of these experiments (western-blot and FRET) are shown in supplementary figures. They should be transferred/integrated into the main Figures.

      We agree and moved Figure S3A to the main Figures as Figure 4A.

      1. Line 166: the authors claim that the assay did not work in 7H9 due to low Ca2+ concentration in this medium. Why did the authors not just add a bit more calcium to show whether this claim holds true?

      7H9 is not a suitable medium for these experiments because the baseline Ca2+ concentration is too high, not too low (see Fig. 5B, grey versus black line). Adding more Ca2+ to 7H9 medium resulted in precipitation, probably due to its interaction with phosphates. Our use of “low” in this context was confusing, we changed the wording of this sentence (line 180-181).

      1. Line 183: more detailed description on cellular fractionation and subsequent anti-FLAG Western needed here.

      We added more detail in the Materials section (lines 475 ff).

      Reviewer 2:

      • A major concern regarding the importance of the data: there are considerable technical challenges in generating Ca2+ depleted media. This is clear in that M. tuberculosis seems to be unaffected by Ca2+ in the medium - similar growth seems in Ca2+-free media to media with up to 10mM Ca2+ (Fig. S1). This raises a concern about the physiological relevance of the data (mammalian cells have intracellular Ca2+ of 0.01-0.1mM, extracellular free Ca2+ is around 1mM).

      If we correctly understand this comment, the reviewer is unconvinced that we fully and reproducibly depleted Ca2+ from medium based on a lack of an effect of Ca2+ on in vitro growth. We tested for baseline Ca2+ levels and depletion in media by inductively coupled plasma optical emission spectrometry and added these data showing precise quantitation of Ca2+ in medium (see new Fig. S1B).

      • The role of PE15/PPE20 in Ca2+ acquisition may be clearer if the authors ensure that the PDIM layer is intact. Specifically, there is a technical issue: The authors use Tween80 as a detergent. Tween-80 partially strips the outer cell wall of M. tuberculosis resulting in shedding of PDIM and PE/PPE proteins. Tyloxapol is a somewhat milder detergent. Some of the experiments would possibly show clearer phenotypes by use of Tyloxapol.

      We share the concern about PDIM, as PDIM loss is common in in vitro culture. We analyzed the total lipids by thin layer chromatography and confirmed the presence of PDIM in all three strains (Fig S3C, lines 198-201). We repeated experiments with Tyloxapol and did not see differences to Tween-80. We nonetheless now show the Tyloxapol data (Fig 5D).

      • The authors could increase the impact of their work be exploring the role of PE15/PPE20 during pathogenesis of resting versus activated bone marrow macrophages where Ca2+ fluxes of the host cell play a role in host responses.

      We agree. In vivo or macrophage experiments are a logical next step to fully characterize the function of PE15/PPE20, but we think it is beyond the scope of this manuscript. The main contribution of this paper is the identification of channel function of a PE/PPE protein pair that extends the novel channel paradigm for these proteins. These data support that transport might indeed be a shared function of the entire PE/PPE family with 169 members.

      Minor:

      • The authors should consider citing Sharma et al (2021)

      We cited the paper.

      • Are there Ca2+ binding motifs in PPE20?

      We did not detect canonical Ca2+ binding motifs in PPE20.

      • RNAseq data may need to be deposited in a public database.

      RNA-seq data have been deposited to NCBI - GEO accession GSE214266

      Link: https://urldefense.com/v3/https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE214266;!!NuzbfyPwt6ZyPHQ!tCf4MS_HRKJFn6qV2orkDAkXTWvx9IIU11fAV7TguYE2ietoMBpBgRC7rvfnM9bsoiVdIvDBUHdPmHZliDP2o5sRZR2ziK4$

      Token: cvmhakcgbpmbfuz

      • In its current state, the work is somewhat incremental

      The function of the large PE/PPE protein family of Mtb has been one of the most longstanding and perplexing puzzles in Mtb biology. For more than 20 years, speculation about their potential role, for example in antigenic variation, abounded but no conclusive evidence for this or another shared function emerged. A recent landmark paper then conclusively showed that a subset of the PE/PPE proteins function as nutrient channels (Wang et al., Science 2020). However, whether transporter function is a general function of the family of 169 PE/PPE proteins remains untested. Our PE/PPE pair is associated with a different type VII secretion system (Esx-3) and belongs to a different subfamily than the previous examples, suggesting a shared function across families and perhaps even all of these proteins. Given the intense interest and many false leads that have plagued the identification of PE/PPE function in the last 20 years, the difficulty of working with them biochemically, as well as the almost complete absence of understanding of Ca2+ homeostasis in Mtb, we do not consider our work incremental.

      Reviewer 3

      • My only slight concern is the meaning attached to the "biofilm" assays. It is never very clear to me that this is anything more than formation of a surface pellicle and general hydrophobicity of the mycobacterial cells.

      We fully agree that Mtb biofilms remain poorly defined. However, the term biofilm as used in our study has already found its way into the literature and we would rather not cause confusion by calling the same phenomenon by a different name. Whatever the term used, we do not suggest any other relevance other than it being a Ca2+-dependent phenotype that serves as one of several tests to parse PE15/PPE20’s role in Ca2+ homeostasis.

      Cross-consultation comments:

      • We agree with the concerns of reviewer#2 that the role of PDIM and use of detergent should be looked at more closely.

      We tested the roles of PDIM and detergent, see reviewer 2.

      • Likewise, the paper would strongly benefit from some further insights into the potential physiological role of PPE20/PE15 in calcium homeostasis.

      We show PE15/PPE20 function in the transport of Ca2+ and the first Ca2+-related cellular phenotypes in Mtb. Testing the role of the complex in an infection model is outside of the scope of this manuscript and mouse infection experiments would take many months and would likely be intractable because of the expected extensive redundancy among the 169 PE/PPE proteins.

    1. Or, take the case of unemployment as described by sociologist C. WrightMills:When, in a city of 100,000, only one man is unemployed, that is his per-sonal trouble, and for its relief we properly look to the character of theman, his skills, and his immediate opportunities. But when in a nation of50 million employees, 15 million men are unemployed, that is an issue, and

      we may not hope to find its solution within the range of opportunities open to any one individual. The very structure of opportunities has collapsed. Both the correct statement of the problem and the range of possible solutions require us to consider the economic and political institutions of the society, and not merely the personal situation and character of a scatter of individuals.16

      1. C. Wright Mills, The Sociological Imagination (New York: Oxford University Press, 1959), p. 9.

      I love this quote and it's interesting food for thought.

      Framing problems from the perspectives of a single individual versus a majority of people can be a powerful tool.

      The idea of the "welfare queen" was possibly too powerful because it singled out an imaginary individual rather than focusing on millions of people with a variety of backgrounds and diversity. Compare this with the fundraisers for impoverished children in Sally Stuther's Christian Children's Fund (aka ChildFund) which, while they show thousands of people in trouble, quite often focus on one individual child. This helps to personalize the plea and the charity actually assigned each donor a particular child they were helping out.

      How might this set up be used in reverse to change the perspective and opinions of those who think the "welfare queen" is a real thing instead of a problematic trope?

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This paper demonstrates a link between oxidative stress, lipid biosynthesis, and targeted histone acetylation in fission yeast. In mutant cells with defects in lipid synthesis (cbf11, mga2 lacking transcription factors, and cut6 lacking acetyl-CoA carboxylase), transcripts of a number of genes implicated in resistance to oxidative stress are increased. This is associated with higher levels of H3K9 acetylation and increased tolerance to oxidative stress. These effects are mediated through Sty1, a stress-activated MAP kinase and the transcription factor Atf1.

      It is also shown that H3K9 acetylation levels in the promoter region and just downstream of the transcriptional start site are increased in cbf11 mutants (Fig. 5A).

      By mutational analysis, the authors implicate the acetyl transferases Mst1 and Gcn5 in this transcriptional effect. Other related acetyl transferases, Hat1, Elp3, Mst2, Rtt109 have been ruled out as main contributors to the dysregulation in unstressed cbf11 mutants. That specific acetyl transferases have been shown to be required is a strength of the investigation.

      Major comments:

      The hypothesis is put forward in the manuscript that altered acetyl-CoA levels in cbf1 mutants would underlie the dysregulation of genes induced by oxidative stress. Histone acetyl transferases compete for acetyl-CoA with lipid biosynthesis, and so with increased demand for acetyl-CoA underacetylation in the concerned promoters would result - specifically at H3K9. These results do not directly support the hypothesis, on the other hand they are not sufficient to rule it out.

      Actually, we view this phenomenon the other way round: We primarily focus on exponentially growing cells, which have substantial demand for fatty acid (FA) production (= high acetyl-CoA consumption). So the level of promoter histone acetylation under these conditions is our baseline, or “normal” state. When FA production is decreased (cbf11 or cut6 mutants; inhibition of FA synthase by cerulenin…), stress gene promoters get *hyper*acetylated. We do not have any data on (or claims about) histone underacetylation compared to the baseline. Nevertheless, we now show that overexpression of Cut6/ACC results in decreased resistance to oxidative stress (Fig. 5C), which is compatible with the notion that increased acetyl-CoA consumption would result in insufficient histone acetylation at stress gene promoters during stress.

      Acetyl-CoA levels were measured only in undisturbed cells, and the possibility remains that under oxidative stress there would be changes in acetyl-CoA pools that could explain this apparent contradiction - why did not the authors examine that?

      Under oxidative stress, the Sty1 stress MAPK is activated, leading to a massive Atf1-dependent transcription wave, which is also associated with increased SAGA-dependent H3K9 acetylation (PMID: 21515633). This well-studied cellular response, however, is not the main focus of our study. Rather, we found a novel connection between perturbed lipid metabolism and increased expression of stress genes in cells *not challenged* by oxidative stress (i.e. Sty1-Atf1 are not hyperactivated). This is why we only measured acetyl-CoA concentrations in untreated cells.

      The authors argue that although the global acetyl-CoA levels are not increased, local concentrations might be altered in a way to permit higher H3K9 acetylation levels at selected promoters. Although a formal possibility, this is rather far-fetched as a small and freely diffusible molecule like acetyl-CoA should quickly equilibrate within one cellular compartment. I think that although the overall relationships that the authors have established between oxidative stress, H3K9 acetylation levels with increased expression, and lipid biosynthesis, are compelling, the role of acetyl-CoA concentrations is not clear and should be de-emphasized.

      Interestingly, acetyl-CoA production in the nucleus has been published by several studies (reviewed in PMID: 29174173), suggesting that local acetyl-CoA concentrations (microgradients) within the cell are functionally relevant. We agree that acetyl-CoA is a small molecule which, in theory, should diffuse quickly throughout the nucleocytoplasmic space. However, empirical evidence shows that the lipid synthesis in the cytosol and histone acetylation in the nucleus may not access a uniform nuclear-cytosolic pool of acetyl-CoA (PMID: 28099844, PMID: 28552616). This is related to the fact that the acetyl-CoA sink is large and acetyl-CoA may react with many proteins (i.e. any extra amounts will be consumed rapidly).

      Even though we provide strong evidence that HAT activity is critical for the crosstalk between FA synthesis and stress gene expression, we do agree that we have not conclusively established the role of acetyl-CoA in the process. However, we still feel that it is justified to point out acetyl-CoA is a “possible” mediator molecule for the crosstalk in the Results and Discussion sections.

      Minor comments:

      In many of the bar diagrams, only a borderline statistical significance is indicated (p ~ 0.05) despite seemingly large numerical differences between the means. In the legends it is stated that one-sided Mann-Whitney U tests were used. This is a non-parametric test with low power - would it not have been better to use a t test?

      We do agree that the non-parametric Mann-Whitney U test is rather conservative and, therefore, less sensitive for small sample sizes, such as n = 3. Our reason for using this particular test instead of the parametric t-test is that qPCR fold-change values come from a log-normal distribution, which is incompatible with t-test (requires normal distribution of data). Importantly, using conservative statistical testing does not invalidate our conclusions.

      What do the error bars in the diagram show, SEM? If a non-parametric test is used, a parametric measure of variability is irrelevant.

      The error bars represent standard deviation (SD). We do not see an issue here as, in our opinion, the visual style of numeric data presentation is independent from any chosen statistical testing methods.

      It would be helpful to the reader to indicate directly in the diagram panels what is actually shown, not just "fold change vs ..." In Fig. 1, 2, 4 D and 5 we see mRNA levels, in Fig. 3 chromatin IP.

      Done

      Reviewer #1 (Significance (Required)):

      The paper represents conceptual advances for our understanding of how stress responses, metabolism and transcriptional regulation are linked, although one of the links (acetyl-CoA levels in this case) is tenuous.

      This manuscript belongs in a rich literature on stress responses on the gene expression level, mostly from studies in yeast. Potentially, it adds entirely new information on how cellular stress may be mechanistially linked to stress responses.

      These results are potentially general and of broad interest to the biological community.

      This reviewer is familiar with yeast genetics, stress responses, and quantification of gene expression.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      As more and more metabolic intermediates are found to also serve as co-factors for epigenetic modifications, it has been widely accepted that regulating the levels of these key metabolites can be an effective way to control nutrient related gene expression. Acetyl-CoA is one of those early examples. Increased acetyl-CoA was shown to promote local acetylation at growth genes (Mol Cell 2011 PMID: 21596309), and ACC deletion funnels more Acetyl-CoA towards histone acetylation reactions and causes global hyperacetylation (Ref 17). However, whether those increased metabolite/co-factor can exert signal-specific effects remains elusive. For instance, although increased acetyl-CoA stimulates the SAGA complex enzymatic activity, it is not clear whether it also causes SAGA to be targeted to new sites without external cues to induce new transcription factor binding. Does increased acetyl-CoA cause broad hyperacetylation at all inducible genes which are the primary targets for those HAT complexes?

      In this manuscript, Princová et al. found that deletion of fatty acid synthesis transcriptional factors Cbf11 and Mga2 increases cell survival under H2O2 induced oxidative stress in S. pombe. They further showed that several stress-related genes increased upon Cbf11 deletion, and H3K9 acetylation at their promotor regions were elevated. They argued that FA-TF deletion may indirectly regulate stress-related genes potentially through influencing Acetyl-CoA level, although they failed to detect significant changes of global Acetyl-CoA levels. While it's interesting to see yet another example of metabolite-mediated gene expression regulation, the current manuscript only made incremental advance towards mechanistic principles of how these co-factors finetune specific gene expression program.

      Specific comments:

      1. This work showed convincingly that deletion of CBF11 or MGA2 leads to resistance to oxidative stress. However, it provides little mechanistic insight into how deletion of Cbf11 increased the expression of stress response genes and why some HATs are involved but others not (Figure EV5).

      We respectfully disagree with the notion that we only provide “little mechanistic insight” into the process whereby FA metabolism affects stress gene expression.

      • First, we show that not only deletion of cbf11, but also a very specific manipulation of the rate-limiting FA-producing enzyme (Cut6/ACC; Fig. 4D), or chemical inhibition of FA synthase by cerulenin (new Fig. 4F) all lead to increased stress gene expression. On the other hand, overproduction of Cut6/ACC results in decreased stress gene expression and lower resistance to ox. stress (new Fig. 5B-C). These findings clearly show the specific and tight mutual relationship between FA synthesis and expression of stress genes.

      • Second, we show that the DNA-binding activity of Cbf11 is critical for affecting stress gene expression levels, yet Cbf11 does not act as a stress gene repressor.

      • Third, we show that, compared to e.g. peroxide treatment, stress gene mRNA levels are only moderately increased upon downregulation of FA synthesis. So the situation can be called stress gene “derepression”. At the same time the major stress-response regulators (Sty1-Atf1, Fig. 2A-C; Pap1, new Fig. 2D-E) are required for the derepression, but, importantly, neither of them shows increased activation compared to unstressed WT cells (Fig. 3A-C). These data suggest a qualitative difference between the two phenomena (canonical stress response vs dysregulation of FA synthesis). Furthermore, they hint at an important role of the chromatin environment.

      • Fourth, we show that Gcn5/SAGA and Mst1, but not 4 other HATs, mediate the connection between FA metabolism and stress gene expression (Fig. 5D-E), and we show clear and specific H3K9 hyperacetylation of stress gene promoters in FA metabolism mutants (Fig. 5A), arguing that this is not a general acetylome issue.

      • Fifth, we show that the stress genes affected by changes in FA metabolism show unusually high nucleosome (H3) occupancy in their transcribed regions (even in unperturbed WT cells; Fig. 5A bottom panels), which could dictate the observed specificity in regulation.

      While we agree that our understanding is not yet complete, we have already described many mechanistic aspects of the link between FA metabolism and stress gene expression.

      1. Although in Cbf11 deletion cells, increased resistance to H2O2 is relied upon the Sty1/Atf1 pathway, the authors did not establish a link between lipid synthesis and Atf1 activity because Cbf11 deletion does not affect the phosphorylation of Atf1.

      Sty1 and/or Atf1 show non-zero activity even in normal, healthy, unstressed cells. Importantly, Atf1 is bound to many target promoters even in the absence of stress (Fig. 3B; PMID: 20661279, PMID: 28652406). Moreover, Sty1 is actually needed for orderly cell cycle progression (sty1KO cells are elongated, a result of postponed mitotic entry; e.g. PMID:7501024), which we now mention in the Introduction and Discussion. Our point is that Sty1-Atf1 are not hyperactivated under normal conditions - this only happens during major stress insults. Thus, in unstressed cbf11KO cells, stress gene promoters are hyperacetylated, which may facilitate their (Sty1-Atf1 and Pap1-dependent) transcription, without the need for hyperactivation of the stress response regulators. Such increased transcriptional competence of stress promoters is consistent with our findings that upon peroxide treatment stress gene mRNA levels in cbf11KO exceed those in WT (Fig. 1B). We have amended the corresponding section of the Discussion to more clearly explain our conclusions and hypotheses.

      1. Cbf11 deletion causes elevated H3K9 acetylation at the promotor regions of a number of stress respond genes, the author did not mention whether demonstrate how lipid synthesis defect causes the hyperacetylation at these promoters.

      As discussed in our manuscript, we suggest that following downregulation of FA synthesis, the surplus acetyl-CoA is used by Gcn5 and Mst1 HATs to hyperacetylate stress gene promoters.

      1. As all lipid-metabolism mutants show increased stress response, it would helpful to examine whether H2O2 induction of WT cells influence lipid synthesis, thus establish physiological links between FA synthesis and stress response.

      We now mention in the Discussion section that, curiously, cut6/ACC mRNA levels are downregulated upon peroxide treatment. However, the significance of this finding is unclear as FA metabolism is strongly regulated at the post-translational level (PMID: 12529438). Unfortunately, we are not in a position to measure changes in metabolic fluxes upon stress. In any case, we believe that such experiments would be outside the scope of the current study.

      Beside, fatty acid may be beneficial to fight oxidative stress because they maintain the integrity of cell membrane. What is the potential effect of CBF11 deletion in this aspect? The author may want to discuss it.

      The reviewer suggests that higher production of FA would result in higher resistance to oxidative stress. However, our data do not indicate this - we show that under low FA synthesis the stress resistance is actually higher. Nevertheless, we acknowledge in the Discussion that the scenario suggested by the reviewer can occur, for example, in cancer cells which become more resistant to oxidative stress following increased lipid biosynthesis/storage.

      1. Since H2O2 treatment also causes change in glucose metabolism including upregulation of glucose transporter Ght5 (PMID: 30782292), it would be enlightening to see if there is a crosstalk between the lipid and glucose metabolisms. Does Ght5 expression increase upon H2O2 treatment in CBF11 deletion strain?

      While the topic is interesting, we strongly believe that the relationship between glucose metabolism and stress gene expression is outside the scope of this study.

      According to our data used in Fig. 4A, ght5 expression in cbf11KO at 60 min after 0.74 mM H2O2 treatment is downregulated 3-fold.

      5 Different H2O2 concentration causes different stress response in pombe: Pap1 and Sty1 mediate responses for low and high H2O2, respectively. For fully activated Sty1 response, the concentration of H2O2, needs to reach 1mM (PMID: 17043891). In this study, the H2O2 concentration ranges from 0.5-1.5mM and Pap1 regulated Ctt1 does show increase upon H2O2 treatment. To test if suppressed lipid synthesis facilitates Sty1 dependent activation, it would be helpful to examine the activation of Pap1 (its nuclear translocation) to eliminate other influences.

      We agree with the reviewer. We have now included data on the role of Pap1 in the crosstalk between lipid metabolism and stress gene expression. We show that Pap1 is required for increased expression of gst2 and ctt1 in untreated cbf11KO cells (Fig. 2D). We note that ctt1 is coregulated by both Pap1 and Atf1 (Fig. 2B, D). Also, Pap1 is partially required for H2O2 resistance of cbf11KO cells (Fig. 2E). Importantly, similar to Sty1-Atf, Pap1 is not hyperactivated (no nuclear accumulation) by 10 or 60 min of cerulenin treatment (Fig. 3C), while stress gene expression is upregulated at 60 min in cerulenin (Fig. 4F) and keeps increasing after 120 min (data not shown). These data collectively support our hypothesis that upon decreased FA synthesis, stress gene promoters become more transcription-competent without the requirement for hyperactivation of the corresponding stress gene regulators.

      Reviewer #2 (Significance (Required)):

      see above

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This study examines the intriguing phenomenon that perturbation of fatty acid biosynthesis induces expression of stress-response genes by increased intracellular levels of acetyl-CoA and hyperacetylation of histones at the promoters of these genes. Loss of the CSL transcription factor Cbf11 results in induced expression of a subset of stress-response genes in unperturbed conditions and resistance to H2O2. These stress-response genes are not direct targets of Cbf11, but their upregulation is dependent on the Sty1-Atf1 pathway. Similar effects in upregulation of stress-response genes were observed in the cut6 hypomorph and mga2 deletion strain, however no change in global levels of acetyl-Co-A in the former as well as in the cbf11 deletion was detected. The upregulated stress-response genes appear to be linked to increased H3K9 acetylation in their promoters and dependent on the Gcn5 and Mst1 HATs.

      The authors present good supportive evidence linking fatty acid biosynthesis to epigenetic regulation of stress response genes potentially mediated by intracellular levels of acetyl-CoA. This is an exciting area and the fission yeast model system is ideal to elucidate the molecular mechanisms behind this process. This is a substantial body of work with state-of-the art functional genomics approaches and LC-MS analysis. The data is of high quality and the manuscript is well written and relatively easy to read. Below are my comments for the manuscript.

      It was determined that increased expression of stress-response genes in the cbf11 deletion is dependent on the presence of Sty1, and partially dependent on Atf1. How about Pap1 (or Prr1) - would this transcription factor that is also regulated by Sty1 be involved in the upregulation of the stress-response genes in the cbf11 deletion? Activation of Sty1 and Atf1 by phosphorylation was not observed in unperturbed cbf11 deletion cells which would be expected in the proposed model. This discrepancy was not well explained. Could activation of Sty1/Atf1/Pap1 in unperturbed cbf11 cells be assayed in a different way such as nuclear localization?

      As these concerns were also raised by Reviewer 2, to avoid duplication, we kindly ask you to read our detailed responses above. Briefly, we have now included new data clarifying the role of Pap1 in the increased expression of selected stress genes in cbf11KO cells (or when FA synthesis is chemically inhibited) - comment #5 of Reviewer 2 above. Also, we explain why Sty1-Atf1 and/or Pap1 hyperactivation (i.e. above their activity level in untreated WT) is actually not needed in order for decreased FA synthesis to trigger a mild/moderate increase in stress gene expression - comment #2 of Reviewer 2 above. We have now also clarified this issue in the Discussion section.

      As for the use of alternative methods for measuring the activation status of Sty1-Atf, we have already provided data from multiple independent and very sensitive methods (western blot, ChIP-qPCR; Fig. 3A-B). Also, it is questionable whether microscopy would be more sensitive than our current methods. Moreover, our H2O2-sensitive reporter does not indicate an increasingly oxidative environment inside cbf11KO cells, quite on the contrary (Fig. 1D).

      It would strengthen the model that perturbation of fatty biosynthesis induces expression of stress-response genes and H2O2 resistance if more mutant strains other than cut6 and two of its known regulators were tested. Does the proposed model apply to any deficiency in fatty acid synthesis in general or only those that result in increased levels of acetyl-CoA? For example, would deletion strains of fas1, fas2, lsd90, lcf1, lcf2 or the4 show the same stress response as cut6, mga2, and cbf11 mutants?

      The roles of lsd90, lcf1, lcf2 and the4 have been only poorly characterized so far, making it potentially difficult to interpret any stress-related phenotypes of these mutants. However, the role of the fatty acid synthase Fas1/Fas2 complex in FA production is well established. We have therefore inhibited FAS using cerulenin and found that this treatment also leads to increased stress gene expression (Fig. 5F), without causing Pap1 hyperactivation (Fig. 3C). Importantly, fas1/fas2 are not Cbf11 target genes, and FAS inhibition by cerulenin represents an acute intervention, very different from the long-term effects in cbf11/mga2/cut6 mutants.

      Also, does overexpression of cut6+ confer sensitivity to H2O2?

      Yes, our new data show that ~2-fold overexpression of cut6 both partially abolished the derepression of stress genes in cbf11KO cells (Fig. 5B), and increased sensitivity to H2O2 of WT cells (new Fig. 5C).

      The authors hypothesize that induced expression of stress-response genes in the cbf11 deletion and cut6 hypomorph is due to H3K9 hyperacetylation because of increased acetyl-CoA abundance in the cell. However, LC-MS analysis showed no change in global abundance of acetyl-CoA in the cbf11 deletion and cut6 hypomorph although differential levels of acetyl-CoA in the nucleus relative to the rest of the cell cannot be ruled out. The authors mentioned that ppc1-537 and ssp2 null are known to have lower abundance of acetyl-CoA and the latter could suppress the cbf11 deletion-induced gene expression for two of three genes tested by qPCR. Can ppc1-537 also suppress the cbf11 deletion-induced gene expression? Are ppc1-537 and the ssp2 null sensitive to H2O2?

      The ppc1-537 mutant is sick and has a growth defect, making it difficult to interpret any findings regarding its survival/resistance phenotype (see a similar issue with the cut6-621 mutant in Fig. 4E). Ssp2/AMPK has a pleiotropic role in the cell and its activity is actually controlled by Sty1-Atf1 under some stress conditions (PMID: 28515144) and the ssp2KO is resistant to osmotic stress (PMID: 28600551). All this makes it potentially difficult to derive reliable conclusions about ppc1 and ssp2. However, our current data on cut6 (ts hypomorph, Pcut6MUT, overexpression) and FAS/cerulenin are derived from precisely targeted and specific interventions, and support the proposed connection between FA synthesis and stress gene expression, and are consistent with the suggested role of acetyl-CoA (and its microgradients) in mediating the connection.

      I think Rtt109 is H3K56 specific.

      Indeed, H3K56 is the characterized specificity of Rtt109, and we indicate this explicitly in the manuscript. We wanted to make our HAT screen comprehensive since we could not presume which histone or even non-histone acetylation target(s) is involved in lipid metabolism-mediated stress gene expression. Even though we have observed increased H3K9ac (Gcn5/SAGA target), other modifications are likely involved since Mst1 affects stress gene expression in lipid mutants, but Mst1 is not known to target H3K9.

      Reviewer #3 (Significance (Required)):

      The authors present good supportive evidence linking fatty acid biosynthesis to epigenetic regulation of stress response genes potentially mediated by intracellular levels of acetyl-CoA. This is an exciting area and not all the molecular details have been elucidated in this process. S. pombe is ideal to study this fundamental process and discoveries would be applicable to other eukaryotic study organisms.

      My expertise is in eukaryotic gene regulation, molecular genetics and functional genomics, so I am quite qualified to critically review this paper.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This paper demonstrates a link between oxidative stress, lipid biosynthesis, and targeted histone acetylation in fission yeast. In mutant cells with defects in lipid synthesis (cbf11, mga2 lacking transcription factors, and cut6 lacking acetyl-CoA carboxylase), transcripts of a number of genes implicated in resistance to oxidative stress are increased. This is associated with higher levels of H3K9 acetylation and increased tolerance to oxidative stress. These effects are mediated through Sty1, a stress-activated MAP kinase and the transcription factor Atf1.

      It is also shown that H3K9 acetylation levels in the promoter region and just downstream of the transcriptional start site are increased in cbf11 mutants (Fig. 5A).

      By mutational analysis, the authors implicate the acetyl transferases Mst1 and Gcn5 in this transcriptional effect. Other related acetyl transferases, Hat1, Elp3, Mst2, Rtt109 have been ruled out as main contributors to the dysregulation in unstressed cbf11 mutants. That specific acetyl transferases have been shown to be required is a strength of the investigation.

      Major comments:

      The hypothesis is put forward in the manuscript that altered acetyl-CoA levels in cbf1 mutants would underlie the dysregulation of genes induced by oxidative stress. Histone acetyl transferases compete for acetyl-CoA with lipid biosynthesis, and so with increased demand for acetyl-CoA underacetylation in the concerned promoters would result - specifically at H3K9.

      These results do not directly support the hypothesis, on the other hand they are not sufficient to rule it out. Acetyl-CoA levels were measured only in undisturbed cells, and the possibility remains that under oxidative stress there would be changes in acetyl-CoA pools that could explain this apparent contradiction - why did not the authors examine that?

      The authors argue that although the global acetyl-CoA levels are not increased, local concentrations might be altered in a way to permit higher H3K9 acetylation levels at selected promoters. Although a formal possibility, this is rather far-fetched as a small and freely diffusible molecule like acetyl-CoA should quickly equilibrate within one cellular compartment. I think that although the overall relationships that the authors have established between oxidative stress, H3K9 acetylation levels with increased expression, and lipid biosynthesis, are compelling, the role of acetyl-CoA concentrations is not clear and should be de-emphasized.

      Minor comments:

      In many of the bar diagrams, only a borderline statistical significance is indicated (p ~ 0.05) despite seemingly large numerical differences between the means. In the legends it is stated that one-sided Mann-Whitney U tests were used. This is a non-parametric test with low power - would it not have been better to use a t test? What do the error bars in the diagram show, SEM? If a non-parametric test is used, a parametric measure of variability is irrelevant.

      It would be helpful to the reader to indicate directly in the diagram panels what is actually shown, not just "fold change vs ..." In Fig. 1, 2, 4 D and 5 we see mRNA levels, in Fig. 3 chromatin IP.

      Significance

      The paper represents conceptual advances for our understanding of how stress responses, metabolism and transcriptional regulation are linked, although one of the links (acetyl-CoA levels in this case) is tenuous.

      This manuscript belongs in a rich literature on stress responses on the gene expression level, mostly from studies in yeast. Potentially, it adds entirely new information on how cellular stress may be mechanistially linked to stress responses.

      These results are potentially general and of broad interest to the biological community.

      This reviewer is familiar with yeast genetics, stress responses, and quantification of gene expression.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This study presents a first structural insight on formin mDia bound to actin filaments in physiological conditions. Based mainly negative stain EM, the authors use 2D and 3D class averaging to describe two main configuration of the formin at the filament barbed end. The two configurations support the previously proposed stair-stepping model, which was based on crystal structures, with an open state where the formin binds two actin monomers and a closed state where three monomers are bound. Because the majority of the structures fall in the first, open state, this supports the existence of this intermediate. The authors also show that the orientation of the free FH2 in this open state is somewhat flexible, as several sub-classes with different angles can be distinguished. Finally, they identify, for the first time, formin densities bound along the length of the filament.

      The data is well presented and I don't have any major issue. The only point is that the information that all the initial structural data comes from negative stain EM comes should be put upfront. One gets the feeling that cryoEM is used throughout until one reads the section on cryoEM. Given that the methodology is now also established for cryoEM, it is regrettable that data was not collected with a 300kV microscope, which may have revealed more details of the conformations, but I understand microscope time is hard to come by, and the authors did a remarkable job from negative-stain EM.

      The finding of formin densities binding along the length of the actin filament is very interesting. Besides the previous cited finding, it also reminds of the observations made in yeast where Bni1 (in S. cerevisiae; PMID 17344480) and For3 (in S. pombe; PMID 16782006) where shown to exhibit retrograde movement with polymerizing actin cables in vivo. This would be interesting to consider in the discussion.

      Reviewer #1 (Significance (Required)):

      This study extends our understanding of the mechanism of formin-mediated actin assembly, by providing a first structural observation in physiological conditions. While confirmatory of previously proposed model, but also excludes an alternative model, and offers novel observations of flexibility and binding along the actin filament length. It will be of great interest to researchers on the actin cytoskeleton.

      My expertise is in the actin cytoskeleton and formins, but I am no expert in EM structural analysis.

      We thank reviewer 1 for the very positive comments and for pointing out the relevance of our study for the actin cytoskeleton field. As advised, we now specify upfront in the abstract and in the introduction that most of the presented results were obtained from negative stain electron microscopy. Following the reviewer’s advice, we have enriched the discussion to highlight the retrograde movements of formins in actin cables observed in vivo.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Maufront et al. have used EM to study the conformation of mDia1 at the barbed end and the core of actin filaments to explain the molecular mechanism of the FH2 dimer processivity at these sites. Based on modelled structural data they tried to describe how the conformational changes in FH2 dimer lead to its partial dissociation, and then association with filaments during the process of translocation coupled to subunit addition at actin filaments barbed ends. This supports a previous study (Otomo et al. 2005, Nature), in which using X-ray crystallography structural data were used to propose a stair-stepping model for Bni1p translocation at the barbed ends during actin polymerization. The model for mDia1 binding to core filaments is also given. Moreover, using EM structure and the previously reported structures of actin (PDB: 5OOE), and actin with formin FH2 dimer (PDB: 1Y64), authors explained the dynamic nature of FH2 dimer at barbed ends of the filaments using the flapping model. But due to the low resolution of their structures ~ 26-29A0, the finer details of actin and the FH2 dimer structure at barbed ends could not be resolved, leaving open questions about the orientation of actin helical twist at this end during elongation. The authors tried several conditions to get high density barbed-end filaments, but that did not collect adequate number of particles, resulting in low number of particles selected for structure modelling purposes. However, to attain more physiologically relevant structure they used cryo-EM, but were successful in capturing only the open conformation structure of FH2 dimer (at low resolution). Thus, due to low resolution of structures the key findings have not added much to what we already know about the mechanism of FH2 dimer translocation during actin polymerization, except that their studies support the stair-stepping model (Otomo et al. 2005, Nature) and not that of "stepping second" model ( Paul and Pollard. 2008, Curr. Bio.). Thus, this manuscript does not merit publication in this journal.

      We thank reviewer 2 for taking the time to read and review our study. However, we respectfully disagree with the statement that our findings “have not added much to what we already know about the mechanism of FH2 dimer translocation during actin polymerization”. As mentioned in our report, collecting EM data for formins in physiological conditions (at the barbed ends of growing filaments), as we do here for the first time, entails limitations on the number of particles one can observe and on the resulting resolution. Despite this rather low resolution, our data allow us to discriminate between two proposed models accounting for the processivity of formin FH2 domains at filament barbed ends. Being able to determine which of two competing models is valid (as the reviewer says we do) does add a lot to what we already know.

      Major comments:

      1. Present study does not provide any new insight about the conformation of the actin dimer at the barbed ends of actin filaments when FH2 domains of formin are bound. This study appears to be more like an extension of previous research (Otomo et al. 2005, Nature), in which the authors used X-ray crystallography data to propose a model for actin filaments elongation by formin bound at the barbed ends.

      As mentioned above, we respectfully disagree with this remark. First, in Otomo et al. 2005, formins are arranged in a crystal into a non-physiological “daisy chain” arrangement around a non-canonical tetramethyl rhodamine-actin filament. Our observations were made in physiological conditions displaying a single formin dimer at the barbed end of a polymerizing filament. Second, the stair stepping model originating from Otomo et al. was only inferred and extrapolated from the crystal structure and not directly observed. Both the open and the closed conformations were speculations, that had never been observed up to now. In our current report we directly visualize these two conformations. Third, the observations of Otomo et al. were obtained using formin Bni1p from yeast, not the mammalian formin mDia1, for which there is little (PDB 1V9B) structural data available describing the structure of a truncated mDia1 in the absence of actin. Finally, in addition to validating the stair-stepping model experimentally, we make unexpected observations that are totally absent from the model derived from Otomo et al. and subsequent studies.

      The low resolution of structures is a major concern.

      As mentioned above, the limited resolution is the price we had to pay for being in physiological conditions, with formins interacting with the barbed ends of growing actin filaments. Nonetheless, this resolution is sufficient to discriminate between the two previously existing models, and to make new observations, beyond these models.

      Given the low resolution of data, how can the authors decide on the number (4) of classes of FH2 domain (in open state) and present them as "continuum of conformations". They stated "details featured in class 4 do not appear as sharp as in class 2". What was the basis of deciding on the sharpness level?

      We agree that this point was unclear, and we thank the reviewer for pointing it out. The choice of the number of sub-classes for the open state is a trade-off between the sharpness (ie signal-to-noise ratio) of the resulting image, which is a direct consequence of the number of particles within each sub-class, and the internal variability within each sub-class. Class 4 might appear more “blurry” because it gathers particles displaying a range of angles. When increasing the number of generated classes in the 2D processing, we observe angular variations of the FH2 domains intermediate to the ones displayed in Figure 3. However, because increasing the number of classes results in averaging less particles per class, the generated classes appeared more noisy or “blurry” and not as “sharp”, as mentioned in the manuscript. Hence, we chose the number of displayed classes so that the signal-to-noise would remain satisfactory and sufficient to be able to determine the relative angle between the two FH2 domains. To make things clearer, “do not appear as sharp” was replaced by “displayed a lower signal-to-noise ratio and thus looked noisier”. The expression “sharp” was replaced by “enough contrast”.

      The authors showed 30Å structure of FH2 domain encircling actin filaments towards their pointed ends, but said nothing about the kind of decoration it could be, a "daisy-chain" or "concentric circle"? Also, they did not mention anything about the orientation of actin helical twist and specific sites of binding. These information would provide new in-depth understanding of how formins binds while diffusing along the filaments.

      The quality is sufficient to distinguish isolated FH2 dimers along the core of actin.

      Accordingly, the FH2 dimers we observed along the core of our actin filaments adopt a conformation similar to that observed at the barbed end, as mentioned in the text (‘concentric circle’). This observation differs from the reported for INF2 which accumulated along filaments and may interact in a ‘daisy-chain‘ manner (Gurel et al, 2014 ; Sharma et al, 2014). From our data, we can thus assume that formins interact with F-actin along the core of filaments similarly to the way they do at the barbed ends, and might translocate in a two-step manner alongside the actin filament. As stated in the manuscript, the actin helical twist could not be deciphered. For docking the crystal structures within our EM envelope, we used the formin-actin contacts described previously in Otomo et al.

      The author stated - "The leading FH2 domain likely provides a first docking intermediate for actin monomers that would help their orientation relative to the barbed end, resulting in a higher actin monomer on-rate". This statement was made on the basis of observing 79% times FH2 in the open state in their data set. This seems like an overstatement because they don't have any direct structural data to support such claim.

      We agree with the reviewer that our statement, taken from the discussion section, is speculative, and we apologize if this was unclear. Our purpose was to propose a plausible mechanism, based on our structural data, since the FH2 domain stands in front of the barbed end in the “open conformation” and since it likely interacts with actin monomers. We have now rephrased our sentence to state more clearly that is a hypothetical mechanism : “We propose that… could provide…”.

      In the Discussion they mentioned "the FH2 dimer would then be "lagging" behind the elongating barbed end if actin twisting back to 180{degree sign} occurs before the addition of actin monomer and this explains the diffusing along the actin filaments". Did authors encounter filaments with two formins bounds to them in their negative stain images? What is their view on this? In current data, they showed structure in which only one FH2 dimer is bound to the pointed ends of actin filaments. Have they tried increasing the concentration of formins to obtain structures with more than one formin is bound towards the pointed ends of actin filaments?

      Following the recommendations from reviewer 2, we have performed an additional analysis and we now show typical examples of filaments observed with a formin along their core, including cases where two formins are observed on the same filament (Supplementary Figure 12). As we now explain in the discussion section, five different mechanisms (including lagging) can be invoked to explain how a formin can be located along the core of the filament. These five mechanisms can all account for the possibility to have more than one formin on the same filament.

      The lagging mechanism, however, is the only one where we would expect that the filaments with a formin along their core are less likely to also have a formin at their barbed end (because the formin at the core spontaneously departed the bare barbed, that was left bare and with a shorter time to load another formin before fixation of the sample). A simple statistical analysis of our data leads to the estimation that 48 ± 7% (n=50) of actin filaments with a formin within their core also display a formin at their barbed ends. This is significantly less than for the global filament population, where 77 ±0.4% (n=10,461) of barbed ends are decorated with formins. This supports the lagging scenario as a likely mechanism putting formins along the core of the filament.

      Regarding the specific suggestion to increase the formin concentration: We did screen different formin concentrations, but with higher concentrations the level of noise due to unbound formins was significantly increased in the image background and impeded a proper analysis. This is why we consistently used 100 nM formins.

      To increase the density of short filaments for sample preparation, the authors used additional actin binding proteins "shown in supplementary Figure 2.C". There is no supplementary Figure 2.C. Moreover, it would be nice if the concentrations of these proteins are mentioned in the text.

      We apologize for this mistake. Supplementary Figure 2.C has now been added and the protein concentrations have been added in the main text.

      Minor comments:

      1. Figure 1 legend needs editing. E is missing in the legend.

      Thanks for noticing this. We have added the missing legend for 1.E. 2. There is no supplementary Figure 2.C.

      We apologize for this mistake. We have now added supplementary Figure 2.C.

      It is recommended that the authors report the number of particle used during 2D and RELION 3D classifications in the figures. This would help in better understanding of the probability of the conformations mentioned in the text.

      It was mentioned in the text. We have now made this information clearer to the reader.

      Reviewer #2 (Significance (Required)):

      This is the first direct study showing the two (open and closed) conformations of mDia1 FH2 domain at the barbed ends of actin filaments using EM and cryoEM. The study supports the proposed molecular mechanism of FH2 processivity at the barbed ends during filaments elongation using stair-stepping model reported earlier (Otomo et al. 2005, Nature). For the first time, FH2 has been shown to fluctuate between various angles with respect to static actin filaments, and on this basis they propose a flapping model (Fig 5). They explained the whole mechanism using structural proof, but the low resolution of data raises a question about their quality sufficiency to propose this mechanism. The overall novelty of this manuscripts is insufficient for the publication in this journal. Audience having understanding of the actin and actin binding proteins will be interested in this study. Additionally, researcher from the field of structural biology (EM and CryoEM) will be interested. I have been working in the field of actin and actin binding proteins for past 4 years. Over 10 years' experience in protein biochemistry, structural biology and molecular biology.

      We do not fully understand why, on one hand, reviewer 2 indicates that “for the first time, FH2 has been shown to fluctuate between various angles…” and that “Audience having understanding of the actin and actin binding proteins will be interested in this study. Additionally, researcher from the field of structural biology (EM and CryoEM) will be interested.”. On another hand, reviewer 2 states that “The overall novelty of this manuscripts is insufficient for the publication in this journal.”, which seems contradictory with the above statements and comments.

      Regarding novelty, we insist on the fact that we have achieved for the first time the direct observation of FH2 formin domains at a resolution sufficient to discriminate between two distinct models at the barbed ends, as well as to observe the presence of formin mDia1 along the core of actin filaments in conditions where nobody has proposed that this could happen.

      In addition, we have not specified any specific journal within the possible ones from “review commons”, up to now.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      In this manuscript, Julien et al. use negative stain electron microscopy and cryo-EM to show two conformations of the FH2 domain for the formin mDia1 bound to the barbed end of an actin filament. These conformations support the "stair-stepping" model of FH2 domain movement with an elongating actin filament, as previously postulated by Otomo et al. (reference 1). The two states observe correspond to the "open" (~79%) and closed (~21%). The authors also show the conformational variability of the open state suggesting flexibility in this state. Finally, the authors observe FH2 domains encircling the actin filament at a distance from the barbed end, and suggest that the FH2 can diffuse from the barbed end down the filament.

      Major comments:

      1) Novel insights into formin function derived from this structure would raise impact. Issues that could be addressed include the following. Simply adding some lines to the discussion would not really add impact, but additional experimental/modeling work would.

      We agree that comparing the binding mode of different formins on actin filaments, testing the impact of profilin, and assaying FH2 domains in the absence of FH1, as proposed below, would provide a broad set of interesting additional data. However, without claiming that our results can be generalized to all formins in all conditions, we believe that our findings are novel and should be of interest to a large community. The proposed additional experiments/modeling represent an impressive amount of work, and will be carried out in future investigations. We answer these comments in more details below.

      1. Whether this model really holds true for all FH2 domains. Formin FH2 dimerization and processive filament barbed end elongation are widespread features of formins, which have been evidenced for many organisms from metazoan to plants. Since we could dock the FH2 from yeast formin Bni1p to account for mammalian mDia1, we think the FH2 domain conformations may be conserved enough among species to display similar translocation mechanisms at the barbed ends of actin filaments, using a two-state mechanism. We chose to use the crystal structure from Bni1 formin (PDB 1Y64) because this structure was obtained in the presence of an actin filaments and brings some insights about the formin-actin contacts.

      In order to convince reviewer 3, we superimposed the existing crystal structure of the FH2 mDia1 domain (PDB: 1V9D) with our model and reconstruction and show (Supplementary Figure 12) that the differences are minor. The mDia1 FH2 domains (atomic structures in red, PDB : 1V9D) are aligned with Bni1p FH2 domains (atomic structures in green and blue, PDB : 1Y64) previously fitted into the electron microscopy envelope of a barbed end capped by a formin in the « open state ». The FH2 domains are well aligned with a slight discrepancy in the knob/actin contact regions (blue arrows). This discrepancy most likely results from the absence of actin partners in the crystals obtained with mDia1 FH2 domains. The Bni1p structure thereby most accurately represents the knob/actin contact region. In addition, the folding of the lasso domain around the post domain is resolved in the Bni1p structure. Note here that the Bni1p lasso domains wrap equally well around the Bni1p post domain and the mDia1 post domain (green arrows).

      1. Whether the % time spent in the open and closed states might dictate the vastly different elongation rates mediated by different formins. For example, mDia1 is considered one of the 'faster' elongators (equivalent to actin alone in the absence of profilin), while fission yeast Cdc12 essentially caps filaments in the absence of profilin. We have discussed this aspect thoroughly in the discussion section to conclude that:” Our direct assessment of the open state occupancy rate thus provides important information on the molecular nature of the formin-barbed end conformations which could not be directly inferred from kinetic measurements, with or without mechanical tension, so far. Considering a gating factor of 0.9 and considering that formin mDia1 spends 79% of the time in the open state, we can compute that the on-rate for monomers would be slightly higher (14% higher) for an mDia1-bearing barbed end in the open state, than for a bare barbed end.”

      We agree that repeating our set of EM experiments and analysis with other formins, like fission yeast Cdc12, would be interesting. However, this would take a long time, and falls out of the scope of our paper.

      1. Whether the % time spent in the open and closed states varies if filaments are actively elongating in the presence or absence of profilin. We have chosen not to include profilin in our experiments, and to limit the concentration of G-actin, in order to reduce the background in our EM micrographs. Also, a rapid filament elongation would increase the amount of F-actin per barbed end, while a dense population of short filaments is key to obtain accurate data (as we explain in the discussion, paragraph 1, p9).

      We speculate that, by providing a link between the FH1 domains and the filament barbed end, profilin might very well alter the percentage of time spent in the open state, and mitigate lagging as mentioned in the discussion section. Properly addressing the impact of profilin with our EM experiments is very challenging, for the reasons we have explained. It would require further investigations, beyond the scope of this study.

      1. How this model impacts the interactions of formins with other proteins at the barbed end. For example, capping proteins. We did not include capping proteins (or other additional proteins) because we wanted to avoid increasing the number of particles from diverse nature per field of view, as they constitute a background that is detrimental for the analysis of EM micrographs. We would have add to sort out additional populations in the course of image analysis. We thus only mixed actin and formin in our assays.

      2. Do these results relate to formin function in disease? Because formins regulate actin polymerization, their malfunction is linked to a variety of diseases. We therefore expect our findings to be useful to researchers in the medical field. However, our study remains in the scope of basic research and primarily aims at understanding the mechanisms of formin-assisted actin polymerization.

      2) The observation that formin FH2 domains can bind filament sides has been made several times. In particular, a structural model of the FH2 domain of the INF2 formin along the side of an actin filament (Gurel et al 2014, PMID 24915113). This publication also references other papers showing other formins binding to filament sides. There are two points to this comment:

      1. The model in Gurel et al is that the FH2 domain does not slide down the filament from the barbed end. Rather, the FH2 dimer has an appreciable dissociation rate, enabling it to encircle the filament without having to slide. This FH2 dissociation has been observed for another formin that has been shown to bind filament sides, FMNL1 (called FRL1 in the listed publication), in Harris et al 2006 (PMID 16556604). The authors must explain their reasoning for thinking that mDia1's FH2 can slide down the filament from the barbed end. One possibility is to make observations of this FH2 population in filaments that were not sonicated. What is the average distance of FH2s from the barbed end? We thank the reviewer for pointing our attention to this report from Gurel et al. which we now cite. Following this comment, as well as point 6 of reviewer 2, we now discuss the different mechanisms that could lead to our observation of mDia1 along the core of the filament. We provide a new analysis of our data (discussion section), arguing in favor of the lagging mechanism (i.e. ‘sliding down’ from the barbed end), without excluding the competing scenarios. Briefly, we compute that 48 ± 7% (n=50) of actin filaments with a formin within their core also display a formin at their barbed ends. This is significantly less than for the global filament population, where 77 ±0.4% (n=10,461) of barbed ends are decorated with formins. This supports the lagging scenario, which is the only one where a filament with a formin along its core should be less likely to also have a formin at its barbed end.

      The distance of FH2s from the barbed end would provide additional information. However, it is difficult to estimate, since we often to not see the entire filament, and since we do not know which end is the barbed end.

      1. Interestingly, in some of the works studying formin binding to filament sides, mDia1 was shown to be rather poor in this property. It would be useful to get an idea of what % of the observed FH2s are in the filament core, as opposed to at the barbed end. Along with the additional analysis mentioned in the previous point, we have now estimated that about 8% of actin filaments display a formin within their core. We have added this number in the manuscript (end of the Results section). As a comparison, in our assays, 77% of filament barbed ends bear a formin.

      2. The authors must reference the past works showing FH2 binding to filament sides, particularly the structural work. At present, no mention of prior work on FH2 side binding is mentioned. As advised, we have now added additional references and more particularly Gurel et al, 2014.

      3) My major technical concern in this manuscript is that the authors use the FH1-FH2-DAD domain of mDia1 for the imaging, but use FH2 structure of Bni1p for 3D characterization (Otomo et al.). Even though Bni1p has been used for functional and structural analysis, mDia1 and Bni1p FH2 domains share low sequence homology. In addition, mDia1 only partially complements loss of Bni1 function in vivo (Moseley et al., 2004 PMID 14657240). Can the authors use the partial structural information of the mDia1 FH2 from Shimada et al 2004 (PDB 1V9D, PMID 14992721)? Alternately, the authors could have used FH2 domain of Bni1p for imaging. At the very least, the authors should explain clearly why they used different proteins for imaging and modeling.

      As mentioned above (please see our response to point 1.a), we chose to use the crystal structure from formin Bni1 (PDB 1Y64) because this structure was obtained in the presence of an actin monomers, and it thus brings some insights about the formin-actin contacts. The existing structures obtained from formin mDia1 does not include actin (full length by EM: Maiti et al, 2012; crystal structure of subdomains (without FH1): Otomo et al., 2010 PLoS one). It thus seems relevant, in the context of our investigations, to use a structure where formin-actin contacts could be at least partially inferred.

      Further, we superimposed the existing crystal structure of the FH2 mDia1 domain (PDB: 1V9D) with our model and reconstruction and show that the differences are minor (please see the figure in our response to point 1.a, above).

      4) The open and closed states are observed from negative staining data. However, the authors can only find one of the states (open) by cryo-EM, which decreases the confidence level of the paper's conclusions. It would be useful for the authors do a little more to try to find the closed conformation by cryo-EM.

      Using Cryo-EM we can already recover the most abundant open conformation.

      Unfortunately, as pointed out here, the number of particles obtained was too low to enable high resolution and reveal the two observed conformations. Indeed, considering a density of ~ 5 barbed ends par micrograph, the collection of tens of thousands of images would have been necessary, which was not realistic regarding the access we have to latest generation microscopes.

      5) It is unclear whether there are additional effects of using FH1-FH2-DAD protein (not FH2 only) for the imaging, as it shows long protrusion at the tip of actin barbed end. To avoid those concerns the authors could use only FH2 domain of mDia1. Also the authors have to note that they used Bni1p structure because there are no published structures of mDia1 so far.

      We had indeed tried to use a construct deprived of the flexible FH1 domain but the lower purity of this construct and the presence of aggregates led to the collection of lower quality EM micrographs. As profilin was not included in our assay, FH1 domains were not involved in actin polymerization at the barbed end and thus remain very flexible and unstructured. Consistently, we did not detect any additional electronic density that could result from the FH1 domains.

      We indeed point out (p5) that “We used the crystal structure from yeast Bni1p FH2 domains in interactions with an actin filament, rather than the existing one from mammalian mDia1 formin FH2 dimer in isolation (PDB 1V9D), because actin-formin contacts are described in the Bni1p structure.” Minor comments:

      1) Figure 1: It would be interesting if imaging is provided for mDia1 bound to filaments which it has nucleated. Would it be possible that binding to pre-formed filaments is different to that for mDia1-nucleated filaments?

      This is a good suggestion for further investigations but it extends beyond the scope of this study: as we explain, our attempts to nucleate filaments from mDia1 lead to lower quality micrographs, and the sonication of preformed filaments was our best option. However, we do not expect the translocation mechanism of FH2 to differ, as a function of the nucleation history of the filament, since the formin interacts with a filament whose elongation it has assisted over several subunits.

      2) Supplementary figure 2: Numbers of things in the S2 is unclear and poorly described in both results and methods. In particular, figure S2A, the definitions of the black and gray lines (steady state actin) is not clear. Are they containing 5% pyrene actin? Is that actin in polymerization buffer or in monomer-actin buffer? Is that actin incubated with actin polymerization buffer for a certain time before measurement of fluorescent intensity? In figure S2B, how the authors calculate the monomer actin concentration? The authors should provide the information in either results or methods part.

      We apologize for the lack of information. Since this is a standard assay, we have now added more details in the Methods section (rather than in the Results section).

      All curves shown in figure S2 were obtained with 5% pyrene actin. The gray curve shows the pyrene fluorescence intensity baseline from 1 µM G-actin monomers, obtained in G-buffer. The black curve is the fluorescence intensity at steady-state of 1 µM actin in polymerizing conditions, (after 1 hour of incubation at room temperature, at 5 µM, the sample was diluted without sonication and left for another hour before measuring the fluorescence intensity).

      The monomeric actin concentrations shown in figure S2B are derived from the intensity level of pyrene at any time point during the experiment, using the simple equations we now present in the Methods section.

      3) Supplementary figure 2 C: The figure and legend are missing in the manuscript. Furthermore, the authors describe that they used Gc-globulin to sequester monomeric actin in solution. Is gc-globulin widely used for actin monomer sequestration?

      Thank you for noticing the missing panel which is now back in place. Indeed, Gc globulin is known to sequester G-actin (Van Baelen, H., R. Bouillon, and P. DeMoor. 1980. “Vitamin D binding protein (Gc-globulin) binds actin”. J. Biol. Chem. 255:2270-2272). This is why we have attempted to use it. We could see a slight effect but we did not want to increase the noise within our images with additional proteins that would have made the analysis more complicated.

      CROSS-CONSULTATION COMMENTS Reviewer #1 mentions that the authors identify formin densities bound along the actin filament for the first time. I agree that the imaging of the mDia1 along the actin filament using electron microscopy is novel, but the concept of formin binding has already been found and studied well with other formins (PMID 16556604, PMID 24915113) and even mDia1 has poor binding activity compared to other formins. It was really nice of the authors to show the mDia1 side filament binding, but I don't think it is a striking finding.

      I have no comment for Reviewer #2.

      Reviewer #3 (Significance (Required)):

      If the EM refinements and 3D rendering techniques are conducted rigorously (which this reviewer is unable to judge), the data support an existing theory of how FH2 domains interact with the actin barbed end. Overall, the data will be of interest in formin field. However, as written the paper confirms an existing model, and does not represent new insight. Impact would be raised by providing insights from these findings that impact formin function or disease.

      We have answered this concern above. The existing models were speculative and not based on direct observations. They relied on data obtained in non-physiological conditions.

      Here, we directly observe two distinct conformations in our structural data, and clearly validate one model over the other. This provides a major advancement in our understanding of formin interaction with actin filaments. In addition, we uncovered an unexpected behavior of formin mDia1, which can readily be found along the core of the filament without the aid of additional proteins, and we propose a mechanism based on our data to account for this observation.

      Another main point is that the observation of FH2 domains bound along an actin filament, while interesting, is not novel. Others have found this for other formins, but those papers are not referenced here.

      The direct binding of formins to the sides of actin filaments is thought to be specific to some particular formins (we now cite additional references in our manuscript, to discuss this point). Formin mDia1, which is a ubiquitous and widely studied mammalian formin (perhaps the most studied), has only been described to diffuse along actin filaments when a capping protein dislodges it from the barbed end (Bombardier et al. Nat Com 2015). Here, we show that formin mDia1 can be found encircling the core of actin filaments, in the absence of any capping protein. This behavior is novel and unexpected. It should open new avenues for research on formin mDia1, as well as on other formins.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): ____ *A significant criticism of the paper is an assumption that readers will be familiar with all of the findings in the author's previous 2016 paper and the PGL-1 papers by Aoki et al. Minimal context is given for each approach. *

      To address this concern, we have added a paragraph in the Introduction section of the revised manuscript.

      *Some conclusions are not well supported and require further analysis, proper controls, and more extensive descriptions of the experiments performed. *

      We have addressed the reviewer’s concerns as detailed below.

      Most importantly, the central conclusion and title of the paper is that composition can buffer the dynamics of individual proteins within liquid-like condensates. In other words, in vitro condensation assays often do not recapitulate LLPS behavior in vivo. That said, the findings in this study would be significantly strengthened and complemented by observing endogenously tagged PGL-3 and PGL-3 mutants in living worms, considering the efficiency of using CRISPR in C. elegans to insert tags and make precise mutations.

      The original manuscript already contained data where we microinjected wild-type PGL-3 and mutant PGL-3 proteins (recombinantly purified) into adult C. elegans gonads to assay how the P granule phase supports diffusion of these proteins.

      In the revised version, we now include additional data which shows “dynamics buffering” in transgenic worms generated using CRISPR/Cas9 technology. Briefly, we used CRISPR/Cas9 to generate transgenic C. elegans which expresses PGL-3-mEGFP or PGL-3(D425-452)-mEGFP from the native pgl-3 locus. In vitro, wild-type PGL-3-mEGFP protein generates liquid-like condensates. On the other hand, the recombinantly purified PGL-3(D425-452)-mEGFP protein generates condensates that are non-dynamic. In contrast to these observations in vitro, both wild-type PGL-3-mEGFP and PGL-3(D425-452)-mEGFP show similar dynamics (half-time of FRAP recovery) within P granules in vivo.

      *To improve readability, the introduction to P granules should be expanded, and include the reasons for looking at the nematode-specific PGL-3 protein among all the other known P granule proteins. A recap of previous findings on PGL-3 phase separation, in vivo and in vitro, is warranted, starting with the significant results of Saha et al 2016. Setting up the investigative questions in the context of recent work on PGL-1 (Aoki, et al) is also necessary. *

      To address this concern, we have added a paragraph in the Introduction section of the revised manuscript.

      The physiological concentration of PGL-3 should be more transparent, including why some experiments in this study are done at physiological concentrations while others are not. Describing why salt concentrations, crowding agents, and protein abundance are similar or different for each experiment is necessary and relevant. For example, after showing in Figure 1 that PGL-3 protein phase separates, the paragraph starting on line 161 says that it was previously shown that PGL-3 doesn't phase separate at physiological concentrations without RNA. One has to go back to Figure 1 to realize it was done differently than Figure 2 and Saha 2016.

      The concentrations of PGL-3 protein and use of crowding agents (if any) have already been specified within figures or figure legends. Salt concentrations used are specified within figure legends or materials and methods section.

      We have added the following paragraph to the materials and methods section of the revised manuscript.

      “Saha et al. 2016 showed that at physiological concentrations (approx. 1 mM), the PGL-3 protein is unable to phase separate into condensates. At these concentrations, mRNA promotes phase separation of PGL-3. To assay for mRNA-dependence of condensate assembly, it is therefore essential to use physiological concentrations of the PGL-3 protein or mutants (e.g. Figure 2). However, these condensates are generally too small to assay rate of internal rearrangement of PGL-3 molecules within condensates using fluorescence recovery after photobleaching experiments. Therefore, to generate large condensates for measuring internal rearrangement of PGL-3 or mutant molecules, we primarily used higher concentrations of these proteins where binding to RNA is not essential for phase separation. However, to mimic the in vivo P granule phase as closely as possible, we generally added constituent proteins in proportion to their in vivo abundance estimated in Saha et al. 2016.”

      The added paragraph in the Introduction section of the revised manuscript may be helpful to the readers. * *

      *Statements in the same paragraph like "in contrast to full-length PGL-3, mRNA does not support phase separation..." should be qualified by stating the concentration observed, with or without salts or other crowding agents. Similarly, line 230 "suggests that interactions involving the disordered C-terminal region of PGL-3 are not essential for the fast dynamics" and should be qualified with "at non-physiological concentrations and with XX crowding agents or salt concentration." It would be more consistent if physiological concentrations were consistent from figure to figure, as extra variables weaken some of the stated conclusions. *

      We thank the reviewer for this suggestion. However, we feel the statements (without full experimental details within main text) help convey the conceptual essence of the findings better. Of course, all these statements contain reference to figures or prior publications which provide relevant details about experimental conditions.

      *The 2010 review reference stating that there are 40 P granule enriched proteins is outdated. More recent reviews put the number much higher. This is relevant because the approach to put PGL-3 in a more physiological environment by including just PGL-1, GLH-1 and mRNA with the condensate assays, out of ~100 P granule enriched proteins, may not be sufficient to conclude "that the influence of complex composition on dynamics is modest" (line 223), or imply that the multicomponent nature of the P granule is reconstituted by adding these components (line 355). *

      We revised the text to indicate that P granules contain approx. 70 proteins and added appropriate references.

      • *

      Based on current information of constitutive P granule components (PGL-1, PGL-3, GLH-1, GLH-2, GLH-3, GLH-4, DEPS-1, MIP-1 and mRNA), (Kawasaki et al, 1998, 2004; Spike et al, 2008a, 2008b; Price et al, 2021; Cipriani et al, 2021; Phillips & Updike, 2022) we reconstituted P granule-like phase in vitro with mRNA, PGL- and GLH- proteins that likely constitute the most abundant components within P granules in vivo (based on concentration estimates in Saha et al. 2016).

      We do appreciate the reviewer’s comment that more components can be added to our in vitro reconstitution in addition to the limited set of components used in our study. However, we feel it is interesting to observe that a limited set of components can support dynamics buffering (the main message of the paper). Further, the complementary in vivo experiments show that the P granule phase can also support dynamics buffering.

      *Figure 1C needs to include PGL-3(370-693) in the analysis. Figure 1E is also incomplete without a comparison of FRAP recovery between PGL-3(1-452) and full PGL-3 as the control.

      *

      Fig. 1c already includes data with PGL-3 (370-693) [top row, central panel]. FRAP recovery data with full-length PGL-3 is already available in Supplementary Fig. 2c, g.

      *Figure 4C is missing an essential control where PGL-3 and S1 FRAP is performed without PGL-1, GLH-1, and mRNA. *

      In the revised version, we have added Supplementary Fig. 5f, where FRAP recovery of the following condensates are plotted together: 1) PGL-3 alone, 2) S1 alone, 3) PGL-3 + PGL-1, GLH-1 and mRNA, 4) S1 + PGL-1, GLH-1 and mRNA.

      *It would also help show sup Fig4A in the main figure to show concentration dependence. *

      We revised Fig. 4 to address the reviewer’s suggestion.

      Consider adding subtitles to supplementary figures.

      We considered the suggestion but felt it may not be essential.

      *M&M should include an explanation for statistical analysis *

      We added a paragraph describing statistical analysis within the Materials and Methods section.

      *CROSS-CONSULTATION COMMENTS I am also in agreement with the comments and critiques of reviewers 2 and 3.

      * Reviewer #1 (Significance (Required)): The paper by Saha and colleagues investigate the in vitro liquid-liquid phase separation propensity of a P granule protein PGL-3 and its structural domains. The findings largely replicate and support the phase-separation properties of a paralogous protein called PGL-1, as recently described by Aoki et al. 2021. Furthermore, they show that the dynamics demonstrated by recombinant PGL-3 may be maintained or buffered by the complex composition of P granules.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      *Jelenic et al. describe the effect of partner proteins on the FRAP dynamics of recombinant PGL-3 protein and variants in in vitro condensates and C elegans p-granules. The study shows that the N terminal a-helical dimerization domains is required for condensate formation and modulate of it alters aggregation and the FRAP dynamics of its condensates. Interestingly, a construct including the entire IDR region (370-693) by itself does not phase separate on its own at these conditions. The K126E K129E mutant (known previously to disrupt dimerization) and the deletion mutant abrogate llps. A mutant construct that shuffles the sequence in the region 423-453 called S1 here reduces the helicity and the condensate FRAP dynamics but recovered in the presence of a few P granule components. Also, the reduced dynamics of partially unfolded PGL-3 condensates are also rescued by the p-granule components to a certain degree of the unfolded PGL3 concentrations. This threshold concentration for recovering the condensate dynamics is further reduced in the helix reducing S1 mutant, which is also dependent on the number and the nature of P granule components.

      Overall, the study aims to probe how "composition can buffer protein dynamics within liquid-like condensates" - yet several underlying aspects of the study do not fully support that conclusion. The introduction does not sufficiently introduce the known structural information of the two dimerization domains in C elegans PGL proteins for which structures are known. The region is discussed as "alpha helical" but really there are two evolutionarily conserved independently folding dimerization domains (referring to the mutants as "reduced alpha helicity" is not helpful - these are mutations that destabilize a folded domain).*

      To address this concern, we have added a paragraph in the Introduction section of the revised manuscript.

      *Additionally, the abstract and introduction ignore the aspects of aggregation (touched on in discussion) - this is likely what the disruption to the helical region in residue 450 region is doing (the helix is not on the dimer interface based on homology / sequence identity to the crystal structure of PGL-1 central dimerization domain. *

      We think elucidating the molecular mechanism of apparent aggregation of PGL-3 (D425-452) could be an interesting direction for future investigation. Here, we focused our analysis predominantly on the mutant S1 since it generates liquid-like condensates with ~20- fold slower dynamics (compared to wild-type) in contrast to non-dynamic condensates/aggregates. Therefore, influence of other P granule components on the dynamics of PGL-3 in liquid-like condensates is easier to address using the mutant S1 rather than PGL-3 (D425-452). We didn’t find evidence that S1 aggregates as we did not detect aggregates of S1 molecules using fluorescence confocal microscopy and the slow dynamics in condensates of S1 does not change significantly over 24 h (Supplementary Fig. 3f).

      However, in the revised version, we now include additional in vivo data with C. elegans expressing the aggregation-prone PGL-3 (D425-452)-mEGFP. Briefly, we used CRISPR/Cas9 to generate transgenic C. elegans which expresses PGL-3-mEGFP or PGL-3(D425-452)-mEGFP from the native pgl-3 locus. In vitro, wild-type PGL-3-mEGFP protein generates liquid-like condensates. On the other hand, the recombinantly purified PGL-3(D425-452)-mEGFP protein generates condensates that are non-dynamic. In contrast to these observations in vitro, both wild-type PGL-3-mEGFP and PGL-3(D425-452)-mEGFP show similar dynamics (half-time of FRAP recovery) within P granules in vivo.

      Finally, the "dynamics buffering" is not really clearly established and could also be explained as small concentrations of aggregated proteins act like clients while increasing the concentration results in aggregation and "cross linking" in the entire droplet - and this concentration is never achieved in the in worm experiments so it is not clear. In other words, the change in FRAP dynamics not observed in worms is perhaps not surprising if small amount of recombinant proteins are incorporated into the granules. *

      *

      Data with the S1 mutant establishes that dynamics buffering can be observed in condensates with different sets of additives both in vitro (Fig. 5a, b) and in vivo (Fig. 4a, b). Further, data with condensates of S1 containing the additives PGL-3 (K126E K129E) or S1 (K126E K129E) demonstrate that dynamics (half-time of FRAP recovery) within S1 condensates, and in turn “dynamics buffering” depend on inter-molecular interactions. With respect to the hypothesis proposed by the reviewer, we did not detect aggregates within S1 condensates using confocal fluorescence microscopy.

      In contrast to S1 condensates, condensates containing partially unfolded PGL-3-mEGFP together with PGL-1, GLH-1 and mRNA showed spatial inhomogeneities in fluorescence signal throughout the condensate (Fig. 4g). We have not tested if areas with higher fluorescence signal represent aggregates. It is a possibility that the partially unfolded PGL-3-mEGFP fluorescence signal becomes more homogeneous if higher concentrations of additives (PGL-1, GLH-1 and mRNA) are used. However, the presented data demonstrate the significant effect of the P granule components (PGL-1, GLH-1 and mRNA) on the FRAP recovery rate of partially unfolded PGL-3-mEGFP in condensates (compare figures Fig. 3e and Fig. 4g).

      However, consistent with dynamics buffering, the P granule phase in vivo supports wild-type dynamics of different PGL-3 constructs over a range of concentrations - PGL-3(D425-452)-mEGFP at physiological concentration (CRISPR transgenic strain, Fig. 4e) or at higher concentrations (microinjected S1 and partially unfolded PGL-3-mEGFP, Fig. 4b).

      • *

      *It is also not clear what the mechanism of the changes is - is the protein driven to fold more properly (despite S1 disruption of its conserved sequence) inside the condensate? Does it still self interact and act as a dimerization domain? Does this change disrupt interactions? *

      We agree with the reviewer that identifying the precise structural changes of the S1 protein within the condensate vs. dilute phase could be an interesting direction for future investigation. However, we have already discussed the issues raised by the reviewer in the original manuscript.

      “Our data is consistent with the model that other regions of S1 molecules cooperate with residues 425-452 (shuffled) to generate stronger inter-molecular interactions. For instance, addition of the mutant S1 (K126E K129E) enhances dynamics of S1 within condensates in contrast to maintaining the slower dynamics observed within condensates of S1 alone. This suggests that the interactions disrupted by the mutations K126E and K129E also contribute to slow S1 dynamics. One possibility is that interactions involving the residues K126 and K129 favor S1 conformations that enhance 425-452 (shuffled)-dependent interactions. Indeed, the mutations K126E K129E have been reported to interfere with interactions among N-termini of PGL-3 molecules (Aoki et al, 2021). While two self-association domains within the α-helical N-terminus of PGL-3 have been mapped (Aoki et al, 2021, 2016), structural insights into those associations are limited. However, PGL-3 shares significant sequence similarity with another protein PGL-1. Crystal structures are available for fragments of the PGL-1 protein that show the two self-association domains at the N-terminus are predominantly α-helical and globular in nature (Aoki et al, 2016, 2021). Therefore, one possibility is that shuffling the sequence 425-452 of PGL-3 or heat-induced unfolding of PGL-3 exposes hydrophobic residues that become available to participate in inter-molecular interactions.”

      What is the real mechanism by which PGL-3 phase separates if not via the disordered domains? *

      *

      We agree with the reviewer that elucidating the detailed mechanism of phase separation of PGL-3 is an interesting direction for future investigation. However, we feel this is not required to support the main message of this manuscript.

      Throughout the manuscript, the term "dynamics" is used to indicate FRAP, but it would be better to define what is meant (diffusion of PGL-3 in condensates) instead of using dynamics a term that could mean many things. Secondly, FRAP cannot directly measure liquidity etc (see recent critiques by McSwiggen elife 2019, etc) so it is better to be cautious in the claims. Finally, discussing "dyanmics buffering" adds more terminology where it is not needed - perhaps say "changes to diffusion of PGL-3 in condensates".

      We feel it is useful to introduce a term that describes our observation. To our knowledge, our observation is novel and therefore requires a new term to describe it.

      However, we do appreciate the concern raised by the reviewer. We used a more generic term “dynamics buffering” in contrast to the more specific “diffusion buffering” since we did not directly estimate diffusion behavior at the ‘single-molecule’ level. However, we already described what we mean by “dynamics buffering” in the text as follows.

      “We used condensates of similar size for our analysis (average ± 1 SD of diameter of condensates are 6.4 ± 1.7 mm (Fig. 5a) and 5.9 ± 0.4 mm (Fig. 5b)). Therefore, dynamics buffering here is likely to represent similar diffusion rates of S1 within condensates.”

      • *

      *The "N-terminus" is not 65% of the protein. One could define this as the N-terminal domain, but again there are two clear folded domains in the first 65% of the protein and this needs to be described better. *

      We revised the text to replace the terms “N-terminus” and “N-terminal domain” to “N-terminal fragment”.

      *The description of "stickers" and the references to tau and hnRNPA1 are confusing as this is a predominantly ordered domain while those are IDRs. *

      • *

      We feel this is important as it aids discussing our work in the context of current literature describing the mechanisms of macromolecular phase separation.

      The suggestion in the discussion that "P granule components support dynamics by participating in intermolecular interactions wth PGL-3-mEGFP molecules" is not well supported because no interaction assays are performed and no mutaitons are made that disrupt these interactions to test this.

      Indeed, we have not conducted interaction assays or mutational analysis to directly test this. However, our detailed analysis with the S1 mutant supports this suggestion.

      While partially unfolded PGL-3-mEGFP molecules lose 30% of a-helicity, the a-helicity of the S1 mutant is reduced by 15% compared to wild-type PGL-3. Data with S1 and partially unfolded PGL-3-mEGFP molecules show that loss of a-helicity correlates with slower diffusion of protein molecules within condensates. Using the mutants PGL-3 (K126E K129E) and S1 (K126E K129E), we show that diffusion rate of S1 molecules within condensates depend on inter-molecular interactions, and presence of other P granule components support faster diffusion rate of S1 molecules within condensates. Therefore, we feel it is safe to speculate that intermolecular interactions with P granule components can support dynamics of a “more unfolded” (compared to S1) version of PGL-3 molecule. * *

      *More detailed analysis of some of the claims: Claim 1: An a-helical region mediates the phase separation of PGL-3, and the C-terminal disordered region by itself does not phase separate. The N-terminal dimerization is essential for LLPS. The C-terminal IDR interactions with mRNA facilitate the LLPS. Comments: The authors show sufficient experimental data using microscopy and FRAP on truncated constructs with the N-terminal and C-terminal regions - but see above regarding how these are described - a proper domain structure with the folded domains shown and the RGG motifs highlighted should be added and integrated throughout the discussion. *

      In the revised version of the manuscript, we described the predicted PGL-3 domains within a paragraph in the introduction: “The interactions that support phase separation of the PGL-3 protein remains unclear. Structural studies on the orthologous PGL-1 protein revealed two dimerization domains. This raises the possibility that PGL-3 also contains similar dimerization domains, and phase separation depends on interactions involving these domains.”

      Our Fig. 1a already includes the schematic representation of PGL-3 with predicted N-terminal and Central Dimerization domains and RGG repeats.

      *They show that the N-terminus is necessary and adequate for LLPS, and the C-terminus by itself does not phase separate. But, how does the N-terminal domains phase separate? This is not explained - what are the interactions? *

      • *

      Also, a di-mutant (K126E K129E) that is known, and also authors use SEC-MALS to show their N-terminal construct is consistent with the published results. Disrupting the n-terminal dimerization prevents phase separation, suggesting the importance of these residues in the N-terminus for self-assembly and LLPS. The Microscopy data backs the claim that the mRNA-mediated LLPS is facilitated by binding with C-terminus. However, the m-RNA binding to IDR is not sufficient for LLPS. Yet, the authors do not explain how higher salt prevents phase separation - again the mechanism of phase separation is unclear. Is it multivalent interaction of the two dimerization domains? A basic model (that is tested) would be important.

      We agree with the reviewer that elucidating the detailed mechanism of phase separation of PGL-3 is an interesting direction for future investigation. However, we feel this is not required to support the main message of this manuscript.

      However, our manuscript already provides some relevant insights as follows.

      “To investigate the underlying mechanism further, we began by testing if the N-terminal α-helical region of PGL-3 can self-associate. Our analysis using size exclusion chromatography followed by multi-angle light scattering (SEC-MALS) showed that this PGL-3 fragment 1-452 forms a dimer (Supplementary Fig. 2f). Mutation of two residues (K126E K129E) have been shown to interfere with interactions among the N-termini of PGL-3 molecules (Aoki et al, 2021). We mutated these two residues within the full-length PGL-3 protein (K126E K129E) (Fig. 1a) and found that this mutant PGL-3 (K126E K129E) protein cannot phase separate even at high protein concentrations up to ~130 µM (Fig. 1b, c). Addition of mRNA does not trigger phase separation of this protein at physiological concentrations either (Fig. 2a, b). Taken together, our data is consistent with a model where association among folded N-termini of PGL-3 molecules is essential for phase separation.”

      A likely possibility is that phase separation of PGL-3 depends on electrostatic inter-molecular interactions among the folded N-terminal fragment of PGL-3 molecules. Therefore, high salt prevents phase separation.

      Are the tags removed to ensure that phase separation is not caused by tags or remaining linker regions? Is the protein purified to be without nucleic acid contamination or other purity metrics?

      Most of the experiments were done with only 5% of total protein tagged with 6x-His-mEGFP. No additional tags were present on the constructs. For recombinant expression and purification, proteins were cloned such that it is possible to remove the 6xHis-mEGFP tag following treatment with TEV protease. Following removal of the 6xHis-mEGFP tag, the residual linker is just two amino acid residues long. We used 100% tagged-protein for our experiments only in very few cases (indicated in the figure legends).

      To demonstrate purity of recombinant proteins, SDS-PAGE gels with all protein constructs used in this study are shown in Supplementary Fig. 1.

      To minimize contamination of nucleic acids, we treated samples with Benzonase during the course of purification.

      To assess the extent of nucleic acid contamination, the ratio of absorbance at 260 nm and 280 nm (A260/A280) was monitored. In exceptional cases with high A260/A280 values, we analyzed samples further by purifying RNA from the sample using RNA purification kit (Qiagen) and found that RNA represented 1% or less of the sample mass.* *

      Claim2: The N-terminal a-helical region modulates the dynamics within condensates. The IDR region has minimal effect on the fast dynamics of PGL-3. Comments: The authors show that the full-length PGL-3 condensates have modest influence of components by comparing the FRAP half times with or without the P granule components, including mRNA. However, have the authors tried this in the presence of mRNAs for the constructs lacking the IDRs as they have several RGG domains and bind with mRNA and are likely to change the dynamics.

      We thank the reviewer for this suggestion. However, this experiment is not essential to support the claim made in the context of homotypic condensates of PGL-3 : “The N-terminal a-helical region modulates the dynamics within condensates. The IDR region has minimal effect on the fast dynamics of PGL-3.”

      *The authors report the importance of the N-terminal a-helical region by making a construct that lacks/disrupts a part of the helices lowers the thermal stability and significantly lowers the dynamics of the condensates. Also unfolding of helices is shown to reduce the dynamics. One primary concern is whether these "rescued" protein dynamics imply protein functionality. *

      An assay of “functionality” e.g. an enzymatic activity of the PGL-3 protein is not available.

      However, we compared the fecundity of C. elegans worms expressing from the native pgl-3 locus, PGL-3-mEGFP or the mutant protein PGL-3(D425-452)-mEGFP, to assay the functionality of P granules in these strains. We found that worms of both genotypes produced similar number of offspring (Fig. 4d). This suggests that deletion of residues 425-452 of PGL-3 does not result in significant loss of function of P granules.

      Are these semi denatured proteins refolded in the presence of P-granule components?

      We feel that identifying the precise structural changes of the semi-denatured PGL-3 proteins within the condensate vs. dilute phase could be an interesting direction for future investigation.

      Finally, it is not clear why the authors chose to disrupt folding of the central dimerization domain?

      The manuscript included a paragraph to describe the rationale.

      “This suggests that interactions involving the disordered C-terminal region of PGL-3 are not essential for the fast dynamics within condensates. Therefore, we addressed the role of the N-terminal α-helical region (1-452) in driving dynamics. In order to avoid engineering mutations that result in significant misfolding of PGL-3 and concomitant loss of its ability to phase separate, we focused our mutational analysis close to the junction of the folded N-terminus and the disordered C-terminus of PGL-3. Surprisingly, we found that a full-length PGL-3 construct (D425-452) that lacks only 27 residues phase separates into condensates that are non-dynamic (Fig. 3a, c). Sequence analysis of the PGL-3 protein predicts that this region 425-452 spans two α-helices (one complete helix and fraction of a second helix) (Supplementary Fig. 3d). We generated a PGL-3 construct (hereafter called ‘S1’) (Fig. 3a) in which the sequence in the region, 425-452, is shuffled while keeping the overall amino acid composition unchanged. We found that S1 phase separates into condensates that are 20- fold less dynamic than with wild-type PGL-3 (Fig. 3d, Supplementary Fig. 3c).”

      Saying that "reduced alpha-helicity of PGL-3 correlates with slower dynamics in condensates" may be factual in these assays but "correlation" should be expanded upon to include mechanism and to me it seems that the statement should read "aggregation of PGL-3 causes slower dynamics in condensates" (both the partially destabilized mutant and the fully unfolded WT show similar effects perhaps to different degrees).

      We feel that identifying the precise structural changes of the semi-denatured PGL-3 proteins within the condensate vs. dilute phase could be an interesting direction for future investigation.

      We did not use the term "aggregation" since we did not detect aggregates of S1 molecules using fluorescence confocal microscopy.

      *CROSS-CONSULTATION COMMENTS I agree with the other reviewer's comments and critiques, I have concerns about the biological relevance and also the biophysical mechanisms. Reflecting on the other reviewers' comments, the papers could provide more depth in one or both of these areas to come to firm conclusions that are either revealing about PGL biology or elucidate a (possible) general biophysical mechanism. *

      In the revised version, we now include additional data which shows “dynamics buffering” in transgenic worms generated using CRISPR/Cas9 technology. Briefly, we used CRISPR/Cas9 to generate transgenic C. elegans which expresses PGL-3-mEGFP or PGL-3(D425-452)-mEGFP from the native pgl-3 locus. In vitro, wild-type PGL-3-mEGFP protein generates liquid-like condensates. On the other hand, the recombinantly purified PGL-3(D425-452)-mEGFP protein generates condensates that are non-dynamic. In contrast to these observations in vitro, both wild-type PGL-3-mEGFP and PGL-3(D425-452)-mEGFP show similar dynamics (half-time of FRAP recovery) within P granules in vivo.

      Reviewer #2 (Significance (Required)): *Hence, although the authors shows how inclusion of other components can alter the one protein component phase separation, this is done with entirely artificial means of destabilizing the fold of one of the domains which likely leads to aggregation. So the true impact of the work is hard to understand because the mutations impact on the basic biophysical properties of the domain (stability, interaction) are not completely characterized and the reason for disrupting this folding is not clear. *

      A major impact of our work is elucidation of a novel “dynamics buffering” property within biomolecular condensates in vitro. Our in vivo data is consistent with this finding.

      • *

      We have chosen two orthogonal ways of perturbing the PGL-3 protein (i.e. mutations and temperature-dependent unfolding) to assay the effect on diffusion rate against different levels of perturbation (e.g. 30% loss of a-helicity in heat-denatured PGL-3-mEGFP vs. 15% loss of a-helicity in the S1 mutant, compared to wild-type PGL-3). Studying the phase separation behavior of these “artificially-generated” constructs provided the understanding that dynamics of PGL-3 in condensates depends on inter-molecular interactions, and slower dynamics generally correlate with stronger inter-molecular interactions. Further, interactions among two or more P granule components can buffer against large change in dynamics / aggregation within the P granule phase. These insights may lay the groundwork for addressing how more “natural” modifications (e.g., post-translational modifications, high local concentration of “sticky” molecules) may influence dynamics within biomolecular condensates in vivo.

      Based on current knowledge of P granule composition, chaperone proteins (e.g. heat-shock family proteins) do not show abundant concentration within P granules. However, it is unclear if chaperone proteins are completely excluded from the P granule phase. Therefore, we speculate that weak interactions among two or more non-chaperone proteins contribute significantly to “dynamics buffering” within the P granule phase in vivo.

      In the discussion section of the manuscript, we had speculated that “dynamics buffering” may potentially explain observations reported in the nucleolus: “Similarly, interactions among components could be a potential mechanism of storage of misfolding-prone proteins in non-aggregated state within the liquid-like nucleolus under stress in vivo (Frottin et al, 2019).”

      Our finding is also relevant in the context of synthetic biology with applications that require steady diffusion rate of macromolecules during biochemical reactions within biomolecular condensates.

      • *

      My field of expertise is protein phase separation and protein structure. * *

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: P granules are liquid condensates found in the developing germlines and embryos of C. elegans. Prior work by the authors and others have established P granules as a tractable model to investigate the basic biophysical properties of liquid condensates. Much of the prior published work focused on specific P granule scaffold proteins, PGL-1 and PGL-3. How attributes of these PGL proteins and the effect of other P granule components affect condensate properties is not fully understood. Here, Jelenic, et al. probe the biophysical properties of PGL-3. Using recombinant protein, they show that an N-terminal, alpha-helical region of PGL-3 is sufficient for liquid condensate formation and that N-terminal assembly is required for this formation. Creation of a scrambled alpha-helical region in PGL-3 and heat treatment affects PGL-3 fluidity. This fluidity can be "rescued" in vivo and in vitro with the inclusion of other P granule factors, including wildtype PGL-3, PGL-1, GLH-1 and mRNA. The authors note an inverse correlation between fluidity and mutant PGL-3 fluorescent intensity. They propose a model that heterotypic compositions of condensates can buffer their fluidity against components with stronger multivalent interactions. *

      MAJOR: 1. PGL-3 is a fantastic model to study the biophysical properties of a liquid condensate. But as the authors address in their discussion, the S1 mutant will likely affect the central domain folding, at its minimum causing exposure of a hydrophobic surface not typically exposed in biology. These helices are found at the terminal portion of the domain determined in the crystal structure and as depicted in the authors' Figure 1A. While the cause of S1's enhanced molecular interactions does not affect the in vitro work presented in this manuscript, it does affect how the conclusions connect to the biological nature of P granules and liquid condensates more generally. *

      We have chosen two orthogonal ways of perturbing the PGL-3 protein (i.e. mutations and temperature-dependent unfolding) to assay the effect on diffusion rate against different levels of perturbation (e.g. 30% loss of a-helicity in heat-denatured PGL-3-mEGFP vs. 15% loss of a-helicity in the S1 mutant, compared to wild-type PGL-3). Studying the phase separation behavior of these “artificial” constructs provided the understanding that dynamics of PGL-3 in condensates depends on inter-molecular interactions, and slower dynamics generally correlate with stronger inter-molecular interactions. Further, interactions among two or more P granule components can buffer against large change in dynamics / aggregation within the P granule phase. These insights may lay the groundwork for addressing how more “natural” modifications (e.g., post-translational modifications, high local concentration of “sticky” molecules) may influence dynamics within biomolecular condensates in vivo.

      Based on current knowledge of P granule composition, chaperone proteins (e.g. heat-shock family proteins) do not show abundant concentration within P granules. However, it is unclear if chaperone proteins are completely excluded from the P granule phase. Therefore, we speculate that weak interactions among two or more non-chaperone proteins contribute significantly to “dynamics buffering” within the P granule phase in vivo.

      In the discussion section of the manuscript, we had speculated that “dynamics buffering” may potentially explain observations reported in the nucleolus: “Similarly, interactions among components could be a potential mechanism of storage of misfolding-prone proteins in non-aggregated state within the liquid-like nucleolus under stress in vivo (Frottin et al, 2019).”

      Our finding is also relevant in the context of synthetic biology with applications that require steady diffusion rate of macromolecules during biochemical reactions within biomolecular condensates.

      • Recombinant PGL-3 experiments added PGL-1, GLH-1 and mRNA simultaneously and measured fluidity. It will be interesting to know which components contribute to fluidity and whether fluidity enhancement of each component is dependent on one another. Addition experiments with each component should be included and/or at least discussed in the main text. *

      Our data with S1-mEGFP or PGL-3-mEGFP (pre-heated at 50°C) proteins microinjected into C. elegans gonads, and the transgenic strain expressing PGL-3(D425-452)-mEGFP from the pgl-3 locus showed that the P granule phase can support fast dynamics of these mutant PGL-3 constructs. Since P granules have a complex composition, one possibility is that fast dynamics of these constructs is supported by interactions involving many P granule components. We found that using only a limited set of P granule components (PGL-1, GLH-1 and mRNA) can buffer dynamics of S1 in condensates in vitro.

      In absence of a systematic analysis investigating the individual role of approx. 70 P granule proteins in buffering S1 dynamics in condensates in vitro, we have claimed in the text that dynamics-buffering of S1 in condensates is supported by interactions among two or more components. However, we do appreciate the reviewer’s comment and feel it would be interesting to investigate the contribution of individual P granule components towards fluidity in future studies. We have discussed this in the ‘Discussion’ section of the manuscript.

      • The biological relevance of PGL-1, GLH-1, and mRNA were not discussed in the main text. How these factors contribute to P granule assembly and function should be mentioned in the Introduction or Results. *

      To address this concern, we have added a paragraph in the Introduction section of the revised manuscript.

      *MINOR: 1. Line 20, "most non-membrane-bound compartments...have complex composition": Are there examples of condensates that do not have complex composition? *

      Not all non-membrane-bound compartments may have been characterized. To accommodate this possibility, we refrained from making a more general statement, but stated “most non-membrane-bound compartments…”.

      • Lines 40-43, RNA interactions driving LLPS: Please include citations from the Parker Lab (e.g. Van Treeck and Parker, Cell. 2018 doi: 10.1016/j.cell.2018.07.023) *

      We added the reference suggested by the reviewer.

      • *

      • Line 60, condensates contain hundreds of different proteins and RNA: Please cite at least a few examples of condensates with their components identified. *

      We added some references following suggestion by the reviewer.

      • Lines 82-84, PGL-3 drives assembly: Please cite Kawasaki, et al. Genetics 2004 for the discovery of PGL-3. *

      We added the reference suggested by the reviewer.

      • Lines 88-89, PGL-3 N-terminal fragment predominantly alpha-helical: The PGL domain structures should be cited here as supporting evidence that these regions are composed primarily of alpha helices (Aoki, et al 2016, 2021) *

      • *

      To address this concern, we have added a paragraph in the Introduction section of the revised manuscript.

      • Lines 158-159, driving forces for phase separation: This statement should be removed or expanded. The authors point regarding the protein concentrations is not clear here but clarified in the Discussion (Lines 691-693). Recommend removing due to its speculative nature. *

      We retained the speculative comment in the results section. We feel that this prepares the readers for the discussion later in the manuscript.

      • Lines 210: Add commas before and after "PGL-1 and GLH-1"*

      We addressed the reviewer’s suggestion.

      • Lines 218-219: add "and" instead of comma between PGL-1 and GLH-1 *

      We addressed the reviewer’s suggestion.

      • Lines 238-239, alpha-helices: The PGL CDD structure should also be referenced here (Aoki, et al 2016). *

      To address this concern, we have added a paragraph in the Introduction section of the revised manuscript.

      • Lines 680-682, MEG proteins: Please cite accordingly. *

      We added the reference suggested by the reviewer.

      • Lines 694-695, heterotypic interactions: Please cite Saha, et al. 2016. *

      We added the reference suggested by the reviewer.

      • Figure 1: Add space between 1 and mM DTT *

      We addressed the reviewer’s suggestion.

      • Figure 2b: Please provide statistics between condensate numbers. *

      We provide statistics between condensate numbers in Fig. 2b.

      • Figure 4A: The region of the germline imaged and analyzed should be mentioned in the caption or the main text. *

      We revised the Figure legend of Fig. 4a to address this issue.

      • Figure 4B,C: Please include statistics between the FRAP curves. *

      We have included statistics comparing FRAP curves in Supplementary Fig. 4a-c.

      • Figure 4D: It will be helpful to compare this curve to Figure S4A in the same graph. Please also include graph statistics. *

      We have revised Fig. 4 to address the reviewer’s suggestion.

      • Figure 5: The data points are difficult to resolve. Recommend use of color.*

      We considered the suggestion, but felt it works better in the original form.

      • Figure 6: This is a very general model that does not highlight the extensive experimental work performed by the authors. Recommend incorporating PGL-3, mutants and P granule factors into this model. *

      We thank the reviewer for appreciating our extensive work. However, we retained the original Fig. 6 for the sake of simplicity.

      • Methods, Line 939, C. elegans section: What worms were used? TH623? Please describe the genotype. *

      We have included a table listing the strains used in the study and their genotype. * CROSS-CONSULTATION COMMENTS While my review was arguably the more favorable of the three, I agree with the other reviewers' comments and evaluation, particularly with Reviewer #1. As written in my review, my primary concern was the biological relevance of the work.*

      Reviewer #3 (Significance (Required)):

      Overall, the in vitro work presented investigating the biophysical properties of this minimal P granule system was thorough and well-analyzed, and the manuscript was clearly written. Additional citations and statistics will improve the manuscript and the strength of the conclusions, respectively. The biological relevance of this study to P granule form and function in vivo, and to condensates in vivo, is debatable. This work will interest those who study condensate biology, the biophysics of protein-protein and protein-RNA interactions, and RNA biochemists more generally.

      A major impact of our work is elucidation of a novel “dynamics buffering” property within biomolecular condensates in vitro. Our in vivo data is consistent with this finding.

      We have chosen two orthogonal ways of perturbing the PGL-3 protein (i.e. mutations and temperature-dependent unfolding) to assay the effect on diffusion rate against different levels of perturbation (e.g. 30% loss of a-helicity in heat-denatured PGL-3-mEGFP vs. 15% loss of a-helicity in the S1 mutant, compared to wild-type PGL-3). Studying the phase separation behavior of these “artificially-generated” constructs provided the understanding that dynamics of PGL-3 in condensates depends on inter-molecular interactions, and slower dynamics generally correlate with stronger inter-molecular interactions. Further, interactions among two or more P granule components can buffer against large change in dynamics / aggregation within the P granule phase. These insights may lay the groundwork for addressing how more “natural” modifications (e.g., post-translational modifications, high local concentration of “sticky” molecules) may influence dynamics within biomolecular condensates in vivo.

      • *

      Based on current knowledge of P granule composition, chaperone proteins (e.g. heat-shock family proteins) do not show abundant concentration within P granules. However, it is unclear if chaperone proteins are completely excluded from the P granule phase. Therefore, we speculate that weak interactions among two or more non-chaperone proteins contribute significantly to “dynamics buffering” within the P granule phase in vivo.

      In the discussion section of the manuscript, we had speculated that “dynamics buffering” may potentially explain observations reported in the nucleolus: “Similarly, interactions among components could be a potential mechanism of storage of misfolding-prone proteins in non-aggregated state within the liquid-like nucleolus under stress in vivo (Frottin et al, 2019).”

      Our finding is also relevant in the context of synthetic biology with applications that require steady diffusion rate of macromolecules during biochemical reactions within biomolecular condensates.

      *I have expertise in P granules, protein/RNA biochemistry, condensate assembly, and C. elegans. *

      References

      Aoki ST, Kershner AM, Bingman CA, Wickens M & Kimble J (2016) PGL germ granule assembly protein is a base-specific, single-stranded RNase. Proceedings of the National Academy of Sciences of the United States of America

      Aoki ST, Lynch TR, Crittenden SL, Bingman CA, Wickens M & Kimble J (2021) C. elegans germ granules require both assembly and localized regulators for mRNA repression. Nat Commun 12: 996

      Cipriani PG, Bay O, Zinno J, Gutwein M, Gan HH, Mayya VK, Chung G, Chen J-X, Fahs H, Guan Y, et al (2021) Novel LOTUS-domain proteins are organizational hubs that recruit C. elegans Vasa to germ granules. Elife 10: e60833

      Frottin F, Schueder F, Tiwary S, Gupta R, Körner R, Schlichthaerle T, Cox J, Jungmann R, Hartl FU & Hipp MS (2019) The nucleolus functions as a phase-separated protein quality control compartment. Science 365: 342–347

      Kawasaki I, Amiri A, Fan Y, Meyer N, Dunkelbarger S, Motohashi T, Karashima T, Bossinger O & Strome S (2004) The PGL family proteins associate with germ granules and function redundantly in Caenorhabditis elegans germline development. Genetics 167: 645–661

      Kawasaki I, Shim YH, Kirchner J, Kaminker J, Wood WB & Strome S (1998) PGL-1, a predicted RNA-binding component of germ granules, is essential for fertility in C. elegans. Cell 94: 635–645

      Phillips CM & Updike DL (2022) Germ granules and gene regulation in the Caenorhabditis elegans germline. Genetics 220: iyab195

      Price IF, Hertz HL, Pastore B, Wagner J & Tang W (2021) Proximity labeling identifies LOTUS domain proteins that promote the formation of perinuclear germ granules in C. elegans. Elife 10: e72276

      Saha S, Weber CA, Nousch M, Adame-Arana O, Hoege C, Hein MY, Osborne Nishimura E, Mahamid J, Jahnel M, Jawerth L, et al (2016) Polar Positioning of Phase-Separated Liquid Compartments in Cells Regulated by an mRNA Competition Mechanism. Cell 166: 1572-1584.e16

      Spike C, Meyer N, Racen E, Orsborn A, Kirchner J, Kuznicki K, Yee C, Bennett K & Strome S (2008a) Genetic analysis of the Caenorhabditis elegans GLH family of P-granule proteins. Genetics 178: 1973–1987

      Spike CA, Bader J, Reinke V & Strome S (2008b) DEPS-1 promotes P-granule assembly and RNA interference in C. elegans germ cells. Development (Cambridge, England) 135: 983–993

    1. Author Response

      Reviewer #1 (Public Review):

      Several questions have remained regarding the characteristics of these cells:

      1) Based on the transcriptome data in Figure 2, the authors inferred that thymic macrophages are "specialized in lysosome degradation of phagocytosed material and antigen presentation" yet did not show functional data to support these claims. Functional assays such as phagocytosis and antigen presentation are desirable, especially in comparison to other well characterized macrophage populations.

      We agree with the reviewer that additional functional characterization of thymic macrophages will strengthen the conclusions of our manuscript. We have performed antigen presentation assay and in vitro phagocytosis assay to functionally characterize the thymic macrophages. Indeed, thymic macrophages seem to be quite good antigen presenting cells – not as good as thymic DCs, but much better than peritoneal macrophages. This is documented in Fig. 3A and B. They were also good phagocytes both in vitro and in vivo as demonstrated in Fig. 3C-G. Surprisingly, peritoneal macrophages were better in the in vitro phagocytosis assay. We attribute this result to thymic macrophages’ poor survival during the sorting and in vitro culture.

      2) Do transcriptomes of CX3CR1+ thymic macrophages in old mice significantly differ from those of young mice?

      This is a very interesting question that we plan to explore in the future, but we feel it is beyond the scope of the current manuscript.

      3) It would be helpful to better graphically show the compositions (both cell number and cell ratio) of thymic macrophage subsets (TIM4+, CX3CR1+, and others) in mice at different ages (1 week, 6 weeks, and 4 months old). It is not straightforward to deduce all the information based on the current data presentation.

      We thank the reviewer for the suggestion! Plotting the cell numbers did reveal a peak in young age and then significant decline in the number of Tim4+ cells and a trend for accumulation of Tim4+ cells with age. Unfortunately, older mice show great variability in thymus size, which prevented the Tim4- result from being statistically significant. We have added these data to Fig. 8F.

      4) The description of the gating strategy of thymic macrophages for Figure 1 is quite verbose. Adding a step-wise gating strategy of thymic macrophages as a figure panel would be helpful for readers to follow the experimental details.

      We thank the reviewer for the suggestion. The description of the gating strategy has been stripped to 2 panels that capture its essence (Fig. 1B).

      Reviewer #2 (Public Review):

      This work provides by far the most thorough characterization of thymic macrophages. The authors used bulk RNA-seq, single-cell seq and fate mapping animal models to demonstrate the phenotype, origin and diversity of thymic macrophages. Overall the manuscript is well written and the conclusions of the paper are mostly well supported by data.

      Some aspects of data acquisition and data analysis need to be clarified.

      1) the authors should state what does row min row max in figure2 b,d refer to. is this expression value on log scale? In figure 2d, the authors compared their own RNAseq data with ImmGen seq data, what kind of normalization did the authors apply?

      We appologize for not making this clear. The values in Fig. 2b and d (current Fig. 2A and C) are expression values on log scale. We have included this information in the figure.

      Our data is part of the IMMGEN dataset. We sorted the cells and sent them to the US for RNA sequencing. That is why we referred to it as “our” data. However, to avoid confusion we changed the wording to clearly reflect that the data are from IMMGEN.

      2)The authors used immunofluorescent to identify the localization of two populations of macrophages, where they used merTK staining to indicate all macrophages. However, MerTK expression may not restrict to immune cells. The authors are encouraged to confirm that MerTK only labels macrophages in thymus by co-staining with F4/80 or CD45. Tim4 can also be used in immunofluorescence.

      We agree that staining with additional macrophage markers will strengthen our conclusions about ThyMacs localization. We have performed staining with CD64 together with MerTK or Tim4. CD64 and MerTK almost completely overlapped and so did CD64 and Tim4 in the cortex. We could not stain MerTK and Tim4 together because the antibodies are raised in the same species (rat). Additional evidence for the specificity of these markers for thymic macrophages comes from Fig. 3E and F showing the high degree of co-localization of apoptotic cells (TUNEL+) with MerTK or Tim4. Finally, Fig. 4 figure supplement 1 also clearly shows the distribution of TIM4 and CD64 in the whole thymus.

      3) The data of Cx3cr1+ cells accumulation with age in thymus is very interesting, and as the author has discussed, might indicate their contribution to thymus involution. However, the authors only showed change of percentage. As the total macrophages numbers decreased with age, it is not clear whether these cells actually "accumulate" with age. It will help us to assess if this increased percentage of Cx3Cr1+ cells is an actual increase of "influx" or due to the decrease of the self-maintain Tim4+ macrophage subsets.

      The reviewer is raising a very important point. As the changes in the Tim4+ and Tim4- thymic macrophages proportions with age occur at the background of thymic involution, it is difficult to judge whether Tim4+ cells self-maintain and whether Tim4- cells accumulate. Plotting the cell numbers revealed a peak in young age and then significant decline in the number of Tim4+ cells and a trend for accumulation of Tim4+ cells with age. Unfortunately, older mice show great variability in thymus size, which prevented the Tim4- result from being statistically significant. We have added these data to Fig. 8F.

      Reviewer #3 (Public Review):

      This study by Zhou et al. focuses on thymic macrophages and shows that two populations can be distinguished with different identities, localization and origin. Authors use several murine reporter and fate-mapping models, coupled with flow cytometry and transcriptomics approach to support their claims.

      Overall, the question tackled by this study is interesting, thymic macrophages having a bit being forgotten in the last decade which has seen many studies similar to the one presented here in other organs. So, the stated aim to closing this gap is relevant. But the actual version of the study suffers from many defects, more or less severe, which affect the clarity and the persuasiveness of it.

      • About the plan, authors study the origin of the thymic population and provide data in fig 2, 3 & 4 assuming that thymic macs form a homogeneous population. But from fig 5, they distinguish 2 populations and study them separately. So the end of the paper renders obsolete the beginning, that asks for a revision of the whole plan.

      We agree with the reviewer that there is more than one way to tell this story and we have been agonizing over our plan. However, we respectfully disagree that the beginning of the paper is made obsolete by the ending for several reasons:

      1) The initial figures in our manuscript contain very fundamental characterizaition of ThyMacs. Just as the revelation of a heterogeneity in liver macrophages or lung macrophages (ref) does not render all prior research on these cells obsolete, the initial figures in our manuscript are an essential part of the story. Such data are available for all other studied tissue resident macrophage populations. Removing them will be a disservice to the community.

      2) Another reviewer asked for deeper characterization of ThyMacs based on the data in Fig. 2. Accommodating this request will be very difficult if we remove this part.

      Nevertheless, we agree that ThyMacs heterogeneity is the central claim of the manuscript and should be introduced earlier. Now, the original figure 5 (current Fig. 4) that described the heterogeneity has been moved before the original figures 3 and 4 (current Fig. 5 and 6). Additional analyses distinguishing Tim4+ and Tim4- ThyMacs has been incorporated in current Fig. 5 and 6.

      • The figure 1 is not very clear. The backgating should be added in 1a. Or why not using the color map axis mode from FlowJo to show 3 parameters at a glance? The gating strategy should be more clearly displayed on the figure. On fig 1S3, there are clearly 2 pops in the CX3CR1-GFP mice. Why not starting from this to introduce the two populations?

      We thank the reviewer for the suggestion. We have included a color map axis to show MerTK, CD64, and F4/80 in one plot. The description of the gating strategy has been stripped to 2 panels that capture its essence. \We agree that there are several indications for heterogeneity among thymic macrophages, starting with Fig. 1E – the expression of Tim4, and Fig S4c – the expression of CX3CR1-GFP. We have added extra text at the beginning of the paragraph describing current Fig. 4 to point out these facts.

      • The figure 2 could be revised also. First, the panel 2a is useless and should be removed. A PC analysis of all the macs would be more useful here. Also, the color code used for the genes is confusing. Why genes up in ThyMacs are red in 2b but only half of them in 2d? Info can be found in the legend but it should be more clear on a graphical point of view.

      We have revised Fig. 2 according to the reviewer’s suggestions. The PCA analysis is consistent with the hierarchical clustering and shows that splenic and liver macrophages are most closesly related to ThyMacs. We agree that the presence of red in both heatmaps is confusing and we have changed the color code – color was removed from current Fig. 2A but retained in Fig. 2C.

      • For figure 3, what is the timepoint of the panel 3b? Here, authors should show microglia and ThyMacs for both timepoints and conclude based on the comparison. If ThyMacs are as stable as the microglia, no replacement. If not, replacement. For the panel 3f, n=3 is too low to be convinced notably with the standard variation here. And displaying the dot plot with 11% of blood mono from donor while the median being around 20 is not fair, authors should present the most representative plot. For the panel 3h, there are more GFP (in term of MFI) for TEC and ThyMacs than for total cells. How is it possible? TECs and ThyMacs should be in the total cells? Or the gating is not clear enough?

      We thank the reviewer for pointing our omissions. Fig. 3b (current Fig. 5B) is from E19.5 and we have added this information to the figure. We also agree that in Fig. 3f (current Fig. 5F) the sample number is too small and the variation too large to make solid conclusions. That is why we have repeated the partial chimeras experiment trying to irradiate as much as possible of the mice without affecting the thymus. We have substituted the data in the Fig. 3e and 3f with the new data. For Fig. 3h, we appologize for not labeling the data clearly. The panels labeled “single, live cells” should be labeled as “thymocytes” as they were obtained without enzymatic digestion that is essential for both TECs and ThyMacs. However, we found an important caveat in the thymus transplant experiment. It appeared that some of the thymus macrophages were GFP positive not because they express GFP but because they have engulfed GFP+ cells. As a result our experiments with embryonic GFP+ thymus transplants overestimate the percentage of donor-derived ThyMacs (all of them were GFP+). We have repeated the thymus transplantation experiments with congenically marked thymuses (CD45.2 donor and CD45.1 host). While this set up did not allow us to use the thymic epithelial cells as positive control because they are CD45-, we did identify host-derived ThyMacs, consistent with Tim4- cells originating from adult HSCs. Thus, we have replaced the previous data in Fig. 3H and 3I with current figures 5H and 5I.

      • For figure 4, the EdU staining (4e) is not convincing at all. The signal is very low (as compared to 4c for example.

      We agree that signal after 21d chase is a lot weaker than after 2 h (Fig. 4c) or 21d (Fig. 4e) of EdU pulse. The reason we decided to keep this data is that: 1) the thymocytes also have much lower EdU staining after 21d chase compared to 2h and 21d of EdU pulse; 2) The results from EdU staining are very consistent with the data from Ki67 staining, cell cycle analysis, and scRNA-Seq revealing a small population (~5%) of cycling ThyMacs.

      • For figure 7, the interpretation of the data and the way to present them are not clear. Authors use an inducible fate-mapping model. The fact that Tim4- loose their signal with time argue for a replacement by non-labelled cells (blood monocytes) whereas Tim4+ ones are stable meaning they self-maintain. It is what authors claim. But how it fits with previous data where they say that Tim4+ derived form CX3CR1+? The explanation that is a bit subtended here but not enough clearly shown is that CX3CR1+ give rise to Tim4+ during embryonic development but is stops after, Tim4 self-renew independently, and CX3CR1+ are slowly replaced by monocytes. As this is the central claim of the paper, it should be most clearly reported and for this, a substantial change of the whole plan is required.

      We thank the reviewer for pointing out the need for better explanation. The maintenance of the different populations of ThyMacs is indeed complex and proceeds in different ways in the different periods of life. We have added some extra data to Fig. 7 (current Fig. 8) that we hope will add some clarity to the maintenance of thymic macrophages with age. The new Fig. 8F shows the dynamics of the cell numbers of Tim4+ and Tim4- macrophages with age. Tim4+ cells reach a peak in young mice and decline significantly as mice age. So, we do not think that they are self-maintaining but instead, undergo slow attrition with very limited replacement. These results are consistent with Fig. 6I showing low levels of Mki67 in Tim4+ cells. Tim4- are a different story: they progressively accumulate with age. Although the variability in thymus size and Tim4- macrophages in very old mice is too great for the data to reach significance, the trend is clear.

      As for the dynamics of the populations in the embryonic period, we added data formally demonstrating that TIM4+CX3XR1- are derived from CX3CR1+ cells by fate mapping (Fig. 7E-G). We induced re-combination in pregnant ROSA26LSL-GFP mice pregnant from Cx3cr1CreER males at E15.5 when almost all ThyMacs are Cx3cr1+ (Fig. 7A). Just before birth, at E19.5, we could find a substantial proportion of TIM4+CX3CR1- cells among the fate mapped GFP+ macrophages, indicating that Cx3cr1+ cells, indeed, give rise to TIM4+CX3CR1- cells. As pointed out before, this pathway gets exhausted by the first week after birth – at d7 all ThyMacs are TIM4+.

    1. we must acknowledge that our styles of teaching may need to change. Let's face it: most of us were taught in classrooms where styles of teachings reflected the hotion of a single norm of thought and experience, which we were encouraged to believe was universal.

      I totally agree with this statement, because different people think differently, and people's brain work and learn in different ways in different stage of growth. It is really surprising that we as a student, all learn from the same method and experiences. And clearly the style of teaching should be change,

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The authors have assembled an enormous amount of statistical data on the genomes and phylogeny of Arctic algae, including the genomes of four new species that they sequenced for this study. Their main finding is that horizontal gene transfer has led to convergent evolution in distantly related microalgae.

      **Major comments**

      Reviewer #1__: The purpose of the study is not clearly stated in the abstract or the introduction. The authors say (line 93) "Defining the genetic adaptations underpinning these small algal species is crucial as a baseline to understand their response to anthropogenic global change (Notz & Stroeve,2016)." Is this their goal? Or are they just quoting another study? The authors state (line 103) "We extend by sequencing the genomes of four distantly related microalgae...". This is not really a question or a hypothesis. I am sure the authors can provide a more compelling reason to embark on such a labor-intensive study.__

      Reply: We agree that the aim was lost in the details and the Introduction is now focused towards the original goal of the study, which was to investigate convergent evolution in a biogeographically isolated ocean. Additional references on the formation and history of the Arctic Basin have been added to the Introduction to provide context. “An ocean has been present at the pole since the beginning of the Cretaceous. Shaped by tectonic processes (Nikishin et al., 2021) the Arctic Ocean has been a relatively closed basin since the Masstrichtian at the end of the late Cretaceous epoch (ca. 70 million years before present), with episodic sea-ice cover since that time (Niezgodzki et al., 2019). This long history suggests limited gene flow from the global ocean over vast time scales and Arctic marine species including microalgae could well have unique adaptations to cold arctic conditions.” Line 78-83.

      And following this we provide a clear hypothesis “The potential for lineages of ancient Arctic origin and the episodic input of outside species led us to our hypothesis that Arctic microalgae convergently evolved traits or adaptations aiding survival in an ice-influenced ocean. Line 112-117.

      We also discuss both the adaptive and distinct physical environment of the Arctic, and its topographical separation from other ocean regions as dispersal limitation would enhance the Arctic-specific genomic signatures. We now cite the recent paper by Sommeria-Kline et al. (2020), which puts eukaryotic plankton biogeography into a global context (Line 72)

      Reveiwer #1__: The most prominent shared trait that the authors found are genes for ice-binding proteins. However, in view of their importance, little information is given about their different types and possible functions.__

      Reply: We appreciate the comment and have added information on relevant ice binding proteins found in the Arctic Algae. In addition, we discuss how the functional and secretory diversity of IBP would enhance the survivability of pelagic taxa. Lines 534 to 564.

      Although ice binding proteins from multicellular animals and plants are outside the scope of this study, there is a recent review; Bar Doley, Braslavsky and Davies 2016 Annual review of Biochemisty 85: 515-542.

      .

      Reviewer #1__: The HGT of ice-binding proteins is a major focus of this study, but little is said about what previous studies have said about this. What are the previous studies, what are their findings and how do the present findings contribute to this?__

      Reply: We agree that this aspect should have been more visible. We incorporated new data to characterize IBPs drawn from MMETSP transcriptomes, and environmental Tara Ocean metagenomes, as well as our Arctic strains. We note that as we take a PFAM-based approach, the IBPs treated are DUF3494/PF11999 domain, which are type 1 IBPs / algal IBPs (Raymond and Remia 2019). As an example of novelty, we identify the position of IBPs from dinoflagellates, within a larger Arctic Clade that included CCMP2293, CCMP2436 and CCMP2097 and Arctic TARA IBP, rendering this a pan-algal IBD clade.

      In addition, we were able to resolve the position of anomalous F. cylindrus IBP that fell between two Arctic associated clades (A and B, in our Fig 4). This finding is consistent with F. cylindrus originating in the Arctic as previously suggested and subsequently invading the Southern Ocean.

      The recurrent acquisition of multiple diverse IBP isoforms in individual species through HGT events has not been previously reported, and the extent of isoforms in the Arctic was surprising. See for example multiple different IBP forms with separate origins in Pavlovales CCMP2436 (Fig 4). The previous studies are referred to in the context of the phylogeny of the IBD within the results section: Lines 322- 413, and Lines 534-585.

      Reviewer #1: Figure 5 on HGT of ice-binding proteins is difficult to follow. It would be clearer if each panel could be described separately, clearly stating its main finding. I doubt that a reader could look at this figure and explain to a colleague what it shows.

      Reply: We have revised rearranged the figure (now Fig 4) with Arctic A, B, C and D clearly indicated as well as the two Antarctic dominated clades. The upper schematic includes the deepest phylogeny of algal IBDs to date, incorporating all of UniRef, MMETSP and TARA Oceans. The fasta files underlying the tree and the nexus file used are provided the S1 Data Folder, which is an excel folder with information on the analysis of the data. The callout and order of the clades has been revised to facilitate interpretation of the phylogenies more clearly. The entire section has been completely rewritten.

      Reviewer #1: This is also a problem with many of the other figures. For each figure, what is the question being asked and what is its take-home message?

      Reply: We agree that the message was lost and have now focused on our original question in our accepted proposal to JGI. “Is there a convergence among arctic microalgae at the genomic level?”. We found some genome properties were common among the Arctic isolates (more unknown PFAMS and several expanded PFAMs). The importance of ice binding proteins in Arctic Isolates and the widespread inter-algal HGT of this important protein among the Arctic strains. The IBP biogeography and phylogeny strongly indicate that the Arctic microalga have acquired IBP locally and that the Antarctic strains have acquired additional isoforms independently from Antarctic bacteria and fungi (Lines 565-585).

      Reviewer ____#1____: ____The paper has more data than a reader can absorb. It could be strengthened by reducing the number of figures, simplifying them if possible, and more clearly stating the value of the remaining figures.

      Reply. As suggested, we have refocused the paper, removing more speculative statistics based analysis and associated figures. The main conclusions are supported by the 5 main figures. We are now present 5 main figures and 11 supplementary figures (previously 23 downloadable supplementary figures and 40 on-line only figures supporting the support figures). We agree with the reviewer, and we feel the revised version is a more transparent synthesis. Briefly the Figures illustrate the following points. Fig. 1. The multigene tree of available algal genomes and transcriptomes provides a clear framework for judging the divergence of subsequent individual gene and PFAMs phylogenies. Fig. 2 (originally Fig. 3). Indicates the convergence of PFAM domains in the Arctic strains, in contrast to strains from elsewhere. Fig. 3 (originally Figure 4) shows Arctic specific expansions and contraction of PFAM domains, again demonstrating convergent evolution in the Arctic. The figure identifies specific PFAMs that contribute to the within-Arctic convergence. This figure is based on statistical methods independent of Fig 2. Figure 4 is the most extensive IBP phylogeny to date and has been discussed above. Figure 5, which was supplementary in our non-peer reviewed version, shows the biogeographic distribution of IBP, and can be compared to the distributions of the 18S rRNA genes from the four Arctic algae provided as supplementary (S6 Fig.)

      **Minor comments**Reviewer #1

      1. The figure citations are confusing. E.g., what does "Fig.1- Figure supplement 1" refer to? Does this refer to 1 or 2 figures? Apparently, it refers only to Fig. S1, so many readers will be confused when they look at Fig. 1.

      Reply: We apologize for the confusing format; the manuscript had been formatted for the online journal eLife. Our revision follows the more traditional style of PLoS Biology and other Review Commons journals.

      .

      Multiple citations should be in order of publication date, not alphabetical order.

      Reply ; We agree that date of publications is quite standard and recognizes priority of publication. Several on line journals no longer follow this rule and citation order will follow the specific style used by our accepting journal.

      Reviewer #1 (Significance (Required)): It is well known that useful genes tend to be shared among microorganisms. The present study strengthens previous studies in showing that gene transfer is an important process in polar regions.

      Reply: We thank the reviewer for recognizing the importance of our study.


      Reviewer #2 ____(Evidence, reproducibility, and clarity (Required)):

      This manuscript is the result of a large international collaborative effort, including the US Department of Energy Joint Genome Institute. Its focus is comparative genomics of eukaryotic Arctic algae. The primary data described in the ms are four new genome and transcriptome sequences from diverse Arctic algae, represented by a cryptomonad, a haptophyte, a chrysophyte, and a pelagophyte.

      The authors compare these new data to previously published genomic/transcriptomic data from eukaryotic algae with the goal of understanding genome evolution in the Artic. The results of the paper are a series large-scale comparative genomic bioinformatics analyses, including the associated statistical analyses. The key findings center on statistically significant features of Arctic genomes, features that stand out as compared to the genomes of algae that are not primarily found in the Arctic. Together, these findings allow the authors to make various hypotheses and suggestions about genetic adaptations to polar environments.

      By far the most significant finding is that the genomes of Arctic algae are enriched in genes encoding proteins with an ice-binding domain, paralleling findings from Antarctic algae. These genes appear to have spread among Arctic algal genomes via horizontal gene transfer, which raises a series of interesting questions. In my opinion, the major conclusions of this paper are supported by the data. Listed below are a few comments that may improve the ms:

      Reviewer #2.

      1) In today's post-genomics era, everyone seems to be sequencing nuclear genomes. Often what distinguishes high-impact and low-impact genome papers is the number of genomes presented and the quality of the genome assembly. I may have missed it, but reading the main text, the figures/tables, and the supplementary data I was not able to get a sense of the quality of the four genome assemblies from which the main findings are based. I was eventually able to find this information from PhycoCosm (note: some of the links to this site are not working in the ms). My quick scan of the PhycoCosm summary info for the four genomes indicates that the assemblies are highly fragmented, likely because they are based on short-read Illumina sequencing rather than a combination of short and long reads. I think it is important to briefly discuss (and or present) the quality of the assemblies in the ms and to highlight the potential limitations/drawbacks of employing highly fragmented assemblies when carrying out large-scale comparative genomics.

      Reply: We agree and the data concerning the genome quality assemblies has been moved to the main text Table 1. The comparison with other paired related strains is provided in an excel folder designated S2 Data Folder.

      Reviewer #2.

      2) Horizontal gene transfer is undeniably a major driving force in evolution, and one that has shaped genomic architecture across the Tree of Life. I believe the data presented here support a role for HGT in the genome of evolution of Arctic algae, particularly with respect to genes encoding proteins with an ice-binding domain. However, we can all think of numerous instances when authors of genome papers were too quick to point to HGT. Thus, I would urge more caution and balance when presenting the HGT data, including some discussion about factors that could incorrectly lead researchers to conclude a significant role for HGT, such as contamination, gene duplication, mis-assemblies, etc. I'm not suggesting that you change the main conclusions, but just tone down the language in places (e.g., "we reveal remarkable convergence in the coding content ... ").

      Reply: We understand the reviewers concerns and now more clearly outline the pipeline we have used to identify HGTs. This included: filtering each genome to remove all possible contaminant sequences first, considering both contig co-presence of vertical- and horizontally-derived genes, and reciprocal and independent annotations of gene sequences in both genome sequences and MMETSP transcriptomes. Retained genes were subjected to simultaneous BLAST analysis and manually curated phylogenies using decontaminated reference datasets. The most parsimonious explanation for our final IBP domain microbial algal clusters (Fig 4) is HGT. On the side of caution, we removed the entire section that identified potential arctic HGT based primarily on a less targeted broad statistical analysis. The focus is now on 3 genes that have clearly identifiable utility in the Arctic, were found to be enriched in Arctic genomes via a separate analysis and had homologs in the Tara Ocean Polar circle data. In addition, we describe more clearly the role of expansion and enrichment of PFAMs and the high proportion genes without an identifiable PFAMs in the Arctic strains as evidence for Arctic convergence separate from potential HGT.

      Reviewer #2.

      3) The downside of studying protists (as compared to multicellular animals, for instance) is that most are not widely known by the scientific community and even fewer scientists can picture what they actually look like (e.g., Pavlovales sp. CCMP2436). A few more details about the four Arctic algae that make up the focus of this paper might be helpful for the casual reader. My sense is that if at the next departmental meeting I asked my colleagues what a pelagophyte was most would look at me with a blank stare. Moreover, am I right to assume that all four algae are psychrotolerant rather than psychrophilic (Supplement Fig. 1 makes me think otherwise). It might be good to point out the difference in the text.

      Reply: High resolution images of each strain are available on the JGI home page for each alga, given the multiple figures we feel photos would not add information.

      Reviewer #2

      4) I don't think Supp. Table 1 (the Pan-algal dataset) got uploaded correctly during the manuscript submission stage. The first link I click on gives me Supp. Table 2.

      Reply: We apologize for this, the format was incorrect for the file designation and there were lost links. We now more actually refer to these as Data Folders as they are excel folders containing multiple sheets, All supplementary links will be verified again on final submission.

      .

      Reviewer #2 (Significance (Required)):

      By far the most significant finding from this paper is that the genomes of Arctic algae are enriched in genes encoding proteins with an ice-binding domain, paralleling findings from Antarctic algae. These genes appear to have spread among Arctic algal genomes via horizontal gene transfer, which raises a series of interesting questions. This is not the first paper to present these types of ideas, but it is arguably the broadest analysis yet, at least with respect to eukaryotic algae. This work will be of great interest to polar scientists, phycologists, protistologists, and the genomics community. I am genome scientist studying protists, including algae.

      Reply. We thank the reviewer for their insightful comments.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      This manuscript is focused on Arctic microalgae, an important yet understudied community in permanently cold ecosystems. By sequencing the genomes of four phylogenetically diverse and uncharacterized polar algae, the authors seek to elucidate genomic features and protein families that are similar in polar species (and differ from their relatives from temperate environments) This work used high-throughput genomic sequencing and computational analysis to demonstrate significant horizontal gene transfer (HGT) in several gene families, including ice-binding proteins. The authors suggest that this HGT is an effector of environmental adaptation to Arctic environments.

      **Major comments and experiment suggestions:**

      The authors conclude that HGT between arctic species is a driver of polar adaptation. The authors strongly support the claim that HGT is present more frequently in the polar algae examined here. Whether this is adaptive should be further explored though. For instance, ice-binding domains were one PFAM group found at significantly higher frequencies in the polar species - but are all of these species associated with ice? What would be the benefit of IBDs in an alga that is found in the open ocean. Similar with the other domains (Lns 333-335), its not clear whether these are truly adaptive features. ____This is more speculative.

      Reply: We agree that detail was lacking and have considerably expanded our introduction on the character of the Arctic Ocean and have stated the goals and underlying hypothesis. Briefly, all surface water organisms that live in the Arctic encounter ice during the year as the ocean freezes in winter, and surface waters reman around negative 1.7 °C for much of the year. This information has been added to the introduction. We have also expanded the discussion on the multiple effects of different IBPs that would be ecologically beneficial for plankton as well as ice-algae and cite relevant experimental studies and reviews.

      Reviewer #3) ____HGT was a major conclusion of this study, putting this in a wider perspective would strengthen the conclusion, especially in the context of HGT from prokaryotes. Are there insights on whether IBDs are present in Arctic prokaryotes?

      Reply: This is a good question, and we now point out that there were 91 Arctic bacterial and archaeal IBP sequences in our comparative dataset. In contrast to the Antarctic clades, none were closely related to the Arctic strain IBPs (Fig 4). Line 336.

      Reviewer #3) ____The data obtained from the genomic works supports the conclusions stronger that ones from transcriptomes, where what genes/domains are present would depend largely on the sampling conditions. This should be emphasized.

      Reply: The main rational for using transcriptomes was that more of these are available and enabled us to detect convergences and HGT across a broader taxonomic range than would be possible with genome-only data, where we had access to a total of only 21 microalgal genomes. In general transcriptome studies are aimed at identifying responses under different conditions and rely on comparative expression data, usually 2-fold differences in up or down expression under different growth conditions, see for example Freyria et al. 2022 (Communications Biology). Unlike a transcriptome expression study, our data mining detected any (constitutive or regulated) expression in these unicellular haploid cells, we would have detected genes used under any condition that an algal happened to be growing. IBD was not detected in any of the temperate genomes, and only detected in transcriptomes of Arctic and Arctic-Boreal groups. However, we agree that there may be some limitation of transcriptomes only studies and mention this. Lines 522-528.

      Reviewer #3) ____An experiment to determine whether the species are cold extremophiles (psychrophiles) would be useful here to strongly support the data in Figure 1. The authors state that their species can not survive >6C but this is based on experiments done on older studies. Considering the cultures have been maintained as a continuous culture for decades, confirming that they still have psychrophilic characteristic would be useful. This is a straightforward and low cost experiment that requires simply measuring growth rates at several temperatures to define the optimal and confirm that the cells are not viable above 6C.

      Reply: These are interesting points, and the broad “background” statements in the original manuscript would require a separate study,and have been deleted. Temperature tolerance experiments are not so simple for cold adapted algae with slow growth rates. Such experiments require specialized incubators to maintain low temperatures. Temperature experiments have been carried out on the cultures in the context of other studies, see for example, Daugberg et al. 2018, J. Phycol. But this is not within the scope of the present study.

      We now restrict our conclusions to the specific question of convergence among Arctic strains. We apologize for the misunderstanding on the history of the cultures. They have not been in “continuous culture” but are cryopreserved. We now simply indicate that they grow below 6 °C, which is sufficient to assume that they are likely cryophiles, our experience is that they do not grow well or at all at higher temperatures, our efforts have been to maintain the cultures that are otherwise easily lost. We now make no claims about optimality or limits. Here we simply examined genomes and available transcriptomes that were generated from algae growing at 4-6 °C.

      Reviewer #3) ____**Minor comments:**

      Defining the species used here as psychrophiles would put the study in context better. The authors relate their finding to Antarctic species (HGT, ice-binding domains, large genomes) all of which are confirmed psychrophiles.

      Reply: The temperature definition of psychrophiles is surprisingly high (optimal growth below 15 °C) and this definition of psychrophiles is now given in the introduction. The point is really that there are few isolates from cold surface waters that have been well studied. We now add. “A handful of polar algal genomes have been extensively studied, with 4 of these from around Antarctica and classified as psychrophiles (not being able to grow above 15 °C (Feller & Gerday, 2003)”. Lines 103-107.

      Reviewer #3) ____A short rationale on why these species at all would be useful - are they representative of their classes? Do they have psychrophilic characteristics that might make them useful models in the future? Are they widely used now?

      Reply: We appreciate the point as the definition of utility in discovery-based science is an open dialog.

      We agree that the study requires context and have added our rational for selecting the species for genome sequencing to the introduction. “To address questions on genetic adaptations to this ice-influenced environment, we sequenced 4 phylogenetically divergent microalgae, from 4 algal classes belonging to 3 algal phyla: Cryptophyceae (Cryptophyta), Pavlovophyceae (Haptophyta), Chrysophyceae and Pelagophyceae (both in the Ochrophyta) isolated from the ca. 77 °N, where surface ice flow persists through June (Mei et al., 2002). The four isolates were selected as representatives of different water and ice conditions and phylogeny from available strains collected in April and June 1998 during the North Water Polynya study”.

      Reviewer #3) ____Starting algal cultures were maintained in a continuous culture since 1998 and under continuous light since at least 2015, have the authors confirmed that these algae retain their physiological features even after this long time? The accumulation of mutations is a possibility here.

      Reply: We apologize for the misunderstanding of the timeline; the history of the cultures was not given in the manuscript and the inferred history is not quite correct. The 2015 date was the year of publication for the MMETSP data. Our continuous light statement is a record of our standard culture conditions. We now elaborate on the material used in the current study. The cultures were deposited in the Bigelow culture collection (now NCMA) in 2002 and cryopreserved once they had been verified and given a culture designation. We obtained fresh cultures in 2005 and these were used for the MMETSP project. We obtained fresh cultures again in 2011, specifically for the JGI genome project. These algae do not grow fast and most of the DNA was sent to JGI in 2012 for most of the isolates. This history is rather long and not relevant, since one would speculate that over the years the algae would tend to lose the ice associated functionality, e.g. they were not frozen in seawater every year for 4 to 6 months or subject to sudden freshwater exposure, when ice melts. We would encourage other researchers to order the cultures and run experiments. We note that many of the 40 or so algae isolated from the same campaign have been used by others for specific studies and at least 8 are in the MMETSP data set. The presence of 18S rRNA and phylogenetic position of the IBP sequences compared to Tara Arctic circle data confirms long-term Arctic presence of each species and the IBP domains in the Arctic without marked changes over the last 20 years.

      Reviewer #3) ____Ln381 - The culture collection IDs for each sequenced species should be included here

      Reply: we have added the culture IDs throughout.

      Reviewer #3) ____Ln. 389 - Algal cells are harvested and used for nucleic acid extraction, the nucleic acids themselves are not harvested

      Reply: we agree and corrected the wording

      Reviewer #3 (Significance (Required)):

      This study is well places in the current state of research on polar alga and represents a significant and very valuable addition to the current knowledge pool. Algae in general are lagging behind other groups of photosynthetic organisms in the number of sequenced and analyzed genomes, despite algae being one of the main primary producers globally. This is even more strongly felt in polar research, where only 4 species have been sequenced, most of which are restricted to Antarctica. There is a true gap in our knowledge when it comes to Arctic species, and this study fills this gap. As the authors correctly state, we need more knowledge on polar environments and the primary producers that support these important ecosystems in light of current climate change trends.

      Reply: we appreciate the succinct summary of our study and thank the reviewer for insights and suggestions that have improved the manuscript.

      Reviewer field of expertise: Polar algae, stress responses, plant and algal energetics, cell signalling

      Reply: We appreciate the incites and perspective steming from the reviewer's expertise.

      Relevant key references cited in the reply:

      Daugbjerg N, Norlin A, Lovejoy C. Baffinella frigidus gen. et sp. nov. (Baffinellaceae fam. nov., Cryptophyceae) from Baffin Bay: Morphology, pigment profile, phylogeny, and growth rate response to three abiotic factors. Journal of Phycology. 2018;54(5):665-80

      Feller, G. and Gerday, C. (2003) Psychrophilic enzymes: Hot topics in cold adaptation. Nat Rev Microbiol, 1, 200-208.

      Freyria NJ, Kuo A, Chovatia M, Johnson J, Lipzen A, Barry KW, et al. Salinity tolerance mechanisms of an Arctic Pelagophyte using comparative transcriptomic and gene expression analysis. Communications Biology. 2022;5(1). doi: 10.1038/s42003-022-03461-2

      Mei, Z. P., Legendre, L., Gratton, Y., Tremblay, J. E., Leblanc, B., Mundy, C. J., Klein, B., Gosselin, M., Larouche, P., Papakyriakou, T. N., Lovejoy, C. and Von Quillfeldt, C. H. (2002) Physical control of spring-summer phytoplankton dynamics in the North Water, April-July 1998. Deep-Sea Research Part Ii-Topical Studies in Oceanography, 49, 4959-4982.

      Niezgodzki, I., Tyszka, J., Knorr, G. and Lohmann, G. (2019) Was the Arctic Ocean ice free during the latest Cretaceous? The role of CO2 and gateway configurations. Global and Planetary Change, 177, 201-212.

      Nikishin, A. M., Petrov, E. I., Cloetingh, S., Freiman, S. I., Malyshev, N. A., Morozov, A. F., Posamentier, H. W., Verzhbitsky, V. E., Zhukov, N. N. and Startseva, K. (2021) Arctic Ocean Mega Project: Paper 3-Mesozoic to Cenozoic geological evolution. Earth-Science Reviews, 217.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This manuscript is the result of a large international collaborative effort, including the US Department of Energy Joint Genome Institute. Its focus is comparative genomics of eukaryotic Arctic algae. The primary data described in the ms are four new genome and transcriptome sequences from diverse Arctic algae, represented by a cryptomonad, a haptophyte, a chrysophyte, and a pelagophyte.

      The authors compare these new data to previously published genomic/transcriptomic data from eukaryotic algae with the goal of understanding genome evolution in the Artic. The results of the paper are a series large-scale comparative genomic bioinformatics analyses, including the associated statistical analyses. The key findings center on statistically significant features of Arctic genomes, features that stand out as compared to the genomes of algae that are not primarily found in the Arctic. Together, these findings allow the authors to make various hypotheses and suggestions about genetic adaptations to polar environments.

      By far the most significant finding is that the genomes of Arctic algae are enriched in genes encoding proteins with an ice-binding domain, paralleling findings from Antarctic algae. These genes appear to have spread among Arctic algal genomes via horizontal gene transfer, which raises a series of interesting questions. In my opinion, the major conclusions of this paper are supported by the data. Listed below are a few comments that may improve the ms:

      1) In today's post-genomics era, everyone seems to be sequencing nuclear genomes. Often what distinguishes high-impact and low-impact genome papers is the number of genomes presented and the quality of the genome assembly. I may have missed it, but reading the main text, the figures/tables, and the supplementary data I was not able to get a sense of the quality of the four genome assemblies from which the main findings are based. I was eventually able to find this information from PhycoCosm (note: some of the links to this site are not working in the ms). My quick scan of the PhycoCosm summary info for the four genomes indicates that the assemblies are highly fragmented, likely because they are based on short-read Illumina sequencing rather than a combination of short and long reads. I think it is important to briefly discuss (and or present) the quality of the assemblies in the ms and to highlight the potential limitations/drawbacks of employing highly fragmented assemblies when carrying out large-scale comparative genomics.

      2) Horizontal gene transfer is undeniably a major driving force in evolution, and one that has shaped genomic architecture across the Tree of Life. I believe the data presented here support a role for HGT in the genome of evolution of Arctic algae, particularly with respect to genes encoding proteins with an ice-binding domain. However, we can all think of numerous instances when authors of genome papers were too quick to point to HGT. Thus, I would urge more caution and balance when presenting the HGT data, including some discussion about factors that could incorrectly lead researchers to conclude a significant role for HGT, such as contamination, gene duplication, mis-assemblies, etc. I'm not suggesting that you change the main conclusions, but just tone down the language in places (e.g., "we reveal remarkable convergence in the coding content ... ").

      3) The downside of studying protists (as compared to multicellular animals, for instance) is that most are not widely known by the scientific community and even fewer scientists can picture what they actually look like (e.g., Pavlovales sp. CCMP2436). A few more details about the four Arctic algae that make up the focus of this paper might be helpful for the casual reader. My sense is that if at the next departmental meeting I asked my colleagues what a pelagophyte was most would look at me with a blank stare. Moreover, am I right to assume that all four algae are psychrotolerant rather than psychrophilic (Supplement Fig. 1 makes me think otherwise). It might be good to point out the difference in the text.

      4) I don't think Supp. Table 1 (the Pan-algal dataset) got uploaded correctly during the manuscript submission stage. The first link I click on gives me Supp. Table 2.

      Significance

      By far the most significant finding from this paper is that the genomes of Arctic algae are enriched in genes encoding proteins with an ice-binding domain, paralleling findings from Antarctic algae. These genes appear to have spread among Arctic algal genomes via horizontal gene transfer, which raises a series of interesting questions. This is not the first paper to present these types of ideas, but it is arguably the broadest analysis yet, at least with respect to eukaryotic algae. This work will be of great interest to polar scientists, phycologists, protistologists, and the genomics community. I am genome scientist studying protists, including algae.

    1. Consolidated peer review report (23 September 2022)

      GENERAL ASSESSMENT

      In this manuscript, Tiemann, J., et al. take on a large-scale exploration of how mutations associated with disease impact calculated stability and conservation scores across the entire membrane proteome. The aim was to gain mechanistic insight into the causes of pathogenicity of missense mutations of human membrane proteins and verify whether, as is the case for soluble proteins, mutational destabilisation of membrane proteins can explain disease. To do so, the authors use a framework they previously developed, using measures of stability change (ΔΔG) and sequence conservation (ΔΔE, the GEMME score) to predict fitness effects of mutations with large-scale mutational data (Høie et al., 2022).

      By conducting a proteome-wide analysis of missense variants in human membrane proteins, the authors find decisively that pathogenic mutations are heavily enriched within the transmembrane region of membrane proteins. In addition, they report that they can sometimes use their calculated properties to classify residues based on their potential roles in stability or function, and that stability appears to be a major determinant of conservation and likely pathogenicity for GPCRs. 

      The authors thus make meaningful strides towards explaining the clinical impact of variants within membrane proteins, a currently under-characterized yet important category of proteins. The analyses have been conducted in a rigorous way, and the data and protocols are openly available. This work will be of interest to researchers working on membrane proteins as well as those applying computational methods to biophysical systems.

      On the other hand, the choices made by the authors in terms of presentation make the identification of the main conclusions of the paper challenging. In part, this is likely due to fundamental technical challenges associated with calculating biophysical properties for membrane proteins. In addition, although the analysis was performed at the scale of the proteome, due to the decision to only consider X-ray crystallography structures, the number of proteins analyzed is rather small (15). It thus remains unclear how the findings are transferable to other membrane proteins and how robust the comparison between the different functional classes is. 

      RECOMMENDATIONS

      Revisions essential for endorsement:

      1.     The authors are careful with what they claim, to the point where it becomes difficult to interpret the major messages. It appears there are many contributing factors to noise within these assays, resulting in complex figures that make it hard to interpret the data. The goal of presenting the data without overinterpreting it is noble, and the difficulty of digesting and presenting the comparisons in this work should be emphasized, but the complexity of the results made it difficult for reviewers to interpret without more robust processing. Further, we were not always certain how each result fits into the overall argument, which from our reading is whether the performance of predictors for classifying pathogenic mutations based on conservation and stability calculations provides insight into the mechanisms underlying membrane protein disease. Overall, we feel that clarifying the unifying argument of the manuscript and simplifying the figures would greatly improve the comprehensibility of this work. This could be achieved with one of the following approaches, although we leave the final choice to the authors: 

      • The manuscript could attempt to answer the following question: “Can existing methods be used to computationally determine whether pathogenic mutations are due to stability?” It would then explore why this question can or cannot be answered with the current analysis pipeline and existing tools. The answer is likely that the current tools are insufficient and the manuscript would thus point towards a future area of growth to be able to address the question.

      • The manuscript could focus on presenting the dataset. The results would be presented as preliminary examples of the kind of information that can be extracted and the type of analysis that may be done. In this case, claims such as “stability causes x% of pathogenic mutations” should be avoided, and the most important aspect of the manuscript would be that it accompanies a well-curated and openly available dataset, and provides links to it. In that context, the authors should mention whether there are existing curated and/or established databases of (human) membrane proteins, and how the dataset of putative membrane proteins compares with these resources.

      • The manuscript could focus on presenting the “computational approach”, which consists of mapping ddG-ddE, combined with an analysis of the localization of pathogenic (and non-pathogenic) mutations and the types of mutation (conservative, non-conservative etc.). Revisions would be needed to present results as examples of the kind of information this approach may provide.

      • The manuscript could possibly make a clear and compelling case for the idea that mutations of membrane proteins cause disease either because they destabilize the protein or because they occur at sites that are directly involved in function. This would require major revisions of the results and a systematic, clear and robust combined analysis of quadrant-location, protein-region-location, and amino-acid-type substitution.

      Related to the above, it would be useful to clarify in the introduction what is expected from the study upfront: did the authors expect that the picture that would emerge would indeed be the same for membrane proteins as for soluble proteins? Are there different degradation pathways for these two classes of proteins and is a loss of stability expected to have different consequences or not? In the end, the role of destabilization is rationalized in terms of buriedness and amount of physico-chemical change upon mutations. Hence, are the results of the study saying something about the mechanisms of disease variants or simply about the physico-chemical composition and topology of membrane proteins? To answer this point, we suggest contextualizing the study more by expanding on the published literature. This would also clarify that the membrane protein folding field is very far behind the soluble protein folding field, and, as a result, that we cannot expect the methods that work for soluble proteins to work for membrane proteins, or even if methods will mature to the point that they do yield predictive results for membrane proteins. 

      2.     In general, uncertainties need to be better quantified and discussed and statistical tests included. For example:

      • The low correlation of Rosetta estimates of ΔΔG and experimental ΔΔG is 0.47, which means less than 25% of ΔΔG is accounted for by Rosetta. This uncertainty needs to be considered more carefully: it will likely affect the AUC (i.e. is AUC(ΔΔG) < AUC(ΔΔE) because not all mutations are pathogenic due to stability, or is this a mere consequence of the uncertainty of ΔΔG estimates?) and the number of points in the different quadrants (how many of the points in a quadrant are false-positives or false-negatives, etc., and can we guess which they are by using other information such as the protein region, aa-type change, ΔΔE value, etc?). 

      • A variant may fall in the “wrong” ΔΔG-ΔΔE quadrant because of the mentioned (large) ΔΔG error, but also because of ΔΔE errors. This needs to be considered. Some estimate of the ΔΔE error needs to be made (e.g. by bootstrapping the alignment). Even in an ideal case in which ΔΔE is dependent only on ΔΔG, i.e. that both ΔΔG_Rosetta and ΔΔE are estimates of a “true” ΔΔG, not all points would fall in a y = x line in the ddE-ddG plane. How many points would there be in each of the quadrants because of mere estimation errors?

      • As the authors state, quadrant IV has few points. But it also seems that there are more blue points than red points in regions further away from the axes. Could the author comment on this observation? Is there a tendency for the ΔΔG measure to “over predict” pathogenicity ?

      • Within the manuscript the authors widely compare different groupings to drive their narrative. For example, on line 115 the authors discuss the enrichment of pathogenic mutations within the transmembrane domains, which then leads to many subsequent explorations of why TMs may be involved in disease. For this comparison, there is a large and visible significant difference, thus there may not be a need for a statistical test for significance. However, there are many other comparisons that are harder to interpret due to multiple different groupings, complex data representation, and at its core a fundamentally complex study. In these cases, we would like to see more robust statistical tests. For example, on line 184, after breaking up data in 2B based on ΔΔG and ΔΔE cutoffs, the authors write “...only a few variants (14.2%) falling in the quadrant of low ΔΔE and ΔΔG…” – it is unclear what a few means or if this is a significant reduction in variants compared to other quadrants. 

      3.     Regarding the performance of Rosetta to measure ΔΔGs:

      • The authors state that pathogenic mutations causing loss of stability are more often located in the interior of the protein (buried), implying bigger physico-chemical property changes. Isn’t that expected from Rosetta design? Indeed, while the analysis of the distribution of variants among protein regions (buried, etc.) and mutation-type (hydrophobic-to-hydrophobic, etc.) does add additional information to support the hypothesis that in some cases stability loss causes disease, it is important to recognize that this is not completely independent evidence because any ΔΔG predictor should somehow capture the observed patterns. 

      • ROC curves are used to determine how well ΔΔG guides pathogenicity, as a follow up to the observations that pathogenic mutations are enriched in TM regions of membrane proteins. The intuition here is that deleterious mutations within TMs are likely disrupting folding and therefore a ΔΔG-based predictor should do relatively well. However, the authors find that Rosetta-based ΔΔG calculations do not do well in all membrane proteins with benign-like and pathogenic mutations (Figure 2A) and solved crystal structures. In contrast, ΔΔG works quite well when trained solely on GPCRs (Figure 3A). The interpretation of this could be that stability is not a major driver of membrane protein disease – however, in many cases it is, such as Rhodopsin and CFTR. In contrast, another explanation is that Rosetta doesn’t predict stability well for mammalian membrane proteins, and in fact the authors discuss this at length in the limitations of the study section, explaining this is because Rosetta is trained on many bacterial beta barrel membrane proteins. We appreciated this section but would have preferred more of this discussion earlier on as it could aid in understanding why the ΔΔG predictors don’t perform accurately, as presented in Figure 2A. 

      • Could the authors clarify what they mean by “where the Rosetta energy function suggested a potential incompatibility between the experimental structure and the Rosetta energy function”? 

      4.     Regarding ΔΔE, in the present work, there is an implicit assumption that the constraints that operate during evolution of the aligned sequences, across species, as captured by GEMME, are the same constraints that affect the variants within a population, and therefore determine whether a variant will be pathological/non-pathological. This is a major assumption that needs to be spelled out and discussed. Mentioning this will help interpret “misplaced” points of the ΔΔE-ΔΔG map.

      Additional suggestions for the authors to consider:

      1.     The comparison of pathogenic/non-pathogenic mutations should consistently be made across the various sections of the paper. In too many cases in the present version of the paper this comparison is not emphasized. In some cases, the distribution of variants is described, without clearly differentiating pathogenic from non-pathogenic. In other cases, only pathogenic variants are considered, without comparing with the non-pathogenic cases.

      2.     Moving the section on the two specific proteins to the end of results would likely improve the flow of the paper. The A/B x ΔΔE-ΔΔG plane analysis would be presented first, then the A/B x ΔΔE-ΔΔG x “protein regions” analysis, and finally the A/B x ΔΔE-ΔΔG x regions x “aa-type” analysis before ending with examples.

      3.     The choice to restrict the analysis to X-ray crystallography structures from the PDB is not obviously well suited. Indeed, the coverage of membrane proteins by the PDB is rather low, and the authors found that less than 30% of all annotated human membrane proteins have at least some part resolved. One of the potential advantages of the AlphaFold database is to improve this coverage, and the analyses presented by the authors would thus benefit from considering predicted models displaying high confidence values.

      4.     In Figure 2, the authors define two classes of variants in their dataset, group A (pathogenic variants) and group B (benign or non-pathogenic with an allele frequency > 9.9 · 10^-5). Then they tested their models’ ability to distinguish between groups A and B by constructing ROC curves for Rosetta ΔΔG and GEMME ΔΔE. To visualize variant effects and further classify variants, they plotted individual variants along a ΔΔG vs. ΔΔE plot. They then use this plot to further classify variants based on their combined ΔΔG and ΔΔE values. The allele frequency cutoff is so important for generating group B that all downstream analysis is dependent on this. But because these residues are coming from a much more limited set of proteins, we think it would be useful to include a comparison showing that the gnomad allele frequency > 9.9 x 10^-5 cutoff remains informative for differentiating between benign and pathogenic residues.

      5.     In Figure 3, the authors apply their analysis to variants across all GPCRs, as well as just GPCR transmembrane regions. The AUC curves in panel A are much more accurate when applied to just this protein family, as also seen in panel B where variants fall into very clear subpopulations within each quadrant. The illustration and category definitions on the left of panel C are a helpful guide for the discussion of different variant types and their relevance to stability of the protein versus function in a unique way, however the plot on the right of panels C and D is confusing and not immediately intuitive making it difficult to consider comparisons that are discussed within the text. Indeed, the authors state that “Pathogenic variants in GPCRs, especially in the transmembrane region, lose function mostly by loss of stability”. Comparing these two panels, it is concluded that the pathogenic variants that do not lose stability are more often found in the TM regions of GPCRs compared to all datasets. This is somewhat confusing and the numbers supporting this affirmation in Fig 3C seem quite low.

      6.     The authors do not extensively discuss their results in the context of the membrane protein field nor the specific membrane proteins they highlight such as Rhodopsin and GTR1 (Figure 4). For Rhodopsin, at least, there has been extensive work done on its folding by Johnathan Schlebach’s lab and others, including a mutational scan. It could be useful to at least contextualize and contrast results here with previously published work. 

      7.     In Figure 5, the authors consider whether the identities of the starting and mutant residues correlate with their overall quadrants. Panel A is extremely difficult to interpret. We are  also unsure how robust any differences are likely to be, given the uneven sampling and the small number of samples in some of the boxes. Narrowing the comparisons (changed vs. unchanged property, A vs B) would likely improve comprehension and may be more meaningful. Panel B is, on the other hand, a wonderful example of how to clearly display complex, multidimensional data in a comprehensible way. The well-demonstrated association of hydrophobicity and transmembrane stability is beautifully demonstrated directly from the data, and the potential discordance with evolutionary conservation as well. We find this correlation even more striking given that the hydrophobicity scale used here was explicitly determined in the context of transmembrane regions, but the variants are drawn from all regions of the targets. We were curious to know what percentage of these are drawn from the transmembrane vs. soluble regions of the targets.

      REVIEWING TEAM

      Reviewed by:

      Willow Coyote-Maestas Paper Discussion Group, UCSF, USA: membrane proteins; high throughput experimental variant screening; developing assays for measuring how mutations break membrane proteins in order to explore how mutations alter folding, trafficking, and function of membrane proteins (see Appendix for group members).

      Julian Echave, Professor, Universidad Nacional de San Martín, Argentina: theoretical and computational study of biophysical aspects of protein evolution.

      Elodie Laine, Associate Professor, Sorbonne Université, France: development of methods for predicting the effects of missense mutations using evolutionary information extracted from protein sequences and/or structural information coming from molecular dynamics simulations.

      Curated by:

      Lucie Delemotte, KTH Royal Institute of Technology, Sweden

      APPENDIX

      Willow Coyote-Maestas Paper Discussion Group:

      Feedback was generated in a meeting of the journal club involving:

      Willow Coyote-Maestas

      Christian Macdonald

      Donovan Trinidad

      Patrick Rockefeller Grimes

      Matthew Howard

      Arthur Melo

      (This consolidated report is a result of peer review conducted by Biophysics Colab on version 1 of this preprint. Minor corrections and presentational issues have been omitted for brevity.)

    1. Author Response

      Reviewer #1 (Public Review):

      The authors ask an interesting question as to whether working memory contains more than one conjunctive representation of multiple task features required for a future response with one of these representations being more likely to become relevant at the time of the response. With RSA the authors use a multivariate approach that seems to become the standard in modern EEG research.

      We appreciate the reviewer’s helpful comments on the manuscript and their encouraging comments regarding its potential impact.

      I have three major concerns that are currently limiting the meaningfulness of the manuscript: For one, the paradigm uses stimuli with properties that could potentially influence involuntary attention and interfere in a Stroop-like manner with the required responses (i.e., 2 out of 3 cues involve the terms "horizontal" or "vertical" while the stimuli contain horizontal and vertical bars). It is not clear to me whether these potential interactions might bring about what is identified as conjunctive representations or whether they cause these representations to be quite weak.

      We agree it is important to rule out any effects of involuntary attention that might have been elicited by our stimulus choices. To address the Reviewer’s concern, we conducted control analyses to test if there was any influence of Stroop-like interference on our measures of behavior or the conjunctive representation. To summarize these analyses (detailed in our responses below and in the supplemental materials), we found no evidence of the effect of compatibility on behavior or on the decoding of conjunctions during either the maintenance or test periods. Furthermore, we found that the decoding of the bar orientation was at chance level during the interval when we observe evidence of the conjunctive representations. Thus, we conclude that the compatibility of the stimuli and the rule did not contribute to the decoding of conjunctive representations or to behavior.

      Second, the relatively weak conjunctive representations are making it difficult to interpret null effects such as the absence of certain correlations.

      The reviewer is correct that we cannot draw strong conclusions from null findings. We have revised the main text accordingly. In certain cases, we have also included additional analyses. These revisions are described in detail in response the reviewer’s comments below.

      Third, if the conjunctive representations truly are reflections of working memory activity, then it would help to include a control condition where memory load is reduced so as to demonstrate that representational strength varies as a function of load. Depending on whether these concerns or some of them can be addressed or ruled out this manuscript has the potential of becoming influential in the field.

      This is a clever suggestion for further experimentation. We agree that observing the adverse effect of memory load is one of the robust ways to assess the contributions of working memory system for future studies. However, given that decoding is noisy during the maintenance period (particularly for the low-priority conjunctive representation) even with a relatively low set-size, we expect that in order to further manipulate load, we would need to alter the research design substantially. Thus, as the main goal of the current study is to study prioritization and post-encoding selection of action-related information, we focused on the minimum set-size required for this question (i.e., load 2). However, we now note this load manipulation as a direction for future research in the discussion (pg. 18).

      Reviewer #2 (Public Review):

      Kikumoto and colleagues investigate the way visual-motor representations are stored in working memory and selected for action based on a retro-cue. They make use of a combination of decoding and RSA to assess at which stages of processing sensory, motor, and conjunctive information (consisting of sensory and motor representations linked via an S- R mapping) are represented in working memory and how these mental representations are related to behavioral performance.

      Strengths

      This is an elaborate and carefully designed experiment. The authors are able to shed further light on the type of mental representations in working memory that serve as the basis for the selection of relevant information in support of goal- directed actions. This is highly relevant for a better understanding of the role of selective attention and prospective motor representations in working memory. The methods used could provide a good basis for further research in this regard.

      We appreciate these helpful comments and the Reviewer’s positive comments on the impact of the work.

      Weaknesses

      There are important points requiring further clarification, especially regarding the statistical approach and interpretation of results.

      • Why is there a conjunction RSA model vector (b4) required, when all information for a response can be achieved by combining the individual stimulus, response, and rule vectors? In Figure 3 it becomes obvious that the conjunction RSA scores do not simply reflect the overlap of the other three vectors. I think it would help the interpretation of results to clearly state why this is not the case.

      Thank you for the suggestion, we’ve now added the theoretical background that motivates us to include the RSA model of conjunctive representation (pg. 4 and 5). In particular, several theories of cognitive control have proposed that over the course of action planning, the system assembles an event (task) file which binds all task features at all levels – including the rule (i.e., context), stimulus, and response – into an integrated, conjunctive representation that is essential for an action to be executed (Hommel 2019; Frings et al. 2020). Similarly, neural evidence of non-human primates suggests that cognitive tasks that require context-dependency (e.g., flexible remapping of inputs to different outputs based on the context) recruit nonlinear conjunctive representations (Rigotti et al. 2013; Parthasarathy et al. 2019; Bernardi et al. 2020; Panichello and Buschman, 2021). Supporting these views, we previously observed that conjunctive representations emerge in the human brain during action selection, which uniquely explained behavior such as the costs in transition of actions (Kikumoto & Mayr, 2020; see also Rangel & Hazeltine & Wessel, 2022) or the successful cancelation of actions (Kikumoto & Mayr, 2022). In the current study, by using the same set of RSA models, we attempted to extend the role of conjunctive representations for planning and prioritization of future actions. As in the previous studies (and as noted by the reviewer), the conjunction model makes a unique prediction of the similarity (or dissimilarity) pattern of the decoder outputs: a specific instance of action that is distinct from others actions. This contrasts to other RSA models of low-level features that predict similar patterns of activities for instances that share the same feature (e.g., S-R mappings 1 to 4 share the diagonal rule context). Here, we generally replicate the previous studies showing the unique trajectories of conjunctive representations (Figure 3) and their unique contribution on behavior (Figure 5).

      • One of the key findings of this study is the reliable representation of the conjunction information during the preparation phase while there is no comparable effect evident for response representations. This might suggest that two potentially independent conjunctive representations can be activated in working memory and thereby function as the basis for later response selection during the test phase. However, the assumption of the independence of the high and low priority conjunction representations relies only on the observation that there was no statistically reliable correlation between the high and low priority conjunctions in the preparation and test phases. This assumption is not valid because non-significant correlations do not allow any conclusion about the independence of the two processes. A comparable problem appeared regarding the non-significant difference between high and low-priority representations. These results show that it was not possible to prove a difference between these representations prior to the test phase based on the current approach, but they do not unequivocally "suggest that neither action plan was selectively prioritized".

      We appreciate this important point. We have taken care in the revision to state that we find evidence of an interference effect for the high-priority action and do not find evidence for such an effect from the low-priority action. Thus, we do not intend to conclude that no such effect could exist. Further, although it is not our intention to draw a strong conclusion from the null effect (i.e., no correlations), we performed an exploratory analysis where we tested the correlation in trials where we observed strong evidence of both conjunctions. Specifically, we binned trials into half within each time point and individual subject and performed the multi-level model analysis using trials where both high and low priority conjunctions were above their medians. Thus, we selected trials in such a way that they are independent of the effect we are testing. The figure below shows the coefficient of associated with low-priority conjunction predicting high-priority conjunction (uncorrected). Even when we focus on trials where both conjunctions are detected (i.e., a high signal-to-noise ratio), we observed no tradeoff. Again, we cannot draw strong conclusions based on the null result of this exploratory analysis. Yet, we can rule out some causes of no correlation between high and low priority conjunctions such as the poor signal-to-noise ratio of the low priority conjunctions. We have further clarified this point in the result (pg. 14).

      Fig. 1. Trial-to-trial variability between high and low priority conjunctions, using above median trials. The coefficients of the multilevel regression model predicting the variability in trial-to-trial highpriority conjunction by low-priority conjunction.

      • The experimental design used does not allow for a clear statement about whether pure motor representations in working memory only emerge with the definition of the response to be executed (test phase). It is not evident from Figure 3 that the increase in the RSA scores strictly follows the onset of the Go stimulus. It is also conceivable that the emergence of a pure motor representation requires a longer processing time. This could only be investigated through temporally varying preparation phases.

      We agree with the reviewer. Although we detected no evidence of response representations of both high and low priority action plans during the preparation phase, t(1,23) = -.514, beta = .002, 95% CI [-.010 .006] for high priority; t(1,23) = -1.57, beta = -.008, 95% CI [-.017 .002] for low priority, this may be limited by the relatively short duration of the delay period (750 ms) in this study. However, in our previous studies using a similar paradigm without a delay period (Kikumoto & Mayr, 2020; Kikumoto & Mayr, 2022), response representations were detected less than 300ms after the response was specified, which corresponds to the onset of delay period in this study. Further, participants in the current study were encouraged to prepare responses as early as possible, using adaptive response deadlines and performance-based incentives. Thus, we know of no reason why responses would take longer to prepare in the present study. But we agree that we can’t rule this out. We have added the caveat noted above, as well as this additional context in the discussion (pg. 16-17).

      • Inconsistency of statistical approaches: In the methods section, the authors state that they used a cluster-forming threshold and a cluster-significance threshold of p < 0.05. In the results section (Figure 4) a cluster p-value of 0.01 is introduced. Although this concerns different analyses, varying threshold values appear as if they were chosen in favor of significant results. The authors should either proceed consistently here or give very good reasons for varying thresholds.

      We thank the reviewer for noting this oversight. All reported significant clusters with cluster P-value were identified using a cluster-forming threshold, p < .05. We fixed the description accordingly.

      • Interpretation of results: The significant time window for the high vs. low priority by test-type interaction appeared quite late for the conjunction representation. First, it does not seem reasonable that such an effect appears in a time window overlapping with the motor responses. But more importantly, why should it appear after the respective interaction for the response representation? When keeping in mind that these results are based on a combination of time-frequency analysis, decoding, and RSA (quite many processing steps), I find it hard to really see a consistent pattern in these results that allows for a conclusion about how higher-level conjunctive and motor representations are selected in working memory.

      Thank you for raising this important point. First, we fixed reported methodological inconsistencies such as the cluster P-value and cluster-forming threshold). Further, we fully agree that the difference in the time course for the response and conjunctive representations in the low priority, tested condition is unexpected and would complicate the perspective that the conjunctive representation contributes to efficient response selection. However, additional analysis indicates that this apparent pattern in the stimulus locked result is misleading and there is a more parsimonious explanation. First, we wish to caution that the data are relatively noisy and likely are influenced by different frequency bands for different features. Thus, fine-grained temporal differences should be interpreted with caution in the absence of positive statistical evidence of an interaction over time. Indeed, though Figure 4 in the original submission shows a quantitative difference in timing of the interaction effect (priority by test type) across conjunctive representation and response representation, the direct test of this four way interaction [priority x test type x representation type (conjunction vs. response), x time interval (1500 ms to 1850 ms vs. 1850 to 2100 ms)] is not significant, t(1,23) = 1.65, beta = .058, 95% CI [-.012 .015]). The same analysis using response-aligned data is also not significant, t(1,23) = -1.24, beta = -.046, 95% CI [-.128 .028]). These observations were not dependent on the choice of time interval, as other time intervals were also not significant. Therefore, we do not have strong evidence that this is a true timing difference between these conditions and believe this is likely driven by noise.

      Further, we believe the apparent late emergence of difference in two conjunctions when the low priority action is tested is more likely due to a slow decline in the strength of the untested high priority conjunction rather than a late emergence of the low priority conjunction. This pattern is clearer when the traces are aligned to the response. The tested low priority conjunction emerges early and is sustained when it is the tested action and declines when it is untested (-226 ms to 86 ms relative to the response onset, cluster-forming threshold, p < .05). These changes eventually resulted in a significant difference in strength between the tested versus untested low priority conjunctions just prior to the commission of the response (Figure 4 - figure supplement 1, the panel on right column of the middle row, the black bars at the top of panel). Importantly, the high priority conjunction also remains active in its untested condition and declines later than the untested low priority conjunction does. Indeed, the untested high priority conjunction does not decline significantly relative to trials when it is tested until after the response is emitted (Figure 4 - figure supplement 1, the panel on right column of the middle row, the red bars at the top of panel). This results in a late emerging interaction effect of the priority and test type, but this is not due to a late emerging low priority conjunctive representation.

      In summary, we do not have statistical evidence of a time by effect interaction that allows us to draw strong inferences about timing. Nonetheless, even the patterns we observe are inconsistent with a late emerging low priority conjunctive representation. And if anything, they support a late decline in the untested high priority conjunctive representation. This pattern of the result of the high priority conjunction being sustained until late, even when it is untested, is also notable in light of our observation that the strength of the high priority conjunctive representation interferes behavior when the low priority item is tested, but not vice versa. We now address this point about the timing directly in the results (pg. 15-16) and the discussion (pg. 21), and we include the response locked results in the main text along with the stimulus locked result including exploratory analyses reported here.

      Reviewer #3 (Public Review):

      This study aims to address the important question of whether working memory can hold multiple conjunctive task representations. The authors combined a retro-cue working memory paradigm with their previous task design that cleverly constructed multiple conjunctive tasks with the same set of stimuli, rules, and responses. They used advanced EEG analytical skills to provide the temporal dynamics of concurrent working memory representation of multiple task representations and task features (e.g., stimulus and responses) and how their representation strength changes as a function of priority and task relevance. The results generally support the authors' conclusion that multiple task representations can be simultaneously manipulated in working memory.

      We appreciate these helpful comments, and were pleased that the reviewer shares our view that these results may be broadly impactful.

    1. Artykuł przedstawia podłoże rozwoju metod rozpoznawania dokumentów oraz wyszukiwania informacji do 1939 roku, czyli do momentu, w którym Vannevar Bush napisał artykuł „As We May Think”, opublikowane potem w 1945 roku.

      Artykuł przekonuje do tego, że pomysł Busha nie był ani tak oryginalny, ani tak rewolucyjny, jak się go przedstawia. Autor przedstawia także stanowiska innych badaczy czy wynalazców, którzy mieli zarzuty względem projektu Memeksu.

      Autor skupia się przede wszystkim na osobie Emanuela Goldberga i jego wynalazku wyszukiwarki mikrofilmów. Przedstawia także powody, które spowodowały, że jego wynalazek był pomijany i zapomniany.

    1. However, we can also stimulate growth by capitalizing on existing strengths.

      I think that this is very important to understand. If you know where or with what your students succeed then you can use those strengths to help them in other areas where they may not feel as confident.

    1. His team already has more findings that support the cerebellum’s contribution to addictive behavior, and in particular to the solidifying of a neurally stimulating behavior such as drug use. Such memory-making may render some individuals more susceptible to addictions. “One of the biggest problems is that those who are addicts [or former addicts] can be weaned from their addiction but if there is a new stress, the person is very susceptible to relapse,” Khodakhah says. “We think the reason is that there is a signature of the memory within the cerebellum. . . . If we understand that better we might be able to provide pharmacological or other therapeutic interventions to help these individuals.”

      orienting statements - I feel like this gives the article a clear end and conclusion which ultimately reveals the direction of the essay

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We are very grateful about the thorough reading and deep understanding of the work that these 4 reviewers have provided. Although it is evident that they still request an improved profiling of some aspects, it is very encouraging that all four think the work is very interesting, original, insightful and adds a new layer of knowledge to the regulation of DNA damage sensing and repair. It is also very rewarding that the four reviewers estimate that this work will sew connections between different fields and interest a broad readership. This is why we have designed here a very deep revision, tailored to satisfy all the raised concerns except one, and this just for technical reasons.

      Please find below the original reviewers’ comments and our answers to them preceded by the symbol “>”:

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): Ovejero et al. report an increase in lipid droplet (LD) abundance after long (from 120' on) exposure of budding yeast cells to DNA damaging agents zeocin and camptothecin (CPT). Next, they analyze DNA damage signaling in yeast mutants that impair triacylglycerol (TAGs) or sterol (STEs) esterification. They observe a slight anticipation in Rad53/CHK2 phosphorylation (indicative of DDR signaling) in yeast stem mutants, as well as in yeast cells or human cells lines pre-treated with oleate upon zeocin treatment. Yeast stem mutants are sensitive to zeocin and captothecin, but only confer sensitivity to hydroxyurea upon combination with tagD mutations. Authors relate these phenotypes to a somewhat decreases DSB resection in yeh2D mutants (expected to have reduced steryl esters pools) and RPA-foci in steD yeast cells. Next, a reduction in single strand annealing recombination repair events upon zeocin treatment is reported using a genetic reporter in steD mutants and oleate-treated cells. From these data they conclude that inability to process sterols in response to DSBs leads to an exacerbated DDR and prevents DNA repair. Next, it is shown that Flag-tagged Tel1 distinctly interacts with mono-phosphate phosphoinositides, including PI(4)P. An interaction in vivo is also inferred through Proximity Ligation Assays (PLA) using anti-PI(4)P and anti-ATM antibodies in human cell lines, which was moderately downregulated upon treatment with MMS or zeocin. Over-expression of the Osh4/OSBP1 transporter, which consumes PI(4)P, increased the number of Tel1 (nuclear) foci upon zeocin treatment. Conversely Sac1 ablation, in which accumulation of PI(4)P is expected, abrogated nuclear Tel1 foci formation and reduced telomere length (a phenotype related to lack of Tel1 function). From these results authors conclude that Tel1 availability in the nucleus is influenced by PI(4)P availability. Lastly, treatment with an OSBP1 inhibitor led to a cell line and damaging agent -variable reduction of ATM phosphorylation and a mostly non-significant reduction of DNA resection, measured by native BrdU detection, in response to CPT treatment. Overall, authors conclude that i) biding of Tel1/ATM to PI(4)P modulates its functional availability in the nucleus, and that ii) DNA damage elicits the esterification and storage of sterols toward LDs, which contributes to tritate Tel1/ATM away from the nucleus dampening the DDR and affecting long-range resection.

      Major comments: While the conclusion that Tel1/ATM binds PI(4)P and this interaction modulates Tel1/ATM functional availability at the nucleus is convincing, the conclusion that DSBs elicit a change in the metabolism of this lipid to "control" Tel1/ATM function is not demonstrated. The notion that sterol processing occurs in response to DSBs is not sufficiently supported by the data presented, as the increase in LD numbers is observed much after activation of the DDR (Rad53 phosphorylation) in Zeozin-treated yeast cells.

      We are afraid that we have not been clear enough in explaining the kinetics giving rise to our model. As indicated by the reviewer, our work shows, through kinetic studies, that the storage of sterols within LD occurs at later stages than the activation of the DDR by Tel1 and Rad53 phosphorylation. Tel1 foci decline is necessary for subsequent engagement of downstream DNA long-range resection. Since we propose that sterol storage within LD is a means to attenuate Tel1 engagement at DSBs, it is thus logical (and thus compatible with the data we show) that LD number increase occurs simultaneously with Tel1 foci decrease, at late stages of the reactionWe will include this explanation and graph in the revised version of the work.

      In addition, evidence is not provided on the mechanisms by which PI(4)P metabolism would be controlled, which would be expected to be DDR-independent as they are placed upstream of this signaling pathway in the author's model.

      The key mechanism through which, in the end, PI(4)P metabolism will be controlled, is the esterification of sterols within LD. Given that, as clarified above, LD formation in response to DSBs occurs “late” (i.e., after 120 min), it is not excluded that the DDR itself can instruct, through phosphorylation of some effector(s), LD formation. In other words, by ordering LD formation, the DDR would be launching a self-limiting mechanism. In support, we now know, although we do not show in this work, that eliminating key DDR proteins prevents the formation of LD in response to DNA damage. Because of this, we have undertaken an educated-guess approach and chosen critical or rate-limiting enzymes in LD biology either possessing an S/T-Q cluster domain (predicted to be a phosphorylation substrate for the DNA Damage Response kinases (1), and/or retrieved in phospho-proteomic screens as specific DDR targets (2,3). This adds up to 28 proteins in S. cerevisiae and 45 proteins in Homo sapiens. Importantly, the emergent candidates fall into two identical categories in both organisms. To provide initial support for their pertinence, we have generated a point mutant in the putative S/T-Q cluster of one of the yeast candidates. Of high relevance, we find that the concerned mutant is impaired in correctly triggering LD formation in response to DNA damage, and we have now obtained a specific funding to pursue this characterization that, as such, constitutes a different work from the one presented in this manuscript. We hope that the reviewer is now convinced yet that she/he agrees in keeping this information for subsequent manuscript(s).

      The damaging agents used have been suggested to alter the redox metabolism and even lipid peroxidation (Kitanovic 2009, Mizumoto 1993, Krol 2015, Todorova 2015, Ren 2019, Singh 2014). Hence it is possible that PI(4)P changes are not due to DSBs, but an indirect though relevant effect. In absence of direct evidence supporting an active regulation of PI(4)P dynamics in response to DNA breaks, this conclusion remains speculative and this should be noted in the manuscript.

      We fully agree with the reviewer that the used genotoxins are triggering a myriad of effects which could elicit the same phenomenon by indirect means. Yet, we want to stress that the use of camptothecin, which elicits a very robust LD formation phenotype (Figure 1C), is very likely specific, as it is proven as a potent and direct trapper of Top1 onto DNA after having cleaved it. Nevertheless, we propose in the next paragraph two specific experiments to dismiss this problem, please see immediately below.

      Authors conclude that LD is specific to DSB induction. This seems an overstatement as they just reported LD increases in response to two agents that also induce other kinds of DNA damage. To also strengthen the link between DSBs and PI(4)P modulation of Tel1 function, authors should analyze LD numbers, Rad53 phosphorylation and Tel1 nuclear re-localization in response to HO-induced DNA breaks (e.g., using the system employed in Figure 3C).

      We humbly think that enzymatically-induced DNA breaks will both activate Rad53 phosphorylation and Tel1 nuclear concentration, as this has already been established, thus requiring no further exploration. Yet, it is very important to assess the reviewer’s suggestion concerning whether enzymatically-induced DNA breaks also trigger the formation of LD. To this end, we will perform two complementary studies in which, instead of using HO, which cuts only a few times in the genome, we will:

      1. a) exploit the naturally DSB-accumulating mutant rad3-102, which we previously characterized in the past (4), and which we already exploit in this work for recombination analyses (Figure S4A), to evaluate whether it endogenously harbors more LD in comparison with the WT.
      2. b) we have recently created a tool in which gRNAs targeted to different subsets of transposons in the genome can drive Cas9 to create DSB in a dose-dependent manner ((9), under revision in Genetics). We will use this system to monitor the LD formation in response to Cas9-triggered cuts. In addition, on figure 5A, significant differences in GFP-Tel1 foci abundance between WT and steD or yeh2D cells are only observed after 210', way after the slight effect on Rad53 phosphorylation is observed. This is at odds with the conclusion that Tel1 association to STEs modulates DDR signaling.

      We are afraid that we have not been clear enough in explaining the kinetics giving rise to our model. As indicated by the reviewer, our work shows, through kinetic studies, that the storage of sterols within LD occurs at later stages than the activation of the DDR by Tel1 and Rad53 phosphorylation. Tel1 foci decline is necessary for subsequent engagement of downstream DNA long-range resection. Since we propose that sterol storage within LD is a means to attenuate Tel1 engagement at DSBs, it is thus logical (and thus compatible with the data we show) that LD number increase occurs simultaneously with Tel1 foci decrease, at late stages of the reactionWe will include this explanation and graph in the revised version of the work.

      Minor comments:

      Figure S1D and E, experiments should be carried out to include time points in which LD accumulation and cell cycle arrest are observed upon zeocin treatment (i.e., up to 210' as in Figure 1A)

      We will provide cytometry profiles of cells at 210 min. These data exist already in our laboratory.

      How do authors explain increased single strand annealing recombination frequencies in steD and oleate-treated wild type cells (Figure 4A). Should it not be expected that increased STEs also impair recombination induced by endogenous damage?

      Only ste∆ (and not +oleate) indeed manifests an increase in basal recombination frequencies, likely arising from endogenous damage. Although the increase is observed, it is not significant. We agree anyway with the reviewer that, was the experiment to be repeated more times, the increase may be found significantly different. We do not have any honest proposal to explain this.

      Data presented in figure 4B and 4C are not fully convincing. Performing time course experiments might help concluding if the differences observed represent a relevant defect in DSB processing.

      We will perform a Pulsed Field Gel Electrophoresis (PFGE) kinetcis in response to zeocin with or without oleate pre-loading to reinforce the conclusion.

      Is Figure 5B referring to Flag-tagged Tel1 or GFP-tagged Tel1 as stated in the figure legend?

      There is a misunderstanding here, as the mentioned Figure 5B corresponds to P-ATM immunofluorescences in human cells, not to any tagged Tel1 experiment.

      Treatment with the ATM inhibitor AZD0156 increased PI(4)P-ATM PLA signals. From these authors conclude that "association of ATM and PI(4)P inversely correlated with the need for ATM within the nucleus. Do they imply that treatment with ATM-inhibitors reduces the requirement for ATM function in the nucleus? The interpretation of this result should be further elaborated to sustain this conclusion.

      We may have conveyed a wrong notion at this point. We do not imply at all that ATM inhibitors reduce the need for ATM in the nucleus. Instead, we imply that, by reinforcing ATM attachment to Golgi-resident PI(4)P, ATM inhibitors end up titrating ATM away from the nucleus. We will clarify our explanation to avoid misunderstandings.

      An increase of GFP-Tel1 foci upon OSH4 overexpression is described on Figure 7B. These are described as nuclear in the results, but no reference is made in the figure or legend as to how nucleus positions are addressed in these experiments. This should be clarified.

      We systematically combine the tagging of a nucleoplasmic protein (mCherry-Pus1) with the detection of GFP-Tel1 foci, as to unambiguously assess the nuclear position of Tel1 foci. We will include this explanation and the corresponding mCherry-Pus1 channel to clarify this.

      Also, WT controls and quantifications should be included in the experiments shown on Figure 7C.

      These experiments are quantified from the moment we did them, although we did not include such quantifications in the present version for the sake of space. We will do so in the revised version.

      Reviewer #1 (Significance (Required)):

      While the conclusion of lipid metabolism responding to DSBs is not convincing, the observation that Tel1/ATM function is modulated by PI(4)P biding is significant and advances the understanding on the function and regulation of this key kinase in promoting genome integrity maintenance. This is an unanticipated result which is highly novel and has implications for the modulation of Tel1/ATM function through pharmacological manipulation of lipid metabolism. This finding would be of broad interest for scientists working on the response to DNA damage and the maintenance of genome integrity. This reviewer belongs to that group and has limited expertise to evaluate the lipid metabolism genetic manipulation in the manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors show that cytoplasmic PI4P have a regulatory role on ATM response to DNA double strand breaks. The process involves a balance between exchange of PI4P between Golgi and ER in exchange of esterified sterols. The study is of interest, however provides indirect evidences to support their conclusions.

      Major comments : 1). Since the major conclusion relates to PI4P association with ATM in basal conditions to keep ATM outside nucleus and known presence of PI4P, ATM in other organelles of a cell, further experiments such as cell fractionation experimental that show golgi specific interaction would support the main conclusion.

      In continuation of 1st comment, since PI4P in substrate of PI4 phosphoinositol kinases, is there a competition between PI4kinases and ATM for PI4P binding should be addressed through immunoprecipitation studies.

      First of all, we need to specify here that PI4kinases will phosphorylated PI4 to create PI(4)P. Thus, PI(4)P is the product, and not the substrate, of PI4kinases. We therefore do not expect any competition between such kinases and ATM.

      Second, we take good note of the reviewer’s concern that the pool of PI(4)P at the Golgi may be shared, and also that it would be important to reinforce the notion of the relative subcellular localization of ATM under different treatments. To this end, we will perform the following integrative experiment:

      Immunoprecipitation of PI(4)P could theoretically be done using our specific antibody, yet the IP efficiency of a lipid cannot be verified by western blot. Further, there are PI(4)P pools elsewhere in the cell that would mess up with interpretations. We therefore dismiss the use of anti-PI(4)P as a tool to perform immunoprecipitations.

      Instead, to explore the impact of PI(4)P levels on ATM both at the Golgi and within the nucleus, we will split our cultures in two to either immunoprecipitate specific cytoplasmic Trans-Golgi Network-associated proteins (we will test separately TGN46 and GOLPH3); or the nuclear ATM-interacting factor MRE11 from nuclei, then blot for co-immunoprecipitated ATM. The relative co-immunoprecipitated ATM is expected to vary under different treatments to which the cells will be exposed, namely:

      • untreated
      • zeocin, to trigger ATM need in the nucleus
      • OSBP inhibition (+/- zeocin), to stabilize PI(4)P at the Golgi
      • PIK93, an inhibitor of PI4 kinases that prevents PI(4)P synthesis

      2). The authors claim that the ATM retention is the main function of PI4P in Golgi. The authors should rule out the possibility that the phenotype observed on DNA damage response is not due to non availability of PI4P substrate for PI4P kinases, that have recently been shown to participate in genome integrity maintenance.

      We want to explain that we do not intend to say that PI(4)P main function at the Golgi is ATM retention, as PI(4)P is a molecule binding and modulating multiple proteins, as for example the aforementioned GOLPH3. We will first revise our text to correct it, in case we have conveyed this incorrect notion, as it stems from the reviewer’s comment.

      Second, the reviewer evokes the notion that PI(4)P can be the substrate of a second phosphorylation, which could give rise to PI(3,4)P or to PI(4,5)P, which could still undergo remodeling into PI(3)P, for example. Recent work by Dr Michael Sheetz’s lab demonstrated that this set of phosphoinositides serves to drive the nucleation and activation of the ATR-Chk1 branch of the DNA Damage Response upon genotoxic stress, yet was completely inert with respect to the ATM-Chk2 branch (5). To rule out the possibility, as evoked by the reviewer, that the oleate-induced DDR phenomena we describe relate to these other events, we have now explored the response of the ATR-Chk1 branch when comparing the response of zeocin-treated cells that have been pre-loaded or not with oleate. We observe that the ATR-Chk1 branch is unaltered by oleate loading. Thus, we can now propose that the PI(4)P branch exclusively modulates the ATM-Chk2 axis.

      3). Does Oleate treatment influences Rad53 protein levels in addition to its phosphorylation that affect DNA damage response may be addressed.

      Exponential cultures from three different WT, three different ste∆ and three different yeh2∆ strains have now been taken and pre-loaded for 2 hours with 0.05% oleate, then total levels of Rad53 (without induction of DNA damage) assessed. We can now formally say that basal levels of Rad53 protein are not altered by this incubation. We will include this control in the revised manuscript.

      4). Does Yeh2 deletion reduces LDS should be checked.

      We frequently use yeh2∆ cells in our studies. In particular, we have recently published work characterizing the phenotype of this strain with respect to the formation of lipid droplets in the nucleus (6). We are currently exploiting those same sets of data to quantify the total number of LD in order to satisfy the reviewer’s concern.

      5). Figure 4D representation should show % of phospho reduction of initial activation and a better western blot image should be shown that show equal loading of samples.

      We are currently repeating these gels and blots for the sake of clarity, as requested.

      6). In immunoprecipitation experiments, kindly include isotypee IgG controls as well to rule out non-specificity.

      Of course, this important control will be included every time.

      Minor points: 1). Figure S1F do not show oleate treatment as presented in results section.

      We will revise the accurate naming.

      2). A better gel for S4B should be presented with ponceau of the same gel.

      We are currently repeating this gel and associated blot for the sake of clarity, as requested.

      3). Nuclear PI4Ps has also been previously reported, an explanation to the specific interaction of ATM and PI4P in the Golgi should be addressed/discussed.

      We take it that the reviewer is referring here to the recent work by Fáberová et al (7) in which PI(4)P and PI(4,5)P were described as very dynamic in the nucleus, and mostly related then to mRNA transcription, splicing and export. We will reinforce the connection of our phenomenon to the Golgi-associated pool of PI(4)P thanks to the co-immunoprecipitation experiments proposed above, and will timely contextualize these in light of the paper by Fáberová and co-workers in the revised version. Thank you for reminding us of this work.

      Reviewer #2 (Significance (Required)):

      The current work definitely adds a layer in our understanding to ATM regulation and cross-talk between different PIKK family of kinases. ATM localisation in extra nuclear regions of a cell has been described earlier with significant impact on cell physiology such as mitochondria etc., ATM retention at golgi and limiting nuclear ATM levels is significant advance at ATM activity regulation, while signifying non canonical function of PI4P.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      In this manuscript, the authors propose that ATM/Tel1 signaling is regulated in a spatiotemporal manner during genotoxic stress both in yeast and mammalian cells. They show that Lipid droplets accumulate in response to genotoxic stress. As a consequence, there is a decrease of exchange of PI4P from the Golgi to ER, thus dampening ATM/Tel1 signaling by sequestering this kinase into the Golgi. The authors combined findings in yeast and mammals showing that this mechanism is conserved throughout eukaryotes. For this purpose, they use a vast number of techniques that support their proposed model.

      Major comments:

      The conclusions were made based on evidence combining yeast genetics, immunofluorescence, DNA end resection analysis and pharmacological interventions. The hypothesis that ATM is kept away from the nucleus by physically interacting with PI4P at the Golgi, thus allowing processive repair is bold and contributes for a better understanding of the choreography of the DDR kinases during DSB repair. However, many of the experiments in yeast and mammals show only mild phenotypes and there is no evidence that this mode of ATM dampening impact cell viability in mammals.

      We agree with the reviewer that the effects associated to the reported phenomenon are indeed mild. This is a fact. We would like to remind that the metabolism of sterols is finely controlled, and at many different levels, in a very complex manner. For example, sterol increases in the cell will immediately be compensated by reduced synthesis, while synthesis inhibition will immediately promote uptake from the medium, and/or release from stores (for example, see (8)). As a natural consequence, the window of manipulation and, more importantly, the strength of the phenotypes we can uncover are small.

      Therefore, I have some comments and suggestions of experiments that I think could improve the quality of the manuscript. I believe that most of these new experiments does not require much time and resources.

      • Does oleate treatment in RPE-1/Huh-7 cells induce loss of viability? An experiment showing loss of viability like MT-assay or decreased cell proliferation would reinforce the importance of the mechanism proposed.

      This experiment was already included in the previous version, yet it may have escaped the attention of the reviewer. We show in Figure S2E that oleate treatment restricts viability in Huh-7 cells alone, and also worsens their tolerance to zeocin. Perhaps we should reconsider moving this result to the main figures so that it does not go unnoticed.

      • In yeast there is evidence that a ste delta strain show sensitivity to zeocin/CPT, but there is no experiment showing the same effect on cells lacking Yeh2. Since both strains share similar phenotypes, it would be interesting to show that increased kinetics of Rad53 signaling leads to sensitivity to genotoxins.

      We have now performed this experiment, we will include the matching information for yeh2∆ cells, which agrees with the predictions.

      • The conclusion that ste delta cells exposed to zeocin leads to unproductive events due to defects in DNA-end resection could be reinforced by a decrease in Rad52 foci. It has been previously shown by the group of Dr. Marcus Smolka, that inhibition of DNA-end resection decreases Rad52 foci (https://doi.org/10.1083/jcb.201607031). Since the authors were able to monitor Rad52-YFP (Figure S1A), it shouldn't consume time and resources.

      The reviewer is right that this experiment should not be time- or resources-consuming. We will evaluate the accumulation of Rad52 foci in response to the concerned genotoxin in ste∆ cells.

      • Since the authors propose that there is a DNA repair defect due to inhibition of long-range DNA-end resection, it would be important to monitor gamma-H2A(X) signal either in yeast or mammals.

      Taking into consideration the reviewer’s suggestion, we have now performed anti-yH2AX immunofluorescence of all the implied conditions (genotoxins +/- oleate pre-load) and will quantify them to answer the concern.

      • How do the authors exclude the possibility that yeast mutants or oleate treatment in yeast/mammalian cells change membrane permeability allowing an increase in genotoxin concentration?

      Although this is a very reasonable criticism, we want to remind the data we present in Figure S4A in which we use the naturally DSB-bearing rad3-102 cells for recombination analyses, showing that, in the absence of any genotoxin, the same phenotype also applies. Yet, we want to reinforce the notion that LD formation in response to DSB can also occur when the breaks are not chemically, but physically, induced. To this end, and also to match a related request by Reviewer 1, we will:

      1. a) exploit the naturally DSB-accumulating mutant rad3-102 (4) to evaluate whether it endogenously harbors more LD in comparison with the WT.
      2. b) we have recently created a tool in which gRNAs targeted to different subsets of transposons in the genome can drive Cas9 to create DSB in a dose-dependent manner ((9), under revision in Genetics). We will use this system to monitor the LD formation in response to Cas9-triggered cuts. In addition, on figure 5A, significant differences in GFP-Tel1 foci abundance between WT and steD or yeh2D cells are only observed after 210', way after the slight effect on Rad53 phosphorylation is observed. This is at odds with the conclusion that Tel1 association to STEs modulates DDR signaling.

      We are afraid that we have not been clear enough in explaining the kinetics giving rise to our model. As indicated by the reviewer, our work shows, through kinetic studies, that the storage of sterols within LD occurs at later stages than the activation of the DDR by Tel1 and Rad53 phosphorylation. Tel1 foci decline is necessary for subsequent engagement of downstream DNA long-range resection. Since we propose that sterol storage within LD is a means to attenuate Tel1 engagement at DSBs, it is thus logical (and thus compatible with the data we show) that LD number increase occurs simultaneously with Tel1 foci decrease, at late stages of the reactionWe will include this explanation and graph in the revised version of the work.

      • It would be interesting to investigate genetic interactions between ste delta (or yeh2delta) and yeast mutants with DNA-end resection problems (exo1delta; sae2delta). For instance, it has been shown that Sae2 antagonizes checkpoint signaling by competing with Rad9 to DSB sites (https://doi.org/10.1073/pnas.1816539115). Also, cells lacking Sae2 show an increase in Rad53 signaling due to increased Tel1 Signaling. Therefore, an epistatic effect between these two pathways would reinforce the hypothesis of the manuscript.

      we will build the double mutant sae2∆ yeh2∆ and assess the potential epistatic behavior they may display with respect to some key phenotypes (Tel1 foci formation, Rad53 phosphorylation…).

      • The authors showed that Tel1-GFP does not accumulate in the nucleus in cells lacking Sac1 (Figure 7C). Tel1 is important to cope with increased DSBs in the absence of Mec1, thus avoiding genomic instability. Cells lacking both Mec1 and Tel1 show a sick phenotype with an exponential increase in gross chromosomal rearrangements and sensitivity to genotoxins. Therefore, does deletion of Mec1 (and Sml1) in sac1 delta phenocopies a mec1tel1 delta? Alternatively, does pharmacological inhibition of ATR in the presence of the OSBP1 inhibitor causes loss of viability or chromosomal aberrations?

      We will delete SAC1 in mec1∆ sml1∆ and compare the fitness, through growth drop assays, with respect to the mutant tel1∆ mec1∆ sml1∆.

      We will expose cells either to OSBP1 inhibitor, ATR inhibitor, or both, and assess the phosphorylation of their downstream common effector H2AX. Additionally, we will assess the effect on cell growth of the combination of ATRi and OSBP1i using synergy matrices. We will determine if the combination of both drugs synergizes or not to impair cell proliferation and reduce cell viability.

      • Finally, it seems strange to me that ATR/Mec1 signaling is not mentioned throughout the entire manuscript. Does PI4P pathway affect only ATM/Tel1? In Figure 2D, an antibody against phospho-CHK1 could be used to monitor ATR signaling. In line with that, I would like to see in the discussion how these new findings are in line with evidence from a 2019 paper showing that phophoinositides PIP2 and PIP3, but not PI4P are important for ATR signaling (DOI: 10.1038/s41467-017-01805-9). They showed that a nuclear pool of PIP2 increases upon DNA damage induction and rapidly accumulates at DNA lesions. This event is important for the recruitment of ATR. Since PI4P is substrate for PIP2 synthesis and there is a nuclear pool of PI4P and PIP2, I think it is important to discuss if the results presented here are in line with these previous findings.

      The reviewer evokes recent work by Dr Michael Sheetz’s lab demonstrating that a different set of phosphoinositides serves to drive the nucleation and activation of the ATR-Chk1 branch of the DNA Damage Response upon genotoxic stress, yet was completely inert with respect to the ATM-Chk2 branch (5). We have now explored, also to satisfy a similar concerned raised by Reviewer 2, the response of the ATR-Chk1 branch when comparing the response of zeocin-treated cells that have been pre-loaded or not with oleate. We observe that the ATR-Chk1 branch is unaltered by oleate loading. Thus, we can now propose that the PI(4)P branch exclusively modulates the ATM-Chk2 axis.

      We will of course give the needed credit to this work and contextualize our findings accordingly.

      Minor comments:

      • Line 124: The correct is Figure S1E, lower panel and not Figure S1F -Lines 127-128: Figure S2A does not show zeocin treatment

      Both minor mistakes will be corrected.

      Reviewer #3 (Significance (Required)):

      Together, these new findings, if corroborated by others, might be important to open new lines of investigation in basic and translational research regarding human diseases as explored in the discussion section. I believe this paper will attract attention not only from the DDR field but also from other areas of research such as nutrient and lipid signaling both in yeast and mammals. I hope I was able to collaborate in this review, since my main expertise is in the area of DNA damage signaling using budding yeast as an organism model.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      This is a very interesting study where Sara et al. demonstrated a link between lipid metabolism with DNA repair response (DDR). In this study, they have proposed ATM as a novel PI4P-effector. The sterol deposition into lipid droplets impacts the Golgi PI4P level due to lipid exchange machinery facilitated by OSBP1, therefore regulating the cytosolic retention of ATM due to PI4P binding. Although how lipid droplets in the cytosol sense the DNA damage and control the initiation of DDR by regulating ATM is still unclear, this study linked lipid biology/PI signaling to DNA damage repair and showed the evolutionary conservation of PI signaling and DNA repair machinery from yeast to humans. The experiments are well designed, nicely controlled, with a high quality of data presentation. With some improvements, this work could be a very interesting study attracting a broad readership.

      In their model, ATM is PI4P-bound and sequestered inside the cytosol under basal conditions. Upon genotoxic stress, activation of OSBP1 removes PI4P and free PI4P-bound ATM for nuclear translocation of DNA repair. This could also be interpreted as genotoxic stress-induced PIP-kinase activity, where PI4P is processed into PIP2 or PIP3, somehow redirecting ATM into the nucleus to initiate its activation for DDR. Those aspects should be discussed and improved.

      Both Reviewers 2 and 3 have somehow evoked a similar concern. More precisely, the work by Dr Michael Sheetz’s lab demonstrating that a different set of phosphoinositides serves to drive the nucleation and activation of the ATR-Chk1 branch of the DNA Damage Response upon genotoxic stress, yet was completely inert with respect to the ATM-Chk2 branch (5). We have now explored, to satisfy all reviewers’ concerns, the response of the ATR-Chk1 branch when comparing the response of zeocin-treated cells that have been pre-loaded or not with oleate. We observe that the ATR-Chk1 branch is unaltered by oleate loading. Thus, we can now propose that the PI(4)P branch exclusively modulates the ATM-Chk2 axis.

      Additionally, we will of course give the needed credit to this work and contextualize our findings accordingly.

      Upon stress, there is nuclear activation of p53-phosphoinositide (PI) signalosomes and PIP-kinases. Also, there is a significant PIP2 pool inside the nucleus with an involvement in DNA damage repair. Those papers and their relevance to the current study need to be discussed. If ATM is a novel PI4P-effector, there is also nuclear PI4P formation or nuclear PI4P accumulation upon stresses based on recent studies; how the ATM interacts with PIPn in the nucleus upon translocation? A know ATM substrate p53 is PIP2/PIP3 bound in the nucleus based on recent studies. Will ATM prefer to interact with other PIPn-bound proteins in the nucleus or PIPn regulate their interaction needs to be discussed.

      These additional notions are in line with the previous paragraph presented by the reviewer, and our answers too. We will provide a constructive overview of all these ideas in the revised version of the manuscript.

      Major points: 1. The PI4P-ATM complex is supported only by PLA and PIP strips. Need more robust biochemical characterization of the interaction: co-IP, lipid binding, and/or in vitro constitution.

      We agree with the need to perform assays in which PI(4)P is embedded in a bilayer, as to confidently assess whether Tel1 can bind it in that context. We have now performed a pilot experiment in which we have confronted purified FLAG-Tel1 to liposomes harboring PI(4)P. Western blot analysis using anti-FLAG antibody shows the encouraging result that FLAG-Tel1 can be found there. As a control, we have performed the same process but in the absence of any liposomes. We observe that a residual fraction of FLAG-Tel1 can nevertheless be found in this control, most probably because the buffer used during the liposome assay makes part of FLAG-Tel1 precipitate.To avoid this type of background and to increase our trust in the results, we propose to perform the liposome assay but on a discontinuous density gradient, so that liposomes will be retrieved in the top layer (and bound FLAG-Tel1 with them, if that is the case), while any precipitated FLAG-Tel1 will be in the bottom fraction (liposome floatation assay). As a further control, we will include the same liposomes but lacking PI(4)P. We expect to be successful in the floatation assays. If we are not, we will repeat the experiment presented above to be confident that the observed increase is reproducible.

      1. The use of drug inhibitors only in the final figure is problematic. KD or KO experiments should be performed to confirm that ATM and the exchanger are the relevant targets.

      We have now used siRNAs against the exchanger protein, OSBP1, with a very high silencing rate success. We have next monitored the activation status of the chromatin-associated ATM target KAP1, in order to monitor the predicted decrease of ATM activity specifically inside the nucleus. Our results confirm the role of OSBP1, by KD experiments as requested by the reviewer, in attenuating ATM nuclear participation.

      1. Poor quality of some WBs (e.g Fig. S1F).

      We have now repeated the Western Blot to detect Rad53-P in response to 20 mM HU in WT versus ste∆cells.

      1. Lack of statistical analyses for some data (e.g. Fig. 1B-E)

      We had already included, in the previous version, the complete statistical analyses corresponding to Figures 1B to E and evoked here by the reviewer. They were indeed included in Figure S1C, and our brief reference to them in the text may have escaped her/his attention. We will make a clear reference to this in the revised version.

      Additional clarification points:

      Figure 1: No representative images were shown for quantifications in Figure 1C, D, E.

      If the reviewer / editor estimates it pertinent, we can of course include them. Yet, they will be very redundant with the images displayed in Figure 1A.

      Line 121: Should be Figure S1E, upper panel. Line 124: Should be Figure S1E, lower panel. Figure 2D-E, please show the quantification of the ratio of pCHK2/CHK2 with an N=3

      We will correct / include the requested changes.

      Figure S2B: needs quantification of NileRed staining to conclude induction in LD formation

      We will quantify the LD as requested.

      Figure 3C, to show the selectivity of ATM-binding toward PI4P, PLA of ATM with other PIPn species should be assessed, such as PI3P, PI4,5P2, and PI3,4,5P3.

      We have provided an overview of the binding preferences of ATM with respect to the full battery of phosphoinositides in the strip-binding assay shown in Figures S5C and 6B. Other than that, we are afraid that PLA studies as the ones we develop in the current manuscript for PI(4)P are not feasible, since no reliable antibodies exist for most of the phosphoinositide species evoked by the reviewer.

      Figure S6A, PI4P level could be assessed by IF staining using PI4P antibody besides using PI4P sensor.

      We will use our PI(4)P antibody to monitor by immunofluorescence the behavior of this molecule in response to either MMS or zeocin, as suggested.

      References

      1. Cheung HC, San Lucas FA, Hicks S, Chang K, Bertuch AA, Ribes-Zamora A. An S/T-Q cluster domain census unveils new putative targets under Tel1/Mec1 control. BMC Genomics. 2012;
      2. Bensimon A, Schmidt A, Ziv Y, Elkon R, Wang SY, Chen DJ, et al. ATM-dependent and -independent dynamics of the nuclear phosphoproteome after DNA damage. Sci Signal. 2010;
      3. BastosdeOliveira FM, Kim D, Cussiol JR, Das J, Jeong MC, Doerfler L, et al. Phosphoproteomics Reveals Distinct Modes of Mec1/ATR Signaling during DNA Replication. Mol Cell. 2015;
      4. Moriel-Carretero M, Aguilera A. A Postincision-Deficient TFIIH Causes Replication Fork Breakage and Uncovers Alternative Rad51- or Pol32-Mediated Restart Mechanisms. Mol Cell. 2010;37(5):690–701.
      5. Wang YH, Hariharan A, Bastianello G, Toyama Y, Shivashankar G V., Foiani M, et al. DNA damage causes rapid accumulation of phosphoinositides for ATR signaling. Nat Commun. 2017;
      6. Kumanski S, Forey R, Cazevieille C, Moriel-Carretero M. Nuclear Lipid Droplet Birth during Replicative Stress. Cells. 2022;11(1390).
      7. Fáberová V, Kalasová I, Krausová A, Hozák P. Super-Resolution Localisation of Nuclear PI(4)P and Identification of Its Interacting Proteome. Cells. 2020;9(5):1–17.
      8. Luo J, Yang H, Song BL. Mechanisms and regulation of cholesterol homeostasis. Nat Rev Mol Cell Biol [Internet]. 2020;21(4):225–45. Available from: http://dx.doi.org/10.1038/s41580-019-0190-7
      9. Coiffard J, Santt O, Kumanski S, Pardo B, Moriel-Carretero M. A CRISPR-Cas9-based system for the dose-dependent study of 4 DNA double strand breaks sensing and repair 5 6. bioRxiv [Internet]. 2021;1–37. Available from: https://doi.org/10.1101/2021.10.21.465387.
    1. gammon

      I want to focus this annotation on one word in particular, that at first I overlooked: gammon. Taken at face value, gammon is a British term for a smoked or cured ham. Thus, it would be easy to not think much of the line: “Well, that Sunday Albert was home, they had a hot gammon, / And they asked me in to dinner, to get the beauty of it hot.” However, gammon also has two other definitions: 1. To defeat an opponent in backgammon, another board game which shares similarities to chess. and 2. To hoax or deceive. First, alluding to backgammon within “A Game of Chess” provides interesting parallels and reflections on what it means to be within a game. From Middleton’s play, we know that chess is strongly affiliated with seduction and lust. While this may be a stretch, I believe that backgammon acts as a contrast to chess as a representation of what society was before the War and deterioration of creativity and individualism that Eliot constantly references within The Wasteland. Chess is the younger of the two, and has a belligerent connotation (possibly in reference to The Great War), in comparison to the meditative nature of backgammon. To be engaged in chess is a cerebral battle, and in Eliot’s mind, England is losing. Moreover, in Sukhbir Singh’s journal article, “Gloss on "Gammon" in "The Waste Land", II, Line 166”, he mentions the importance of the characterization of the gammon as “hot.” Singh deems the gammon aphrodisiac, and believes that “hot” refers to the Duke’s “flaming appetite” and “hot lust.” Singh’s opinion fits nicely into the second alternate definition of gammon, which is to hoax or deceit. In this case, the unnamed woman in the poem has most likely fallen into her “flaming appetite” and participated in an affair. Singh believes that this lack of love is Eliot’s reflection of societal deterioration.

    1. Yet when all this is admitted I still feel that the considerations which I have urged should have a wide influence upon the type of psychology which is to be developed in the future. What we need to do is to start work upon psychology, making behavior, not consciousness, the objective point of our attack. Certainly there are enough problems in the control of behavior to keep us all working many lifetimes without ever allowing us time to think of consciousness an sich. Once launched in the undertaking, we will find ourselves in a short time as far divorced from an introspective psychology as the psychology of the present time is divorced from faculty psychology

      Watson acknowledges some arguments that may come from his views, but still believes psychology to stop focusing on consciousness but rather on the behavior.

    2. But on the other hand, since it does respond to thermal, tactual and organic stimuli, its conscious content must be made up largely of these sensations; and we usually add, to protect ourselves against the reproach of being anthropomorphic, 'if it has any consciousness'. Surely this doctrine which calls for an anological interpretation of all behavior data may be shown to be false: the position that the standing of an observation upon behavior is determined by its fruitfulness in yielding results which are interpretable only in the narrow realm of (really human) consciousness

      Anthropomorphism is when you think about animals or objects as if they were human. For instance, pet owners might observe human-like qualities in their pets, believing that their pet is experiencing an emotional state similar to what a human feels. https://psychcentral.com/health/why-do-we-anthropomorphize#anthropomorphism

    1. Learning (defined as actionable knowledge) can reside outside of ourselves (within an organization or a database), is focused on connecting specialized information sets

      I haven't really explored this idea before, but it makes complete sense! When we take time to reflect on what we already know, what we have just learned, and ask questions about what else these ideas may relate to, we get the big picture. I think this idea could apply to connecting ideas and it could also be about connecting people (like in out Twitter chats) so that we have more resources or better support to continue the learning process.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      In our work, we quantified the abundance and positions of major kinetochore proteins within the metaphase kinetochore in budding yeast using single-molecule localization microscopy. Based on these measures, we revised the current model of the kinetochore and provided a nanoscale view of the complex.

      We now revised our manuscript according to reviewers’ points. We performed new analyses to quantify the measurement errors and to justify our data analysis workflows. We further exploited the correlation-based analysis and found a correlation between the spreads of kinetochore proteins perpendicular to the spindle axis and their positions along the axis. We also discussed the potential non-centromeric pools and revised our model of the kinetochore. Further information on our analyses was now provided to improve the clarity. Changes to the text were implemented to better reflect our data. Information from relevant works was incorporated to better connect this work to the field.

      We thank the reviewers for their points, which help us show the rigorousness of our analyses, further demonstrate the potential of our work, and improve clarity.

      2. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The authors have developed a rigorous methodology for using single-molecule imaging of exogenously labeled kinetochore proteins to count and estimate their copy numbers and the average distance from the kinetochore protein Spc105. Although the method is technically sound, its application to the kinetochore raises some crucial questions below. My biggest concern is the effect of non-centromeric pools of the centromeric proteins Cse4, Cep3, and Ctf19 on the estimated copy number per kinetochore. The authors should be able to address most, if not all, questions by presenting a more in-depth data analysis.

      Major points

      1. Accounting for tilt of the yeast spindle relative to the image plane: It is not clear to me how the authors ascertain whether the spindle being imaged is nearly parallel to the image plane. In the companion fission yeast study, spindle poles are used for this purpose, but this study seems to rely only on the labeled kinetochore proteins. The criteria used to select the in-plane spindles should be clearly defined.

      We thank the reviewer for pointing this out. We selected the in-plane spindles based on their average PSF size, which informs the z positions of the center of the kinetochore cluster (for simplicity, now all ’half-spindle’ was changed to ‘kinetochore cluster’). To calibrate the z position of kinetochore clusters, we first measured the width of the kinetochore cluster by fitting a cylindrical distribution. Overall, the kinetochores are likely symmetrically distributed around the spindle axes. Therefore, the height and the width of a kinetochore cluster should be the same. We then calibrated the z positions of the PSF size based on fluorescent bead data. Next, we plugged in the cylindrical distribution to the calibration curve to correlate the mean PSF size and position of the kinetochore cluster. We only took the kinetochore clusters with a mean PSF size

      1. The effects of PSF depth on counting kinetochore proteins: The authors use a well-characterized nuclear pore protein as the reference to estimate kinetochore protein counts per half-spindle. Although this method appears rigorous in principle, I am unsure about the effect of the spatial distribution of kinetochores on the accuracy of the estimated number. Nuclear pore proteins are all localized within an 100 nm away from the focal plane even when the spindle is perfectly parallel to the focal plane. A discussion of this possibility, its effect on the protein count/distance estimates, and any mitigating factors is essential to highlight the caveats associated with the conclusions.

      Based on the cylindrical distribution (see please the reply to point 1) of kinetochore clusters and their positions in z, we calculated the upper and lower boundaries of the distribution of kinetochore proteins in z, given a specific mean PSF size cutoff of a kinetochore cluster. Regardless of how stringent the cutoff is (130 and 135 nm), we made sure the boundaries do not exceed the imaging depth defined by our choice of the PSF size filtering (

      1. Presentation of the cross-correlation analysis: The authors use cross-correlation for an unbiased calculation of the axial separation between a protein of interest and Cse4, but I am curious about the structure of the underlying data, and the intensity image in Figure 1 is not easy to examine. It will be helpful to include more analysis of the underlying data for at least a subset of the proteins (e.g., proteins at short, intermediate, and long distances from Cse4) as supplementary data.

      2. The authors should include X and Y projections of the cross-correlation function.

      3. Do the widths of cross-correlation functions (i.e., their spread perpendicular to the spindle axis) match across all proteins and experiments? This should be an almost invariant characteristic of the measurements, assuming that proteins within each kinetochore tightly cluster around the 25 nm microtubule. This line of thinking makes the large width of the cross-correlation shown in Figure 1 somewhat surprising.

      4. It will also be interesting to test if the correlation between the positions of Spc105 molecules, especially perpendicular to the spindle axis, is comparable to the known separations between adjacent microtubules in the yeast spindle (the authors could use Winey et al. 1995 for serial-section EM of yeast spindles for comparison).

      The reviewer is interested in the spread, or the size of the distribution, of a protein in a kinetochore along and perpendicular to the spindle axis. This is an interesting idea and can be done practically. However, the information can be more easily obtained based on auto-correlation instead of cross-correlation, due to its better signal-to-noise ratio along the dimension perpendicular to the spindle axis. Cross-correlations in that dimension are convoluted with background localizations and different localization precisions of the two channels. These factors are hard to interpret and disentangled. In auto-correlations, although the background is still present, it can be modeled and then removed easily, as now mentioned on page 15 lines 500-516.

      Accordingly, we performed auto-correlation analysis on all the proteins and compared them to simulations representing different sizes. We find that the size of the distribution correlates to the position of the protein along the spindle axis. The results are now included as the new Fig. S5 and discussed on page 6 lines 169-176.

      The cross-correlation analysis was based on only the position of the maximum value, not the projections. To keep the figure concise, we decided not to include the projections. However, the auto-correlation analysis was indeed based on projections, which we now included in Fig. S5.

      Regarding the correlation between the positions of Spc105 molecules, we believe the reviewer actually refers to the correlation between the positions of kinetochores. Auto-/cross-correlations contain the information of the cluster sizes, based on the first peak (as shown in Fig. S5), and the relative distance (if the pattern is periodic). Unfortunately, the positions of kinetochores perpendicular to the spindle axis are not periodically distributed. Therefore, we cannot comment on the separations between adjacent microtubules.

      1. Cse4 count (4 per kinetochore) and the model presented: One of the surprising conclusions of the study is that there are two nucleosomes associated with each microtubule attachment, with Mif2/CENP-C potentially interacting with both nucleosomes. There are two critical issues that the authors must consider.

      (1) Fluorescent protein chimeras of Cse4 and CBF3 and COMA complex members do not exclusively localize to kinetochores. Biochemical studies show that both Cse4 and CBF3 proteins interact with non-centromeric DNA, e.g., see work from the Biggins lab regarding Cse4 over-expression and also from the Henikoff group that used ChIP-seq. I can't think of a similar reference for the CBF3 complex, but the DNA-binding proteins are also likely to interact with other parts of the genome. The non-centromeric protein is visible as a significant background fluorescence in wide-field microscopy, e.g., see Cep3 localization here: https://images.yeastrc.org/imagerepo/viewExperiment.do?id=202308&experimentGroupOffset=3&experimentOffset=0&experimentGroupSize=3

      Similar background fluorescence can be detected for Cse4 and Ctf19. This extra-centromeric localization of Cse4, Cep3, and Ctf19 makes it possible that the protein counts included by the authors are "contaminated" to some extent by the extra-centromeric protein. The authors should discuss this possibility and how it might affect their counts.

      After consideration, we agree with the reviewer that, specifically, a fraction of counted Cse4 molecules should be considered non-centromeric. We agree that the previous data is certainly sufficient to conclude it. The reviewer made a similar suggestion about COMA and CBF3 subcomplexes. In recent years a substantial portion of inner kinetochore components has been reconstituted. In Harrison et al. 2019, the Ctf19 complex structure has been solved. Two copies of the complex were observed. Therefore, the non-centromeric pool of COMA is certainly possible and we now made the adjustments to the text (page 8, lines 219-225) and Fig. 4. Accordingly, we now also modified the abstract (page 1, lines 26-27) and restructured the sections (page 10) to accommodate the different possibility of Cse4 copy numbers. While, fluorescence imaging of CBF3 presents a signal throughout the nuclear region we observed only four copies of Cep3 (part of CBF3). A CBF3 structure also has been resolved by Yan et al. 2018, in which the complex was proposed to exist as a dimer. This translates into four copies of Cep3. Therefore, we find it more suitable to leave all observed Cep3 (CBF3) molecules within a kinetochore model.

      (2) The model drawn in Figure 4 makes explicit assumptions about the positioning of the four Cse4 molecules (or two nucleosomes) in each kinetochore relative to the rest of the kinetochore components. Yet, the data shown do not justify this specific arrangement. Lawrimore et al. 2011 claim that the non-centromeric Cse4 nucleosomes must be randomly distributed in the pericentromeric chromatin to evade detection in biochemical tests. Therefore, the nearest-neighbor analysis suggested above will be valuable for gaining new insights into the relative positioning of the centromeric- and non-centromeric Cse4 nucleosomes. A similar analysis for Cep3 and Ctf19 will also be helpful. If stereotypical positioning of these molecules cannot be detected, then the model should be revised accordingly (alternative models that are also consistent with the data can be included).

      The reviewer has pointed out that Lawrimore et al. 2011 proposed and justified the existence of a non-centromeric Cse4 pool. This arrangement, also potentially along other inner kinetochore components, makes sense and our data did not indicate it otherwise. Therefore, we now revised our model accordingly by applying changes in the main text on page 10 lines 302-305 __as well as in __Fig. 4.

      (3) I suggest one experiment that can help the authors better understand protein organization in one kinetochore. Joglekar et al. 2006 used a dicentric chromosome to isolate single kinetochores on the spindle axis to test the assumption that each kinetochore consists of approximately the same number of molecules of kinetochore proteins. The strains are easy to construct (transform existing strains with a linearized plasmid). Single kinetochores can be seen with a low but reasonable frequency. I leave the decision to perform the experiment to the authors' discretion depending on whether the experiment will be worth the effort in strengthening or enhancing their conclusions.

      We performed the suggested experiment using the strain published in Joglekar et al. 2006 (kindly provided by Prof. Kerry Bloom) with Cse4 additionally tagged with mMaple. However, we always observed several super-resolved Cse4 clusters (likely of several kinetochores) overlapping with Nuf2-GFP diffraction-limited signal, therefore unable to assign a single isolated kinetochore to the lagging centromere.

      1. Information regarding the degree of correction applied to calculate protein count per half-spindle: It will be helpful to include data regarding the degree of correction applied to the expected and measured numbers of NPC protein as supplementary data so that the readers can see the magnitude of this correction relative to the measured counts.

      We would like to clarify that we did not correct the data. Instead, we calibrate the copy number, given that the copy number of Nup188 per NPC is known. We assume the same ratio between localization and copy number applies to both Nup188 and the kinetochore proteins. We now include a new Table S4 listing calibration factors of all experiments shown in Fig. 3.

      Minor points:

      1. McIntosh et al. JCB 2013 used microtubule plus-ends in serial section electron micrographs of yeast spindles to align the centromeric region and found a disk-shaped structure that roughly corresponds to the size of a single nucleosome ~ 80 nm away from the tip of the microtubule and centered the microtubule axis. The authors should refer to this finding in their discussion of the model that they present with two nucleosomes. In my opinion, this is compelling evidence for a nucleosome-like structure serving as the kinetochore foundation.

      We agree with this reviewer's comment. The study, among others, present compelling evidence for a point-centromere. We now included the finding in the discussion on page 10, lines 293-294.

      1. As discussed by the authors, the number of Cse4 molecules per kinetochore has been the subject of some controversy. Biochemical data from the Biggins group and ChIPseq data from the Westermann group (Altunkaya et al. 2016 Current Biology) strongly suggest that Cse4 molecules can only be found centered on the centromeric sequence. The latter reference should be included in the discussion.

      Thank you for pointing this out. Indeed, this is important. We have now added the relevant reference in the discussion on__ page 10 lines 291-292__.

      1. Although microscopy-based methods have estimated anywhere from 1, 2, to 6 Cse4 molecules per kinetochore, these studies generally agree on the stoichiometry between Cse4 and the rest of the kinetochore proteins, e.g., Ndc80 complex proteins are ~ 4-fold more abundant that Cse4, etc. The present study seems to disagree with protein stoichiometry. The authors may find it worthwhile to note this feature of their data.

      We now discuss the stoichiometry difference between our results and others on page 11 lines 322-324.

      1. Omission of the Dam1 complex from this study is disappointing to me personally, but I am sure that the authors have good reasons for this. They should briefly comment on the absence of the Dam1 complex in this study.

      To provide information on the Dam1 complex, we imaged Ask1, a component of the complex. The measured positioning and copy number of the protein are now included in Fig. 2 and Fig. 3 respectively, and described and discussed in respective parts of the manuscript.

      Reviewer #1 (Significance (Required)):

      Cieslinski and colleagues present a single-molecule localization-based study to define the copy numbers and relative organization of kinetochore proteins in budding yeast. These numbers confirm and significantly refine prior measurements of the same aspects of the kinetochore. They also raise new questions and point to new research directions. The measurements also reveal a model of the protein organization of the budding yeast kinetochore in metaphase. For these reasons, the manuscript is of significant interest to the cell division field.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this study, Cielinski and colleagues have applied single molecule localization microscopy to map the positions of proteins in the yeast kinetochore. This has not been reported previously and this study is both well-conducted and the data appear solid. They also use a modification of this technique to assess the stoichiometry of kinetochore proteins. The results that they obtain are broadly in line with several previous studies that use other methodology. There may be an improvement in accuracy using this new approach that has not been obtained previously and there are some important novel conclusions from this work. I would like the authors to address the following concerns prior to publication:

      Major points

      1. One interesting finding is that there is a discrepancy in the length of both the MIND and NDC80 complexes (from crystallographic data) with their relative positions. The authors suggest that the outer complexes could be twisted or rotated in respect of the spindle axis. It would be great if the authors could illustrate this in their model (or discuss it in the text), to demonstrate the required angle of twist/rotation of both complexes to account for the discrepancy. A twisted filament structure to the outer kinetochore does have some implications for its response to tension - a key determinant of kinetochore-microtubule attachment. It also may provide some flexibility to the structure under tension.

      The discussion about this discrepancy has now been incorporated in the main text, page 9 lines 263-267. For clarity, we only partially reflect this in our schematic model (Fig. 4A; the MIND complex) but we already reflected this in the illustrative structural model in Fig. 4B.

      1. For the experiment with cycloheximide, the authors state "Although we observed minor changes in copy numbers, the overall effect of CHX was small." For some proteins, Cse4i for example, there appears to be a significant decrease in intensity (30-40%) after cycloheximide treatment, see Figure S3. While the conclusion that tag maturation does not affect copy number measurements is sound, I suggest modifying this section to reflect the data.

      We now modified the section accordingly by pointing out that Cse4i under CHX measurements led to reduction of the signal. The modification can be found on page 8 lines 207-211.

      1. Page 5. The statement "These data agree reasonably well with previous diffraction-limited dual-color microscopy studies ..." provides readers with little ability to compare the data. I would like to see a supplementary figure comparing these new data with previous studies, especially those of Joglekar et al 2009, see Figure 3 in this paper.

      We thank the reviewer for suggesting such a table. This will allow readers a direct comparison of the data between our study and Joglekar at al. 2009. The comparison can be found in new Table S1 __and __Fig. S4, which are now mentioned on page 5.

      1. In terms of the distances quoted, are they in one dimension (as per Jogelkar et al 2009) or in three? The results section is entitled "...positions of kinetochore proteins along the metaphase spindle axis", which suggests a single dimension. Please make this very clear in the results section. In the discussion, is the statement "we mapped the relative positions of 15 kinetochore proteins along the kinetochore axis", which is not entirely clear. It seems from the methods that this is one dimension "...we determined the average distance between the two proteins along the spindle axis. “I suggest clarifying the results section briefly and clearly to indicate that this is a single dimension being measured and also using consistent wording of the axis measured throughout the text.

      We agree the previous description may not be clear to the viewers. We now changed the text accordingly in the results section, page 5 lines 129-130.

      Minor points:

      Abstract: I would drop "all" from "For all major kinetochore proteins...", since full characterisation was performed on 14 proteins (9 in terms of copy number).

      We now deleted “all” in the abstract as the reviewer suggested__.__

      Page 2: "trough" to through.

      Corrected.

      Page 2 "S. cerevisiae" to italics

      Corrected.

      Methods p11. How do the MKY strains relate to common yeast genetic backgrounds? (e.g. are they S288C?).

      MKY strains are derivative of S288C. The information was now updated in the Methods section and in Table S2.

      Reviewer #2 (Significance (Required)):

      This manuscript, together with an accompanying one from Virat et al., are nice complementary studies that provide the first single molecule localization studies of the yeast kinetochore. Although other labs have used super-resolution methods to study individual kinetochore proteins; both of these new studies map distances between many proteins at the kinetochore and thus are able to produce maps of the overall kinetochore structure. Like the previous study using standard resolution methods (Joglekar et al, 2009. Current Biology 19, 694-699); these studies will likely provide a benchmark for future studies on eukaryotic kinetochore architecture, including those in mammalian systems. Additionally, this work will appeal to super-resolution microscopists.

      My expertise is as a yeast kinetochore cell biologist.

    1. Author Response

      Reviewer #1 (Public Review):

      I'm curious about whether the microscopy provided any information about when secretory vesicles leave the TGN. Do they leave throughout the lifetime of a TGN structure, or do they leave in a burst when a TGN structure disperses as marked by loss of Sec7? This information might take us a step closer to understanding how secretory vesicles are made.

      Given the limitations of our current imaging set-up with regards to high-speed 3D two-color microscopy, we were unable to capture a large number of these events and therefore cannot make concrete statements about this, however, the quantified events did not appear to be preceded or followed by additional events, suggesting some temporal separation.

      Reviewer #2 (Public Review):

      The authors are encouraged to integrate their data together better with published biochemistry and structural work into more complete mechanisms for vesicle trafficking, tethering and fusion. The manuscript would be improved by a clearer model(s) of how these factors come together to carry out exocytosis.

      This suggestion has been addressed by the addition of a new model figure (Figure 9).

      Moreover, many conclusions (especially as they appear in the Results and Figures) are written as if they are well supported by the data (or others' data), when they are often speculative, or reasonable alternative explanations exist. The authors should be clear about which conclusions are well supported, and which are hypotheses. (e.g. Fig 6I, which is a terrific figure, but some of the "conclusions/statements" are speculations).

      We have made textual changes to make clearer distinctions between conclusions that are supported by the data, and which are more speculative.

      The mechanistic and experimental definitions for the start/end of "tethering" and "fusion" are not clearly stated in the main text, which leads to confusion when examining the arrival of different factors (and seems to lead to circular arguments about what is defining what). Are these definitions well supported by the previously published and current data? E.g. is the disappearance of GFP-Sec4 really equal to the fusion event? Without data showing membrane-merger or content delivery, this needs to be described as an assumption that is being made.

      Early in the results, we now define precisely what we interpret as the start of tethering and time of fusion. Unfortunately, thus far, all attempts at designing a cargo marker suitable for defining membrane fusion have not succeeded, however, we believe the observations in Figure 4 strongly support assumption that loss of GFP-Sec4 signal coincides with fusion.

      The Sro7 results and conclusions are complicated, and not always carefully supported, for several reasons: there is a functionally redundant paralog Sro77, and data shows Sro7 can bind to Sec4, Sec9 and Exo84 in exocyst (Brennwald, Novick and Guo labs). The authors should be clearer, as they seem to pick and choose which interactions they think are relevant for different observations.

      We did not intend to “pick-and-choose” relevant interactions and now more clearly state what our Sro7 results mean.

      The assumption that yeast Sec1 behaves similarly to other Sec1/Munc18 proteins for "templating" SNARE complex assembly, e.g. Vps33 in Baker et al, is unlikely, given the binding studies from a number of labs (Carr, McNew, Jantti). Furthermore, the evidence for Sec1 interaction with exocyst suggests that they may work together (Novick, Munson labs). Previous data from the Guo lab (Yue et al 2017) and new BioRxiv data from the Munson/Yoon labs suggest that exocyst may play key roles in SNARE complex assembly and fusion.

      We did not mean to imply that the exocyst does not play a meaningful and critical role in SNARE complex assembly and fusion. This was an unintentional omission, which we have now addressed in the text. Our interpretation of the published meaning of SM-protein “templating” is that SM’s facilitate the alignment of the critical zero-layer ionic residues in the SNARE motifs, which may be possible regardless of affinity to single SNARE motifs. Indeed, for Sec1 specifically, it may be possible that this exact function is of lower importance relative to, perhaps, the stabilization and protection of trans-SNARE complexes prior to membrane fusion. Future studies may clarify this.

      There is concern that the number of molecules of each of the factors measured is accurate, and how the authors really know that they are visualizing single vesicle events (especially with data showing that "hot-spots" may exist). For example, why is the number of molecules of exocyst is ~double or more than that previously observed (Picco et al; Ahmed et al with mammalian exocyst).

      Estimating the numbers of molecules is subject to some variation due to fluorescent tags used and to some extent where the protein is tagged. Since different tags were used in the earlier studies, being within a factor of two is not that surprising.

      For puncta of exocyst subunits in the mother or moving towards the plasma membrane, what is the evidence that they are actually on vesicles? The clearest argument seems to be the velocity at which they move, but this could be due to the direct interaction of exocyst with the myosin (which is a tighter interaction in vitro than exocyst-Sec4 binding), rather than being on vesicles. Furthermore, do all the exocyst complexes in the cell show this behavior, or could these be newly synthesized/assembled complexes?

      Transport of the exocyst by myosin alone without a vesicle seems very unlikely, as this myosin V needs to be activated by binding vesicle-associated Sec4 (Donovan et al., 2012, 2015). Moreover, transport of just two exocyst complexes by a myosin dimer would be very hard to detect. Nonetheless, we have added an additional supplementary figure (Figure 1 Supplement 5C) illustrating a clear example of exocyst complex colocalization with a secretory vesicle in the mother cell which we hope will quell fears that the exocyst complex is indeed on secretory vesicles, albeit in small numbers, during this stage of transport.

      With regard to the exocyst octamer leaving at the time of "fusion," the authors should discuss Ahmed et al.'s finding of Sec3 leaving prematurely in mammalian cells, as well as data from the Toomre lab.

      We did reference this earlier work in mammalian cells and indicate that it differs from the situation in yeast. We don't have anything insightful to be drawn from these differences.

      Reviewer #3 (Public Review):

      In this context, it is notable that dual-channel imaging appears to be made by sequential, not simultaneous, acquisition, which deserves a currently missing comment. Moreover, given the weight that image acquisition plays in this project, it might be described and justified better.

      As noted above, we have expanded our description of the microscopy. We took two-color images sequentially as our microscope is not configured with a beam-splitter for simultaneous imaging.

      This referee could not fully understand the routine of image acquisition, specifically, the continuous movement of the stage in the Z-axis as images are streamed (to the RAM or to the disk? the latter takes time, line 177); does it mean that Z-stepping is solely governed by the exposure time? The CCD camera penalizes pixel size (16 µm) at the expense of achieving outstanding quantum efficiency. The optical path includes a 100x objective and a 2x magnification lens to compensate for the large camera pixel size, thereby achieving 0.085 µm/pixel, but these lenses 'waste' part of the fluorescent signal. One wonders if the CMOS camera (6.5 µm pixel size) coupled with a 63x objective wouldn't be appropriate? A brief discussion on this choice would be helpful for readers.

      We now discuss the microscopy in more detail and why we use an EMCCD rather than aCMOS camera.

      It is remarkable that Sec2 and Sec4 are recruited to membranes even before a vesicle is formed (Fig 6I). I find somewhat weak the evidence that RAB11s 'mark' the TGN, and disturbing the fact that RAB11 reaches the PM (does GFP tagging prevent GAP accession?). I should like to recommend strongly that the authors integrate into the introduction/discussion information on the late steps of exocytosis available for Aspergillus nidulans, another ascomycete that is particularly well suited for studying this process. Here RAB11 is not a late Golgi resident but is transiently (20 s) recruited to TGN cisternae in the late stages of their 120 s maturation cycle to drive the transition between Golgi and post-Golgi (Pantazopoulou MBoC, 2014). Recruitment of RAB11 to the TGN is preceded by the arrival of its TRAPPII GEF (Pinar, PNAS 2015; Pinar PLOS Gen 2019), a huge complex that is incorporated en bloc to the TGN (Pinar JoCS, 2020). Upon RAB11 acquisition RAB11 membranes engage molecular motors (Penalva, MBoC 2017) to undertake a several-micron journey that transports them to a vesicle supply center located underneath the apex (review, Pinar & Penalva, 2021). Here is where Sec4 is located, strongly indicating that there is a division of work between two Rabs each mediating one of the two stages between the TGN and the membrane (Pantazopoulou, 2014, MBoC).

      In the general comments above, we discuss the possible artifact of tagged Ypt31 on the PM. In the Discussion, we now compare our results in S. cerevisiae with the findingss in A. nidulans.

    1. Reviewer #1 (Public Review):

      Grande et al report the results of a series of functional connectivity experiments that build upon and extend results reported in Maass et al. (2015). The authors conducted three separate but interrelated analyses with a primary aim of characterising entorhinal-hippocampal processing pathways in the human brain.

      The first analysis served to identify subregions within the entorhinal cortex (EC) that preferentially connect with the retrosplenial cortex (RSC), posterior parahippocampal cortex (PHC) and perirhinal areas 35 (A35) and 36 (A36). The results of this analysis revealed that the RSC and PHC preferentially connect with the anterior medial EC and posterior medial EC respectively while A35 and A36 preferentially connect with the anterior lateral EC and posterior lateral EC respectively. In a second analysis, the authors evaluated patterns of functional connectivity between the four entorhinal subregions identified in Analysis 1 and specific subfields of the hippocampus, namely the subiculum and CA1. The authors provide evidence that each EC subregion preferentially connects with specific regions along the transverse (medial-lateral) axis of the subiculum and CA1.

      In a third analysis, the authors investigated whether 'object' and 'scene' information is differentially processed within EC subregions and along the transverse axis of the subiculum and CA1. Results revealed that the posterior medial EC and distal (medial) subiculum were preferentially engaged by 'scene' stimuli. In contrast, anterior regions of the EC and the CA1/subiculum border were equally engaged by 'object' and 'scene' stimuli. The authors propose that the posterior medial EC and distal subiculum may represent a unique route for scene/contextual information flow while anterior regions of the EC and the CA1/subiculum border may be involved in integrating both 'scene' and 'object' information.

      Overall, the study was well-motivated, well-designed and appropriately analysed to address the research questions. The conclusions of the paper are well supported by the data.

      The primary novelty of these results relate to the characterisation of how the RSC, PHC, A35 and A36 functionally connect with different portions of the EC and how, in turn, these EC subregions preferentially connect along the medial-lateral axis of the subiculum and CA1. These new and detailed insights will have an impact on and advance current theoretical models of entorhinal-hippocampal functional organisation in the human brain with implications for our understanding of human memory processing and its dysfunction.

      The study also provides new evidence regarding the functional organisation of EC-hippocampal circuitry as it relates to 'object' and 'scene' processing. Results of this component of the analysis support accumulating evidence that medial portions of the hippocampus and EC are preferentially engaged during scene-based cognition.

      Taken together, the results of this study inform and extend current theoretical models of entorhinal-hippocampal information processing pathways in the human brain.

      A major strength of the study is the detailed approach used to investigate each cortical region of interest (ROI), to characterise their functional connectivity with subregions of the EC and, in turn, how these EC subregions functionally relate to hippocampal subfields. The authors take advantage of the rich dataset acquired at 7T to gain new insights into entorhinal-hippocampal functional interactions.

      While the detailed approach noted above is a major strength of the study, it is also the source of some weaknesses. For example, when manually segmenting small ROIs (such as hippocampal subfields), quality assurance measures are important to give the reader confidence that the ROI masks are, as accurately as possible, measuring what we think they are measuring. A weakness of this study in its current form is that no quality assurance measures have been presented for the ROIs. The authors provide no metrics relating to intra- or inter-rater reliability (e.g., DICE metrics) for the manually segmented ROIs. Also, it can be difficult to warp small ROIs such as hippocampal subfields to EPI images with sufficient accuracy. No data is presented to assure readers that the ROIs (manually segmented on structural images and then warped to EPI space) were well aligned with the EPI images.

      It is also important to note that the subiculum mask used in this study appears to encompass the entire 'subicular complex' inclusive of the subiculum, presubiculum and parasubiculum. Importantly, the pre- and parasubiculum are located on the medial most aspect of the 'subicular complex' but this region is referred to throughout the current study as the 'distal subiculum'. Therefore, results attributed to the distal subiculum likely also reflect functional activation of the pre- and parasubiculum. Indeed, this makes sense considering accumulating evidence that the pre- and parasubiculum are preferentially engaged during scene-based cognition. Interpretation of results relating to the 'distal subiculum' should, therefore, be interpreted with this in mind.

    1. Author Response

      Reviewer #1 (Public Review):

      This well-written paper combines a novel method for assaying ubiquitin-proteasome system (UPS) activity with a yeast genetic cross to study genetic variation in this system. Many loci are mapped, and a few genes and causal polymorphism are identified. A connection between UPS variation and protein abundance is made for one gene, demonstrating that variation in this system may affect phenotypic variation.

      The major strength of the study is the power of yeast genetics which makes it possible to dissect quantitative traits down to the nucleotide level. The weakness is that is not clear whether the observed UBS variation matters on any level, however, the claims are suitable to moderate, and generally supported.

      We agree with the reviewer that understanding how causal variants for ubiquitin-proteasome system (UPS) activity affect other molecular, cellular, and organismal phenotypes is an important area of future research.

      The paper provides a nice example of how it is possible to genetically dissect an "endo-phenotype", and learn some new biology. It also represents a welcome attempt to put the function of a mechanism that is heavily studied in molecular cell biology in a broader context.

      We thank the reviewer for these kind words.

      Reviewer #2 (Public Review):

      In this manuscript, the authors developed an elegant quantitative reporter assay to identify quantitative trait loci that regulates N-end rule pathway, a major quality control mechanism in eukaryotes. By crossing two yeast species with divergent proteostasis activity, they generated a population that showed broad variation in proteostasis activity. By sequencing and mapping the underlying loci, they have identified several genes that regulate N-end rule activity. They then verified them using precise genetic tools, validating the power of their approach.

      Overall, it is a very solid manuscript that would be highly interesting for the quality control field.

      In general, I really liked this manuscript for these reasons:

      • Uses fluorescent timers elegantly to quantitatively measure protein degradation.

      • Validates the approach in depth, showing the readers how the tool works.

      • Uses the power of yeast genetics and bulk segregant analysis to map loci that may have small effects.

      • Validates the mapped loci using precise genetic tools.

      In a field that is dominated by biochemistry, this manuscript will be a fresh breath of air…

      We thank the reviewer for their thoughtful evaluation of our work and these kind words.

      Reviewer #3 (Public Review):

      This manuscript, "Variation in Ubiquitin System Genes Creates Substrate-Specific Effects on Proteasomal Protein Degradation" studies the genetic basis of differences in protein degradation. The authors do so by screening natural genetic variation in two yeast strains, finding several genes and often several variants within each gene that can affect protein degradation efficiency by the Ubiquitin-Proteasome system (UPS). Many of these variants have "substrate-specific effects" meaning they only affect the degradation of specific proteins (those with specific degrons). Also, many variants located within the same genes have conflicting effects, some of which are larger than others and can mask others. Overall, this study reveals a complex genetic basis for protein degradation.

      Strengths: Revealing the genetic basis for any complex trait, such as protein degradation, is a major goal of biology. The results of this paper make a significant step towards the goal of mapping the genes and variants involved in this specific trait. Fine mapping methods are used to home in on the specific variants involved and to measure their effects. This is very nicely done and provides a detailed view of the genetic basis of protein degradation. Further, the GFP/RFP system used to quantify the efficiency of the protein degradation system is a very elegant system. Also, the completeness of the analysis, meaning that all 20 N-degrons were studied, is impressive and leads to very detailed findings. It is interesting that some genetic variants have larger and opposite effects on the degradation of different N-degrons.

      We thank the reviewer for these positive comments.

      Weaknesses: Some of the results discussed in this paper are not surprising. For example, the finding that both large effect and small effect genetic variants contribute to this complex trait is not at all surprising. This is true of many complex traits.

      We agree with the reviewer that the number and patterns of QTLs we observe are perhaps not unexpected given that most traits are genetically complex. However, we also note that our results stand in stark contrast to previous efforts to understand how natural genetic variation affects the UPS, which have focused almost exclusively on large-effect mutations in UPS genes that cause rare Mendelian disorders. We have therefore chosen to retain our discussion of the complex genetic architecture of the UPS.

      The discussion of human disease is also a bit extensive given this study was performed on yeast. It might be more productive to use these findings to understand the UPS better on a mechanistic level. Why does the same genetic variant have opposite effects on the degradation of different degrons, even in cases where those degrons are of the same type?

      Following the reviewer’s suggestion we have removed multiple references to human disease from the introduction. We retained paragraph 3 of the introduction (previously, lines 43-55, pg. 2, para. 2 in the revised manuscript), which discusses disease-causing mutations in UPS genes, because the examples presented highlight two important motivations for our work: (1) individual genetic differences create variation in UPS activity and (2) much of our knowledge of how natural genetic variation affects the UPS comes from these rare, limited examples. However, we have re-written the paragraph to focus on these points and removed descriptions of the clinical manifestations of the disorders mentioned.

      We agree with the reviewer that understanding the mechanistic basis of substrate-specific variant effects on distinct N-degrons is important. However, doing so would require additional experiments that we argue are outside the scope of the current study.

      Overall, this manuscript excels at mapping the genetic basis of variation in the UPS system. It demonstrates a very complex mapping from genotype to phenotype that begs for further mechanistic explanation. These results are important to the UPS field because they may help researchers interrogate this highly conserved essential system. The manuscript is weaker when it comes to the broader conclusions drawn about the relative importance of large vs. small effects variants on complex traits, the amount of heritability explained, and the effects of genetic variation on protein abundance vs transcript abundance. Though in the case of protein vs transcript, I feel the cursory examination of the trends is perhaps at an appropriate level for the study, as it is mainly meant to show these things differ rather than to show exactly how and why they differ.

      We state that the distribution of QTL effect sizes for UPS activity consists of many QTLs with small effects and few QTLs of large effects. While this result is similar to patterns observed for other complex traits, it differs dramatically from the results of previous studies of genetic influences on the UPS, which have been largely confined to large-effect variants. Given these differences, we think it is appropriate and worthwhile to emphasize the complex genetic architecture of UPS activity.

      We agree that estimating the fraction of heritability explained by our QTLs and variants would be valuable. However, as noted in our response to Reviewer 1, the QTL mapping method we used does not permit ready calculation of heritability estimates due to its pooled nature.

      The reviewer is correct in noting that the primary goal of our RNA-seq and proteomics experiments was to provide an initial exploration of the effects of causal variants for UPS activity on global gene expression at the protein and mRNA levels. While a comprehensive dissection of the effects of this and other causal variants is an important area of future work, our results here show broad changes in global gene expression and establish that the causal UBR1 variant affects gene expression at the protein and mRNA levels.

      Reviewer #4 (Public Review):

      Overall the paper is clear and well-written. The experimental design is elegant and powerful, and it's a stimulating read. Most QTL mapping has focused on directly measurable phenotypes such as expression or drug response; I really like this paper's distinctive approach of placing bespoke functional assays for a specific molecular mechanism into the classical QTL framework.

      We thank the reviewer for their thoughtful evaluation of the work and positive comments.

    1. I thought I should have sunk down at last, and never got out; but I may say, as in Psalm 94.18, “When my foot slipped, thy mercy, O Lord, held me up.” Going along, having indeed my life, but little spirit, Philip, who was in the company, came up and took me by the hand, and said, two weeks more and you shall be mistress again. I asked him, if he spake true? He answered, “Yes, and quickly you shall come to your master again; who had been gone from us three weeks.” After many weary steps we came to Wachusett, where he was: and glad I was to see him. He asked me, when I washed me? I told him not this month. Then he fetched me some water himself, and bid me wash, and gave me the glass to see how I looked; and bid his squaw give me something to eat. So she gave me a mess of beans and meat, and a little ground nut cake. I was wonderfully revived with this favor showed me: “He made them also to be pitied of all those that carried them captives” (Psalm 106.46).

      I think this example of her reference to religion demonstrates its significance amidst her hardships.

    1. Oftentimes they even refered to one another.

      An explicit reference in 1931 in a section on note taking to cross links between entries in accounting ledgers. This linking process is a a precursor to larger database processes seen in digital computing.

      Were there other earlier references that are this explicit within either note making or accounting contexts? Surely... (See also: Beatrice Webb's scientific note taking)


      Just the word "digital" computing defines that there must have been an "analog' computing which preceded it. However we think of digital computing in much broader terms than we may have of the analog process.

      Human thinking is heavily influenced by associative links, so it's only natural that we should want to link our notes together on paper as we've done for tens of thousands of years (at least.)

    1. Rain, shine, and seasons aside, passengers scheduling rides are instructed by call center operators to be outside our pick-up location at our scheduled pick-up time, even though our ride may be nowhere near at that time. We are also instructed to be prepared to wait up to 30 minutes for our drivers in case of traffic or delays. Drivers who arrive within that “30-minute window” are still considered to be on time, even though the passenger may have been outside for up to half an hour at that point. Those 30-minute delays may actually turn into hours-long waits for many customers, as drivers must follow predetermined routes that lengthen trips and exacerbate travel conditions. Drivers, on the other hand, are instructed to give late passengers only a five-minute grace period. Drivers are also encouraged to call passengers if they do not see us when they arrive, but such calls are considered a courtesy, not a requirement.

      one of the issues. I think this is especially jarring to the reader because most of us have used an Uber before, or other forms of public transportation and these "terms" are very different.

    1. Author Response

      Reviewer #2 (Public Review):

      This study evaluates the causal relationship between childhood obesity on the one hand, and childhood emotional and behavioral problems on the other. It applies Mendelian Randomization (MR), a family of methods in statistical genetics that uses genetic markers to break the symmetry between correlated traits, allowing inference of causation rather than mere correlation. The authors argue convincingly that previous studies of these traits, both those using non-genetic observational epidemiology methods and those using standard MR methods, may be confounded by demographic effects and familial effects. One possible example of this kind of confounding is that the idea that obesity in parents may contribute to emotional and behavioral problems in children; another is the idea that adults with emotional and behavioral issues may be more likely to have children with partners who are obese, and vice-versa. They then make use of a recently proposed "within-family" MR method, which should effectively control for these confounders, at the cost of higher uncertainty in the estimated effect size, and therefore lower power to detect small effects. They report that none of the previously reported associations of childhood BMI with anxiety, depression, or ADHD are replicated using the within-family MR method, and that in the case of depression the primary association appears to be with maternal BMI rather than the child's own BMI.

      This argument that these confounders may affect these phenotypes is fairly sound, and within-family MR should indeed do a good job of controlling for them. I do not see any major issues with the cohort itself or the choice of genetic instruments. I also do not see any major issues with the definitions or ascertainment of the phenotypes studied, though I am not an expert on any of these phenotypes in particular. I am especially satisfied with the series of analyses demonstrating that the results are robust to many variations of MR methodology. Overall, I think the positive result this study reports is very credible: that the known association between childhood BMI and depression is likely primarily due to an effect of maternal BMI rather than the child's own BMI (though given that paternal BMI has a similar effect size with only a slightly wider confidence interval, I would instead say that the effect is from parental BMI generally, not specifically maternal.)

      In the updated results based on the larger genetic data release, the estimates for the association of maternal BMI and paternal BMI with the child’s depressive symptoms are more clearly different than they were in the smaller dataset (for maternal BMI, beta= 0.11, CI:0.02,0.19, p=0.01; for paternal BMI, beta=0.02, CI:-0.09,0.12, p=0.71). Therefore, in this version, it makes sense to note an association with maternal BMI specifically.

      The main weakness of the study comes from its negative results, which the authors emphasize as their primary conclusion: that previously reported associations of childhood BMI with anxiety, depression, and ADHD are not replicated using within-family MR methods. These claims do not seem justified by the evidence presented in this study. In fact, in every panel of figures 2 and 3, the error bars for the within-family MR analysis encompass the estimates for both the regression analysis and the traditional MR analysis, suggesting that the within-family analysis provides no evidence one way or another about which of these analyses is more accurate. More generally, in order to convincingly claim that there is no causal relationship between two traits, an MR study must argue that the study would be powered to detect a relationship if one existed. Within-family MR methods are known to have less power to detect associations and less precision to estimate effect sizes than traditional MR methods or traditional observational epidemiology methods, so it is not sufficient to show that these other methods have power to detect the association. To make this kind of claim, it is necessary to include some kind of power analysis, such as a simulation study or analytic power calculations, and likely also a positive control to show that this method does have power to detect known effects in this cohort.

      We agree that it is imperative that negative (i.e. “non-significant”) results are correctly interpreted - it is just as important to discover what is unlikely to affect emotional and behavioural outcomes as what does affect them. Negative results (non-significant estimates) are neither a weakness nor strength of the study, but simply reflect the estimation error in our analysis of the data. The key question is whether our within-family MR estimates are sufficiently powered to detect effect sizes of interest or rule out clinically meaningful effect sizes – or are they simply too imprecise to draw any conclusions? As the reviewer suggests, one way to address this is via a post-hoc power calculation. We consider post-hoc power calculations redundant, since all the information about the power of our analysis is reflected in the standard errors and reported confidence intervals. Moreover, any post-hoc power calculation will be necessarily approximate compared to using the standard errors and confidence intervals which we report.

      Despite these methodological reservations, we have conducted simulations to estimate the power of our within-family models (the R code is included at the end of this document). These simulations indicate that we do have sufficient power to detect the size of effects seen for depressive symptoms and ADHD in models using the adult BMI PGS. They also indicate that we cannot rule out smaller effects for non-significant associations (e.g., for the impact of the child’s BMI on anxiety). Naturally, this is entirely consistent with the width of the confidence intervals reported in results tables and in Figures 1 and 2. However, although power calculations are important when planning a study, they make little contribution to interpretation once a study has been conducted and confidence intervals are available (e.g., https://psyarxiv.com/tcqrn/). For this reason, we comment on these simulations in this response to reviewers but do not include them in the manuscript or supplementary materials. At the same time, we have changed the language used in the manuscript to be clearer that the results were imprecise and that values contained within the confidence limits cannot be ruled out.

      For example, the discussion now includes the following:

      ‘However, within-family MR estimates using the childhood body size PGS are still consistent with small effects of the child’s BMI on all outcomes, with upper confidence limits around a 0.2 standard-deviation increase in the outcome per 5kg/m2 increase in BMI.’

      And the conclusion of the paper now reads:

      ‘Our results suggest that genetic variation associated with BMI in adulthood affects a child’s depressive and ADHD symptoms, but genetic variation associated with recalled childhood body size does not substantially affect these outcomes. There was little evidence that BMI affects anxiety. However, our estimates were imprecise, and these differences may be due to estimation error. There was little evidence that parental BMI affects a child’s ADHD or anxiety symptoms, but factors associated with maternal BMI may independently influence a child’s depressive symptoms. Genetic studies using unrelated individuals, or polygenic scores for adult BMI, may have overestimated the causal effects of a child’s own BMI.’

      Regarding a positive control: for analyses of BMI in adults, suitable positive controls would include directly measured biomarkers such as fat mass or blood pressure or reported medical outcomes like type 2 diabetes. In adolescents and younger adults, age at menarche or other measures of puberty can be used, as these are reliably influenced by BMI. However, the age of the participants for whom within-family effects are being estimated (8 years), together with the lack of any biomarkers such as fat mass (due to the questionnaire-based survey design) mean no suitable measures are available.

      Reviewer #3 (Public Review):

      Higher BMI in childhood is correlated with behavioral problems (e.g. depression and ADHD) and some studies have shown that this relationship may be causal using Mendelian Randomization (MR). However, traditional MR is susceptible to bias due to population stratification, assortative mating, and indirect effects (dynastic effects). To address this issue, Hughes et al. use within-family MR, which should be immune to the above-listed problems. They were unable to find a causal relationship between children's BMI and depression, anxiety, or ADHD. They do, however, report a causal effect of mother's BMI on depression in their children. They conclude that the causal effect of children's BMI on behavioral phenotypes such as depression and anxiety, if present, is very small, and may have been overestimated in previous studies. The analyses have been carried out carefully in a large sample and the paper is presented clearly. Overall, their assertions are justified but given that the conclusions mostly rest on an absence of an effect, I would like to see more discussion on statistical power.

      1) The authors show that the estimates of within-family MR are imprecise. It would be helpful to know how much power they have for estimating effect sizes reported previously given their sample size.

      As discussed in response to a comment from reviewer 2, the power of our results is already indicated by our standard errors and confidence intervals. Nevertheless, we conducted simulations to estimate the size of effects which we had 80% power to detect. Results, presented below, are consistent with our main results. As discussed in response to a comment from reviewer 2, we consider post-hoc power calculations redundant when standard errors and confidence intervals are reported; for this reason, we include this information in the response to reviewers but not the manuscript itself.

      2) They used the correlation between PGS and BMI to support the assertion that the former is a strong instrument. Were the reported correlations calculated across all individuals? Since we know that stratification, assortative mating, and indirect effects can inflate these correlations, perhaps a more unbiased estimate would be the proportion of children's BMI variance explained by their PGS conditioned on the parents' PGS. This should also be the estimate used in power calculations.

      The manuscript has been updated to quote Sanderson-Windmeijer conditional R2 values: the proportion of BMI variance explained by the BMI PGS for each member of a trio, conditional on the PGS of the other members of the trio, and all genetic covariates included in within-family models. Similarly, we now show Sanderson-Windmeijer conditional F-statistics for a model including the child, mother, and father’s BMI instrumented by the child, mother, and father’s PGS.

      3) In testing the association of mothers' and fathers' BMI with children's symptoms, the authors used a multivariable linear regression conditioning on the child's own BMI. Was the other parent's BMI (either by itself or using the polygenic score) included as a covariate in the multivariable and MR models? This was not entirely clear from the text or from Fig. 2. I suspect that if there were assortative mating on BMI in the parent's generation, the effect of any one parent's BMI on the child's symptoms might be inflated unless the other parent's BMI was included as a covariate (assuming both mother's and father's BMI affect the child's symptoms).

      Non-genetic models include both the mother and father’s phenotypic BMI as well as the child’s, allowing estimation of conditional effects of all three. This controls for assortative mating as noted by the reviewer. This was not previously clear - all relevant text and figure captions have been updated to clarify this.

      4) They report no evidence of cross-trait assortative mating in the parents generation. The power to detect cross-trait assortative mating in the parents' generation using PGS would depend on the actual strength of assortative mating and the respective proportions of trait variance explained by PGS. Could the authors provide an estimate of the power for this test in their sample?

      We have updated the discussion of assortative mating (in both the results and the discussion section) to note possible limitations of power and clarify that that this approach to examining assortment may not capture its full extent.

      The relevant part of the results section now reads:

      “In the parents’ generation, phenotypes were associated within parental pairs, consistent with assortative mating on these traits (Appendix 1 – Table 5). Adjusted for ancestry and other genetic covariates, maternal and paternal BMI were positively associated (beta: 0.23, 95%CI: 0.22,0.25, p<0.001), as were maternal and paternal depressive symptoms (beta: 0.18, 95%CI: 0.16,0.20, p<0.001), and maternal and paternal ADHD symptoms (beta: 0.11, 95%CI: 0.09,0.13, p<0.001). Consistent with cross-trait assortative mating, there was an association of mother’s BMI with father’s ADHD symptoms (beta: 0.03, 95%CI: 0.02,0.05, p<0.001) and mother’s ADHD symptoms with father’s depressive symptoms (beta: 0.05,95%CI: 0.05,0.06, p<0.001). Phenotypic associations can reflect the influence of one partner on another as well as selection into partnerships, but regression models of paternal polygenic scores on maternal polygenic scores also pointed to a degree of assortative mating. Adjusted for ancestry and genotyping covariates, there were small associations between parents’ BMI polygenic scores (beta: 0.01, 95%CI: 0.00,0.02, p=0.02 for the adult BMI PGS, and beta: 0.01, 95%CI: 0.00,0.02, p=0.008 for the childhood body size PGS), and of the mother’s childhood body size PGS with the father’s ADHD PGS (beta: 0.01, 95%CI: 0.00,0.02, p=0.03). We did not detect associations with pairs of other polygenic scores, which may be due to insufficient statistical power.”

      And the relevant part of the discussion section now reads:

      “We found some genomic evidence of assortative mating for BMI, and cross-trait assortative mating between BMI and ADHD, but not between other traits. However, associations between polygenic scores, which only capture some of the genetic variation associated with these phenotypes, may not capture the full extent of genetic assortment on these traits.”

      5) Are the actual phenotypes (BMI, depression or ADHD) correlated between the parents? If so, would this not suffice as evidence of cross-trait assortative mating? It is known that the genetic correlation between parents as a result of assortative mating is a function of the correlation in their phenotypes and the heritabilities underlying the two traits (e.g., see Yengo and Visscher 2018). An alternative way to estimate the genetic correlation between parents without using PGS (which is noisy and therefore underpowered) would be to use the phenotypic correlation and heritability estimated using GREML or LDSC. Perhaps this is outside the scope of the paper but I would like to hear the author's thoughts on this.

      Associations between maternal and paternal phenotypes are consistent with a degree of assortative mating (shown below). These results have added to Appendix 1 - Table 5, which also shows associations between maternal and paternal polygenic scores, and methods and results updated accordingly (see quoted text in response to the comment above). For comparability, both sets of results are based on regression models adjusting for the mother’s and father’s ancestry PCs and genotyping covariates. We agree that analysis of assortative mating using GREML or LDSC is out of scope for this paper. As noted above, we have updated the discussion to acknowledge the limitations of the approach taken:

      ‘We found some genomic evidence of assortative mating for BMI, and cross-trait assortative mating between BMI and ADHD, but not between other traits. However, associations between polygenic scores, which only capture some of the genetic variation associated with these phenotypes, may not capture the full extent of genetic assortment on these traits.’

      6) It would be helpful to include power calculations for the MR-Egger intercept estimates.

      As with our response to the comments above, post-hoc power calculations are redundant, as all the information about the power of our analysis, including the MR-Egger is indicated by the standard errors and confidence intervals. MR-Egger is less precise than other estimators, as is made clear from the wide confidence intervals reported in the relevant tables (Appendix 1 - Tables 8 and 9). However, we have now updated the discussion to give more weight to this as a limitation. The discussion of pleiotropy in the final paragraph of the discussion now reads:

      ‘While robustness checks found little evidence of pleiotropy, these methods rely on assumptions. Moreover, MR-Egger is known to give imprecise estimates (Burgess and Thompson 2017), and confidence intervals from MR-Egger models were wide. Thus, pleiotropy cannot be ruled out.’

      Similarly, we have updated the relevant line of the results section, which now reads:

      ‘MR-Egger models found little evidence of horizontal pleiotropy, although MR-Egger estimates were imprecise (Appendix 1 - Tables 8 and 9).’

      7) Finally, what is the correlation between PGS and genetic PCs/geography in their sample? A correlation might provide evidence to support the point that classic MR effects are inflated due to stratification.

      Figures presenting the association of the child’s BMI polygenic scores and their PCs have been added to the supplementary information as Appendix 1 - Figure 2 and Appendix 1 - Figure 3. Consistent with an influence of residual stratification, a regression of the child’s BMI polygenic scores against their ancestry PCs (adjusting for genotyping centre and chip) found that 7 of the 20 PCs were associated at p<0.05 with the adult BMI PGS, and 8 of 20 with the childhood body size PGS (under the null hypothesis, we would expect one association in each case). When parental polygenic scores were added to the models, these associations attenuated towards to null.

    1. Author Response

      Reviewer #1 (Public Review):

      The manuscript shows that bone is resorbed during the early steps of limb regeneration in urodeles, and osteoclasts are required for this process. In case of impaired resorption, integration of newly-formed tissue with the original bone shaft is compromised. The manuscript further shows that wound epithelium is required for bone resorption and suggests that it induces osteoclastogenesis or migration of osteoclasts. Furthermore, the authors showed that the formation of novel skeletal elements is initiated while the resorption of the old one is still actively ongoing.

      The study is well designed, conclusions are relatively well supported, and data are presented in a clear way. Two new models of transgenic axolotls have been created. The strongest and most important finding is that partial bone resorption is required for tissue reintegration. My main concern is the novelty of this study, which is quite limited in my opinion.

      Specifically, resorption of bone stump during limb regeneration has been shown before in various model organisms.

      The role of osteoclasts in this process has not been well characterized in urodeles but has been shown during the regeneration of a mouse digit.

      It is reasonable to anticipate that similarly, osteoclasts are resorbing bone in salamanders, especially since this is the only cell type known for bone resorption.

      Thus, this observation, despite being nicely and thoroughly done, is of limited interest.

      The role of wound epithelium in bone histolysis is well demonstrated via skin flap experiments in this manuscript. However, upon skin flap surgery no limb regeneration occurs, implying wound epithelium is a key tissue triggering all the processes of limb regeneration. Accordingly, the absence of bone histolysis in such conditions can be secondary to the absence of any other part of the regenerative process, e.g., blastema formation, macrophage M1 to M2 transition, reinnervation, etc. The proposed link between wound epithelium and osteoclastogenesis (i.e., Sphk1, Ccl4, Mdka) is very superficial and very suggestive.

      No functional evidence was provided to confirm these connections. Finally, the authors showed that new bone formation occurs while resorption of the bone stump is still ongoing. This is a nice observation, but again, rather indirect as it is based on the dynamics of bone resorption and bone formation in different animals. Due to high variability among animals, direct evidence, like double staining for osteoclasts and blastema markers would address this point more precisely.

      We consider that our work provides evidence, for the first time, that skeletal resorption in early stages of regeneration has a durable impact by affecting tissue integration. We show that this process occurs in a short and conserved time, which provides a window of interest for comparative research with other models, and interventional therapies. To our knowledge, limb regeneration is studied mainly in amphibians, as they are the only established lab model with this ability. Some lizards, geckos and possibly iguanas, have been reported to regrow an appendage albeit lacking the regenerative fidelity amphibians have. In an established regeneration lab model, such as the axolotl, the study of regeneration-induced resorption has been scarce.

      During murine digit tip, osteoclasts are recruited to the amputation site and resorb the bone in a similar time frame as we show here in the axolotl. Ablating osteoclasts delays the regeneration time, however, no study has been conducted on the impact of tissue integration. Additionally, a key difference between mouse digit and adult axolotl limb regeneration is that the new skeletal elements are built fundamentally different: direct ossification (bone on top of bone) in mouse, versus endochondral ossification (cartilage on top of osteo-cartilage elements) in the axolotl limb. The tissue integration of the latter may present different challenges worth exploring to understand its regulation. What this work adds, is a characterization of the temporal and cellular dynamic of regeneration-induced resorption, the interaction of osteoclasts with skeletal cells and lastly, the impact on tissue integration.

      Based on previous studies in mammals, it is reasonable to anticipate the presence and role of osteoclasts in salamanders. However, the growing body of work in the field, as well as our own work in the axolotl, have shown that extrapolations of mammalian skeletal biology to other species come with their risks.

      We agree that the role of the wound epithelium (WE) in skeletal histolysis will require further and extensive work. The evidence shown here, provides a glimpse of the complex response and crosstalk of the WE with the tissue underneath, and we hypothesize this response is tailored to the tissue composition exposed during the injury.

      Finally, following the reviewer’s advice, we have conducted new experiments to prove the temporal connection between skeletal resorption and regeneration, showing that these processes occur simultaneously.

      Reviewer #3 (Public Review):

      This study outlines the role of osteoclast-mediated resorption in integrating the skeletal elements during limb regeneration, using axolotls that can regenerate the entire limb upon amputation. Using calcium-binding vital dyes (calcein and alizarin red), the authors first demonstrated that a large portion of amputated skeletal elements is resorbed prior to blastema formation. They further show that 1) inhibiting bone resorption by zoledronic acid impairs proper integration of the pre-existing and regenerating skeletal elements, 2) removing the wound epithelium using the full skin flap surgery inhibits bone resorption, and 3) bone resorption and blastema formation are correlated. The authors reached the major conclusion that bone resorption is essential for successful skeletal regeneration. Notably, this study applies a well-established and elegant axolotl limb regeneration model and transgenic reporter strains to reveal the potential roles of resorption in limb regeneration.

      Strengths:

      1. The authors utilized a well-established axolotl limb regeneration model and applied elegant vital mineral dyes and transgenic reporter lines for sequential in vivo imaging. The authors also provided quantitative assessment by examining multiple animals, particularly in the early sections, ensuring the rigor and the reproducibility of the study.

      2. The authors further performed important interventions that can impinge upon successful limb regeneration, including inhibition of bone resorption by zoledronic acid and impairment of the wound epithelium by full skin flap surgery. These procedures gave rise to useful insights into the relationship between bone resorption and successful limb regeneration.

      3. The imaging presented in this manuscript is of exceptionally high quality.

      Weaknesses:

      1. Despite the high quality of the work, many analyses in this study are incomplete, making it insufficient to support the major conclusion. For example, in Figure 4, the authors did not provide any quantitative assessment to show how zol affects the integration of the skeletal elements (angulation?), which seems to be essential for supporting the conclusion. Likewise in Figure 7, the analyses of EdU+ cells and Sox9 reporter expression were not included in zol-treated animals. Similarly in Figure 5, quantification of osteoclasts was not performed with the full skin flap surgery group. Analyses of only normally regenerated animals are not sufficient to support many of the conclusions.

      2. The phenotype of zol-treated animals in limb regeneration is somewhat disappointing. Although zol-treated animals show decreased blastema formation and unresorbed pre-existing skeletal elements, limb regeneration still occurs and the only phenotype is a relatively minor defect in skeletal integration. It is possible that zol-induced defect in blastema formation is not directly linked to the failure of integration at a later stage. I find this “weakness” a bit subjective.

      3. As an integration failure of the newly formed skeleton still occurs in untreated animals, it is not entirely clear how the authors can attribute this defect to a lack of bone resorption. More quantitative analyses would be necessary to demonstrate the correlation between zol treatment and lack of integration.

      Taking into consideration the reviewer’s concerns, we have improved our analysis of integration phenotype. The assessment of integration success was carried out using a score matrix and with it, we correlated the extent of resorption with integration efficiency more accurately. We believe our results provide sufficient evidence to support this correlation.

      When we first saw the phenotype of zol-treated animals, we were far from disappointed, we were actually intrigued that we could observe a significant failure in tissue integration after removing the function of osteoclasts in an early phase of regeneration. All or nothing results are exciting, subtle results on the other hand, could prove more informative, and we think this is the case here. Our treatment does not inhibit regeneration, but disrupts tissue integration, opening another fascinating aspect of regeneration: how old tissue is capable of functionally integrate newly-formed tissue?

      The integration phenotypes observed in the un-resorbed limbs does not resemble anything reported in the field so far. Moreover, the range of phenotypes observed led us to better determine its correlation with resorption. Importantly, the presence of integration failures in untreated animals allowed us to look into ECM organization at this old-new tissue interphase, while highlighting the normal occurrence of imperfect regeneration in the axolotl limb.

      Finally, we have included new results to complement the conclusions presented at the end of our work. Albeit we observed differences in blastema size in zol-treated animals, we did not observe difference in the amount of EdU+ cells, which reveals that the skeleton cannot be used as a reference for assessing blastema location. This conclusion is complemented with our in vivo assays in which we observed condensation of cartilage despite resorption still occurring. We consider our conclusions to be justified and supported by the assays presented in our work.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01594

      Corresponding authors: Hidehiko Kawai and Hiroyuki Kamiya

      1. General Statements [optional]

      We would like to extend our gratitude to the Editor and both Reviewers for their constructive and insightful comments to our manuscript. We deeply appreciate the Reviewers’ careful consideration of our work, in result of which we think the paper has greatly improved. Below, we have responded to all points raised by the Reviewers.

      2. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The analysis of mutations in mammalian, including human, genomes has been of interest for many decades. Early DNA sequencing technologies enabled direct identification of mutations in target genes provided that the mutant genes could be readily isolated. This requirement stimulated the development of shuttle vector plasmids that carried a mutation marker gene and could replicate in both mammalian and bacterial cells. These were used in experiments in which the plasmids, treated with a mutagen, would be passaged through mammalian host cells after which the progeny plasmids were introduced into an indicator bacterial strain. Colonies with mutant marker genes could be distinguished by color or survival, the plasmids recovered, and the sequence of the mutant gene determined. The shuttle vector plasmid that became the most widely used contained as the marker the supF amber suppressor tyrosyl tRNA gene positioned in the plasmid such that deletion mutations associated with mammalian cell transfection were selected against. Although various improvements have been introduced since its introduction in the mid-1980s, including bar codes to distinguish independent from sibling mutations (in the early 1990s), the basics of the system have been maintained, and it and variations are still in use. The Kamiya group has made several adjustments to the supF shuttle vectors, including the construction of indicator bacterial strains based on survival of bacteria containing mutant supF genes (the initial system relied on colony color). They have published many studies of mutagenesis by various agents, error prone polymerases, etc. In the current submission they describe a comprehensive approach to identifying mutations in the supF gene that exploits Next Generation Sequencing technology that can identify the full spectrum of mutations including those that escape detection in phenotypic screens. The study is exhaustive and presents a methodical validation of each component of their approach. They report UV induced mutations, the mechanism of which has been well characterized in previous literature. They also describe a category of multiple mutations, which had been observed in the early work with the supF plasmids, and whose relationship to UV photoproducts is most likely indirect.

      *We thank the Reviewer for their very insightful feedback to our manuscript and their positive assessment. We have added some discussion points based on the essential references mentioned in the Reviewer’s comments, which we believe made the explanation of our study more complete. *

      Major comments: This manuscript presents a technical advance on the use of the supF mutation reporter system. The extent of the validation of each component of the system, including the bar code is rigorous. Their data on the nature and location of UV induced mutations are in very good agreement with previous studies with supF and other reporter genes, a further validation of their approach. Their discussion of the mechanism of the UV induced mutations is in accord with prior work from other laboratories. However, their interpretation of the multiple mutations, although reasonable in invoking a role for APOBEC deamination of cytosines (see eLife. 2014; 3: e02001 for another discussion of this issue), overlooks a much earlier study on the same topic that showed that nicks in the vicinity of the marker gene are mutagenic and can induce multiple mutations (Proc Natl Acad Sci 1987 84:4944-8). It would be useful for the authors to consider their data on the multiple mutations in the light of the earlier analysis. Furthermore, a check to verify the covalently closed circular integrity of the plasmid preparations would be an important quality control and could reduce the mutagenesis observed in 0 UV controls.

      We thank the Reviewer for the valuable comments that made our manuscript clearer and more emphatic. We are hereby addressing all of the Reviewer’s concerns. The available data accumulated from previous studies have proved the high sensitivity of the supF assay as a mutagenesis assay, which now has been clearly supported by the results in the current study. We believe that this NGS assay will be able to fulfil the data requirements to tackle many questions related to mutagenesis, thanks to the simplicity and cost-effectiveness of the procedure. However, to meet the experimental objectives, the preparation and analysis of the library are crucially important procedures in the stages of initial setting up of the assay. The covalently closed circular integrity of the vector library is definitely one of the important points we should pay attention to when performing this assay. After the construction of the BC12-library, we have to check the quality of the library by agarose gel electrophoresis. The background mutation frequency and the sequence of the library itself (uploaded as described in the DATA AVAILABILITY section of this manuscript) also needs to be analyzed by NGS before the experiment. We are also routinely constructing the double-stranded shuttle vector from a single-stranded circular DNA with a variety of site-specific damaged oligonucleotides. The treatment with T5 exonuclease followed by purification is absolutely essential to decrease the background mutation frequency. Without the treatment with the exonuclease, cluster mutations may be increased under specific experimental conditions. For this study, we carried out the conventional supF assay using the BC12-library purified after T5 exonuclease treatment. However, in this case the process of purification slightly increased the mutant frequency of the BC12-library to about 2 x10-4 (corresponding to 1x10-6/bp).Therefore, when setting up the essay, we have to consider the background control that we will need for the data analysis. In response to the Reviewer’s comments, we have now added the following paragraph in the DISCUSSION section:

      Page 16, line 25:

      ”5) For the supF assay, spontaneous cluster mutations at TC:GA sites were often observed, and it was well illustrated in an earlier study that a nick in the shuttle vector was a trigger for these asymmetric cluster mutations (54). Therefore, we need to be aware of the quality of each library and how it affects the outcome of each analysis, especially for detection of very low levels of mutations. Depending on the purpose of the experiments, in the preparation of covalently closed circular vector libraries it is essential to eliminate the background level of mutations. In fact, the in vitro construction of the library of double-stranded shuttle vectors from single-stranded circular DNA requires the process of treatment with T5 Exonuclease, which drastically decreases background mutations.”

      Minor points The authors state that only 30% of the base sequence of the supF gene can be "used for dual-antibiotic selection on the indicator E. coli". An earlier review (Mutation Res 220: 61,1989) indicated that within the mature tRNA region single or tandem mutations had been reported at 87% of sites, using the colony color assay. The direct NGS analyses would be indifferent to phenotype, and one would expect the maximum number of mutable sites would be recovered from this approach. It would be helpful for an explicit statement regarding the number of mutant sites to be in the Discussion, as this should strengthen the case for the NGS strategy.

      We thank the Reviewer for the helpful comment. These are important points we should indeed mention. This method will complement previous data, and especially the data from titer plates will provide us with non-biased mutation spectra for the whole analyzed region. We have now explained in detail about the coverage of mutation spectra in the DISSCUSSION section.

      Page 14, line 14:

      The mutation spectra of single or tandem base-substitutions for inactive supF genes identified by using the blue-white colony color assays were comprehensively summarized in an earlier review article, and it was noted that the mutations were detected at 86 sites within a 158-bp region covering the supF gene (54%) and at 74 sites within the 85-bp mature tRNA region (87%), thus demonstrating the great sensitivity of the supF assay system for analysis of mutation spectra (19). However, obtaining reliable datasets by the conventional supF assay requires skill and experience, especially for studies where the mutations of interest are induced with low frequency. The method has been advanced by the construction of indicator bacterial strains with different supF reporter genes which allow selection based on survival of bacteria containing mutant supF genes. However, the fact that the supF phenotypic selection process relies on the structure and function of transfer RNAs that may be differently affected by different mutations means that the improvement of the efficiency of the selection process may cause loss of coverage of the mutation spectra, as it is under our experimental conditions, where the coverage is about 30% (19,20).”

      Page 15, line 4:

      From this point of view, we believe that we can secure a sufficient number of experiments to improve the accuracy of the analysis and to confirm the reproducibility of the experiments. Furthermore, the data from colonies grown on titer plates provides us, at least in principle, and with the exception of large deletions and insertions, with non-biased mutation spectra for the whole analyzed region.

      Supplementary Figure 1 shows the organization of 8 supF reporter plasmids. Were these discussed in the text and employed in the experiments? It was not clear in the text.

      We thank the Reviewer for the helpful comment. It was indeed not clear which vectors we used and why we constructed a series of vectors. Now, we have added the vectors we used for the constructions of the library and each experiment in the RESULTS and MATERIALS AND METHODS sections. Since this is quite important for us and, we believe, the readers, we also added the explanations in the DISCUSSION section, detailing why we have constructed a series of shuttle vectors, as follows:

      Page 19, line 36:

      Mutational signatures identified in cancer cells are emerging as valuable markers for cancer diagnosis and therapeutics. Innumerable physical, chemical and biological mutagens, including anticancer drugs, induce characteristic mutations in genomic DNA via specific mutagenic processes. The mutation spectra obtained here by using the presented advanced method were in good agreement with accumulated data from previous papers where the conventional method had been used, with the advantage that our method provided less-biased mutation spectra data. As described above, the datasets presented here highlighted novel mutational signatures and also cluster mutations with a strand-bias, which could be associated with the processes of replication, transcription, or repair of DNA-damage, including a single strand break (a nick). In this study, eight series of supF shuttle vector plasmids were constructed, as presented in Supplementary Figure S1; however, the analysis was carried out using N12-BC libraries prepared from either pNGS2-K1 (Figures 1-4) or pNGS2-K3 (Figures 5-10). The pNGS2-K1/-A1/-K4/-A4 and pNGS2-K2/-A2/-K3/-A3 vector series contain an M13 intergenic region with opposite orientations relative to the supF gene, which allows us to incorporate specific types of DNA-damage at specific sites in the opposite strand of the vector library. Also, the pNGS2-K1/-A1/-K3/-A3 and pNGS2-K2/-A2/-K4/-A4 vector series contain the SV40 replication origin, which enables bidirectional replication and transcription, at opposite sides of the supF gene. Although this is still preliminary data, it is notable that the spontaneously induced mutations for the different vectors in U2OS cells were not significantly different. Therefore, the here presented mutagenesis assay with NGS, by using these series of libraries, can be applied in many different types of experiments to address both quantitative and qualitative features of mutagenesis. It is possible to design series of libraries containing DNA lesions or sequences suitable for the investigation of specific molecular mechanisms, such as TLS, template switching, and asymmetric cluster mutations.”

      CROSS-CONSULTATION COMMENTS Comment on the issue raised by Reviewer #2 regarding plasmids with unrepaired DNA damage introduced into E. coli after passage through U2OS cells: treatment of the plasmid harvest with Dpn1 eliminates un-replicated plasmid DNA. Also, SV40 T antigen drives run away replication of the plasmids, which contain the SV40 origin of replication. This greatly dilutes plasmids with remaining UV photoproducts.

      Reviewer #1 (Significance (Required)):

      Significance This is a comprehensive description of a technical advance for the analysis of mutations based on the most widely used system for reporting mutations in mammalian, including human, cells. As costs for NGS decline it is likely to become the approach of choice.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, the authors developed a novel mutagenesis assay by combining the conventional supF forward mutagenesis assay with NGS technology. The manuscript is well written, providing design, methods, and results of the experimental system in very much details, which this reviewer highly evaluates. However, the manuscript may be too long and could be more concise. In addition, this reviewer is afraid that main figures seem difficult to fit printed pages (especially multi-paneled figures of large size, such as Fig. 5 through 8). The authors should re-organize the figures by reducing size and/or moving partly to supplementary information.

      We thank the Reviewer for the helpful comments to our manuscript. It is true that the multi-paneled figures were too large, and we have now re-analyzed and optimized most of the figures by reducing size, transferring to Supplementary Figures, and separating one figure into two. Although the number of Figures and Supplementary Figures have now increased, we believe that it has become easy to follow for readers and to fit printed pages. *We considered carefully the Reviewer’s remark about the length of the manuscript, but we feel that the text was already as concise as we could make it, and we have already left out some more detailed explanations. *

      1. Some UV-induced DNA damage (typically CPD) is repaired only slowly in human cells, so that the replicated plasmid DNAs recovered from U2OS cells may still contain damage and possibly induce mutations in E. coli after transfection. As the result of high sensitivity of NGS analysis, it is worried that such mutations could be also included in the results. To obtain even more accurate mutational characteristics in mammalian cells, the authors could consider to treat the DNA samples with photolyases before transformation of E. coli. The authors could consider to discuss on this point.

      *We thank the Reviewer for the helpful comment, indeed Dpn I treatment is one of the very important procedures for avoiding analysis bias. We have now expanded the explanation why the libraries have to be treated with Dpn I, as follows: *

      Page 11, line 4:

      the libraries were extracted from the cells, and treated with dam-GmATC-methylated DNA specific restriction enzyme Dpn I to digest un-replicated DNAs that contain UV-photoproducts.”

      1. It is quite intriguing that multiple mutations in a single BC clone tend to occur in the same DNA strand. Is there any trend in a distance between the mutated sites? Considering participation of TLS polymerases in the first round of replication, it may be interesting if multiple DNA lesions occur in relatively close positions so that TLS polymerases elongate the DNA strand without switching back to replicative polymerases.

      We thank the Reviewer for the valuable and insightful suggestions for this assay. We have analyzed the positions of SNSs in multiple-mutations shown in Supplementary Figures S11 and S12. As the reviewer mentioned, we may be able to address the mechanisms of TLS switching in mammalian cells by using this assay. In this study, the obtained non-biased mutation spectra of multiple mutations may not be enough for the static analysis, but our results indicate that multiple mutations were induced at relatively close positions. It would be interesting if we could address the mechanisms of TLS polymerase switching. We believe that the accumulation of large numbers of non-biased mutation spectra will provide us with growing opportunities to address more questions in mutagenesis. We have now added the Supplementary Figures S11 and S12, as well as the following discussion points:

      Page 14, line 6:

      5) The distance between two SNSs in multiple mutations induced by UV irradiation was relatively shorter than the theoretically expected based on the sequence (Supplementary Figures S11 and S12).”

      Page 18, line 27:

      “In addition, the positions of SNSs in the multiple mutations were closer to each other compared to the theoretically expected positions (Supplementary Figures S11 and S12), which may reflect switching events involving TLS polymerases. It should be noted that the presented data for the distance between two SNSs in the multiple mutations was analyzed from the data from selection plates in order to secure a sufficient number of mutations, and therefore, there may be a bias due to hot spots associated with the selection process. However, the results from the limited number of mutations from the titer plates are similar to these from the selection plates. It can be proposed that this assay may also be applied for analysis of TLS polymerases in mammalian cells.”

      1. This reviewer is wondering whether the results of mammalian cells are influenced by transcription-coupled repair in this experimental system. Because the SV40 replication origin functions as bidirectional promoters, the supF region may be transcribed in U2OS cells so that DNA damage on transcribed strands may be removed more efficiently than non-transcribed strands. Please comment on this, if relevant.

      *We thank the Reviewer for the insightful comments. This issue is also very important and interesting, and should be addressed in the mutagenesis research. That is exactly the reason why we presented series of vectors for the assay in this paper. The SV40 replication origin has an effect on the background mutations, which this is also dependent on the experimental conditions. However, this needs to be confirmed by further studies. We hope the idea for these constructions will be helpful for many laboratories. We have now added the following parts in the DISCUSSION section. *

      Page 18, line 36:

      Mutational signatures identified in cancer cells are emerging as valuable markers for cancer diagnosis and therapeutics. Innumerable physical, chemical and biological mutagens, including anticancer drugs, induce characteristic mutations in genomic DNA via specific mutagenic processes. The mutation spectra obtained here by using the presented advanced method were in good agreement with accumulated data from previous papers where the conventional method had been used, with the advantage that our method provided less-biased mutation spectra data. As described above, the datasets presented here highlighted novel mutational signatures and also cluster mutations with a strand-bias, which could be associated with the processes of replication, transcription, or repair of DNA-damage, including a single strand break (a nick). In this study, eight series of supF shuttle vector plasmids were constructed, as presented in Supplementary Figure S1; however, the analysis was carried out using N12-BC libraries prepared from either pNGS2-K1 (Figures 1-4) or pNGS2-K3 (Figures 5-10). The pNGS2-K1/-A1/-K4/-A4 and pNGS2-K2/-A2/-K3/-A3 vector series contain an M13 intergenic region with opposite orientations relative to the supF gene, which allows us to incorporate specific types of DNA-damage at specific sites in the opposite strand of the vector library. Also, the pNGS2-K1/-A1/-K3/-A3 and pNGS2-K2/-A2/-K4/-A4 vector series contain the SV40 replication origin, which enables bidirectional replication and transcription, at opposite sides of the supF gene. Although this is still preliminary data, it is notable that the spontaneously induced mutations for the different vectors in U2OS cells were not significantly different. Therefore, the here presented mutagenesis assay with NGS, by using these series of libraries, can be applied in many different types of experiments to address both quantitative and qualitative features of mutagenesis. It is possible to design series of libraries containing DNA lesions or sequences suitable for the investigation of specific molecular mechanisms, such as TLS, template switching, and asymmetric cluster mutations.”

      1. page 13: Please check whether the description of Fig. 9C is correct (6th line, graph on top; 9th line, bottom graph).

      We thank the Reviewer for carefully checking our manuscript, it was mislabeled in the text. Now, following the Reviewer’s comments, most figures have been changed from the figures in the previous submission. We appreciate the careful review.

      CROSS-CONSULTATION COMMENTS Reviewer #1 gives quite relevant comments as an expert of the mutagenesis field. It would improve this manuscript greatly for the authors to make appropriate modifications according to his/her suggestions.

      Reviewer #2 (Significance (Required)):

      It is quite convincing that this method has a great potential to give much more extensive information on mutational characteristics, most importantly, by eliminating the bias caused by phenotypic selection. Therefore, this work certainly must be worth being published in an appropriate journal.

    1. Aur ore ara, ~ spu8 dur ‘parapamur skoq Aj paysitres 10 peap uarpyi Ay “vanowdH © SRPOTL “SPDaq9 SAPPARLL [SILL ‘suodurey, -ouasdy, ‘asedmpooy

      This really illustrates the disconnect between what provokes stress or despair between the two worlds. Privilege check. SO THE BEGINNING PARTS ARE CREATING DISTANCE. I think the later parts will make us see that while we live two different lives, we live right next to each other. And nothing is as far as it may seem.

    1. ust as some friendly debunkerswere able to build connections with believers, addressing the issues of false conspiracy theoriesmay require us to see our similarities and common concerns instead of focusing solely on ourdifferences in belief. This challenge also requires us to see the problem from a socio-technicalperspective by treating the technologies and the social relationships on the internet together as anorganic whole. We hope our research contribute to the understanding of conspiracy believers andtheir belief changing process, and shed light on how we may better facilitate people in makingsense of online information

      Made me think of flat-earth documentary on Netflix.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1:

      Review of "Identifying novel regulators of placental development using time series transcriptomic data and network analyses."

      The authors present a detailed bioinformatic assessment of mouse developmental time series of the placenta. They apply current data mining and analysis methods to identify protein-centred networks that are likely enriched to specific cell types of the placenta. They then translate these findings to humans using statistical comparisons of human single-cell sequencing data of the placenta. Lastly, they use knock-down experiments to validate the conserved functional importance of the hub genes in the mouse protein networks in human cells.

      The strengths of this paper are the rigorous data mining methods and the functional translation to humans from mice. There are no critical weaknesses to the article. There is a blend of statistical analysis with anecdotal or hand curation from databases and the literature, but it is unclear if these curated finings are circumstantial or statistically meaningful. In the end, the hypothesis seems to hold in that 4/4 gene knocked down in the human cells gave a migration phenotype.

      Comments, questions, critique:

      1. Given the translational aims of the paper, more introduction/discussion material on the comparative aspects of mice and humans are needed. Are giant cells and EVT the same? What are the cell equivalents that you are discovering? The Soncin et al. paper is cited, but I think underused. This publication contains time series data on mice and humans and could be used as external validation of clusters, networks, and other analyses. Other publications to consider for context are

      2. Cox B, et al. Mol Syst Biol 5: 279.

      3. Silva JF, Serakides R. 2016. Cell Adhes Migr 10: 88-110. (specifically discusses migration difference between the species placentae)

      We thank the reviewer for the comment and valuable resources. We agree that more information about the similarities and differences between the migratory cells needs to be provided. We have added the following details in the introduction of the manuscript:

      “Although there are certain differences between the mouse and human placenta (Hemberger, Hanna, and Dean 2020; Soncin, Natale, and Parast 2015), they do express common genes during gestation, including common regulators and signaling pathways involved in placental development (Cox et al. 2009; Soncin et al. 2018; Soncin, Natale, and Parast 2015; Watson and Cross 2005). For example, Ascl2/ASCL2 and Tfap2c/TFAP2C are required for the trophoblast (TB) cell lineage in both mouse and human models (Guillemot et al. 1994; Kuckenberg, Kubaczka, and Schorle 2012; Varberg et al. 2021). Another example is the HIF signaling pathway, which regulates TB differentiation in both mouse and human placenta (Soncin, Natale, and Parast 2015).”

      “Although the structure of the placenta is not identical between mouse and human, certain mouse placental cell types are thought to be equivalent to human placental cell types (Soncin, Natale, and Parast 2015). For example, parietal TGCs and glycogen TBs have been described as equivalent to human extravillous trophoblasts (EVTs) (Soncin, Natale, and Parast 2015). Mouse TGCs are not as invasive as human EVTs (Soncin, Natale, and Parast 2015), and they have different levels of polyploidy and copy number variation (Morey et al. 2021); however, both EVTs and TGCs are able to degrade extracellular matrix to enable TB migration into the decidua (Silva and Serakides 2016).”

      Added to discussion:

      “These genes were selected primarily based on the network analyses, but also based on expression data from human cells to account for possible differences between mouse and human placental gene expression.”

      As the reviewer suggested, we used the Soncin et al., 2015 data for validation. Only 6,317 of the 11,713 protein-coding genes used for hierarchical clustering were detected in the mouse dataset in Soncin et al., 2015. This issue could be because the Soncin data was generated using microarrays.

      Nevertheless, we still compared our e7.5 and e9.5 hierarchical groups with: (1) Soncin et al. gene clusters in mouse that were downregulated over time, had highest expression from e9.5-12.5, or were upregulated over time; and (2) Soncin et al. gene clusters in human that were best correlated with mouse clusters and were either downregulated over time or upregulated over time. We observed a general consensus that our e7.5-hierarchical group had the highest percent of agreement with Soncin et al. gene groups that are downregulated over time, and our e9.5-hierarchical group had the highest percent of agreement with Soncin et al. gene groups that either have highest expression at e9.5-e12.5 or genes that are upregulated over time. This data is added below, described in the results section 1, and included in Supplementary Table S1.

      Comparison with Soncin et al. mouse data:

      Having expression > 0 (in Soncin et al.) and being in any hierarchical clusters

      E7.5-hierarchical genes (down-regulation trend)

      E9.5-hierarchical genes (up-regulation trend)

      Cluster 2, 3 and 7 (Soncin et al., downregulation trend)

      1009

      800 (79.3%)

      279 (27.7%)

      Cluster 6 (Soncin et al., highest at e9.5 – e12.5)

      120

      51 (42.5%)

      110 (91.7%)

      Cluster 1, 4 and 5 (Soncin et al., upregulation trend)

      1019

      415 (40.7%)

      881 (86.5%)

      Comparison with Soncin et al. human data:

      Having expression > 0 (in Soncin et al.) and being in any hierarchical clusters

      E7.5-hierarchical genes (down-regulation trend)

      E9.5-hierarchical genes (up-regulation trend)

      HS Cluster 5 (Soncin et al., downregulation trend)

      164

      92 (56.1%)

      52 (31.7%)

      HS Cluster 2 and 4 (Soncin et al., upregulation trend)

      111

      44 (39.6%)

      72 (64.9%)

      The following statement was added to the result section:

      “Second, we compared our hierarchical groups with previously published mouse and human placental microarray time course data from Soncin et al., 2015 (Soncin, Natale, and Parast 2015). Despite the technical differences between the datasets, we observed a consensus that our e7.5 hierarchical cluster had the highest percent of overlap with Soncin et al. gene groups that are downregulated over time, and our e9.5 hierarchical cluster had the highest percent of overlap with Soncin et al. gene groups that either have highest expression at e9.5 - e12.5 or genes that are upregulated over time (Supplementary Table S1).”

      Clustering represented in Figure 1B, was this a supervised model? Why only three clusters?) Did you specify that there would be three models and force each gene profile into one of the categories? How robust are the fits? A fitted model might be a better approach as you can specify the ideal models (early high, late high and mid-high), then determine each gene profile that fits each model and only assess those genes with a significant fit to the model. Forcing clustering to the three-model fit likely gives many poorly fitting profiles. While in the end, this works out, it may be due to applying other post hoc methods for gene enrichment, where noise distributes randomly.

      We carried out unsupervised transcript clustering using hierarchical clustering (agglomerative approach using Euclidean distance and complete linkage). The resulting dendrogram was cut at the second highest level to obtain three clusters. We have added additional validation with different numbers of clusters (k = 3, 4 and 5) and quantification of agreement between different clustering methods to show the robustness of the hierarchical clusters. We acknowledge that hierarchical clustering could be sensitive to noise and could result in poorly fitted transcripts in each group; however, it was a necessary first step for us to identify genes relevant to the distinct placental processes at the three timepoints. Acknowledging this disadvantage, we only focused the analyses on genes that are differentially expressed over time and were present in the timepoint hierarchical groups.

      We added the additional analysis as Supplementary Figure S1, and the following statements were added in the results section:

      "First, we used three different algorithms, K-means clustering, self-organizing maps, and spectral clustering, to validate the trends of the expression levels in hierarchical groups, as well as the number of transcript groups (k = 3, 4 and 5). Only with k = 3 did we obtain groups with median expression level trends consistent in all four algorithms (Supplementary Figure S1). Moreover, with k = 3, the maximum percent of agreement (see Materials and Methods) between hierarchical clusters and clusters obtained using each of the different algorithms was 70.34-87.26% (Supplementary Figure S1), while the maximum percent of agreement between hierarchical clusters and clusters obtained from other algorithms decreases to between 55.67-65.72% with k = 4 and 54.81-59.19% with k = 5.”

      We agree model-based clustering could be an alternative approach and have added it to the discussion section:

      “Combining hierarchical clustering with differential expression analysis, we were able to identify gene groups using an unsupervised approach. It has also been shown that for times-series analyses with fewer than eight timepoints, pairwise differential expression analysis combined with additional methods identifies a more robust set of genes (Spies et al. 2019). Alternatively, model-based clustering using RNA-seq profiles (Si et al. 2014) could also be useful for gene group identification. However, it is still important to evaluate the robustness and functional relevance of the fitted models by carrying out additional downstream analyses.”

      Several statements are made about the conservation of importance between mouse and human hub genes. For example, "We predict these highly expressed genes to be generally important for TB function and processes such as cell migration, a term associated with multiple timepoint specific networks (Figure 2A)." While your knock-down assay of migration results shows these hub genes to be necessary to humans, what do they mean to the mouse? You did not use mouse TSC to assess functional importance concurrently. You note a small number of genes as of known importance, "127 hub genes of which 16 have been annotated as having a role in placental development". Were the others knocked out but lack a developmental phenotype or not assessed? Are these functionally redundant in the mouse or not involved in the same processes between the species?

      To assess the possible role of hub genes in mouse development more comprehensively, we extended our search for gene functions on the Mouse Genome Informatics (MGI) database to include not only placenta related GO and MGI phenotype terms (defined as “genes with known roles”), but also embryo related GO and MGI phenotype terms (defined as “genes with possible roles”). We included embryo related terms as “genes with possible roles” because embryonic lethal mouse knockout lines frequently have placentation defects, and because defects in placental development can be associated with the development of other embryonic tissues (Brown and Hay 2016; Perez-Garcia et al. 2018; Woods, Perez-garcia, and Hemberger 2018). This change resulted in an increase in the number of genes with relevant functions in mouse, including several annotated as embryonic lethal or with abnormal embryonic growth (see Supplementary Table S6). With the additional annotations:

      • 6 out of 17 hub genes of e7.5 networks have known/possible roles.
      • 17 out of 28 hub genes of e8.5 networks have known/possible roles.
      • 48 out of 127 hub genes of e9.5 networks have known/possible roles. We also carried out randomization tests to determine if the number of known/possible genes we identified were significant. Randomization tests were carried out with the following procedure: for each timepoint, from the respective timepoint-specific groups, we sampled 10,000 gene sets of the same number as the hub gene numbers. Then we counted the number of known/possible genes in each random set. A p-value is calculated as the number of times a random gene set has ≥ known/possible genes than the observed number, divided by 10,000. We found that the number of genes with known/possible roles at each time point are statistically significant (Supplementary Figure S3). This result indicates that the gene sets we identified are significantly associated with relevant phenotypes in mouse.

      The remaining hub genes are unannotated as related to placental or embryonic functions in the MGI database. Based on that, it is difficult to determine if they lack a relevant phenotype, or if there has not been a detailed assessment of the placenta.

      Added to section 2 of the result section:

      “Briefly, genes annotated under any GO or MGI phenotype terms related to placenta, TB cells, TE and the chorion layer are considered as having a “known” role in the placenta. Genes annotated under terms related to embryo are considered as having a “possible” role in the placenta, because embryonic lethal mouse knockout lines frequently have placentation defects, and because defects in placental development can be associated with the development of other embryonic tissues (Brown and Hay 2016; Perez-Garcia et al. 2018; Woods, Perez-garcia, and Hemberger 2018). Hereafter, such genes are referred to as “known/possible genes”. In the e7.5 networks, there were 17 hub genes in which six genes were known/possible. The number of hub genes that are labelled as known/possible is statistically significant when comparing to random gene sets selected from the e7.5 timepoint-specific group (Supplementary Figure S3). In the e8.5 and e9.5 networks, 17 out of 28 and 48 out of 127 hub genes were known/possible, respectively. Similar to e7.5, the number of hub genes labelled as known/possible in e8.5 networks and e9.5 networks were both statistically significant when comparing to random gene sets selected from the corresponding timepoint-specific groups (Supplementary Figure S3). These results indicate that the gene sets we identified are significantly associated with relevant phenotypes in the mouse.”

      For the four genes that we tested in HTR-8/SVneo cells, we also added more information about the current known role of the gene in mouse.

      Added to the discussion section:

      “We identified hub genes and their immediate neighboring genes which could regulate placental development and confirmed the roles of four novel genes (Mtdh, Siah2, Hnrnpk and Ncor2) in regulating cell migration in the HTR-8/SVneo cell line. These genes were selected primarily based on the network analyses, but also based on expression data from human cells to account for possible differences between mouse and human placental gene expression. Previous studies suggested these four candidates are functionally important in mouse. Mtdh has been suggested to regulate cell proliferation in mouse fetal development (Jeon et al. 2010). The Siah gene family is important for several functions (Qi et al. 2013). Of relevance to the placenta, Siah2 is an important regulator of HIF1α during hypoxia both in vitro and in vivo (Qi et al. 2008). Moreover, while Siah2 null mice exhibited normal phenotypes, combined knockouts of Siah2 and Siah1a showed enhanced lethality rates, suggesting the two genes have overlapping modulating roles (Frew et al. 2003). Hnrnpk-/- mice were embryonic lethal, and Hnrnpk+/- mice had dysfunctions in neonatal survival and development (Gallardo et al. 2015) . Ncor2-/- mice were embryonic lethal before e16.5 due to heart defects (Jepsen et al. 2007). According to the International Mouse Phenotyping Consortium database (Dickinson et al. 2016), Ncor2 null mice also showed abnormal placental morphology at e15.5. However, none of these genes have been studied in TB migration function.”

      In determining conservation between mouse and human networks, were only 1:1 orthologs examined or did you consider more complex 1:many mapping conditions between the two species?

      In this work, we used only one-to-one orthology between mouse and human avoid duplication while sampling in the enrichment tests. We added this detail in the method section. However, as found in Cox et al., 2009, genes with one-to-many orthologs could be highly intriguing and should be investigated in future studies.

      Should the migration assay be normalized to survival/adhesion? If 70,000 cells were seeded but had 50% cell death (or reduced adhesion), then it may appear to be poor migration. Should the migration be evaluated as a ratio of top to bottom cell densities to control for poor adhesion or survival?

      We thank the reviewer for bringing up this important point. Unfortunately, with the method we used we cannot quantify the densities on top, because the cells on top need to be scraped off prior to measuring the cells at the bottom (the two densities cannot be measured separately). To help with this concern, in a separate experiment we instead counted cell numbers 48-hours post-transfection for cells treated with target gene siRNA and cells treated with negative control siRNA to determine if apoptosis or changes in proliferation rate could be leading to changes in the observed migration. From this data, we determined that none of the siRNA knockdowns resulted in a significant change of cell counts (p-value > 0.05). We do note that Siah2 siRNA #1 has some decrease in counts (p-value = 0.081) and Ncor2 siRNA #1 and #2 have some increase in cell counts (p-value = 0.081 and p-value = 0.077) (Supplementary Figure S7). Additional follow up experiments we have performed with our targets of interest, which are out of the scope of this paper, demonstrate that different pathways and processes could be involved in the resulting decrease in migration we observed (we are following up experimentally in more detail for each gene). Proliferation and other assays could also be used to further examine the increase in Ncor2 cell counts that were observed. We have added the cell count results and additional text to the discussion.

      Added to results, section 4:

      “When comparing the number of cells 48 hours post-transfection for cells treated with target gene siRNA to cells treated with negative control siRNA, we determined that none of the target gene siRNA treatments resulted in significant changes in cell counts. We do note that Siah2 siRNA #1 has some decrease in cell counts (p-value = 0.081), and Ncor2 siRNA #1 and Ncor2 siRNA #2 have some increase in cell counts (p-value = 0.081 and p-value = 0.077) compared to negative control treated samples (Supplementary Figure S7). This provides evidence that, in general, the reduction in cell migration capacity was likely not due to the target gene impacting the rate of cell death.”

      To the discussion:

      “Moreover, we observed that cell counts generally were not decreased upon target gene knockdown compared to negative control knockdown. However, more detailed analysis and process specific assays are needed. For example, future studies assessing each gene’s role in cell adhesion, cell-cell fusion, cell proliferation and cell apoptosis can be done to better understand their roles in placental development.”

      Reviewer #1 (Significance (Required)):

      This significantly advances previous publications on this topic by functionally testing the discovered genes.

      This highlights an excellent data mining strategy for a developmental disease using mice and translating to humans.

      The audience is likely developmental biologists and reproductive specialists.

      My expertise is bioinformatics and developmental biology.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors used RNA-seq data from mouse fetal placenta at e7.5, e8.5, and e9.5 to create timepoint-specific gene expression interaction networks to find genes that they predicted would regulate placental development. They confirmed four novel candidate genes and showed that in the transfected human trophoblast HTR-8/SVneo cell line, these four candidates reduced cell migration capacity. Additionally, the authors show that bulk RNA-seq data can be used to infer cell-type composition and when used with single-cell RNA-seq, can be a powerful tool to study the biological processes that involve multiple cell-types.

      Overall, the authors are rigorous in their analyses, their conclusions appear sound, and the work could be an asset to the broader placental biology field. However, although the authors present an approach that future studies might find useful to replicate and their work has produced numerous novel transcripts/genes that warrant further investigation, the approach is not entirely novel, and could be expanded/improved (as suggested by the authors in the discussion), particularly with regard to validation of the genes/networks identified. Major and minor comments are listed below.

      Major comments:

      1) The authors used clustering and differential expression analysis to define sets of timepoint-specific genes. However, it was not clear to me the benefits of this approach. Why would using this approach be better than differential expression analysis alone such as in a typical ANOVA?

      We have added more discussion on this matter to explain our approach. We believe using hierarchical clustering and pairwise differential expression analysis can help identify gene lists with higher confidence. These are the new details we added to the discussion section:

      “Combining hierarchical clustering with differential expression analysis, we were able to identify gene groups using an unsupervised approach. It has also been shown that for times-series analyses with fewer than eight timepoints, pairwise differential expression analysis combined with additional methods identifies a more robust set of genes (Spies et al. 2019). Alternatively, model-based clustering using RNA-seq profiles (Si et al. 2014) could also be useful for gene group identification. However, it is still important to evaluate the robustness and functional relevance of the fitted models by carrying out additional downstream analyses.”

      2) Related to number 1 above, although the authors are interested in timepoint-specific transcripts, the author's methods would filter out possibly interesting transcripts that turn on and off during development. The authors might want to check to see if there are transcripts that are up in e7.5 and then down in e8.5 but then up again in e9.5. Also, the author's methods seem to include transcripts that are not exclusive to one timepoint (i.e. are up in e7.5 and e8.5 but not e9.5). It might be interesting to differentiate transcripts that are exclusive to one timepoint from those that are in more than one timepoint.

      We thank the reviewer for their valuable comment. We agree genes that turn on and off during the time course could be very interesting. In performing this analysis, we found that the number of such genes is rather small (38 genes that are up-regulated at e7.5 compared to e8.5 and up-regulated at e9.5 compared to e8.5). These genes were not enriched for processes that we observed with timepoint-specific gene groups, such as “trophoblast giant cell differentiation” (e7.5-specific genes), “labyrinthine layer development” (e8.5- and e9.5-specific genes), "blood vessel development” (e7.5- and e9.5-specific genes) and “response to nutrient” (e9.5-specific genes) (Supplementary Table S3). They are generally enriched for processes related to cytokine production and regulation of secretion.

      We also agree that it is interesting to differentiate transcripts that are exclusive to one time point from those that are in more than one time point. In the revised manuscript, we added additional analysis for genes that belong to multiple timepoint groups due to different transcripts of the same gene being annotated as timepoint-specific, and genes unique to each timepoint (Added to results section 1):

      “It is possible that timepoint-specific groups share genes that have timepoint-specific transcripts. Indeed, we identified 37 genes shared between e7.5 and e8.5, 5 genes shared between e7.5 and e9.5, and 109 genes shared between e8.5 and e9.5 (Supplementary Table S3). We found that genes only present at one timepoint (timepoint-unique genes) were generally enriched for similar terms as the full group of timepoint-specific genes (Supplementary Table S3). However, terms related to the development of labyrinth layer like “labyrinthine layer morphogenesis” and “labyrinthine layer blood vessel development” were only enriched when using all e8.5-specific genes but not when using e8.5 timepoint-unique genes. Moreover, we found that, unlike genes shared between e9.5 and e7.5, genes shared between e9.5 and e8.5 were enriched for processes such as “blood vessel development” and “insulin receptor signaling pathway”. This observation may indicate that different transcripts of the same genes could be expressed at different timepoints for the continuation of certain biological processes.”

      3) In the network analysis it would be interesting and helpful to the reader to highlight, if any, nodes or terms that were found to be significant (i.e. hubs or genes that have a high centrality metric etc.) in both the STRING and GENIE3 networks or overlap the networks created by the two different algorithms to compare them. This might help readers better rank genes when using these data to decide what genes are most important at each timepoint.

      We observed only one hub gene shared among networks inferred by the two methods (Vegfa in the e9.5 networks). However, hub genes of networks inferred by one method could be nodes in networks inferred by the other method. Hence, we have added lists of such genes in section 2. Interestingly, many of these genes have known roles in placental development. In terms of biological functions shared between the networks at the same timepoints, there were multiple interesting processes such as “positive regulation of cell migration”, “epithelium migration” and “vasculature development”, which we highlighted in Figure 2A.

      In the revised manuscript, we have added the following details in different paragraphs of section 2 of the results:

      “Although the networks inferred by the two methods did not share any hub genes, hub genes identified with one method could be members of the other method’s networks. These hub genes are Mmp9 (e7.5_1_STRING), Frk, Hmox1, and Nr2f2 (e7.5_2_GENIE3) (Table 1). This observation strengthens the potential roles of Frk gene in placental development.”

      “Hub genes identified with one method and present in the other method’s networks are Hsp90aa1, Akt1, and Mapk14 (e8.5_1_STRING), Dvl3 and Msx2 (e8.5_2_GENIE3) (Table 1).”

      “Hub genes identified with one method and present in the other method’s networks include important genes such as Rb1 (Sun et al. 2006), Yap1 (Meinhardt et al. 2020) (e9.5_1_GENIE3) and Vegfa (e9.5_2_STRING) (Table 1). Notably, Vegfa is the only hub gene identified with both of the network inference methods.”

      4) The author's conclusion that network analysis can be used to identify genes more likely associated with specific placental cell types is very likely true, but I think that the conclusion would be more impactful if the authors reported how the method compares to simply taking a list of differentially expressed genes and looking for cell type enrichments using their favorite enrichment software. For example, if a gene is highly connected in a particular network that has been identified as SCT-specific, but that gene isn't considered an SCT "marker" by the placental biology research community, it would be interesting to highlight that it is prevalent in a previously published scRNA-seq dataset or a dataset that has isolated that particular cell type to show the advantages of using networks to find placental cell type specific genes.

      We completely agree with the reviewer’s point and have now added a randomization analysis to compare the enrichment using PlacentaCellEnrich (PCE) with genes in networks and random genes (Supplementary Figure S6). We randomly sampled 10,000 gene sets with the same sizes as the subnetworks from their corresponding hierarchical groups and carried out PCE analysis. These tests showed that the enrichments of cell type-specific genes were only significant with the subnetwork genes but not the random genes. The randomization tests added a valuable highlight that the network genes are highly relevant to cell type-specific genes in the human placenta, and therefore provided more confidence in the gene lists obtained from the network analyses.

      We also further checked the expression of the hub genes in other independent data in order to identify hub genes that are potentially cell type specific markers. For example, we observed that Dvl3 (e8.5_2_GENIE3) and Olr1 (e9.5_3_STRING) have been shown to be differentially expressed in SCT compared to other TB subtypes (human trophoblast stem cells, EVT (Sheridan et al. 2021) or endovascular TB (Gormley et al. 2021)).

      We added the following detail in the results, section 3:

      “Importantly, randomization tests showed that the enrichment of cell type-specific genes were only significant in these subnetworks but not in random gene sets selected from corresponding timepoint hierarchical groups (Supplementary Figure S6), which highlights the biological relevance of the gene network modules.”

      Added to the discussion section:

      “Moreover, hub genes could be used to identify potential novel markers for the cell types corresponding to their subnetworks. For example, hub genes of subnetworks enriched for SCT-specific genes such as Dvl3 (e8.5_2_GENIE3) and Olr1 (e9.5_3_STRING) are not established SCT marker genes, but are in fact differentially expressed in SCT compared to human trophoblast stem cells, EVT (Sheridan et al. 2021) or endovascular TB (Gormley et al. 2021). In general, combining network analysis with existing gene expression data from single cell or pure cell populations will allow identification of novel cell-specific marker genes to help future studies focused on different TB populations.”

      5) While the selection of genes for validation was limited by the model system available for testing, the authors should recognize that the genes/networks identified here should first and foremost be validated in a mouse model (by knockdown/overexpression studies using mouse trophoblast stem cells or by evaluation of placenta/embryo in a KO/transgenic mouse model). Whether or not the data are relevant to human placentation is (at least initially) irrelevant. While we recognize that these are difficult studies requiring significant time and resources, as is, the data and results will have significantly less impact than if even a limited amount of such validation could be performed.

      We thank the reviewer for this valuable comment. Based on this comment and the suggestions from reviewer #1, we have added the following points to the manuscript to discuss the relevance of the genes in the mouse models, and further explain our gene choices:

      To assess the possible role of hub genes in mouse development more comprehensively, we extended our search for gene functions on the Mouse Genome Informatics (MGI) database to include not only placenta related GO and MGI phenotype terms (defined as “genes with known roles”), but also embryo related GO and MGI phenotype terms (defined as “genes with possible roles”). We included embryo related terms as “genes with possible roles” because embryonic lethal mouse knockout lines frequently have placentation defects, and because defects in placental development can be associated with the development of other embryonic tissues (Brown and Hay 2016; Perez-Garcia et al. 2018; Woods, Perez-garcia, and Hemberger 2018). This change resulted in an increase in the number of genes with relevant functions in mouse, including several annotated as embryonic lethal or with abnormal embryonic growth (see Supplementary Table S6). With the additional annotations:

      • 6 out of 17 hub genes of e7.5 networks have known/possible roles.
      • 17 out of 28 hub genes of e8.5 networks have known/possible roles.
      • 48 out of 127 hub genes of e9.5 networks have known/possible roles. We also carried out randomization tests to determine if the number of known/possible genes we identified were significant. Randomization tests were carried out with the following procedure: for each timepoint, from the respective timepoint-specific groups, we sampled 10,000 gene sets of the same number as the hub gene numbers. Then we counted the number of known/possible genes in each random set. A p-value is calculated as the number of times a random gene set has ≥ known/possible genes than the observed number, divided by 10,000. We found that the number of genes with known/possible roles at each time point are statistically significant (Supplementary Figure S3). This result indicates that the gene sets we identified are significantly associated with relevant phenotypes in mouse.

      The remaining hub genes are unannotated as related to placental or embryonic functions in the MGI database. Based on that, it is difficult to determine if they lack a relevant phenotype, or if there has not been a detailed assessment of the placenta.

      Added to section 2 of the result section:

      “Briefly, genes annotated under any GO or MGI phenotype terms related to placenta, TB cells, TE and the chorion layer are considered as having a “known” role in the placenta. Genes annotated under terms related to embryo are considered as having a “possible” role in the placenta, because embryonic lethal mouse knockout lines frequently have placentation defects, and because defects in placental development can be associated with the development of other embryonic tissues (Brown and Hay 2016; Perez-Garcia et al. 2018; Woods, Perez-garcia, and Hemberger 2018). Hereafter, such genes are referred to as “known/possible genes”. In the e7.5 networks, there were 17 hub genes in which six genes were known/possible. The number of hub genes that are labelled as known/possible is statistically significant when comparing to random gene sets selected from the e7.5 timepoint-specific group (Supplementary Figure S3). In the e8.5 and e9.5 networks, 17 out of 28 and 48 out of 127 hub genes were known/possible, respectively. Similar to e7.5, the number of hub genes labelled as known/possible in e8.5 networks and e9.5 networks were both statistically significant when comparing to random gene sets selected from the corresponding timepoint-specific groups (Supplementary Figure S3). These results indicate that the gene sets we identified are significantly associated with relevant phenotypes in the mouse.”

      For the four genes that we tested in HTR-8/SVneo cells, we also added more information about the current known role of the gene in mouse.

      Added to the discussion section:

      “We identified hub genes and their immediate neighboring genes which could regulate placental development and confirmed the roles of four novel genes (Mtdh, Siah2, Hnrnpk and Ncor2) in regulating cell migration in the HTR-8/SVneo cell line. These genes were selected primarily based on the network analyses, but also based on expression data from human cells to account for possible differences between mouse and human placental gene expression. Previous studies suggested these four candidates are functionally important in mouse. Mtdh has been suggested to regulate cell proliferation in mouse fetal development (Jeon et al. 2010). The Siah gene family is important for several functions (Qi et al. 2013). Of relevance to the placenta, Siah2 is an important regulator of HIF1α during hypoxia both in vitro and in vivo (Qi et al. 2008). Moreover, while Siah2 null mice exhibited normal phenotypes, combined knockouts of Siah2 and Siah1a showed enhanced lethality rates, suggesting the two genes have overlapping modulating roles (Frew et al. 2003). Hnrnpk-/- mice were embryonic lethal, and Hnrnpk+/- mice had dysfunctions in neonatal survival and development (Gallardo et al. 2015) . Ncor2-/- mice were embryonic lethal before e16.5 due to heart defects (Jepsen et al. 2007). According to the International Mouse Phenotyping Consortium database (Dickinson et al. 2016), Ncor2 null mice also showed abnormal placental morphology at e15.5. However, none of these genes have been studied in the context of TB migration.”

      Minor comments:

      1) In the GO analysis, why not use a combination of hypergeometric and binomial distribution for enrichment decisions?

      We used hypergeometric tests as in the default setting of ClusterProfiler. GO enrichment with hypergeometric test for differentially expressed genes was also suggested in Rivals et al., 2007 (Rivals et al. 2007). Combination of hypergeometric and binomial tests will be of great use when carrying out enrichment for cis-regulatory domains where there is a higher chance of sampling a gene randomly (McLean et al. 2010).

      We have added this detail in the method section to make the analysis clearer.

      2) In Figure 2B, are there any genes that are both hub nodes (diamonds) and annotated as having placental functions (squares)? If so, it might be good to show that in some way.

      We agree this is necessary and have altered the presentation in Figure 2. In the revised manuscript, we have added an additional list of hub genes as genes with possible roles. The figure now shows hub genes with known placental functions (diamonds), hub genes with possible functions (hexagons) and hub genes without related annotation (rounded squares). Non-hub genes are now not shown to avoid crowdedness.

      3) It might improve the deconvolution analysis to employ more than one method and recent reports have shown that the cell-type signature data is the most important parameter with the main factors influencing performance being biological (such as where the sample was taken) rather than technical (https://doi.org/10.1038/s41467-022-28655-4).

      We agree the conclusion would have been further confirmed if we could employ another deconvolution method. Upon literature search, we found another tool, CAM (N. Wang et al. 2016), that had similar approaches to LinSeed which aims to infer cell proportions without reference. However, the tool has been taken down from Bioconductor and is not currently maintained. As a result, to the best of our knowledge, LinSeed is the only deconvolution tool that is completely reference-free.

      We also tried carrying out the deconvolution analysis with another method, DSA (Zhong et al. 2013), with a limited number of marker genes obtained through literature review. However, when the marker genes are highly correlated in multiple cell types, the models failed to infer meaningful proportions.

      We acknowledge that we need additional single cell RNA-seq data or marker genes obtained from pure cell populations to make more concrete conclusions for the deconvolution analysis. We hope with future studies, there will be more evidence supporting our observations.

      We have added this acknowledgement in the results section:

      “The identification of these cell groups could have resulted from noise introduced by both biological and technical variation, which is challenging to overcome when using a small sample size or analyzing without prior knowledge in the deconvolution analysis.”

      Added to the discussion section:

      “Nevertheless, we acknowledge that our deconvolution analysis and cell type annotations were limited due to the absence of matching scRNA-seq data, data from pure cell populations, or extensive cell marker lists. As these types of information are available, deconvolution analysis can be used to identify species-specific cell types or correcting for confounding effects prior to DEA (Sutton et al. 2022).”

      4) The above report also shows that there are ways to correct for cell-type composition differences in DEA which might be interesting to look when using bulk data from different timepoints in future studies when focusing on different biological processes and not timepoint-specific transcripts.

      We agree correcting for cell proportion prior to differential expression analysis will be interesting for future studies. When single cell RNA-seq data or more extensive marker gene lists are available, deconvolution analysis will be of great use for this purpose.

      We have added this in the discussion section (also mentioned in point #3):

      “Nevertheless, we acknowledge that our deconvolution analysis and cell type annotations were limited due to the absence of matching scRNA-seq data, data from pure cells, or extensive cell marker lists. As these types of information become more available, deconvolution analysis can be used to identify species-specific cell types or correcting for confounding effects prior to DEA (Sutton et al. 2022).”

      5) Could the authors speculate as to possible reason(s) that an siRNA knockdown would give variable results functionally, while the actual gene expression appears to be consistently and sufficiently downregulated? Did the authors evaluate protein levels following siRNA knockdown?

      Following the reviewer’s comment, we have evaluated protein levels for each target gene and each siRNA. For the genes that gave variable results between siRNAs (MTDH and NCOR2), we did not observe a change in their ability to reduce protein levels (Supplementary Figure S7). It is therefore possible that there are off-target effects for one of the siRNAs. We considered this possibility in designing the project, which is why we tested two siRNAs per target gene. Although siRNA off-target effects may be present, visual inspection of the migration experiments indicate that transfection with each of the siRNAs reduces migration capacity. We have added the possibility of off-target effects in the discussion section:

      “We observed that while all siRNAs were able to decrease cell migration capacity, there was variability in the amount of decrease, even when comparing two siRNAs targeting the same gene. This observation did not seem to be associated with differences in transcript or protein knockdown levels and could be due to different off-target effects for different siRNAs.”

      6) As mentioned in the discussion, finding genes that have timepoint dependent isoforms would an interesting and novel addition to the manuscript.

      Protein isoforms would be interesting to study. Here we focused on different mRNA transcripts. We carried out additional GO analysis on the genes unique to each timepoint and genes shared among timepoints. This was also done in response to major comment 2:

      In the revised manuscript, we added additional analysis for genes that belong to multiple timepoint groups due to different transcripts of the same gene being annotated as timepoint-specific, and genes unique to each timepoint (Added to results section 1):

      “It is possible that timepoint-specific groups share genes that have timepoint-specific transcripts. Indeed, we identified 37 genes shared between e7.5 and e8.5, 5 genes shared between e7.5 and e9.5, and 109 genes shared between e8.5 and e9.5 (Supplementary Table S3). We found that genes only present at one timepoint (timepoint-unique genes) were generally enriched for similar terms as the full group of timepoint-specific genes (Supplementary Table S3). However, terms related to the development of labyrinth layer like “labyrinthine layer morphogenesis” and “labyrinthine layer blood vessel development” were only enriched when using all e8.5-specific genes but not when using e8.5 timepoint-unique genes. Moreover, we found that, unlike genes shared between e9.5 and e7.5, genes shared between e9.5 and e8.5 were enriched for processes such as “blood vessel development” and “insulin receptor signaling pathway”. This observation may indicate that different transcripts of the same genes could be expressed at different timepoints for the continuation of certain biological processes.”

      7) Although outside the scope of this manuscript, it might be interesting to look at the effects of knocking down network genes on the networks themselves and in combination with a phenotypic readout such as a migration assay. With numerous knockouts and migration assay readouts, one could possibly find a better method to rank the genes within the networks.

      We agree with this comment. Upon literature search, we realized this approach has been used in previous studies on other biological contexts such as virus entry (A. Wang et al. 2010; A. Wang, Ren, and Li 2011) and cancer cell growth (Paul et al. 2021). Although these studies used different network inference strategies from ours, their in silico gene knockouts proved to be effective for the candidate selection. However, the knockout process (both computationally and experimentally) may not be trivial; therefore, we agree the approach will be useful for future studies.

      CROSS-CONSULTATION COMMENTS

      I mostly agree with the other two reviewers.

      It is not clear to me that additional KD experiments (i.e. ones that might affect fusion, proliferation, apoptosis), as proposed by Reviewer #3, would be that much more informative. There are many differences between mouse and human placentation, and these model systems (HTR8 and BeWo) are not truly representative of either. The additional data mining/computational work would be more useful and enhance data interpretation.

      Reviewer #2 (Significance (Required)):

      The authors use RNA-seq of mouse placenta at e7.5, e8.5, and e9.5 to show that timepoint-specific expression patterns are highly correlated with certain biological processes and point to the existence of certain cell types in the sample. While focused on early post-implantation mouse placental development, the author's methods could be transferrable to other timepoints, species, and organs. Furthermore, with their method they uncover what appears to be several novel, early placental, developmentally important genes and their results might be of interest to those in the field studying placental development.

      Reviewer #3:

      Summary:

      This paper is an analysis of RNA-seq data from the mouse human placenta at embryonic day from 7.5 to 9.5 days. Bioinformatics was used to pinpoint genes networks, and tentatively connect with human cell populations. Wet experiments were performed on the HTR8/SV neo trophoblast cell model.

      The introduction clearly posits the reasons why mouse models were chosen, and presents some examples of genes that are conserved between human and mouse placentas, before presenting the major steps of mouse placental development at the crucial periods analyzed.

      The results are divided into four parts:

      1. Identification of genes that are specific of fetal tissues at the three days studied
      2. A network analysis of the genes using classical bioinformatics tools (String, Genie3) to identify gene modules
      3. A connection with the human placenta at the level of cell-specific expression profile is then analyzed
      4. A in vitro validation on a trophoblast cell model using siRNA to Knockdown genes identified in the in silico part of the paper. Three clustering methods were used to classify the genes according to their profile (at which time point they have the highest level). The function associated are dispatched into three logical physiological events (7.5: proliferation and ectoplacental cone development, 8.5 attachment of the placenta -chorioallantoidian at this stage- , and 9.5: syncytiotrophoblast constitution and labyrinth development, structures essential for growth and exchange).

      Mostly minor comments:

      Quality of the transcriptomics data: 6 replicates per condition (some being pools at E7.5 and 8.5) is a lot, and I congratulate the authors to have make such effort. This says a lot about the technical quality of their results. Nevertheless, there is no comment on the exclusion of two samples in the further analysis based upon the PCA. Could the authors comment upon the reasons why these two samples behave so differently from the others?

      We thank the reviewer for the comment. We reviewed the RNA concentration and quality prior to sequencing, and did not observe that the outliers were of lower quality. After sequencing, quality control metrics (obtained with FastQC), also did not indicate that the two outliers were of poor quality. Based on the PCA, it is also unlikely that two samples were swapped. One possibility is that the tissues obtained for these samples were diseased in some way. However, this is difficult to confirm, so we did not want to speculate about this in the manuscript. We did exclude the two samples to ensure the accuracy of our downstream analyses.

      Rq: at this stage some statistics of the degree of enrichment in keyword should be provided (such as Enrichment Scores, normalized or not, and False Discovery Rates, to be able to evaluate the actual robustness of the genes network identified. In addition, it seems that the authors supervised the 'keywords' and 'ontologies' toward placental function. A more agnostic approach could be very relevant, such as identifying the ontologies associated to for instance the set of genes that are highest at 8.5 days, by comparing them with preliminary datasets accessible via the GSEA platform of the BROAD institute or similar sites such as Webgestalt. This does not mean that the placental-targeted approach is not useful, but to have a more global overview is in my opinion indispensable.

      We agree and this is a good point. We have now added a stringent approach to determine if the placenta-targeted terms are truly relevant to the gene networks. We performed randomization tests using random gene sets sampled from hierarchical groups of the same time point. These tests showed that the selected terms are significant in the networks when compared to gene groups of the same size from the timepoint specific hierarchical groups (Supplementary Figure S3). Moreover, we have added the specific -log10(q-value) of some highlighted enriched terms in the main text, so together with Figure 2A, the degree of enrichment of these terms can be shown in a clearer way.

      We have added this detail in the result section:

      “Compared to e8.5 and e9.5 networks, e7.5 networks had a higher rank or fold change and were significantly enriched for the GO terms “inflammatory response” (e7.5_1_STRING: -log10(q-value) = 22.82 and e7.5_2_GENIE3: -log10(q-value) = 3.95) and “female pregnancy” (e7.5_2_GENIE3: -log10(q-value) = 4.1) (Figure 2A, Supplementary Table S5). The term “morphogenesis of a branching structure”, which can be expected following chorioallantoic attachment around e8.5, was not enriched at e7.5, but was enriched in multiple e8.5 and e9.5 networks (e8.5_1_STRING: -log10(q-value) = 1.73, e8.5_2_GENIE3: -log10(q-value) = 1.72, e9.5_1_STRING: -log10(q-value) = 4.01, e9.5_1_GENIE3: -log10(q-value) = 1.54, e9.5_2_STRING: -log10(q-value) = 14.33, and e9.5_2_GENIE3: -log10(q-value) = 2.2). After chorioallantoic attachment finishes, nutrient transport is being established. Accordingly, we observed the following enrichments: “endothelial cell proliferation” (highest ranked in e9.5_2_STRING: -log10(q-value) = 15.91), “lipid biosynthetic process” (only significant after e7.5, highest ranked in e9.5_3_STRING: -log10(q-value) = 17.63), “cholesterol metabolic process” (only significant after e7.5, highest ranked in e9.5_2_GENIE3: -log10(q-value) = 2.76 and e9.5_3_STRING: -log10(q-value) = 7.79), and “response to insulin” (only significant after e7.5, highest ranked in e9.5_1_GENIE3: -log10(q-value) = 1.67).”

      “Using randomization tests, we observed the majority of these GO terms (10 out of 11 terms) were significantly enriched when using the network genes but not random gene sets (significance level of 0.05; the term “vasculature development” having p-value = 0.0549 and 0.0575 in with subnetwork e9.5_1_GENIE3 and e9.5_3_GENIE3, respectively) (see Materials and Methods, Supplementary Figure S3). This analysis demonstrates that the network genes were highly relevant to the biological functions of interest. Moreover, the observed GO terms strongly aligned with the processes enriched when using the full lists of timepoint-specific genes (Supplementary Table S3), indicating the representative characteristics of the network genes. While the current analysis focuses on the biological processes related to placental development, there are other terms significantly enriched, which can be found in Supplementary Table S5.”

      This is partially done in the part 2 of the results, but it would be relevant to do it on the group of highly expressed genes and not only on the clusters found by the algorithm of sting and genie3.

      We have added GO analysis for timepoint-specific genes and also observed highly relevant processes being enriched (Supplementary Table S3). This additional analysis has also helped strengthen the relevance of the network genes, as the observed terms with network genes aligned well with the terms enriched with the full lists of genes.

      Rq: in the second part of the results, everything is descriptive but no hierarchy is given to facilitate the understanding and to try to generate a few 'take-home messages' for the reader.

      We agree with the comment and have adjusted the writing accordingly. We have added the following statements in section 2 of the result section:

      “In summary, we identified 18 subnetworks across three timepoints for downstream analyses, some of which were enriched, according to GO analysis and randomization tests, for specific terms relating to placental development (Figure 2A).”

      “These results indicate that the gene sets we identified are functionally relevant in the mouse models.”

      “In summary, we have identified hub genes in networks at each timepoint. Analyzing the annotations of hub genes using the MGI database demonstrated that the hub genes are biologically relevant to mouse development and will be strong candidates for future investigation.”

      The network analysis is well presented in Figure 2. I wonder whether the author could add systematically besides the three examples that are given the network analysis for the other enrichment network that are described (the four at e7.5, the 6 at e8.5 and the 8 at e9.5).

      We have added the additional figures in Supplementary Figure S3.

      The deconvolution of the 3rd part of the results to try to connect the mouse results to the human cell situation is interesting. I suspect that given the terms of the mouse placentas used, it would be relevant to focus on 1st trimester human placental cells.

      The reference dataset we used in the PlacentaCellEnrich analysis was from human 1st trimester placenta samples. For the Placenta Ontology analysis, we were limited to the provided database from (Naismith and Cox 2021); however, it will be interesting to revisit the analysis when the database is extended.

      We have specified that the reference data in PlacentaCellEnrich analysis was from human 1st trimester placenta in the methods section:

      “For PlacentaCellEnrich, cell-type specific groups were based on the single-cell transcriptome data of first trimester human maternal-fetal interface from Vento-Tormo et al.”

      As previously mentioned, this is a highly descriptive paragraph, and two or three sentences at the end of each paragraph of the results would be in my opinion indispensable to present the most important observations of the results in an intelligible way. Overall, the data presented by the authors, are not obviously 'raw data', but an effort of interpretation should be done by the authors to underline the importance of their results, and to stress among these results which are the most important, and which are the most relevant for placental development and human health.

      We agree with the comment and have adjusted the writing accordingly. We have added this summary paragraph at the end of section 3 of the result section:

      “In summary, we have demonstrated that the identification of timepoint-specific gene groups and densely connected network modules can be used to infer the cellular composition of bulk RNA-seq samples. We used independent human datasets from different sources to annotate the cell types in each timepoint’s samples. As a result, from the bulk RNA-seq data we were able to observe that at e7.5 and e8.5, there was a high proportion of different TB populations, whereas at e9.5, the placental tissues consisted of multiple cell types such as TB, endothelial and fibroblast cells.”

      In the last part, which is very important in this type of paper, four genes were selected. A choice of highly expressed genes was made (which can in fact be discussed, some transcriptional factors may have a crucial importance with relatively low levels of expression). The efficiency of the siRNA was overall excellent. The authors showed that each of these siRNA is efficient to inhibit cell migration in the HTR8/SVneo model.

      The migration assays are quantified, but there is a inherent limit of the cell model: the authors analyzed only cell migration, but not other very important parameters. One of them is trophoblast fusion, an issue that can be studied in another trophoblast cell model, the BeWo cells, which are induced to fuse under forskolin. It would be highly relevant to test the siRNA identified in this respect, since fusion is a very conspicuous feature of trophoblast cells in mice as well as in humans. Other relevant endpoints such as proliferation markers, apoptosis markers, oxidative stress markers could be studied in the KD cell models. Alternatively, it would have been interesting to evaluate the overall effect of the siRNA by transcriptomics and check whether the modified gene expression leads to specific profiles characteristic of a certain moment of placental development in mice, or proportion of various cells in the human placentas. Without asking for further experiments the authors should mention these limits in their discussion.

      We completely agree with this comment and are investigating each of our candidate genes in more detail in ongoing studies. As we have already learned that each gene is involved in different processes and pathways, we feel that these studies are out of the scope of the current paper. However, we have added this point to our discussion section:

      “However, more detailed analysis and process specific assays are needed. For example, future studies assessing each gene’s role in cell adhesion, cell-cell fusion, cell proliferation and cell apoptosis can be done to better understand their roles in placental development.”

      In sum, I feel that this paper provides an excellent dataset, but that the authors should make an additional effort of redaction to extract the most important conclusions of their paper. This would increase its impact for a wider public.

      Thank you. We have attempted to do so in the revised version.

      Reviewer #3 (Significance (Required)):

      The context is well introduced, but explanatory and synthesis sentences are missing at the end of each paragraph. I am relatively competent in bioinformatics methods, including deconvolution, and rather expert in cell biology. Therefore I feel comfortable to evaluate this paper.

      References:

      Brown, Laura D., and William W. Hay. 2016. “Impact of Placental Insufficiency on Fetal Skeletal Muscle Growth.” Molecular and cellular endocrinology 435: 69. /pmc/articles/PMC5014698/ (August 24, 2022).

      Cox, Brian et al. 2009. “Comparative Systems Biology of Human and Mouse as a Tool to Guide the Modeling of Human Placental Pathology.” Molecular Systems Biology 5: 279. /pmc/articles/PMC2710868/ (July 20, 2022).

      Dickinson, Mary E. et al. 2016. “High-Throughput Discovery of Novel Developmental Phenotypes.” Nature 2016 537:7621 537(7621): 508–14. https://www.nature.com/articles/nature19356 (July 20, 2022).

      Frew, Ian J. et al. 2003. “Generation and Analysis of Siah2 Mutant Mice.” Molecular and Cellular Biology 23(24): 9150. /pmc/articles/PMC309644/ (July 27, 2022).

      Gallardo, Miguel et al. 2015. “HnRNP K Is a Haploinsufficient Tumor Suppressor That Regulates Proliferation and Differentiation Programs in Hematologic Malignancies.” Cancer Cell 28(4): 486–99. http://www.cell.com/article/S1535610815003050/fulltext (August 24, 2022).

      Gormley, Matthew et al. 2021. “RNA Profiling of Laser Microdissected Human Trophoblast Subtypes at Mid-Gestation Reveals a Role for Cannabinoid Signaling in Invasion.” Development (Cambridge, England) 148(20). https://pubmed.ncbi.nlm.nih.gov/34557907/ (August 15, 2022).

      Guillemot, François et al. 1994. “Essential Role of Mash-2 in Extraembryonic Development.” Nature 371(6495): 333–36. https://www.nature.com/articles/371333a0 (December 21, 2021).

      Hemberger, Myriam, Courtney W. Hanna, and Wendy Dean. 2020. “Mechanisms of Early Placental Development in Mouse and Humans.” Nature Reviews Genetics 21(1): 27–43. http://dx.doi.org/10.1038/s41576-019-0169-4.

      Jeon, Hyun Yong et al. 2010. “Expression Patterns of Astrocyte Elevated Gene-1 (AEG-1) during Development of the Mouse Embryo.” Gene expression patterns : GEP 10(7–8): 361. /pmc/articles/PMC3165053/ (July 27, 2022).

      Jepsen, Kristen et al. 2007. “SMRT-Mediated Repression of an H3K27 Demethylase in Progression from Neural Stem Cell to Neuron.” Nature 450(7168): 415–19. https://www.nature.com/articles/nature06270 (July 27, 2022).

      Kuckenberg, Peter, Caroline Kubaczka, and Hubert Schorle. 2012. “The Role of Transcription Factor Tcfap2c/TFAP2C in Trophectoderm Development.” Reproductive BioMedicine Online 25(1): 12–20. http://www.rbmojournal.com/article/S1472648312001010/fulltext (December 21, 2021).

      McLean, Cory Y. et al. 2010. “GREAT Improves Functional Interpretation of Cis-Regulatory Regions.” Nature Biotechnology 28(5): 495–501. http://dx.doi.org/10.1038/nbt.1630.

      Meinhardt, Gudrun et al. 2020. “Pivotal Role of the Transcriptional Co-Activator YAP in Trophoblast Stemness of the Developing Human Placenta.” Proceedings of the National Academy of Sciences of the United States of America 117(24): 13562–70. https://www.ncbi.nlm.nih.gov/geo/ (April 8, 2022).

      Morey, Robert et al. 2021. “Transcriptomic Drivers of Differentiation, Maturation, and Polyploidy in Human Extravillous Trophoblast.” Frontiers in Cell and Developmental Biology 9: 2269.

      Naismith, Kendra, and Brian Cox. 2021. “Human Placental Gene Sets Improve Analysis of Placental Pathologies and Link Trophoblast and Cancer Invasion Genes.” Placenta 112: 9–15.

      Paul, Abhijit et al. 2021. “Exploring Gene Knockout Strategies to Identify Potential Drug Targets Using Genome-Scale Metabolic Models.” Scientific Reports 2021 11:1 11(1): 1–13. https://www.nature.com/articles/s41598-020-80561-1 (July 27, 2022).

      Perez-Garcia, Vicente et al. 2018. “Placentation Defects Are Highly Prevalent in Embryonic Lethal Mouse Mutants.” Nature 555(7697): 463. /pmc/articles/PMC5866719/ (August 11, 2022).

      Qi, Jianfei et al. 2008. “The Ubiquitin Ligase Siah2 Regulates Tumorigenesis and Metastasis by HIF-Dependent and -Independent Pathways.” Proceedings of the National Academy of Sciences of the United States of America 105(43): 16713. /pmc/articles/PMC2575485/ (September 20, 2021).

      Qi, Jianfei, Hyungsoo Kim, Marzia Scortegagna, and Ze’ev A. Ronai. 2013. “Regulators and Effectors of Siah Ubiquitin Ligases.” Cell biochemistry and biophysics 67(1): 15. /pmc/articles/PMC3758783/ (July 27, 2022).

      Rivals, Isabelle, Lé On Personnaz, Lieng Taing, and Marie-Claude Potier. 2007. “Databases and Ontologies Enrichment or Depletion of a GO Category within a Class of Genes: Which Test?” Bioinformatics 23(4): 401–7.

      Sheridan, Megan A. et al. 2021. “Characterization of Primary Models of Human Trophoblast.” Development (Cambridge, England) 148(21). /pmc/articles/PMC8602945/ (August 15, 2022).

      Si, Yaqing, Peng Liu, Pinghua Li, and Thomas P. Brutnell. 2014. “Model-Based Clustering for RNA-Seq Data.” Bioinformatics 30(2): 197–205. https://academic.oup.com/bioinformatics/article/30/2/197/217752 (July 18, 2022).

      Silva, Juneo F., and Rogéria Serakides. 2016. “Intrauterine Trophoblast Migration: A Comparative View of Humans and Rodents.” Cell Adhesion and Migration 10(1–2): 88–110. http://dx.doi.org/10.1080/19336918.2015.1120397.

      Soncin, Francesca et al. 2018. “Comparative Analysis of Mouse and Human Placentae across Gestation Reveals Species-Specific Regulators of Placental Development.” Development (Cambridge) 145(2).

      Soncin, Francesca, David Natale, and Mana M. Parast. 2015. “Signaling Pathways in Mouse and Human Trophoblast Differentiation: A Comparative Review.” Cellular and Molecular Life Sciences 72(7): 1291–1302.

      Spies, Daniel, Peter F. Renz, Tobias A. Beyer, and Constance Ciaudo. 2019. “Comparative Analysis of Differential Gene Expression Tools for RNA Sequencing Time Course Data.” Briefings in Bioinformatics 20(1): 288. /pmc/articles/PMC6357553/ (July 18, 2022).

      Sun, Huifang et al. 2006. “An E2F Binding-Deficient Rb1 Protein Partially Rescues Developmental Defects Associated with Rb1 Nullizygosity.” Molecular and Cellular Biology 26(4): 1527. /pmc/articles/PMC1367194/ (February 6, 2022).

      Sutton, Gavin J. et al. 2022. “Comprehensive Evaluation of Deconvolution Methods for Human Brain Gene Expression.” Nature Communications 2022 13:1 13(1): 1–18. https://www.nature.com/articles/s41467-022-28655-4 (July 27, 2022).

      Varberg, Kaela M. et al. 2021. “ASCL2 Reciprocally Controls Key Trophoblast Lineage Decisions during Hemochorial Placenta Development.” Proceedings of the National Academy of Sciences of the United States of America 118(10). https://www.pnas.org/content/118/10/e2016517118 (December 21, 2021).

      Wang, Anyou, S. Claiborne Johnston, Joyce Chou, and Deborah Dean. 2010. “A Systemic Network for Chlamydia Pneumoniae Entry into Human Cells.” Journal of Bacteriology 192(11): 2809–15. https://journals.asm.org/doi/10.1128/JB.01462-09 (July 27, 2022).

      Wang, Anyou, Li Ren, and Hong Li. 2011. “A Systemic Network Triggered by Human Cytomegalovirus Entry.” Advances in Virology 2011.

      Wang, Niya et al. 2016. “Mathematical Modelling of Transcriptional Heterogeneity Identifies Novel Markers and Subpopulations in Complex Tissues.” Scientific Reports 2016 6:1 6(1): 1–12. https://www.nature.com/articles/srep18909 (July 27, 2022).

      Watson, Erica D., and James C. Cross. 2005. “Development of Structures and Transport Functions in the Mouse Placenta.” Physiology 20(3): 180–93.

      Woods, Laura, Vicente Perez-garcia, and Myriam Hemberger. 2018. “Regulation of Placental Development and Its Impact on Fetal Growth — New Insights From Mouse Models.” Frontiers in Endocrinology 9(September): 1–18.

      Zhong, Yi et al. 2013. “Digital Sorting of Complex Tissues for Cell Type-Specific Gene Expression Profiles.” BMC Bioinformatics 14(1): 1–10. https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-14-89 (July 27, 2022).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Review of "Identifying novel regulators of placental development using time series transcriptomic data and network analyses."

      The authors present a detailed bioinformatic assessment of mouse developmental time series of the placenta. They apply current data mining and analysis methods to identify protein-centred networks that are likely enriched to specific cell types of the placenta. They then translate these findings to humans using statistical comparisons of human single-cell sequencing data of the placenta. Lastly, they use knock-down experiments to validate the conserved functional importance of the hub genes in the mouse protein networks in human cells. The strengths of this paper are the rigorous data mining methods and the functional translation to humans from mice. There are no critical weaknesses to the article. There is a blend of statistical analysis with anecdotal or hand curation from databases and the literature, but it is unclear if these curated finings are circumstantial or statistically meaningful. In the end, the hypothesis seems to hold in that 4/4 gene knocked down in the human cells gave a migration phenotype.

      Comments, questions, critique

      1. Given the translational aims of the paper, more introduction/discussion material on the comparative aspects of mice and humans are needed. Are giant cells and EVT the same? What are the cell equivalents that you are discovering? The Soncin et al. paper is cited, but I think underused. This publication contains time series data on mice and humans and could be used as external validation of clusters, networks, and other analyses. Other publications to consider for context are

      a) Cox B, et al. Mol Syst Biol 5: 279.

      b) Silva JF, Serakides R. 2016. Cell Adhes Migr 10: 88-110. (specifically discusses migration difference between the species placenta)

      1. Clustering represented in Figure 1B, was this a supervised model? Why only three clusters?) Did you specify that there would be three models and force each gene profile into one of the categories? How robust are the fits? A fitted model might be a better approach as you can specify the ideal models (early high, late high and mid-high), then determine each gene profile that fits each model and only assess those genes with a significant fit to the model. Forcing clustering to the three-model fit likely gives many poorly fitting profiles. While in the end, this works out, it may be due to applying other post hoc methods for gene enrichment, where noise distributes randomly.

      2. Several statements are made about the conservation of importance between mouse and human hub genes. For example, "We predict these highly expressed genes to be generally important for TB function and processes such as cell migration, a term associated with multiple timepoint specific networks (Figure 2A)." While your knock-down assay of migration results shows these hub genes to be necessary to humans, what do they mean to the mouse? You did not use mouse TSC to assess functional importance concurrently. You note a small number of genes as of known importance, "127 hub genes of which 16 have been annotated as having a role in placental development". Were the others knocked out but lack a developmental phenotype or not assessed? Are these functionally redundant in the mouse or not involved in the same processes between the species?

      3. In determining conservation between mouse and human networks, were only 1:1 orthologs examined or did you consider more complex 1:many mapping conditions between the two species?

      4. Should the migration assay be normalized to survival/adhesion? If 70,000 cells were seeded but had 50% cell death (or reduced adhesion), then it may appear to be poor migration. Should the migration be evaluated as a ratio of top to bottom cell densities to control for poor adhesion or survival?

      Significance

      This significantly advances previous publications on this topic by functionally testing the discovered genes.

      This highlights an excellent data mining strategy for a developmental disease using mice and translating to humans.

      The audience is likely developmental biologists and reproductive specialists.

      My expertise is bioinformatics and developmental biology.

    1. But it’s with these weather worries that these manipulative scientists really give the game away. Urging us to use more wind power but complaining about all the hurricanes we keep having? They got us all to convert to solar power decades ago but keep whining about prolonged sunny spells? MAKE YOUR MINDS UP! Some of them even go so far as to say it’s climate change that’s causing forced migration of millions of people. But that’s clearly because everyone has solar cars and jetpacks and matter transporters now, so why would they stay in one place, with or without devastating environmental damage spurring them on. It’s all a bit convenient, isn’t it, all this palaver over climate change? Weird how 99.9999% of all scientists purportedly agree that it’s definitely happening and our most powerful quantum computers are certain to over a million decimal places that it’s our fault? Weird how they’re saying this now, at exactly the same time when they need all the volunteers they can get for the moon and Mars colonies. What’s more likely; that human industrial activity actually does lead to climate change, or that it’s all a massive meticulous centuries-long ruse to convince people that leaving Earth is a good idea? Obviously, it’s the latter. These scientists have no shame or respect. I can’t say I’m not tempted to go myself, though. I’d rather live on another planet, than on one where every aspect of your life is subject to rigorous scientific control. Nobody should have to put up with that crap.

      Overall it seems that climate change has affected the author or they are worried for others but others may say different. I truly don't think humans take full accountability for this but may play some part in it.

    1. An epistemic bubble is a social structure where insiders aren’t exposed to views on the outside. Despite the superficial similarity, epistemic bubbles and echo chambers work through entirely different mechanisms. In an echo chamber, inside members may have plenty of exposure to outside views, but outside voices have been undermined. Epistemic bubbles are structures of bad connectivity; echo chambers are structures of manipulated credence. In an epistemic bubble, outside voices aren’t heard; in an echo chamber, outside voices have been systematically discredited. Importantly, I’ve argued, many communities with problematic belief systems have been misdiagnosed as epistemic bubbles. But actually, they are mostly the result of echo chambers. It isn’t that climate change deniers, for example, are simply unaware of what climate change scientist think, or the standard publicly available arguments for climate change. They are, for the most part, quite well acquainted with those arguments and conclusions. It is that they think that the institutions of climate change science have been systematically corrupted and are untrustworthy. This helps to explain the intractability of climate change denialists. Since an epistemic bubble works through simply omitting outside voices, we should be able to shatter one simply by exposing an insider to more voices and more viewpoints. We should expect epistemic bubbles to go down with the first contact with the missing evidence. But echo chamber members are pre-prepared for encounters with external viewpoints and armed with explanatory mechanisms to dismiss those other voices. Echo chambers are far more robust.

      Epistemic bubble vs. echo chamber

    1. Author Response

      Reviewer #2 (Public Review):

      Silberberg et al. present a series of cryo-EM structures of the ATP dependent bacterial potassium importer KdpFABC, a protein that is inhibited by phosphorylation under high environmental K+ conditions. The aim of the study was to sample the protein's conformational landscape under active, non-phosphorylated and inhibited, phosphorylated (Ser162) conditions.

      Overall, the study presents 5 structures of phosphorylated wildtype protein (S162-P), 3 structures of phosphorylated 'dead' mutant (D307N, S162-P), and 2 structures of constitutively active, non-phosphorylatable protein (S162A).

      The true novelty and strength of this work is that 8 of the presented structures were obtained either under "turnover" or at least 'native' conditions without ATP, ie in the absence of any non-physiological substrate analogues or stabilising inhibitors. The remaining 2 were obtained in the presence of orthovanadate.

      Comparing the presented structures with previously published KdpFACB structures, there are 5 structural states that have not been reported before, namely an E1-P·ADP state, an E1-P tight state captured in the autoinhibited WT protein (with and without vanadate), and two different nucleotide-free 'apo' states and an E1·ATP early state.

      Of these new states, the 'tight' states are of particular interest, because they appear to be 'off-cycle', dead end states. A novelty lies in the finding that this tight conformation can exist both in nucleotide-free E1 (as seen in the published first KdpFABC crystal structure), and also in the phosphorylated E1-P intermediate.

      By EPR spectroscopy, the authors show that the nucleotide free 'tight' state readily converts into an active E1·ATP conformation when provided with nucleotide, leading to the conclusion that the E1-P·ADP state must be the true inhibitory species. This claim is supported by structural analysis supporting the hypothesis that the phosphorylation at Ser162 could stall the KdpB subunit in an E1P state unable to convert into E2P. This is further supported by the fact that the phosphorylated sample does not readily convert into an E2P state when exposed to vanadate, as would otherwise be expected.

      The structures are of medium resolution (3.1 - 7.4 Å), but the key sites of nucleotide binding and/or phosphorylation are reasonably well supported by the EM maps, with one exception: in the 'E1·ATP early' state determined under turnover conditions, I find the map for the gamma phosphate of ATP not overly convincing, leaving the question whether this could instead be a product-inhibited, Mg-ADP bound E1 state resulting from an accumulation of MgADP under the turnover conditions used. Overall, the manuscript is well written and carefully phrased, and it presents interesting novel findings, which expand our knowledge about the conformational landscape and regulatory mechanisms of the P-type ATPase family.

      We thank the reviewer for their comments and helpful insights. We have addressed the points as follows:

      However in my opinion there are the following weaknesses in the current version of the manuscript:

      1) A lack of quantification. The heart of this study is the comparison of the newly determined KdpFABC structures with previously published ones (of which there are already 10). Yet, there are no RMSD calculations to illustrate the magnitude of any structural deviations. Instead, the authors use phrases like 'similar but not identical to', 'has some similarities', 'virtually identical', 'significant differences'. This makes it very hard to appreciate the true level of novelty/deviation from known structures.

      This is a very valid point and we thank the reviewers for bringing it up. To provide a better overview and appreciation of conformational similarities and significant differences we have calculated RMSDs between all available structures of KdpFABC. They are summarised in the new Table 1 – Table Supplement 2. We have included individual rmsd values, whenever applicable and relevant, in the respective sections in the text and figures. We note that the RMSDs were calculated only between the cytosolic domains (KdpB N,A,P domains) after superimposition of the full-length protein on KdpA, which is rigid across all conformations of KdpFABC (see description in material and methods lines 1184-1191 or the caption to Table 1 – Table Supplement 2). We opted to not indicate the RMSD calculated between the full-length proteins, as the largest part of the complex does not undergo large structural changes (see Figure 1 – Figure Supplement 1, the transmembrane region of KdpB as well as KdpA, KdpC and KdpF show relatively small to no rearrangements compared to the cytosolic domains), and would otherwise obscure the relevant RMSD differences discussed here.

      Also the decrease in EPR peak height of the E1 apo tight state between phosphorylated and non-phosphorylated sample - a key piece of supporting data - is not quantified.

      EPR distance distributions have been quantified by fitting and integrating a gaussian distribution curve, and have been added to the corresponding results section (lines 523-542) and the methods section (lines 1230-1232).

      2) Perhaps as a consequence of the above, there seems to be a slight tendency towards overstatements regarding the novelty of the findings in the context of previous structural studies. The E1-P·ATP tight structure is extremely similar to the previously published crystal structure (5MRW), but it took me three reads through the paper and a structural superposition (overall RMSD less than 2Å), to realise that. While I do see that the existing differences, the two helix shifts in the P- and A- domains - are important and do probably permit the usage of the term 'novel conformation' (I don't think there is a clear consensus on what level of change defines a novel conformation), it could have been made more clear that the 'tight' arrangement of domains has actually been reported before, only it was not termed 'tight'.

      As indicated above we have now included an extensive RMSD table between all available KdpFABC structures. To ensure a meaningful comparison, the rmsd are only calculated between the cytosolic domains after superimposition of the full-length protein on KdpA, as the transmembrane region of KdpFABC is largely rigid (see figure below panel B). However, we have to note that in the X-ray structure the transmembrane region of KdpB is displaced relative to the rest of the complex when compared to the arrangement found in any of the other 18 cryo-EM structures, which all align well in the TMD (see figure below panel C). These deviations make the crystal structure somewhat of an outlier and might be a consequence of the crystal packing (see figure below panel A). For completeness in our comparison with the X-Ray structure, we have included an RMSD calculated when superimposed on KdpA and additional RMSD that was calculated between structures when aligned on the TMD of KdpB (see figure below panel D,E). The reported RMSD that the reviewer mentiones of less than 2Å was probably obtained when superimposing the entire complex on each other (see figure below panel F). However, we do not believe that this is a reasonable comparison as the TMD of the complex is significantly displaced, which stands in strong contrast to all other RMSDs calculated between the rest of the structures where the TMD aligns well (see figure below panel B).

      From the resulting comparisons, we conclude that the E1P-tight and the X-Ray structure do have a certain similarity but are not identical. In particular not in the relative orientation of the cytosolic domains to the rest of the complex. We hope that including the RMSD in the text and separately highlighting the important features of the E1P tight state in the section “E1P tight is the consequence of an impaired E1P/E2P transition“ makes the story now more conclusive.

      Likewise, the authors claim that they have covered the entire conformational cycle with their 10 structures, but this is actually not correct, as there is no representative of an E2 state or functional E1P state after ADP release.

      This is correct, and we have adjusted the phrasing to “close to the entire conformational cycle” or “the entire KdpFABC conformational cycle except the highly transient E1P state after ADP release and E2 state after dephosphorylation.”

      3) A key hypothesis this paper suggests is that KdpFABC cannot undergo the transition from E1P tight to E2P and hence gets stuck in this dead end 'off cycle' state. To test this, the authors analysed an S162-P sample supplied with the E2P inducing inhibitor orthovanadate and found about 11% of particles in an E2P conformation. This is rationalised as a residual fraction of unphosphorylated, non-inhibited, protein in the sample, but the sample is not actually tested for residual unphosphorylated fraction or residual activity. Instead, there is a reference to Sweet et al, 2020. So the claim that the 11% E2P particles in the vanadate sample are irrelevant, whereas the 14% E1P tight from the turnover dataset are of key importance, would strongly benefit from some additional validation.

      We have added an ATPase assay that shows the residual ATPase activity of WT KdpFABC compared to KdpFABS162AC, both purified from E. coli LB2003 cells, which is identical to the protein production and purification for the cryo-EM samples (see Figure 2-Suppl. Figure 5). The residual ATPase activity is ca. 14% of the uninhibited sample, which correlates with the E2-P fraction in the orthovanadate sample.

      Reviewer #3 (Public Review):

      The authors have determined a range of conformations of the high-affinity prokaryotic K+ uptake system KdpFABC, and demonstrate at least two novel states that shed further light on the structure and function of these elusive protein complexes.

      The manuscript is well-written and easy to follow. The introduction puts the work in a proper context and highlights gaps in the field. I am however missing an overview of the currently available structures/states of KdpFABC. This could also be implemented in Fig. 6 (highlighting new vs available data). This is also connected to one of my main remarks - the lack of comparisons and RMSD estimates to available structures. Similarity/resemblance to available structures is indicated several times throughout the manuscript, but this is not quantified or shown in detail, and hence it is difficult for the reader to grasp how unique or alike the structures are. Linked to this, I am somewhat surprised by the lack of considerable changes within the TM domain and the overlapping connectivity of the K indicated in Table 1 - Figure Supplement 1. According to Fig. 6 the uptake pathway should be open in early E1 states, but not in E2 states, contrasting to the Table 1 - Figure Supplement 1, which show connectivity in all structures? Furthermore, the release pathway (to the inside) should be open in the E2-P conformation, but no release pathway is shown as K ions in any of the structures in Table 1 - Figure Supplement 1. Overall, it seems as if rather small shifts in-between the shown structures (are the structures changing from closed to inward-open)? Or is it only KdpA that is shown?

      We thank the reviewer for their positive response and constructive criticisms. We have addressed these comments as follows:

      1. The overview of the available structures has been implemented in Fig. 6, with the new structures from this study highlighted in bold.

      2. RMSD values have been added to all comparisons, with a focus on the deviations of the cytosolic domains, which are most relevant to our conformational assignments and discussions.

      3. To highlight the (comparatively small) changes in the TMD, we have expanded Table 1 - Figure Supplement 1 to include panels showing the outward-open half-channel in the E1 states with a constriction at the KdpA/KdpB interface and the inward-open half-channel in the E2 states. The largest observable rearrangements do however take place in the cytosolic domains. This is an absolute agreement with previous studies, which focused more on the transition occurring within the transmembrane region during the transport cycle (Stock et al, Nature Communication 2018; Silberberg et al, Nature Communication 2021; Sweet et al., PNAS 2021).

      4. The ions observed in the intersubunit tunnel are all before the point at which the tunnel closes, explaining why there is no difference in this region between E1 and E2 structures. Moreover, as we discussed in our last publication (Silberberg, Corey, Hielkema et al., 2021, Nat. Comms.), the assignment of non-protein densities along the entire length of the tunnel is contentious and can only be certain in the selectivity filter of KdpA and the CBS of KdpB.

      5. The release pathway from the CBS does not feature any defined K+ coordination sites, so ions are not expected to stay bound along this inward-open half-channel.

      My second key remark concerns the "E1-P tight is the consequence of an impaired E1-P/E2-P transition" section, and the associated discussion, which is very interesting. I am not convinced though that the nucleotide and phosphate mimic-stabilized states (such as E1-P:ADP) represent the high-energy E1P state, as I believe is indicated in the text. Supportive of this, in SERCA, the shifts from the E1:ATP to the E1P:ADP structures are modest, while the following high-energy Ca-bound E1P and E2P states remain elusive (see Fig. 1 in PMID: 32219166, from 3N8G to 3BA6). Or maybe this is not what the authors claim, or the situation is different for KdpFABC? Associated, while I agree with the statement in rows 234-237 (that the authors likely have caught an off-cycle state), I wonder if the tight E1-P configuration could relate to the elusive high-energy states (although initially counter-intuitive as it has been caught in the structure)? The claims on rows 358-360 and 420-422 are not in conflict with such an idea, and the authors touch on this subject on rows 436-450. Can it be excluded that it is the proper elusive E1P state? If the state is related to the E1P conformation it may well have bearing also on other P-type ATPases and this could be expanded upon.

      This a good point, particularly since the E1P·ADP state is the most populated state in our sample, which is also counterintuitive to “high-energy unstable state”. One possible explanation is that this state already has some of the E1-P strains (which we can see in the clash of D307-P with D518/D522), but the ADP and its associated Mg2+ in particular help to stabilize this. Once ADP dissociates and takes the Mg2+ with it, the full destabilization takes effect in the actual high-energy E1P state. Nonetheless, we consider it fair to compare the E1P tight with the E1P·ADP to look for electrostatic relaxation. We have clarified the sequence of events and our hypothesized role the ADP/Mg2+ have in stabilizing the E1P·ADP state that we can see (lines 609-619): “Moreover, a comparison of the E1P tight structure with the E1P·ADP structure, its most immediate precursor in the conformational cycle obtained, reveals a number of significant rearrangements within the P domain (Figure 5B,C). First, Helix 6 (KdpB538-545) is partially unwound and has moved away from helix 5 towards the A domain, alongside the tilting of helix 4 of the A domain (Figure 5B,C – arrow 2). Second, and of particular interest, are the additional local changes that occur in the immediate vicinity of the phosphorylated KdpBD307. In the E1P·ADP structure, the catalytic aspartyl phosphate, located in the D307KTG signature motif, points towards the negatively charged KdpBD518/D522. This strain is likely to become even more unfavorable once ADP dissociates in the E1P state, as the Mg2+ associated with the ADP partially shields these clashes. The ensuing repulsion might serve as a driving force for the system to relax into the E2 state in the catalytic cycle.”

      We believe it is highly unlikely that the reported E1-P tight state represents an on-cycle high-energy E1P intermediate. For one, we observe a relaxation of electrostatic strains in this structure, in particular when compared to the obtained E1P ADP state. By contrast, the E1P should be the most energetically unfavourable state possible to ensure the rapid transition to the E2P state. As such, this state should be a transient state, making it less likely to be obtainable structurally as an accumulated state. Additionally, the association of the N domain with the A domain in the tight conformation, which would have to be reverted, would be a surprising intermediary step in the transition from E1P to E2P. Altogether, the here reported E1P tight state most likely represents an off-cycle state.

    1. His excursions may be more enjoyable if he can reacquire the privilege of forgetting the manifold things he does not need to have immediately at hand, with some assurance that he can find them again if they prove important.

      hey that's what I said at the beginning of this piece

    2. if the user inserted 5000 pages of material a day it would take him hundreds of years to fill the repository, so he can be profligate and enter material freely.

      It's interesting that this machine focuses on retrieval of a person's personal memories, whereas we're more concerned with retrieving other people's ideas from the internet and from archives

    3. With machines for advanced analysis no such situation existed; for there was and is no extensive market

      machines for advanced analysis forced their way into the extensive market by becoming more familiar and user-friendly, but they are still essentially the same machine

    4. relegated to the machine.

      this is interesting in thinking about current ideas about what should be relegated to machines-- thinking about the debates around whether or not AI can really create art