6,971 Matching Annotations
  1. Sep 2022
    1. McConnell said it’s up to the Republican candidates in various Senate battleground races to explain how they view the hot-button issue.   (function () { try { var event = new CustomEvent( "nsDfpSlotRendered", { detail: { id: 'acm-ad-tag-mr2_ab-mr2_ab' } } ); window.dispatchEvent(event); } catch (err) {} })(); “I think every Republican senator running this year in these contested races has an answer as to how they feel about the issue and it may be different in different states. So I leave it up to our candidates who are quite capable of handling this issue to determine for them what their response is,” he said.

      Context: Lindsey Graham had just proposed a bill for a nationwide abortion ban after 15 weeks of pregnancy.

      McConnell's position seems to be one that choice about abolition is an option, but one which is reserved for white men of power over others. This is painful because that choice is being left to people without any of the information and nuance about specific circumstances versus the pregnant women themselves potentially in consultation with their doctors who have broad specific training and experience in the topics and issues at hand. Why are these leaders attempting to make decisions based on possibilities rather than realities, particularly when they've not properly studied or are generally aware of any of the realities?

      If this is McConnell's true position, then why not punt the decision and choices down to the people directly impacted? And isn't this a long running tenet of the Republican Party to allow greater individual freedoms? Isn't their broad philosophy: individual > state government > national government? (At least with respect to internal, domestic matters; in international matters the opposite relationships seem to dominate.)

      tl;dr:<br /> Mitch McConnell believes in choice, just not in your choice.

      Here's the actual audio from a similar NPR story:<br /> https://ondemand.npr.org/anon.npr-mp3/npr/me/2022/09/20220914_me_gop_sen_lindsey_graham_introduces_15-week_abortion_ban_in_the_senate.mp3#t=206


      McConnell is also practicing the Republican party game of "do as I say and not as I do" on Graham directly. He's practicing this sort of hypocrisy because as leadership, he's desperately worried that this move will decimate the Republican Party in the midterm elections.

      There's also another reading of McConnell's statement. Viewed as a statement from leadership, there's a form of omerta or silent threat being communicated here to the general Republican Party membership: you better fall in line on the party line here because otherwise we run the risk of losing power. He's saying he's leaving it up to them individually, but in reality, as the owner of the purse strings, he's not.


      Thesis:<br /> The broadest distinction between American political parties right now seems to be that the Republican Party wants to practice fascistic forms of "power over" while the Democratic Party wants to practice more democratic forms of "power with".

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      This is a revision plan, the manuscript has not been modified yet as it is being transferred to a journal.

      *------------------------------------------------------------------------------ Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This study proposes (and uses) an elegant model of bacteria evolution to study how division of labor can emerge through the interaction between non-random mutations (occurring at some specific ``fragile' genomic sites) and genome architecture. The study is very interesting and the results are convincing. My main concerns are about the presentation of the model and results. Although I am confident about the results, some elements should be clarified for a better understanding and for a correct interpretation of the results. Two points in particular (detailed below as major comments) require clarification.

      Major comments:

      • the notion of telomere/centromere is used all throughout the paper but I think it is used in a misleading way. First, it seems that here there is only one telomere (but this is actually a detail of the model). More importantly, as long as I know, it is well known that in S. coelicolor the sequence degenerates more rapidly when getting closer to the telomeres (but telomeres are defined independently from this property). But here, the notion of telomere is precisely directly determined by its mutational instability (respectively, the centromere is defined by its stability). Although this is reasonable given the objective of the model, it forbid the use of sentences like "we observed that the genome of the evolved colony founded had two distinct regions: a telomeric [...] and a centromeric [...]" (line 234) or "When bacteria divide, mutations induced at fragile sites lead to the deletion of the part of the genome distal to them, causing large telometic deletions" (line 239 - this is not a result but a hidden description of the model) as this distinction between the two regions is not an outcome of the simulation but rather given a priori as a coded property of the fragile sites that all lead to deletions on the same -- called telomeric -- side (of course, formally if the genome contains no fragile site, there is no distinction but still). Please clarify this in the main text and in the methods. *

      Authors response (AR, in the following): we agree with the reviewer that the directionality of the deletions determines centromere and telomere in our model (and the reviewer is correct that we only consider one arm of the chromosome). We will explicitly state both in the main text and in the methods that the model does not include any explicit centromeric and telomeric structure, and that the polarity of the genetic information (and thus centromere and telomere) depends on the choice of directionality of the deletions.

      - In most part of the paper (methods, results, figures, sup mat...) antibiotics are considered to have a concentration (or a high/low production) but at least twice in the text (lines 165 and 488) it is said that only the presence/absence of antibiotics is modelled. I was not able to understand how the continuous values are transformed into presence/absence (is there a threshold?) but more importantly, I strongly suspect that this choice has a strong influence on the outcome. For instance, with a diffusion radius equals to 10, it means that an antibiotics producing cell is able to protect 2*\pi*10=~60 replicating cells. Hence, one could conjecture that the fraction of antibiotic-producing mutants should a little more than 2%... which is what is observed by the authors. So (1) please clarify this point (2) discuss (or experiments) the consequences of this choice on the conclusion.

      AR: the reviewer is correct that antibiotics are modelled as presence/absence – this was done for computational efficiency. However, the probability that a bacterium deposits an antibiotic at a site within the deposition radius is a continuous number, as it depends on the number of antibiotic genes and growth genes. We will make this clear in the main text and in the methods.

      Secondly, we show the effect of varying the deposition radius for the evolutionary dynamics in Supplementary Section S17. We will make this clear in the main text. For the area covered by different radius of antibiotic deposition, please see below.

      * Minor comments: - line 262: "We conclude that genome architecture is a key prerequisit for the maintenance of mutation-driven division of labor". Given the model hypotheses you cannot be so affirmative (it is a key prerequisit... in this model!) *

      AR: we will modify the statement as suggested. * *

      - line 286: "cannot" is probably too strong. It has not been observed...

      AR: we will modify the statement as suggested.

      - line 288 and following: you seem to consider that there is "selection for diversity". Given the large number of possible antibiotics and given that cells are "automatically" resistant to the antibiotics they produce, could it be simply drift? There is a clear selection pressure to limit the number of growth-promoting genes but no such pressure exist for antibiotics. Hence their number could simply drift (note that figs 2 and SF1 both use a log scale; random variations due to drift could be hidden by the log. Fig. SF2 does use a log scale and shows a dynamics that---to my eyes---claims for drift rather than for selection of diversity).

      AR: we agree with the reviewer that drift might contribute to the overall antibiotic diversity. This might be especially true for the antibiotic genes residing downstream of the fragile sites, which have low probability of expression in the wild-type (because of the many growth genes) and are deleted in the mutants. Duplications, deletions and modifications of these genes are effectively neutral, and are therefore likley subject to drift. We will include this discussion in the main text. However, bacteria are highly susceptible to the diverse antibiotics produced by other colonies (i.e. those produced – largely – by the mutants). These antibiotics and their diversity drives colony invasion and is thus selective. The overall number and diversity of antibiotics is therefore, at least in part, under selection.

      - line 340: "ends" should be "end" when discussing the model - line 345: "a telomeric region" should be "telomeric regions" when discussing the bacteria - line 359: "S. ambofaciens" should be italic - line 365: same for "Streptomyces"

      AR: we will modify the statement as suggested (and thank the reviewer for carefully reading the text).

      - line 245 states that colonies begin clonally but methods (lines 434-438) don't support this. Colonies don't begin clonally but they begin without antibiotic-producing spores (see also line 618)

      AR: we agree with the reviewer that colonies are not specifically initialised as clonal. We will modify the sentence as: By this process colonies eventually evolve to become functionally differentiated throughout the growth cycle.

      - line 442: "their" should be "its" - line 446: "hotspot for recombination" no, for "deletion" - line 449: please remove brackets around the reference.

      AR: we will modify the statement as suggested.

      - line 458: if I understood it correctly, there is no explicit competition in the model. Competition simply comes from the asynchronous replication. Am I true? Could you clarify that point?

      AR: The reviewer is correct that through asynchronous updating only one focal lattice site is update at a time. However, if a site is empty, the bacteria surrounding it are competing based on their replication rate kreplication. Dividing by the neighbourhood size (eta) simply ensures that a bacterium surrounded by a completely empty neighborhood replicates on average alpha_g times (alpha_g being the max growth rate). We will mention this in the methods.

      - line 490: "the antibiotic deposited is chosen randomly and uniformly among them". This is not fully clear. I suppose the bacteria is still resistant to all the antibiotics it \it{can} produce?

      AR: Yes. This is mentioned in the methods section “Replication”.

      - figure SF1: please use the same scales as in figure 2 such that the two plots can be easily compared

      AR: we will modify the x-axis to include the number of growth cycles.

      - section S3 and figure SF4: What is to be understood from the figure is not clear to me. Seems that WTs win only if generalists produce less AB or replicate slower (?) Is it true?

      AR: The reviewer is correct. In other words: when the artificial generalist has the same replication rate and the same antibiotic production rate as the WT, then the competition experiment ends with a near draw (the generalist still wins, but slowly). This means that the fitness cost associated to division of labor, i.e. to having two cell types doing the same work as one generalist – is small.

      We will include this description in the section.

      The figure is unfortunately complicated by the fact that we do not know a-priori how high the effective antibiotic production rate is (because antibiotics are spatially distributed by the stochastically generated mutants) – and so we had to make a large parameter screen to figure out the parameter values for which the competition experiment made most sense.

      - I found it very difficult to draw conclusion from section S4, S5 and S6. These experiments should be analyzed with the help of mathematical analyses of the equations. Moreover, the understanding of these results are rendered difficult due to the lack of clarity regarding the discrete (or not) nature of the antibiotic production/action/diffusion

      AR: We hope that we have clarified the distinction between antibiotic production rate and antibiotic presence/absence in the lattice.

      The model is not amenable to analytical tractability, which makes it difficult to make exact statements based on the equations that govern it. However, we can check that the model is robust, and identify regions of parameter space where the model behaves in a qualitatively similar way to main text results.

      Sections S4, S5 and S6 are essentially parameter screens to verify that the model reproduces the results reported in the main text for a broad range of parameters. The primary conclusion that can be drawn is that the model is robust to parameter changes.

      Section S4 explores the model robustness to changes in two key parameters of the model: the antibiotic inhibition due to growth genes beta_g and the parameter h_g, which is the number of growth genes that produces half-maximum growth rate. Section S5 further analyses the relation between these parameters, and how they together determine the strength of the trade-off. Section S6, finally, shows that a strong trade-off is not a necessary requirement for evolution of division of labor as the division also depends (in a counterintuitive way) on the parameter alpha_g, the maximum antibiotic production rate.

      We will include and expand these summarizing statements in each section, to make clear what each section achieves.

      - S7 and fig SF9. It is unclear to me why the fraction of mutants decrease along time elapsed in the cycle. Please explain.

      AR: The reason is that not all mutants are born with the same number of antibiotic genes (Fig. 3A). A mutant with fewer antibiotic genes might be susceptible to some of the antibiotics produced by another mutant, and could be killed by these antibiotics. Once a mutant is killed in the inner colony, a wt will replicate to fill the spot, and likely a wt offspring will take that site rather than another mutant. Thus there is a decline in overall mutant population.

      We will include this discussion in Section S7.

      - Figure SF14: what are the tin lines? if they correspond to the five repeats, how can it be that the bold line be the median?

      AR: we realise that the caption should be clearer. Each of the five lines (both bold and thin) in each pane represents the median number of genetic elements over time. The bold line just highlights one randomly chosen simulation (the same for each genetic element), to better guide the eye.

      We will clarify the caption of the figure.

      - S13 and figure SF15: given that AB concentration is ON/OFF, is this result really surprising? This also questions about the accumulation of AB genes in the original model. Although the authors regularly claim that this is due to selection for diversity, drift could also be at play (see above)

      AR: As mentioned above, we agree with the reviewer and we will mention that drift may co-determine antibiotic gene accumulation.

      - S17: for radius 1, 2 and 3, the aliasing is likely to be strong. Hence, the results cannot be interpreted with this sole information. Please give e.g. how many cells are "protected" for each radius (e.g. for r_{alpha}=1, this value can vary between 1 and 9!)

      AR: for radius=1, 2, 3, 5 ,8, 10 the area covered by antibiotic production is respectively 5 ,13, 29, 81, 197, 317. We will include this information in the figure.

      - L742: "matching the antibiotic bitstring with the bitstring of the antibiotic". True and actually elegant but simpler formulation could ease the reading...

      AR: We will change the sentence as follows: “Both antibiotics and antibiotic genes are characterised by a bitstring, which determines their type. Antibiotic resistance in the model is determined by matching these two strings.”

      - lines 746-751 and figure SF21: There again, could it be a consequence of the AB ON/OFF diffusion model?

      AR: we agree with the reviewer that a continuous diffusion model could affect resistance to antibiotics. We expect that the main effect will come from some antibiotics antibiotics having different concentrations. For instance, we could have a situation in which many deleterious antibiotics are produced in small amount, but have a compounding effect on the susceptible bacterium. This finer model of antibiotic production, diffusion and killing was not included in the model to limit the computational load.

      - S18-S19-S20: what should the reader understand from these results? Please better comment the figures.

      AR: we agree that figures in Section S18,19 and 20 could have more descriptive captions. Sections S18, 19 and 20 are parameter screen to check that the model is robust to changes in the mutation rates affecting fragile sites activation and de-novo formation. The primary result of Section S18 is that that division of labor evolves over a broad range of fragile site activation rates and de-novo fragile site formation rates (and even when these parameters are decreased by one order of magnitude).

      Section S19 shows how these combination of parameters result in quantitative changes in genome composition.

      Section S20 shows that the de-novo fragile site formation rate can be zero: as long as the system is initialised genomes that can divide labor, the fragile sites will persist even though no new ones are generated.* *

      • CROSS-CONSULTATION COMMENTS Sorry about the confusion about the computation of the number of cells protected by a single AB-producing cell. Of course it is of the order 10*\pi^2 !!! The global argument still holds but the number of cells protected is of course larger than 60 (note that, due to aliasing at the periphery the exact number of cells in the protected area is difficult to determine). *

      Author response: We hope the clarifications mentioned above answer the reviewer’s comment.

      * Reviewer #1 (Significance (Required)):

      First, an very importantly, I must say that I am no familiar with the biological model (Streptomyces coelicolor). So I am not fully able to judge the biological significance of this research (i.e. whether the way division of labor is achieved here enlights---or not---the biology of this bacteria). However, on the computational side, the model and the results (as they are summarized in the conclusion) are very interesting on their own and deserve publication.

      Remark: a lots of supplementary results are added to the paper that are not not fully explained or analysed. Please, better discuss all these results and their significance. *

      AR: we will extensively check and add detail to the supplementary material, ensuring that results are fully explained (see also response to reviewer 1).*

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The manuscript "Evolution of genome fragility enables microbial division of labor" presents a model of genetically-based division of labour in bacterial colonies. It is postulated that two essential processes, growth and the important for elimination of competitors production of antibiotics, are poorly compatible in a single cell. The beneficial for a colony cell specialization is assumed to be determined only by genetic differences that appear via deletions of growth- promoting loci. These deletions and production of various antibiotics are mediated by a rather elaborate genetic architecture, which includes position-sensitive "fragile" sites, mutable antibiotic and growth-promoting genes. The model produces rather predictable results that under sufficiently strong incompatibility between growth and antibiotic production, the long-term evolution results in formation of mosaic of colonies, each specialized in production of its specific set of antibiotics. Such production is facilitated by evolving rapidly mutable genomes that constantly generate non-reproducing antibiotic-pumping cells.

      The model appears very thoroughly developed and analyzed, and all major conclusion are intuitively appealing. Overall, the manuscript reads as a well-written quantitative proof of the principle of genetically-based division of labour between bacterial cells. The only part of the model that I'm a bit sceptical about is the unwarranted complexity of the genetic architecture. Unless the introduction of "fragile" sites and the directional ordering of genes is strongly justified by empirical data, a simpler and more clear assumption about mutational incapacitation of growth genes would suffice to reproduce the predicted phenomenology. So adding such empirical evidence would boost the relevance of the genetical part of the model. In the present form, all observed adaptations are inevitable simply because the expected division of labour will not evolve without each of them due to the design of the model. *

      AR: We agree with the reviewer that a simpler model with a predetermined effect of mutations, such as to incapacitate the growth genes, would suffice to reproduce the phenomenology of the mutation-driven division of labor observed in Streptomyces. Adding the complexity of a genome architecture introduces one more hypothesis: that genome fragility can evolve to organize the division of labor. This hypothesis, supported by the results presented here, can be tested experimentally.

      However, there is already some empirical support for our modelling choices: 1) mutation rates along the genome of Streptomyces are highly heterogeneous, 2) the genetic content is partitioned along the chromosome so that some genes are preferentially located in the mutationally quiet centromere, and others are in the mutationally active (sub)telomeric regions, 3) some cis genetic elements in Steptomyces’ genomes readily recombine to produce large-scale duplications and deletions (which we heavily simplified in the model as deletion-inducing fragile sites).

      We will extend the introduction to include the references for the empirical support to our model.

      * A couple of minor comments...

      217 This is achieved when fewer growth-promoting genes are required to inhibit antibiotic 218 production (i.e. lower βg). Shouldn't it be "larger \beta_g"? *

      AR: yes. Thanks for catching this!

      * Whether in the main text or Supplementary materials, it woud help to add a complete population dynamics equation with all gain and loss terms. *

      AR: we agree with the reviewer that it would be interesting to obtain a comprehensive population dynamics equation that captures the spatial dynamics of replication, mutation, and antibiotic production, causing colony formation and between-colony competition. However, deriving such equation would be a very big effort in itself, and we suspect that it would not be analytically tractable. Because of this, we prefer the “procedural” model description we gave – which also mirrors the model implementation (see github repository at github.com/escolizzi/strepto2).

      * Strikingly, we find the opposite: division of labor evolves when 224 bacteria produce fewer overall antibiotics (lower αa), under shallow trade-off conditions 225 (hgβg = 5; see Suppl. Section S6).

      I don't see why it is"striking". It seems perfectly explicable that a smaller \alpha requires more dedication to antibiotic production, thus favouring specialization. *

      * *AR: we agree that we have not conveyed why we found this result surprising. We have set the trade-off shallow enough (h_g beta_g =5) that the generalist wins when alpha_g =1. In addition, lowering alpha_a makes the benefit of creating a mutant smaller, because a highly specialised mutant with zero growth genes makes fewer antibiotics. A generalist is proportionally less affected. Intuitively, we have compunded two benefits for the generalist.

      But division of labor evolves, outcompeting the generalist – which surprised us.

      We will modify the paragraph to better explain what we expected, and we will tone down the wording, removing the word “strikingly”.

      *Reviewer #2 (Significance (Required)):

      Due to my relative lack of familiarity with the literature on evolution of genetically-based division of labour, I would rather not comment on the degree of innovation of the manuscript.

      The text is well written and is accessible to a wide readership, so it could be recommended to a general biological or evolutionary journal.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: In this manuscript the authors explore the co-evolution of genomic architecture and division of labour in antibiotic production, in a model inspired by the bacterium Streptomyces coelicolor. In the model a genetic trade-off is implemented where the having a large number of growths promoting genes (and thus fast growth) leads to a low production of antibiotics. On the other hand, having fewer growth promoting genes allows for a higher production of antibiotics. This trade-off selects for a division of labour, where one sub population specializes in antibiotic production and another sub population specializes in reproduction. This division of labour is achieved by evolving the genome structure, so that growth promoting genes are clustered together, separated from the rest of the genome by several fragile sites (sites that allow for large deletions). This allows a single mutational event to delete a large number of growth-promoting genes, which creates a cell, lacking growth genes and that thus has a high antibiotic production (cell specializing in antibiotic production). In other words, the genome structure evolves to shape evolvability, so as to allow cells with a high growth rate to rapidly and repeatably evolve/mutate into cells with a high antibiotic production. This creates a division of labour where a part of the population specializes in growth/reproduction and another part specializes in antibiotics production. This model provides a tangible mechanism to explain a similar division of labour observed in S. coelicolor. This mechanism also fits well with the large deletions observed in antibiotic-hyperproducing S. coelicolor cells, which are also repeatably generated during colony growth.

      Major comments: -Line 69, It would be good to give a bit more information here on the (number of) different types of antibiotics produced by S. coelicolor, to help the reader understand some of the modelling choices later on, such as allowing for the evolution of a large number (16 or higher if I understand correctly) of different antibiotics and a cell automatically being resistant to all antibiotics it produces (instead of having separate resistance genes). *

      AR: we agree with the reviewer that adding this information would put the model more in focus. The total number of antibiotics that can be produced by the genus Streptomyces has been estimated to be of the order of 100000 (ten to the fifth, [Watve et al., 2001]). Although we use S. Coelicolor as reference model organism for our computational model, we simulate long-term evolutionary dynamics that diversify the antibiotic repertoire. Each antibiotic is represented by a 16 bits string, meaning that there are 2^16 (= 65536) possible antibiotics in the system – consistent with the number of possible antibiotics in the genus.

      This being said, our model genomes evolve to have many more antibiotic genes than typical Streptomyces. Each species in the genus has up to 30 biosynthetic gene clusters [Genilloud, O. (2014)], a fraction of which make antibiotics. We discuss this discrepancy and propose solutions for this in the Discussion (also see below).

      Regarding the possibility of separating antibiotic resistance from antibiotic synthesis: we (and most literature on the eco-evolutionary dynamics of antibiotic-producing bacteria) simplified antibiotic production as depending on individual “antibiotic biosynthetic genes”. In reality several genes in a cluster must be expressed to synthesize an antibiotic. A typical biosynthetic gene cluster also encodes resistance genes for the cognate antibiotic, to prevent cell suicide [Mak et al., 2014] – hence antibiotic genes providing resistance in the model. This being said, Streptomyces genomes also host resistance genes to antibiotics for which they have no biosynthetic pathway themselves, including efflux pumps that give some nonspecific resistance [Nag et al 2021].

      Modelling antibiotic synthesis in more detail would allow to make a better model of antibiotic evolution, as well as to enrich the social dynamics of the model – because “cheaters” could evolve that are resistant but do not contribute to the antibiotics in the colony. These questions are certainly interesing, but would further complexify the model. They are exciting venues for future model expansions.

      We will include the literature mentioned above in the introduction, and use these references to better motivate the model.

      * -Lines 127-129 It is mentioned here fragile sites in the genome might represent transposable elements or long inverted repeats. Would both of these types of fragile sites behave the same? Has it been shown that both transposable elements and long inverted repeats can lead to large deletions from a linear chromosome? It would be nice to have a bit more background on how fragile sites might work or what they might look like in an empirical context. I am a bit unsure on this, but depending on their exact empirical nature, should fragile sites not also lead to increased rates of gene duplication near themselves? *

      AR: we see that we have not made a clear connection between the introduction, where we introduce the mutational dynamics of Streptomyces, and the methods, where we introduce fragile sites.

      Briefly, both duplications and deletions occur in Streptomyces, as well as circularization of the linear chromosome, conjugation, etc. [Hoff et al 2018,Tidjani et al 2019]. However, the outcome of all these mutations is biased towards deletion [hoff et al 2018, Zhang et al, 2020, Zhang et al, 2022]. There are many mechanisms involved in producing these mutations, forming the mutational hotspots, handling DNA breaks, and in the horizontal transfer of genetic material [Tidjani et al 2019; Lorentzi et al, 2021]. As the reviewer suggests – they do not behave all in the same way. To construct the model, we simplified all these mutational mechanisms into one genetic element, the “fragile site”, and assumed that they are solely responsible for the chromosomal-scale mutations that produce deletions.

      We will add this information to the introduction (see also response to reviewer 2), and refer to it in the methods.

      * -Line 160 As alluded to before, given the introduction provided, two assumptions come about here (lines 160-166) that lack a bit of justification/background/context. First, why does one allow the evolution of such a relatively large number of antibiotics? A bit more empirical in the introduction background would go a long way to making this assumption seem more justified. As far as I can see the genomic architecture leading to division of labour is only demonstrated for values of v that are 6 (i.e. 64 antibiotics) or above. Perhaps it is because I lack empirical background here, but this still seems to be a relatively large antibiotic space. Does the model also work with v=2? Perhaps it would be good to show a simulation with v=2 in supplementary material S16 as well. *

      AR: Hopefully the previous comment on the number of possible antibiotics also clarifies this point.

      We will carry out a simulation with v=2.

      * -Line 166 The assumption is made that if a bacterium produces a certain antibiotic, it is automatically resistant to this antibiotic. Now it could be that this assumption is empirically rooted, in which case it would be good to allude to this empirical justification. I wonder how would the results be impacted if the resistance genes were separated from the antibiotic production genes? (I do not think additional simulations are in any way necessary on this point, but some more context/thoughts on this matter would be helpful, perhaps near lines 306-309) *

      AR: Please see response to major comment on the possibility of separating antibiotic resistance from antibiotic synthesis. We will add the discussion there in the Discussion session.

      * -Figure 1 In the subscript it becomes evident that the probability of large deletions due to fragile sites is much higher (10 fold) than single gene duplications, it seems to me this should be the other way around, single gene duplications and deletions could be much more probable than fragile site induced large deletions. Would the model still produce the same results if the values for mu-d and mu-f were switched around? (Again, I do not think additional simulations are per se required, some justification for this assumption would already be plenty). *

      AR: We chose these parameter values because, empirically, large scale chromosomal rearrangements (deletions) occur more frequently than single gene duplication/deletion in Streptomyces – as they are the primary mechanism for Streptomyces development and division of labor. We now mention this in the caption of Fig. 1.

      Still, would we expect results to be affected if mu_d > mu_f? We do not think so, for the following reason: mu_d and mu_f are per-gene probabilities, so the genomic probability of duplication/deletion and of fragile site activation will depend on the evolved number of genes.

      in Fig. 5 we show that mu_f can be decreased by more than one order of magnitude and results do not change qualitatively. To compensate for a smaller per-gene deletion rate (mu_f), the evolved number of fragile sites per genome becomes larger (Suppl. Section S19, Fig. SF23). A similar compensatory increase of fragile sites could happen if duplications and deletions rate per gene were larger.

      * Minor comments: -Line 36, perhaps replace "must" with " can" as there are other ways to achieve a division of labour that do not hinge on genomic architecture such as those listed in the next sentence. This sentence seems at odds with the next one, which lists ways to achieve cell differentiation that do not per se completely rely on genomic architecture such as gene regulation. Maybe consider moving this sentence to be on line 40 (after "...organized at the genome level remains unclear") *

      AR: we will modify the text as suggested by the reviewer

      * -Line 48, perhaps remove "disposable" as there is no particular reason the somatic tissue is disposable, furthermore it invokes the disposable soma theory of aging which is not relevant here *

      AR: we will remove “disposable”.

      * -Line 147-148 Why these particular relationships, as a reader I do not understand how these functions were constructed and how they might influence the results, a bit more justification might be helpful. Perhaps later on (results/discussion) also address what might happen if you were to use different functions? *

      AR: we agree that these functions could use a little more explanation. The probability of replication is a function that increases with the number of growth genes. We assume that the function saturates, as growth cannot be arbitrarily large even if the genome hosts many growth genes. So we need at least two parameters: one for the maximum growth rate (alpha_g), and another that controls the curvature of the function (h_g). A simple choice is a Hill function, but other saturating functions would likely work just as well (e.g. an exponential function with a form alpha_g*(1-exp(-g/h_g)). Similarly, antibiotic synthesis inhibition from growth genes should tend to zero for larger numbers of growth genes, hence the exponential (but we expect that a hyperbolic form e.g 1/(1+g/beta_g) would work just the same).

      As this discussion is rather technical, we will include it in the methods section.

      * -I am clearly biased on this matter, since I work on evolvability. So, the authors should feel free to ignore this comment. Regardless, I think the authors have shown a wonderful example of the evolution of evolvability. Perhaps it would be nice to add a little bit of an evolvability angle in the discussion. In particular thinking about how fragile sites shape evolvability. *

      AR: we agree with the reviewer that the work is a clear form of evolution of evolvability. We now explicitly mention this in the discussion.

      * -Lines 404-411 It is great to see that the authors consider the wider applicability of their findings. It would be nice to add something here about the broader applicability in bacteria. As a large number of bacteria have circular chromosomes, how would these findings be impacted if circular chromosomes were at play? (I suspect they would largely still work in the same way, but keen to hear what the authors think). Referring to the work of Yona et al. 2012 on transient chromosomal duplications in yeast due to heat stress might also be good here, to show the more general applicability of the authors findings, this is another example where genomic architecture shapes evolvability. Yona AH, Manor YS, Herbst RH, Romano GH, Mitchell A, Kupiec M, Pilpel Y, Dahan O. Chromosomal duplication is a transient evolutionary solution to stress. Proc Natl Acad Sci U S A. 2012 Dec 18;109(51):21010-5. doi: 10.1073/pnas.1211150109. Epub 2012 Nov 29. PMID: 23197825; PMCID: PMC3529009. *

      AR: Bacteria show many forms of targeted mutational dynamics (we do already mention CRISPR and HGT). It recently came to our attention that many bacterial and archea genomes host so-called Diversity-Generating Retroelements (DGR) [Macadangdang et al, 2022]. DGRs accelerate microbial evolution at specific sites and generate functional diversity. We will include this reference in the discussion.

      We thank the reviewer for pointing us to the work on chromosomal duplication in yeast – we will also incorporate this “dramatic” form of duplication in the discussion.

      * -Lines 412 -419 I agree with the authors that in practice the cells specializing in antibiotic production look somewhat like soma, however I would consider not using this term here as strictly speaking the antibiotics producing cells can still reproduce (be it at an extremely low rate, which leads to their loss). *

      AR: We tone down both mentions of soma, as follows: “This gives rise to a division of labor driven by mutation, reminiscent of the division between germ and soma in multicellular eukaryotes.”

      And, in the last sentence, we write: “...mutant cells *effectively* function as soma by enhancing...”

      - Lines 434-438 If I understand correctly authors did not explicitly model the sporulation process (instead selecting random cells from the end of a cycle). I think this is a very good modelling choice that should not be changed; however, I do wonder how the results would be affected if sporulation was more explicitly modelled (for example by adding genes for sporulation, creating a 3 way trade-off between growth, sporulation and antibiotic production). Perhaps something that could be mentioned in the discussion.

      AR: we agree with the reviewer that more complex evolutionary problem could be implemented in the system, e.g. through a gene type required for sporulation. They would likely have interesting outcomes. For instance, some bacteria may decide never to sporulate, while others could enhance their antibiotic resistance by turning into spores. Moreover, including additional functions together with an evolvable gene regulation could better capture the developmental dynamics observed through the life cycle of Streptomyces.

      * I hope this review is of some use and helps the improvement of this manuscript. *

      * Yours sincerely,

      Timo van Eldijk

      Reviewer #3 (Significance (Required)):

      Significance: This study provides a clear conceptual advance by showing and studying how genome structure can evolve to create a division of labor. Thereby mechanistically explaining the division of labor in antibiotic production observed in S. coelicolor. It seems evident to me that whilst this study mainly focuses on S. coelicolor, the mechanism likely plays an important role in microbial evolution in general. Though others have previously theoretically explored such mechanisms, this study provides the first exploration modelled closely after an empirical system and hence provides a significant advance. In a more general sense, the evolution of genome architecture likely governs evolvability not just in microbes but in all life on earth. Therefore, I believe that this paper would be interesting for a general audience interested evolution. It would be of particular interest to those studying microbial evolution. My expertise lies in evolutionary biology, theoretical biology, microbial evolution and palaeontology. *

    1. As a metaphorical mode of representation, whether it may be oral, iconic, or written, the fairy tale effective ydraws our attention to relevant information that will enable us to knowmore about our real life situations, and through its symbolical code anflexible structure, it allows for personal and public, individual and co-lective interpretations

      I think this an interesting light to see fairytales in. It almost reminds me of the notion of "taking what you need" and our brains allowing us to focus on what resonates with us and taking out the morals that we can particularly relate to.

    1. Social workers treat each person in a caring and respectful fashion, mindful of individual differences and cultural and ethnic diversity. Social workers promote clients’ socially responsible self-determination. Social workers seek to enhance clients’ capacity and opportunity to change and to address their own needs.

      This specific area of the Code of Ethics is relevant to a situation that I have experienced in my field work for various reasons. First off, I encountered a situation when working with a client, with my supervisor present, that required me to keep this value and ethical principle in mind. In this instance, the client was expressing to me the way her family dynamic is run based on her cultural views. Although, my family dynamic may not be run the same way as my client's, in correspondence with the ethical principle of respecting her inherent dignity and worth, I used this information to be mindful of her individual differences and understand how this might play a role in her situation. This is an example of treating my client not only with respect but seeing how culturally her family operates their household. Furthermore, I could identify the emotional impact expressing this situation had on my client so I handled her feelings with empathy and compassion, as requested of us as social workers. As the session progressed, the client expressed difficulty in fully finding her drive when completing tasks in her daily routine. Due to this, I prepared self motivating activities and strategies for the client to visualize ways to push herself to complete these tasks. Specifically, we worked together on a time management chart where the client mapped out her schedule. In this, I was attempting to promote her socially responsible self-determination. By helping the client find her inner drive, I was attempting to instill in her that she was not only contributing to her personal self-determination but also to the social community around her as well. I hoped that helping her visualize this, it would make her want to act in positive ways and see how her inner self is a wonderful piece to the puzzle of her community around her. Moreover, this falls into enhancing her capacity and opportunity to change. My client expressed to my supervisor and I that it was difficult for her to pinpoint exactly what areas she wanted to work on and she felt like she needed assistance with. Through our discussion of her strengths, her interests, her hobbies, and especially her goals, I took note that the client began to open up more as the discussion went on and find her way of speaking on her struggles. This is a relevant example of respecting the dignity and worth of my client because in acknowledging her struggles, I am showing that I am not only a listener to her but a guide to her as well that seeks to help her find her inner desires. I constantly used phrases like, "I can absolutely see how you feel that way.." as a means of reassuring the client she is heard and respected. In my discussion with my client, the piece of the Code of Ethics that states, "seek to enhance clients capacity and opportunity to change" I think is especially relevant. This is because in my session with the client I aimed to have her strive to seek out just how much potential she truly possesses through conservation and self-reflection activities I provided. By giving my client strategies towards positive change, I felt as though this was representative of my client being able to pinpoint her needs while also understanding just the positive strategies she can use to meet those needs. In terms of the opportunity to change, in our session we talked a lot on this change towards a positive mindset and the ways she can do that on her own time as well, which I think also falls into this ethical principle.

    1. Author Response

      Reviewer #1 (Public Review):

      1-1. I do have some concerns that the differences in network clustering reported in Fig 6 may be due to noise and I think the comparisons against the HCP parcellation could be more robust. Specifically, with regard to the network clustering in Fig 6. The authors use a clustering algorithm (which is not explained) to cluster the parcels into different functional networks. They achieve this by estimating the mean time series for each parcel in each individual, which they then correlate between the n regions, to generate an nxn connectivity matrix. This they then binarise, before averaging across individuals within an age group. It strikes me that binarising before averaging will artificially reduce connections for which only a subset of individuals are set to zero. Therefore averaging should really occur before binarising. Then I think the stability of these clusters should be explored by creating random repeat and generation groups (as done for the original parcells) or just by bootstrapping the process. I would be interested to see whether after all this the observation that the posterior frontoparietal expands to include the parahippocampal gryus from 3-6 months and then disappears at 9 months - remains.

      We thank the reviewer for this insightful comment on our clustering process. For the step of “binarizing before averaging”, we followed the method proposed by Yeo et al (1). In this method, all correlation matrices are binarized according to the individual-specific thresholds. Specifically, each individual-specific threshold is determined according to the percentile, and only 10% of connections are kept and set to 1, while all other connections are set to 0. Yeo et al. (1) explained their motivation for doing so as “the binarization of the correlation matrix leads to significantly better clustering results, although the algorithm appears robust to the particular choice of the threshold”. We consider that the possible reason is that the binarization of connectivity in each individual offers a certain level of normalization so that each subject can contribute the same number of connections. If averaging occurs before binarizing, the actual connectivity contributed by different subjects would be different, which leads to bias. Meanwhile, we tested the stability of ‘binarizing first’ and ‘averaging first’, and the result is shown in Fig. R1 below. This figure suggests a similar conclusion as (1), where binarizing first before averaging leads to better clustering stability. We added the motivation of binarizing before averaging in the revised manuscript between line 577 and line 581.

      Fig. R1. The comparison of clustering stability of different methods. The red line refers to the clustering stability when binarizing the correlation matrices first and then averaging the matrices across individuals, while the blue line refers to the clustering stability when averaging the correlation matrices across individuals first and then binarizing the average matrix.

      For the final clustering results, we performed our clustering method using bootstrapping 100 times, and the final result is a majority voting of each parcel. The comparison of these two results is shown in Fig. R2. Overall, we do observe good repeatability between these two results. However, we also observed that some parcels show different patterns between the two results, especially for those parcels that are spatially located around the boundaries of networks or the medial wall. The pattern of the observation that “the posterior frontoparietal expands to include the parahippocampal gyrus from 3-6 months and then disappears at 9 months – remains” was not repeated in the bootstrapped results. These results might suggest that the clustering method is quite robust, the discovered patterns are relatively stable, and the differences between our original results and bootstrapping results might be caused by noises or inter-subject variabilities.

      Fig. R2. Top panel: the network clustering results using all data in the original manuscript. Bottom panel: the network clustering results using majority voting through 100 times of bootstrapping. Black circles and red arrows point to the parahippocampal gyrus, which was included in the posterior frontoparietal network, and is not well repeated in the bootstrapped results. (M: months)

      1-2. Then with regard to the comparison against the HCP parcellation, this is only qualitative. The authors should see whether the comparison is quantitatively better relative to the null clusterings that they produce.

      Thank you for this great suggestion! As suggested, we added this quantitative comparison using the Hausdorff distance. Similar to the comparison in parcel variance and homogeneity, the 1,000 null parcellations were created by randomly rotating our parcellation with small angles on the spherical surface 1,000 times. We compared our parcellation and the null parcellations by accordingly evaluating their Hausdorff distances to some specific areas of the HCP parcellation on the spherical space, including Brodmann's area 2, 3b, 4+3a, 44+45, V1, and MT+MST. The results are listed in Figure 4. From the results, we can observe that our parcellation generally shows statistically much lower Hausdorff distances to the HCP parcellation, suggesting that our parcellation generates parcel borders that are closer to HCP parcellations compared to the null parcellations.

      However, we noticed very few null parcellations that show smaller Hausdorff distances compared to our parcellation. A possible reason comes from our surface registration process with the HCP template purely based on cortical folding, without using functional gradient density maps, which are not available in the HCP template. As a result, this does not ensure high-quality functional alignment between our infant data and the HCP space, thus inevitably increasing the Hausdorff distance between our parcellation and the HCP parcellation.

      1-3. … not all individuals appear (from Fig 8) to be acquired exactly at the desired timepoints, so maybe the authors might comment on why they decided not to apply any kernel weighted or smoothing to their averaging? Pg. 8 'and parcel numbers show slight changes that follow a multi-peak fluctuation, with inflection ages of 9 and 18 months' explain - the parcels per age group vary - with age with peaks at 9 and 18 - could this be due to differences in the subject numbers, or the subjects that were scanned at that point?

      We do agree with the reviewer that subjects are not scanned at similar time points. This is designed in the data acquisition protocol to seamlessly cover the early postnatal stage so that we will have a quasi-continuous observation of the dynamic early brain development.

      We didn’t apply kernel weighted average or smoothing when generating the parcellation, as we would like each scan to contribute equally, and each parcellation map could be representative of the cohort of the covered age, instead of only part of them. Meanwhile, our final ‘age-common parcellation’ could be representative of all subjects from birth to 2 years of age. However, we do agree that the parcellation map that is only designed for the use of a specific age, e.g., 1-year-olds, kernel weighted average, or even a more restricted age range could be a more appropriate solution.

      For the parcel number that likely shows fluctuations with subject numbers, we added an experiment, where we randomly selected 100 scans by considering the minimum scan number in each age group using bootstrapping and repeated this process 100 times. The average parcel number of each age is reported in the following Table R1. We didn’t observe strong changes in parcel numbers when reducing scan numbers, which further demonstrates that our parcel numbers do not show a strong relation to subject numbers. However, the parcel number does not increase greatly from 18M to 24M in the bootstrapping results, so we modified the statement in the manuscript about the parcel number to ‘… all parcel numbers fall between 461 to 493 per hemisphere, where the parcel number attains a maximum at around 9 months and then reduces slightly and remains relatively stable afterward. …’, which can be found between line 121 and line 122.

      1-4. I also have some residual concerns over the number of parcels reported, specifically as to whether all of this represents fine-grained functional organisation, or whether some of it represents noise. The number of parcels reported is very high. While Glasser et al 2016 reports 360 as a lower bound, it seems unlikely that the number of parcels estimated by that method would greatly exceed 400. This would align with the previous work of Van Essen et al (which the authors cite as 53) which suggests a high bound of 400 regions. While accepting Eickhoff's argument that a more modular view of parcellation might be appropriate, these are infants with underdeveloped brain function.

      We thank the reviewer for this insightful comment. We agree that there might be noises for some of the parcels, as noises exist in each step, such as data acquisition, image processing, surface reconstruction, and registration, especially considering functional MRI is noisier than structural MRI. Though our experiments show that our parcellation is fine-grained and is suitable for the study of the infant brain functional development, it is hard to directly quantitatively validate as there is no ground truth available.

      Despite these, we are still motivated to create fine-grained parcellations, as with the increase of bigger and higher resolution imaging data and advanced computational methods, parcellations with more fine-grained regions are desired for downstream analyses, especially considering the hierarchical nature of the brain organization (2). And the main reason that our method generates much finer parcellation maps, is that both our registration and parcellation process is based on the functional gradient density, which characterizes a fine-grained feature map based on fMRI. This leads to both better inter-subject alignment in functional boundaries and finer region partitions. This strategy is different from Glasser et al (3), which jointly considers multimodal information for defining parcel boundaries, thus parcels revealed purely by functional MRI might be ignored in the HCP parcellation. We hope our parcellation framework can be a useful reference for this research direction. We added this discussion in the revised manuscript between line 268 and line 271.

      For the parcel number, even without performing surface registration based on fine-grained functional features, recent adult fMRI-based parcellations greatly increased parcel numbers, such as up to 1,000 parcels in Schaefer et al. (4), 518 parcels in Peng et al. (5), and 1,600 parcels in Zhao et al. (6). For infants, we do agree that the infant functional connectivity might not be as strong as in adults. However, there are opinions (7-9) that the basic units of functional organization are likely to present in infant brains, and brain functional development gradually shapes the brain networks. Therefore, the functional parcel units in infants could be possibly on a comparable scale to adults. Even so, we do agree that more research needs to be performed on larger datasets for better evaluations. We added this discussion in the revised manuscript between line 275 and line 280.

      1-5. Further comparisons across different subjects based on small parcels increases the chances of downstream analyses incorporating image registration noise, since as Glasser et al 2016 noted, there are many examples of topographic variation, which diffeomorphic registration cannot match. Therefore averaging across individuals would likely lose this granularity. I'm not sure how to test this beyond showing that the networks work well for downstream analyses but I think these issues should be discussed.

      We agree with the reviewer that averaging across individuals inevitably brings some registration errors to the parcellation, especially for regions with high topographic variation across subjects, which would lead to loss of granularity in these regions. We believe this is an important issue that exists in most methods on group-level parcellations, and the eventual solution might be individualized parcellation, which will be our future work. We added this discussion in the revised manuscript between line 288 and line 292.

      We also agree with the reviewer that downstream analyses are important evaluations for parcellations. We provided a beta version of our parcellation with 602 parcels (10) to our colleagues, and they tested our parcellation in the task of infant individual recognition across ages using functional connectivity, to explore infant functional connectome fingerprinting (10). We compared the performance of different parcellations with 602 ROIs (our beta version), 360 ROIs (HCP MMP parcellation (3)), and 68 ROIs (FreeSurfer parcellation (11)). The results (Fig. R3) show that our parcellation with a higher parcellation number yields better accuracy compared to other parcellations. We added a description of this downstream application in the discussion between line 284 and line 287.

      Fig. R3. The comparison of different parcellations for infant individual recognition across age based on functional connectivity (figure source: Hu et al. (10)). The parcellation with 602 ROIs is the beta version of our parcellation, 360 ROIs stands for HCP MMP parcellation (3) and 68 ROIs stands for the FreeSurfer parcellation (11). This downstream task shows that a higher parcellation number does lead to better accuracy in the application.

      1-6. Finally, I feel the methods lack clarity in some areas and that many key references are missing. In general I don't think that key methods should be described only through references to other papers. And there are many references, particular to FSL papers, that are missing.

      We thank the reviewer for this great suggestion. We added related references for FLIRT, FSL, MCFLIRT, and TOPUP For the alignment to the HCP 32k_LR space, we first aligned all subjects to the fsaverage space using spherical demons, and then used part of the HCP pipeline (12) to map the surface from the fsaverage space to HCP 164k_LR space, and downsampled to 32k_LR space. We modified this citation by referencing the HCP pipeline by Glasser et al. (12) instead and detailed this registration process in the revised manuscript between line 434 to line 440 in the revised manuscript and as below:

      “… The population-mean surface maps were mapped to the HCP 164k ‘fs_LR’ space using the deformation field that deforms the ‘fsaverage’ space to the ‘fs_LR’ space released by Van Essen et al. (13), which was obtained by landmark-based registration. By concatenating the three deformation fields of steps 1, 3, and 4, we directly warped all cortical surfaces from individual scan spaces to the HCP 164k_LR space and then resampled them to 32k_LR using the HCP pipeline (12), thus establishing vertex-to-vertex correspondences across individuals and ages …”

      Reviewer #2 (Public Review):

      2-1. Diminishing enthusiasm is the lack of focus in the result section, the frequent use of jargon, and figures that are often difficult to interpret. If those issues are addressed, the proposed atlas could have a high impact in the field especially as it is aligned with the template of the Human Connectome Project.

      We’d like to thank Reviewer #2 for the appreciation of our atlas. According to the reviewer’s suggestion, we went through the manuscript again by focusing on correcting the use of jargon, clarity in the result section, as well as figures and figure captions. We hope our corrections can help explain our work to a broader community. Our revisions are accordingly detailed in the following. Meanwhile, our parcellation maps have been aligned with the templates in HCP and FreeSurfer and made available via NITRC at: https://www.nitrc.org/projects/infantsurfatlas/.

      References

      1. B. Thomas Yeo, F. M. Krienen, J. Sepulcre, M. R. Sabuncu, D. Lashkari, M. Hollinshead, J. L. Roffman, J. W. Smoller, L. Zöllei, J. R. Polimeni, The organization of the human cerebral cortex estimated by intrinsic functional connectivity. Journal of neurophysiology 106, 1125-1165 (2011).

      2. S. B. Eickhoff, R. T. Constable, B. T. Yeo, Topographic organization of the cerebral cortex and brain cartography. NeuroImage 170, 332-347 (2018).

      3. M. F. Glasser, T. S. Coalson, E. C. Robinson, C. D. Hacker, J. Harwell, E. Yacoub, K. Ugurbil, J. Andersson, C. F. Beckmann, M. Jenkinson, S. M. Smith, D. C. Van Essen, A multi-modal parcellation of human cerebral cortex. Nature 536, 171-178 (2016).

      4. A. Schaefer, R. Kong, E. M. Gordon, T. O. Laumann, X.-N. Zuo, A. J. Holmes, S. B. Eickhoff, B. T. J. C. C. Yeo, Local-global parcellation of the human cerebral cortex from intrinsic functional connectivity MRI. 28, 3095-3114 (2018).

      5. L. Peng, Z. Luo, L.-L. Zeng, C. Hou, H. Shen, Z. Zhou, D. Hu, Parcellating the human brain using resting-state dynamic functional connectivity. Cerebral Cortex, (2022).

      6. J. Zhao, C. Tang, J. Nie, Functional parcellation of individual cerebral cortex based on functional mri. Neuroinformatics 18, 295-306 (2020).

      7. W. Gao, S. Alcauter, J. K. Smith, J. H. Gilmore, W. Lin, Development of human brain cortical network architecture during infancy. Brain Structure and Function 220, 1173-1186 (2015).

      8. W. Gao, H. Zhu, K. S. Giovanello, J. K. Smith, D. Shen, J. H. Gilmore, W. J. P. o. t. N. A. o. S. Lin, Evidence on the emergence of the brain's default network from 2-week-old to 2-year-old healthy pediatric subjects. 106, 6790-6795 (2009).

      9. K. Keunen, S. J. Counsell, M. J. J. N. Benders, The emergence of functional architecture during early brain development. 160, 2-14 (2017).

      10. D. Hu, F. Wang, H. Zhang, Z. Wu, Z. Zhou, G. Li, L. Wang, W. Lin, G. Li, U. U. B. C. P. Consortium, Existence of Functional Connectome Fingerprint during Infancy and Its Stability over Months. Journal of Neuroscience 42, 377-389 (2022).

      11. R. S. Desikan, F. Ségonne, B. Fischl, B. T. Quinn, B. C. Dickerson, D. Blacker, R. L. Buckner, A. M. Dale, R. P. Maguire, B. T. Hyman, An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. Neuroimage 31, 968-980 (2006).

      12. M. F. Glasser, S. N. Sotiropoulos, J. A. Wilson, T. S. Coalson, B. Fischl, J. L. Andersson, J. Xu, S. Jbabdi, M. Webster, J. R. Polimeni, The minimal preprocessing pipelines for the Human Connectome Project. NeuroImage 80, 105-124 (2013).

    1. Signals of respect and disrespect Conflict resolution Restorative practices Ways of working Ways of handling emotion Response to trauma

      I like the way the article connects a tree with ourselves. Our outer selves are usually what people can see physically, the trunk is more internal thoughts of whatever you may have going on or how certain things in life have affected one, and then the roots are the more inner, mental type of thinking. This makes me think about how angry people are angry for their own respected reasons, why others can be more sensitive, and a different group might be more outgoing and laid back. It really makes you think about how everyone lives a different life, and how we all have different things going on that eventually make us unique in our own ways but also the same.

    1. The telephone and the phonograph, which already have done what seems to be almost miraculous work, may in time be made the means of conveying a message directly from the telegraph instrument to the person to whom it is addressed. But, until this is accomplished, we must acknowledge our dependence on the messenger-boys and fairly recognize them as person of business. 

      Its is crazy to think all of this had to be done to get to where we are today when it comes to technology.

    1. Author Response

      Reviewer #1 (Public Review):

      Anopheles is an important disease vector and the efforts to characterize the extent of genetic variation in the system are welcome. In this piece, the authors propose a Variational Autoencoders method to assign species boundaries in a large sample of Anopheles mosquitoes using a panel of 62 nuclear amplicons. Overall, the method performs well as it can assign samples to an acceptable granularity. The main advantage of the method is that it takes reduced representation genome sampling which should cut costs in genotyping. The authors do not compare the effectiveness of their amplicon panel with other approaches to do reduced representation sequencing, or the computational method with other previously published methods. Additionally, the manuscript does not clearly state what is the importance of species assignments and the findings/method are -by definition- limited to a single biological system.

      It is important to draw the reviewer’s attention to the fact that this is a two part approach – the reviewer seems to have overlooked the Nearest Neighbour component of the work. The approach is not solely a VAE – the VAE only comes into play at the species complex level. The higher level assignments are done using NN approaches.

      The manuscript has three main limitations. First, there is no explicit test of the performance of ANOSPP compared to other methods of low-dimensional sampling. While the authors state that the ANOSPP panel will lead to genotyping for low cost (justifiably so), there is no direct comparison to other low-representation methods (e.g., RAD-Seq, MSG).

      The key advantage of ANOSPP is that it works on the entire Anopheles genus; while the other suggested sequencing methods are more applicable to a group of specimens of the same or closely related species. The purpose of the panel is to do species identification for the whole genus; so it really is an alternative to the current methods of species identification, which commonly consists of morphological identification of the species complex, followed by complex-specific PCR amplification of a single species-diagnostic locus. The only other species identification method for Anopheles that is not limited to a single species complex, that we are aware of, is a mass spectrometry approach (Nabet et al. Malar J, 2021); however, they only investigate three different species and reach a classification accuracy of at most 67.5%.

      The main advantage of ANOSPP over other reduced representation sequencing methods, like MSG and RAD-Seq, is that it is specifically designed to work for the entire Anopheles genus to support genus-wide species identification. In a genus comprising an estimated 100 million years of divergence, a sequencing approach that relies on restriction enzymes is likely to introduce a lot of variability in which parts of the genome are sequenced for different species. Moreover, both MSG and RAD-Seq typically map the reads to a reference genome; any choice of reference genome will likely introduce considerable bias when dealing with such diverged species. In general, the sequence data generated by those sequencing methods require more complicated and labour intensive processing. And lastly, the costs per sample for library preparation and sequencing are substantially lower with ANOSPP than with MSG and RAD-Seq: for library prep <1 USD (ANOSPP) versus 5 USD (RAD-Seq) (Meek and Larson, Mol Ecol Resour, 2019) and with 768 samples (ANOSPP), 384 samples (MSG; Andolfatto et al, Genome Res., 2011) and 96 samples (RAD-Seq; Meek and Larson, Mol Ecol Resour, 2019) per run.

      Second, and on a related vein, the authors present NNoVAE as a novel solution to determine species boundaries in Anopheles. Perusing the very references the authors cite, it is clear that VAEs have been used before to delimit species boundaries which diminishes the novelty of the approach on its own.

      The VAE is only a part of the method presented in this manuscript. We believe a substantial amount of the value of NNoVAE lies in its ability to perform assignments for the entire Anopheles genus comprising over 100 MY of divergence - the closest analogous approach would be COI or ITS2 DNA barcoding, neither of which is robust for species complexes. Using NNoVAE, samples are first assigned to their relevant groups, and in many cases to their species, by the Nearest Neighbour method. Only those samples that are identified by the Nearest Neighbour method as members of the An. gambiae complex and cannot be unambiguously assigned to a single species, are passed through the VAE assignment method.

      Indeed, in (Derkarabetian et al, Mol Phylogenet Evol, 2019) VAEs are used to delimit species boundaries in an arachnid genus. However, this study works with ultra conserved elements, incorporating a total of 76kB of sequence, which is much more data than the approximately 10kB we get for all amplicons combined. Moreover, a crucial difference is that the referenced work uses SNP calls, based on alignment to one of their sequenced samples, as input for the VAE, where our VAE takes k-mer based inputs. This is also an important consideration in working with a large number of highly diverged species.

      Perhaps more importantly, the manuscript does not present a comparison with other methods of species delimitation (SPEDEStem, UML -this approach is cited in the paper though-), or even of assessment of population differentiation, such as STRUCTURE, ADMIXTURE, or ASTRAL concordance factors (to mention a few among many). The absence of this comparative framework makes it unclear how this method compares to other tools already available.

      NNoVAE is primarily a method for species assignment rather than for species delimitation. SPEDEStem addresses the question whether different groups of samples are separate species or not; different groups can be defined by e.g. described races, described subspecies, different morphotypes or different collection locations. The aim of ANOSPP and NNoVAE is to remove the necessity of any prior sorting of samples into groups – all that needs to be known is that the sample is an Anopheline. This avoids the issues associated with morphological identification and single marker molecular barcodes. So to perform species assignment with SPEDEStem, we’d have to run many replicates, each time asking whether a single sample is of the same species as one of the species represented in our reference database. For example, for the 2218 samples presented in the case studies, we would have to run SPEDEStem more than 130,000 times, to check for each of these samples whether they are any of the 62 species represented in the reference dataset NNv1.

      However, we agree that it would be good to check that the species-groups in the reference database, NNv1, are indeed supported as separate species. We attempted to run SPEDEStem, but the web browser no longer exists, and we were not able to install the command line application, which runs on Python 2. Moreover, the example files provided in the tutorial are not complete. Therefore, we were unable to even carry out this basic comparison.

      UML (unsupervised machine learning) approaches comprise quite a wide range of methods, including VAE. We have conducted a comparison between the VAE assignments and assignments based on UMAP, for the discussion see below and page 20 in the manuscript and newly added supplementary information section 4.

      As requested by the reviewer, we have compared our assignment approach to ADMIXTURE on the Anopheles gambiae complex training set (see Supplementary information section 5). It is a good sanity check to compare the structure revealed by ADMIXTURE to the structure revealed by the VAE. We found that ADMIXTURE does not satisfyingly differentiate between the species in the complex that are only represented by a handful of samples, while the VAE suffers much less from the differences in group sizes in the training set. Moreover, we want to point out that ADMIXTURE is a tool for assessing population differentiation, not for species assignment. To use it as an assignment method, there are two options: either infer the allele frequencies in the ancestral populations from the training set and use those to compute the maximum likelihood of ancestry frequencies for the test set; or run ADMIXTURE on the training and test sets combined and use the labels from the training set to label ancestral populations. A major drawback from the former approach is that it is tricky to discover cryptic taxa or outliers in the test set; while with the second approach we create a dependency of the training set results on the test set it is combined with during the run. But more importantly, ADMIXTURE performs worse than the VAE on the An. gambiae complex training set by itself; and identifies only two to three different groups among the five diverged species (An. melas, An. merus, An. quadriannulatus, An. bwambae and An. fontenillei). For more information, see page 20 in the manuscript and newly added supplementary information section 5

      One important use case of our method is to identify interesting samples, e.g. potential hybrids or cryptic taxa, for subsequent whole genome sequencing. After selection and whole genome sequencing of interesting samples detected by ANOSPP+NNoVAE, ADMIXTURE may be useful as one of the tools to investigate such samples.

      A final concern is less methodological and more related to the biology of the system. I am curious about the possibility of ascertainment bias induced by the amplicon panel. In particular, the authors conclusively demonstrate they can do species assignment with species that are already known. Nonetheless, there is the possibility of unsampled species and/or cryptic species. This later issue is brought up in passing the 'Gambiae complex classifier datasets' section but I think the possibility deserves a formal treatment. This is particularly important because the system shows such high levels of hybridization that the possibility of speciation by admixture is not trivial.

      We appreciate the reviewer’s concern regarding ascertainment bias in the amplicon panel. The targets have been selected based on multiple sequence alignments of all Anopheles reference genomes at the time (Makunin et al. Mol Ecol Resour, 2022). Using sequenced species from four different subgenera, the species span a considerable amount of evolutionary time in the Anopheles genus. For all species we have since tested the panel on, we find that at least half of the targets get amplified.

      We share the reviewer’s concern regarding species which are not (yet) represented in the reference database. This is one of the main advantages of the Nearest Neighbour method: it works on three levels of increasing granularity. So for samples that cannot be assigned at species level, we are often able to identify the group of species from the reference database it is closest to. In particular, the situation of a test sample whose species is not represented in the reference database, is mimicked in the drop-out experiment by the species-groups which contain only one sample. On page 16 in the manuscript, we explain how NNoVAE deals with such samples and we show that in the majority of cases NNoVAE assigns the sample to a group of closely related species rather than misclassifying it more specifically to the wrong species.

      In summary, the main limitation of the manuscript is that the authors do not really elaborate on the need for this method. The manuscript does show that the method is feasible but it is not forthcoming on why this is of importance, especially when there is the possibility of generating full genome sequences.

      ANOSPP and NNoVAE are specifically designed for high throughput accurate species identification across the entire Anopheles genus – WGS is important to address many questions, but is complete overkill for doing species identification. ANOSPP costs only a small fraction of whole genome sequencing, which makes it possible to monitor mosquito populations at much larger scale (e.g., in partnership with our vector biologist collaborators in Africa, we have already generated ANOSPP data for approximately 10,000 mosquitoes and will be running 500,000 over the next few years). Moreover, for most analyses using whole genome sequencing, a reference genome of a sufficiently similar species is required. While we are in a position of privilege having reference genomes for more than 20 species in Anopheles, we have a long way to go before we have 100s of reference genomes covering the true diversity of the genus.

      NNoVAE can also be used to select interesting samples (e.g. species that have not been through the panel before, divergent populations, potential hybrids), which can be submitted for whole genome sequencing subsequently.

      Since Anopheles is arguably one of the most important insects to characterize genetically, the ANOSPP panel is certainly important but I am not completely sure the method of species assignment is novel or groundbreaking .

      Reviewer #2 (Public Review):

      The medically important mosquito genus Anopheles contains many species that are difficult or impossible to distinguish morphologically, even for trained entomologists. Building on prior work on amplicon sequencing, Boddé et al. present a novel set of tools for in silico identification of anopheline mosquitoes. Briefly, they decompose haplotypes generated with amplicon sequencing into kmers to facilitate the process of finding similar sequences; then, using the closest sequence or sequences ("nearest neighbors") to a target, they predict taxonomic identity by the frequency of the neighbor sequences in all groups present in a reference database. In the An. gambiae species complex, which is well-known for its historical and ongoing introgression between closely-related species, this approach cannot distinguish species. Therefore, they also apply a deep learning method, variational autoencoders, to predict species identity. The nearest neighbor method achieves high accuracy for species outside the gambiae complex, and the variational autoencoder method achieves high accuracy for species within the complex.

      The main strength of this method (along with the associated methods in the paper on which this work builds) is its ability to speed up the identification of anopheline mosquitoes, therefore facilitating larger sample sizes for a wide breadth of questions in vector biology and beyond. This technique has the added advantage over many existing molecular identification protocols of being non-destructive. This high-throughput identification protocol that relies on a relatively straightforward amplicon sequencing procedure may be especially useful for the understudied species outside the well-resourced gambiae complex.

      An additional and intriguing strength of this method is that, when a species label cannot be predicted, some basic taxonomic predictions may still be made in some cases. Indeed, even in the case of known species, the authors find possible cryptic variation within An. hyrcanus and An. nili, demonstrating how useful this new tool can be.

      The main weakness of this method is that, as the authors note, accuracy is dependent on the quality and breadth of the reference database (which in turn relies on the expertise of entomologists). A substantial portion of the current reference database, NNv1, comes from one species complex, An. gambiae. This is reasonable given the complex's medical importance and long history of study; however, for that same reason, robust molecular and computational tools for identifying species in this complex already exist. The deep learning portion of this manuscript is a valuable development that can eventually be applied to other species complexes, but building up a sufficient database of specimens is non-trivial. For that reason, the nearest neighbor method may be the more immediately impactful portion of this paper; however, its usefulness will depend on good sampling and coverage outside the gambiae complex.

      Another potential caveat of this method is its portability. It is not clear from either the manuscript or the code repository how easy it would be for other researchers to use this method, and whether they would need to regenerate the reference database themselves. The authors clearly have expansive and immediate plans for this workflow; however, as many researchers will read this manuscript with an eye towards using these methods themselves, clarifying this point would be valuable.

      This is an important point; currently the amplicon panel is only run on specialised robots, but we are working to adapt the protocol so that it can be run in any basic molecular lab. We have now clarified this in the conclusion. Furthermore, there is never a need to regenerate the reference databases – this is fully publicly available at github.com/mariloubodde/NNoVAE and version controlled. As we obtain ANOSPP data from additional samples, representing new species or new within-species diversity, we will add these to the reference database and create an updated openly available version.

      The authors present data suggesting that their method is highly accurate in most of the species or groups tested. While the usefulness of this method will depend on the reference database, two points ameliorate this potential concern: it is already accurate on a wide breadth of species, including the understudied ones outside the An. gambiae complex; additionally, even when a specific species identification cannot be made, the specimen may be able to be placed in a higher taxonomic group.

      Overall, these new methods offer an additional avenue for identifying anopheline species; given their high-throughput nature, they will be most useful to researchers doing bulk collections or surveillance, especially where multiple morphologically similar species are common. These methods have the potential to speed up vector surveillance and the generation of many new insights into anopheline biology, genetics, and phylogeny.

    1. #stylez--3fKJu styles--3sKVw "> #_I_have_ "other ideas" that are related to our concentration here; and I really thinksomeone in a position like yours would benefit greatly from working on the branch of crypto related to "free communioation."I want to build an open social network ["protocol"] that combines "what email, facebook, reddit and ... wikipedia to enable "commenting on anything" the light of ...# "hey ma, where did all the online newspaper comments disappear to?"I know what has to go into it, I'm looking at things like hypothes.is, tableland.xyz and ... https://lnkd.in/gNbBAewt ... and I think it would be simple to put something together that will intrigue people; i know the software and infrastructure can offer us a bulletproof check on censorship that we need now more than ever before in history; and I'm having trouble figuring out why more people aren't interested in helping me ensure that we have a safe happy future free from "un america n th i ngz" like "no newspaper" and no recourse against insurance and credit fraud/problems; which is what I'm staring at in full blown disbelief.</textarea></div></div></label></div><div class="styles--2mJeY"><div class="styles--YBb-N styles--3IYUq"><div--

      stylez--3fKJu styles--3sKVw "> #I_have "other ideas" that are related to our concentration here; and I really think

      someone in a position like yours would benefit greatly from working on the branch of crypto related to "free communioation." I want to build an open social network ["protocol"] that combines "what email, facebook, reddit and ... wikipedia to enable " commenting on anything" the light of ...

      "hey ma, where did all the online newspaper comments disappear to?"

      I know what has to go into it, I'm looking at things like hypothes.is, tableland.xyz and ... https://lnkd.in/gNbBAewt ... and I think it would be simple to put something together that will intrigue people; i know the software and infrastructure can offer us a bulletproof check on censorship that we need now more than ever before in history; and I'm having trouble figuring out why more people aren't interested in helping me ensure that we have a safe happy future free from "un america n th i ngz" like "no newspaper" and no recourse against insurance and credit fraud/problems; which is what I'm staring at in full blown disbelief.</textarea></div></div></label></div><div class="styles--2mJeY"><div class="styles--YBb-N styles--3IYUq"><div -- The importance of seeing that it's an open "DeFi-inspir[ing/edu]" protocol that will work with existing service and interfaces like LinkedIn and Facebook and Mastodon and diasp.org is ... without question a necessary part of understanding the vision. I think we will see great leaps and bounds in interface design that make the "second small step" Dissenter/Unity and #hypothes eze ... have almost brought to the forefront of the "right venue." https://web.hypothes.is/sponsors/ Seeing #hypothesisontableland and having it work is the first "glaringly bright flash" that will ensure that we never again watch commenting and sites like discus and reddit and facebook turn from the light of social "what's on fire?" to the darkness of shadow ... "throttling" of the presentation of the world changing that somehow has missed our tongues and hopefully not our eyes.<br /> Hopefully once we start talking and getting more involved it will be clear how easy it was and is to make the world a better place just by ... "dropping in your two cents" or BTC as the case may be. --

      We've got to get serious about caring "of things like ourselves" for the truth and health and happiness that some of us probably take for granted as I do;

      still you can see me smiling when i know full well it's a little early for that--maybe you can help shift the timeline.

    1. On the top are slanting translucentscreens, on which material can be projected for convenientreading. There is a keyboard, and sets of buttons and levers.Otherwise it looks like an ordinary desk.

      I think it's interesting that Bush associates the memex with something the size of a desk. This may be due to the fact that he naturally assumed that something this powerful and containing this much information would need to be housed in a large apparatus. This is obviously not the case in modern day as we see smart devices hundreds of times smaller than a desk with all of the capabilities that Bush lists.

    2. Now, says Dr. Bush, instruments are at hand which, if properlydeveloped, will give man access to and command over the inheritedknowledge of the ages. The perfection of these pacific instrumentsshould be the first objective of our scientists as they emerge from theirwar work.

      During the time of World War II many scientific innovations were made. One view that many may not think about everyday while interacting with information. Moreover, its a view that is one of the most important as storing knowledge and accessibility of it. This in fact has opened the doors for the current digital age and provides a perspective on how far we have come and the amount of knowledge we have accessible to us currently.

  2. inst-fs-iad-prod.inscloudgate.net inst-fs-iad-prod.inscloudgate.net
    1. Children may never come to see fractions asbeing fundamentally different from whole num-bers and thus may fail to understand fractionoperations.

      This is why we use methods such as partitioning and iterating in order to see the fractions as fractions that can be divided or multiplied, instead of seeing the numerator and denominator of a fraction as separate whole numbers. This can be very confusing for students to think about this way.

    2. Improper fractions may be nonsensical to chil-dren because they may think that a quantity thatis more than the original amount is impossible.For example, 3/2 thought of as 3 out of 2 thingsis problematic, prompting the child to ask howshe can take three things when she has only twothings total.

      iterating is a process that makes improper fractions a lot more understandable. Using iterating, we can see that we have 3 equal parts of 1/2.

    1. Why is this important in this history of psychology?

      "The present work will, I venture to think, prove that I both saw at the time the value and scope of the law which I had discovered, and have since been able to apply it to some purpose in a few original lines of investigation. But here my claims cease. I have felt all my life, and I still feel, the most sincere satisfaction that Mr. Darwin had been at work long before me, and that it was not left for me to attempt to write 'The Origin of Species.' I have long since measured my own strength, and know well that it would be quite unequal to that task. Far abler men than myself may confess that they have not that untiring patience in accumulating and that wonderful skill in using large masses of facts of the most varied kinds, -- that wide and accurate physiological knowledge, -- that acuteness in devising, and skill in carrying out, experiments, and that admirable style of composition, at once clear, persuasive, and judicial, -- qualities which, in their harmonious combination, mark out Mr. Darwin as the man, perhaps of all men now living, best fitted for the great work he has undertaken and accomplished." This comes from the Classics in the History of Psychology Limits of Natural Selection By Chauncey Wright (1870). This shows us the importamce of the limits including in theories like this one. Natural selection indicates that the strongest will be the ones that will survive and there for will be the ones that will be able to have offsprings and make their generation endure. But thjis has a limit due to the sexual selection because it shows that the natural selection can not be impossed to people in any way or form. I see this working in psychology in a very big way because now that we are in a generation that is so ruled out by the social media this concept wants to persist and endure no matter what. I can see natural selecetion slowly decreasing amd really another type of selection evolving with the next future generations.

      Angela Cruz Cubero (Christian Cruz Cubero)

    1. #stylez--3fKJu styles--3sKVw "> #_I_have_ "other ideas" that are related to our concentration here; and I really thinksomeone in a position like yours would benefit greatly from working on the branch of crypto related to "free communioation."I want to build an open social network ["protocol"] that combines "what email, facebook, reddit and ... wikipedia to enable "commenting on anything" the light of ...# "hey ma, where did all the online newspaper comments disappear to?"I know what has to go into it, I'm looking at things like hypothes.is, tableland.xyz and ... https://lnkd.in/gNbBAewt ... and I think it would be simple to put something together that will intrigue people; i know the software and infrastructure can offer us a bulletproof check on censorship that we need now more than ever before in history; and I'm having trouble figuring out why more people aren't interested in helping me ensure that we have a safe happy future free from "un america n th i ngz" like "no newspaper" and no recourse against insurance and credit fraud/problems; which is what I'm staring at in full blown disbelief.</textarea></div></div></label></div><div class="styles--2mJeY"><div class="styles--YBb-N styles--3IYUq"><div

      The importance of seeing that it's an open "DeFi-inspir[ing/edu]" protocol that will work with existing service and interfaces like LinkedIn and Facebook and Mastodon and diasp.org is ... without question a necessary part of understanding the vision. I think we will see great leaps and bounds in interface design that make the "second small step" Dissenter/Unity and #hypothes eze ... have almost brought to the forefront of the "right venue."

      https://web.hypothes.is/sponsors/

      Seeing #hypothesisontableland and having it work is the first "glaringly bright flash" that will ensure that we never again watch commenting and sites like discus and reddit and facebook turn from the light of social "what's on fire?" to the darkness of shadow ... "throttling" of the presentation of the world changing that somehow has missed our tongues and hopefully not our eyes.

      Hopefully once we start talking and getting more involved it will be clear how easy it was and is to make the world a better place just by ... "dropping in your two cents" or BTC as the case may be.

    1. #stylez--3fKJu styles--3sKVw "> #_I_have_ "other ideas" that are related to our concentration here; and I really thinksomeone in a position like yours would benefit greatly from working on the branch of crypto related to "free communioation."I want to build an open social network ["protocol"] that combines "what email, facebook, reddit and ... wikipedia to enable "commenting on anything" the light of ...# "hey ma, where did all the online newspaper comments disappear to?"I know what has to go into it, I'm looking at things like hypothes.is, tableland.xyz and ... https://lnkd.in/gNbBAewt ... and I think it would be simple to put something together that will intrigue people; i know the software and infrastructure can offer us a bulletproof check on censorship that we need now more than ever before in history; and I'm having trouble figuring out why more people aren't interested in helping me ensure that we have a safe happy future free from "un america n th i ngz" like "no newspaper" and no recourse against insurance and credit fraud/problems; which is what I'm staring at in full blown disbelief.</textarea></div></div></label></div><div class="styles--2mJeY"><div class="styles--YBb-N styles--3IYUq"><div

      The importance of seeing that it's an open "DeFi-inspir[ing/edu]" protocol that will work with existing service and interfaces like LinkedIn and Facebook and Mastodon and diasp.org is ... without question a necessary part of understanding the vision. I think we will see great leaps and bounds in interface design that make the "second small step" Dissenter/Unity and #hypothes eze ... have almost brought to the forefront of the "right venue."

      https://web.hypothes.is/sponsors/

      Seeing #hypothesisontableland and having it work is the first "glaringly bright flash" that will ensure that we never again watch commenting and sites like discus and reddit and facebook turn from the light of social "what's on fire?" to the darkness of shadow ... "throttling" of the presentation of the world changing that somehow has missed our tongues and hopefully not our eyes.

      Hopefully once we start talking and getting more involved it will be clear how easy it was and is to make the world a better place just by ... "dropping in your two cents" or BTC as the case may be.

    1. The Andrew W. Mellon Foundation $752,000 1 January, 2014 The Andrew W. Mellon Foundation awarded Hypothesis a multi-year grant to support the development of annotation services for digital scholarly materials, including support for the I Annotate annual conference, I Annotate 2014: Annotato Ergo Sum.

      The importance of seeing that it's an open "DeFi-inspir[ing/edu]" protocol that will work with existing service and interfaces like LinkedIn and Facebook and Mastodon and diasp.org is ... without question a necessary part of understanding the vision. I think we will see great leaps and bounds in interface design that make the "second small step" Dissenter/Unity and #hypothes eze ... have almost brought to the forefront of the "right venue."

      https://web.hypothes.is/sponsors/

      Seeing #hypothesisontableland and having it work is the first "glaringly bright flash" that will ensure that we never again watch commenting and sites like discus and reddit and facebook turn from the light of social "what's on fire?" to the darkness of shadow ... "throttling" of the presentation of the world changing that somehow has missed our tongues and hopefully not our eyes.

      Hopefully once we start talking and getting more involved it will be clear how easy it was and is to make the world a better place just by ... "dropping in your two cents" or BTC as the case may be.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: GlmS, the glucosamine-6-phosphate synthetase in E. coli and related bacteria, is essential, required for synthesis of both peptidoglycan and LPS. It is regulated at various levels, including positive regulation of GlmS translation by the Hfq-binding sRNA GlmZ. GlmZ activation of translation is regulated, indirectly, by the levels of GlcN6P, the product of GlmS. The components of the sensing and regulatory cascade have previously been defined, via genetics, biochemical and molecular biology studies. GlmZ is cleaved by Rnase E, becoming inactive, when GlcN6P levels are high, dependent upon the binding of GlmZ to RapZ. RapZ has been found to directly sense GlcN6P levels; another regulatory RNA, GlmY, also binds RapZ in the absence of GlcN6P, protecting GlmZ from RapZ-mediated processing. The authors of this manuscript performed cryoEM to study the structure of two important complexes in this sensing cascade, RapZ/GlmZ and RapZ/GlmZ/RNase E-NTD, with the aim of clarifying how the RNA binding protein RapZ causes the cleavage of sRNA GlmZ by RNaseE. Some of the predictions for critical residues in the RapZ/GlmZ binary complex structure were investigated by mutagenesis RapZ to define essential resiudes for GlmZ cleavage; the results are consistent with the structure.

      Major comments:

      • Are the key conclusions convincing? 1) Given that this is basically a structural paper, the major questions would be whether the cryoEM reconstructions are accurate (appear to be consistent with general expectations) and whether there is clear evidence to support the physiological relevance of the structure. The tests of function are of two sorts: a) Effect of RapZ mutants in Fig. 3b-d. These tests show loss of RapZ function with various alleles, likely consistent with model (but as noted below, very difficult for the reader to identify on the structures in 3a). The implication is that these will interfere with GlmZ binding. Possibly direct tests of a couple of these mutants for GlmZ binding (or pull down of GlmZ from in vivo expressed protein) would further support the model. I note that the text says T248A was unaffected in cleavage, but seems to be much reduced in Fig. 3b, even if fusion activity is good.

      Our reply. We have made further tests of the mutations for GlmZ binding. Using electrophoretic mobility shift assays, we observe reduced GlmZ binding affinities for RapZ mutants K170A, H190A, C247A, T248A (figure below). We also tested the activity of RapZ variant with 4 substitutions at the proposed RapZ/NTD interface (right lanes in figure below).

      We followed the recommendation of the reviewer and performed co-purification experiments (“pull-down”) using StrepTactin affinity chromatography and Strep-tagged RapZ variants as baits. Eluates were assessed for RapZ protein content and co-eluting GlmZ and processed GlmZ* sRNAs using Northern blotting. These new results, which have been incorporated in Fig. S7c, show that all tested RapZ variants except for the wild-type protein are not capable to pull-down GlmZ or GlmZ* in cell extracts. This includes the RapZ-T248A variant, which as noted by the referee is nonetheless still capable to decrease full-length GlmZ to some extent, albeit processed GlmZ* is hardly detectable (Fig. 3b, lanes 23, 24). To address this issue further, we purified the RapZ-T248A variant and some additional variants for comparison and performed EMSA. Globally, the EMSAs confirm the co-purification experiments, i.e. they demonstrate strongly reduced GlmZ binding activity for most tested RapZ variants, but also show that the RapZ-T248 variant kept some residual binding activity. This may explain the weak signal for processed GlmZ in the Northern blot (Fig. 3b) as processed GlmZ* likely binds to RapZ for stabilization. Similar effects were previously seen for the RapZquad and the RapZ 1-279 variants in Durica-Mitic et al. 2020 (Fig. 5). Accordingly, we also changed our wording concerning the RapZ-T248A variant in the text. We have not incorporated the EMSA figure into the updated manuscript.

      b) The ternary complex was tested primarily by the BACTH assay of some RapZ mutants (Fig. S11), that show a reduced interaction. This is not a particularly convincing assay for a number of reasons: 1) the effects are relatively modest (2x down, in an assay that is probably not very linear with interaction, 2) some with reduced interaction (S239A, T248A) had good activity (at least all those with full interaction seem to be functional); 3) Ternary complex suggests that RapZ mediates this interaction; this could be tested by deleting glmZ (and maybe glmY as well) from this BACTH strain. 4) the authors suggest that there are also important protein-protein interactions, based on some observed interactions, and support this with similarly difficult to interpret BACTH data from a previous paper for Rnase E-RapZ interaction. Here, too, that is not the most compelling data (is this interaction RNA-independent?).

      Our reply: Previous work already indicated that formation of the ternary complex involves multiple interactions – direct protein-protein contacts but also indirect interactions mediated by sRNA GlmZ. For instance, in vitro pull-down signals (RapZ = prey; RNase E = bait) become reduced but not abolished when RNA-free protein preparations are used (Durica-Mitic et al., 2020; Fig. 2E). BACTH signals are reduced 2-fold when using RNase E and RapZ variants that are strongly impaired in their RNA-binding capabilities, respectively (Durica-Mitic et al., 2020; Fig. 2C). As the BACTH assays and in vitro pull-down approaches yield similar trends, we suggest that BACTH experiments represent a useful approach to clarify the questions under study.

      Point b1: To demonstrate that removal of multiple interactions is required to disrupt the ternary complex, we combined substitutions of residues making contact to the sRNA as well as residues directly contacting RNase E. According to the structure of the ternary complex presented here, residues T161, Y240, N271 and Q273 in RapZ are proposed to contact RNase E directly. Upon substitution of these four latter residues, resulting in the RapZ variant named RapZ-4 subst., the BACTH signal decreases two-fold – similar to what is observed for the RapZ variants that carry Ala substitutions of residues involved in sRNA-binding, such as H190 or R253. Importantly, when the latter two substitutions are introduced into the RapZ-4 subst. variant – either alone or in combination, the BACTH signal is reduced to almost back-ground levels. These results are in agreement with the features of the ternary complex proposed here and also with data obtained previously: They show that protein-protein and protein-RNA contacts must be concomitantly removed to disrupt the complex completely. We integrated the latter data as Fig. S7a in the revised manuscript and discuss the data at the appropriate positions in the text.

      Point b2: In our opinion, the data reporting regulatory activity of the individual RapZ variants (Fig. 3 b-d) correlate well with the BACTH data (Fig. S7a): RapZ variants carrying substitutions of residues I175 and N236 retain regulatory activity and concomitantly a high RNase E interaction potential indistinguishable from the wild-type is observed. In contrast, RapZ variants carrying substitutions affecting sRNA-binding, i.e. H190A, C247A, C247S, T248D, G249W, R253A loose activity completely and concomitantly show a 2-fold decrease in the BACTH signal. The remaining BACTH signal is explained by the remaining (protein-protein) contacts as discussed above (point b1). Therefore, these variants are likely uncapable to present GlmZ in a correct manner to RNase E even though interaction is retained to some degree.

      Only the RapZ mutants with exchanges H171A, S239A and T248A do not follow either of these two scenarios: albeit they exhibit reduced interaction with RNase E according to BACTH, they retain the ability to regulate the chromosomal glmS’-lacZ fusion, at least when produced from a plasmid (Fig. 3d). However, inspection of the GlmZ Northern blot signals (Fig. 3b) reveals that full-length GlmZ is decreased as expected, but that processed GlmZ* becomes either not visible or is much reduced when compared to wild-type RapZ. This explains by a reduced sRNA binding affinity, as pointed out above (point 1a), which also provides a rationale for the decreased BACTH signal.

      Point b3: We agree that deletion of glmZ in the BACTH strain would be an ideal approach to dissect the contributions of protein-protein and sRNA-protein mediated interactions for formation of the ternary complex in vivo. Unfortunately, construction of the strain is not straight-forward. In our hands, the BACTH reporter strain BTH101 is not amenable to chromosomal manipulations by using engineered recombination tools such as the phage lambda-derived Red system. This may be explained by regulatory elements used by the l Red system that depend on cAMP, which cannot be synthesized in this strain.

      __Point b4: __We have addressed this query in the response to point b1.

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? Possibly the importance of RNAse E-RapZ direct interaction, without further proof that this actually is needed for function.

      __Our reply: __We partially addressed this issue already in our response to point b1. Additionally, we also tested activity of the RapZ-4 subst variant that lacks the residues making direct contact to RNase E in our structure (Fig. 3b-d, last two lanes/columns). The results that are now described in the last paragraph of the results section show that this variant retains regulatory activity. Interestingly, the level of processed GlmZ* is strongly reduced in this case, similar to what is observed with the RapZ-S239A and RapZ-T248A variants discussed above. Therefore, these direct protein-protein contacts might have a role for GlmZ* decay in a manner that remains to be addressed.

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. As noted above, further tests of RapZ mutants for RNA binding would be useful; if this has been done previously, needs to be presented.

      Our reply.

      This has been addressed in the response above.

      Are there Rnase E residues that would be predicted by the model to be critical for the RapZ or GlmZ interaction but are not otherwise needed for activity? Would these disrupt either the BACTH results or activity in vivo?

      Our reply.

      Please see response to this point above.

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. Yes, they are. They are generally extrapolations from what is already in the paper or in previous studies by these groups.
      • Are the data and the methods presented in such a way that they can be reproduced? Yes.
      • Are the experiments adequately replicated and statistical analysis adequate? Yes.

      Minor comments: - Specific experimental issues that are easily addressable. None noted. - Are prior studies referenced appropriately? Yes, they are. However, the paper could more clearly outline what is already known at the level of interactions of the molecules under study here.

      Our reply. We have changed the text to better introduce information from previous studies: interprotomer contacts, properties of the isolated RapZ domains, conclusions from the truncation analyses, requirements for interaction for RNase E and for sRNA-binding, stabilization of processed GlmZ through RapZ binding (Göpel et al., 2013; Gonzalez et al 2017; Durica-Mitic and Görke, 2019; Durica-Mitic et al., 2020).

      • Are the text and figures clear and accurate?
      • In a number of places, the text and figure order/numbers are not correct. See Fig. S1 (p. 4), S2 (legends vs. figure panels).

      Our reply. We have corrected these in the revised text.

      Better labeling in many figures is needed. Clarify what is shown in Fig. S2d, and make the labels readable. Label the particle types in S3. Use schematics more (as in Fig. 4 and S8) to make it easier for reader to follow structure (for Fig. 2, for instance). It is very difficult to discern RapZ tetramer here. Fig. 3a, it is very difficult to see the residue numbers on the structures. Clearly identify the fructokinase-like domains. Label lanes in Fig. 3b, c, d. Indicate active site for RNase E. in Fig. 4, in schematic at least.

      Our reply. We have also corrected these in the revised text.

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions? Overall, clarify and highlight better how the structures here fit with what is already known about important sequences/regions of RapZ, GlmZ, and Rnase E, maybe color-coding parts of GlmZ shown to be important for RapZ recognition, etc.

      Our reply. We have added a sequence alignment for RapZ in the supplementary materials section, indicating important residues (Fig. S12).

      Page 12, the second last row. Text after 'In this model...' can be simplified or removed because it is just a hypothesis.

      Our reply. We have simplified the text.

      Our reply:

      We believe that the discussion section should also give room for novel ideas and hypotheses. Therefore, we wish to keep the paragraph.

      Reviewer #1 (Significance (Required)):

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. Rnase E is a major essential endonuclease in bacteria such as E. coli. How accessory proteins lead to its recognition and cleavage of regulatory RNAs such as GlmZ is not well understood at the structural level, and these structures provide important insight into that process. In addition, the GlmZ/RapZ regulatory circuit plays an important role in bacterial growth and pathogenesis, and understanding it at this level of detail will certainly open up possibilities for targeting this process in the future.

      • Place the work in the context of the existing literature (provide references, where appropriate). The components that go into the current structures have been studied previously, with publications on RapZ structure, analysis of critical regions within the GlmZ RNA, and demonstration of the domain of Rnase E involved in interactions with RapZ (Durica-Mitic et al, 2020; Khan et al, 2020, Gonzalez et al, 2017, among others), exactly how these fit together has not been known. Other RNA binding proteins that affect degradation have been reported, but are not fully understood, and ways in which the ribonuclease binds complex RNAs is not fully understood either.

      • State what audience might be interested in and influenced by the reported findings. This work should be of broad interested to the field of RNA-based regulation and RNA degradation, with particular interest for those working on these processes in bacteria.

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. Our expertise is in RNA-based regulation and microbial genetics; we are not able to critically evaluate the cryoEM analysis itself.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Islam et al present their characterization of the E. coli RapZ-GlmZ-RNase E ternary complex in this manuscript under review. In E. coli, the RNA binding protein RapZ facilitates cleavage of GlmZ sRNA by RNase E when intracellular concentrations of GlcN-6P are high; when GlcN-6P levels are low RapZ is titrated by GlmY sRNA and GlmZ sRNA promotes an increase in the translation and stability of the mRNA encoding GlcN-6P synthase, GlmS. Via Cryo-EM, the authors of this manuscript solve the structure of the binary RapZ:GlmZ (Fig. 2) and ternary RapZ:GlmZ:RNase Y (Fig. 4) complexes. Based on the apparent RapZ-sRNA binding sites in the solved structure of the binary complex, the authors make substitutions in residues suspected to be involved in RNA binding and measure the impact of these substitutions on cleavage of GlmZ and GlmZ-mediated activation of GlmS expression (Fig. 3). The authors find that some of the residues predicted to be involved in RNA binding based on their structural studies are also important for the cleavage of GlmZ, presumably by RNase E. Finally, the authors show via bacterial two-hybrid assays that some residues of RapZ necessary for GlmZ cleavage are also important for its interaction with RNase E (Fig. S11). I would suggest that the authors measure co-immunoprecipitation of GlmZ with tagged-RapZ with or without substitutions in the proposed RNA binding residues to resolve this issue. Alternatively, EMSAs could be performed.

      Our reply. Please see the response above to reviewer 1. We have included results from EMSAs with selected RapZ mutants and for multiple mutations in the BACTH analysis.

      Major comments:

      Overall, the structural studies our impressive and provide considerable insight into the recognition of substrates by RapZ and RNase E. Given the dearth of solved structures of RNAs with their cognate RNA binding proteins, these results are very significant.

      A limitation in this work is the lack of experiments directly testing whether or not the residues of RapZ that appear to be important for its interaction with the GlmZ sRNA in the authors' Cryo-EM structures actually have a significant role in RNA binding. In lieu of measuring GlmZ binding by RapZ, the authors measure GlmZ cleavage in strains expressing RapZ or particular variants harboring substitutions in residues that appear to play a role in sRNA binding (Fig. 3b); however, it is impossible for the authors to determine whether impairment of GlmZ cleavage by RNase E in their assays is due to lack of GlmZ binding to RapZ, extraordinarily tight binding of GlmZ to RapZ, changes in the orientation of GlmZ bound to RapZ, or conformational changes in RapZ that lead to disruption of direct RapZ-RNase E contacts. The lack of this empirical data supporting their structural studies becomes more salient as the authors attempt to test whether RapZ binding of GlmZ is important for its interaction with RNase E via a bacterial two-hybrid assay. Since the authors have not directly examined the importance of particular RapZ residues on GlmZ binding, the authors' interpretation of their results from these assays is very speculative.

      Our reply: Reviewer 1 raised a similar point to which we replied above. The role of candidate residues in RapZ for binding GlmZ has been addressed by more direct assays (Pull-down/EMSA).

      The authors state on page 7 that "the interaction of RapZ:GlmZ with RNase E does not involve conformational rearrangement of either RapZ or GlmZ". However, the arrangement of SLII relative to SLI appears different between the RapZ:GlmZ and RapZ:GlmZ:RNase E structures presented. Additionally, SLII appears entirely bound by RapZ in the binary complex (Fig. 2b), whereas in the structure of the ternary complex, SLII appears less associated with RapZ (Fig. S4b). A supplementary figure showing side-by-side the structure of GlmZ bound to RapZ solved in the presence or absence of RNase E may make clear whether any differences that exist in the conformation of RapZ and GlmZ between the binary and ternary complex structures.

      Our reply: In the revised manuscript, we have included a supplementary figure showing side-by-side comparisons of the structures.

      Minor comments: Figure S1 legend. Change "inactivate" to "inactive" or "inactivated"

      Figure S2 legend. The description for "(d)" is for S2c and the text for "(c)" refers to the image in S2d.

      Figure legend S5a and S9a. If resolution in the key is in angstroms, then it should be indicated.

      Our reply: We have now corrected the above points in the revised text.

      Figure 1. The model appears to indicate that the apo-form of RapZ binds GlmZ and GlmY, whereas the GlcN-6P bound form does not. Moreover, in the discussion, the authors indicate that GlcN-6P interferes with GlmZ binding to RapZ. How does RapZ bind and cleave GlmZ when GlcN-6P levels are high, if GlcN-6P interferes with GlmZ binding? It would be useful for the authors to address this conundrum in their discussion.

      Our reply. We thank the reviewer for pointing out this paradox. Our unpublished work indicates that RapZ may have phosphatase activity for GlcN6P, and we added a comment to this in the discussion section.

      Fig. S3B and C. While panels in Fig. S3B and C seemed well aligned, numbering of lanes would provide additional clarity.

      We will provide lane numbers, accordingly.

      Many bacterial species including Bacillus subtilis, Streptococcus pyogenes, and Clostridium botulinum have RapZ homologs that bear a tyrosine instead of a histidine residue at the position corresponding to H190 in E. coli RapZ. Would you expect this change to reduce GlmZ binding by RapZ or lead to change in RNA specificity based on your structural data? This may be useful to discuss in the manuscript.

      We believe that the is more behind this question. Likely, the referee (by inspecting a RapZ sequence alignment) realized that almost all residues proposed to be involved in binding GlmZ are also conserved in RapZ homologs in Gram-positive bacteria, unless His190 and His171, which are replaced by tyrosines in some of these species. However, no RNA-binding activity has been reported for the Gram-positive RapZ homologs. If true, the question arises what is making the difference here? In principle, this could be due to the lacking histidine residues, which are replaced by tyrosines in Gram-positive RapZs. Alternatively, we consider that the positively charged residues at the far C-terminus (K270, K281, R282, K283), which were identified previously to be required for sRNA binding (Göpel et al., 2013; Durica-Mitic et al., 2020), and which could not be resolved in the current structures, are additionally required to obtain RNA-binding activity.

      Fig. S10. It is confusing to me that the yellow chain in the structure of RNase E is labeled as the DNase I-domain in the apo structure, whereas in the structure with RprA or GlmZ bound, this colored region is labeled as the 5' sensing domain.

      We have changed the figure to make it clearer.

      On page 12, the authors appear to indicate that their structural studies of the RapZ-GlmZ-RNase E ternary complex could be informative with regards to how KH domain proteins in Gram-positive bacteria could present their substrates to RNase E. First of all, these bacteria lack RNase E and instead encode an evolutionarily distinct endoribonuclease (RNase Y). Secondly, I think that it is overreaching to state that these structural studies will inform us on how KH domain proteins such as KhpA/KhpB, which may or may not have a chaperoning function akin to Hfq in Gram-positive bacteria, present substrates to RNase Y. Regardless, if this statement is to remain, the authors should make clear that is RNase Y and not RNase E that they are referring to.

      We have changed the text to make clear that a different RNase is employed in this case.

      Reviewer #2 (Significance (Required)):

      In my opinion, the significance of this work is in the achievement of high-resolution structures of the complexes of the RNA binding protein RapZ and the endoribonuclease RNase Y with RNA substrate bound. There are very few structures solved of RNA binding proteins or RNases with their cognate substrates. This is likely due to the difficult in obtaining high resolution data for the bound RNA that may have a large degree of flexibility or many alternative conformations. More structures like this are needed to advance our understanding of RNA-protein interactions.

      I believe that these findings would not only be of great interest to those that study small regulatory RNAs, such as myself, but also others more generally interested in RNA binding proteins, RNases, or protein-RNA interactions.

      Field of expertise: small regulatory RNAs, RNA chaperones, RNases

      **Referees cross-commenting**

      1. I agree with Reviewer #1 that the results of the bacterial two-hybrid assay would be more informative, if the authors tested the impact of deletion of glmZ on the ability of the wild type and mutant RapZ proteins to interact with RNase Y by this assay.

      As both reviewer #1 and I indicated, I think that it would be useful for the authors to directly assess the effect of key substitutions in RapZ on GlmY binding by a more direct measure of interaction, e.g., CoIP or EMSA.

      I do think that it would be nice at some point for the authors to actually provide evidence that GlcN6P binds to the site that they predict as reviewer 3 suggested but this may be be beyond the scope of this manuscript and may be better addressed in another manuscript in which the authors solve the structure of RapZ with GlcN6P bound. In the meantime, the authors could limit their speculation.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: The biogenesis of the bacterial cell envelope relies on glucosamine-6-phosphate (GlcN6P), which is mediated by GlmZ and the sRNA-binding protein RapZ. GlmZ stimulates translation of the GlcN6P synthetase. When the levels of the GlcN6P are sufficiently high, RapZ will presents GlmZ to the endoribonuclease RNase E for cleavage and thereby silencing synthesis of the GlcN6P synthetase. However, how RapZ recruit RNase E to GlmZ for degradation is still unsolved. This paper reports the cryoEM structure of the binary complex of RapZ: GlmZ and the ternary complex of the RNase E catalytic domain (RNase E-NTD), RapZ and GlmZ. RapZ interacts with SLI and SLII of GlmZ through complementarity in shape and electrostatic charge to the phosphodiester backbone of the sRNA and presents the sRNA by alignning its SSR comprising the cleavage site into the RNase E active center. This paper suggests a general RNase E recognition pathway for complex substrates, which will help to understand the mechanisms that other RNA chaperones such as Hfq might work in an analogous assembly to present base-paired sRNA/mRNA pairs for cleavage. In total, this is an excellent work. I will support the publication of it until these following points are presented.

      Major comments: 1. It was mentioned on Page 5 that "Sulphate and malonate ions were previously seen at these positions in crystal structures of apo RapZ" and pn Page 11 that " Interestingly, the phosphate groups of the RNA backbone occupy positions in RapZ that were previously observed to bind sulphate or malonate ions in the crystal structure of apo-RapZ, suggesting that this pocket could be the binding site for a charged metabolite such as GlcN6P". Is there any following experiments to investigate it further? If possible, I suggest the author to confirm that weather RapZ has the binding activity with GlcN6P or not.

      Binding of GlcN6P by the RapZ-CTD was demonstrated previously by SPR as well as by metabolomics of metabolites copurifying with RapZ (Khan et al., 2020), although evidence that the “sulphate/malonate binding sites” in RapZ also bind GlcN6P is still lacking. Crystallization of RapZ+GlcN6P is not straight forward as bound GlcN6P is apparently hydrolyzed over time.

      "The kinase-like N-terminal domain of RapZ (NTD) makes only a few interactions with the RNA, and the path of the RNA does not encounter the Walker A or B motifs (Figure 2b). It is possible that this domain could act as an allosteric switch, whereby the binding of an as yet unknown ligand triggers quaternary structural changes that affect RapZ functions." Is there any more structural information supporting it? If the domain act as an allosteric switch, is it possible to make some deletion or substitution to test it?

      The properties of the separated NTD and CTD of RapZ were assessed in previous work.

      Is there any results to compare the binding affinity of GlmY and GlmZ with RapZ?

      Affinities were determined previously using complimentary techniques:

      Göpel et al., 2013/EMSA: KD GlmY ~ 30 nM; KD GlmZ ~ 75 nM

      Gonzalez et al., 2017/biolayer interferometry: ~ 50 nM for both GlmY/GlmZ (full-length)

      Minor comments: 1. Page 8, is it "stabilised" or "stabilized", please check it.

      We have changed the spelling to “stabilized”.

      The legends for Figure S2 c and d are reversed.

      This has now been corrected.

      It was suggested to show the RNA molecules in Figure S1a.

      We have changed the figure to include single-stranded RNA substrate.

      Reviewer #3 (Significance (Required)):

      This paper suggests a general RNase E recognition pathway for complex substrates, which will help to understand the mechanisms that other RNA chaperones such as Hfq might work in an analogous assembly to present base-paired sRNA/mRNA pairs for cleavage. In total, this is an excellent work.

    1. Despite such claims against CAI, many of the researchers of the decade produced empirical evidence showing the significant benefits of CAI. Kinzie et al. found “a strong positive effect of computers on continuing motivation” (1989 p. 12), while Tennyson et al. (1980) showed how computers can aid and empower learners in taking control of meeting their own learning needs. This was similar to Dalton et al. (1987), who claimed that computers aid instructors and practitioners in providing personalized learning experiences to students. Yet the research of the decade continued to be rife with conflicting opinions as researchers sought to understand and define the role of technology, specifically computers, in education.

      This paragraph, and reading in general, makes me think about the positives and negatives of learning almost solely with computers during the midst of the pandemic we are in. Computers are a technological device I can’t imagine not being able to use in my high school and college careers. When reflecting on my past zoom classes I have taken, I do believe that it is a valid point to bring up the statement that suggests that the technology only benefits students if the teaching is implemented in a well thought and beneficial way. When we made the sift to online school it was not only an adjustment for students, but teachers as well. This class, and material within, really makes me think about how the sudden switch to the use of technology to learn has impacted my college experience. I can definitely pick out certain classes that gave me more meaningful experiences due to the way that the teacher was able to utilize the technology at hand. Lastly, it makes me think about what educational technology may look like in school settings when I plan to become a teacher myself in the next 5+ years.

    2. Access to mobile devices or computers is essential for students to participate in “flipped classrooms,” a model which grew in popularity during the 2010s. With flipped classrooms, what was “previously class content (teacher led instruction)” is replaced with “what was previously homework (assigned activities to complete) now taking place within the class” (O’Flaherty & Phillips, 2015, p. 85). This method of instruction emerged in the 2010s in response to increased access to technology and understanding of its benefits.

      I found this section interesting as I believe its a tactic used more toward those in high school / college students. Growing up, I think children needed more 'step-by step / how to' instruction when it came to our education as we were learning things we've never encountered prior but as you get older and move onto more advanced studies I believe that we're more so relating and understanding concepts to build onto base knowledge we already have. So although I do find this tactic valuable currently at this age, I don't know how beneficial this would be to younger kids as they may not even be able to accomplish their task if they lack the instruction they need when doing their assignment online without an educator in their direct presence.

    1. Author Response

      Reviewer #1 (Public Review):

      Huang et al. sought to study the cellular origin of Tuft cells and the molecular mechanisms that govern their specification in severe lung injury. First the authors show ectopic emergence of Tuft cells in airways and distal parenchyma following different injuries. The authors also used lineage tracing models and uncovered that p63-expressing cells and to some extent Scgb1a1-lineaged labeled cells contribute to tuft cells after injury. Further, the authors modulated multiple pathways and claim that Notch inhibition blocks tuft cells whereas Wnt inhibition enhances Tuft cell development in basal cell cultures. Finally, the authors used Trpm5 and Pou2f3 knock-out models to claim that tuft cells are indispensable for alveolar regeneration.

      In summary, the findings described in this manuscript are somewhat preliminary. The claim that the cellular origin of Tuft cells in influenza infection was not determined is incorrect. Current data from pathway modulation is preliminary and this requires genetic modulation to support their claims.

      We thank the reviewer for the comments and we have performed extensive experiments to address the reviewer’s comments. In the revised manuscript we provide additional data including genetic modulation findings to support our model.

      Major comments:

      1) The abstract sounds incomplete and does not cover all key aspects of this manuscript. Currently, it is mainly focusing on the cellular origin of Tuft cells and the role of Wnt and notch signaling. However, it completely omits the findings from Trpm5 and Pou2f3 knock-out mice. In fact, the title of the manuscript highlights the indispensable nature of tuft cells in alveolar regeneration.

      We have modified the abstract and title accordingly.

      2) In lines 93-94, the authors state that "It is also unknown what cells generate these tuft cells.....". This statement is incorrect. Rane et al., 2019 used the same p63-creER mouse line and demonstrated that all tuft cells that ectopically emerge following H1N1 infection originate from p63+ lineage labeled basal cells. Therefore, this claim is not new.

      We thank the reviewer’s comment. Although Rane et al. reported the p63-expressing lineage-negative epithelial stem/progenitor cells (LNEPs) could contribute to the ectopic tuft cells after PR8 virus infection, it is still not clear whether the p63+ cells immediately give rise to tuft cells or though EBCs. Thus, we performed TMX injection after PR8 infection, different from Rane et al (Rane et al., 2019). who performed Tmx injection before viral infection to indicate the ectopic tuft cells are derived from EBCs, as shown in revised Figure 2.

      3) Lines 152-153 state that "21.0% +/- 2.0 % tuft cells within EBCs are labeled with tdT when examined at 30 dpi...". It is not clear what the authors meant here ("within EBC's")? And also, the same sentence states that "......suggesting that club cell-derived EBCs generate a portion of tuft cells....". In this experiment, the authors used club cell lineage tracing mouse lines. So, how do the authors know that the club cell lineage-derived tuft cells came through intermediate EBC population? Current data do not show evidence for this claim. Is it possible that club cells can directly generate tuft cells?

      We apologize for the confusion and revised the text accordingly. Here, “within EBCs” means within the “pods” area where p63+ basal cells are ectopically present. The sentence is revised as “21.0% +/- 2.0 % tuft cells that are ectopically present in the parenchyma are labeled by tdT. Notably, these lineage labeled tuft cells were co-localized with EBCs.” We don’t know whether the club cell lineage-derived tuft cells transit through intermediate EBCs and that is why we use “suggest”. It is also possible that club cells can directly generate tuft cells. To avoid the confusion, we delete the sentence.

      4) Based on the data from Fig-3A, the authors claim that treatment with C59 significantly enhances tuft cell development in ALI cultures. Porcupine is known to facilitate Wnt secretion. So, which cells are producing Wnt in these cultures? It is important to determine which cells are producing Wnt and also which Wnt? Further, based on DBZ treatments, it appears that active Notch signaling is necessary to induce Tuft cell fate in basal cells. Where are Notch ligands expressed in these tissues? Is Notch active only in a small subset of basal cells (and hence generate rate tuft cells)? This is one of the key findings in this manuscript. Therefore, it is important to determine the expression pattern of Wnt and Notch pathway components.

      We thank the reviewer’s interesting questions and agree the importance of identifying the specific ligands and receptors for relevant Wnt and Notch signaling during tuft cell derivation. That being said, we think the topic is beyond the scope of this study which is focused on the role of tuft cells in alveolar regeneration. The point is well taken and we will investigate the topic in our future study.

      5) How do the authors explain different phenotypes observed in Trpm5 knockout and Pou2f3 mutants? Is it possible that Trpm5 knockout mice have a subset of tuft cells and that they might be something to do with the phenotypic discrepancy between two mutant models?

      Again we thank the reviewer for the interesting question. As discussed in the discussion section, Trpm5 is also reported to be expressed in B lymphocytes (Sakaguchi et al., 2020). It is possible that loss of Trpm5 modulates the inflammatory responses following viral infection, which may contribute to improved alveolar regeneration. However, it is also possible that Trpm5-/- mice keep a subset of tuft cells that facilitate lung regeneration as suggested by the reviewer.

      6) One of the key findings in this manuscript is that Wnt and Notch signaling play a role in Tuft cell specification. All current experiments are based on pharmacological modulation. These need to be substantiated using genetic gain loss of function models.

      We have performed the genetic studies.

      Reviewer #2 (Public Review):

      In this manuscript, the authors describe the ectopic differentiation of tuft cells that were derived from lineage-tagged p63+ cells post influenza virus infection. These tuft cells do not appear to proliferate or give rise to other lineages. They then claim that Wnt inhibitors increase the number of tuft cells while inhibiting Notch signaling decreases the number of tuft cells within Krt5+ pods after infection in vitro and in vivo. The authors further show that genetic deletion of Trpm5 in p63+ cells post-infection results in an increase in AT2 and AT1 cells in p63 lineage-tagged cells compared to control. Lastly, they demonstrate that depletion of tuft cells caused by genetic deletion of Pou2f3 in p63+ cells has no effect on the expansion or resolution of Krt5+ pods after infection, implying that tuft cells play no functional role in this process.

      Overall, in vivo and in vitro phenotypes of tuft cells and alveolar cells are clear, but the lack of detailed cellular characterization and molecular mechanisms underlying the cellular events limits the value of this study.

      We thank the reviewer for the comments and acknowledging that our findings are clear. In the revised manuscript we provide more detailed characterization and genetic evidence to elucidate the role of tuft cells in lung regeneration.

      1) Origin of tuft cells: Although the authors showed the emergence of ectopic tuft cells derived from labelled p63+ cells after infection, it cannot be ruled out that pre-existing p63+Krt5- intrapulmonary progenitors, as previously reported, can also contribute to tuft cell expansion (Rane et al. 2019; by labelling p63+ cells prior to infection, they showed that the majority of ectopic tuft cells are derived from p63+ cells after viral infection). It would be more informative if the authors show the differentiation of tuft cells derived from p63+Krt5+ cells by tracing Krt5+ cells after infection, which will tell us whether ectopic tuft cells are differentiated from ectopic basal cells within Krt5+ pods induced by virus infection.

      We thank the reviewer for the helpful suggestion. We have performed the experiment accordingly.

      2) Mechanisms of tuft cell differentiation: The authors tried to determine which signaling pathways regulate the differentiation of tuft cells from p63+ cells following infection. Although Wnt/Notch inhibitors affected the number of tuft cells derived from p63+ labelled cells, it remains unclear whether these signals directly modulate differentiation fate. The authors claimed that Wnt inhibition promotes tuft cell differentiation from ectopic basal cells. However, in Fig 3B, Wnt inhibition appears to trigger the expansion of p63+Krt5+ pod cells, resulting in increased tuft cell differentiation rather than directly enhancing tuft cell differentiation. Further, in Fig 3D, Notch inhibition appears to reduce p63+Krt5+ pod cells, resulting in decreased tuft cell differentiation. Importantly, a previous study has reported that Notch signalling is critical for Krt5+ pod expansion following influenza infection (Vaughan et al. 2015; Xi et al. 2017). Notch inhibition reduced Krt5+ pod expansion and induced their differentiation into Sftpc+ AT2 cells. In order to address the direct effect of Wnt/Notch signaling in the differentiation process of tuft cells from EBCs, the authors should provide a more detailed characterization of cellular composition (Krt5+ basal cells, club cells, ciliated cells, AT2 and AT1 cells, etc.) and activity (proliferation) within the pods with/without inhibitors/activators.

      Again we thank the reviewer for the insightful suggestions. We agree that it will be interesting to further address the direct effect of Wnt/Notch signaling in the differentiation process of tuft cells from EBCs. In this revised manuscript we added new findings of EBC differentiation into tuft cells in mice with genetic deletion of Rbpjk.

      3) Impact of Trpm5 deletion in p63+ cells: It is interesting that Trpm5 deletion promotes the expansion of AT2 and AT1 cells derived from labelled p63+ cells following infection. It would be informative to check whether Trpm5 regulates Hif1a and/or Notch activity which has been reported to induce AT2 differentiation from ectopic basal cells (Xi et al. 2017). Although the authors stated that there was no discernible reduction in the size of Krt5+ pods in mutant mice, it would be interesting to investigate the relationship between AT2/AT1 cell retaining pods and the severity of injury (e.g. large Krt5+ pods retain more/less AT2/AT1 cells compared to small pods. What about other cell types, such as club and goblet cells, in Trpm5 mutant pods? Again, it cannot be ruled out that pre-existing p63+Krt5- intrapulmonary progenitor cells can directly convert into AT2/AT1 cells upon Trpm5 deletion rather than p63+Krt5+ cells induced by infection.

      We thank the reviewer for the comments and suggestions. Our new data using KRT5-CreER mouse line confirmed that pod cells (Krt5+) do not contribute to AT2/AT1 cells, consistent with previous studies (Kanegai et al., 2016; Vaughan et al., 2015). Our data also show that p63-CreER lineage labeled AT2/AT1 cells are separated from pod cell area, suggesting pod cells and these AT2/AT1 cells are generated from different cell of origin. We also checked the Notch activity in pod cells in Trpm5-/- mice, and some pod cell-derived cells are Hes1 positive, whereas some are Hes1 negative (RLFigure 1). As indicated in discussion we think that AT2/AT1 cells are possibly derived from pre-existing AT2 cells that transiently express p63 after PR8 infection. It will be interesting to test whether Trpm5 regulates Hif1a in this population (p63+,Krt5-), and this will be our next plan.

      RLFigure 1. Representative area staining in Trpm5-/- mice at 30 dpi. Area 1: Notch signaling is active (Hes1+, arrows) in pod cells following viral infection. Area 2: pod cells exhibit reduced Notch activities. Note few Hes1+ cells in pods (arrows). Scale bar: 50 µm.

      4) Ectopic tuft cells in COVID-19 lungs: The previous study by the authors' group revealed the presence of ectopic tuft cells in COVID-19 patient samples (Melms et al. 2021). There appears to be no additional information in this manuscript.

      In Melms et al., Nature, 2021 (Melms et al., 2021), we showed tuft cell expansion in COVID-19 lungs but not the potential origin of tuft cells. In this manuscript we show some cells co-expressing POU2F3 and KRT5, suggesting a pod-to-tuft cell differentiation.

      5) Quantification information and method: Overall, the quantification method should be clarified throughout the manuscript. Further, in the method section, the authors stated that the production of various airway epithelial cell types was counted and quantified on at least 5 "random" fields of view. However, virus infection causes spatially heterogeneous injury, resulting in a difficult to measure "blind test". The authors should address how they dealt with this issue.

      We clarified that quantification method as suggested. For the in vitro cell culture assays on the signaling pathways, we took pictures from at least five random fields of view for quantification. For lung sections, we tile-scanned the lung sections including at least three lung lobes and performed quantification.

      Reviewer #3 (Public Review):

      In this manuscript Huang et al. study how the lung regenerates after severe injury due to viral infection. They focus on how tuft cells may affect regeneration of the lung by ectopic basal cells and come to the conclusion that they are not required. The manuscript is intriguing but also very puzzling. The authors claim they are specifically targeting ectopic basal progenitor cells and show that they can regenerate the alveolar epithelium in the lung following severe injury. However, it is not clear that the p63-CreERT2 line the authors are using only labels ectopic basal cells. The question is what is a basal cell? Is an ectopic basal progenitor cell only defined by Trp63 expression?

      The accompanying manuscript by Barr et al. uses a Krt5-CreERT2 line to target ectopic basal cells and using that tool the authors do not see a signification contribution of ectopic basal cells towards alveolar epithelial regeneration. As such the claim that ectopic basal cell progenitors drive alveolar epithelial regeneration is not well-founded.

      We appreciate the reviewer for the positive comments and agreeing that our findings are interesting.

      The title itself is also not very informative and is a bit misleading. That being said I think the manuscript is still very interesting and can likely easily be improved through a better validation of which cells the p63-CreERT2 tool is targeting.

      We have revised the title accordingly and performed extensive experiments to address the reviewer’s concerns.

      I, therefore, suggest the following experiments.

      1) Please analyze which cells p63-CreERT2 labels immediately after PR8 and tamoxifen treatment. Are all the tdTomato labeled cells also Krt5 and p63 positive or are some alveolar epithelial cells or other airway cell types also labeled?

      We thank the reviewer for the question. To answer the reviewer’s question, we performed PR8 infection (250 pfu) on three Trp63-CreERT2;R26tdT mice and TMX treatment at days 5 and 7 post viral infection. We didn't perform TMX injection immediately as the mice were sick at a few days post infection. The lung samples were collected at 14 dpi. We observed that tdT+ cells are present in the airways (rebuttal letter RLFigure 2A, B), and it appears that the lineage labeled cells (tdT+) include club cells (CC10+) that are underlined by tdT+Krt5+ basal cells (RLFigure 2C). We think that these labeled basal cells give rise to club cells. However, we also noticed that rare club cells and ciliated cells (FoxJ1+) are labeled by tdT in the areas absent of surrounding tdT+ basal cells (RLFigure 2D). Moreover, a minor population of tdT+ SPC+ cells are present in the terminal airways that were disrupted by viral infection (RLFigure 2E and D). We did not see any pods formed in this experiment and we did not observe any tdT+ cells in the intact alveoli (uninjured area).

      RLFigure 2. Trp63-CreERT2 lineage labeled cells in the airways but not alveoli when Tamoxifen was induced at day 5 and 7 after PR8 H1N1 viral infection. Trp63-CreERT2;R26-tdT mice were infected with PR8 at 250 pfu and Tmx were delivered at a dose of 0.25 mg/g bodyweight by oral gavage. Lung samples were collected and analyzed at 14 dpi. Stained antibodies are as indicated. Scale bar: 100 µm.

      2) Please also show if p63-CreERT2 labels any cells in the adult lung parenchyma in the absence of injury after tamoxifen treatment.

      Dr. Wellington Cardoso’s group demonstrated that Trp63-CreERT2 only labels very few cells in the airways but not the lung parenchyma in the absence of injury after tamoxifen treatment (Yang et al., 2018). Dr. Ying Yang has revisited the data and she did not observe any labeling in the lung parenchyma (n = 2).

      3) Please analyze if p63-CreERT2 labels any cells with tdTomato in the absence of injury or after PR8 infection but without tamoxifen treatment.

      We performed the experiment and didn't observe any labeled cells in the lung parenchyma without Tamoxifen treatment (n = 4).

      4) Please analyze when after PR8 infection do the first p63-CreERT2 labeled tdTomato positive alveolar epithelial cells appear.

      We administered tamoxifen at day 5 and 7 after PR8 infection and harvested lung tissues at day 14. As shown in Figure 1, we observed a few tdT+ SPC+ cells in the terminal airways that are disrupted by viral infection. Notably, we did not observe any lineage labeled cells in the intact alveoli (uninjured) in this experiment..

      5) A clonal analysis of p63-CreERT2 labeled cells using a confetti reporter might also help interpret the origin of p63-CreERT2 labeled cells.

      We thank the reviewer for the suggestion. Our new data demonstrate that a rare population of SPC+tdT+ cells are present in the disrupted terminal airways of Trp63-CreERT2;R26tdT mice. Our data in the original manuscript and the new data suggest that the initial SPC+;tdT+ cells are rare because we have to administrate multiple doses of Tamoxifen to label them. Given the less labeling efficiency of confetti than R26tdT mice, it is possible we will not be able to label these SPC+ cells. Moreover, our original manuscript clearly shows individual clones of SPC+tdT+ cells in the regenerated lung, and they do not seem to compose of multiple clones. Therefore we think that use of confetti mice may not add new information..

      6) Lastly could the authors compare the single-cell RNAseq transcription profile of p63-CREERT2 labeled cells immediately after PR8 and tamoxifen treatment and also at 60dpi. A pseudotime analysis and trajectory interference analysis could help elucidate the identity of p63-CreERT2 labeled cells that are actually not ectopic basal progenitor cells.

      We appreciated the reviewer’s suggestion and agree that single cell RNA sequencing with pseudotime analysis can provide further information regarding the origin of the lineage labeled alveolar cells of Trp63-CreERT2;R26tdT mice. That said, our new data clearly show that KRT5-CreER lineage labeled cells do not give rise to AT1/2 cells as previously described (Kanegai et al., 2016; Vaughan et al., 2015), suggesting that the ectopic basal progenitor cells do not generate alveolar cells. By contrast, Trp63-CreERT2 lineage labeled cells do give rise to AECs, suggesting that this p63+ cell population capable of generating AECs are different from Krt5+ ectopic basal progenitor cells. Our single cell core has an extremely long waiting list due to the pandemic and we hope that our new findings are enough to address the reviewer’s concern without the need of single cell analysis..

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Spinal cord injury (SCI) is a damage to the spinal cord, that causes temporary or permanent changes in its function. While in mammals the regeneration process are very limited zebrafish are able to repair the spinal cord. Based on the hypothesis, that the vascular response might affect the regeneration capacity, the paper by Ribeiro et al addresses the structure and injury response of the spinal cord vasculature. As the growth of zebrafish larvae and juveniles depends a lot on the individual response to the environment, the authors first established comparable body measurement parameters (other than age) and observed the natural spinal cord vascularization process, starting from 6mm body length of the animals. Using transgenic lines the authors describe the formation and patterning of endothelial cells and pericytes up to 9mm length, when a more developed vascular network was present. They observe the processes of vascular regeneration after a contusion based SCI model at different time points (days post injury (dpi)) and in correlation with glial and axonal regrowth, also observing BSCB barrier integrity, angiogenesis, pericyte recruitment and the dependence on Vegf signaling.

      The study is interesting and novel, vascular structures in the zebrafish adult spinal cord have not been reported yet and neither has the vascular response to SCI. Currently the study remains very descriptive, although the authors tried to add functional data, by inhibiting Vegf signaling.

      Major points for revision: The authors fail to establish whether there is any relationship between spinal cord regeneration and vessel regeneration. While I do very well understand the challenges and limitations the authors should put more effort into functional analyses.

      For example: the authors address EC proliferation as a marker for angiogenesis, but do not analyse whether or how much EC proliferation is required for revascularization and regeneration. Pharmacological inhibition of proliferation should be possible and used. From a vascular point of view it would also be interesting whether there is a differential influence of tip or stalk cell proliferation.

      Although we agree that it would be interesting to inhibit EC proliferation to assess its role in spinal cord regeneration, the use of proliferation inhibiting drugs would likely have a widespread effect on the lesioned spinal cord, since many cell types proliferate in response to injury. Therefore, a pharmacological approach would not allow us to dissect the specific role of endothelial proliferation.

      The same is true for pericyte recruitment: the role of pericytes for the vascular repair or the spinal cord regeneration is not clear. The authors could use use mutants with impaired pericyte development or e.g. nitroreductase mediated ablation of pericytes.

      These experiments have been performed in larvae by Tsata et al. (2021). Although it would be interesting to repeat in adults, we believe that these experiments are beyond the focus of our study.

      The statements regarding the role of Vegf are too bold. The problem lies in the limitations of assessing the efficiency of Vegf inhibition. The heatshock promotor has been shown to induce transcription for up to 4 hours, depending on the efficiency of heatshock. There are no data on the stability of dnVegfaa protein. Likewise the pharmacological inhibition could be far from complete. A full inhibition of Vegf signaling is expected to stop vessel growth or angiogenesis. While it is a sign of good practice, that the authors combined a genetic model with a pharmacological one, both leave the same unresolved issue. However if we believe a very limites requirement of Vegf-signaling, it would be interesting to look for other signaling pathways, like cxcl, IL, or FGF to regulates regenerative angiogenesis.

      We agree that our data does not allow us assess the level of inhibition of the Vegf pathway. Since we are unable to confirm this at the moment, we will be excluding the Vegf inhibition data and make this a descriptive study.

      Minor issues

      The correlation with spinal cord repair could be stated more clearly throughout the manuscript. For the uninformed reader it is less clear when exactly the spinal cord is functional again.

      We will include in Fig. 3 a plot of the swimming capacity in contusion-injured fish until 90 dpi and will explain in the text how the vascular response correlates with the functional recovery.

      While I find the model in figure 8 very helpful, it gives 5 to 30 days, for the neuronal regeneration. Maybe a more detailed timeline of EC regeneration and remodeling correlating with neuronal repair would help.

      We will update the model in Fig. 8 with a more detailed timeline and a better description of structures important for regeneration (glial bridge, axonal regrowth).

      In line with that in figure 4 it is not clear whether the images of different time points are indeed one individual animal at the different time points or representative animals for the stage (also figure 4 lacks panel labels, in my copy I can see A, K and L, but no other letters).

      We will detail in the figure legend that the images are of different animals that are representative for each stage.

      For understanding the (re)vascularization, the direction of blood flow might be helpful.

      We will perform an additional experiment to characterise the direction of blood flow in uninjured fish. For this we will use juvenile fish with a body size of 7-9 mm, in which we expect to be able to perform live imaging. We will use a lighsheet microscope to image circulating cells in the spinal cord blood vessels in fish with labelled thrombocytes (Tg(-6.0itga2b:EGFP); Lin et al., 2005) and endothelial cells (Tg(kdrl:ras-mCherry)). These transgenic lines are already available in our fish facility. Even though the vascular network has not yet reached its mature stage at these body sizes, we expect to have enough intraspinal vessels to describe the blood flow circuit.

      Especially for the connection between spinal cord regeneration and vessel regeneration. Does blood flow regulate vessel pruning after 14 dpi?

      Although we agree with the reviewer that it would be interesting to understand how blood flow direction is reestablished in repaired vessels and how blood flow levels correlate with vessel remodelling and pruning, this would be difficult to assess in this system. This could be examined using live imaging, but this technique is challenging in adult zebrafish and has only been carried out in more superficial organs than the spinal cord, such as skin (Castranova et al., 2022) and superficial brain structures (Barbosa et al., 2015; Castranova et al., 2021). In addition, SC-injured fish are more sensitive to external conditions and would probably not survive the long-term/repeated anaesthesia required for imaging.

      This analysis could be performed in fixed samples, for example using the the Golgi complex position in relation to the endothelial nuclei as a proxy for blood flow direction (Kwon et al., 2016), however: (1) this would require a new transgenic line (Tg(fli1a: B4GALT1-mCherry)) that would take time to import and establish in the lab; (2) the identification of regressing vessels is not straightforward in fixed samples and is usually studied in very well established vascular models, such as the mouse retina and zebrafish ISVs (Franco et al., 2015).

      For these reasons, we will not address this question by reviewer 1.

      The combined Vegfaa DN and PTK treatment data looks like it could be inhibiting endothelial cell proliferation (Figure7I).However, Supplementary Figure 8B shows endothelial proliferation does not change. Does it mean the number of endothelial cells is same but the volume of endothelial cells decrees?

      We will not be addressing the changes in endothelial density in the presence of dn-vegfaa and PTK787, since we will be removing the figures related to Vegf inhibition.

      There are also some remaining grammatical errors, for example (but NOT limited to) line 133 to 135.

      We will review grammatical errors in the text.

      As a personal interest I think evaluating the role of Notch in the SCI model would also be very interesting, especially with regard to the vasculature, however that might be out of the scope of the manuscript.

      We agree that Notch signalling may be a player during spinal cord revascularisation. However, mutants for dll4 (the Notch ligand involved in angiogenesis) die between 7-14 dpf and cannot be used for this study. In addition, the use of Notch-inhibiting drugs would likely have pleiotropic effects, since the Notch pathway is also involved in other aspects of spinal cord regeneration, namely in the regulation of regenerative neurogenesis (Dias et al., 2012). To our knowledge, tools that allow the endothelial-specific inhibition of the Notch pathway have not been developed, and therefore we will not be able to address this question.

      Reviewer #1 (Significance (Required)):

      The study is partially descriptive, but very novel as the aspects of vascularisation in a spinal cord injury model have not been described before. If the major revisions regarding functionality are addressed fully, I would wholeheartedly recommend publication and expect an interest for a broad audience. The presented images and their analyses are of very high quality, and therefore also enhance the impact of the study.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The study by Ribeiro et al. investigates the formation of new blood vessels after spinal cord injury in adult zebrafish. The authors initially characterize the extend of spinal cord vascularization during the development of juvenile zebrafish and investigate the association of pericytes with the newly forming vasculature. They then injure the spinal cord and describe the subsequent regeneration of blood vessels. They perform assays to analyze the functionality of the newly forming blood vessels and show that initially blood vessels are leaky. Through EdU labelling the authors show that endothelial cells proliferate. Pericytes similarly increased in numbers. Lastly, the authors inhibited VEGF signaling, which only mildly affected vascular regeneration.

      Together, this manuscript describes the re-vascularization of the regenerating spinal cord in adult zebrafish and addresses how blood vessels mature during this process through pericyte recruitment and decrease in leakiness. The manuscript provides some interesting initial insights into spinal cord vascularization, but is mainly descriptive and unfortunately remains superficial in this regard, as specified below:

      1. The authors only refer to "blood vessels" without specifying the type of blood vessels they observe (are these veins, arteries, capillaries)? A wealth of markers and transgenic zebrafish lines are available to better characterize spinal cord vessels. This is not necessary in case the authors solely refer to "blood vessels" as they do, but it greatly limits the insights into spinal cord vascularization. For instance, Wild et al. (2017) showed that new vessels apparently sprout from veins in the spinal cord. Is this also true during regeneration?

      We will perform RNA in situ hybridisation using probes for arterial and venous markers. We will assay the expression of arterial markers (dll4, dlc, flt1 and efnb2a) and venous markers (flt4 and ephb4a) in uninjured spinal cord (to characterise vessel identity in homeostasis) and in 3 and 7 dpi spinal cords (to investigate the identify of angiogenic vessels during regeneration).

      1. The authors state that their characterization revealed a "stereotypic organization of blood vessels". However, the organization does not appear to be stereotypic (as I understand this term as looking the same in each fish) at all. Can the authors compare e.g. 3 or 5 wildtype fish and extract features that all fish share and those that differ between fish? This would greatly enhance our understanding of the vascular variability within the wildtype population.

      We will provide an additional figure comparing the spinal cord vasculature in different fish.

      1. The authors show an interesting metameric organization of the vasculature with regions of high vascularization interspersed with sparsely vascularized areas. Are there any morphological landmarks that would precipitate these differences?

      We will acquire light sheet images of adult spinal cords without removing the vertebrae. This will allow us to determine if the metameric organisation is correlated with the vertebral distribution.

      Can the authors check whether they induce a lesion in a highly or poorly vascularized area? This might greatly influence the degree of re-vascularization.

      We always perform the spinal cord injury in the region between neural arches (Dietrich et al., 2021). Once we determine how the vasculature is organised in relation to the vertebrae, we will be able to determine if the lesions are performed in a region of high or low vascularisation.

      1. The same superficial characterization unfortunately also applies to the cell population the authors refer to as "pericytes". Traditionally, pericytes are characterized as being associated with capillaries and sharing a basement membrane with the endothelium. Is this the case here?

      We will further characterise the association between Tg(pdgfrß:citrine)-positive cells and blood vessels using an anti-laminin antibody (#L9393, Sigma) to label the basement membrane. Preliminary results recently acquired indicate that Tg(pdgfrß:citrine)-positive perivascular cells and endothelial cells are both enveloped by the basement membrane, supporting the identity of Tg(pdgfrß:citrine)-positive cells as pericytes. Moreover, pericytes are generally described as solitary mural cells associated with small diameter blood vessels (the type of distribution we observe for Tg(pdgfrß:citrine)-positive cells), whereas vascular smooth muscle cells (vSMCs) form concentric layers around larger blood vessels (a distribution we do not detect with this transgene) (Hellström et al., 1999). For these reasons we believe that this transgene is labelling pericytes. We will explain more clearly in the text how the morphology, localisation and density of Tg(pdgfrß:citrine)-positive cells suggests these cells are pericytes.

      In addition, pdgfrb is hardly specific for pericytes, as it also labels a multitude of other cell types (refer to e.g. Tsata et al. (2021)).

      The different cell types labelled by the pdgfrb reporter line used in the Tsata et al., 2021 paper were identified not by the use of different cell markers, but by their localisation: perivascular cells (the same cell type that we also detect), myoseptal cells (which we would not expect to detect, since we are only analysing the spinal cord tissue and not the adjacent muscle) and floor plate cells (a reporter distribution that the authors show is lost after 3 dpf and is not present in the adult spinal cord). Moreover, the Tsata et al., 2021 paper also includes a supplementary figure (S1, panel N) showing a restricted perivascular pdgfrb:GFP distribution in the wholemount adult spinal cord, in agreement with our characterisation. By their morphology and density, these perivascular cells are likely pericytes, as argued above.

      It is also not clear why the transgenic pdgfrb line the authors use only labels cells next to blood vessels. Tsata et al. show a much broader labelling. The authors need to validate their transgenic line using in situ hybridization showing where pdgfrb is being expressed endogenously and how this overlaps with the fluorescent protein expression of the pdgfrb transgenic line.

      We will perform ISH for pdgfrb to confirm if the Tg(pdgfrß:citrine) reporter reproduces the endogenous expression in the uninjured spinal cord and at 3 and 7dpi. The 3-7 dpi period is approximately equivalent to the 1-2 days post-lesion in larvae and, if the non-perivascular pdgfrb:GFP cells observed in the larval spinal cord are present in the adult, we expect to detect them by ISH during this phase of regeneration.

      There are also several transgenic lines available that allow for the distinction between smooth muscle cells and pericytes (e.g. Shih,..., Lawson, Development 2021 and Whitesell,..., Childs, Plos ONE 2014). As for the vasculature, this more detailed characterization is not necessary in case the authors refer to the cells as "cells labelled by the pdgfrb transgene and reside next to endothelial cells". However, this would not be reflective of the level of detail currently present in the field.

      As we explain above, the morphology and density of the pdgfrb:Citrine-positive cells suggests that these cells are pericytes and not smooth muscle cells (SMCs). To confirm this we will compare the expression of pdgfrb with markers of SMCs (i.e, 𝛼-smooth muscle actin and desmin) using immunohistochemistry and/or ISH.

      The reviewer also suggests the characterisation of pericyte subtypes using the lines described by Shih et al., 2021. Although this would be interesting, we do not consider it is essential for our study. It would be very demanding to import the reporter lines and it is not certain that these subtypes are present in the spinal cord.

      1. The authors state that "New blood vessels rapidly attracted pericytes, formed through proliferation and possibly migration of existing pericytes". This statement is not supported by the data, as the authors do not perform lineage tracing of pre-existing pericytes. The authors need to specifically label existing pericytes and then follow whether these pre-labelled cells can be found on newly forming blood vessels. Tsata et al. provide some evidence for this in zebrafish larvae, but they also conclude that pdgfrb expressing tenocytes contribute to new mural cells.

      We will reformulate the sentence to clarify that we detect pericyte proliferation, but pdgfrb-lineage tracing would be needed to provide evidence that existing pericytes contribute to the generation of mural cells associated to new blood vessels. However, we will not perform the lineage tracing experiment for the revision, as we are unable to currently import this line.

      1. The findings that new blood vessel growth only marginally depended on VEGFA signaling is striking. However, it might also point towards an inefficient inhibition of VEGFA signaling. In particular, other publications, for instance Cattin et al. 2015 have shown that inhibiting VEGFA signaling prevents new blood vessel growth during peripheral nerve regeneration in mouse. It will therefore be important that the authors demonstrate that their approach leads to successful inhibition of VEGFA signaling. VEGFAB mutants appear to be homozygous viable and important for spinal cord vascularization (Matsuoka et al., 2017). In addition, heterozygous VEGFAA mutants already have some vascular phenotypes, but are also viable. Can the authors combine these mutants with their inhibitor treatments to achieve a greater reduction in VEGFA signaling?

      Since we are unable to confirm the level of inhibition of the Vegf pathway and we are unable to import the suggested lines at the moment, we will be excluding the Vegf inhibition data.

      Reviewer #2 (Significance (Required)):

      Together, this publication is the first to describe to some extend the regenerating vasculature after spinal cord injury in adult zebrafish. However, both the vascular and regeneration fields are much more advanced than what the authors cover. Both blood vessels and perivascular cells can be characterized in much more detail, as outlined above. Also, studies on nerve regeneration and its dependence on the vasculature, e.g. during peripheral nerve regeneration in mouse have been carried out with a wealth of functional data available. Therefore, the impact of the present study in its current form will be limited. I am an expert on zebrafish blood vessel development.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

      Ribeiro et al. described vascular development in the spinal cord from larval to adult stages in zebrafish, and found the dependence of vessel length on body-size. Then, the authors depicted the vascular regeneration process after spinal cord injury (SCI), which includes initial vascularization, angiogenesis, pericyte recruitment, and blood-spinal cord barrier establishment. Although the molecules or signaling pathways that drive the re-vascularization remain unidentified, this study illustrates the cellular processes of spinal cord vascular development and regeneration from the descriptive level, which may facilitate further understandings of mechanisms underlying vascular regeneration in the spinal cord.

      Major comments: - Are the key conclusions convincing? The descriptions of spinal cord vascularization during development and vascular regeneration after SCI are convincing. However, inhibition of Vegfaa and Vegfr2 is nearly ineffective. The author might not conclude that the Vegfr2 signaling plays any role.

      Since we are unable to confirm the level of inhibition of the Vegf pathway, we will be excluding the Vegf inhibition data.

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? - Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      Major comments: 1) In Figure 3, the exact injured site on the spinal cord is not clear. Please include a schematic illustration of full spinal cord to show where is the injured site. Are all the injury experiments in this study done at the same site? If not, is there any site difference regarding the regenerative capability.

      We will include a scheme of the injury site in the spinal cord in Fig.3. All the injury were performed in the same position and this will be clarified in the methods.

      2) Figure 2E showed a segmented pattern of spinal cord vasculature. Is this pattern correlated with the position of vertebra?

      We will acquire light sheet images of adult spinal cords without removing the vertebrae. This will allow us to determine if the metameric organisation is correlated with the vertebral distribution.

      3) In Figure 3, during vascular regeneration after SCI, the author only showed partial regeneration at 30 dpi. Why not show the stage of complete regeneration? At that stage, how about the behaviors of the regenerated animals?

      We will add an additional timepoint (90 dpi) to the characterisation of the revascularisation. Moreover, we will include in Fig. 3 a plot of the swimming capacity in contusion-injured fish until 90 dpi and will explain in the text how the vascular response correlates with the functional recovery.

      4) Only EdU data is not sufficient to conclude that new vessels come from proliferation of remaining endothelial cells. For example, these new vessels might come from transdifferentiation of lymphatic vessels, or immune cells, or glial cells, in the meantime proliferate. This could also explain why the inhibition of Vegfr2 signaling is ineffective on new vessel formation. Cre/loxP-mediated lineage tracings need to be performed to exactly identify where these new vessels originate.

      We will clarify in the text that while the detection of endothelial proliferation suggests existing endothelial cells contribute to new vessels, we cannot exclude that other cell types also give rise to endothelial cells. However, regarding the transdifferentiation of immune and glial cells into endothelial cells, to our knowledge few examples have been described in the literature and generally associated with cancers or in in vitro conditions (Fernandez Pujol et al., 2000; Li et al., 2011; Soda et al., 2011). For this reason we do not expect this rare process to occur during spinal cord repair.

      A cell type that has been associated with transdifferentiation into ECs are lymphatic cells (Das et al., 2022). However, we have analysed the expression of a lymphatic marker (Tg(lyve1b:DsRed)) and were only able to detect very few lyve1b:DsRed-positive cells before or after injury, suggesting that any possible lymphatic contribution would likely be very limited. We plan to include these data in the revised submission.

      5) To confirm the Tg(hsp70l:dn-vegfaa) did work in this study, the authors need a positive control. For example, the effects on vasculogenesis or angiogenesis during embryonic development after heat shock. If the transgene works, the vascular development at early stages should be blocked (Marín-Juez et al., 2016).

      We will be removing the vegf inhibition data, therefore we will not address this question.

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. The suggested experiments are realistic in terms of time and resources.

      • Are the data and the methods presented in such a way that they can be reproduced? In the method, the author should describe how to identify the Tg(hsp70l:dn-vegfaa) in more details, because there is no fluorescence before and after heat shock.

      We will be removing the vegf inhibition data, therefore we will not address this question.

      Are the experiments adequately replicated and statistical analysis adequate? Yes.

      Minor comments: - Specific experimental issues that are easily addressable. In Figure 6, from 30 dpi to 90 dpi, the number of pericytes decreased. Did these pericytes undergo apoptosis from 30 dpi on?

      We have not investigated pericyte apoptosis during vessel remodelling. However, this experiment would require the acquisition of long-term samples (between 60 and 90 dpi) and we would prefer not to address this question.

      Are prior studies referenced appropriately? Yes.

      • Are the text and figures clear and accurate? Please clearly labeled the injured region in Figure 6.

      We will identify more clearly the site of the injury in Fig.6.

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions? The number of proliferating ECs at 3 dpi is more than those at 5 dpi (Figure 5G). But the number of total EdU+ cells at 3 dpi is less than those at 5 dpi (Figure 5A-D). These data are consistent with Figure S3, which showed ECs were the leading cell type to enter the lesioned site, then were the axons and glial cells at later stages. Please explain and discuss whether the regeneration of other cell types is dependent on the accomplishment of vascular regeneration.

      As the reviewer points out, our data suggest that endothelial cells display an earlier peak of proliferation than spinal cord cells in general and colonise the lesioned tissue before new axons and glial cells. Although these observations could point to a role for ECs in the regeneration of other cell types, we would need to inhibit vascular repair to assess this possibility, which we were unable to do using Vegf inhibition. In our discussion we already mention some possible roles for ECs in stem cell proliferation, neurogenesis and axonal regrowth, but can expand this discussion if necessary.

      Reviewer #3 (Significance (Required)):

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      Although this study has characterized the development and regeneration of spinal cord vasculature in details, the significance of the advance needs to be improved due to lack of mechanisms. Obviously Vegfa is not essential for the vascular regeneration after SCI. It is better for the authors to identify one or two factors required for this process, in addition to identify cell origins of new vessels. With those, the significance of this study will be improved because the cell origins and required factors will provide potential therapeutic targets after SCI.

      • Place the work in the context of the existing literature (provide references, where appropriate).
      • State what audience might be interested in and influenced by the reported findings. The audience includes people who are interested in vascular development and regeneration, and spinal cord clinicians.

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. My field of expertise includes brain vascular regeneration, digestive organ development and regeneration. This study reported spinal cord vascular development and regeneration, which fit my expertise.

    1. Micah ReddingChristian Transhumanist AssociationMicah ReddingFavorites  · oSorpetdsnftf0t30m861u2c10ag1g6mi4ffaial25il81h5hu7gc4t6571t  · Shared with Public groupA few years ago, a friend called his young daughter over, and said to her, “Did you know that Micah thinks we’re all going to become immortal cyborgs, and that’s how we’ll usher in the kingdom of heaven?”I laughed. That wasn’t something I had said, but I think it was his attempt at “reading between the lines”.Is that what I think? The question raises other questions. Could the relationship between God and humanity in the person of Christ be described as “cyborg”? What about the spiritual body of 1 Corinthians 15? Some people have indeed described things this way. After all, it is the “bizarre and unnatural” juxtaposition of God’s spirit and human flesh that saves us and transforms us into constituents of the kingdom of heaven. This combination remains just as repulsive to many people as a robotic eye or an exoskeleton.But I think most people would probably be wondering about something else:Is it our technology that ultimately saves us?I think this question reflects confusion around cause and effect, faith and works, salvation and renewal.In the book of Revelation, we see a glorious city, descending from heaven to earth. This city is organic, human, and technological—a “cyborg city” just like our cities today. All the glory of the nations is brought into it—every good thing created or discovered has a place there. And this city is actively producing new and unprecedented means to heal and renew the outside world.I think this is a picture of what Paul calls “the body of Christ”, the community and ecosystem that Paul says will one day fill the universe. Salvation means being included in this community, both now and into the indefinite future. As part of this community, we are bound into a network of relationships that continually renews and sustains our life. As part of this community, we have gifts and works to do (Eph 2:10), which are the works of creation and healing that renew the world.Salvation consists of relationships, and relationships are always expressed in gifts and acts and works of creation.For several years now, I’ve been arguing that we can’t make sense of Genesis 1-2 (let alone Hebrews 2, 1 Cor 15, or Romans 8;) unless we understand science and technology to be part of these God-given works of creation and renewal.So is it our technology that saves us? No, rather our technology is a gift, a work, a byproduct of salvation overflowing into renewal and creation. Becoming immortal cyborgs won’t save us—but being saved may turn us into immortal cyborgs.

      Micah Redding -> Christian Transhumanist Association · oSorpetdsnftf0t30m86 1 u2c10ag1g6mi4ffaial25il81h5 h u7gc4t6571t · A few years ago, a friend called his young daughter over, and said to her, “Did you know that Micah thinks we’re all going to become immortal cyborgs, and that’s how we’ll usher in the kingdom of heaven?” I laughed. That wasn’t something I had said, but I think it was his attempt at “reading between the lines”. Is that what I think? The question raises other questions. Could the relationship between God and humanity in the person of Christ be described as “cyborg”? What about the spiritual body of 1 Corinthians 15? Some people have indeed described things this way. After all, it is the “bizarre and unnatural” juxtaposition of God’s spirit and human flesh that saves us and transforms us into constituents of the kingdom of heaven. This combination remains just as repulsive to many people as a robotic eye or an exoskeleton. But I think most people would probably be wondering about something else: Is it our technology that ultimately saves us? I think this question reflects confusion around cause and effect, faith and works, salvation and renewal. In the book of Revelation, we see a glorious city, descending from heaven to earth. This city is organic, human, and technological—a “cyborg city” just like our cities today. All the glory of the nations is brought into it—every good thing created or discovered has a place there. And this city is actively producing new and unprecedented means to heal and renew the outside world. I think this is a picture of what Paul calls “the body of Christ”, the community and ecosystem that Paul says will one day fill the universe. Salvation means being included in this community, both now and into the indefinite future. As part of this community, we are bound into a network of relationships that continually renews and sustains our life. As part of this community, we have gifts and works to do (Eph 2:10), which are the works of creation and healing that renew the world. Salvation consists of relationships, and relationships are always expressed in gifts and acts and works of creation. For several years now, I’ve been arguing that we can’t make sense of Genesis 1-2 (let alone Hebrews 2, 1 Cor 15, or Romans 8;) unless we understand science and technology to be part of these God-given works of creation and renewal. So is it our technology that saves us? No, rather our technology is a gift, a work, a byproduct of salvation overflowing into renewal and creation.

      Becoming immortal cyborgs won’t save us—but being saved may turn us into immortal cyborgs.

    1. Author Response

      Reviewer #2 (Public Review):

      Suggestions to improve the paper:

      Major Issues

      1) I do not think that the introduction accurately reflects the state of the field with respect to single cell omics and nerve injury. The CCI model is different than the SNI model, which has been used in most previous studies, in terms of the nature of the injury, and the resolution of pain after the injury. I do not think it is accurate to claim that the CCI model is somehow more relevant clinically, because both models are just that. It is also not really true that co-mingling, un-injured neurons have not been profiled before. The Renthal paper did this, but using a different model. There is value in what the authors have done here, but they can state it more clearly in the introduction. In particular, most published studies have only used male mice, so the sex differences aspect of this work is important. In that regard, the authors did not cite any of the growing literature on sex differences in neuropathic pain mechanisms.

      We revised the introduction and discussion to address the comments. Specifically, we revised the related information about animal models (Page 4-5). Although Renthal et al. examined co-mingling, “un-injured” neurons using a sciatic crush injury model, they did not find cell-type specific changes in uninjured neurons. The reason for this is unclear, but we speculate that it may be partially due to differences in the techniques (e.g., tissue processing, cell sorting, sequencing depth) and animal models (CCI versus crush injury). Compared to sciatic CCI induced by loose ligation of the sciatic nerve, crush injury would injure most nerve fibers (~50% of L3-5 DRG neurons are axotomized in this model). Therefore, the remaining “uninjured’ neurons for sequencing may be much less than that in the CCI model. In addition, we used Pirt-EGFPf mice to establish a highly efficient purification approach to enrich neurons for scRNA-seq and therefore largely increased the number of genes detected in DRG neurons. Comparatively, the neuronal selectivity and number of genes detected were lower in the previous study, which may have resulted in fewer DEGs and decreased ability to detect aforementioned changes. We include a brief discussion (Page 24).

      We appreciate the reviewer’s good suggestion, and cited sex differences studies in neuropathic pain mechanisms (Pages 5, 25). Although our findings suggest that peripheral neuronal mechanisms may also underlie sexual dimorphisms in neuropathic pain, Renthal et al. reported no differences in subtype distributions or injury-induced transcriptional changes between males and females after sciatic nerve crush injury (Renthal et al., 2020). We also discussed the differences between current findings and previous work and also emphasized the sex differences aspect of this work in the discussion (Page 25).

      2) I am curious about the choice to only use samples from 7 days after CCI. One of the advantages of the CCI model is that pain resolves at about 35-60 days, depending on how the ligations are done, and this allows one to look at how transcriptional programs change in DRG neurons after pain resolves. This would give some new insight, at least in comparison to the very comprehensive profiling done in the sciatic nerve crush model by Renthal and colleagues.

      We thank the reviewer for this comment. We provided the rationale for day 7 post-CCI (Page 22). It is the time point when neuropathic pain-like behavior is fully developed in most animals, and the post-injury time point examined in many previous studies. The reviewer is correct, an advantage of the CCI model is that pain resolves at about 35-60 days. Although meaningful, it was not our intention to conduct a time course study to fully characterize time-dependent transcriptional changes using scRNA-seq, which is costly and requires a great effort for data analysis, etc., and is beyond the scope of the current study. We will address this in a future study, and provided a brief discussion (Page 22).

      3) An alternative interpretation of the ATF3 expression is that the dissociation protocol causes this upregulation. ATF3 induction may be rapid and could occur due to the technique the authors chose to use. This could be acknowledged.

      We agree and acknowledged this in our original discussion (Page 22).

      4) I think the authors are a bit over-confident in their call of "injured" and "un-injured" neurons based on Sprr1a expression. This is really the only grounds for calling these neurons injured or uninjured. The fact is that the CCI model does not provide a clear way to determine injured and uninjured neurons contributing to neuropathic pain. This is an advantage of the SNL model, as shown in many classic papers from the Chung lab.

      We included a brief discussion about Sprr1a (Page 22). Although Atf3 is a classic marker of injured neurons in some previous studies, a recent study suggested that Sprr1a may be a better standard to define “injured” neurons (Nguyen et al., 2017). Although injured and uninjured neurons can be readily separated in the SNL model, they are mostly from different DRGs, but not intermingled in the same DRG. Since glia-neuron interaction and neuron-neuron interaction may occur between cells within the same DRG after injury, these interactions may profoundly affect neuronal excitability and gene expression. Accordingly, we choose the CCI model for the current study to determine whether injured and uninjured neurons contribute to neuropathic pain. We included a brief discussion (Page 5, 23, 24).

      5) There are now two papers on human DRG neurons that are available. One was recently published in eLife, and the other is available on Biorxiv, and has been there since Feb 2021. I expected the authors to make some comparisons of cell types that are changing in CCI with populations that are found in humans. Would similar effects be expected? Are these cell types represented in the human DRG?

      Study of human DRG is important, and recent studies elegantly characterized neurochemical and physiological properties. Previous findings have suggested some notable difference between human and rodent DRGs. Importantly, many markers and methods used for classifying subpopulations of rodent DRG neurons do not apply well to human DRG neurons. In addition, data from human DRG came from patients with different etiologies, but not due to peripheral nerve injury as in the animal study. Due to these differences, we feel that it is difficult to make direct compassion of cell types that are changing in CCI with corresponding human DRG neurons.

      Minor Issues

      1) Does the 40 um cell strainer eliminate some larger diameter cells from the analysis?

      We think this is unlikely, as large-diameter cells such as NF1 and NF2 clusters were also observed in our dataset. Importantly, we examined the cell strainer by washing it out inversely and did not find single cells. In addition, all subtypes identified in other studies were also found in our study. Nevertheless, an underrepresentation of the amount of NF neurons may be a result of the fact that not all NF neurons are GFP-positive in Pirt-EGFPf mice. In Pirt-EGFPf mice, expression of the knockin EGFPf was under the control of the endogenous Pirt promoter. Anti-GFP antibody staining revealed that GFP is widely expressed in 83.9% of all neurons. However, Pirt-negative neurons are mainly NF200+ and have large-diameter cell bodies. In addition, compared to small neurons, large neurons are also easier to lose during FACS sorting. We included a brief discussion of this potential limitation, as the NF population may be underrepresented in our sample set (Page 21).

    1. Author Response

      Reviewer #2 (Public Review):

      Zhong et al conducted a scRNA-seq analysis to uncover the features in multiple myeloma (MM) based on the Revised International Staging System (R-ISS) stage. They contributed 11 scRNA-seq datasets, including 9 MM samples and 2 healthy BMMC. And validated their findings using the deconvolution method in large cohorts.

      In addition, the newly identified and validated a subset of GZMA+ cytotoxic multiple myeloma cells. The experiments were nicely conducted and the datasets generated in this study might benefit many other studies. Major comments:

      1) Several studies on scRNA-seq in MM have been reported, but different from that reported in this study. The authors might discuss the insight gained from their study.

      Thanks for your comments. Several studies on scRNA-seq in MM have been disclosed some heterogeneity of MM. For example, Jang JS et al identified the molecular pathways during MM progression (MGUS, SMM, NDMM, and RRMM) [Blood Cancer J. 2019 Jan 3;9(1):2.]. Jean Fan et al devised a computational approach called HoneyBADGER to identify copy number variation and loss of heterozygosity in individual cells from single-cell RNA-sequencing data [Genome Res. 2018 Aug; 28(8):1217-1227.]. These studies verified the high heterogeneities existed in MM. But the specific the mechanism was not clear. Furthermore, these studies didn’t specify the heterogeneity among different stages in R-ISS staging system, which has been an international wide used prognostic stratification system. Therefore, we focused on the specific cluster, marker, and cross-talk pattern among the three stages of MM to reveal the potential mechanism of heterogeneity.

      2) The author claimed Proliferating plasma cells were increased in EBV-positive MM patients. It would be interesting to examine the abundance of EBV RNA levels in the scRNA-seq datasets. Several tools, such as viral-track or PathogenTrack, might be used to conduct such analysis.

      Thanks for the reviewer’s great suggestions and comments. According to your suggestion, we used PathogenTrack to identify pathogens in MM patients and added this analysis results in the file ‘Data for reviewers-1(PathogenTrack).xlsx’. However, the algorithm did not identify EBV reads in the scRNA-seq datasets. In order to verify our conclusion, we collected more MM patients’ samples and examined EBV, MKI67, and PCNA. Our result showed that EBV positive samples had significantly higher MKI67 and PCNA expression, compared with EBV negative samples on Lines 193 to 195, Page 6 (in Figure 5B and 5C).

      3) Methods used for deconvolution are missing.

      We thank the reviewer’s comments and suggestions. In our study, we didn’t use an analytical tool named CIBERSORT, thus we didn’t use deconvolution either in the manuscript. It may cause you a misunderstanding because of our unclear description.

      Reviewer #3 (Public Review):

      The authors constructed a single-cell transcriptome atlas of bone marrow in normal and R-ISS-staged MM patients. A group of malignant PC populations with high proliferation capability (proliferating PCs) was identified. Some intercellular ligand receptors and potential immunotargets such as SIRPA-CD47 and TIGIT-NECTIN3 were discovered by cell-cell communication. A small set of GZMA+ cytotoxic PCs was reported and validated using public data.

      For scRNA-seq data analysis, the authors did QC and filtering and removed low quality cells, including some doublets and followed by batch effect correction. Malignant PC populations were identified using the copy number analysis tool "inferCNV".

      The authors have done lots of analysis. But I think the results can be improved if they can do more analyses. I would recommend to 1) analyze doublets; 2) remove cell cycle effect; 3) GO and pathway analysis for genes with copy number change; 4) do cell-cell communication with more cell type/clusters.

      Thanks for your suggestion and comment.

      1) We applied Scrublet to computationally infer and remove doublets in each sample individually, with an expected doublet rate of 0.06 and default parameters used otherwise. The doublet score threshold was set by visual inspection of the histogram in combination with automatic detection. Information about this description was added to material and methods section as ‘We applied Scrublet [74] to computationally infer and remove doublets in each sample individually, with an expected doublet rate of 0.06 and default parameters used otherwise. The doublet score threshold was set by visual inspection of the histogram in combination with automatic detection.’ accordingly in Lines 731-734, Page 27.

      2) As we focused on the differences in proliferative capacity of myeloma cells, the cell cycle could reflect the difference well. Therefore, the cell cycle data was provided accordingly. Information about this description was added into main text as ‘Next, we analysed the cell cycle of six PC clusters, and distinguished them from other clusters, PCs in cluster 6 (PCC6) were presumably enriched in G2/M stage (Figure. 3B)’ in Lines 142-144, Page 5.

      3) We have analyzed the GO and pathway analysis for genes with copy number changes, and provided the file ‘Data for reviewers-2 and 3 (InferCNV for PCC4 and PCC6)’. Based on this, we found that oxidative phosphorylation was the most significant enriched pathways for PCC4 and PCC6, respectively. Cell-cell communication with more cell type/clusters was provided with the supplementary data in the file ‘Data for reviewers-3 (Overall T cells interaction ligand-receptor pairs dotplot, Overall T cells interaction ligand-receptor, Overall T cells interaction map)’.

      Data analysis of public data was sufficient to prove the small set of GZMA+ cytotoxic PCs. More data analysis or wet experiment proof is required.

      Thanks for your suggestion. The subset of cytotoxic PCs was identified in this study. These PCs exhibited NKG7 and GZMA. Furthermore, NKG7 showed the higher expression level than NKG7. Therefore, we validated it using Multi-parameter Flow Cytometry (MFC) and Immunofluorescence in MM samples. We identified a new subset of NKG7+ cytotoxic PCs and found that the percentage of NKG7+ PCs displayed obvious diversities among stage I, II and III groups. Information about this description was added in the main text as ‘In another MM single-cell dataset focusing on PC heterogeneity of symptomatic and asymptomatic myeloma (dataset GSE117156) [19], one cluster, C21, exclusively expressing NKG7 corresponded to PC18 in our dataset (Fig 2C-2D). In GSE117156 of all 42 samples, the cell proportion varied from 0% to 30.95% of all PCs, with an average percentage of 4.28% (Figure. 2E).Next, immunofluorescence confirmed the expression of NKG7 in cytoplasm of PCs (CD138 positive) from patients with MM (Figure. 2F). Finally, twenty MM patients (stage I: three patients, stage II: 10 patients and stage III: seven patients) were enrolled for multi-parameter flow cytometric (MFC) analysis. The results showed that the percentage of NKG7+ PCs displayed obvious diversities among stage I, II and III groups (Figure. 2G and Figure. S2). The average percentage of NKG7+ population was 2.73% in stage I, 8.89% in stage II and 0.58% in stage III (Figure. 2G and Figure. S3). In summary, we characterized a NKG7+ PC population (PC18), which may provide a novel perspective for the cytotherapy of MM.’ in Figure 2 and S3 and Lines 118-130, Page 4-5.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1.

      Reviewer #1 summary:

      In this manuscript by Lu et al., the authors describe some CRISPR screens and protein-protein interaction screens to identify novel regulators of wild-type p53 and mutant p53 function and stability. Besides generating a wealth of data, they discover FBXO42-CCDC6 as positive regulators of the some p53 hot-spot mutants, including R273H mutant p53, but not of all p53 mutants tested and also not of wild-type, indicating selectivity. Furthermore, the found C16orf72(TAPR1) as a negative regulator of p53 stability.

      Mechanistically, the authors claim a direct interaction between FBXO42 and CCDC6 and p53, but the importance of these interactions has not been shown. On the other hand the authors suggest that the FBXO42/CCDC6 regulate p53 via destabilization of USP28, but also the mechanism has not been worked out. For C16orf72, they show that it interacts with USP7, but no relevance of this interaction is shown either.

      Response: We sincerely thank the reviewer for the constructive and thorough review. We have incorporated most of the suggestions into our planned revision, with our major focus on the molecular mechanistic follow-up.

      Reviewer #1, major points.

      1. One very important point for me is that the authors do not show the levels of expression of p53 in the p53-mClover stable cell lines. It is known that overexpressed p53 is usualy more stable than endogenous levels of wt-p53. Therefore, I think it is necessary that the authors show the levels of the p53-mClover fusion proteins in the stably transduced cell lines compared to endogenous p53 levels in the parental RPE1 cells and also compared to the endogenous levels of R273H mutant in the PANC-1 cells.

      Response: We fully agree that the levels of overexpressed p53s are often more than the endogenous ones, due in part to increased expression and stability. In designing the reporter, we first tried to avoid the stabilisation of p53-GFP due to GFP aggregation by using the monomeric mClover-variant. Further, we titrated the WT and R273H clones (similar to our recent work in PMID: 35439056), to select clones with p53 levels closer to endogenous protein, and exhibiting high dynamic response to Nutlin-3a treatment.

      In the revised submission, we will include Western blotting comparing the levels of p53-mClover (WT and R273H) expression to the endogenous p53s in RPE1 (WT) and PANC1 (R273H) cell lines, in the presence or absence of Nutlin-3a.

      Also the functionality of the wild-type p53-mClover fusion is questionable, at least not shown. One would expect that the overexpression of a functional wt-p53 in p53-KO cells will affect the survival of the RPE1 cells. In Figure 5A the authors show that depletion of MDM2 or C16ORF72 is toxic for the RPE1 cells in a p53-dependent manner, indicating that elevated levels of p53 cannot be handled by these cells. So, experiment(s) showing that the wt-p53/mClover fusion is functional is needed.

      Response: We agree that it will be an important point to benchmark the reporter design. The ectopically expressed WTp53 is often observed to have reduced functionality compared to the endogenous WTp53. The WTp53-reporter line behaves similarly to the RPE1 line (p53-proficient), where both chemical (e.g. Nutlin) or genetic perturbation (e.g. depletion of MDM2/C16orf72) would be toxic in a p53-depedent manner. In line with this data, we have observed that the WTp53-reporter line is able to induce a p53 response as demonstrated by induction of p53-target genes such as p21, which is not observed in p53 null RPE cells, albeit the p21 induction is not as dramatic as in RPE1 cells with endogenous WTp53. Together, these data indicate that our WTp53-reporter is functional albeit with a somewhat reduced activity.

      In the revised submission, we will better demonstrate the functionality of the WTp53-mClover fusion by probing WTp53 target (e.g. p21), in the presence and absence of Nutlin. This is also performed as a part of the experiment addressing Point #1 above.

      A second important point is that the 'verification' of the hits from the screens is only done in one cancer cell line, PANC-1, with mutant p53. I would have like to see at least one other cell line with another p53 mutant endogenously expressed that is also regulated by FBXO42/CCDC6.

      Response: we will include validation of the hits (FBXO42, CCDC6) in other 1-2 tumour lines with confirmed R273H endogenous mutation (e.g. MB-MDA-468, etc).

      For many of the p53-mutants, a bimodal expression is observed. In the FBXO42- and CCDC6-depleted cells, the equilibrium shifts towards more negative cells but the levels in the two populations itself don’t change (while for example for USP28 depletion also the right peak shifts further up, Fig S4E). Is there any correlation with the cell cycle and p53 expression? And can the authors exclude that FBXO42 and CCDC6 are involved in cell cycle progression and hereby influence p53 indirectly (by combining PI staining with Clover-p53 for example).

      Response: we have indeed observed that the “bimodal” levels in the reporters of several mutants, which are also observed in other studies probing the endogenous p53 level (PMID: 29653964); while the population equilibrium shifts, the location of each peak (as a proxy of the level of p53s) are more stable.

      Regarding the relation between p53-level and cell cycle stage, indeed, both the authors in the paper above and we have probed this possibility, but were unable to establish a direct connection.

      In the revised submission, we will add flow cytometry analysis of the p53-mClover level, and the cell cycle position using Hoechst 33342 (live-cell permeable DNA staining).

      The authors claim that the FBXO42-CCDC6 axis regulates stability specifically some p53-mutants, including R273H-mutant, in a manner involving USP28. But USP28 regulates all forms of p53, not just some mutants version. How can the authors reconcile this apparent contradiction?

      Response: we thank the reviewer for this critical observation. From our screen (Supplemental Table 1A), we have indeed noticed a pronounced effects (|Z score| >=3) of FBXO42 on R273H and R248Q stability, and a marginal effect on wild-type p53. Similarly, USP28 had pronounced effects on R273H and R248Q and WTp53.

      In the discussion of the paper, we noted that USP28 was shown to regulate p53 levels through distinct mechanisms:

      ‘USP28 was originally implicated as a protective deubiquitinating enzyme counteracting the proteasomal degradation of p53, TP53BP1, CHCK2, and additional proteins68-71. USP28 regulates wild-type p53 via TP53BP1-dependent and -independent mechanisms. Concordantly, our data shows that USP28 and TP53BP1 are strong positive regulators of wild-type p53. However, while USP28 was also a strong hit in the mutant R273H p53 screen, TP53BP1 was not, indicating that the effects we see upon loss of USP28 on R273H p53 are independent of TP53BP1.’

      Together, this indicates that the R273H-mutant is regulated by a FBXO42-CCDC6-USP28 axis while wild-type p53 is regulated mainly via a USP28-TP53BP1 axis. We will attempt to address and discuss it in the revision.

      On a similar note, the authors show that FBXO42 and CCDC6 interact with p53, but not USP28. Do FBXO42 and CCDC6 interact with each other and with USP28? And is the interaction with p53 specific for the R273H version? This part of the mechanism is very poorly defined and the Co-IPs are not very convincing or relevant for the proposed model.

      Response: This comment will be more extensively addressed in the revision. We have indeed observed the interaction between FBXO42 and CCDC6 (via BioID and APMS); however, we failed to recover USP28 as an interactor of either FBXO42 or CCDC6. The interaction between CCDC6/FBXO42 is not specific to R273H; although we were able to IP endogenous R273H with CCDC6 in PANC-1 line, the WTp53 (as in HEK 293 TRex BioID line) was also picked up in the BioID preys of CCDC6/FBXO42. In addition, we have new data to show that FBXO42 directly interacts with WTp53.

      In the revised submission, we will improve the molecular underpinning of the FBXO42-CCDC6-USP28-p53 axis we propose. We will specifically address the following.

      (1.1.) Biochemically, further support that CCDC6 and FBXO42 regulate p53 via regulating USP28 stability: We will address this by established biochemical assays, e.g. cycloheximide-chase/MG132 experiment. While USP28 is an established WTp53 regulator, little is known about the mechanism, and the “upstream” regulation of USP28; we will attempt to fill this gap:

      (1.2.) And to an unbiased systematic approach, how R273H interactome changes upon the loss of CCDC6 or FBXO42.

      We will perform R273H-BioID upon loss of CCDC6 and FBXO42 and USP28.

      (1.3.) Furthermore, we will specifically exam the interaction of USP28-p53R273H with or without the genetic perturbation of FBXO42/CCDC6.

      Through these efforts, we hope to gain further mechanistic insights into this regulatory axis, but hope that the editors and reviewers will agree that a fully annotated mechanistic understanding is probably beyond the scope of this paper.

      Reviewer #1, minor points.

      The mechanisms of p53 regulation may vary greatly in different cell lines. Can the authors discuss why they choose to do the screen with different mutants, rather than with different cell lines expressing these same mutant endogenously?

      Response: While it is certainly very interesting to assess how WT and mutant p53 is regulated in different cell lines, such an approach is confounded by the ‘genetic make-up’ of the respective tested cell lines. For example, TP53BP1 might be a regulator in one cell line but not in another for the simple reason that the later cell line harbors a TP53BP1 deletion or mutation or expression levels. In addition, while working with endogenous p53 mutations certainly has many advantages, comparing different mutants in different cell lines is again very much confounded by the ‘genetic make-up’ of the respective tested cell lines.

      Our focus was slightly different, and we wanted to set out and specifically ask what the difference between p53 hotspot mutations are. Are they all the same or are there differences and importantly, are there differences between mutants and WT p53 and this can only be achieved when working in the same cellular background. In designing the screen, we have thus tried to optimise the inclusion of different hotspot mutants in an isogenic screening system. As such, we first depleted the endogenous WTp53 to minimise its interference and built the current isogenic system in the non-transformed RPE1 (“normal”) line.

      However, as discussed above, we agree that the screen results will be validated in more cell lines carrying respective endogenous mutants.

      Figure 1: Typo in the legends : Nultin ipv Nutlin

      Response: We apologise for the typos. This is addressed in the current submission, along with improved figure legends to improve readability.

      Figure 1b,1c : Show basal and Nutlin-3 induced MDM2 levels and in the overexpression cell lines; if WT-p53 is functional, MDM2 levels should be higher in WT-transduced cells compared to control or mt-p53 expressing cells.

      Response: In the revised submission, we will include Western blotting probing MDM2 levels (antibody permitting); this is a part of the experiment proposed for Points 1 and 2.

      Authors should explain which they name USP7 a negative regulator of p53, since it is supposed to de-ubiquitinate p53?!

      Response: The effects of USP7 on WTp53 have indeed been difficult to elucidate (by Prof. Vogelstein PMID: 15118411, and PMID: 15058298, and seemingly opposite by Prof. Gu Wei, PMID: 15053880, and PMID: 11923872). However, consistent with Prof. Vogelstein group, the inhibition of USP7 (either by inhibitor or genetically via CRISPR in our studies), has resulted in elevated p53 level.

      Figure 2E: the effect of MG132 on p53 seems to be very minimal on this Western blot; it would need quantification to be convincing...Quality of the blot is also not great.The fact that in control cells the levels of p53 R273H are not affected by MG132 treatment fits with Suppl Figure 2E, indicating that the proteasome has no effect on p53 R273H.

      Response: We indeed noticed that while the proteasome pathway is largely implicated in the WTp53 screen, it has much reduced effects on R273H. Interestingly, the treatment of MG 132 also has limited effects using PANC-1 line (with endogenous R273H). We will repeat this experiment and provide quantifications and modify the text accordingly.

      Suppl figure 3b, 3c, 3d:

      Somehow, I have the feeling that the results from the western blots and the FACS do not match fully, although not all the time-points are shown in the various experiments.

      For example, the FACS analysis (3b) suggests that in control-transduced cells after 16hr p53 is still increased. However, that is not clear at all in the Western blot (3c)

      Is Suppl Figure 3d the quantification of 3c experiment? If so, in the blot also the 24 hrs should be shown.

      The blot shown in Suppl Figure 3c suggests that CCDC6 expression increased upon irradiation. Do the authors agree with that? Would that explain why depletion of CCDC6 has more effect upon irradiation?

      Suppl Figure S3E: if I am right, this is essentially the same type of experiment as shown in figure 2e, but analysis of p53-expression by Western blot. In that blot no real effect of MG132 on p53 levels could be seen. But here, in the FACS analysis, MG132 clearly increases the p53-Clover fusion levels; for me again that Western blot and FACS data do not neccesarily match.

      Response: We apologise for the confusion. In the revised submission, we will improve the figure legends for better readability. Furthermore, in anticipation to the multiple cell lines involved in the revision, we will also clarify the cell lines in the figure.

      With regards to the difference between the flow cytometry and WB data, we have generally observed the flow cytometry bimodal shifting to be more sensitive than the WB, e.g. a 50% shift in population (FACS) is reflected by a 15% reduction in WB (which may be partially explained as WB is a measurement across the cell population and FACS determines the p53-GFP levels of every cell and thus the shift of cells between peaks). Similarly, we noticed flow-cytometry based quantification by antibody staining the endogenous p53 yielded similar sensitivity (PMID: 29653964). As such, we will ensure the validation of hits is performed in two modes. For WB experiment, we will do so in two cell lines carrying the endogenous mutants as suggested by Reviewers #1 and 2.

      Figure 3B: In the CCDC6 IP a very small amount of p53 can be found. I don't know how much input lysate compared to amount of lysate for IP is used, but the percentage of p53 found interacting with CCDC6 seems so marginal that is difficult to explain the effect of KO of CCDC6 in PANC1 cells.

      And, the authors called it a 'reciprocal IP' (Suppl Figure 4a) after transfection of V5-tagged CCDC6 into PANC1 cells, but it actually is the same type of IP. Did the authors try to IP p53 and blot for CCDC6? That would be a reciprocal IP.

      Response: We apologise for the confusion. In the revised submission, we will specify the portion of the lysates used for pre-IP (5% lysate) and IP (1 mg). As for the IP, we will also include the true reciprocal IP (IP p53, and blot for CCDC6).

      Figure 3H: how can authors explain that basal levels of USP28 in control and CCDC6-KO cells transfected with control plasmid are more or less the same and not reduced in the CCDC6-KO cells?

      Response: We will provide a better blot and quantification for this observation. In the current Fig 3H, the CCDC6-KO lane is slightly overladed as seen by the H3 loading control.

      Figure 3I: Essentially the whole blot here is of low quality; especially the FBXO42 blot; is deletion of USP28 increasing FBXO42 protein levels, or is it just the quality of the blot? All in all it seems that FBXO42 is very low expressed in the used cell lines.

      Response: We apologise for the confusion. In the revised submission, we will repeat and try to include higher quality WB, with more optimised condition for using the FBXO42 antibody.

      FBXO42 messenger level is readily detected using qRT.

      Figure 4B: I find it a bit surprising that USP7 is also found in the synthetic viability screen, since it has been shown that USP7 has many more essential targets and KO of p53 only partially rescues the development of USP7-KO mouse embryo's.

      Response: We thank the reviewer for this critical observation. While the double p53-USP7 knockout line is viable, we acknowledge that it is amongst the top scored hits due to the large differential viabilities between WT and p53-null lines. In the revised submission, we will further clarify the screen analysis and the associated interpretation.

      Figure 5: the authors nowhere show the efficacy of the guides targeting c16orf72. A Western blot showing the expression and the reduction upon expressing the guide-RNAs is essential.

      Response: We thank the Reviewer for this suggestion. The efficacy of each guide has been verified using ICE (at the genomic level), and in the revised submission, we will include this critical information as part of the Figure S2F.

      Figure 5E: First, here probably parental RPE1 cells have been used, but that is not stated. Second, the authors state 'only a slight increase in p53 levels upon siHUWE1'; I would say none compared to scrambled.

      I know HUWE1 is a very huge protein, but the blot of HUWE1 is not convincing. I seem to be able to conclude that siMDM2 and siUSP7 reduces HUWE1 levels?

      Response: We apologise for the confusion. In the revised submission, we will be specific of the cell line information on the figure, to improve the readability.

      We agree with the reviewers that assessment of large protein by WB is often difficult but given that this band almost completely disappears upon HUWE1 knock-down, strongly argues that we are indeed assessing the endogenous HUWE1. We also agree that it is an interesting observation that the levels of HUWE1 seem to be slightly reduced upon knock-down of MDM2 and USP7. We will repeat this experiments and provide quantitative data for HUWE1 and p53. Of note, in the screen, HUWE1 also scored as a negative regulator of wt-p53 and did not quite reach statistical significance for the p53 mutants.

      Regarding the relationship between C16orf72 and HUWE1, a newly published work (PMID: 35776542) seems to suggest that siHUWE1 has resulted in an increased C16orf72 level (termed HAPSTR1 in the paper), while siC16orf72 seemed to have no effect on HUWE1 level, although the stability of such a large protein by WB is often difficult to conclude.

      Figure 5F, in relation to figure 5D. Here the author overexpress both c16orf72 and USP7, and find an interaction. The implication of that is not clear. If they want to make point of this interaction, they should have looked at endogenous proteins.

      Response: We acknowledge the many concerns associated with coIP with ectopically, and especially overexpressed proteins in large quantity. In the revised submission, we will attempt to perform endogenous-based IP experiment (antibody permitting).

      It is worrying that USP7 apparently was not one of the hits in the Mass-spec experiment of which results are shown in Figure 5D. Also in that experiment c16orf72 was overexpressed, and USP7 is very highly expressed in essentially all cell lines, so do the authors have an explanation?

      Response: We indeed acknowledge this discrepancy. In the revised submission, we will attempt the coIP/IP using endogenous proteins (antibody permitting, or at least using endogenous target for one of the two partners). We also acknowledge that the limitation associated with the APMS for the detection of interactors.

      Suppl. figure 5D is missing

      Response: We apologise for the confusion. The Figure S5D was inconveniently placed at the top of the figure panel due to space limitation. In the revised submission, we will address this as a part of the overall readability improvement.

      Reviewer #1, Significance.

      The topic of the paper is of high interest given the relevance of p53 and its gain-of-function mutants in oncology, and the screens are well executed and clearly presented. In terms of novelty, FBXO42 has been linked to p53-degradation before, and c16orf72 was recently shown to be able to destabilize p53. However, the link between CCDC6 and p53 is novel and of interest, since they are both substrates of USP7 and are both regulators of the cell cycle.

      We think the manuscript has potential to add something to the field, but would benefit greatly from a better understanding of the molecular underpinnings of their newly described mechanisms, as well as the conditions in which the mechanism is active.

      Therefore, it might be advisable to shorten the manuscript, and go more in-depth in finding the mechanisms of regulation.

      Response: We sincerely thank the reviewer for all the constructive critiques. We will incorporate them in to our revision.

      Reviewer #2.

      Reviewer #2 summary:

      The paper describes several genome-wide CRISPR screens designed to identify regulators of p53 stability. The authors use a system in which p53 levels are marked by mClover expression, using RFP expression to normalise for gene expression changes.

      Reviewer #2, major points.

      1. The bimodal distribution of p53 expression levels in some reporter cell lines (G245S, R248Q, R248W and R273H) hampers the implementation of a robust readout and makes correct interpretation of the results challenging. While it is possible that the bimodal distribution indicates dynamic changes in p53 levels within one population, it also seems possible that a subclone of these cells have acquired additional alterations affecting p53 stability, and that the authors are screening a mixed population of two intrinsically different cell populations. This would make it difficult to interpret the results of the screen in these cell lines and may be a challenge when trying to identify something that has not already been highlighted on depmap.

      Response: We thank the reviewer for this critical observation. We strongly believe that this bimodal distribution is actually an inherent property of the p53 mutants in these cells for the following reasons: (1) The observation of the similar bimodal appearance in cell lines harbouring corresponding endogenous mutant p53s (PMID: 29653964) suggest that these two populations are of biological significance. (2) We have established 5-10 clonal lines each from the G245S, R248Q, R248W and R273H p53 reporter line and all of them exhibit a bimodal distribution, making it very unlikely that these populations are all through stochastic outgrowth of sub-populations with spontaneous mutations/alterations. (3) The bimodal distribution is stable over several months to years in culture. If it were a spontaneous mutations giving rise to a clone with higher mutant p53 levels, we would likely expect that over time this clone takes over the population. (4) We observed that such a pool of bimodal cells could be “synchronised” (e.g. by Nutlin, or MDM2 knockout) to one population, and later return to and repopulate the other (e.g. Nutlin washoff, Figure 1B). (5) When we sort out a single cells from the upper or the lower peak, expand them, we obtain again populations of cells with the same bimodal distribution, indicating that this is a dynamic process. Thus, we believe that these two populations were rather intrinsic, such that a cell in the population may assume both states.

      We also acknowledge the difficulties of screening using a bimodal population; however, we took advantage of these “bimodal” mutants and using FACS assessed the state of a single cell in relation to a genetic perturbation. Each guide has an equal chance of entering a cell that belongs to one of the two populations. If a gene knock-out really affects p53 levels, the cells with the respective guides enrich in one and deplete in the other population and the analysis comparing the guide abundances from these two peaks ensures the experiment are being perfectly internally controlled.

      While many of the top scored hits from the resulting screens are known regulators, it is critical that we validate our hits in an independent system, such as the cell lines harbouring endogenous p53 mutations, echoed by both Reviewers #1 and 2.

      The coverage of the sgRNA library (200x) is rather low for a negative selection screen, where a coverage of 500x would be more desirable. The FDR threshold is also rather lenient, a more stringent FDR threshold would seem more appropriate and shorten the list of potential hits.

      Response: We thank the reviewer for this constructive suggestion. A higher coverage, along with a more stringent FDR, will ensure an even stronger confidence for the remaining individual hits. The present reporter-based enrichment screen and the synthetical viability drop-out screen used four guides per gene, and with 200x coverage for each guide.

      In determining the coverage, we tried to reference recent successful screenings and apply earlier titration result for the 200x coverage (e.g. PMID: 26627737, PMID: 33465779, and reviewed in Nat Rev Methods Primers 2, 8 (2022). https://doi.org/10.1038/s43586-021-00093-4). While the threshold of FDR was often arbitrary, we fully agree that a more stringent FDR, which results in shortened hits list, may further boost the confidence of the hits, though also at the cost of losing potential hits due to collateral effects (e.g. guide efficiency).

      We agree with this reviewer that a higher FDR, esp. at the hits that result in p53 stabilization, would make sense as any gene whose loss causes cellular or genotoxic stress, would likely lead at least in part to p53 stabilization. In the revised submission, we will adjust the FDR accordingly.

      Although the study is focused on the regulation of p53 stability, there are no experiments to show that any of the manipulations alter the ubiquitination or degradation (half-life) of p53. The rescue of expression by proteasome inhibition is very modest (Figure 2E), suggesting the loss of expression may not be a reflection of degradation. A role for endogenous FBXO42 and C16orf72 in regulating the ubiquitination and half-life of endogenous p53 should be confirmed

      Response: We thank the reviewer for this suggestion. In the revised submission, we will monitor the ubiquitination status and also degradation (cycloheximide-chase) experiments for R273H cells, with or without the genetic alteration of CCDC6/FBXO42/C16orf72.

      Many p53 mutants are used for the initial screens, but very little validation is carried out to show that the apparent differences in factors regulating their stability persists in cells naturally expressing these mutants. For example, FBXO42 is identified as a protein required to maintain the stability of R273H, 248W and R248Q, but not R175H, G245S and R337H. While the authors show an association of CCDC6 and p53 in PANC1 cells (expressing 273H), it would be important to show a panel of R273H, 248W and R248Q expressing tumor cells and the response of p53 to FBXO42 and CCDC6 depletion, compared to similar experiments in a panel of R175H, G245S and R337H expressing tumor cells. Again, it would be important to show that any changes in protein levels are due to changes in protein stability.

      Response: We thank the reviewer for this suggestion. In the revised submission, we will include validations in more cell lines carrying endogenous mutant p53s, with a focus on the R273H mutant. We will also try to involve a line with an endogenous p53 mutation that does not respond to FBXO42/CCDC6 alteration.

      The potential hits should also be tested in wild type p53 expressing cells to confirm the specificity to mutant p53s.

      Response: In the revised submission, we will include WB for WT lines (e.g. RPE1) upon genetic alteration of CCDC6 and FBXO42. This was already performed for C16orf72 (Figure 6D).

      (6A) The role of C16orf72 in restraining p53 activity has been reported previously, as has the interaction with HUWE1 (including a new publication PMID: 35776542). The authors suggest an interaction between C16orf72 and USP7, although this should be shown with endogenous proteins. The relative importance of USP7 and HUWE1 binding is not explored. (6B) The effect of C16orf72 overexpression in promoting mammary tumors is impressive, although maybe the more interesting question is whether inhibition of C16orf72 expression can limit tumor development in this system.

      Response to 6A: we are excited about the independent observations by other group(s) confirming similar results! As a part of our improvement for mechanistic work-up, in the revised submission, we will attempt to address, whether C16orf72’ regulation of p53 is dependent on USP7 and/or HUWE1, or other known E3s, such as MDM2.

      (1) Whether the interaction of C16orf72 and HUWE1 or USP7 is required for the C16orf72 regulation of p53. Specifically, for example, we will perform epistasis experiments to test USP7’ or HUWE1’ ability to rescue the p53 levels in reporters upon ∆C16orf72. Due to the toxicity/lethality in WTp53 lines induced by the loss of C16orf72, we intend to test using R273H-reporter, or RPE1-line with ∆CDKN1A (p21) that is a synthetic viable rescue for ∆*C16orf72. *

      (2) In the revised submission, we will attempt to perform endogenous-based C16orf72-USP7 IP experiment (antibody permitting).

      6B. The effect of C16orf72 overexpression in promoting mammary tumors is impressive, although maybe the more interesting question is whether inhibition of C16orf72 expression can limit tumor development in this system.

      Response: We are also equally excited about the in vivo result supporting the idea that C16orf72 overexpression in tumour-prone mice (Pik3caH1047R) mice harbouring WTp53 may accelerate tumour formations. In the revised submission, we will further support that this effect is specific to WTp53/C16orf72, by including data of the control cohort with p53-null background (LSL-Pi3kH1047R; p53Flox/Flox).

      In regard to the effects of C16orf72-depletion in controlling tumour growth - we agree that this would be a very exciting avenue. Conditional C16orf72 mice are being made at the moment and these mice will allow us to comprehensively address this question. However, it will take several more month to generate and validate this line, and then another 2 breeding rounds to generate homozygous C16orf72fl/fl; Pik3caH1047R mice. In addition, the long time required to form tumours in the control mice with WTp53 (~250 days), it becomes not feasible for us to test whether the inhibition of C16orf72 could limit the tumour development, given the revision timeline. As such we respectfully believe that this would be beyond the scope of this manuscript.

      Reviewer #2, Minor comments.

      Figure 1b: The nutlin concentration stated in the methods section is wrong. Should be 10 µM instead of 10 nM (correct in figure legend).

      Figure 6b: y-axis label is missing.

      Figure 1e/f Legend: Should be FDR 0.5.

      Response: We apologise for typos. The current submission has incorporated the corrections.

      Figure 1c: Include results for a mutant that is not regulated by MDM2, such as R175H. Otherwise, as a standalone experiment, this figure doesn't add much.

      Response: We thank the reviewer for this suggestion. In the revised submission, we will include R175H/R337H.

      Figure 1h: While an UpSet plot is an elegant way to present unique and overlapping hits between different screens, Venn diagrams might be more 'accessible' to many readers and easier to understand.

      Response: We thank the reviewer for this feedback. The choice of UpSet blot was largely motivated by the different categories involved, which made the area representation and the intersection of the conventional Venn diagram no longer feasible.

      In the revised submission, we will improve our figure legend for the UpSet blot, to improve the readability.

      Might be worth stating that mClover is an eGFP variant and can therefore be targeted by eGFP sgRNAs so that it is easier to understand the following:

      o Page 5, paragraph 1: "We used the TKOv3 sgRNA library, which contains [...] 142 control sgRNAs targeting EGFP, LacZ and luciferase"

      o Page 5, paragraph 2: "As expected, sgRNAs targeting p53 and mClover were the most depleted sgRNAs, [...]

      Response: We thank the reviewer for this suggestion. We believe this will also improve the readability and have incorporated this into our current submission.

      Reviewer #2, Significance.

      Reviewer #2 (Significance (Required)):

      This is an interesting concept and the results could provide a useful resource for groups interested in the regulation of p53. The authors chose to focus on candidate genes that could have been identified by looking for the top 30 p53 co-dependent genes on depmap (C16orf72 is #24 in this list and FBXO42 is #28, most of the other genes ranking above are already known as p53 regulators). While this validates the screen, it would have been interesting if the authors had identified and validated new regulators of p53 that were not apparent from previously published work.

      Response: We thank the reviewer for all the thorough and constructive comments! In relation to the DepMap dataset, we are excited that many of the top hits from our screens are indeed top WTp53-correlators/anti-correlators (e.g. MDM2, USP28)!

      While the DepMap dataset used cell fitness/viability to construct the genetic relation score, this assay may not effectively rule out the many regulators that could otherwise elicit their regulation of p53 via regulating the general cell response to cell cycle, stress, etc. In our screen systems (i.e. protein stability and synthetic viability screens), we attempted to focus on the regulators of p53-stability (post-translational), and further coupled it with the synthetic viability screens to concentrate on hits that have a more direct role in p53 regulation (e.g. MDM2, C16orf72).

      One other difficulty to fully couple our screens to the DepMap dataset is due to the limited cell lines harbouring endogenous mutant p53s, e.g. R337H. This may also contribute to the uniqueness of the identified R337H-reporter specific hits (where cell lines harbouring R337H have not yet been included in the DepMap dataset), e.g. several Aminoacyl tRNA synthetases (SARS, YARS, etc) were identified as R337H unique regulators and subsequently verified using different guides in the reporter line, but could not be obtained via DepMap.

      We largely see this paper as a resource for the p53 field and would like to publish it as soon as possible. In fact, when we started working on C16orf72 or CCDC6/FBXO42, these hits were not known for their ability to regulate p53. We will work up several other hits, but this would be beyond the scope of this paper and the first author’s Ph.D. thesis that needs to be completed under a timeline.

      Reviewer #3.

      Reviewer #3 summary:

      The manuscript by Lu and coworkers performed genome wide CRISPR screens to search for genes that when knocked out, lead to p53 accumulation or degradation. Wt p53 and a panel of p53 hotspot mutants were chosen as reporter for the screen. The approach reassuringly identified many previously described regulators of p53 degradation, and also found a large set of new hits that many appear to be indirectly affecting p53 level.

      A key step of this approach is the follow up functional and mechanistic study of the hits. To this end, the authors chose FBXO42 as a top hit that blocks mutant p53 degradation, and C16orf72 as a top hit that promotes wt/mutant p53 degradation.

      Overall the functional data for FBXO42 is disappointing. FBXO42 knockout has quite modest effect on mutant p53 level (~50% reduction). The knockout also showed some effect on p53 mRNA level (~25% reduction), making the determination of mechanism difficult. It does not appear to be a promising targeting for reducing mutant p53 level and gain of function activity in tumor cells.

      We thank the reviewer for this constructive comment! We will address this in the revision, as proposed in Point #3.

      The C16orf72 finding unfortunately lost some novelty because it was independently identified as a p53 regulator in a recent study using CRISPR screening (PMID: 33660365). However, the repeated identification is reassuring and the current work provides more convincing functional data, showing C16orf72 knockout increase wt p53 level, inhibits cell proliferation specifically in p53+/+ cells, and overexpression of C16orf72 reduce wt p53 level and accelerates progression of a breast tumor mouse model. Their results suggest C16orf72 is a biologically relevant regulator of p53 in cancer development. In order to provide a reasonable amount of new information and set it further apart from the published study, some biochemical analysis looking into the mechanism of C16orf72 will be helpful.

      Reviewer #3 Major and Minor comments:

      Specific comments:

      1. There appears to be a mix up in the figure legend for Fig.1A describing line 1 and 2.

      Response: We sincerely apologise for the mix up in the figure legend! In the current submission, this has been fixed.

      Fig.2. Data for some p53 mutants mentioned in the text cannot be found in the main figure 2D and supplemental figure S3A.

      Response: We apologise for having not included the R175H and R337H mutants in Supplemental Figure S3A. In the revised version, we will include these two mutants.

      Fig.2 E-F. The effects of FBXO42 and CCDC6 KO on endogenous mutant p53 level is small (~50% decrease). Given that mutant p53 accumulates at high levels, whether a 50% decrease has meaningful effect on its gain of function activities is questionable. The knockouts also caused a ~25% decrease in p53 mRNA (FigS3F) which makes the mechanism quite difficult to investigate further.

      Response: We agree with the reviewer that the current data makes it difficult to conclude the mechanism. Given the design of our reporter, we still believe that the regulations could largely be at the post-translational level. In our revised version, we plan to exam the ubiquitination status of p53 upon losses of CCDC6/FBXO42, and also monitor the p53 degradation via cycloheximide chase.

      To further address whether this reduced level of mutp53 has biological impacts, we plan to test it in the tumour cell context. Given the difference in migration capability observed between PANC-1 and PANC-1-∆p53 line (e.g. PMID: 35439056), we plan to also evaluate the migration pattern of PANC-1, with the presence and absence of FBXO42/CCDC6 (controlled by similar FBXO42/CCDC6 loss in PANC-1- ∆p53 background). Furthermore, in tissue culture, although there is only marginal to no difference in cell growth rate between many mutant p53 lines (e.g. PANC-1) and their ∆p53 line, we plan to test whether a reduced serum or nutrient level could exacerbate the difference, and hence further be used to monitor the difference resulted from the loss of FBXO42/CCDC6.

      Fig.3B. The IP experiment using p53 shRNA and control shRNA should be done by IP of p53 followed by CCDC6 western blot. If CCDC6 IP is used as in the figure, then a CCDC6 shRNA knockdown sample should be compared to control shRNA. The current data does not rule out the possibility that CCDC6 antibody can nonspecifically pull down some p53.

      Response: We apologise for the confusion. In the revised version, we will include the proper reciprocal IP, with IP of endogenous p53 (R273H) followed by blotting of CCDC6.

      Fig.3D. The in vitro pull down experiment needs specificity controls such as non affected R175H p53 core domain. The data presented would suggest that MBP-FBXO42c captured more than 1:1 molar ratio of R273H core domain, which is unusual for specific binding unless there is aggregation of p53.

      Response: We thank the reviewer for this constructive comment! In the revised version, we will incorporate this, by repeating the in vitro pull-down assay including a non-p53 control protein.

      To increase the impact of the current study, the authors could provide more mechanism insight on how C16orf72 regulates p53 level, which was also missing in the other published study. For example, addressing whether C16orf72 effect is dependent on MDM2. Does it cooperate with MDM2 to ubiquitinate p53. Does it promote p53 ubiquitination in the absence of MDM2, since it interacts with HUWE1. Does it act by recruiting usp7 to stabilize MDM2.

      Response: we thank the reviewer for this very constructive and thorough comment! In our revised version, we will attempt these assays and incorporate them into the submission.

      Together with our response to Reviewer #2, Point #6, in the revised submission, we will attempt to address if C16orf72 regulation of p53 is dependent on MDM2 or HUWE1.

      (1) Whether the interaction of C16orf72 and HUWE1, or C16orf72 and USP7 is required for the C16orf72 regulation of p53. Specifically, for example, we will perform epistasis experiments to test HUWE1’ or USP7’s ability to rescue the p53 levels in reporters upon the loss of C16orf72 (∆C16orf72). Due to the toxicity/lethality in WTp53 lines induced by the loss of C16orf72, we intend to test using the R273H-reporter, or RPE1-line with ∆CDKN1A (p21) that is a synthetic viable rescue for ∆*C16orf72. *

      (2) Whether C16orf72 dependent upon or cooperate with MDM2 in regulating p53.

      We will first probe whether C16orf72 overexpression increased the p53 ubiquitination, and then decide whether overexpression of C16orf72 has additive effects to MDM2 overexpression in regulating p53 levels.

      We previously observed that overexpressing C16orf72 could not rescue the R273H level resulted from losing MDM2 (using flow-cytometry in R273H-reporter-∆MDM2), and as such, we plan to test the C16orf72-MDM2 relation in the MDM2-proficient context.

      The manuscript is in a form extremely unfriendly to review, text, figures and legends are all split up at multiple locations, the pdf figures are very sluggish to scroll.

      Response: We sincerely apologise for the inconvenience. In the current submission, we have split the submission into three separate files, (1) main text, (2) main figures, and (3) supplemental figures, along with (4) supplemental tables as individual EXCELs. We will also reduce the resolution of a few images, so the overall higher resolution is retained, while still fitting into the file size limit.

      Reviewer #3 (Significance (Required)):

      The work is significant in identifying a functionally relevant regulator of p53 stability.

      Response: we thank the reviewer again for the very constructive feedback!

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript by Lu et al., the authors describe some CRISPR screens and protein-protein interaction screens to identify novel regulators of wild-type p53 and mutant p53 function and stability. Besides generating a wealth of data, they discover FBXO42-CCDC6 as positive regulators of the some p53 hot-spot mutants, including R273H mutant p53, but not of all p53 mutants tested and also not of wild-type, indicating selectivity. Furthermore, the found C16orf72(TAPR1) as a negative regulator of p53 stability. Mechanistically, the authors claim a direct interaction between FBXO42 and CCDC6 and p53, but the importance of these interactions has not been shown. On the other hand the authors suggest that the FBXO42/CCDC6 regulate p53 via destabilization of USP28, but also the mechanism has not been worked out. For c16orf72, they show that it interacts with USP7, but no relevance of this interaction is shown either.

      Major points

      One very important point for me is that the authors do not show the levels of expression of p53 in the p53-mClover stable cell lines. It is known that overexpressed p53 is usualy more stable than endogenous levels of wt-p53. Therefore, I think it is necessary that the authors show the levels of the p53-mClover fusion proteins in the stably transduced cell lines compared to endogenous p53 levels in the parental RPE1 cells and also compared to the endogenous levels of R273H mutant in the PANC-1 cells.

      Also the functionality of the wild-type p53-mClover fusion is questionable, at least not shown. One would expect that the overexpression of a functional wt-p53 in p53-KO cells will affect the survival of the RPE1 cells. In Figure 5A the authors show that depletion of MDM2 or C16ORF72 is toxic for the RPE1 cells in a p53-dependent manner, indicating that elevated levels of p53 cannot be handled by these cells. So, experiment(s) showing that the wt-p53/mClover fusion is functional is needed.

      A second important point is that the 'verification' of the hits from the screens is only done in one cancer cell line, PANC-1, with mutant p53. I would have like to see at least one other cell line with another p53 mutant endogenously expressed that is also regulated by FBXO42/CCDC6.

      For many of the p53-mutants, a bimodal expression is observed. In the FBXO42- and CCDC6-depleted cells, the equilibrium shifts towards more negative cells but the levels in the two populations itself don't change (while for example for USP28 depletion also the right peak shifts further up, Fig S4E). Is there any correlation with the cell cycle and p53 expression? And can the authors exclude that FBXO42 and CCDC6 are involved in cell cycle progression and hereby influence p53 indirectly (by combining PI staining with Clover-p53 for example).

      • The authors claim that the FBXO42-CCDC6 axis regulates stability specifically some p53-mutants, including R273H-mutant, in a manner involving USP28. But USP28 regulates all forms of p53, not just some mutants version. How can the authors reconcile this apparent contradiction?

      On a similar note, the authors show that FBXO42 and CCDC6 interact with p53, but not USP28. Do FBXO42 and CCDC6 interact with each other and with USP28? And is the interaction with p53 specific for the R273H version? This part of the mechanism is very poorly defined and the Co-IPs are not very convincing or relevant for the proposed model.

      Minor points

      The mechanisms of p53 regulation may vary greatly in different cell lines. Can the authors discuss why they choose to do the screen with different mutants, rather than with different cell lines expressing these same mutant endogenously? .

      Figure 1: Typo in the legends : Nultin ipv Nutlin

      Figure 1b,1c : Show basal and Nutlin-3 induced MDM2 levels and in the overexpression cell lines; if WT-p53 is functional, MDM2 levels should be higher in WT-transduced cells compared to control or mt-p53 expressing cells. Authors should explain which they name USP7 a negative regulator of p53, since it is supposed to de-ubiquitinate p53?!

      Figure 2E: the effect of MG132 on p53 seems to be very minimal on this Western blot; it would need quantification to be convincing...Quality of the blot is also not great. The fact that in control cells the levels of p53 R273H are not affected by MG132 treatment fits with Suppl Figure 2E, indicating that the proteasome has no effect on p53 R273H.

      Suppl figure 3b, 3c, 3d:

      Somehow, I have the feeling that the results from the western blots and the FACS do not match fully, although not all the time-points are shown in the various experiments. For example, the FACS analysis (3b) suggests that in control-transduced cells after 16 hr p53 is still increased. However, that is not clear at all in theWestern blot (3c) Is Suppl Figure 3d the quantification of 3c experiment? If so, in the blot also the 24 hrs should be shown. The blot shown in Suppl Figure 3c suggests that CCDC6 expression increased upon irradiation. Do the authors agree with that? Would that explain why depletion of CCDC6 has more effect upon irradiation? Suppl Figure S3E: if I am right, this is essentially the same type of experiment as shown in figure 2e, but analysis of p53-expression by Western blot. In that blot no real effect of MG132 on p53 levels could be seen. But here, in the FACS analysis, MG132 clearly increases the p53-Clover fusion levels; for me again that Western blot and FACS data do not neccesarily match.

      Figure 3B: In the CCDC6 IP a very small amount of p53 can be found. I don't know how much input lysate compared to amount of lysate for IP is used, but the percentage of p53 found interacting with CCDC6 seems so marginal that is is difficult to explain the effect of KO of CCDC6 in PANC1 cells. And, the authors called it a 'reciprocal IP' (Suppl Figure 4a) after transfection of V5-tagged CCDC6 into PANC1 cells,but it actually is the same type of IP. Did the authors try to IP p53 and blot for CCDC6? That would be a reciprocal IP.

      Figure 3H: how can authors explain that basal levels of USP28 in control and CCDC6-KO cells transfected with control plasmid are more or less the same and not reduced in the CCDC6-KO cells?

      Figure 3I: Essentially the whole blot here is of low quality; especially the FBXO42 blot; is deletion of USP28 increasing FBXO42 protein levels, or is it just the quality of the blot? All in all it seems that FBXO42 is very low expressed in the used cell lines.

      Figure 4B: I find it a bit surprising that USP7 is also found in the synthetic viability screen, since it has been shown that USP7 has many more essential targets and KO of p53 only partially rescues the development of USP7-KO mouse embryo's.

      Figure 5: the authors nowhere show the efficacy of the guides targeting c16orf72. A Western blot showing the expression and the reduction upon expressing the guide-RNAs is essential. Figure 5E: First, here probably parental RPE1 cells have been used, but that is not stated. Second, the authors state 'only a slight increase in p53 levels upon siHUWE1'; I would say none compared to scrambled. I know HUWE1 is a very huge protein, but the blot of HUWE1 is not convincing. I seem to be able to conclude that siMDM2 and siUSP7 reduces HUWE1 levels? Figure 5F, in relation to figure 5D. Here the author overexpress both c16orf72 and USP7, and find an interaction. The implication of that is not clear. If they want to make point of this interaction, they should have looked at endogenous proteins. It is worrying that USP7 apparently was not one of the hits in de Mass-spec experiment of which results are shown in Figure 5D. Also in that experiment c16orf72was overexpressed, and USP7 is very highly expressed in essentially all cell lines, so do the authors have an explanation?

      Suppl. figure 5D is missing

      Referees cross-commenting

      I agree essentially with all comments of Reviewer #2. Especially the major points 3 and 4. The use of more cell lines expressing endogenous mutant p53 is very important. In addition, I can agree with almost all comments of Reviewer #3. The effects especially of FBXO42 ablation are rather minimal, so relevance is questionable.

      Significance

      Nature and Significance

      Compare to existing literature

      The topic of the paper is of high interest given the relevance of p53 and its gain-of-function mutants in oncology, and the screens are well executed and clearly presented. In terms of novelty, FBXO42 has been linked to p53-degradation before, and c16orf72 was recently shown to be able to destabilize p53. However, the link between CCDC6 and p53 is novel and of interest, since they are both substrates of USP7 and are both regulators of the cell cycle.

      We think the manuscript has potential to add something to the field, but would benefit greatly from a better understanding of the molecular underpinnings of their newly described mechanisms, as well as the conditions in which the mechanism is active.

      Therefore, it might be advisable to shorten the manuscript, and go more in-depth in finding the mechanisms of regulation.

    1. If we think carefully of the may find out what our human companions are thinking, we can not fail to be struck by the fact that our only method for obtaining such information is to be had by observing their conduct.

      Watson (1907) is pointing out one of the greatest ways to learn about human/animal behavior. This is done through observation. In 1907 this may not be as clear and simple due to the fact that psychology was not something relevant during those times.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2021-01204R

      Corresponding author(s): Alexander, Aulehla

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)): *

      *The paper by Miyazawa and colleagues addresses a key question: How is changed metabolic activity sensed and to induce changes in developmental programs. In recent years, there is more and more indication that metabolism is not only a dull workhorse synthesizing the building blocks for new cells and providing chemical energy, but that metabolic activity itself has also a regulatory role. How this precisely works is largely unknown and even also unexplored in higher cells. From early insights obtained in microbes, it seems that certain metabolites - possibly reflecting metabolic activity (i.e. flux) - could be metabolic signals that feedback into cellular regulation. *

      *The current paper takes this idea now to developmental processes, where the authors found that the glycolytic metabolite fructose-1,6-bisphosphate is a flux-dependent signal that interferes with developmental processes. This is a very exciting finding, as it indicates that this metabolite not only has a regulatory function in microbes but also in mouse during mesoderm development. *

      *Answering the question how such a flux-dependent metabolite mechanistically interferes with the developmental processes is an enormously difficult. Compared to other mechanistic studies, where deleting genes, modifying genes, and changing protein expressions will usually do the trick, here, perturbing metabolite levels is extremely challenging, particularly if such perturbations need to be carried out in a way that nothing else is perturbed. Researchers, who are not overly familiar with metabolism, usually underestimate the difficulty with targeted and insightful perturbation of metabolism. *

      *To this end, the authors of this paper need to be congratulated for a very well carried out study with very solid data, and excellent control experiments. The authors open up a new path towards understanding how embryo mesoderm development is regulated by metabolic activity. In particular, they show that that glycolytic flux, FBP and important developmental phenotypes as well as protein localization changes are linked. As normal with a complex metabolism-based story as this one, there is always more that could be done. Yet, the results are highly important to be reported now such that the field as a whole can build on these interesting results and to explore the exciting path further that has been opened by the authors. Thus, I strongly recommend publishing these findings: The data generated by the authors are accompanied by the required control experiments. The conclusions drawn are very solid. I do not have any major concerns but just a number of minor suggestions that the authors could consider in a revised version of the manuscript. *

      *Minor: *

        • At the end of the introduction, the authors stated their original goal. As it is phrased, it is unclear whether this goal has been obtained or not. They might want to consider replacing the last introductory sentence by a sentence stating what the reader can find in this paper.*

      1. We agree with the reviewer and have rephrased accordingly (line 112–117):

      “In this study, our goal was therefore to first determine in vivo sentinel metabolites during mouse embryo PSM development. We then combined genetic, metabolomic and proteomic approaches to investigate how altered glycolytic flux and metabolite levels impact developmental signaling and patterning processes.”

      • Data from Fig 3: If you plot the lactate secretion vs the FBP levels of the controls and the overexpression experiment, would the control and the overexpression data lie on one line (maybe if combined with the data shown in Fig 1A)?*

      2. As the reviewer suggested, it is of great interest to check whether lactate secretion and FBP levels show a similar correlation in control and cytoPfkfb3 embryos, considering that cytoPfkfb3 overexpression lifts the upper limit of glycolytic capacity and FBP levels (revised Figure 3B, 3E). As the reviewer suggested, we plotted FBP levels against lactate secretion and fitted a linear regression line onto control samples (please see the Figure R1 below). The new plot shows that lactate secretion and FBP levels in cytoPfkfb3 embryos lie on the linear regression line derived from wild-type samples, highlighting that a correlation between lactate secretion and FBP levels is maintained even in cytoPfkfb3 embryos. We now included this new plot in the revised Figure S4C and modified the text accordingly (line 474-477):

      “In addition, FBP levels showed a linear correlation with lactate secretion in control explants, and such a correlation was maintained even in cytoPfkfb3 explants (Figure S4C).”

      Figure R1. Correlation between lactate secretion and FBP levels in PSM explants. Linear regression line (a grey line) was derived from the data of control samples cultured in 0.5–25 mM glucose (black circles; from Figure 1A and 3E). The data from cytoPfkfb3 embryos cultured in 2.0–10 mM glucose (from Figure 3B and 3E) are shown as red rectangles.

      • Maybe the authors could attempt an experiment like the following one: Chose the strongest phenotype observed and test a combination of overexpressing cytoPfkfb3 and reducing extracellular glucose level at the same time? *

      3. We agree this suggested experiment is important to show that the phenotype in cytoPfkfb3 embryos is indeed dependent on glycolytic flux and have already addressed this specific point in our manuscript, see results in Figure 4B and 5A in our original manuscript. The results show that the phenotypes in cytoPfkfb3 explants, i.e. reduction in somite formation and downregulation of Msgn mRNA expression occur in a glucose dose-dependent manner. Since in this embryonic context, we show that glucose concentration impacts glycolytic flux (see increased lactate production upon glucose titration in Figure 3B), our findings support the conclusion that the effect of cytoPfkfb3 overexpression is flux-dependent and not due to the overexpression per se. Based on the reviewer's feedback, we have modified the text to clarify and highlight this critical point (line 339–345):

      “Combined, these results show that cytoPfkfb3 overexpression results in reduced segment formation, arrest of the segmentation clock oscillations and downregulation of Wnt signaling, in a glucose-dose dependent manner. As glucose concentration impacts, in turn, glycolytic flux (Figure 1A, 3B), these findings suggest that these phenotypes are flux-dependent and are not a mere result of cytoPfkfb3 overexpression.”

      • Can the proteomics experiments shown in Fig. 6 be repeated with high and low extracellular glucose? High glucose should yield high FBP levels and one would then expect to see the same as with the experiment where at 2 mM glucose 20 mM extracellular FBP were added. Is this the case? *

      4. We agree with the reviewer that based on the findings, one would expect the phenotype, i.e. in this case translocation of proteins, to correlate with FBP levels. Two of our results are of note in this regard.

      First, our data indicates that in order to see the effect on protein localization, high levels of FBP have to be reached. Accordingly, we find that Pfkl becomes depleted from the nuclear-cytoskeletal fraction in cytoPfkfb3 explants when cultured in 10 mM glucose but not (visibly) in 2.0 mM glucose (Figure 7D). Corresponding to this, FBP levels in cytoPfkfb3 explants show a significant increase (about 3-fold) from 2.0 to 10 mM glucose conditions (revised Figure 3E).

      Second, in control samples, FBP levels saturate in high glucose conditions. FBP levels in control samples do not further increase when glucose concentration is increased from 10mM to 25mM, and thus it does not become as high as in cytoPfkfb3 embryos cultured in 10 mM glucose (revised Figure 3E).

      Therefore, in order to reveal the translocation, it requires an experimental strategy that leads to significantly increased FBP levels, such as in cytoPfkfb3 explants with high glucose condition, or alternatively, direct supplementation of FBP.

      As also pointed out by the other reviewers, we are experimentally generating controlled conditions that exceed the physiological range which the embryo is exposed to. Accordingly, our data does not constitute evidence that under physiological conditions an alteration of protein localization in response to change in glycolytic flux and FBP levels occurs, at a smaller scale.

      We regard our approach as a first step to reveal potential mechanisms and so far hidden possible responses to changes in metabolic flux. In order to see minor changes in translocation upon small changes in glycolytic-flux/FBP levels, more quantitative approaches, such as live-imaging of tagged proteins, will need to be developed. We hence decided to include these discussion in our revised manuscript (line 657-666):

      “Of note, the translocation of proteins was observed only when high levels of FBP were reached upon direct FBP supplementation or cytoPfkfb3 overexpression with high glucose (Figure 6, 7). Future studies hence need to investigate whether flux-dependent change in protein localization occurs upon moderate and more physiological changes in glycolytic-flux/FBP levels. To this end, the development of more quantitative approaches, such as live-imaging of tagged enzymes and the development of metabolite biosensors, are needed.”

      • While the authors quantified proteins in different compartments, I was wondering whether they also looked for whole-embryo protein expression changes? *

      5. We have not done protein expression analysis using whole embryos, or other isolated tissues in this study. This is indeed a potentially interesting future experimental comparison.

      • Throughout the manuscript, the authors state the glucose levels or cytoPfkfb3 changes the glycolytic flux. While I tend to agree with this, it is important to note that the authors have not directly measured glycolytic flux, but use the amount of accumulated lactate as a proxy. I think it is important to add this disclaimer at important points in the manuscript, such that readers are aware of this point. *

      6. We fully agree with the reviewer and now have added the following sentence in the first result section to make this point clearer to the reader (line 126-128):

      "Throughout this study, we used quantification of secreted lactate as a proxy for glycolytic flux due to the inability to directly measure flux in embryonic tissues."

      Another aspect for changing FBP levels could be connected on what was found in yeast, where the FBP levels were found to oscillate with the cell cycle (https://pubmed.ncbi.nlm.nih.gov/31885198/). Could this be connected with the pattern formation here?

      7. This is indeed an interesting aspect to discuss; in the absence of experimental evidence connecting the observed pattern formation and cell cycle (though some classic work had suggested its existence) we have decided to omit the discussion of this potential link.

      • Line 606: The mentioned review article also covers yeast. As such, maybe the authors should replace the term "bacteria" with "microbes"? *

      8. We modified our manuscript accordingly.

      Reviewer #1 (Significance (Required)):

      **Referees cross-commenting**

      As I mentioned in my comment, targeted metabolic perturbations are extremely difficult. Perturbing a metabolite level without at the same time perturbing the flux through this pathways is difficult (of not impossible). Also, the opposite is the case.

      I am not sure whether experiments as the one suggested by reviewer 2 (comment 1) will really lead to results from which further conclusions can be drawn. Furthermore, there does not need to be a linear correlation between the extracellular glucose concentration and metabolic flux/FBP levels (as my reviewer colleague implies). Thus, I am not sure whether doing this experiment makes sense, or would lead to strengthened conclusions.

      Reviewer 2 also states "The lack of proven mechanism for the activity of FBP might restrict the real general impact of this work." I agree that we do not know the downstream targets of FBP, but finding them would likely require many years of additional work. Such work will not be initiated if this paper is not published, and it would be a pity if it would be further delayed. I feel that the evidence is strong enough that FBP has an important role and with this paper published, it will motivate others to look for the downstream targets.

      Reviewer 3 makes the point: "Given that FBP levels are highly correlated with extracellular glucose levels (which impact glycolytic flux )(TeSlaa and Teitell, 2014) the authors should elaborate on why progressive increase in extracellular glucose does not affect PSM patterning, in the same way that increasing FBP levels does. " Here, I feel my reviewer colleague might be overlooking that in biochemistry molecular interactions typically reach a saturation at some point. The correlation between extracellular glucose and glycolytic flux has likely only a range where these two measures linearly correlate. Similarily, the correlation between glycolytic flxu and FBP likely also exists only within a certain range, and finally FBP levels and the downstream targets likely also only linearly interact within bounds. Thus, the absence of a correlation at "extremes" does by no mean mean that what the authors propose is incorrect. In fact, it just shows what you expect from biomolecular interactions that there a limits to linear correlations.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      *Summary. *

      *The work described in this paper first searches for potential sentinel metabolites of glycolytic flux, focusing on the process of somitogenesis during mouse embryonic development. By measuring the levels of different metabolites in the presomitic mesoderm (PSM) of E10.5 mouse embryos cultured in the presence of three different glucose concentrations, the authors identify 14 metabolites whose concentration rises with increasing glucose concentration in the culture medium. Among them, they selected fructose 1,6-bisphosphate (FBP) for further analyses, as it showed the highest linear correlation with extracellular glucose concentrations. They then show that addition of FBP to the incubation medium of cultured embryo tails interfere with somitogenesis and tail extension in a concentration-dependent fashion. In addition, they show that this effect is exacerbated when extracellular glucose levels are increased. By analyzing specific targets of Wnt and Fgf signaling, the authors also show that addition of FBP down-regulates both signaling pathways in the PSM. They then use a genetic trick (ubiquitous overexpression of cytoPfkfb3) to increase FBP levels by allosteric activation of Pfk (the enzyme that produces FBP) in developing embryos. When tails from these transgenic embryos were cultured in vitro and exposed to various glucose concentrations somitogenesis was affected in a way resembling the effects of FBP on cultured tails from wild type embryos. The authors then go on to determine the subcellular localization of different proteins in tails incubated in the presence of various FBP concentrations to identify that some enzymes involved in the glycolytic pathway (and they specifically focus on Pfkl and Aldoa) are excluded from nuclear fractions at high FBP concentrations. The authors conclude that FBP functions as a flux-signaling metabolite connecting glycolysis and PSM patterning, potentially through modulating subcellular protein localization. *

      *Major comments *

      *I think that in general the work described in this manuscript has been performed to the highest technical standards. However, I do not think that I can agree with the authors' conclusions (that FBP connects glycolysis with PSM patterning and that subcellular localization of glycolytic enzymes play a role in this process), which in my opinion go way beyond what can be proven by the data provided. *

      *1- Explants incubated with external glucose concentrations up to 25 mM have no obvious defects on somitogenesis or on the segmentation clock as determined by LuVeLu cycling activity. Under these conditions, explants are expected to contain very high FBP levels if this metabolite keeps its linear relationship with external glucose (in this work it was not measured beyond 10 mM glucose in the medium, where FBP concentration was already very high). This contrasts with the phenotypes observed upon exogenous supplementation of FBP, which affects somitogenesis already at 2 mM glucose. These latter results are at odds not only with the lack of phenotypic alterations under high glucose conditions, but also with the observation that exogenous addition of fructose 6-phosphate (F6P), the substrate of Pfk enzymes to generate FBP, does not alter somitogenesis. The authors take the absence of effects by incubation with F6P as a control of the specificity of FBP. However, as F6P is the natural substrate of Pfk, it is possible that supplementation of F6P also leads to an increase of FBP but in a way closer to a physiological condition. Therefore, I find it essential to determine FBP levels in tails incubated in the presence of increasing amounts of F6P, as if it increases FBP levels, similarly to what the authors described for the tails incubated with increasing glucose concentrations, it will have important implications to the interpretation of the work presented in this manuscript. *

      9. We agree with the reviewer and to directly address this central point, we have performed an extended, additional experiment, collecting 375 embryos to quantify FBP levels under five conditions with three biological replicates.

      There are two major results that we highlight here: First, we found that addition of F6P did not lead to increased FBP levels compared to control samples cultured in 10 mM glucose, which is in stark contrast to cytoPfkfb3 embryos cultured in 10 mM glucose (revised Figure 3E). Second, while increasing glucose concentration is mirrored by elevated FBP levels as we reported, we find clear evidence of saturation above a concentration of 10mM glucose: increasing glucose to 25mM does not increase FBP levels further (revised Figure 3E).

      This saturation effect seen in glucose titration, but also the absence of elevated FBP upon F6P addition, might be expected outcomes because, as also the reviewer 1 pointed out in the response, Pfk is commonly considered to be a rate-limiting enzyme in the glycolytic pathway. We now have the direct experimental data supporting this hypothesis and thank the reviewers to have initiated this additional (very involved..) experiment.

      This new data allows us to conclude more firmly on the correlation between FBP levels and phenotype: at high FBP levels, which are seen in cytoPfkfb3 samples, we observe PSM patterning defects. These high levels are not reached even at 25mM glucose or upon F6P addition, due to the saturation at the level of PFK enzymatic step. Hence, while glucose titration does elevate FBP significantly until this saturation, FBP levels are not as high as in cytoPfkfb3 samples. As a correlative finding, we see that only those conditions with very high FBP levels, or the direct addition of high levels of FBP, cause the arrest of segmentation clock activity. At moderately elevated FBP levels, observed in control explants with high glucose or in cytoPfkfb3 explants with low glucose, clock activity continues and we find a quantitative effect at the level of gene expression, i.e. Wnt signaling target downregulation (Figure S3, 5A).

      The new data has been included in the revised manuscript and the text has been adjusted accordingly:

      • (Result Part, line 245–254) "Consistently, we found that cytoPfkfb3 overexpression lifted the upper limit of FBP levels in PSM cells (Figure 3E, S4B, S4C). In control explants, FBP levels did not increase further when glucose concentration was increased from 10 mM to 25 mM. It was also the case when control explants were cultured in 20 mM of F6P (Figure 3E). These results indicate that the Pfk reaction carries a (rate-)limiting role for glycolytic flux and FBP levels, and that cytoPfkfb3 overexpression hinders the flux-regulation function of Pfk."

      • (Discussion Part, line 551–573) “Our findings suggest that flux-regulation at the level of Pfk is critical to keep FBP steady state levels within a range compatible with proper PSM patterning and segmentation. In agreement with such a rate-limiting function for Pfk, we found in glucose titration experiments that FBP levels saturated and did not further increase at glucose levels above 10 mM (Figure 3E). Along similar lines, the supplementation of high concentrations of the Pfk substrate F6P did not result in a significant increase of FBP levels, again compatible with a rate-limiting function at the level of Pfk (Figure 3E). The upper limit of glycolytic flux and FBP levels can be experimentally increased by cytoPfkfb3 overexpression (Figure 3B, 3E). We interpret the data as evidence that cytoPfkfb3 overexpression compromises the flux-control function of Pfk and hence much higher FBP (and secreted lactate) levels are reached. Such a drastic increase in glycolytic flux and FBP levels correlates with a severe PSM patterning phenotype (Figure 4), which resembles the phenotype induced by supplementation of high dose of FBP (Figure 2). Our results in mouse embryos hence provides evidence that flux regulation by Pfk, an evolutionary conserved role present from bacteria to humans, serves to maintain FBP levels below a critical threshold.”

      *The main difference between the experiments involving FBP supplementation and those involving high glucose concentrations or exogenous F6P addition is that in the later two cases increase in FBP would be restricted to the tissue(s) expressing Pfk, whereas upon FBP supplementation this metabolite would hit any tissue, regardless of whether or not it would ever be physiologically exposed to this molecule. In the case of the PSM, this might be relevant because it has been shown that there is a gradient of glycolysis, being high at the caudal tip and becoming lower at more anterior regions of the PSM, most likely mirroring the distribution of Pfk activity. Exogenous administration of FBP would flatten the gradient, which could lead to alterations in PSM patterning, whereas glucose (and eventually F6P) would not as they would increase FBP locally in the area where it is normally activated, keeping the natural gradient. *

      *On the basis of these arguments, to which extent does FBP connect glycolysis and somitogenesis under physiological conditions? *

      10. First, we would like to clarify that while indeed glycolytic activity is graded along the PSM, as other and we reported previously (reported in Bulusu et al., 2017 and Oginuma et al., 2017), the baseline expression of the entire glycolytic machinery (from glucose transport to lactate production) is very high, in all PSM cells. Hence, we see that cells all along the entire PSM have very active glycolysis, the posterior PSM being even more active.

      For this and related reasons, our interpretation about the difference seen between glucose titration/F6P addition on one side, and FBP addition/cytoPfkfb3 addition on the other side, is based on the role of Pfk in controlling either flux levels or dynamics in all PSM cells.

      Hence, while we agree that we generate experimental conditions that allow FBP levels to surpass those found in control embryos, we would like to highlight the fact that even moderate changes in flux does result in very robust functional consequences on gene expression (Figure S3, 5), as we show in this work.

      We can currently not fully address the first point raised, i.e. the role of graded flux/graded metabolite levels, due to the experimental limitations. Such a study requires, for instance, the generation of metabolite biosensor reporter lines in order to be able to monitor these changes dynamically, in space and time.

      *ESSENTIAL ADDITIONAL EXPERIMENT related to point #1: Measure FBP from PSM explants incubated under various exogenous concentrations of F6P. *

      11. We have performed this suggested experiment, which required the collection of n=375 embryos cultured under the various conditions and analysis by LC-MS to quantify metabolites. The outcome was indeed very informative (please refer to our response #9).

      *ANOTHER EXPERIMENT THAT COULD BE INFORMATIVE: measure FBP levels in PSM incubated under different glucose concentrations but instead of using the whole PSM together, dividing the PSM in posterior, medium and anterior parts (similarly to what was done in Oginuma et al, 2017, reference in the manuscript) to see if there is a gradient in FBP activation. *

      12. While in principle we agree that this experiment could be informative, we consider the proposed experiment beyond the scope of this work and technically very challenging (although possible). With a similar motivation, the development of metabolite biosensors is an alternative route that we are pursuing for future studies (for the detail, please refer to our response #10).

      *2- A similar argument could be presented for the results with the cytoPfkfb3 transgenics, as they are based on global artificial overactivation of Pfk, in addition to other possible effects of the ectopic activity of cytoPfkfb3, which were not controlled. Also, while the phenotypic alterations in the PSM in vitro, most particularly in the experiments involving incubation of the tails, are rather strong, the reported effects on somitogenesis in vivo are minor, also questioning the contribution of the in vitro conditions to the final phenotypic effects observed throughout the manuscript. *

      13. First of all, we would like to emphasize that the phenotype seen in cytoPfkfb3 embryos, i.e. the reduction of segmentation and downregulation of Wnt-target gene expression, occurs in a glucose dose dependent manner (Figure 4B and 5A). Hence, it is not the overexpression of cytoPfkfb3 per se that can account for the effects seen. But rather, increased glycolytic flux caused by the combination of transgene expression with high glucose results in functional consequences.

      In addition, ‘other possible effects’ that the reviewer is referring to should be evident in all transgenic embryos, irrespective of glucose dose. To the contrary, transgenic embryos cultured in low glucose conditions appear unaltered to control embryos.

      Second, we agree that we need to distinguish between strong phenotypes, visible at the level of clock arrest, and milder phenotypes, visible at the level of quantitative gene expression changes. It is important to note that the moderate phenotype, i.e. the quantitative gene expression changes seen in posterior PSM, are seen upon the addition of FBP at moderate levels and upon in glucose titration within the physiological concentration range, as well as in cytoPfkfb3 embryos. We take this as evidence that the effects seen in cytoPfkfb3 transgenic embryos reflect a common response also seen under physiological conditions.

      To extend this argument to the in vivo setting, we have performed additional experiments using a genetic mouse model for diabetes. As shown in our previous submission, cytoPfkfb3 transgenic animals do not exhibit a drastic in vivo phenotype when dissected at embryonic day 10.5. One interpretation of this finding is that since the cytoPfkfb3 phenotype is glucose and flux-dependent, the in vivo flux is low, reflecting low glucose concentrations described in vivo. To test the effect of increased flux in cytoPfkfb3 embryos in vivo, we therefore crossed the transgenic mice into a diabetic model called Akita, in which a point mutation in the Insulin2 gene causes high maternal glucose levels (Yoshioka et al., 1997; Wang et al., 1999). Using this experimental setup, we tested whether transgenic embryos in Akita diabetic females would manifest in vivo phenotypes.

      Indeed, we found that cytoPfkfb3 transgenic embryos developing in Akita diabetic females showed significantly increased cases of neural tube closure defects (50% of cytoPfkfb3 embryos) and developmental delay (control: 38 somites vs. cytoPfkfb3: 34 somites at E10.5), defects not seen in transgenic cytoPfkfb3 embryos from control females (please refer to Figure R2 below). This dependency of the in vivo phenotype on maternal glucose conditions again highlights that the defects observed in cytoPfkfb3 embryos are not due to the expression of cytoPfkfb3 per se, but are rather directly linked to increased/unregulated glycolytic flux.

      We included the new in vivo data in the revised Figure S5D-E and modified the text accordingly.

      Figure R2. In vivo phenotype of cytoPfkfb3 embryos grown in diabetic Akita females. (A) The number of somites in control (Ctrl) and cytoPfkfb3 (Tg) E10.5 embryos grown in diabetic Akita females. (B) In situ hybridization of Msgn, Uncx4.1, and Shh mRNAs in Ctrl and Tg E10.5 embryos grown in diabetic Akita females (ss, somite stage; scale bar, 500 µm).

      In conclusion, combining the arguments in the two previous comments, to which extent the results from the addition of FBP or from the transgenic activation of Pfk are not artefactual phenotypes without real physiological relevance?

      14. In our view, two main conclusions, both in vivo and in vitro, can be drawn based on the result we obtained:

      First, we find that a moderate increase in glycolytic flux, within the physiological range, leads to a quantitative and consistent change in gene expression, such as downregulation of Wnt target genes (Figure S3, 5). Such a phenotype was the result of either glucose titration or culturing cytoPfkfb3-transgenic embryos in low glucose concentration.

      In these conditions, while overall PSM patterning is qualitatively normal, we do find consistent changes at quantitative level, i.e. gene expression changes, which are also mirrored by a reduced rate of segmentation (Figure 4B). A detailed analysis of the quantitative changes at the level of segmentation clock dynamics is being carried out and will be presented in a dedicated follow up study.

      Second, we find that a very significant increase in FBP levels, i.e. when cytoPfkfb3 transgenic animals are cultured in high glucose conditions or when samples are cultured in high levels of FBP, PSM patterning is qualitatively altered and segmentation clock ceases to oscillate. In this case, we agree that it is not a physiological condition, as such high levels of flux and FBP are not reached in control samples which have intact flux regulation by Pfk. Nevertheless, such an experimental condition can be insightful, as it very clearly reveals the potential link between glycolysis, clock activity, PSM patterning and the Wnt signaling pathway.

      It is the combination between the moderate and the more severe effects, observed both in vitro, and now also in vivo using the Akita model (see above), that we take as evidence for an intrinsic, physiological link between glycolytic activity, PSM patterning and signaling.

      *3- The authors seem to give a strong functional meaning to the absence of Pfkl and Aldoa from the nuclear fraction in tails incubated with exogenous FBP, suggesting a "moonlighting" function of these enzymes under FBP regulation. In addition to the purely speculative nature of this interpretation (there is no proof for such activity or even an attempt to test it), the data provided is also difficult to interpret for various reasons. *

      15. We fully agree that we do not show a functional role for either the nuclear localization of enzymes or their dynamic change in sub-cellular localization and have tried to express this clearly in the original manuscript:

      • (Result Part, line 382-388) “While we have not been able to address the functional consequence of specific changes in subcellular localization, such as the nuclear depletion of Pfkl or Aldoa when glycolytic flux is increased, these results pave the way for future investigations on the mechanistic underpinning of how metabolic state is linked to cellular signaling and functions.”

      • (Discussion Part, line 575-577): “While future studies will need to reveal if nuclear localization of glycolytic enzymes is linked to their moonlighting functions or metabolic compartmentalization…”

      Based on this comment by the reviewer, we have further emphasised this point in the revised manuscript(line 635-639):

      “While we do not have any direct functional evidence so far for a functional role of nuclear localized glycolytic enzymes, our findings do raise the question whether their subcellular compartmentalization is linked to a non-metabolic, moonlighting function.”

      The protein levels in nuclear fractions are clearly much lower than those in the cytoplasm (this is best seen in the blots of Figure 6D). Does this represent similar subcellular distribution of these enzymes throughout the tissue or the different levels result from the presence of the enzymes in the nucleus of only a subset of the cells? This might be of importance to understand the possible relevance of the subcellular distribution of those enzymes. All the analyses were done on bulk tissue and, therefore, it is not possible to distinguishing between these possibilities. As the authors have antibodies for these enzymes, they could try to perform immunofluorescence analyses, which would provide spatial data.

      16: We agree that a spatially resolved analysis of the subcellular localization of these various enzymes is needed. Unfortunately, the immunofluorescence experiments that we performed did not yield clear, reliable results and hence we can’t provide the answer at this time.

      *In addition to this, it would be important to determine Pfkl and Aldoa subcellular localization in explants incubated with different external concentrations of glucose, which in a way reproduces better possible physiological effects (see point 1), to see if under those conditions high FBP also affects subcellular distribution of those enzymes. *

      17: Please find our response under #4 (attached below), as this important point was also raised by the reviewer 1.

      *(Our response #4) *

      *#4. We agree with the reviewer that based on the findings, one would expect the phenotype, i.e. in this case translocation of proteins, to correlate with FBP levels. Two of our results are of note in this regard. *

      *First, our data indicates that in order to see the effect on protein localization, high levels of FBP have to be reached. Accordingly, we find that Pfkl becomes depleted from the nuclear-cytoskeletal fraction in cytoPfkfb3 explants when cultured in 10 mM glucose but not (visibly) in 2.0 mM glucose (Figure 7D). Corresponding to this, FBP levels in cytoPfkfb3 explants show a significant increase (about 3-fold) from 2.0 to 10 mM glucose conditions (revised Figure 3E). *

      *Second, in control samples, FBP levels saturate in high glucose conditions. FBP levels in control samples do not further increase when glucose concentration is increased from 10mM to 25mM, and thus it does not become as high as in cytoPfkfb3 embryos cultured in 10 mM glucose (revised Figure 3E). *

      *Therefore, in order to reveal the translocation, it requires an experimental strategy that leads to significantly increased FBP levels, such as in cytoPfkfb3 explants with high glucose condition, or alternatively, direct supplementation of FBP. *

      As also pointed out by the other reviewers, we are experimentally generating controlled conditions that exceed the physiological range which the embryo is exposed to. Accordingly, our data does not constitute evidence that under physiological conditions an alteration of protein localization in response to change in glycolytic flux and FBP levels occurs, at a smaller scale.

      We regard our approach as a first step to reveal potential mechanisms and so far hidden possible responses to changes in metabolic flux. In order to see minor changes in translocation upon small changes in glycolytic-flux/FBP levels, more quantitative approaches, such as live-imaging of tagged proteins, will need to be developed. We hence decided to include these discussion in our revised manuscript (line 657-666):

      “Of note, the translocation of proteins was observed only when high levels of FBP were reached upon direct FBP supplementation or cytoPfkfb3 overexpression with high glucose (Figure 6, 7). Future studies hence need to investigate whether flux-dependent change in protein localization occurs upon moderate and more physiological changes in glycolytic-flux/FBP levels. To this end, the development of more quantitative approaches, such as live-imaging of tagged enzymes and the development of metabolite biosensors, are needed.”

      SUGGESTED ADDITIONAL EXPERIMENTS related to point #3:

      *3a- Analysis of subcellular localization of Pfkl and Aldoa by Immunofluorescence. This analysis is not limited by the amount of biological material available, so it could be applied to different experimental conditions. *

      18. We addressed this point in our response #15.

      *3b- Subcellular distribution of Pfkl and Aldoa in explants exposed to different exogenous glucose concentrations. As this involves wild type embryos, it can be done following similar protocols as in figures 6 and 7 of the manuscript. *

      19. We addressed this point in our response #16.

      4- The results from the work presented in this manuscript would indirectly indicate a negative relationship between glycolysis and somitogenesis. This contrasts with previous reports indicating the essential role of aerobic glycolysis for the same process. There is no explanation for this apparent (and important) contradiction (the authors only comment the discrepancy between the data provided in this paper and previous reports in what concerns the relationship between glycolysis and Wnt signalling, although they also do not provide an explanation).

      19. We cannot resolve this discrepancy, but now offer a more detailed discussion, also based on the additional data we obtained.

      First, it is important to point out that we have performed additional experiments to substantiate this part of the work, i.e. a transcriptome analysis with control and cytoPfkfb3 explants cultured in 10 mM glucose. We decided to focus on an early time point, i.e. three-hour after incubation, in order to increase the chance to score the primary response of PSM cells upon changes in glycolytic flux. In addition, our nanostring data in Figure S3 shows that glucose titration can change the expression levels of some Wnt-targets in both directions, i.e. decreasing glucose upregulates their expressions while increasing glucose downregulates their expressions. Again, this analysis was done at short time-scales to score the immediate effect.

      One possible explanation regarding the difference to Oginuma et al. could indeed be the late time point of analysis in their study, i.e. 16-hour after culture. This difference in sampling time, i.e. 3-hour vs. 16-hour after culture, is of particular importance given the dynamic nature of metabolic and signaling responses.

      We have added a sentence to explain this point in more detail (line 608-617):

      “This discrepancy could relate to the time point of analysis: while Oginuma et al. mainly focused on analyzing samples 16-hour after metabolic changes, we chose to score the effects of altered glycolytic flux/FBP levels already after a three-hour incubation, with the goal to capture the primary response of PSM cells. Whether the difference in sampling time underlies the observed difference is yet unknown, but both studies highlight that Wnt signaling is responsive to glycolytic flux, supporting a tight link between metabolism and PSM development.”

      Minor comments.

      *It was not specified the tissue used for the Western blot analyses (was it the PSM alone, the whole tails including somites, etc). This is of relevance to comment #3. *

      20. PSM explants without somites were cultured for one/three-hour and were subjected to subcellular protein fractionation. This information is now included in the revised method section.

      Reviewer #2 (Significance (Required)):

      -The work described in this manuscript identifies FBP as a sentinel metabolite for the glycolytic flux. This, itself has the potential to be important for different processes in which differences in glycolysis makes a difference, although I do not think that this will be relevant for the developmental process on which the authors focused their study (see major comments #1 and 2). Indeed, the lethality of global transgenic cytoPfkfb3 expression (although it was not analyzed if it was during development of in postnatal stages, or the cause of this lethality) but with very minor effects on somitogenesis in vivo supports this conclusion.

      21. Please see our detailed comments also based on the newly added in vivo experiments done with the Akita diabetic mouse model in our responses #9–14.

      *- The potential moonlighting activity of Pfk (connected with specific subcellular localization), is an interesting idea but so far does not go beyond pure speculation. This is prone to the typical double edged effect of stimulating research in that direction but also the potential negative effect of being taken for granted without rigorous proof. *

      22: We have added a statement to highlight the nature of this finding and the requirement for follow up studies both in this and other contexts. Please refer to our response #15 for the details.

      • The importance of metabolism in general and glycolysis in particular for somitogenesis and axial extension has been recently reported (the relevant papers are cited in the manuscript) and therefore the work described in this manuscript extends those studies. Also, the recent observations that metabolic process can influence cell activity beyond their participation on the classical pathways in which they are involved, including processes apparently as distant as epigenetic regulation of gene activity (see for instance Tarazona and Pourquie, 2020, Dev Cell 54, 282-292), is opening new perspectives to the study of the influence of metabolism on physiological and pathological processes (championed by cancer and immunological response). It also provides a link between control mechanisms across large scale phylogeny, from procaryotes to eukaryotes.

      -In principle, the potential audience for this work could be wide, as the interest in understanding the involvement of metabolism in the regulation of physiological and pathological processes has been growing over the last years. However, the lack of proven mechanism for the activity of FBP might restrict the real general impact of this work. In this regard, the suggestion that it might control some type of still unknown moonlighting activity of Pfk is so far totally speculative.

      • I am a developmental biologist with strong focus on mechanisms of somitogenesis and axial extension in vertebrate embryos. There is no part of this work for which I do not feel competent to evaluate.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      *Summary - *

      *In the present manuscript, Miyazawa and colleagues explore the role of glycolytic flux on embryonic development by using presomitic mesoderm (PSM) patterning as a model. *

      *First, the authors examined the steady-state levels of central carbon metabolism metabolites in PSM explants. Explants were cultured in various concentrations of glucose and subjected to gas chromatography mass spectrometry (GC-MS). These experiments allowed the identification of metabolites (such as lactate, 3PG, and FBP) that exhibit a linear correlation with glucose levels and can therefore serve as sentinel metabolites for glycolytic flux in PSM cells. Among the metabolites identified, fructose 1,6-bisphosphate (FBP) showed the strongest linear correlation with glucose levels and was used to inform the design of subsequent experiments. *

      *Second, to elucidate the functional role of FBP on PSM patterning, the authors supplement the media used to culture PSM explants with various concentrations of FBP and: *

      *- analyze the dynamics of Notch signaling (a critical player in mesoderm segmentation during embryogenesis) using real-time imaging of the LuVeLu reporter; *

      *- assess gene expression patterns using in situ hybridization of candidate genes. *

      *The authors find that supplementation with FBP, but not F6P or 3PG, impairs mesoderm segmentation and disrupts the activity of the segmentation clock in the posterior PSM. Furthermore, FBP supplementation led to the reduced expression of FGF- and WNT-target genes Dusp4 and Msgn, respectively. *

      *Third, the authors generate a conditional cytoPfkfb3 transgenic mouse line in which a cytoplasmic form of the Pfkfb3 enzyme is overexpressed. Pfkfb3 can promote glycolysis, and more importantly, leads to increased levels of FBP in a glucose-dependent manner. The authors find that cytoPfkfb3 transgenic PSM explants contain higher levels of FBP and secrete lactate at higher levels when compared to control explants. Importantly, cytoPfkfb3 transgenic PSM explants exhibit impaired somite formation and reduced expression of Msgn (but not Dusp4) in a glucose-dependent manner when compared to control explants. *

      Finally, the authors investigate changes in protein subcellular localization in their pharmacological and genetic models of FBP-driven glycolytic flux activation. This was prompted by previous reports on the changes in subcellular localization of glycolytic enzymes (Hu et al., 2016). To this end, the authors perform proteome-wide cell-fractionation analyses in drug-treated and cytoPfkfb3 transgenic PSM explants and find that certain glycolytic proteins exhibit altered subcellular localization in both cases (albeit in different fractions).

      *Major concerns: *

      *- (Re: Results from Fig. 2 and Fig. S1.) *

      *o Given that FBP levels are highly correlated with extracellular glucose levels (which impact glycolytic flux )(TeSlaa and Teitell, 2014) the authors should elaborate on why progressive increase in extracellular glucose does not affect PSM patterning, in the same way that increasing FBP levels does. This is especially important given the claim that FBP is a sentinel metabolite of glycolytic flux. *

      23. This important point was also addressed by the reviewer 2, so please see our responses that are also listed under #9, #10, #14 (attached below).

      *(Our response #9) *

      *We agree with the reviewer and to directly address this central point, we have performed an extended, additional experiment, collecting 375 embryos to quantify FBP levels under five conditions with three biological replicates. *

      *There are two major results that we highlight here: First, we found that addition of F6P did not lead to increased FBP levels compared to control samples cultured in 10 mM glucose, which is in stark contrast to cytoPfkfb3 embryos cultured in 10 mM glucose (revised Figure 3E). Second, while increasing glucose concentration is mirrored by elevated FBP levels as we reported, we find clear evidence of saturation above a concentration of 10mM glucose: increasing glucose to 25mM does not increase FBP levels further (revised Figure 3E). *

      This saturation effect seen in glucose titration, but also the absence of elevated FBP upon F6P addition, might be expected outcomes because, as also the reviewer 1 pointed out in the response, Pfk is commonly considered to be a rate-limiting enzyme in the glycolytic pathway. We now have the direct experimental data supporting this hypothesis and thank the reviewers to have initiated this additional (very involved..) experiment.

      *This new data allows us to conclude more firmly on the correlation between FBP levels and phenotype: at high FBP levels, which are seen in cytoPfkfb3 samples, we observe PSM patterning defects. These high levels are not reached even at 25mM glucose or upon F6P addition, due to the saturation at the level of PFK enzymatic step. Hence, while glucose titration does elevate FBP significantly until this saturation, FBP levels are not as high as in cytoPfkfb3 samples. As a correlative finding, we see that only those conditions with very high FBP levels, or the direct addition of high levels of FBP, cause the arrest of segmentation clock activity. At moderately elevated FBP levels, observed in control explants with high glucose or in cytoPfkfb3 explants with low glucose, clock activity continues and we find a quantitative effect at the level of gene expression, i.e. Wnt signaling target downregulation (Figure 5A, S3). *

      The new data has been included in the revised manuscript and the text has been adjusted accordingly:

      - (Result Part, line 245–254) "Consistently, we found that cytoPfkfb3 overexpression lifted the upper limit of FBP levels in PSM cells (Figure 3E, S4B, S4C). In control explants, FBP levels did not increase further when glucose concentration was increased from 10 mM to 25 mM. It was also the case when control explants were cultured in 20 mM of F6P (Figure 3E). These results indicate that the Pfk reaction carries a (rate-)limiting role for glycolytic flux and FBP levels, and that cytoPfkfb3 overexpression hinders the flux-regulation function of Pfk."

      - (Discussion Part, line 551–573) “Our findings suggest that flux-regulation at the level of Pfk is critical to keep FBP steady state levels within a range compatible with proper PSM patterning and segmentation. In agreement with such a rate-limiting function for Pfk, we found in glucose titration experiments that FBP levels saturated and did not further increase at glucose levels above 10 mM (Figure 3E). Along similar lines, the supplementation of high concentrations of the Pfk substrate F6P did not result in a significant increase of FBP levels, again compatible with a rate-limiting function at the level of Pfk (Figure 3E). The upper limit of glycolytic flux and FBP levels can be experimentally increased by cytoPfkfb3 overexpression (Figure 3B, 3E). We interpret the data as evidence that cytoPfkfb3 overexpression compromises the flux-control function of Pfk and hence much higher FBP (and secreted lactate) levels are reached. Such a drastic increase in glycolytic flux and FBP levels correlates with a severe PSM patterning phenotype (Figure 4), which resembles the phenotype induced by supplementation of high dose of FBP (Figure 2). Our results in mouse embryos hence provides evidence that flux regulation by Pfk, an evolutionary conserved role present from bacteria to humans, serves to maintain FBP levels below a critical threshold.”

      (Our response #10)

      *#10. First, we would like to clarify that while indeed glycolytic activity is graded along the PSM, as other and we reported previously (reported in Bulusu et al., 2017 and Oginuma et al., 2017), the baseline expression of the entire glycolytic machinery (from glucose transport to lactate production) is very high, in all PSM cells. Hence, we see that cells all along the entire PSM have very active glycolysis, the posterior PSM being even more active. *

      *For this and related reasons, our interpretation about the difference seen between glucose titration/F6P addition on one side, and FBP addition/cytoPfkfb3 addition on the other side, is based on the role of Pfk in controlling either flux levels or dynamics in all PSM cells. *

      Hence, while we agree that we generate experimental conditions that allow FBP levels to surpass those found in control embryos, we would like to highlight the fact that even moderate changes in flux does result in very robust functional consequences on gene expression (Figure S3, 5), as we show in this work.

      *We can currently not fully address the first point raised, i.e. the role of graded flux/graded metabolite levels, due to the experimental limitations. Such a study requires, for instance, the generation of metabolite biosensor reporter lines in order to be able to monitor these changes dynamically, in space and time. *

      (Our response #14)

      *In our view, two main conclusions, both in vivo and in vitro, can be drawn based on the result we obtained: *

      *First, we find that a moderate increase in glycolytic flux, within the physiological range, leads to a quantitative and consistent change in gene expression, such as downregulation of Wnt target genes (Figure S3, 5). Such a phenotype was the result of either glucose titration or culturing cytoPfkfb3-transgenic embryos in low glucose concentration. *

      In these conditions, while overall PSM patterning is qualitatively normal, we do find consistent changes at quantitative level, i.e. gene expression changes, which are also mirrored by a reduced rate of segmentation (Figure 4B). A detailed analysis of the quantitative changes at the level of segmentation clock dynamics is being carried out and will be presented in a dedicated follow up study.

      *Second, we find that a very significant increase in FBP levels, i.e. when cytoPfkfb3 transgenic animals are cultured in high glucose conditions or when samples are cultured in high levels of FBP, PSM patterning is qualitatively altered and segmentation clock ceases to oscillate. In this case, we agree that it is not a physiological condition, as such high levels of flux and FBP are not reached in control samples which have intact flux regulation by Pfk. Nevertheless, such an experimental condition can be insightful, as it very clearly reveals the potential link between glycolysis, clock activity, PSM patterning and the Wnt signaling pathway. *

      *It is the combination between the moderate and the more severe effects, observed both in vitro, and now also in vivo using the Akita model (see above), that we take as evidence for an intrinsic, physiological link between glycolytic activity, PSM patterning and signaling. *

      - (Re: Fig. 2A and Fig. 2B)

      *o The authors should be consistent with the glucose concentrations for the experiments where they assess the dynamics of Notch signaling (Figure 2A) and gene expression (Figure 2B) or otherwise elaborate on why different concentrations are used for these assays. *

      24: We agree that ideally the experimental parameters should be as consistent as possible. In regards to the control glucose concentration used in this study, both 0.5 mM and 2.0 mM glucose were used. It reflects that over the years, minor adjustments in the experimental protocol were made, i.e. we now use 2.0 mM glucose as standard setting for all experiments, while previously, 0.5 mM glucose was used (see Bulusu et al., 2017). This change is based on the observation of a slightly improved culture outcome, in terms of reporter gene expression. We have confirmed that the developmental outcome and also effects seen upon addition of FBP are consistent at 0.5 mM and at 2.0 mM glucose. We made a note in the methods section to explain this point (line 1082-1084):

      “Basal culture condition was 0.5 mM glucose at the beginning of this study but was later switched to 2.0 mM glucose which yields a slightly improved reporter gene expression. No major difference was observed in the effects of FBP between these glucose conditions.”

      *- (Re: Results from pharmacological and genetic models of increased FBP levels) *

      *o The authors state that FBP-driven impairment of mesoderm segmentation is most pronounced in the undifferentiated PSM cells (in the posterior-most end of the explants) and is, therefore, unlikely to be due to a toxic effect that might otherwise affect the whole explant. While this is a reasonable assumption, it does not discount the possibility that the spatial specificity of the effect of FBP could be driven primarily by increased cell death in the posterior end of the explant. Thus, the authors should test whether cell death underlies the mesoderm patterning defects seen in PSM explants subjected to increased FBP levels. *

      25. We have performed immunostaining of active caspase-3 in explants cultured for three-hour in medium containing 0.5 mM glucose and 20 mM FBP and found no difference between control and FBP-treated explants (please refer to the Figure R2 below). This qualitative result does not indicate a major effect via cell death in the tail bud region (i.e. posterior PSM) as the underlying reason for the observed phenotype. We included the new data in the revised Figure S2C and adjusted the text accordingly.

      Figure R3. Immunostaining of active caspase-3 in PSM explants. Explants were cultured for three hours in the presence or absence of 20 mM of FBP. Neural tubes were outlined by white dotted lines.

      *- (Re: Gene expression experiments/analyses) *

      *o This study would benefit greatly from transcriptomic analysis of wt and cytoPfkfb3 transgenic PSM explants (and/or transcriptomic characterization of FBP-treated vs. control PSM explants). The candidate approach used to assess gene expression (through in situ hybridization) may not be sufficient to conclude that cytoPfkfb3 over-expression leads to the downregulation of Wnt signaling (a claim the authors make at the beginning of the manuscript). *

      26. We fully agree with the reviewer’s comment. We have now performed RNA-sequencing (RNAseq) analysis using control and cytoPfkfb3 explants cultured in 10 mM glucose, importantly after three hours of incubation in order to score early effects at transcriptome level (please refer to Figure R4).

      We found clear evidence that many Wnt-target genes (i.e. Axin2, Cdx4, Dact1, Dkk1, Mixl1, Msgn1, Sp5, Sp8, T) were significantly downregulated in cytoPfkfb3 explants, supporting the conclusion that Wnt signaling activity is downregulated in cytoPfkfb3 explants under high glucose condition.

      Furthermore, in order to examine similarities between the effects of cytoPfkfb3 overexpression and FBP supplementation, we also performed RNAseq analysis with explants treated with high dose of FBP or F6P. FBP supplementation resulted in downregulation of Wnt target gene expression (i.e. Dact1, Dkk1, Mixl1, Lef1, Sp5, T, Tbx6), mirroring the effects seen in cytoPfkfb3 samples. Such a response was not detected in F6P-treated explants.

      Combined, these new data significantly strengthen our conclusion that an increase in glycolytic flux and FBP levels leads to downregulation of Wnt signaling activity. The new data is now included in the revised Figure 5C–E and adjusted the texts accordingly.

      Figure R4. Transcriptome analysis of control (Ctrl) and cytoPfkfb3 (TG) PSM explants. PSM explants were cultured for three hours under different culture conditions. (A) Effects of cytoPfkfb3 overexpression on gene expression under 10 mM glucose condition. (B, C) Effects of 20 mM FBP (B) or F6P (C) on gene expression under 2.0 mM glucose condition. Wnt-target genes that were significantly downregulated in cytoPfkfb3 or FBP/F6P-treated explants are highlighted in blue.

      *- (Re: Results related to the neural tube closure defects in cytoPfkfb3 transgenic embryos) *

      *o The section of the manuscript describing the neural tube closure defects in cytoPfkfb3 transgenic embryos is superficial, lacks detail, and distracts from the focus of the study. Perhaps the data and text on neural tube closure defects should be included as supplemental information. *

      27: We agree with the reviewer that in the previous version, this data appeared isolated. It also connects with the point raised by the reviewer 2 about the in vivo significance of our findings. To address both these points, we have now performed additional in vivo experiments using a diabetic mouse model (Akita) to directly test the in vivo consequence of cytoPfkfb3, which interestingly links to the previous findings of neural tube defects. Please see our response #13 for the details (attached below):

      (Our response #13)

      *First of all, we would like to emphasize that the phenotype seen in cytoPfkfb3 embryos, i.e. the reduction of segmentation and downregulation of Wnt-target gene expression, occurs in a glucose dose dependent manner (Figure 4B and 5A). Hence, it is not the overexpression of cytoPfkfb3 per se that can account for the effects seen. But rather, increased glycolytic flux caused by the combination of transgene expression with high glucose results in functional consequences. *

      In addition, ‘other possible effects’ that the reviewer is referring to should be evident in all transgenic embryos, irrespective of glucose dose. To the contrary, transgenic embryos cultured in low glucose conditions appear unaltered to control embryos.

      *Second, we agree that we need to distinguish between strong phenotypes, visible at the level of clock arrest, and milder phenotypes, visible at the level of quantitative gene expression changes. It is important to note that the moderate phenotype, i.e. the quantitative gene expression changes seen in posterior PSM, are seen upon the addition of FBP at moderate levels and upon in glucose titration within the physiological concentration range, as well as in cytoPfkfb3 embryos. We take this as evidence that the effects seen in cytoPfkfb3 transgenic embryos reflect a common response also seen under physiological conditions. *

      *To extend this argument to the in vivo setting, we have performed additional experiments using a genetic mouse model for diabetes. As shown in our previous submission, cytoPfkfb3 transgenic animals do not exhibit a drastic in vivo phenotype when dissected at embryonic day 10.5. One interpretation of this finding is that since the cytoPfkfb3 phenotype is glucose and flux-dependent, the in vivo flux is low, reflecting low glucose concentrations described in vivo. To test the effect of increased flux in cytoPfkfb3 embryos in vivo, we therefore crossed the transgenic mice into a diabetic model called Akita, in which a point mutation in the Insulin2 gene causes high maternal glucose levels (Yoshioka et al., 1997; Wang et al., 1999). Using this experimental setup, we tested whether transgenic embryos in Akita diabetic females would manifest in vivo phenotypes. *

      Indeed, we found that cytoPfkfb3 transgenic embryos developing in Akita diabetic females showed significantly increased cases of neural tube closure defects (50% of cytoPfkfb3 embryos) and developmental delay (control: 38 somites vs. cytoPfkfb3: 34 somites at E10.5), defects not seen in transgenic cytoPfkfb3 embryos from control females (please refer to Figure R2 below). This dependency of the in vivo phenotype on maternal glucose conditions again highlights that the defects observed in cytoPfkfb3 embryos are not due to the expression of cytoPfkfb3 per se, but are rather directly linked to increased/unregulated glycolytic flux.

      We included the new in vivo data in the revised Figure S5D-E and modified the text accordingly.

      *Figure R2. In vivo phenotype of cytoPfkfb3 embryos grown in diabetic Akita females. (A) The number of somites in control (Ctrl) and cytoPfkfb3 (Tg) E10.5 embryos grown in diabetic Akita females. (B) In situ hybridization of Msgn, Uncx4.1, and Shh mRNAs in Ctrl and Tg E10.5 embryos grown in diabetic Akita females (ss, somite stage; scale bar, 500 µm). *

      • (Re: Conclusions of the study)

      o A previous study by Oginuma et al., 2020 provided strong evidence for a mechanism underlying the positive regulation of Wnt signaling by glycolysis (initiated by the elevation of intracellular pH) in the chick embryo tailbud. As mentioned in the discussion, the results of the present study are not consistent with this mode - and this contradiction is not sufficiently resolved. This is a concern, given that the evidence that cytoPfkfb3 inhibits Wnt signaling is sparse (see above).

      28: This important point was also raised by the reviewer 2, please see our response as listed under #19 (attached below).

      (Our response #19)

      *We cannot resolve this discrepancy, but now offer a more detailed discussion, also based on the additional data we obtained. *

      *First, it is important to point out that we have performed additional experiments to substantiate this part of the work, i.e. a transcriptome analysis with control and cytoPfkfb3 explants cultured in 10 mM glucose. We decided to focus on an early time point, i.e. three-hour after incubation, in order to increase the chance to score the primary response of PSM cells upon changes in glycolytic flux. In addition, our nanostring data in Figure S3 shows that glucose titration can change the expression levels of some Wnt-targets in both directions, i.e. decreasing glucose upregulates their expressions while increasing glucose downregulates their expressions. Again, this analysis was done at short time-scales to score the immediate effect. *

      *One possible explanation regarding the difference to Oginuma et al. could indeed be the late time point of analysis in their study, i.e. 16-hour after culture. This difference in sampling time, i.e. 3-hour vs. 16-hour after culture, is of particular importance given the dynamic nature of metabolic and signaling responses. *

      We have added a sentence to explain this point in more detail (line 608-617):

      “This discrepancy could relate to the time point of analysis: while Oginuma et al. mainly focused on analyzing samples 16-hour after metabolic changes, we chose to score the effects of altered glycolytic flux/FBP levels already after a three-hour incubation, with the goal to capture the primary response of PSM cells. Whether the difference in sampling time underlies the observed difference is yet unknown, but both studies highlight that Wnt signaling is responsive to glycolytic flux, supporting a tight link between metabolism and PSM development.”

      *o Another discrepancy lies in the lack of an observable phenotype when culturing mouse PSM explants at very low glucose concentrations (e.g., 0.5 mM in Fig. 2A). Oginuma et al. observed clear disruptions to embryonic elongation and somite formation at a glucose concentration equal to 0.83 mM. Would this be due to species-specific mechanisms? Furthermore, while the authors focus on sentinel metabolites (such as FBP), experiments involving direct manipulation in glycolysis could resolve some of these inconsistencies. *

      29: Indeed species specific differences in the requirement for glucose are to be expected. Our extensive analysis shows that at 0.5mM glucose, segmentation and elongation proceeds (Bulusu et al., 2017).

      Regarding the second point, we have outlined several strategies to directly perturb glycolysis, i.e. glucose titration (mirrored by increase in lactate secretion) and by genetic targeting of the rate-limiting enzyme, Pfk. Glucose titration in wild-type embryos corresponds to the experiment the reviewer suggested, and we again found that higher glucose (i.e. higher flux) leads to down regulation of several Wnt-target genes (Figure S3). Of note, also in cytoPfkfb3 explants the effects are glucose-dose dependent (again mirrored by increase of lactate secretion), clearly indicating that we successfully and directly controlled glycolysis.

      *References - *

        • Hu, Hai, et al. "Phosphoinositide 3-kinase regulates glycolysis through mobilization of aldolase from the actin cytoskeleton." Cell 164.3 (2016): 433-446. *
        • TeSlaa, Tara, and Michael A. Teitell. "Techniques to monitor glycolysis." Methods in enzymology 542 (2014): 91-114. *
        • Oginuma, Masayuki, et al. "Intracellular pH controls WNT downstream of glycolysis in amniote embryos." Nature584.7819 (2020): 98-101. * *Reviewer #3 (Significance (Required)): *

      The experimental results reported in this study enhance our understanding of how cellular metabolic states regulate cellular behaviors during embryonic development. The study provides insight into how PSM elongation is controlled by morphogenetic mechanisms that are modulated by glycolytic flux. One of the strengths of the study is the use of an interdisciplinary approach that includes GC-MS, in vivo imaging and mouse transgenic lines. It should be noted that some of the conclusions of the study diverge from previous papers that examine the role of metabolism in developmental patterning (e.g., Oginuma et al., 2020).

    1. The geography theory postulates that prosperity and poverty of a country are caused by its geography especially tropical versus temperate climate which may influence the attitude of people, the diseases that can impact health, tropical soil which is not very conducive to agriculture as well as the flora/fauna of the place. I think there is another element of the geography theory that we can evaluate is the availability of broader natural resources. Logically the country with higher natural resources should be growing faster than those with lesser. However, in their paper, “Natural resource abundance and economic growth” (published in National Bureau of Economic Research, 1995 https://www.nber.org/system/files/working_papers/w5398/w5398.pdf), Jeffery Sachs and Andrew Warner from Harvard institute of international development, conclude that natural resource-rich countries tend to grow slower than those with scare resources in their study of 97 developing countries over two decades (1970-1989). The key hypothesis validated by them was that resource-rich countries tend to focus their labor on extracting natural resources thereby leaving few resources and investments into manufacturing and value-added industries. In addition, they practice protectionist state-led development policies which lead to lower investment and hence lower growth. So in many ways, riches by themselves become a curse versus a boon.

    1. Author Response

      Reviewer #2 (Public Review):

      The manuscript presents an interesting study on a timely topic (hyperacusis). The study was carried out in awake animals using modern approaches in neurosciences (calcium imaging, optogenetic). The amount of data is impressive, the study is very ambitious, and overall its quality is indisputable. However, I have some general comments and questions on some concepts that are critical for the study, and also on the interpretation of the data, in particular the behavioral data.

      We appreciate Reviewer 2’s overall positive evaluation as well as their more specific critiques, which we address below.

      The first point I want to mention is the concept of 'homeostatic plasticity'. I am not sure we agree on its definition. My understanding of it is that the AVERAGE of central activity will remain constant around a set point value. In case of a reduction of sensory inputs (hearing loss), the neurons' sensitivity will be enhanced in such a way that the averaged activity will be preserved. So, neural hyperactivity after partial or sensory deprivation is not 'maladaptive': it is a collateral effect, 'the price to pay' for maintaining neural activity stable around a given value. In my opinion, this point is crucial. The authors should also mention and cite the model's paper from Schaette et al.

      “Homeostasis” is a term used widely in physiology to describe a negative feedback process in which an internal adjustment compensates for an external perturbation to return a given system (temperature, pH, etc.) to a set point. To the reviewer’s point, homeostatic processes – broadly defined – can work at many different biological scales including perhaps large, distributed systems like the example s/he gave of neurons throughout the central auditory pathway. By contrast, “homeostatic plasticity” is a mechanism studied by dozens of laboratories in hundreds of papers by which neurons (typically studied in cortical neurons) adjust their synaptic and intrinsic excitability to maintain their activity around a set point range. A key feature of homeostatic plasticity is that neurons “sense” deviations from their set point and initiate a compensatory process to offset this deviation. Up to this point, it seems that we are on the same page as the reviewer.

      The first point of possible disagreement lies in the interpretation of how excess neural activity relates to homeostatic plasticity. The reviewer mentioned modeling papers by Schaette and Kempter (2006, 2007, 2012) on the cochlear nucleus, which are also based on homeostatic plasticity and their work is now cited in the revised text (see line 71). The reviewer is correct that there is a difference in how the term is used and interpreted, but the difference is fairly subtle. Their work and our work propose that homeostatic plasticity processes are applied within a single neuron to offset the reduced afferent input that accompanies cochlear damage. As the reviewer recalled, they describe hyperactivity as a consequence of this compensation, as we do as well. The only difference is that they and the reviewer describe hyperactivity as the byproduct of the normal, successful implementation of homeostatic plasticity, which it unequivocally is not because – by definition – homeostatic plasticity is a stabilizing process that maintains activity at a predetermined set point range.

      The second point of disagreement lies in the reviewer’s statement that “neural hyperactivity after partial or sensory deprivation is not 'maladaptive': it is a collateral effect, 'the price to pay' for maintaining neural activity stable around a given value.” We disagree. Hyperactivity can be both a collateral and maladaptive effect. Hyperactivity and hypersynchrony are understood to be the basis of tinnitus, which is a maladaptive, disordered state. The reviewer’s comment implies that there is no alternative for compensating for sensory deprivation but to make cortical neurons hyperactive. We see no reason why this must be so. In fact, stabilization of activity rates after sensory deprivation has been demonstrated in hundreds of studies in the developing visual system. In the adult auditory system, activity in cortical neurons is initially depressed after injury before rebounding to exceed baseline levels (see Resnik Polley 2017 eLife, Asokan 2018 Nat Comm., Resnik Polley 2021 Neuron). It is not obligatory for cortical activity rates to pass through the set point range and continue into hyperactivity, nor is it obligatory for cortical activity rates to remain elevated above baseline many days after the injury. Additional evidence for this point comes from Figures 4, 6, and 8, which show that some cortical neurons actually do homeostatically regulate their activity back to baseline (i.e., show stable gain). This raises the intriguing question of why some neurons recover to their homeostatic activity set point while others do not. Figure 8 provides new insight into this question by showing that that their baseline response properties can account for 40% of the variability in gain stabilization after peripheral insult.

      A third point of disagreement related to the reviewer’s statement that “My understanding of it is that the AVERAGE of central activity will remain constant around a set point value. In case of a reduction of sensory inputs (hearing loss), the neurons' sensitivity will be enhanced in such a way that the averaged activity will be preserved”. We agree that homeostatic plasticity processes are influenced by activity propagating through distributed neural networks. However, the biological implementation of the process is programmed into individual neurons. The activity set point is neuron-specific, the error signal that encodes a deviation from the set point is neuron-specific, and the transcriptional/translational changes deployed to stabilize the activity rate are neuron-specific. As an analogy, home climate control systems work autonomously for each house, because the sensors (thermostat) and actuators (heating/cooling) are sensitive to fluctuations in that home, not across other houses in the town. The heating and cooling systems for each house in town may be driven by a distributed, common source (e.g., a hot day) but the mechanisms that bring the ambient temperature back to the set point for each house are autonomous and reflect the particular thermostat programming for each house. The widely studied homeostatic plasticity mechanisms mentioned in our manuscript (e.g., excitatory synaptic scaling) are not sensitive to and do not target the averaged neural activity among millions of neurons distributed throughout the sensory neuroaxis.

      As a final point on this statement, there is no demonstration that we are aware of that average central activity remains constant after a reduction of sensory inputs. This would require recording from many neurons across multiple stages of the sensory pathway in a single animal to show that the increased gain at later stages in the system exactly offsets the reduced responsiveness at earlier stages of the system. So, the reviewer’s definition of homeostatic plasticity is based on a general supposition about a distributed process that has never been empirically demonstrated whereas the definition we use is consistent with the mechanisms and terminology used throughout the neuroscience literature (albeit often incorrectly in the hearing loss literature).

      The second point is that a lot is built on the behavioral procedure and d'. I am not convinced by the behavioral procedure (and the d') is a convincing measurement of loudness (and therefore loudness hyperacusis). So, in my opinion, the title may be changed and more importantly the entire spirit of the paper should be modified.

      The reviewer’s critique as well as comments from other reviewers helped us realize that we had used the terms “hyperacusis” and “loudness” imprecisely. We think that is part of the confusion. What we have studied here is auditory hypersensitivity after sensorineural hearing loss, which may or may not be a model of why persons with hyperacusis can exhibit loudness hypersensitivity.

      Once “hyperacusis” and “loudness” have been stripped away from the behavior, we contend that we have a behavioral assay for auditory hypersensitivity, which is the main point of our study. To be clear, the behavioral readout most commonly employed in the animal literature to model hyperacusis is reaction time, which has a less direct relationship to hypersensitivity than does d’. D-prime is widely used as the sensitivity index in detection behaviors. The main advantage of d’ is that it controls for differences in response bias either between subjects or after noise exposure. We used the d’ metric to show that mice can more reliably detect tone levels near their sensation threshold and can more reliably detect direct stimulation of thalamocortical projection neurons after acoustic trauma. These observations provide the framework for all of the neural measurements that follow.

      On the balance, the reviewer was correct that our imprecise use of hyperacusis and loudness was confusing and contradictory. The terms “hyperacusis” and “loudness” now only appear in the manuscript to describe other published findings or to describe what our study does not address. This resulted in several small text changes throughout the manuscript as well as a direct statement about the relationship between our work, loudness, and hyperacusis on Pg. 14, Lns 448-466.

      “While the findings presented here support an association between sensorineural peripheral injury, excess cortical gain, and behavioral hypersensitivity, they should not be interpreted as providing strong evidence for these factors in clinical conditions such as tinnitus or hyperacusis. Our data have nothing to say about tinnitus one way or the other, simply because we never studied a behavior that would indicate phantom sound perception. If anything, one might expect that mice experiencing a chronic phantom sound corresponding in frequency to the region of steeply sloping hearing loss would instead exhibit an increase in false alarms on high-frequency detection blocks after acoustic trauma, but this was not something we observed. Hyperacusis describes a spectrum of aversive auditory qualities including increased perceived loudness of moderate intensity sounds, a decrease in loudness tolerance, discomfort, pain, and even fear of sounds (Pienkowski et al., 2014a). The affective components of hyperacusis are more challenging to index in animals, particularly using head-fixed behaviors, though progress is being made with active avoidance paradigms in freely moving animals (Manohar et al., 2017). Our noise-induced high-frequency sensorineural hearing loss and Go-NoGo operant detection behavior were not designed to model hyperacusis. Hearing loss is not strongly associated with hyperacusis, where many individuals have normal hearing or have a pattern of mild hearing loss that does not correspond to the frequency dependence of their auditory sensitivity (Sheldrake et al., 2015). While the excess central gain and behavioral hypersensitivity we describe here may be related to the sensory component of hyperacusis, this connection is tentative because it was elicited by acoustic trauma and because the detection behavior provides a measure of stimulus salience, but not the perceptual quality of loudness, per se.”

      A lot is derived/interpreted from the results, but I believe there is a lot of over-interpretation. I would suggest the authors be more cautious and moderate in their speculations and conclusions. I would reconfigure the manuscript, and simplify it.

      We believe that the changes mentioned above and in the response to their specific comments below reduce over-interpretation and simplify the manuscript.

      As an example of a change made to moderate the conclusions from our work, we added the following to Pg. 14, Lns 442-447

      “Further, while the perceptual salience (Figure 2) and neural decoding of spared, 8kHz tones (Figure 5) were both enhanced after high-frequency sensorineural hearing loss, these measurements were not performed in the same animals (and therefore not at the same time). Definitive proof that increased cortical gain is the neural substrate for auditory hypersensitivity after hearing loss would require concurrent monitoring and manipulations of cortical activity, which would be an important goal for future experiments.”

      Reviewer #3 (Public Review):

      The study uses a mouse animal model of sensorineural hearing loss after sound overexposure at high frequencies that mimics ageing sensorineural hearing loss in humans. Those mice present behavioural hypersensitivity to mid-frequency tones stimuli that can be recreated with optogenetic stimulation of thalamocortical terminals in the auditory cortex. Calcium chronic imaging in pyramidal neurons in layers 2-3 of the auditory cortex shows reorganization of the tonotopic maps and changes in sound intensity coding in line with the loudness hypersensitivity showed behaviourally. After an initial state of neural diffuse hyperactivity and high correlation between cells in the auditory cortex, changes concentrate in the deafferented high-frequency edge by day 3, especially when using mid-frequency tones as sound stimuli. Those neurons can show homeostatic gain control or non-homeostatic excess gain depending on their previous baseline spontaneous activity, suggesting a specific set of cortical neurons prompt to develop hyperactivity following acoustic trauma.

      This study is excellent in the combination of techniques, especially behaviour and calcium chronic imaging. Neural hyperactivity, increase in synchrony, and reorganization of the tonotopic maps in the auditory cortex following peripheral insult in the cochlea has been shown in seminal papers by Jos Eggermont or Dexter Irvine among others, although intensity level changes are a new addition. More importantly, the authors show data that suggest a close association between loudness hypersensitivity perception and an excess of cortical gain after cochlear sensorineural damage, which is the main message of the study.

      The problem is that not all the high-frequency sensorineural hearing loss in humans present hyperacusis and/or tinnitus as co-morbidities, in the same manner that not all animal models of sensorineural hearing loss present combined tinnitus and/or hyperacusis. In fact, among different studies on the topic, there is a consensus that about 2/3rds or 70% of animals with hearing loss develop tinnitus too, but not all of them. A similar scenario may happen with hearing loss and hyperacusis. Therefore, we need to ask whether all the animals in this study develop hyperacusis and tinnitus with the hearing loss or not, and if not, what are the differences in the neural activity between the cases that presented only hearing loss and the cases that presented hearing loss and hyperacusis and/or tinnitus. It could be possible that the proportion of cells showing non-homeostatic excess gain were higher in those cases where tinnitus and hyperacusis were combined with hearing loss.

      We thank the reviewer for her/his careful reading of the original manuscript and many helpful suggestions and critiques that have been addressed in the revision. Both Reviewer 2 and Reviewer 3 understood that we were presenting our high-frequency sensorineural hearing loss manipulation as a way to model the clinical phenomenon of hyperacusis. This was not our intent, and we regret the wording of the original manuscript communicated this point. In fact, the clinical literature shows that hyperacusis does not have a strong association with hearing loss and moreover our behavioral and neural outcome measures were not designed to index the core phenotype of hyperacusis (a spectrum of sound-evoked distress, disproportionate scaling of loudness with sound level, and sound-evoked pain). Our study addresses the neural and behavioral signatures of auditory hypersensitivity, which is an “upstream” condition that may (or may not) be related to the presentation of clinical phenomena like hyperacusis and tinnitus.

      The reviewer mentions a litmus test for animal models of tinnitus, in which the utility of an animal model for tinnitus would be evaluated in part based on whether a controlled insult only produced a behavioral change suggestive of a chronic phantom percept in a fraction of animals. That may be so, but our study is clearly not modeling tinnitus and we make no claims to this effect in the original or revised manuscript. The Reviewer then goes on to say that “a similar scenario may happen with hearing loss and hyperacusis”. “May” is the operative word here because the association between sensorineural hearing loss and the clinical presentation hyperacusis is quite weak overall in human subjects but no study (that we are aware of) has attempted to document the probabilistic appearance of hyperacusis before and after acoustic trauma. So, we really don’t know whether hyperacusis has a probabilistic appearance like tinnitus or is more deterministic like cochlear threshold shift. But, again, the main point is that our experiments make no direct claim about hyperacusis one way or the other, which we now clarify and discuss throughout the revised text, as detailed below.

      We do contend that our experiments allow us to study auditory hypersensitivity, though again there is no precedent or consensus in the literature for expecting auditory hypersensitivity to present probabilistically or deterministically across mice after a controlled insult. Regardless, we agree with the reviewer that it is a very good idea to provide the individual animal data to the reader. We added new panels to Figure 2C to show that an increase in the 8kHz d’ slope after noise exposure (i.e., a change > 1) was observed in 7/7 mice that underwent acoustic trauma but 1/6 mice in the sham exposure group, suggesting a deterministic, binary behavioral effect found in every mouse with noise-induced high-frequency sensorineural damage. On the other hand, within the acoustic trauma cohort, 3 mice showed marked increases in the d’ growth slope (> 2) while 4 showed more subtle changes, suggesting a more graded or probabilistic effect. By providing the individual animal data as per the Reviewer’s request, the reader can now make a more informed determination about the reliability of auditory hypersensitivity within the acoustic trauma cohort.

      Regarding the relationship between the peripheral/cortical/perceptual auditory hypersensitivity we report here and the clinical conditions of tinnitus and hyperacusis, we revised the text such that the word “hyperacusis” only appears in the context of other publications and have added the following text (Pg. 14, Lns 448-466).

      “While the findings presented here support an association between sensorineural peripheral injury, excess cortical gain, and behavioral hypersensitivity, they should not be interpreted as providing strong evidence for these factors in clinical conditions such as tinnitus or hyperacusis. Our data have nothing to say about tinnitus one way or the other, simply because we never studied a behavior that would indicate phantom sound perception. If anything, one might expect that mice experiencing a chronic phantom sound corresponding in frequency to the region of steeply sloping hearing loss would instead exhibit an increase in false alarms on high-frequency detection blocks after acoustic trauma, but this was not something we observed. Hyperacusis describes a spectrum of aversive auditory qualities including increased perceived loudness of moderate intensity sounds, a decrease in loudness tolerance, discomfort, pain, and even fear of sounds (Pienkowski et al., 2014a). The affective components of hyperacusis are more challenging to index in animals, particularly using head-fixed behaviors, though progress is being made with active avoidance paradigms in freely moving animals (Manohar et al., 2017). Our noise-induced high-frequency sensorineural hearing loss and Go-NoGo operant detection behavior were not designed to model hyperacusis. Hearing loss is not strongly associated with hyperacusis, where many individuals have normal hearing or have a pattern of mild hearing loss that does not correspond to the frequency dependence of their auditory sensitivity (Sheldrake et al., 2015). While the excess central gain and behavioral hypersensitivity we describe here may be related to the sensory component of hyperacusis, this connection is tentative because it was elicited by acoustic trauma and because the detection behavior provides a measure of stimulus salience, but not the perceptual quality of loudness, per se.”

    1. Author Response

      Reviewer 1 (Public Review):

      To me, the strengths of the paper are predominantly in the experimental work, there's a huge amount of data generated through mutagenesis, screening, and DMS. This is likely to constitute a valuable dataset for future work.

      We are grateful to the reviewer for their generous comment.

      Scientifically, I think what is perhaps missing, and I don't want this to be misconstrued as a request for additional work, is a deeper analysis of the structural and dynamic molecular basis for the observations. In some ways, the ML is used to replace this and I think it doesn't do as good a job. It is clear for example that there are common mechanisms underpinning the allostery between these proteins, but they are left hanging to some degree. It should be possible to work out what these are with further biophysical analysis…. Actually testing that hypothesis experimentally/computationally would be nice (rather than relying on inference from ML).

      We agree with the reviewer that this study should motivate a deeper biophysical analysis of molecular mechanisms. However, in our view, the ML portion of our work was not intended as a replacement for mechanistic analysis, nor could it serve as one. We treated ML as a hypothesis-generating tool. We hypothesized that distant homologs are likely to have similar allosteric mechanisms which may not be evident from visual analysis of DMS maps. We used ML to (a) extract underlying similarities between homologs (b) make cross predictions across homologs. In fact, the chief conclusion of our work is that while common patterns exist across homologs, the molecular details differ. ML provides tantalizing evidence to this effect. The conclusive evidence will require, as the reviewer rightly suggests, detailed experimental or molecular dynamics characterization. Along this line, we note that we have recently reported our atomistic MD analysis of allostery hotspots in TetR (JACS, 2022, 144, 10870). See ref. 41.

      Changes to manuscript:<br /> “Detailed biophysical or molecular dynamics characterization will be required to further validate our conclusions(38).”

      Reviewer 3 (Public Review):

      However - at least in the manuscript's present form - the paper suffers from key conceptual difficulties and a lack of rigor in data analysis that substantially limits one's confidence in the authors' interpretations.

      We hope the responses below address and allay the reviewer’s concerns.

      A key conceptual challenge shaping the interpretation of this work lies in the definition of allostery, and allosteric hotspot. The authors define allosteric mutations as those that abrogate the response of a given aTF to a small molecule effector (inducer). Thus, the results focus on mutations that are "allosterically dead". However, this assay would seem to miss other types of allosteric mutations: for example, mutations that enhance the allosteric response to ligand would not be captured, and neither would mutations that more subtly tune the dynamic range between uninduced ("off) and induced ("on") states (without wholesale breaking the observed allostery). Prior work has even indicated the presence of TetR mutations that reverse the activity of the effector, causing it to act as a co-repressor rather than an inducer (Scholz et al (2004) PMID: 15255892). Because the work focuses only on allosterically dead mutations, it is unclear how the outcome of the experiments would change if a broader (and in our view more complete) definition of allostery were considered.

      We agree with the reviewer that mutations that impact allostery manifest in many different ways. Furthermore, the effect size of these mutations runs the full gamut from subtle changes in dynamic range to drastic reversal of function. To unpack allostery further, allostery of aTF can be described, not just by the dynamic range, but by the actual basal and induced expression levels of the reporter, EC50 and Hill coefficient. Given the systemic nature of allostery, a substantial fraction of aTF mutations may have some subtle impact on one or more of these metrics. To take the reviewer’s argument one step further, one would have to accurately quantify the effect size of every single amino acid mutation on all the above properties to have a comprehensive sequence-function landscape of allostery. Needless to say, this is extremely hard! Resolution of small effect sizes is very difficult, even at high sequencing depth. To the best of our knowledge, a heroic effort approaching such comprehensive analysis has been accomplished so far only once (PMID: 3491352).

      Our focus, therefore, was to screen for the strongest phenotypic impact on allostery i.e., loss of function. Mutations leading to loss of function can be relatively easily identified by cell-sorting. Because our goal was to compare hotspots across homologs, we surmised that loss of function mutations, given their strong phenotypic impact, are likely to provide the clearest evidence of whether allosteric hotspots are conserved across remote homologs.

      The reviewer raised the point of activity-reversing mutations. Yes, there are activity reversing mutations in TetR. However, they represent an insignificant fraction. In the paper cited by the reviewer, there are 15 activity-reversing mutations among 4000 screened. Furthermore, the paper shows that activity-reversing in TetR requires two-tofour mutations, while our library is exclusively single amino acid substitutions. For these reasons, we did not screen for activity-reversing mutations. Nonetheless, we agree with the reviewer that screening for activity-reversing mutations across homologs would be very interesting.

      The separation in fluorescence between the uninduced and induced states (the assay dynamic range, or fold induction) varies substantially amongst the four aTF homologs. Most concerningly, the fluorescence distributions for the uninduced and induced populations of the RolR single mutant library overlap almost completely (Figure 1, supplement 1), making it unclear if the authors can truly detect meaningful variation in regulation for this homolog.

      Yes, the reviewer is correct that the fold induction ratio varies among the four aTF homologs. However, we note that such differences are common among natural aTFs. Depending on the native downstream gene regulated by the aTF, some aTFs show higher ligand-induced activation, and others are lower. While this is not a hard and fast rule, aTFs that regulate efflux pumps tend to have higher fold induction than those that regulate metabolic enzymes. In summary, the variation in fold induction among the four aTFs is not a flaw in experimental design nor indicates experimental inconsistency but is instead just an inherent property of protein-DNA interaction strength and the allosteric response of each aTF.

      Among the four aTFs, wildtype RolR has the weakest fold induction (15-fold) which makes sorting the RolR library particularly challenging. To minimize false positives as much as possible, we require that dead mutant be present in (a) non-fluorescent cells after ligandinduction (b) non-fluorescent cells before ligand-induction (c) at least two out of the three replicates for both sorts. Additionally, for RolR specifically, we adjusted the nonfluorescent gate to be far more stringent than the other three aTFs (Fig. 1 – figure supplement 1). Furthermore, we assign residues as allosteric hotspots, not individual dead mutations. This buffers against false strong signals from stray individual dead mutations. Finally, the top interquartile range winnows them to residues showing strong consistent dead phenotype. As a result of these “safeguards” we have built in, the number of allosteric hotspots of RolR (57) is comparable to the other three aTFs (51, 53 and 48). This suggests that we are not overestimating the number of hotspots despite the weaker fold induction of RolR. We highlight in a new supplementary figure (Figure 1 – figure supplement 4) that changing the read count threshold from 5X to 10X produces near identical patterns of mutations suggesting that our results are also robust to changes in ready depth stringency.

      Changes to manuscript: In response to the reviewer's comment, we have added the following sentence.

      “We note that the lower fold induction (dynamic range) of RolR makes it particularly challenging to separate the dead variants from the rest.”

      The methods state that "variants with at least 5 reads in both the presence and absence of ligand in at least two replicates were identified as dead". However, the use of a single threshold (5 reads) to define allosterically dead mutations across all mutations in all four homologs overlooks several important factors:

      Depending on the starting number of reads for a given mutation in the population (which may differ in orders of magnitude), the observation of 5 reads in the gated nonfluorescent region might be highly significant, or not significant at all. Often this is handled by considering a relative enrichment (say in the induced vs uninduced population) rather than a flat threshold across all variants.

      We regret the lack of clarity in our presentation. We wish to better explain the rationale behind our approach. First, we understand the reviewer’s point on considering relative enrichment to define a threshold. This approach works well in DMS experiments involving genetic selections, which is commonly the case, because activity scales well with selection stringency. One can then pick enrichment/depletion relative to the middle of the read count distribution as a measure of gain or loss of function.

      Second, this strategy does not, in practice, work well for cell-sorting screens. While it may be tempting to think of cell sorting as comparably activity-scaled as genetic selections, in reality, the fidelity of fluorescent-activated cell sorters is much lower. Making quantitative claims of activity based on cell sorting enrichment can be risky. It is wiser to treat cell sorting results as yes/no binary i.e., does the mutation disrupt allostery or not. More importantly, the yes/no binary classification suffices for our need to identify if a certain mutation adversely impacts allosteric activity or not.

      Third, the above argument does not imply that all mutations have the same effect size on allostery. They don’t. We capture the effect size on individual residues, not individual mutations, by counting the number of dead mutations at a residue position. This is an important consideration because it safeguards us from minor inconsistencies that inevitably arise from cell sorting.

      Fourth, a variant to be classified as allosterically dead, it must be present both in uninduced and induced DNA-bound populations in at least two out of three replicates (four conditions total). This is a stringent criterion for selecting dead variants resulting in highly consistent regions of importance in the protein even upon varying read count thresholds. To the extent possible, we have minimized the possibility of false positive bleed-through.

      Finally, two separate normalizations were performed on the total sequence reads to be able to draw a common read count threshold 1) between experimental conditions & replicates and 2) across proteins. First, total sequencing reads were normalized to 200k total across all sample conditions (presorted, -inducer, and +inducer) and replicates for each homolog, allowing comparisons within a single protein. Next, reads were normalized again to account for differences in the theoretical size of each protein’s single-mutant library, allowing for comparisons across proteins by drawing a commont readcount cutoff. For example, total sequencing reads of RolR (4,332 possible mutants) increased by 1.18x relative to MphR (3,667 possible mutants) for a total of 236k reads.

      Changes to manuscript: We have provided substantial additional details in the Fluorescence-activated cell sorting and NGS preparation and analysis sections.

      We also added the following in the main text.

      “In other words, we use cell sorting as a binary classifier i.e., does the mutation disrupt allostery or not. We capture the effect size on individual residues, not individual mutations, by counting the number of dead mutations at a residue position. This is an important consideration because it safeguards us from minor inconsistencies that inevitably arise from cell sorting.”

      Depending on the noise in the data (as captured in the nucleotide-specific q-scores) and the number of nucleotides changed relative to the WT (anywhere between 1-3 for a given amino acid mutation) one might have more or less chance of observing five reads for a given mutation simply due to sequencing noise.

      All the reads considered in our analyses pass the Illumina quality threshold of Q-score ≥ 30 which as per Illumina represent “perfect reads with no errors or ambiguities”. This translates into a probability of 1 in 1000 incorrect base call or 99.9% base call accuracy.

      We use chip-based oligonucleotides to build our DMS library, which allows us to prespecify the exact codon that encodes a point mutation. This means the nucleotide count and protein count are the same. The scenario referred to by the reviewer i.e., “anywhere between 1-3 for a given amino acid mutation” only applies to codon randomized or errorprone PCR library generation. We regret if the chip-based library assembly part was unclear.

      Depending on the shape and separation of the induced (fluorescent) and uninduced (non-fluorescent) population distributions, one might have more or less chance of observing five reads by chance in the gated non-fluorescent region. The current single threshold does not account for variation in the dynamic range of the assay across homologs.

      We have addressed the concern raised by the reviewer on fluorescent population distributions in answers to questions 10 and 11.

      The reviewer makes an important point about the choice of sequencing threshold. We use the sequencing threshold to simply make a binary choice for whether a certain variant exists in the sorted population or not. We do not use the sequencing reads as to scale the activity of the variant. To address the reviewer's comment, we have included a new supplementary figure (Fig 1 – figure supplement 4) where we compare the data by adjust the threshold two levels – 5 and 10 reads. As is evident in the new figure, the fundamental pattern of allosteric hotspots and the overall data interpretation does not change.

      TetR: 5x – 53 hotspots, 10x – 51 hotspots

      TtgR: 5x – 51 hotspots, 10x – 51 hotspots

      MphR: 5x – 48 hotspots, 10x – 48 hotspots

      RolR: 5x – 57 hotspots, 10x – 60 hotspots

      In other words, changing the threshold to be more or less strict may have a modest impact on the overall number of hotspots in the dataset. Still, the regions of functional importance are consistent across different thresholds. We have expanded the discussion in the manuscript to address this point.

      Changes to manuscript: We have now included a new supplementary comparing hotspot data at two thresholds: Figure 1 – figure supplement 4.

      We also added the following in the main text.

      “To assess the robustness of our classification of hotspots, we determined the number of hotspots at two different sequencing thresholds – 5x and 10x. At 5x and 10x, the number of hotspots are – TetR: 53, 51; TtgR: 51, 51; MphR: 48, 48 and RolR: 57,60, respectively. Changing the threshold has a modest impact on the overall number of hotspots and the regions of functional importance are consistent at both thresholds”

      The authors provide a brief written description of the "weighted score" used to define allosteric hotspots (see y-axis for figure 1B), but without an equation, it is not clear what was calculated. Nonetheless, understanding this weighted score seems central to their definition of allosteric hotspots.

      We regret the lack of clarity in our presentation. The weighted score was used to quantify the “deadness” of every residue position in the protein. At each position in the protein, the number of mutations that inhibited activity was summed up and the ‘deadness’ of each mutation was weighted based on how many replicates is appeared to inactivate the protein. Weighted score at each residue position is given by

      Where at position x in the protein, D1 is the number of mutations dead in one replicate only, D2 is the number of mutations dead in 2 replicates, D3 is the number of mutations dead in 3 replicates, and Total is the total number of variants present in the data set (based on sequencing data). Any dead mutation that is seen in only one replicate is discarded and does not contribute to the “deadness” of the residue. Mutations seen in two and three replicates contribute to the score. We have included a new supplementary figure (Fig. 1 – figure supplement 2) to give the reader a detailed heatmap of all mutations and their impact for each protein.

      Changes to manuscript: The weighted scoring scheme is now described in greater detail under Materials and Methods in the “NGS preparation and analysis” section.

      The authors do not provide some of the standard "controls" often used to assess deep mutational scanning data. For example, one might expect that synonymous mutations are not categorized as allosterically dead using their methods (because they should still respond to ligand) and that most nonsense mutations are also not allosterically dead (because they should no longer repress GFP under either condition). In general, it is not clear how the authors validated the assay/confirmed that it is giving the expected results.

      As we state in response to question 12, we use chip-based oligonucleotides to build our DMS library, which allows us to pre-specify the exact codon that encodes a point mutation. We have no synonymous or nonsense mutations in our DMS library. Each protein mutation is encoded by a single unique codon. The only stop codon is at 3’end of the gene.

      The authors performed three replicates of the experiment, but reproducibility across replicates and noise in the assay is not presented/discussed.

      Changes to manuscript: A new supplementary table (Table 1) is now provided with the pairwise correlation coefficients between all replicates for each protein.

      In the analysis of long-range interactions, the authors assert that "hotspot interactions are more likely to be long-range than those of non-hotspots", but this was not accompanied by a statistical test (Figure 2 - figure supplement 1).

      In response to the reviewer's comment, we now include a paired t-test comparing nonhotspots and hotspots with long-range interactions in the main text.

      Changes to manuscript: In all four aTFs, hotspots constituted a higher fraction of LRIs than non-hotspots (Figure 2 – figure supplement 1; P = 0.07).

    1. Author Resonse

      Reviewer #1 (Public Review):

      The authors trained rats to self-initiated a trial by poking into a nose poke, and to make a sequence of 8 licks in the nose poke after a visual cue. Trials were considered valid (called "timely") only if rats waited for more than 2.5 sec after the end of the previous trial. An attempt to initiate a trial (nose poking) before the 2.5 sec criterion was regarded as "premature". The authors recorded from the dorsal striatum while rats performed in this task. The authors first show that some neurons exhibited a phasic activation around the time of port entry detected using an infrared detector ("Entry cell"), as well as port exit ("Exit cell). Some neurons showed activation at both entry and exit ("Entry and Exit cell") or between these two events ("Inside-port cell"). Fractions of neurons that fall into these four categories are roughly the same (Fig. 3C). The main conclusions drawn from this study are that (1) the activity preceding a port entry was positively correlated with the latency to initiate a trial (or "waiting time"; Fig. 4E), which appear to reflect the value upcoming reward, and that (2) in adolescent rats, the activity rose more steeply with the latency to trial initiation (Fig. 7J).

      These observations are potentially interesting, in particular, the possible difference between adult and adolescent rats is intriguing. However, this study does not examine whether this brain region actually plays a role in the task. Some of the conclusions appear to be premature.

      1) Previous studies have found correlations between the activity of neurons in the striatum and the latency to trial initiation (e.g. Wang et al., Nat. Neurosci., 2013) or action initiation more generally (e.g. Kunimatsu et al., eLife, 2018). In the former study, the trial initiation was self-generated, similar to the present study, and was modulated by the overall reward value (state value). In the latter study, the latency was instructed by a cue. Furthermore, there are many studies that showed correlations between striatal activity and future rewards (e.g. Samejima et al., Science, 2005; Lau and Glimcher, 2008). Many of these studies varied the value of upcoming reward (e.g. amount or probability). Although some details are different, the basic concepts have been demonstrated in previous studies.

      Although there are other studies linking striatal activity to trial/action initiation and reward probability, here the striatal activity preceding the execution of a learned sequence is dependent on the internal representation of the time waited. Elapsed time is the only cue the animal has regarding the possible outcome until it is too late and the trial has already been initiated. Although a light cue then tells the rat if the timing was correct or not, providing an opportunity to stop the behavior, the behavior released during premature trials resembles very closely that observed during unrewarded timely trials. This remarkable similarity between premature trials and timely unrewarded trials allowed comparing very advantageously the effect of wait time-based modulation of anticipatory striatal activity. Moreover, we have compared striatal activity between adult and adolescent rats finding a steeper wait time-based modulation of striatal activity in adolescent animals that correlates with a more impulsive behavior in these animals.

      2) The authors conclude that "in this task, the firing rate modulation preceding trial initiation discriminates between premature and timely trials and does not predict the speed, regularity, structure, value or vigor of the subsequently released action sequence". This conclusion is based on the observation that premature and timely trials did not differ in terms of kinematic parameters as measured using accelerometer. Although the result supports that the difference in activity between premature and timely cannot be explained by the kinematic variables, it does not exclude the possibility that the activity is modulated by some kinematic variables in a way orthogonal to these trial types.

      While our accelerometer data do not support that differences in movement initiation time or velocity could explain the differences in striatal activity between adolescent and adult rats, we can not rule out that kinematic variables not captured by the head accelerometer recordings could explain some of the results. This is acknowledged in the main text, results section, page 8, line 180.

      3) The firing rate plot shown in Figure 4D should be replotted by aligning trials by movement initiation (presumably available from accelometer or video recording). Is it possible that the activity rise similarly between trials types but the activity is cut off depending on when the animal enters the port at different latency from the movement initiation? In any case, the port entry is a little indirect measure of "trial initiation".

      Unfortunately, we have not systematically obtained video recordings of the sessions and only have accelerometer recordings of a few of the animals that provided the neuronal data, which precludes replotting the data as suggested. Accelerometer recordings are available from two of adult and two adolescent rats. Latency from movement initiation to port entry do not differ between premature and timely trials at both ages. This is now reported on page 8 line 175 for adult rats, and page 15 line 341 for adolescent rats. These results appear to be at odds with the idea that decreased neuronal activity in premature trials is the result of a cut-off of the response.

      4) The difference between adult and adolescent rats are not particularly big, with the data from the adolescent rats showing a noisy trace.

      New data from two adolescent rats reduced the variability and confirmed the behavioral and physiological differences with adult rats. All panels from figure 7 now include the data from 5 adolescent animals instead of 3. The number of neurons analyzed in the adolescent group passed from 552 to 876. The inclusion of these new data allowed us to perform new statistical comparisons. We adjusted a logistic function to accumulated trial initiation timing data (Fig.7N) and found that the rate of accumulation is higher in adolescent rats. Importantly, this is observed not only in the part of the curve corresponding to premature responding but also during timely responding, indicating that adolescent rats' premature responding is a manifestation of a more general behavioral trait that makes them self-initiate trials faster than adults (Fig. 7N). The noisy trace of curves showing the amplitude modulation of anticipatory activity as a function of waiting time was partly due to the relatively low number of premature trials that demanded using relatively long time bins. With more data available we have been able to replot these curves using a smaller bin size for the short waiting times (Fig. 7M). We have adjusted a logistic function to these data and observed a higher rate of increase of this activity modulation in adolescent rats, paralleling the behavioral data. Moreover, we report a significant correlation between the behavioral and neurophysiological data (a steeper rate of trial initiation times curve correlates with a steeper wait modulation of anticipatory activity, Fig. 7O). These new findings are reported in the results section, from page 17 line 405 to page 18 line 417.

      Reviewer #2 (Public Review):

      The authors conduct an ambitious set of experiments to study how neural activity in the dorsal striatum relates to how animals can wait to perform an action sequence for reward. There are a lot of interesting studies on striatal encoding of actions/skills, and additionally evidence that striatal activity can help control response timing and time-related response selection. The authors bridge these issues here in an impressive effort. Recordings were made in the dorsal striatum on several tasks, and activity was assessed with respect to action initiation, completion, and outcome processing with respect to whether animals could wait appropriately or could not wait and responded prematurely. Conducting recordings of this sort in this task, particularly in some adolescent animals, is technically advanced. I think there is a very timely and potentially very interesting set of results here. However, I have some concerns that I hope can be addressed:

      It seems like the recordings were made throughout the dorsal striatum (histology map), including some recordings near/in the DLS. Is this accurate? The manuscript is written as though only the DMS was recorded.

      We acknowledge that our recordings are spread along the medial and central regions of the dorsal striatum. Although we are not sure that there is a consensus regarding the limits of the DMS and DLS, we believe that none of our recordings are clearly located within the DLS. Following your suggestion, we have modified the text and refer to the location of our recordings as “dorsal striatum”. We believe that, as there is a lot of work on the roles of the DLS and DMS in reward learning, it is still important to refer to this work in the Introduction section and to discuss our findings in its context, particularly, since we find that most task-related activity is concentrated at the beginning and end of the task as shown in several studies focused in the DLS.

      If I understand correctly, the rats must lick 8 times to get the water. If this is true, one strategy is to just keep licking until the water comes. Therefore, the rats may not have learned an 8-lick action sequence. The authors should clarify this possibility, and if it is, to consider avoiding using phrases like "automatized action sequence" since no real action sequence might have been learned. In short, I am not convinced the animals have learned an action pattern rather than to just keep licking once a waiting period has elapsed.

      We acknowledge that the experiments do not allow us to establish if the rats know what the exact number of licks needed is; when the skill is acquired, licking becomes highly stereotyped and the rats might as well be learning a time after which continuous licking leads to reward. We still believe that the stereotyped performance, the inability to stop the behavior when the absence of the light cue unequivocally indicates that no reward will be obtained in premature trials, and the rapid decrease of lick rate after the eighth lick was emitted and no reward was obtained, support that the behavior is automatic until the time of expected reward delivery. A representative raster plot showing lick sequences during a whole session in a trained adult rat is presented in Fig. 1I and Figure 7 – supplement 1H shows an example of the licks of an adolescent rat.

      The number of subjects per group is very low. This is fine for analysis of within-animal neural activity. However, comparing the behavior between these groups of animals does not seem appropriate unless the Ns are substantially increased.

      The revised version of the manuscript includes a higher number of adolescent rats from which striatal activity and behavior were recorded, which allowed us to perform a more detailed statistical analysis of the correlations between these measures. In addition, we now include new behavioral data from an independent sample of non-implanted 6 adults and 6 adolescent rats that confirms the results obtained with the implanted animals (presented in Figure 7 – supplement 4).

      I found the manuscript difficult to decipher. There are many groups. If I understand correctly, there are the following:

      -ITI 2.5s experiment

      -ITI 5 s experiment

      -ITI2.5-5s experiment

      -ITI 2.5 s experiment (adolescent)

      -Two accelerometer animals (unclear which experiment)

      -Two animals in ITI 2.5 sec without recordings (unclear how incorporated into analyses)

      Within each group, there are multiple categories of behavioral performance. This produces a large list of variables. In some parts of the results, these groups are separated and compared, but not all groups are compared in those such sections. In other sections the different groups (all or just some?) appear to be combined for analysis, but it is not clearly described. Another consequence of mixing the groups and conditions together in analysis as they do is that some of the statements in the results are very hard to follow (E.g., line 305 "...similar behavior observed in 8-lick prematurely released and timely unrewarded trials...").

      To clarify the experimental groups, we now include a table (Table 1) summarizing which tasks were used and how many animals were trained in each task.

      Generally, it is difficult to understand the results without first understanding the details of the different tasks, the different groups of animals, and the different epochs of comparison for neural analysis. It took me a long time to work through the methods and I am still not sure I completely understand it. On this point, some sentences are very long and should be broken up into smaller, clearer sentences. There are a lot of phrases that only someone familiar with the cited articles might understand what they mean (e.g., even one paragraph starting with line 39 includes all of the following terms: automaticity in behavior; behavioral unit or chunk; reward expectancy; reward prediction errors and trial outcomes; explore-exploit; cost-benefit; speed-accuracy tradeoffs; tolerance to delayed rewards; internal urgency states). It is very hard to follow how each of these processes are to be understood in terms of behavioral measures used to study them and how they do or do not relate to the hypothesis of the present study. The discussion similarly uses a lot of different phrases to discuss the task and neural responses in a way that makes it hard to understand exactly what the author's interpretation of the data are. Is there maybe a 'most likely' interpretation that can be stated for some of the responses?

      Our main aim is to disclose the mechanisms underlying differences between adult and adolescent rats relating to impulsivity. We hope that this will become clearer in this version of the manuscript after deepening the analysis of the differences between them. We believe that our data do not allow us to unequivocally determine what is the ultimate cognitive process producing the striatal activity differences between adult and adolescent rats, i.e., differences in internal urgency states, time perception, tolerance to delayed rewards, and tried to reflect that fairly in the Discussion.

      The data set is extremely rich; there are lot of data here. As a result it can be hard to understand how all of the data relate to the main hypothesis of the article. It often reads as an exploratory set of results section rather than a series of hypothesis tests.

      We have tried to improve the overall clarity of the text.

      Reviewer #3 (Public Review):

      Cecilia-Martinez et al., implement a task that allows the study of premature versus timely actions in rats. First, they show that rats can learn this task. Next, they record the activity in the DMS showing start/stop signals in the cells recorded, next they propose that the activity detected before the release of actions sequences discriminate the premature vs the timely initiations showing a relationship between the waiting time and the activity of cells recorded, furthermore they show that it could be the expectancy of reward what could be encoded in the activity before entering the port. Last they show that adolescent rats show more premature starts than adult rats documenting a difference in activity modulation of DMS cells in the relation between waiting time and firing rate (although above the premature threshold, see comments below).

      Overall the paper is well presented describing a well-developed set of experiments and deserves publication attending only minor comments.

      1) I understand rats learn to execute sequences of <8licks or 8 licks, although diagrams are presented, no examples of the individual trials with 8 licks, neither distributions of bouts of these licks are presented.

      Rats learn to execute a lick sequence to obtain the reward. The experiments do not allow us to establish if they know what the exact number of licks needed is; when the skill is acquired, licking becomes highly stereotyped and the rats might as well be learning a time after which continuous licking leads to reward. A representative raster plot showing lick sequences in a session in a trained adult rat is presented in Figure 1I and Figure 7 - supplement 1H shows an example of the licks of an adolescent rat.

      2) Relevant to the statement: "in this task, the firing rate modulation preceding trial initiation discriminates between premature and timely trials and does not predict the speed, regularity, structure, value or vigor of the subsequently released action sequence"... It is not clear if the latency to first lick (plot 2D) and the inter-lick interval (2E) is only from the 8Lick sequences or not. If that is not the case, it is important to compare only the ones with 8Licks.

      The data are from 8 lick sequences, this is now indicated in the figure legend.

      3) Related to the implications of the previous statement, there seems to be a tendency for longer latency to first lick in timely vs premature trials in Figure 2D (timely-trials-Late vs premature-trials-late)? Again here it is important to compare the 8licks sequences only.

      Only 8-lick sequences are compared and the two-way ANOVA showed a significant effect of the training stage without significant effects of trial timing (premature versus timely) and a non-significant interaction. The average ± SEM latencies to the first lick (of the eighth lick sequence) were 0.717 s ± 0.063 for timely trials late and 0.805 s ± 0.086 for premature trials late.

      4) I could not find in the main text whether the individual points in Fig.2 (e.g. 2B-E) are individual animals. Please specify that.

      In this figure panels every individual point corresponds to the mean of a session, the data correspond to 5 adult animals (2-5 sessions per animal and timing condition). Whether the data correspond to animals or sessions is now clarified in all figure legends.

      5) Although very elegant the argument presented in Figure 4C and 6C, I wonder if the head acceleration may lose differences in movements outside the head in the two kinds of trials. If that is the case please acknowledge it.

      We acknowledge in the main text, results section, page 8, line 180, that the accelerometer does not allow us to determine if the movements of other body parts differ between trial types.

      6) Also in 4C, small separations between timely vs premature signals are seen before 0. Is there a way to know if animals in timely vs premature trials approached the entry port in the same way? This request is pertinent in order to rule out motor contribution to the differences in Figure 4A-B.

      Although it is not possible to completely rule out small movement differences between premature and timely trials, no evident behavioral differences can be detected by trained observers or by analyzing video recordings taken during some sessions. The available accelerometer recordings also suggest that a similar motor pattern is displayed in premature and timely trials (Figure 4C).

      7) when saying: "Similar results were obtained in rats trained with a longer waiting interval (Supplementary Figure 5)", "is hard to see the similarity in the premature range, while in the 2.5 seconds task there is a positive relationship in the 5 seconds task it is not.

      Please note that a positive relationship is observed for the two bins preceding trial initiation, which are about 2.75s and 1s before port entry. The bin that seems to not fit is centered 4s before port entry (1s after exiting the port in the previous trial). Because of the longer waiting time, in the 5 s task behavior becomes less organized during the first seconds after port exit, however, the modulation of activity is still observed in the bins that are close to port entry.

      8) The data showing that the waiting modulation of reward anticipation grows at a faster rate in adolescent rats is clear, however, it is not clear how it could be related to the data showing that the adolescent rats were more impulsive.

      We acknowledge that the data do not provide a causal link with behavior. After adding two new adolescent rats we have been able to study in more detail the relationship between the waiting modulation of neuronal activity and the accumulation of trial initiations (depicted in figures 7M and 7N respectively) by adjusting logistic functions to the data. The new results are explained on page 17,line 384. There is a striking parallel between the growth rate of both curves, and the curves of adolescent rats are significantly steeper than those of adult rats. Moreover, there is a significant correlation between the coefficients that mark the rate of growth of the behavioral and neurophysiological data (Fig. 7O).

      9) Related to the sentence: "the strength of anticipatory activity increased with the time waited before response release and was higher in the more impulsive adolescent rats"....One may expect to see a difference in the range of the premature time however the differences were observed in the range >2.5 seconds. Please explain how to reconcile this finding with the fact that the adolescent rats were more impulsive.

      Please, note that the more impulsive behavior of adolescent rats (and the faster growth of the wait modulation of anticipatory activity) is observed along waiting times that exceed the 2.5s criterion wait time; we added a phrase in the Results section (page 18, lines 413) and in the Discussion section (page 19, line 443) to emphasize this point. Regarding the premature trials, a related issue was raised by reviewer #1, concern 4. The addition of new data from adolescent animals allowed us to used smaller bins to better discriminate what happens at short waiting times and included an inset in Figure 7M that allows to better appreciate what happens at these intervals.

    1. Nothing gets people’s attention like something startling. Surprise, a simple emotion, hijacks a person’s mind and body and focuses them on a source of possible danger (Simons, 1996). When there’s a loud, unexpected crash, people stop, freeze, and orient to the source of the noise. Their minds are wiped clean—after something startling, people usually can’t remember what they had been talking about—and attention is focused on what just happened. By focusing all the body’s resources on the unexpected event, surprise helps people respond quickly

      It's interesting to see the the emotion of surprise no matter how composed, calm or worried you are, the feeling of surprise always affects everyone the same because you lose all that feeling of readiness when it hits you. On the other hand, surprises can sometimes show one's best moments as your whole body is reacting and focusing to the surprise, your reaction, thinking can also temporally be enhanced for that moment. I said in the last lecture that because we are different there are different results but i think this time for the emotion of surprise the background event and how unique it is what determines what emotion of surprise the person may feel.

    1. We may justly expect American men tobe as willing to grant to the women of the United States as generous consideration as those of GreatBritain have done

      This shows examples that women need change in society and their rights. I believe what this tells us is that this is what people think about women in general.

    1. Author Response

      Reviewer #2 (Public Review):

      McCoy et al. has developed a new urban tree species database from existing city tree inventories. They designed procedures to collect and clean a large amount of data, i.e., more than five million trees from 63 US cities. They found that urban trees were significantly clustered by species in 93% of cities using the compiled data. They also showed that climate significantly shaped both nativity and tree diversity. Also, they identified the homogenization effect of the non-native species. The interest in patterns of urban biodiversity and its driving mechanism has been rising recently. This paper provides an important data source for addressing research questions on this topic. The finding presented by the authors exemplified its potential. Strengths Compared to the existing urban tree database, such as the one developed by Ossola et al.(Global Ecology and Biogeography 2020), the new database added information on spatial location, nativity statuses, and tree health conditions besides occurrences. The new information expands data usability and saves valuable time for researchers. The authors also make the tools available so others can use them to process their own data sets. Because of the added information, various analyses of the diversity pattern of urban trees and the potential driving mechanism could be conducted. The authors found that individual species nonrandomly clustered urban trees. This finding corroborates the existing knowledge that some common species dominate urban trees. Nevertheless, the authors showed that the dominance was apparent in the spatial dimension. The preliminary finding that the native status of a tree had no apparent impact on tree health is interesting. It can potentially contribute to the debate on native vs. exotic in urban tree species selection, which the author mentioned in the paper.

      Thank you for the feedback!

      Weakness

      While the new database and the analysis based on it has strengths, some aspects of the concepts and data analysis need to be clarified and extended.

      We appreciate these helpful comments and have made many changes in response, detailed below.

      First, the authors need to define several critical concepts used in the paper, including city trees, urban forests, biodiversity, and species diversity. The authors used city trees and urban forests interchangeably throughout the paper. Nevertheless, a widely accepted definition of the urban forest is:"All woody and associated vegetation in and around dense human settlements." Konijnendijk et al. had a good discussion on the terminology used in urban forestry (Urban Forestry & Urban Greening, 2006). Similarly, biodiversity is different from species diversity. Effective species number is a diversity indicator. Therefore, it is challenging to accept conclusions being drawn on biodiversity in urban forests without clear definitions.

      We appreciate these clarifications– we have clarified our terminology throughout and added these important definitions.

      • “...urban forests, which are the woody and associated vegetation in and around dense human settlements (Konijnendijk et al., 2006).”

      • “City tree communities, an essential component of urban forests, provide many services.”

      We replaced the term “biodiversity” throughout the text where really we meant to say “tree species diversity” or just “diversity.”

      Second, the tree inventories varied significantly regarding the number of records (214~720,140). The variation can be due to the actual variation of tree abundance in studied cities or incomplete inventories. Biases can be introduced into the findings when comparing these inventories without adjusting the unequal sample sizes. The authors did not detail how they dealt with this issue when conducting the analysis.

      We redid all of our relevant analyses and applied Chao’s rarefaction and extrapolation techniques throughout the manuscript. The (substantial) changes are fully described above in the “Essential Revisions” section. We also copy them here.

      First, we redid all of our diversity calculations applying Chao’s rarefaction and extrapolation techniques through the R package iNext. Therefore, our summary datasheet now has many new columns to include the following values for each city:

      ○ Effective species number:

      ■ Raw effective species number

      ■ Asymptotic estimate of effective species number with confidence interval

      ■ Estimate of effective species number for a given population size (37,000 trees– the median population size rounded to the nearest 1,000) with confidence interval

      ○ Species richness:

      ■ Raw species richness (number of species)

      ■ Asymptotic estimate of number of species with confidence interval

      ■ Estimate of number of species for a given population size (37,000 trees– the median population size rounded to the nearest 1,000) with confidence interval

      ○ The same for the native-only population of trees in each city (e.g., not just raw number of effective number of native species but also the iNext estimates and confidence intervals)

      ○ Whether or not each of the values above was calculated using extrapolation or interpolation

      ○ Sample coverage estimates

      Second, we re-ran our models testing for significant correlations between species diversity in a city and other factors (including climate), where we used the extrapolated / interpolated effective species numbers from iNext. Specifically, we found the best fit model, which included the following predictors: environmental PCA1, environmental PCA1:environmental PCA2, and whether or not a city was designated as a Tree City USA. Then, we ran this model under six sensitivity conditions, varying the independent variable and/or which cities we included based on completeness of their sample. Climate was still a significant correlate of diversity.

      ○ first, with independent variable = effective species as calculated for a given population of 37,000 trees ("effective species for a standardized population size");

      ○ second, independent variable = the asymptotic estimate of the effective species number for that city as calculated using iNext;

      ○ third, the raw effective species number;

      ○ fourth, excluding cities with fewer than 10,000 trees;

      ○ fifth, excluding cities with <50% spatial coverage;

      ○ sixth, excluding cities with <0.995 sample coverage as calculated by iNext.

      ○ For the fourth, fifth, and sixth models, the independent variable was effective species for a standardized population size of 37,000 trees.

      Third, we redid our comparisons of tree populations in parks versus those in urban areas. Parks were still more diverse than urban areas.

      ○ Specifically, we used iNext to calculate diversity metrics based on the smaller of the two population sizes (park vs urban) to enable fair comparison for each city.

      ○ We reported comparison results for (i) raw effective species number, (ii) asymptotic estimate, and (iii) estimate for a given population.

      ○ In doing so, we eliminated Milwaukee from the comparison (it had only 28 trees recorded as being in an urban setting).

      Fourth, we redid our pairwise comparisons of tree community composition between cities in order to account for different population sizes and sampling efforts. To do so, we randomly subsampled the larger city to make its population equal to the smaller city, calculated comparison metrics, and repeated this process 50 times. We report the average comparison metrics.

      Our new Methods text is copied here for your convenience:

      ○ “Throughout our analyses, it was necessary to control for different sample sizes (and different, but unknown, sampling efforts across cities). To do so, we relied on the rarefaction / extrapolation methods developed by Chao and colleagues (Chao et al., 2015, 2014; Chao & Jost, 2012) and implemented through the R software package iNext (Hsieh et al., 2016). In short, these methods use statistical rarefaction and/or extrapolation to generate comparable estimates of diversity across populations with different sampling efforts or population sizes, alongside confidence intervals for these diversity estimates. iNext performs these tasks for Hill numbers of orders q = 0, 1, and 2. We used two techniques in iNext to allow for comparisons across cities (and between parks and urban areas within cities). First, we generated asymptotic diversity estimates for each; second, we generated diversity estimates for a given standardized population size. For our diversity analyses, the standardized population size we used was 37,000 trees (the rounded median of all cities). For analyses of the diversity of native trees, we used a standardized population size of 10,000 trees. For comparisons of the diversity between park and urban areas in a city, we used the smaller of the two population sizes (park or urban). In all cases we also recorded confidence estimates, and plotted rarefaction/extrapolation curves.

      ○ To control for variation in how uniformly trees were sampled across a city’s geographic range, we developed a procedure to score each city’s spatial coverage (see section Spatial Structure below).

      ○ We identified the best-fitting model, and then repeated our analysis under six sensitivity conditions to control for differences in population size, sampling effort, spatial coverage, and sample coverage. Our sensitivity analyses were as follows: first, with independent variable = effective species as calculated for a given population of 37,000 trees ("effective species for a standardized population size"); second, independent variable = the asymptotic estimate of the effective species number for that city as calculated using iNext; third, the raw effective species number; fourth, excluding cities with fewer than 10,000 trees; fifth, excluding cities with <50% spatial coverage; sixth, excluding cities with <0.995 sample coverage as calculated by iNext. For the fourth, fifth, and sixth models, the independent variable was effective species for a standardized population size of 37,000 trees.”

      Reviewer #3 (Public Review):

      This paper's strength is in the utility of the assembled datasets and some interesting and creative proof of concept analyses. This is an amazing resource for comparative analysis. However the paper felt a little sparse in the conceptual and methodological underpinnings of the questions asked to demonstrate the utility of the analysis. Specifically, I suggest:

      A) More substance in the introduction (currently only two short paragraphs) and a clear statement of research questions.

      We have added text to frame our goals and hypotheses:

      ○ “In particular, we wanted to know whether local climatic conditions are associated with the species diversity of city tree communities, how species diversity was distributed in space within cities, and whether introduced tree species contribute to biotic homogenization among urban ecosystems.”

      B) Add data on the extent to which each dataset represents a complete sample of each city's trees. I know are complete inventories, but some consist of 720 trees and cannot be a complete sample. A column in the meta data indicating effort and if there were any bias in where sampling occurred if the dataset is not complete are needed for others to use this data appropriately. For example, we know tree cover/diversity increases with wealth (which the author rightly cites). Let's say in City X, trees were only inventoried in one wealthy neighborhood. They would not be a representative sample of the city and dataset users need to be aware of this before they draw incorrect conclusions about City X where the sample was biased compared to city Y where the inventory was complete, including a sampling of all affluent and poor areas. This is also needed to support the research questions throughout the paper.

      We completely agree, and have made two major changes in response.

      First, we redid all of our diversity analyses after applying Chao’s rarefaction and extrapolation methods to permit comparison between populations of different sizes and sampling efforts. We added new columns to our datasheet with sample coverage estimates, asymptotic estimates of diversity, and diversity estimates for a given population size.

      Second, we also examined spatial coverage in a city because of the valid concern you raised that trees may only be sampled from particular neighborhoods or areas. In short, we divided each city into grid cells, counted trees per grid cell, and calculated metrics of coverage (adjusted number of trees per grid cell, and proportion grid cells that were empty) and bias (skew, kurtosis of number trees in occupied grid cells). These factors are presented in Spatial_Coverage_Supplement.zip. AS you can see even just from a glance at the spatial coverage plots, some cities are indeed extremely biased! Therefore, we ran a sensitivity analysis where we excluded cities with <50% spatial coverage.

      C) The authors chose to use effective species counts as their alpha diversity metric of choice. They explain why: "effective species counts (a measure that allows comparison between cities of different sizes)" (Ln 109). While effective species number is an excellent metric with much better behavior and attributes in linear modeling, I believe it is still strongly dependent on both city area and the number of individual trees sampled and so the above statement and all of the comparisons that flow out of it in the manuscript are currently unsupported. Just as species richness needs to be rarified or extrapolated to be compared at an equivalent # of individuals or area to be accurate so too does EFN (effective species count). Fortunately there is an R package (iNext) based on Chao's method (citation below) that makes it very easy to create effective species accumulation curves for each city by tree individuals sampled.

      a. Chao, Anne, Nicholas J. Gotelli, T. C. Hsieh, Elizabeth L. Sander, K. H. Ma, Robert K. Colwell, and Aaron M. Ellison. 2014. "Rarefaction and extrapolation with Hill numbers: a framework for sampling and estimation in species diversity studies." Ecological Monographs 84 (1): 45-67. https://doi.org/https://doi.org/10.1890/13-0133.1.

      b. The standardization (rarefaction/extrapolation) of EFN or richness for # individual trees sampled needs to be made for all analyses that make claims to compare diversity metrics across cities or between groups like urban and park areas (i.e. Fig 2a,b,c; Fig 3b; Fig 5a,b, S1a, S2a, S5, Table S2)

      c. If the authors have an argument for why diversity/area or diversity/sampling effort relationships do not apply for a particular question, then they should make that case instead.

      We very much appreciate this suggestion. Indeed, as described above, we applied Chao’s method to all of our analyses.

      D) The question posed by the Beta diversity analysis is fascinating (i.e. is it non-native species that are driving biotic homogenization across species. However, while frequency (which I assume is relative abundance but maybe it is incidence data- please define) is used to deal with different sample sizes consider whether it makes sense to include incomplete, or very small city datasets in the analysis even with frequency data. For example one city only has ~720 trees listed. If this is an incomplete dataset which seems likely, it will probably be much more differentiated (overlap less) from another city with small numbers simply due to incomplete sampling. Diversity analysis in cities always requires tradeoffs and cannot be identical to methods used in "natural" forested ecosystems, but I encourage the authors to explore this a bit. Perhaps a sensitivity analysis could help where incomplete or small sample sizes are dropped or datasets are resampled via random draw to equalize sizes? The latter would handle incomplete samples but would not deal with bias in which neighborhoods were sampled (see point B above).

      Great suggestion. We redid this analysis using a random drawn approach, as you suggested, to equalize sizes. The new analysis found the same results as our old analysis, with slightly different values. The new method is described here:

      ○ “How similar are species compositions across cities? For N = 1953 city-city comparisons of street tree communities, we could calculate weighted measures of similarity because we had frequency data. We calculated similarity scores for the entire tree population, the naturally-occurring trees only, and the introduced trees only. We used chi-square distance metrics on species frequency data, and we controlled for different population sizes (and potentially, sampling efforts) between cities by sub-sampling the larger city 50 times to match the smaller city’s tree population size and calculating average metrics. In this manner we controlled for differences in sample size.”

      E) Additional context/conceptual underpinning the clustering analysis would be great.

      a. The authors state in Line 390-395:"For city trees, which are often organized along grids or the underlying street layout of a city, this method can more meaningfully cluster trees than merely calculating the meters between trees and identifying nearest neighbors (which may be close as the crow flies but separated from each other by tall buildings)."- I very much agree with this sentiment and it is biologically meaningful for animal and plant dispersal, but as written it is unclear to me how the method described in the text "knows" that a tall building or elevation or some sort of feature exists to separate clusters rather than empty space or a ball field. Please clarify.

      We appreciate these comments, and we have added text and references for the interested reader. Here is the new description in full:

      ○ “We wanted to quantify the degree to which trees were spatially clustered by species within a city (rather than randomly arranged). To do so, we first clustered all trees within each city using hierarchical density based spatial clustering through the hdbscan library in Python (McInnes et al., 2017). HDBSCAN, unlike typical methods such as “k nearest neighbors”, takes into account the underlying spatial structure of the dataset and allows the user to modify parameters in order to find biologically meaningful clusters. For city trees, which are often organized along grids or the underlying street layout of a city, this method can more meaningfully cluster trees than merely calculating the meters between trees and identifying nearest neighbors (which may be close as the crow flies but separated from each other by tall buildings). In particular, using the Manhattan metric rather than Euclidean metrics improves clustering analysis in cities (which tend to be organized along city blocks). For further discussion of why hbdscan is preferable to other clustering metrics, see (Berba, 2020; Leland McInnes et al., 2016; McInnes et al., 2017).”

      b. Would you ever expect composition to be truly random either in a city or a natural forest given environmental conditions etc.? In some sense, the ones closest to random are the most surprising. Can you dive into one to give an example of what is going on in that city?

      c. It seems like there are two metrics here- the size of the cluster and then the observed/expected EFN per cluster. The latter is analyzed in this paper but is there any important information in the former? It seems like an interesting structural measurement of the city and possibly useful in its own right.

      d. Are there any target levels of randomness? Could the authors suggest how this might be determined moving forward with their datasets to illustrate this for foresters?

      Great points. We have given a lot of thought to your comments– these are large and interesting questions!! In the end, I think these questions fall mostly beyond the scope of this study, but we added a substantial amount of text to address your comments:

      ○ “Clustering by species is not necessarily a negative, nor indeed should we necessarily expect trees to be randomly arranged (see suggestions for further research in “Future Analyses” section below). Here, we take a first step toward making spatial clustering a metric of interest in city tree planning.”

      ○ “Researchers could also use this dataset to perform more refined analysis of clustering. For example, what is the biological significance of variation in cluster size (as determined by the hdbscan clustering algorithms)? The size and arrangement of the clusters themselves may be useful metrics. How clustered should we expect trees to be in both wild and urban settings? That is, what our are null expectations? Further, researchers could apply network theory to predict how pest species would proliferate through each of these cities (depending on the spatial arrangement of pest-sensitive trees).”

      F) The statement that this dataset enables "the design of rich heterogenous ecosystems built around urban forests" (Ln 72) seems strange. To my mind this tool will enable a more nuanced evaluation of the urban forests that already exist and suggest ways to target future plantings for increased resilience to climate, pest resistance, biodiversity support etc. I don't understand what ecosystem you would build around and not in the urban forest. If this is what is meant please elaborate. For example, do you mean non-tree installations?

      We agree with you and have changed the text as follows:

      ○ “With these tools, we may evaluate existing city tree communities with more nuance and design future plantings to maximize resistance to pests and climate change. We depend on city trees.”

    1. Put simply, conservatives hope that Twitter will now become a more willing vehicle for right-wing propaganda. Even if the platform tilts further in their direction, they will be motivated to continue to insist they are being censored—their criticisms likely exempting Musk himself in favor of attacking Twitter’s white-collar workers, whom conservatives paradoxically perceive as the “elite” while praising their billionaire bosses as populist heroes.

      This claim takes into account that those on the right wing are viewing the purchase of Twitter as a new way to push their narrative on this social media platform. However, they will still push the narrative that they are being censored. This is all a big scheme to make their followers view them as an oppressed group who are being silenced. I think it is important for platforms like this to implement free speech but we have to fact check sources and this is something I feel conservatives may not be taking into consideration.

  3. Aug 2022
    1. For instance, the emissions saved from living car-free may be lower than we calculated if public transit replaces car travel instead of biking or walking (living car free represents all the emissions associated with the life cycle of owning a car in our methodology).

      This contributes to the imperative idea that we talked about on the first day of classes. The imperatives that we have to keep in mind do not only think about the planet and what is best in the long run, but rather the needs of humans and the lives we are currently accustomed to as well. By substituting rather than completely cutting out certain routine actions, it would still result in emissions, but significantly less than those previous to the substitution

    1. Wouldn’t we all love to agree on a comprehensive worldview, consistent with science, that tells us how to behave individually and collectively?

      Personally I had this view about language before, wouldn't it be nice if we all spoke the same language. As I've grown, I realized that even though it may be convenient, I think that's the beautiful part about humans and society, that we coexist despite our differences. I think it would make us more of a dystopia than a utopia if we were all the same.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this paper, Staneva et al describe a novel complex found at RNA PolII promoters that they term the SPARC. The manuscript focuses on defining the core components of the complex and the pivotal role of SET27 in defining its function, and role in PolII transcription. This manuscript is a logical follow on from an initial paper (Staneva et al, 2021) by the same authors where they systematically analyzed chromatin factors, and their role in both transcription start and termination. What is also very clear, is that this complex is one made of histone readers and writers which suggests its function is to change the chromatin structure around a PolII promoters. The authors show that this complex is necessary for the correct positioning of PolII and directionality of transcription.

      This was a well-designed study and well written and clear manuscript that provides fascinating insight transcription control in bloodstream form parasites.

      I have no major comments only a few minor ones.

      1) Localisation of the different SPARC components appears to be either nuclear or nuclear and cytoplasmic. - Both SET27 and CRD1 show a nuclear and cytoplasmic localisation in the bloodstream form IFA (Supplementary Fig 1B), but only a nuclear localisation procyclic form.

      Did the authors attempt C terminally tagging SET27, CRD1 to see if this resulted in a change in the pattern?

      We have not tagged either protein at the C terminus, however SET27 (Tb927.9.13470) has been tagged both N- and C-terminally in procyclic form (PF) cells as part of the TrypTag project (http://tryptag.org). In both cases, SET27 localized to the nucleus, suggesting that the differences in localization we observe for SET27 depend on the life cycle stage, and not on the position of the tag. One caveat is that in the TrypTag project proteins are tagged with mNeonGreen whereas in our study proteins were tagged with YFP. Based on our images, CRD1 appears to be predominantly nuclear in both bloodstream form (BF) and PF parasites. CRD1 (Tb927.7.4540) has been tagged only N-terminally in PF cells as part of the TrypTag project where it has also been classified as mostly nuclear with only 10% of cells showing cytoplasmic localization for CRD1.

      We are well aware that tags can alter the behaviour of a protein. Absolute confirmation of location will require the generation of antibodies that detect untagged proteins. However, this is a longer-term undertaking. We have added the following statement to the Results section to address the point raised:

      “We tagged the proteins on their N termini to preserve 3′ UTR sequences involved in regulating mRNA stability (Clayton, 2019). We note, however, that the presence of the YFP tag and/or its position (N- or C-terminal) might affect protein expression and localization patterns”.

      • The point is made that JBP2 shows a 'distinct cytoplasmic localisation' in PF cells. by this logic, the SET27 localisation in BF is also distinctly cytoplasmic and a nuclear enrichment is not clear.

      Indeed the reviewer is correct - we have inadvertently over accentuated the significance of this difference in the text. We had emphasized the predominantly cytoplasmic localization of JBP2 in PF trypanosomes as potentially related to its weaker association with other (predominantly nuclear) SPARC components in the mass spectrometry experiments. The presence of SET27 in the nuclei of both BF and PF cells is confirmed by a positive ChIP signal. We have revised the manuscript text by changing “distinct cytoplasmic” to “predominantly cytoplasmic” to describe JBP2 localization in PF cells. We hope that this resolves the issue.

      • Why would the localisation pattern change between life cycle stages? Surely PolII transcription should remain the same?

      Although our analysis suggests that there may be some shift in SET27 and JBP2 localization between BF and PF stages, sufficient amounts of these proteins may be present in the nucleus for proper SPARC assembly and RNAPII transcription regulation in both life cycle forms. The proportion of SET27 and JBP2 proteins that localizes to the cytoplasm may have functions unrelated to transcription.

      2) Several of the images in Supplementary Fig 1B seem to show foci in the nucleus (CSD1, PWWP1, CRD1). Do you see foci throughout the cell cycle or just in G1/S phase cells as shown here?

      We have not systematically investigated protein localization at different cell cycle stages, so we do not have microscopy images for all proteins at all stages of the cell cycle. However, the images we did collect suggest the punctate pattern is preserved for CRD1 in the G2 phase in both BF and PF cells (see below) as we showed in Supplemental Figure S1B for cells with 1 kinetoplast and 1 nucleus (G1/S phase cells). The significance of these puncta remains to be determined.

      3) In Figure 6, what does 'TE' stand for?

      TE denotes transposable elements. We have added this to the figure legend.

      4) The authors show this interesting link between SPARC complex and subtelomeric VSG gene silencing. -In the CRD1 ChIP or RBP1 ChIP, are there any other peaks in telomere adjacent regions in the WT cells similar to that seen on chromosome 9A? And does the sequence at this point resemble a PolII promoter?

      Apart from peaks located on Chromosome 9_3A, there are other CRD1 and RPB1 ChIP peaks in chromosomal regions adjacent to telomeres in WT cells. We observed broadening of RPB1 distribution in these regions upon SET27 deletion, similar to what we show for Chromosome 9_3A. In particular, wider RPB1 distribution on Chromosome 8_5A coincides with upregulation of 10 VSG transcripts. These two loci explain most of the differentially expessed genes (DEGs) detected, but other subtelomeric regions show a similar pattern. We have added the following statement to the Results section to highlight that the phenotype shown for Chromosome 9_3A is not unique:

      “We also observed a similar phenotype at other subtelomeric regions, such as Chromosome 8_5A where 10 VSGs and a gene encoding a hypothetical protein were upregulated upon SET27 deletion (Supplemental Table S3)”.

      Cordon-Obras et al. (2022) have recently defined key sequence elements present at one RNAPII promoter. We searched for similar sequence motifs but failed to identify them as underlying CRD1 and RPB1 ChIP peaks, highlighting the likely sequence heterogeneity amongst trypanosome RNAPII promoters. To address this point, we have added the following sentence to the Discussion:

      “Sequence-specific elements have recently been found to drive RNAPII transcription from a T. brucei promoter (Cordon-Obras et al., 2022), however, we were unable to identify similar motifs underlying CRD1 or RPB1 ChIP-seq peaks, suggesting that T. brucei promoters are perhaps heterogeneous in composition”.

      -In the FLAG-CRD1 IP (Figure 3B), the VSG's seen here are not represented (as far as I can tell) in Figure 6B and C. If my reading is correct could, is this a difference in the FC cut off for what is significant in these experiments?

      The VSGs detected in the FLAG-CRD1 IP from set27D/D cells are indeed different from the ones shown in Figure 6 (even after setting the same fold change cutoffs). We have highlighted this by adding the following statement to the Results section: “Gene ontology analysis of the upregulated mRNA set revealed strong enrichment for normally silent VSG genes (Figure 6B-D) which were distinct from the VSG proteins detected in the FLAG-CRD1 immunoprecipitations from set27D/D cells (Figure 3B)”.

      The VSGs in the mass spectrometry experiments likely represent unspecific interactors of FLAG-CRD1. To clarify this, we have added the following statement to the Results section: ”Instead, several VSG proteins were detected as being associated with FLAG-CRD1 in set27D/D cells, though it is likely that these represent unspecific interactions”.

      Reviewer #1 (Significance (Required)):

      Trypanosomes are unusual in the way that they transcribe protein coding genes. Recent advances have defined the chromatin composition at the TSS and TTS, and the recent publication of a PolII promoter sequence(s) further adds to our understanding of how transcription here is regulated. Defining the SPARC complex now add to this understanding and highlights the role of potential histone readers and writers. I think that this will be of interest to the kinetoplastid community especially those working on control of gene expression.

      Our lab studies gene expression and antigenic variation in T. brucei.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, the authors identify a six-membered chromatin-associated protein complex termed SPARC that localizes to Transcription Start Regions (TSRs) and co-localizes with and (directly or indirectly) interacts with RNA polymerase II subunits. Careful deletion studies of one of its components, SET27, convincingly show the functional importance of this complex for the genomic localization, accuracy, and directionality of transcription initiation. Overall, the experiments are well and logically designed and executed, the results are well presented, and the manuscript is easy to read.

      There are a few minor points that would benefit from clarification and/or from a more detailed discussion:

      1) The concomitant expression of many VSGs (37) in a SET27 deletion strain is remarkable and has important implications for their normally monoallelic expression. It is well established that VSG expression in wild-type T. brucei can only occur from one of ~15 subtelomeric bloodstream expression sites, which include the ESAGs. This result implies that VSG genes are also transcribed from "archival VSG sites" in the genome, not only from expression sites. Are there VSGs from the silent BESs among the upregulated VSGs? Is there precedence in the literature for the expression of VSGs from chromosomal regions besides the subtelomeric expression sites?

      Our analysis of differentially expressed genes (DEGs) revealed that 43 VSG genes (37 of which are subtelomeric) and 2 ESAG genes are upregulated in the absence of SET27. Both ESAGs but none of the upregulated VSGs in set27D/D cells are annotated as located in BES regions. While it is possible that recombination events have resulted in gene rearrangements between the reference strain and our laboratory’s strain, at least some of the upregulated VSGs are likely to be transcribed from non-BES archival sites. VSG transcript upregulation from non-BES regions was also recently described by López-Escobar et al (2022).

      We note that the upregulated mRNAs in set27D/D are still relatively lowly expressed (Figure 6C). This is presumably insufficient to coat the surface of T. brucei, and expression from BES sites instead may be required to achieve this. We have revised the manuscript Discussion section to make these points more clear:

      “Bloodstream form trypanosomes normally express only a single VSG gene from 1 of ~15 telomere-adjacent bloodstream expression sites (BESs). In contrast, in set27D/D cells we detected upregulation of 43 VSG transcripts, none of which were annotated as located in BES regions. Recently, López-Escobar et al (2022) have also observed VSG mRNA upregulation from non-BES locations, suggesting that VSGs might sometimes be transcribed from other regions of the genome. However, the VSG transcripts we detect as upregulated in set27D/D were relatively lowly expressed (Figure 6C) and may not be translated to protein or be translated at low levels compared to a VSG transcribed from a BES site”.

      2) The role of SPARC in defining transcription initiation is compelling. It's less clear to the reviewer if the observed transcriptional silencing within subtelomeric regions can also ascribed to SPARC. Have the authors considered the possibility that some components of the SPARC may be shared by other chromatin complexes, which could be responsible for the transcriptional activation of silent genes in SET27 deletion mutants?

      We cannot rule out indirect effects through the participation of some SPARC components in other complexes operating independently of SPARC. Indeed, the transcriptional defect within the main body of chromosomes appears to be somewhat different from that observed at subtelomeric regions, particularly with respect to distance from SPARC. We have added a statement in the Discussion section to highlight the possibility raised by the reviewer:

      “However, an alternative possibility is that transcriptional repression in subtelomeric regions is mediated by different protein complexes which share some of their subunits with SPARC, or whose activity is influenced by it”.

      3) The authors mention that the observed interaction of FLAG-CRD1 with VSGs in the immunoprecipitations (Fig. 3B) is evidence for the actual expression of normally silent VSGs on the protein level. This is true, but it should be spelled out that this interaction is nevertheless likely an artifact, at least the physiological relevance of these interactions is questionable.

      We agree that these are likely background associations and have added the following statement to the Results section to clarify this point:

      “Instead, several VSG proteins were detected as associated with FLAG-CRD1 in set27D/D cells, though it is likely that these represent unspecific interactions”.

      To avoid unnecessary confusion we have also removed the following sentence from the revised Discussion:

      “The interactions of FLAG-CRD1 with VSGs in the affinity selections from set27Δ/Δ cells indicate that some of the normally silent VSG genes are also translated into proteins in the absence of SET27”.

      4) "ophistokont" is misspelled in the introduction

      Thanks for noticing. We have corrected it to “Opisthokonta”.

      Reviewer #2 (Significance (Required)):

      The manuscript by Staneva et al. addresses the fundamental regulatory mechanism of gene transcription in the protozoan parasite Trypanosoma brucei, a highly divergent eukaryotic organism that is renowned for unusual features and mechanisms in gene regulation, metabolism, and other cellular processes. While post-transcriptional regulation is prevalent and relatively well established in T. brucei, much less is known about the mechanism of transcription initiation and transcriptional control, in part due to the general paucity of well-defined conventional promoter regions in this organism (only very few have been identified thus far). In this context, the work by Staneva et al. is highly significant and represents an important contribution to the field of gene regulation and chromatin biology in T. brucei and other related kinetoplastid parasites.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewers 1 and 2 are very positive about our manuscript, while reviewer 3 is surprisingly critical.

      However, except for the first observation, most of reviewer 3´s comments are based on incorrect interpretations of our results.

      We have integrated the useful comment into our revised version, and we will discuss in the following sections why reviewer 3’s remaining criticisms should be disregarded.

      Reviewer 1:

      Reviewer 1 has only minor suggestions and is satisfied that we prove convincingly our claims. The reviewer also finds our results reinforce our previously proposed hypothesis that the glands and the trachea evolved from common metamerically repeated ancient primordia.

      We have introduced the following changes to the text to accommodate Reviewer’s 1 minor suggestions.

      Main suggestion: Insert a paragraph in discussion explaining the relevance of new insights to more basal insects that do not form a ring gland.

      RESPONSE:We have introduced at the end of Discussion the following paragraph:

      “Our analysis of snail activation in the CA and PG shows that these glands and the trachea share similar upstream regulators, reinforcing the hypothesis that both diverged from an ancient segmentally repeated organ. In Drosophila melanogaster the CA and the PG primordia experiment a very active migration after which they fuse to the corpora cardiaca forming the ring gland (Sanchez-Higueras and Hombria, 2016). This differs from more basal insects where the CA fuses to the corpora cardiaca but not to the PG, and from the Crustacea where the three equivalent glands are independent of each other (Chang and O'Connor, 1977; Laufer et al., 1987; Nijhout, 1994; Wigglesworth, 1954). As the mechanisms we here describe relate to the early specification of the glandular primordia in Drosophila, it will be interesting to investigate if the equivalent genes are also involved in the endocrine gland specification of more distant arthropods”.

      Additional comment 1: Introduction, pg 3, a paragraph starting with "In comparison to the extensive knowledge we have of ..." - consider omitting or greatly shortening, this text breaks a flow as it is focused on tracheal development. I understand the authors' logic, but this information distracts from the main focus on CA and PG. RESPONSE:We agree that the trachea description paragraph breaks the flow of the introduction to gland development. As suggested by the reviewer, we have deleted most of the descriptive text on trachea development but left all the references so that interested readers can find the information.

      Additional comment 2: Beginning of discussion, pg 11: - change 2nd sentence to: " Our results indicate that the HH and the Wnt pathways act indirectly to negatively regulate the spatial activation ..." - the following sentence, starting with "Engrailed activation off hh transcription ...." is way too long and hard to follow, consider breaking into two sentences. RESPONSE:We have changed both sentences as suggested

      Additional comment 3: In Fig 4B, mx and lb segments should be labeled so this panel is consistent with labeling in 4A. RESPONSE:We have changed Fig.4B labels to be consistent with 4A

      Additional comment 4: In Fig 6, reduce a font size for labels on right-hand side (A1, A2, A1+A2 proximal, etc), so that they are visualy distinct from panel labels on left-hand side (A, B, C,..).

      RESPONSE:We have changed Fig.6 Font size as suggested

      Reviewer 2

      The reviewer is positive and agrees that the results we present in “this paper add to our understanding of how the CA and PG primordia are specified and highlights important similarities with the specification of the tracheal primordia”. The reviewer’s comments focus specially on the activation vs. maintenance of sna.

      Specific Comment a): Referring to Fig 1G-J, the reviewer says: It is not clear to me from either this figure or from the text whether the initial pattern of expression of the sna-rg reporter in stage 11 embryos is WT and then disappears at stage 12, or whether it is always defective. In trying to understand the activation process, I think it would be important to know for sure whether rg enhancer activity during the initiation phase in stage 11 is WT or not.

      RESPONSE: As suggested by the reviewer, we have included st11 embryos in Fig. 1 as panels G,J which illustrate that early sna-rg activation occurs normally in snaΔrgR2embryos prior to apoptosis kicking in. To make space for these images, we have taken out the st12 embryos that we had in our previous submitted version. This does not affect the manuscript’s message, as st12 phenotypes are similar to those at st13 which are presented in Fig. 1H,J.

      Moreover, in this revised version, the embryos in Fig. 1G-J have also been double stained with the apoptosis marker DCP1 to highlight the cell death observed in the gland primordia of snaΔrgR2 embryos (Fig. 1G’-J’).

      Specific Comment b) The authors argue that the rg deletion removes the only region driving sna expression in CA/PG. I'm not convinced that necessity necessarily implies sufficiency with respect to the requirements for rescue. While the sna-rg reporter is expressed in a pattern that seems to mimic the endogenous gene, do we know that a rg-sna transgene would fully rescue the rg deletion mutant?

      RESPONSE: In our previous paper (Sanchez-Higueras 2014) we presented evidence that in sna null embryos, a Snail BAC gene lacking the sna-rg CRM can fully rescue the mesoderm phenotypes but not the ring gland ones. This proved that in the BAC transgene there was no shadow CRM capable of rescuing the gland formation in the absence of sna-rg. In the current paper we show that deleting the endogenous sna-rg CRM in the sna locus results in the absence of sna transcription from the gland primordia.

      Making a sna-rg- construct expressing sna to test if this rescues the snaΔrgR2 homozygous mutants could be done, but it will delay this publication without adding much to the paper: we already know that sna-rg is sufficient to drive activation in all the CA and the PG cells (Sanchez-Higueras 2014 Fig 2J-M) and it would be expected to rescue the gland formation in snaΔrgR2 homozygous mutants.

      Having said that, we have changed the wording in the manuscript to one that may be acceptable to the reviewer.

      Instead of:

      “These results prove that snaΔrgR2 deletes the only regulatory region driving sna expression in the CA and PG gland primordia…”

      We now say:

      “These results prove that the snaΔrgR2 deletes mutation inactivates the only regulatory region driving sna expression in the CA and PG gland primordia…”

      Specific Comment c) is Sna required for maintaining sna expression?

      RESPONSE:This experiment is relevant to the maintenance mechanism of sna expression in the ring gland, and not to its activation which is the main focus of this paper.

      The search for the maintenance mechanisms is currently been followed in the laboratory and we prefer not deal with it in this paper. Providing a negative answer to this question would not be satisfactory, as we would need to search for the factors controlling sna’s maintenance.

      Specific comment d) The authors show that there is an expansion in the number of sna-rg reporter expressing cells along the AP axis when upd is ectopically expressed using a sal-Gal4 driver. Though not mentioned in the text at this juncture, sal is expressed in the PG primordia, while seven-up (svp) is expressed in the CA primordia. I assume that the upd induced expansion is only observed for the PG primorida (LB) and not the CA primordia (Mx)-at least this is what the figure looks like. (…) How about svp driven upd-assuming there is a svp-Gal4 driver-does it cause an expansion of Ca but not PG.

      RESPONSE: As the reviewer has noticed, there is a stronger expansion of sna-rg-GFP expression in the labial segment than in the maxillary segment. This is not due to the use of the sal-Gal4 line. We see the same effect with arm-Gal4 which drives similar expression on the maxilla and the labium. To illustrate this point, we have included two new panels (Fig.5D-E) where the ectopic expression of Upd has been induced with arm-Gal4. These embryos have been stained with anti-Sal to label the PG. This experiment shows clearly that the PG has expanded much more than the CA.

      There are several reasons why expansion of the glands could be more efficient in the labium than in the maxilla. One possible reason is the temporal response to Upd activation. Upd induction by the arm-Gal4 and sal-Gal4 lines may occur after the cells in the maxilla are no longer capable of activating sna-rg but still capable of activating it in the labium. This temporal hypothesis is based on our results showing that the CA expresses more transiently the upd gene and that STAT activation lasts for longer in the labium than in the maxilla (Fig. 4A-D)].

      A second possibility, that we favour, is the existence of dorso-ventral repressor genes modulating sna-rg expression intrasegmentally. Some of our results point towards the sna-rg CRM receiving repressor inputs that modulate intrasegmental spatial expression in the dorso-vental axis. When we delete the A2 distal region of the sna-rg enhancer, its expression in the labium expands ventrally (Fig. 6E,G and Sup.Fig. 4D). If a similar repressor was also modulating sna-rg in the maxilla it could be blocking its expansion. However, at this stage we have no solid data to support any of these hypotheses. As explained before for the maintenance mechanisms of sna-rg expression, our ongoing work aims to isolate and characterize further elements controlling the ring gland gene network, including these negative regulators.

      In the revised manuscript we now describe the different effects of Upd ectopic activation on the expression of sna-rg in the maxilla and the labium (underlined text is new to this revised version):

      “To test if generalised Upd expression in the maxilla and labium can activate sna-rg expression independently of other upstream positive or negative inputs, we induced UAS-upd with either the sal-Gal4 or the arm-Gal4 lines. We observe that, these embryos have expanded sna-rg expression along the antero-posterior axis in the maxillary and labial segments (Fig. 5C). Analysis of Sal expression, which labels the PG primordium (Sanchez-Higueras et al., 2014), shows that Upd ectopic expression induces a moderate expansion of the CA primordia while resulting a much larger increase of the PG primordium (Fig. 5D-E). This expansion occurs mostly in the anterior and posterior axis from cells where the Hh and the Wnt pathways are normally blocking sna-rg expression, while expansion is less noticeable in the dorso-ventral axis. This indicates that most of the antero-posterior intrasegmental inputs provided by the segment polarity genes converge on Upd transcription but that the dorso-ventral information is registered downstream of Upd.”

      The differential response of sna-rg to Upd activation in the maxillary and labial segments is also mentioned in Fig. 5 legend. (see Continuation comment d).

      * Continuation comment d) “It looks to me also like the vvl domain is expanding as well. This information should be clarified.*

      RESPONSE: Yes, ectopic upd expression also expands vvl1+2 expression. We have previously published that vvl1+2 is a direct target of JAK/STAT signalling in the trachea (Sanchez-Higueras 2019 and Sotillos et al. 2010 Dev.Biol). Although vvl1+2 expands dorsally in the Mx, those cells do not activate sna-rg dorsally. The ventral restriction of sna-rg in the maxilla is controlled by Dfd while in the labium its dorsal expression depends on Scr. We explain this in Fig.5’s figure legend where we now say (underlined text is new to this revised version):

      (C) Ectopic Upd expression driven with sal-Gal4 induces ectopic sna-rg and vvl1+2 expression in the gnathal segments, which for sna-rg is more pronounced in the labium than in the maxilla. Note that in the maxillary segment Upd can induce ectopic dorsal vvl1+2 but not sna-rg expression, this is expected as Dfd only induces sna-rg ventrally in the maxilla. (D-E) sna-rg-GFP embryos stained with anti-GFP (green) and anti-Sal (red). In control embryos (D) Sal labels the PG primordium but not the CA. In arm-Gal4 embryos ectopically expressing Upd, the PG is more expanded than the CA as shown by number of cells co-expressing Sal and GFP.

      Specific Comment e) The authors note a difference between CA and PG in the requirement for STAT binding sites in the enhancers. Is that related to the fact that svp is expressed in CA and sal is expressed in PG? Would driving svp expression using the sal-Gal4 driver maintain sna-rg expression.

      RESPONSE: During our preliminary ongoing experiments on sna maintenance mechanisms we looked in svp mutants and did not notice a change in sna-rg expression, thus it is unlikely that Svp is responsible for the difference. As said above, we continue looking for genes involved in gland formation. Sal could be involved in the maintenance of sna in the PG, but as Sal is expressed in the maxilla and labial segments before gland formation, it is difficult to disentangle if Sal is required for sna activation or maintenance (or both).

      Specific Comment f) Do svp or sal have a role in initiating sna expression when upd is present or maintaining sna expression after upd disappears? Presumably there is already published data that would answer these questions.

      RESPONSE: As explained above we did not find any effect of svp on activation of sna-rg, however we find that in sal mutants the labium does not express sna-rg. This shows that sal is likely to be another positive input. As in sal mutants both trh and Ubx become ectopically expressed in the Lb (Casanova1989 Roux's archives of developmental biology 198: 137-140; Castelli-Gair 1998 IJDB42:437-444) we have done the experiment in sal trh double mutants and in sal Ubx,abdA,Abd-B mutants. In both cases we still see a failure of sna activation in the Lb reinforcing the idea that Sal is an additional positive input. However, we prefer not to add the sal experiments as they would complicate the paper which currently focuses on the similar requirement of the Wnt, Hh and JAK/STAT signalling pathways.

      Reviewer 3

      Reviewer is very critical. We accept some of the points raised and have modified the manuscript accordingly. However, as we detail below, the most serious criticisms are incorrect and do not affect the conclusions reached by our work.

      We agree with the following comment:

      “In the Dfd Scr double mutant, both the CA and PG expression of the snail-rg-GFP reporter is still there - admittedly, the gland cells look abnormal at late stages, but this reporter that is supposed to function as a proxy for gland induction is still expressed. That either means that expression of sna-rg-GFP is not a proxy or that the glands are still being specified in the absence of the Hox genes that are proposed to specify these organs. The reporter should not be expressed if these Hox genes are what specify these endocrine organs.”

      RESPONSE: The reviewer has made a good observation. The expression of sna-rg-GFP is not completely absent in Dfd Scr mutant embryos (Fig. 5F in this revised version), which indicates that although the Hox genes are required to activate upd in the maxilla and labium and in their absence the gland primordia become apoptotic, there must be other positive inputs to the enhancer. However, this does not mean the Hox gene input is irrelevant for gland specification. Not only the Hox genes are required to keep normal levels of upd expression in the Mx and Lb primordia and gland viability, but previously we also showed that cephalic Hox genes influence the dorso-ventral position inside the vvl1+2 expressing cells where the sna-rg enhancer is activated: in the maxilla Dfd induces the ventral vvl1+2 expressing cells to activate sna-rg, while in the labium Scr induces the dorsal vvl1+2 cells to activate sna-rg (Sanchez Higueras 2014). The data presented in this paper indicate that the input of both Dfd and Scr over sna-rg CRM activation are indirect.

      As a result of the reviewer’s criticism, we have tested if the additional positive input could be provided by Ci. In our previous submitted version, we showed that the repressor form of Ci blocks sna-rg activation. In this revised version, we have tested what is the effect of expressing the activator form of Ci. In embryos overexpressing the activator CiPKA isoform, we have observed that the expression of sna-rg and upd are expanded, indicating that Ci can provide the additional Hox-independent positive input. In the revised version we present these new results as Fig.3G and Fig. 4I. We have modified accordingly the scheme that appears in panel 3I to include this. In the main text we describe the result in the Hh regulation section where we have added:

      “Although the above results indicate Ci is not absolutely required for sna-rg expression, we observed that overexpression of CiPKA, the active form of Ci, causes a non-fully penetrant expansion of sna-rg expression (Fig. 3G) suggesting the possibility that sna-rg may be responsive to Ci and to a second activator.”

      … and in the “Regulation of Upd ligand expression by the Wg and Hh pathways” section

      where we say:

      “We also found that ectopic expression of the activator Ci protein results in a non-fully penetrant expansion of upd expression in stage 10 embryos (Fig. 4H-I).”

      We have also modified the final scheme in Fig. 7 to mention that Dfd and Scr prevent the apoptosis of the gland primordia, and that there must be an additional positive input controlling upd activation besides the Hox input. However, in the figure we do not define Ci as the activating input as we would like to have additional evidence before making such claim.

      To clarify that the Hox input is not absolutely required we have modified the text in several places. Where we said:

      “Expression of the sna-rg reporter in the maxilla and the labium requires Dfd and Scr function …”

      We now say:

      “Development of the CA and PG and normal expression of the sna-rg reporter in the maxilla and the labium require Dfd and Scr function …”

      We also mention this in Fig. 5 legend where we have added:

      “In Dfd Scr mutant embryos (F), although the gland primordia become apoptotic, residual GFP expression indicates that there must exist Hox independent inputs activating the sna-rg enhancer.”

      As a result of reviewer 3’s comment, we have noticed a further example of similarity between the gland and the trachea specification, which we have commented in the revised discussion where we added the following paragraph:

      “Another interesting similarity between glands and trachea is that, although ectopic Hox gene expression can ectopically induce sna-rg and trh outside their normal domain, the lack of Hox expression does not completely abolish their endogenous expression, indicating that in both cases a second positive input can compensate for the absence of Hox mediated activation. Our results suggest that, in the glands, this redundant input could be provided by the activating Ci form (Figs. 3G and 4I), but further analysis to confirm this possibility and discard alternative sna-rg activators should be performed.”

      We disagree with the following comments:

      The finding that the CA and PGs form in slightly different DV positions from each other and slightly different DV positions from the trachea (based on the vvl1+2 mCherry reporter staining combined with that of the sna-rg-GFP reporter staining in Figure 5A, where staining does not overlap except where the CA cells have started to migrate over the vvl1+2 mCherry expressing cells) argues pretty strongly against the CA and PG being homologous to each other or absolutely homologous to the trachea primordia

      RESPONSE: This erroneous claim was based on Fig. 5A, that showed a double stained embryo where co-expression is difficult to appreciate without separating the channels. Co-expression of these two reporter lines in the ring gland has been previously documented beyond doubt in our 2014 publication, cited throughout the manuscript, where we presented eight different panels of glands clearly co-expressing both markers at various developmental stages (Current Biology 2014 Fig.2B-I). To prevent any readers reaching the same conclusion as the reviewer, we have modified Fig. 5A to show a double stained sna-rg-GFP vvl1+2-mCherry embryo alongside with the two separate channels (panels 5A’ and A’’) to make the co-expression evident.

      Although we are not including it in this manuscript, the reviewer will also be able to find images in the same 2014 Current Biology publication (Fig.3), where the ectopic activation of Dfd in the trunk leads to the activation of the sna-rg-GFP reporter in the vvl1+2 tracheal cells, proving that the glands and the trachea are formed at homologous positions.

      Having made clear that sna-rg activation in both the CA and the PG occurs in vvl1+2 expressing cells, we now refute a second criticism: The reviewer is puzzled that despite the glands being formed at different dorso-ventral positions in the vvl1+2 expressing patch of cells, we claim both groups of cells are homologous to the trachea.

      We are not saying that the CA are formed at homologous positions to those giving rise to the PG. What we say is that both the CA and the PG are formed at positions homologous to those giving rise to the trachea in the trunk segments.

      To make this clear in the revised version, we have changed the wording of a sentence in the Introduction section that might have originated the confusion.

      Instead of saying:

      “First, the CA, the PG and the traqueal primordia are specified in the lateral ectoderm at homologous positions”.

      Now, it reads:

      “First, the CA and the PG are specified in the cephalic lateral ectoderm at homologous positions to those forming the tracheal primordia in more posterior trunk segments.”

      It has been shown that each tracheal primordium (which are labelled by vvl1+2-mCherry) gives rise to different tracheal branches depending on the positions where they are specified: the dorsal cells give rise to the dorsal tracheal branches, the ventral cells to the ganglionic branches, the medial cells to the dorsal trunk etc. (for illustration see Fig.12 in Manning and Krasnow 1993). Each of these tracheal branches have a different shape and migrate to different positions. We believe that a similar positional specification occurs in the vvl1+2 cells in the maxilla and the labium. In the maxilla only the vvl1+2 ventral cells activate sna and svp (among other genes) to give rise to the CA. In the labium vvl1+2 dorsal cells activate sna, sal, phm (among other genes) to give rise to the PG. This regionalization is similar to what happens during tracheal branch specification, with the only difference that the interaction with Dfd and with Scr is what makes the positional outcome in the maxilla and the labium different (see in our Current Biology 2014 publication Fig. 3E-F and H-J). Thus, when the reviewer considers the equivalence between the CA/PG/trachea homology with that of the wing/haltere or that of the thoracic leg1/2/3 saying: “Indeed, the situation with these endocrine glands and the trachea is completely unlike the situation with the wing and haltere, wherein both structures arise from the same DV position in adjacent segments, or with legs 1, 2 and 3, which arise from the same DV position in adjacent segments

      …the reviewer should think about the coxa and the tarsi in the legs. The coxa in T1 is not homologous to the tarsi in T2 or T3, but when considering the leg structure as a whole, the coxa and the tarsi form part of the same homologous structure in T1, T2 and T3 despite being formed at different positions inside the leg primordia.

      The reviewer also doubts that the activation of upd occurs in the sna-rg primordium when saying: “Likewise, the STAT10X-GFP staining does not overlap with the sna-rg-mCherry staining (I see red cells and I see green cells - there are no yellow cells). If activation of snail is through Upd activation of STAT signaling, we should see that the snail reporter expression is within the domain of STAT10X-GFP expression.”

      RESPONSE: This is due to the fact that upd activation in the CA is extremely transient, leading to the loss of the x10STAT-GFP expression before the sna-rg-mCherry levels are robust enough in the maxilla. This criticism does not apply to the PG where due to upd expression lasting longer, co-expression of sna-rg-mCherry and x10STAT-GFP in panel 4B should be evident to the reviewer.

      To try to sort the CA co-expression problem, we are currently repeating the experiment but instead of analysing sna-rg-mCherry activation with the RFP antibody, we will do an mcherry RNA in situ. We hope that the mcherry transcript will be detectable earlier than the protein and the co-expression will be evident.

      We strongly disagree when the reviewer says: “This paper provides a strong basis for arguing that the CA and PG are induced independently of Jak/Stat signaling, whereas trachea require this signaling pathway.”

      RESPONSE: When making this claim, the reviewer is ignoring a large number of experiments presented in the manuscript. If the CA and the PG are induced independently of JAK/STAT signalling:

      (1) Why sna-rg expression disappears from the glands in mutants lacking the Upd ligands (Fig. 5B and 6K)?

      (2) Why deleting the region containing the putative STAT binding sites in the sna-rg enhancer causes the loss of enhancer expression (Fig. 6C)?

      (3) Why the smaller enhancer mentioned in point (2) recovers gland expression when adding a STAT binding site from an unrelated gene (Fig. 6G)?

      (4) Why the regained expression of the construct mentioned in (3) is lost by the mutation of two bases affecting this single STAT site (Fig. 6H)?

      The reviewer’s conclusion rests on giving an excessive importance to his reservations to CA co-expression in panel 4A while, surprisingly, disregarding the co-expression in the PG shown in panel 4B and all the experiments presented in Fig. 5 and Fig.6.

      Reviewer 3 Minor comments: RESPONSE: Both comments have been taken into account in the revised version.

      In summary, in this revised version we have answered most queries raised by reviewers 1 and 2. Moreover, reviewers 1 and 2 agree that the results presented in this manuscript reinforce the hypothesis that the CA and the PG glands and the trachea derive from the divergent evolution of a metamerically repeated homologous organ.

      Reviewer 3 has made a good point that we have taken into account and has improved the revised submission.

      However, reviewer 3 is wrong when concluding:

      This paper provides a strong basis for arguing that the CA, PG and trachea are not homologous structures, and when saying: the CA and PG are induced independently of Jak/Stat signaling, whereas trachea require this signaling pathway”.

      As we argue above, these conclusions are erroneous because:

      (1) Are based on the incorrect interpretation of Fig 5A and ignore previous published evidence cited throughout the manuscript.

      (2) It does not take into account key experiments presented in this work, while giving too much weigh to a result that can be easily interpreted.

      (3) It misinterprets the arguments justifying the positional homology between the CA/PG glands and trachea primordia.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      This paper focuses on the specification of two endocrine glands that form from head ectoderm, the corpora allata (CA), which forms in the maxillary segment and secretes Juvenile hormone, and the prothoracic glands (PG), which form in the labial segment and secrete Ecdysone. Secretion of both hormones results in a larval molt. Secretion of only Ecdysone induces metamorphosis, the transition of the larvae into the adult forms. Both the CA and PGs form in positions homologous to the tracheal primordia (approximately) and previous reports indicate that ectopic expression of the appropriate Hox genes can result in homeotic transformations of the glands into tracheal primordia and of tracheal primordia into glands. Using a GFP reporter construct for the snail gene as a proxy for gland specification, the authors show that CA and PG formation is regulated by two segment polarity genes: Hh and Wnt, with Hh signaling activating reporter gene expression and Wnt signaling inhibiting reporter gene expression. They also suggest that their endocrine gland GFP reporter is regulated by the two Hox proteins expressed in those segments: Dfd (maxillary) and Scr (labial) (although figure 5D,E argue against this conclusion). They presumably show that reporter gene regulation by Wnt signaling and Hh signaling is indirect and through localized transcriptional activation of the JAK/STAT signaling pathway ligand gene upd (however, the STAT reporter and the snail reporter are expressed in different cells (fig 4B) - so I'm not so convinced of this conclusion). The authors also find that the CA and PG primordia form at slightly different dorsal ventral positions and that DV positional information is controlled downstream of upd JAK/STAT signaling.

      Major comments:

      The paper is well written and makes for a nice story, but the corresponding data are not supportive of most of the conclusions drawn by the authors.

      First, in the Dfd Scr double mutant, both the CA and PG expression of the snail-rg-GFP reporter is still there - admittedly, the gland cells look abnormal at late stages, but this reporter that is supposed to function as a proxy for gland induction is still expressed. That either means that expression of sna-rg-GFP is not a proxy or that the glands are still being specified in the absence of the Hox genes that are proposed to specify these organs. The reporter should not be expressed if these Hox genes are what specify these endocrine organs. This finding might explain why mutating the Hox consensus binding sites had no effect on expression of the smaller snail reporters.

      The finding that the CA and PGs form in slightly different DV positions from each other and slightly different DV positions from the trachea (based on the vvl1+2 mCherry reporter staining combined with that of the sna-rg-GFP reporter staining in Figure 5A, where staining does not overlap except where the CA cells have started to migrate over the vvl1+2 mCherry expressing cells) argues pretty strongly against the CA and PG being homologous to each other or absolutely homologous to the trachea primordia. Likewise, the STAT10X-GFP staining does not overlap with the sna-rg-mCherry staining (I see red cells and I see green cells - there are no yellow cells). If activation of snail is through Upd activation of STAT signaling, we should see that the snail reporter expression is within the domain of STAT10X-GFP expression. This would be consistent with observing a loss of upd mRNA in the maxillary and labial segments with loss of Dfd and Scr, but not seeing a loss of the sna-rg-GFP reporter. This would also argue against the proposed homology between the glands and the trachea. Indeed, the situation with these endocrine glands and the trachea is completely unlike the situation with the wing and haltere, wherein both structures arise from the same DV position in adjacent segments, or with legs 1, 2 and 3, which arise from the same DV position in adjacent segments. This paper provides a strong basis for arguing that the CA, PG and trachea are not homologous structures and that the CA and PG are induced independently of Jak/Stat signaling, whereas trachea require this signaling pathway.

      Minor comments:

      Page 3: tracheal is misspelled in the first paragraph, line 3.

      Page 5, end of first sentence in first full paragraph: "lethal" should be changed to "non-viable". I think the authors mean that homozygous embryos die, not that they cause the death of other life forms.

      Significance

      Nature of significance of advance:

      I think the significant finding is that the CA, PG, and trachea are not homologous structures. But that is not what the authors are concluding. The only findings consistent with the data provided are that Wg signaling represses expression of the snail reporter and Hh signaling activates its expression (Figures 1 - 3). Most of the other conclusions do not seem to be sufficiently supported by the data.

      Context of the work:

      These authors have published that the CA and PG are structures specified in homologous positions to the trachea. It has already been published that CA, PG and trachea primordia express the Vvl transcription factor - although I did not go back to see how that was determined. It has already been published that ectopic expression of specific Hox genes can transform the gland primordia into trachea and vice versa (these experiments may also warrant a closer look). So, idea that CA, PG and TR arose from divergent evolution of a segmentally repeated ancient structure has been proposed.

      Best target audience:

      With the findings that are consistent with the story line (figures 1 - 3), Drosophila embryologists working on the formation of these glands would be interested.

      My field of expertise:

      Drosophila development.

    1. Indie sites can’t complete with that. And what good is hosting and controlling your own content if no one else looks at it? I’m driven by self-satisfaction and a lifelong archivist mindset, but others may not be similarly inclined. The payoffs here aren’t obvious in the short-term, and that’s part of the problem. It will only be when Big Social makes some extremely unpopular decision or some other mass exodus occurs that people lament about having no where else to go, no other place to exist. IndieWeb is an interesting movement, but it’s hard to find mentions of it outside of hippie tech circles. I think even just the way their “Getting Started” page is presented is an enormous barrier. A layperson’s eyes will 100% glaze over before they need to scroll. There is a lot of weird jargon and in-joking. I don’t know how to fix that either. Even as someone with a reasonably technical background, there are a lot of components of IndieWeb that intimidate me. No matter the barriers we tear down, it will always be easier to just install some app made by a centralised platform.
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: Klein and colleagues generate an ES cell model system with inducible FACT depletion to understand how loss of FACT affects gene regulation in ES cells. They find that FACT is critical for ES cell maintenance through multiple mechanisms including direct regulation of key pluripotency transcription factors (Sox2, Oct4, and Nanog), maintaining open chromatin at enhancers, and regulated enhancer RNA transcription. The paper is well-written, the experiments are generally well-controlled and appropriately interpreted and placed within the context of the field.

      We appreciate the Reviewer’s support of this manuscript.

      Major comments: 1. In general, the ChIP-seq and CUT&RUN data are not that similar. Although correlation seems reasonable (S2A), looking at the heatmaps in S2B/C these seem pretty different. It's not very clear if this is a case where CUT&RUN has higher specificity (and signal-to-noise, which is very clear from example tracks) or if these two methods are picking up biologically different sites. Could the authors include some overlap analysis of peaks and comment on these discrepancies. Looking at the example tracks in Figure 2B, it seems likely that prior SPT16 and SSRP1 ChIP-seq were relatively high-noise.

      We have identified overlapping peaks between the two techniques, and while CUT&RUN identified substantially more peaks overall, percentage of peaks shared between datasets were relatively consistent (1-6% of total) between the individual ChIP-seq datasets and the CUT&RUN dataset (Response Figure 1). We note that the biological classes identified through all datasets were remarkably consistent (Fig. 2D), and therefore attribute the discrepancies to the greater number of reproducible peaks called from CUT&RUN data. As discussed in the paper, peak calling algorithms designed for the specific data types were used, and therefore peak calling could also contribute to differences.

      Response Figure 1. ChIP-seq and CUT&RUN peak overlap. Pie chart depicting the unique and overlapping peaks called from V5-SPT16 CUT&RUN data and FACT ChIP-seq data. These data are included in the revised manuscript (as a new Figure panel 2E). Peaks must have been identified in at least two technical or biological replicates.

      Are motifs described in Figure 2E CUT&RUN only, and do prior ChIP-seq experiments also identify these motifs?

      The motifs shown in Figure 2E (now 2F) are indeed CUT&RUN peaks only. We were unable to confidently assign enriched motifs to the ChIP-seq datasets (the most enriched motifs were approximately p = 10-18). By analyzing all SPT16 ChIP-seq peaks, rather than only intersected SPT16 ChIP-seq peaks, we were able to identify motifs recognized by two of the top three CUT&RUN motif hits (SOX2 and OCT4/SOX2/TCF/NANOG); however, enrichment was quite poor (p = 10-3). By limiting the analysis to intergenic regions, we were able to identify strong enrichment for motifs recognized by CTCF and BORIS (p = 10-58 and 10-51, respectively). As validation, we also called motifs from peak files published as supplementary material to the original Tessarz lab manuscript but were still unable to confidently call motifs (all p > 10-7 for SPT16 peaks, p > 10-15 for SSRP1 peaks). Related to major comment 1, we suspect that the weak motif enrichment is due to high background in ChIP-seq datasets compared to CUT&RUN datasets.

      The authors state that FACT depletion affects eRNA transcription and measured this using TT-seq. The analysis in Figure 3B seems to be all the different types of sites looked at together (genes, PROMPTs, etc). Is there evidence that eRNAs specifically are regulated by FACT loss.

      We apologize for the confusion and have clarified that Figure 3B (now 3A) is referring to mRNAs only in the text and figure. Our analysis of eRNA regulation by FACT is predominantly contained within Fig. 4B (TT-seq from DHSs, but no histone mark overlap assessment), Supp Fig. S4 (as in Fig 4B, but at DHSs overlapping H3K27ac or H3K4me1), Fig. 5E (FACT localization to putative enhancers, defined as in S4), and Fig. 6D (ATAC-seq demonstrating loss of accessibility at putative enhancers upon FACT depletion). Based on these results, we believe there are many eRNAs specifically misregulated by FACT loss and that potential direct targets (based on change in depletion and containing FACT binding) are in Fig 5E.

      Could these be compared to DHS sites that lack FACT binding to support a direct role for FACT at these sites?

      We appreciate the suggestion and have performed this analysis (see Response Figure 2). Relatedly, we analyzed putative silencers, defined as DHSs marked by H3K27me3, for FACT binding and expression changes (measured by TT-seq) following FACT depletion (Supp Fig. S7). As expected, FACT does not bind these putative silencer DHSs and transcription does not markedly increase or decrease from these regions after FACT depletion. Complicating the matter, FACT binds at many DHSs, even those that did not to meet our stringent peak-calling criteria (see Response Figure 2, middle cluster).

      __Response Figure 2. Overlap between FACT binding sites and gene-distal DHSs. __Individual clusters are sorted by V5-SPT16 binding. Clusters were assigned based on direct overlap between called V5-SPT16 peaks and assigned gene-distal DHSs. Overall, 17.6% of DHSs overlapped a FACT peak identified in at least one CUT&RUN replicate (8.5% of DHSs overlapped a peak present in multiple replicates).

      One mechanism proposed for how FACT regulates enhancers is that it is required for maintaining a nucleosome free area, and when FACT is depleted nucleosomes invade the site (Figure 7). It wasn't clear if they compared distal DHS sites were FACT normal bound to those without FACT binding in the MNase experiments, which could help support the direct role or specificity of FACT in regulating those enhancers (or a subset of them).

      We have subset the V5-SPT16 CUT&RUN peaks and distal DHSs into groups and have identified increased nucleosome occupancy after depletion at both FACT-bound and FACT-unbound DHSs suggesting both direct and indirect regulation (Fig. 6A, D). There is disruption to nucleosome arrays at non-FACT-bound DHSs (although more modest relative to the FACT bound locations), and therefore we speculate that a nucleosome remodeler is involved downstream of FACT (possibly CHD1, per recent work out of Patrick Cramer and François Robert’s labs, among others).

      1. Data quality for nucleosome occupancy was a little strange (Figure 7F), where the two clones had very different MNase patterns at TSS sites. Could the authors comment on why there is such a strong difference between clones here.

      We agree that the trends identified by visualizing differential MNase-seq signal near TSSs do not fully replicate; however, in examining the nondifferential MNase-seq heatmaps, we see a more expected distribution (see new Figure 7A). Per our newly-added Supp Fig. S9B, all MNase-seq replicates had a pairwise Pearson correlation value of at least 0.73 (SPT16-depleted clone 1/rep 1 vs untagged rep 3), and the vast majority of samples had pairwise correlations of above 0.85, suggesting that these discrepancies are not due to strong differences in sequencing depth or MNase-protected regions. We therefore suspect that the clonal distinctions are a result of different background occupancy of nucleosomes near the TSS, resulting in an array with increased occupancy in one clone and more generalized increased occupancy in the other clone. We also added the MNase-seq data over TSSs in a non-differential form in Fig 7A, and believe the difference between the clones is due to the differential analysis, and have commented accordingly in the revised manuscript.

      More details on some of the analysis steps would be really helpful in evaluating the experiments. Specifically, was any normalization done other than depth normalization? I ask this because the baseline levels for many samples in metaplots look quite different. For example, see Figure 7B where either clone 1 has a globally elevated (at least out 2kb) ratio of nucleosome in the IAA samples relative to the EtOH, or there is some technical difference in MNase. One suggestion is to look at methods in the CSAW R package to allow TMM based normalization strategies which may help.

      We appreciate the suggestion – we have expanded our explanation of normalization methodology in the paper. We initially used quartile and RPGC normalizations to attempt to mitigate technical differences in MNase-seq data. Size distribution plots did not suggest differences in MNase digestion between samples, and neither quartile/RPGC nor TMM-based normalization fully resolved this issue. Because our ATAC-seq datasets agree with the general trends identified by MNase-seq (which are consistent, despite technical differences between clones), we do not believe that the differences constitute true biological difference, but rather experimental noise.

      1. I appreciated the speculation section, and the possible relationship between FACT and paused RNAPII is interesting. While further experiments may be outside the scope of this work and I am not suggesting they do them, I am wondering if others have information on locations of paused RNAPII in ESC that would allow them to test if genes with paused RNAPII have a special requirement for FACT that they could use their current data to assess.

      We agree that experiments to test the relationship between paused RNAPII and FACT are an intriguing next step, and plan to dissect those in the near future.

      Minor comments: 1. When describing the peaks found in the text related to Figure 2 they refer to 'nonunique' peaks. Does this mean the intersection of the independent peak calls? Could they clarify this.

      We apologize for the confusion and have clarified in the text that nonunique peaks does indeed refer to the intersection of independent peak calls (now specified on manuscript page 8, line 15).

      In the text they refer to H3K56ac data in S2D and I don't see that panel. The color scheme for the 1D heatmaps (Figure 5A) is tough to appreciate the differences. I'd suggest something more linear rather than this spectral one might be easier to see.

      We apologize for the confusion and removed the remaining H3K56ac-related data and references in the text. We appreciate the suggestion regarding the 1D heatmap color scheme and have adjusted the colors to a linear (white à red) scheme.

      For the 2D heatmaps of binding, could they include the number of elements they are looking at for each group?

      We appreciate the suggestion and have included numbers of elements visualized wherever applicable in the figure panels and legends.

      1. Also for 2D heatmaps, I think the scale is Log2 (IAA/EtoH), but could they confirm that and include it in the figure?

      We apologize for the confusion; the only heatmaps displaying log2(IAA:EtOH) are those in Fig. 6; for those panels, we have clarified the scale in the figure and legend.

      Reviewer #1 (Significance (Required)):

      • The use of degrader based approaches to depleting a protein allows refined kinetic and temporal assays which I think are important. Several papers showed a rapid invasion of nucleosomes after SWI/SNF loss using these kinds of approaches and revealed surprisingly fast replacement of SWI/SNF. This paper is consistent with those models, showing that another remodeler behaves the same, suggesting there may be general requirements for active chromatin remodeling to maintain the expression of these genes. It also highlights a key gap in how specificity works to target these enzymes remains somewhat unknown.

      • This work will be of interest to those studying detailed mechanisms of gene regulation. Compared to some other chromatin regulators, FACT is understudied and so this work will allow comparison between different chromatin remodeling complexes.

      • My experience: chromatin, gene regulation, cancer, genomics

      We appreciate the thorough review and hope that we have sufficiently addressed your concerns.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors propose that the FACT complex can regulate pluripotency factors along with their regulatory targets through non-genic locations. They find that acute depletion of FACT leads to a "reduction" in pluripotency in mouse embryonic stem cell by disrupting transcription of master regulators of pluripotency. They also show FACT depletion affected the transcription of gene distal regulatory sites, but not silencers. They also stated that SPT16 depletion resulted in both, a reduction of chromatin accessibility and increase of nucleosome occupancy over FACT bound sites.

      Overall the study appears technically well executed. The use of an Auxin induced depletion system is a good model to study the acute effects of FACT depletion. However, I have a number of concerns relating to specificity and interpretation of the results that need to be addressed. We appreciate the careful review and have addressed your comments below:

      Major points o Authors claimed that depletion of the FACT complex "triggers a reduction in pluripotency". As evidence supporting this statement they present images of alkaline phosphatase assays of a time course performed upon depletion of FACT. These experiments indeed show that ESCs are destabilized in the absence of SPT16. However, some key questions regarding the phenotype remain unresolved: o What is are the kinetics of expression of selected naïve pluripotency and early differentiation markers? Are differentiation markers upregulated, consistent with normal differentiation upon FACT depletion?

      We appreciate the suggestion and have emphasized the decrease in pluripotency factor expression, accompanied by an increase in differentiation marker expression across all three germ layers. We graphed 7 pluripotency factors and 7 differentiation markers for each germ layer; generally speaking, pluripotency factors are decreased while differentiation markers are increased (Response Figure 4; pluripotency factors are included in the new Fig. 3B, while differentiation markers are included in the new Supp Fig. S3 F-H).

      We have also performed an immunocytochemistry (ICC) timecourse, per Reviewer 3’s suggestion. This ICC timecourse allows us to orthogonally assess decreased pluripotency factor expression, to pair with the OCT4 Western blot shown in Supp Fig. S1B. These new ICC data are shown in the new Fig. 1D and included here for convenience (Response Figure 5). In addition, we have added alkaline phosphatase staining at 12 hours of depletion to Fig. 1C.

      __Response Figure 4. Plots of DESeq2 analysis across experimental timecourse. __Shown are lineage markers denoting: A. Pluripotency B. Endoderm C. Mesoderm and D. Ectoderm. Generally, expression of pluripotency factors decrease over time, while differentiation markers of each lineage increase over time. These data are shown in Figure 3B and Supplemental Figure S3F-H.

      __Response Figure 5. Immunocytochemistry timecourse depicting DAPI staining (left panels, blue) and OCT4 immunofluorescence (right panels, green). __Images are representative of plate-wide immunofluorescence changes.

      O Is only ESC identity affected or does loss of FACT impair viability also of cells that have exited pluripotency? To address this, growth curves and/or cell cycle analysis upon FACT depletion could be performed. Alternatively, the authors could utilize surface markers to distinguish naïve pluripotent form differentiated cells in the cell cycle analysis experiments to identify a potential differential response of pluripotent and differentiated cells to FACT depletion.

      We have performed a growth curve with FACT depletion as suggested; as the two points are related, we will explain further below:

      o Another key question is whether it is only the metastable pluripotent state of ESCs in heterogeneous FCS/LIF conditions which is affected by FACT loss, and whether cells cultured in the more homogeneous and more robust 2i-LIF conditions can tolerate FACT removal. If that is indeed the case it would enable the authors to address one main concern I have with this manuscript, which is that it is nearly impossible to distinguish the direct effect of FACT loss from differences induced by differentiation (and maybe cell death, see comment above). This is a critical concern that needs to be addressed and discussed appropriately.

      We apologize for the confusion – all original experiments for this project were performed in the presence of LIF as well as GSK and MEK inhibitors CHIR99021 and PD0325091, respectively (2i+LIF conditions). To address the reviewers question, we have now performed a timecourse growth assay under both LIF-only and 2i+LIF conditions (Response Figure 6 and new Supp Fig S1F), and as suggested by the reviewer, observe a stronger effect of FACT depletion on cell viability in LIF-alone (FACT-depletion results in ~90% death within ~24 hours, with differences in growth observed by 12 hours) than in 2i+LIF (FACT-depletion results ~80% death within 48 hours, with differences in growth observed starting around 18 hours). Overall, ES cells in LIF alone are indeed more sensitive to FACT loss, supporting our decision to perform the experiments throughout the manuscript in 2i+LIF conditions.

      LIF alone LIF + 2i

      Response Figure 6. __Growth assays in LIF (left) and 2i+LIF (right) conditions. __Cells were treated with either EtOH or 3-IAA and counted at the indicated times. Viability was assessed using trypan blue exclusion. Error bars indicate standard deviation for biological triplicate experiments.

      o A further major concern is about the specificity of the effect of FACT depletion. The authors claim that FACT is required to maintain pluripotency. From the data presented this is unclear. FACT appears to be part of the general transcription machinery in ESCs. It appears generally associated with active promoters and active genes, according to the data in this manuscript. Whether there is any specific link to pluripotency remains to be shown. It is unclear how enrichment analyses have been performed. If they haven't been performed using a background list of genes actively transcribed in ES cells, they will obviously show enrichment of ESC specific GO categories, because ESCs express ESC specific genes robustly expressed in ESCs?

      We apologize for the confusion and have updated our methods section to include more comprehensive details on our pathway enrichment analyses. We have confirmed that pluripotency-related categories are still highly enriched in FACT-regulated DEGs, even when using a background dataset of all transcribed genes, per our TT-seq datasets (baseMean ≥ 1 in DESeq2 output).

      In line with this: the authors show that FACT bound loci well overlap with Oct4 bound regions. But which proportion of FACT targets loci are actually Oct4 bound too?Is FACT binding exclusive to Oct4 regulated enhancers and promoters? In other words, will FACT be recruited to all actively transcribed genes in ES cells? In that case, a specific effect on pluripotency network regulation cannot be claimed.

      We appreciate the suggestion, and have added the number of OCT4/SOX2/NANOG-bound FACT peaks and vice versa in the text and legend of Fig 3E-F. We have also summarized this information in Response Table 1, below (and included these data as Table 2 in the revised manuscript).

      OCT4 peaks

      Sox2 Peaks

      Nanog Peaks

      Any of OSN

      V5 Peaks

      8,544

      5,948

      5,307

      9,682

      OSN Peaks

      45,476

      19,211

      16,817

      52,899

      % of OSN peaks bound by FACT

      18.33%

      30.72%

      31.40%

      17.91%

      % of V5 peaks bound by pluripotency factor(s)

      52.41%

      36.85%

      32.94%

      59.63%

      V5-bound promoters

      4,261

      2,719

      2,327

      4,452

      OSN-bound promoters

      6,550

      1,542

      666

      6,948

      V5- and OSN-bound promoters

      2,040

      801

      343

      2,202

      OSN-bound gene-distal peaks

      38,926

      17,669

      16,151

      45,938

      V5-bound gene-distal OSN peaks

      6,504

      5,147

      4,964

      7,480

      __Response Table 1. Overlapping CUT&RUN and ChIP-seq peaks shared between OCT4, SOX2, NANOG, and V5-SPT16 under various stratifications. __Shown are numbers or percentages of peaks overlapping between V5 and OSN. The last column are peaks containing any of OCT4, SOX2, and/or NANOG. The first four rows include all peaks, regardless of location, and the last five rows are broken down by promoter (as defined by an annotated mRNA) or gene-distal location (defined by a minimum of +/- 1kb from a gene).

      Of the 45,865 OCT4 peaks, 3,688 are located at promoters, and 1,209 of these peaks are bound by V5-SPT16 (32.8%). Inversely, 13,228 of 42,177 gene-distal OCT4 peaks are called as SPT16-V5 peaks in at least one CUT&RUN replicate (31.36%), suggesting a relationship between OCT4 binding and FACT binding, which has long been identified with genic transcription, but has roles extending beyond gene-proximal regulation. We observe similar trends with NANOG and SOX2.

      o It is disappointing that neither raw data (GEO submission set to private) nor any Supplemental Tables containing differentially expressed transcripts and ChIP or Cut and Run peaks and associated genes were made available. This strongly reduces the depth of review that can be performed.

      We apologize if the reviewer token in the cover letter was not accessible. The GEO datasets (including differentially expressed transcripts, raw fastq files, and analyzed datasets) will be made public upon publication; in the meantime, the GEO entry (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE181624) can still be accessed using the previously provided reviewer token: wvkvwmwynjeffux.

      o To what extent do FACT bound loci overlap with genes differentially expressed 24h after FACT depletion? This analysis would help determine the direct targets of FACS regulation.

      We appreciate the suggestion. This analysis can be found in the original Figure S6, broken down by FACT-repressed (expression increased upon FACT depletion), unchanged, and FACT-stimulated (expression decreased upon FACT depletion) DESeq2 results (ordered left-to-right, respectively). Figure S6A-C shows that V5-SPT16 binding is enriched, but not exclusive to, genes with FACT-regulated expression, while Fig. S6D-F shows TT-seq data for each group, sorted by log2-fold change assigned by DESeq2.

      o The paper mainly relies on NGS analysis. Therefore, it is crucial that authors show as Supplemental Material some basic QC of these data. PCA analyses to show congruency of replicates are the minimum requirement.

      We appreciate the suggestion and have included a new Supp. Fig S9, with pairwise comparative Pearson correlation scatterplots and heatmaps for replicates in each dataset, in addition to the scatterplots shown for CUT&RUN and ChIP-seq data in the original Supp Fig. S2A.

      o Did the authors perform any filtering for gene expression levels before analysis? Are genes in the analysis robustly expressed in at least one of the conditions?

      We apologize for the confusion. Due to the sensitive nature of TT-seq and the germ layer-inconsistent pattern of cell differentiation following FACT depletion, we did not perform filtering for gene expression prior to any analyses. For the vast majority of genes analyzed, however, we are able to identify transcription via TT-seq, even in those that do not significantly change expression upon FACT depletion (see Supp Fig S6E). As discussed above, we did include a cutoff for expressed genes in our revised pathway analysis.

      o Wherever p values were reported for enrichment analyses, adjusted p values should be used

      We apologize for the oversight; the p values were in fact adjusted p values and have updated the text and figures to make it explicit that the adjusted p values were used wherever applicable.

      o I cannot follow the logic used by the authors to explain discrepant results from Chen et al about the role of FACT in ESCs. Chen et al showed that FACT disruption by SSRP1 depletion is compatible with ESC survival and leads to ERV deregulation. The authors of the present study attribute these differences to potential FACT independent roles of SSRP1. However, I would assume that if there are indeed FACT independent roles of SSRP1, then the phenotype of SSRP1 KOs in which FACT and other processes should be dysfunctional should be even stronger than a plain FACT KO. This needs a proper and careful explanation.

      We apologize that our discussion of FACT-independent roles of SSRP1 was not clear and have clarified our wording in the text (page 4, line 49 – page 5, line 4)in the revised manuscript); we intended to reconcile the results of Chen et al. 2020 with Goswami et al. 2022 and Cao et al. 2003; despite SSRP1 knockout viability in embryonic stem cells, SSRP1 knockout is lethal in mice between 5-40 weeks and general SSRP1 knockout is lethal 3.5 days post-conception (per Goswami et al. 2022). We therefore posit that the general requirement for SSRP1 may be due to distinct roles from those carried out by the FACT complex in ES cells, as discussed by Spencer et al. 1999, Zeng et al. 2002, Li et al. 2007, and Marciano et al. 2018.

      We note that our findings are in agreement with papers from the Gurova lab and others in that depletion of mRNA or protein of SPT16 leads to concomitant loss of SSRP1; we therefore do not expect total SSRP1 loss to have a stronger effect than SPT16 depletion. We therefore expect, and confirmed via Western blotting (Figure 1B, Supplemental Figure 1), that depletion of SPT16 leads to loss of both FACT subunits, and therefore all FACT subunit activity, complex-dependent or -independent.

      Also, did the authors observe any evidence for ERV deregulation upon acute SPT16 depletion?

      We did indeed observe ERV deregulation upon SPT16 depletion. When reviewing our TT-seq datasets, 7.1% of ERVs were derepressed, while 2.4% decreased in expression upon 24h FACT depletion (mm10 ERVs sourced from gEVE, Nakagawa and Takahashi, 2016). Further, we identified increased chromatin accessibility after FACT depletion at annotated LTR elements, as shown in the table below (Response Table 2). Here we are displaying the calculated enrichment score for accessibility detected at these locations. A negative value indicates lower accessibility than expected by region size, while a positive score indicates that reads are more enriched than expected at the indicated region.

      ATAC-seq enrichment score for locations losing accessibility with FACT depletion

      3h

      6h

      12h

      24h

      LTR Enrichment

      -1.445

      -1.299

      -0.917

      -0.559

      Intergenic Enrichment

      -6.046

      -4.765

      -3.926

      -2.972

      Promoter Enrichment

      3.335

      2.789

      2.726

      2.233

      ATAC-seq enrichment score for locations gaining accessibility with FACT depletion

      3h

      6h

      12h

      24h

      LTR Enrichment

      -1

      -0.436

      1.103

      1.13

      Intergenic Enrichment

      -1

      0.134

      0.435

      0.236

      Promoter Enrichment

      -1

      -3.585

      1.171

      1.39

      __Response Table 2. Changes in ATAC-seq peak enrichment for selected regions, annotated via HOMER. __At regions differentially accessible between SPT16-depleted and SPT16-undepleted samples, regions were assigned to an annotated genomic feature using HOMER annotatePeaks.pl and assigned an enrichment score based on the ratio of ATAC-seq signal to region size. Over time, LTR elements become more enriched among the ATAC-seq peaks both gaining and losing accessibility, indicating a role for FACT in maintaining LTR accessibility.

      We do wish to note, however, that Lopez et al. 2016 identified SPT16-independent regulation of LEDGF/HIV-1 replication by SSRP1, and therefore cannot rule out effects on ERV dysregulation due to SSRP1 loss that accompanies SPT16 depletion.

      Minor points o Figure S2A is very small and resolution is low. Page 10: "...while all four Yamanaka factors (Pou5f1, Sox2, Klf4, and Myc) and Nanog were significantly 24 reduced after 24 hours (Fig. 3A, S3A-B)". No data for myc is being shown.

      We apologize for the figure resolution and have included a larger image. Because pairwise comparative scatterplots are not space-efficient, we opted to display the Pearson correlations for the datasets including more samples (TT-seq and ATAC-seq timecourses) as heatmaps in the new Supp Fig S9. We have added Myc labeling to the volcano plot (now in Fig. 3A) and included a trace of Myc expression over time to the new pluripotency factor graph in Fig. 3B.

      o Are the two bands in the middle in figure 1B is supposed to be a ladder? This should be clarified.

      We thank the reviewer for noticing this and apologize for the oversight.

      o Figure 3C- This Figure is complicated to read. Also, information appears redundant with the Table 1, I recommend to remove this panel.

      We have moved the panel to the supplement (now Supp Fig. S3A). While the information is somewhat redundant with Table 1, we chose to include the former panel 3C as a visual representation of the consistent deregulation over depletion time across transcript categories.

      o Figure 6 and figure 7 could be presented in one single figure since both aspects are complementary and target related aspects.

      While we thank the reviewer for this suggestion, we do not believe that the information contained in Figs. 6 and 7 can effectively be conveyed in a single figure. While both figures focus on chromatin accessibility and nucleosome occupancy, Fig. 6 is designed to address the changes in chromatin accessibility over time, while Fig.­­­ 7 is more relevant to the biological mechanism through which FACT co-regulates targets of the core pluripotency network (OCT4/SOX2/NANOG) after 24 hours of depletion.

      o Are the authors certain that the effects observed are directly linked to the FACT complex in contrast to FACT independent roles of SPT16, if any exist? The experiment to address this would be to deplete SSRP1 and investigate whether the effects are identical, which would be the hypothesis to be tested.

      We thank the reviewer for this suggestion. We did attempt to create additional SSRP1-AID-tagged lines; however, generating these lines proved to be technically challenging, and comparison of the FACT-dependent and -independent roles of the individual subunits is beyond the scope of this work. Further complicating the matter, SSRP1 is effectively depleted within 6 hours of 3-IAA addition in SPT16-AID lines due to the interdependence of FACT subunits. We again thank the reviewer for their suggestion and will consider this work for a future study.

      Reviewer #2 (Significance (Required)):

      My expertise is pluripotency and GRNs.

      I would judge the significance of the study as presented as low, mainly because at this moment it remains unclear what FACT indeed does concerning regulation of pluripotency.

      We respect the reviewer’s opinion and hope that our revisions have made more clear how the FACT complex prevents nonspecific differentiation from occurring, thereby maintaining pluripotency and self-renewal in embryonic stem cells. Importantly, neither untagged cells treated with 3-IAA nor tagged cells treated with vehicle display the growth defects, loss of pluripotency factor expression, increased differentiation marker expression, phenotypic evidence of differentiation, and reduced alkaline phosphatase staining that the FACT-depleted cells do, highlighting a key requirement for FACT in pluripotent cells. Beyond this, we believe the novel gene distal regulatory role we have identified for FACT presents an exciting new role for this complex in gene regulation.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Klein, et al. addressed function of FACT complex in mouse ESCs, using cut&run, TT-seq, ATAC-seq, MNase-seq, together with Auxin-mediated FACT degradation system. The authors first reported that efficient and acute depletion of SPT16 with the Auxin-mediated degradation system resulted in over 5,000 up- and 5,000 down-regulated genes within 24 hours, including down-regulation of pluripotent gens. Then, they demonstrated that many of FACT binding sites overlap with Oct4, Sox2, Nanog binding sites by Cut&Run, and those loci increase nucleosome occupancy 24 hour after removal of FACT.

      The Auxin-mediated degradation system seems to be working very well (while I would like to see an over exposed version of Western blotting), and efficient degradation might explain the different phenotypes from the previous reported phenotypes by shRNA and the chemical inhibitor, which might not deplete FACT function completely and/or might have off-target effects. The Cut&Run data also have much sharper peaks than previously reported SSRP1, SPT16 ChIP-seq data. Doing ATAC-seq, MNase-seq upon removal of FACT is excellent. WIth the excellnet degradation system, depletion of FACT resulted in loss and gain of gene expression and differentiation. However, unfortunately it was not very clear to me what was the direct consequences of FACT removal and its mechanisms, waht was consequence of differentiation.

      We appreciate the kind words regarding our choice and execution of techniques and the reviewer’s time spent on this manuscript. We have made a number of changes to the manuscript in order to clarify the direct role of FACT and the consequences of FACT loss on embryonic stem cells.

      Although we did not develop the blots for a longer period when we performed the Westerns, we have artificially overexposed our V5-SPT16 Western blot from Figure S1 (in Adobe Illustrator) to highlight the more subtle bands at later depletion timepoints; we hope that this helps to clarify the effectiveness of the degron system.

      Response Figure 7. V5-SPT16 Western blot with adjusted exposure. We manually adjusted the entire blots’ exposures using Adobe Illustrator. L indicates ladders, and the timecourse depletion is shown above the blot.

      In my opinion, doing many of the analysis 24 hours after FACT depletion, where differential expressed (coding) genes (DEGs) are >10,000 (Table 1)), is too late to understand what the direct consequences are. Seeing 214 up- and 174 down-regulated DEGs 6 hours after FACT depletion, I do agree that FCAT seems to do both suppression and activation of target genes. It could have been really interesting to investigate what % of FACT bindign sites change chromatin accesibility and nucleosome occupancy at that time point, if those loci are close to any of the up- or down-regualted DEGs.

      We appreciate the suggestion and have included more information regarding the percentage of FACT binding sites with altered chromatin accessibility, as well as included some analyses to address the directness of FACT’s contribution to DEGs at all timepoints (see Supp Figs S4, S6). We would like to note that, we performed the TT-seq and ATAC-seq experiments at 0, 3, 6, 12, and 24 hours post 3-IAA treatment in order for us to explore the progressive change in both the transcriptome and chromatin accessibility, with only the MNase-seq limited to 24 hours. As originally shown in our Sankey plot in Supp Fig 4, we see a progressive change in expression for a small subset of genes over our timecourse running from 0-24 hours, with the largest effect observed at 24 hours, once the FACT protein levels are almost entirely depleted. Similarly, we see a progressive change in ATAC-seq signal over the same regions, with the strongest effects over the same regions visible at 24 hours post-depletion. Due to our observation that SPT16 is not depleted at 3 or 6 hours, with significant depletion seen at 24 hours (see Response Figure 7) and because we intended to study the FACT complex’s role in preventing differentiation, we were most interested in the effects at 24 hours of depletion, which allow us to analyze both the disruption of pluripotency factor expression and the facilitation of differentiation marker expression across all three germ layers (see Response Figure 4).

      Followings are reasons of above my judgement and suggestions to improve the manuscript.

      Major points 1. Figure 1. ALP staining is not very sensitive way to evaluate ESC differentiation. I recommend Immunofluorescence for pluripotency genes (NANOG and/or SOX2) and quantification. Or present changes of pluripotency genes in graphs over time course from RNA-seq data.

      We appreciate the suggestions and have taken both into account. We have included a new panel in Figure 3 (new 3B) to display the changes of pluripotency factor expression over our timecourse. We have also included some data showing differentiation factors as part of a response to Reviewer 1, which can be found above (Response Figure 4). In addition, we performed immunocytochemistry to examine OCT4 abundance over a depletion timecourse and have added a 12-hour to our alkaline phosphatase assay to address the sensitivity of differentiation over time (Figure 1C, D and Response Figure 5).

      1. Fig 2A, 3E, 3F. How many transcription start sites are shown here? (Throughout the manuscript, it is hard to know how many loci are shown in the heatmaps. It should be described within the figures)

      We apologize for the omission and have added numbers of loci shown to relevant Figure panels throughout the paper.

      It is nice to see nascent transcription high sites have high FACT binding, but can you also show actual nascent transcription of these loci as a heatmap, before and after FACT depletion? These heatmaps should be shown in a descending order of FACT Cut&Run signalling, as FACT binding is important in this manuscript.

      We appreciate the suggestion and have plotted those data below (see Response Figure 8).

      Response Figure 8. Nascent transcription from sites with high FACT binding. Top: TPM-normalized TT-seq signal after 12-hour treatment, oriented to mRNA strand and plotted as entire mRNA length, ± 500 bp. Data are sorted by SPT16 CUT&RUN signal over 1kb upstream of annotated TSSs. n = 1 over 22,597 rows (RefSeq Select mRNAs). Bottom: TPM-normalized TT-seq signal after 24-hour treatment, oriented to mRNA strand and plotted as entire mRNA length, ± 500 bp. Data are sorted by SPT16 CUT&RUN signal over 1kb upstream of annotated TSSs. n = 3 (mean) over 22,597 rows (RefSeq Select mRNAs).

      Strong FACT binding sites have strong transcription. Is FACT really supressing transcription?

      We agree that it is very difficult to disentangle FACT function due to its binding correlation with transcription; however, we see a clear trend of FACT binding at promoters that are sensitive to FACT depletion (Supp Fig. S6A/D and C/F). Intriguingly, the genes that see the greatest derepression by DESeq2 analysis are those that are directly bound by FACT (per ChIP-seq and CUT&RUN; Supplemental Figure S6A/D), while the greatest decrease in expression occurs at genes that are less bound by FACT (Supp Fig S6C/F). In our opinion, this trend lends credence to both direct repression by FACT and distal gene regulation. We note that others (e.g., Kolundzic et al. 2018) have shown direct repression of gene expression by FACT, in line with that aspect of our data.

      1. Fig 3ABD. It is more important to show 3h, 6h 12 h time points. The same apply to Fig 4. What %, how many of DEGs (coding and non-coding) at each time point had FACT binding nearby in ESCs?

      We agree that the early timepoints are important and have added volcano plots to the supplemental material for earlier timepoints, with genes of interest specifically annotated. We have also examined pluripotency and differentiation markers at earlier timepoints, per other reviewers’ suggestions, and have included the percentage of DEGs with nearby FACT binding in the manuscript. Specifically, 2013 replicated V5 peaks (out of 16,054; 12.54%) occurred within 1000 bp of a RefSeq Select TSS.

      Timepoint

      Total DEGs (up)

      V5-bound DEGs (up)

      Total DEGs (down)

      V5-bound DEGs (down)

      3h

      58

      16 (27.59%)

      5

      1 (20%)

      6h

      214

      38 (17.76%)

      174

      31 (17.82%)

      12h

      1366

      123 (9.00%)

      1932

      281 (14.54%)

      24h

      5398

      431 (7.98%)

      5000

      663 (13.26%)

      __Response Table 3. Table of DESeq2-assigned DEGs that are bound by SPT16-V5. __To be defined as V5-SPT16-bound, a DEG must have SPT16-V5 binding within 1000 bp upstream of its RefSeq-select annotated TSS.

      We believe that these earliest depletion timepoints are in line with FACT-mediated gene regulation occurring distal to the regulated genes’ promoters.

      Fig 3EF. Interesting data and the overlap between SPT16 binding sites and pluripotency binding sites look very strong. But it is difficult to know what % is overlapping from these figures.

      We appreciate the difficulty in quantifying the overlap between pluripotency factor binding sites and FACT binding sites; we have added those data to the manuscript below Figure 3E for OCT4; for other pluripotency factors, these data can be found in Response Figure 9 and Response Table 1. Briefly, 18.33% of OCT4 ChIP-seq peaks are bound by V5-SPT16 and 52.41% of V5-SPT16 peaks are bound by OCT4. Interestingly, 34.6% of gene-distal OCT4 ChIP-seq peaks are bound by V5-SPT16, implying greater convergence between FACT and pluripotency factors at gene-distal sites, in line with known trends for OCT4 binding. Overall, 59.63% of V5-SPT16 peaks are co-bound by at least one of OCT4, SOX2, or NANOG.

      Can you show 1 heatmap split into 3 groups, a. SPT16-V5 unique, common between SPT16-V5 and Oct4 ChIP-seq, Oct4 ChIP-seq unique, with indication of numbers each group has? Also make the same figures for Sox2 and Nanog. (E is less important. If the authors want, they can use the published FACT ChIP-seq data in the same loci.)

      We appreciate the suggestion and have plotted V5-SPT16 CUT&RUN data and pluripotency factor ChIP-seq over unique and shared regions for OCT4 (top) SOX2 (middle) and NANOG (bottom). Interestingly, although some peaks in the non-overlapping cluster were not called as peaks by the algorithms’ threshold, one can observe that a subset do seem to have overlapping binding. We again appreciate the suggestion and think that this was an excellent way to display the data and have included these data as a new panel (Fig. 3E) but also show below in Response Figure 9.

      Fig. 5. Basic information what % (how many) of SPT16-V5 CUT&RUN peaks belong to this 'enhancer' category is missing.

      We apologize for the oversight and have added numbers to the figure and legend.

      I am not sure the meaning of separating enhancers and TSS of coding genes in the analyses, though. If majority of SPT16-V5 CUT&RUN peaks overlap with Oct4 binding sites, it is not surprising that SPT16-V5 CUT&RUN peaks overlaps with ATAC-seq signal and enhancer marks.

      We agree that it is unsurprising that V5-SPT16 overlaps with accessible chromatin and enhancers, given the extensive overlap with OCT4 ChIP-seq peaks. We wanted to emphasize our novel finding of gene-distal FACT binding, given the more established trend of binding at promoters.

      1. Fig 6A. I could not figure out what % of DHSs overlaps with FACT binding sites.

      We have added this percentage to Fig 5C and included an analysis of altered chromatin accessibility in a new Table 3 (page 20). Briefly, 11,234 replicated V5-SPT16 peaks (out of 16,043; 70%) directly overlap a gene distal DHS. Orthogonally, 11,234 DHSs (out of 132,555; 8.5%) directly overlap a V5-SPT16 peak.

      I do not see the point of showing DHSs which do not overlap with FACT binding sites.

      In agreement with Reviewer 1, we believe that it is important to include FACT-unbound DHSs for a clearer understanding of the direct vs indirect effects of FACT depletion. We have condensed some of these data into a single heatmap, clustered between FACT-bound DHSs, non-FACT-bound DHSs, and FACT-bound non-DHS sites to streamline the information (now shown in Fig 3E).

      Response Figure 9. Heatmaps of clustered SPT16 and OSN binding data. Shown are clustered heatmaps depicting V5-SPT16 CUT&RUN binding overlapping ChIP-seq peaks for OCT4 (top) SOX2 (middle) and NANOG (bottom). In each set of heatmaps the top cluster is pluripotency factor-unique, the middle cluster is shared, and the V5-unique cluster is on the bottom. Each cluster is sorted by descending strength of V5-SPT16 binding (CUT&RUN). Clusters were assigned by directly overlapping peaks.

      How ATAC-seq signal changes upon depletion of FACT at FACT binding sites (Fig 6B) is important. Can you explain why ATAC-seq signals increase at the FACT binding site flanking regions (across +/- 2kb) where FACT binding is strong (without changing the chromatin accessibility at the FACT binding sites)? Perhaps authors need to show actual ATAC-seq track with EtOH or 3-IAA treatment over ~10kb regions flanking FACT binding sites. It is difficult to understand what is happening seeing only the changes (ratio) of ATAC-seq read counts, how big the differences are.

      We agree that the local window and ratio of ATAC-seq signal somewhat muddles the true biological trends. We have plotted non-differential ATAC-seq signal for each SPT16-AID clone over V5 binding sites, ±10 kb, to more accurately depict the local chromatin status (shown below in Response Figure 10). There is an apparent trend at V5-SPT16 CUT&RUN peaks of accessible chromatin, and this high local accessibility very likely contributes to the high ATAC-seq signal immediately flanking V5 binding sites; over the binding sites themselves, however, FACT depletion consistently triggers decreased accessibility (see Fig. 6).

      Can you identify differentially open loci based on 3-IAA- and Et-OH treated ATAC-seq data at each time point, and then how many of them overlap with FACT binding sites? There are a few tools to identify differential open regions with ATAC-seq data. That could help to understand the direct roles of FACT binding.

      We appreciate the suggestion and have performed this analysis using a combination of PEPATAC and HOMER (see Response Tables 4-6 below). FACT depletion leads to the following accessibility changes:

      3-hour

      6-hour

      12-hour

      24-hour

      Decreased accessibility

      220 (0.35%)

      3,713 (5.99%)

      6,885 (11.11%)

      8,441 (13.62%)

      Increased accessibility

      2 (0.00%)

      12 (0.02%)

      276 (0.45%)

      6,031 (9.73%)

      Response Table 4. Accessibility changes over consensus ATAC-seq peaks. Consensus ATAC-seq peaks were defined per PEPATAC standards (peaks called by MACS2 in (n/2)+1 samples, irrespective of condition.

      3-hour

      6-hour

      12-hour

      24-hour

      Decreased accessibility

      848 (1.64%)

      1870 (3.51%)

      2525 (4.83%)

      4,092 (7.90%)

      Increased accessibility

      107 (0.21%)

      283 (0.55%)

      534 (1.03%)

      2,449 (4.73%)

      Response Table 5. Accessibility changes over regions bound by V5-SPT16.

      Response Figure 10. ATAC-seq data shown over a 20kb window. Heatmaps depicting non-differential ATAC-seq data over FACT binding sites for SPT16-AID clones 1 (top) and 2 (bottom). Data are sorted by V5-SPT16 binding strength.

      All

      3-hour

      6-hour

      12-hour

      24-hour

      Decreased accessibility

      3,294 (2.46%)

      3,175 (2.37%)

      3,636 (2.71%)

      7,018 (5.23%)

      Increased accessibility

      102 (0.08%)

      313 (0.23%)

      1,797 (1.34%)

      5,975 (4.45%)

      V5-bound DHSs (11,234 total)

      3-hour

      6-hour

      12-hour

      24-hour

      Decreased accessibility

      1 (0.01%)

      9 (0.08%)

      96 (0.85%)

      2006 (17.86%)

      Increased accessibility

      5 (0.04%)

      28 (0.25%)

      71 (0.63%)

      87 (0.77%)

      Response Table 6. Accessibility changes over gene-distal DHSs and over only FACT-bound gene-distal DHSs.

      Together with Fig 1A and Fig 6C, do they mean the more FACT binding, the more transcription (Fig 1A). Also the higher transcription rate, the more increased chromatin accessibility upon depletion of FACT (Fig 6C)?

      While we do see that FACT binding correlates with transcription and with FACT-dependent chromatin accessibility, we do not wish to make the argument that FACT binding alone is indicative of high transcription, nor that transcription is necessarily the deciding factor in FACT-depleted chromatin accessibility changes. We do want to note that transcriptional disruption is a likely contributor to increased chromatin accessibility in the absence of FACT as it pertains to paused RNAPII, as speculated in our discussion, but that experiments to truly test this hypothesis are beyond the scope of this work. That being said, in response to Reviewer 1, we did assess the potential correlation of FACT binding to locations with greater paused RNAPII (Response Figure 3) and see a connection. We are excited to explore this more in future work.

      Perhaps plotting nascent transcripts at 12hr, 24 hr of FACT depletion next to these heatmaps might show if it colleates with transcription changes as well?

      We appreciate the suggestion, and have included this plot in Response Figure 8, sorted by FACT binding to gene promoters; however, we find it difficult to visualize differences in transcription with non-differential heatmaps.

      Sites losing chromatin accessibility (bottom half of Fig 6C) seem not to have FACT binding (bottom half of Fig 1A), thus it is likely to be indirect effects. It is better to make figures focussing on 'direct effects'.

      We agree that there are sites with reduced chromatin accessibility upon FACT depletion that are not bound by FACT; however, given the extensive binding of FACT at gene-distal regulatory regions (F2D, F4A, F5, F6A/D), we would suggest that these “indirect” effects are possibly the result of FACT-dependent gene-distal regulation.

      Fig 1A and Fig 6C indicated that FACT binding sites (i.e. TSS) decrease chromatin accessibility. I thought it does not fit with the idea of increasing nucleosome occupancy. But actually the data (Fig 7F) shows that TSS does not show increased nucleosome occupancy unlike Fig 7A-E. In fact, Fig 6B showed that about bottom 50% of weaker V5 binding sites decreased chromatin accessibility at 24 hr, which fits with increased nucleosome occupancy in Fig 7A. But then if you looked at only top 50% of stronger V5 binding sites, which did not decrease chromatin accessibility, nucleosome occupancy did not change as well? Why don't you make heatmap of MNase-seq next to Fig 6B?

      We have added heatmaps of non-differential MNase-seq data to Fig. 7A to address both concerns. Regarding Figure 6B, we note that the V5-SPT16 peaks themselves invariantly show decreased chromatin accessibility, and that it is the surrounding chromatin, not the V5-SPT16 peak itself, that shifts from increased to decreased chromatin accessibility at 12-24 hours of depletion. We would also like to clarify that the original heatmaps in Fig 6B were sorted by change in chromatin accessibility at 24h, rather than V5 binding.

      We disagree that the TSSs do not show increased nucleosome occupancy in Fig. 7F, as there is an increase in signal above background directly over the TSS in both replicates, per the differential metaplot shown in Fig. 7B, that is specific to the AID-tagged lines. However, the two clones did show variable results. To address this, we have plotted the non-differential MNase-seq plots (Fig. 7A), which show more consistent trends; it appears that the transformation of the data into differential at this location was the cause of the slightly variable plots over TSSs.

      1. I could not follow based on which data the model in Fig 8 is made. Again it is better to focus in the direct effects.

      Thank you for the suggestion; we have updated our model to focus more on the direct effects.

      Minor points. 10. Line 1 page 5, Kolundzic paper did not have MEF reprograming data. They reported human fibroblast reprogramming was enhanced by FACT KD.

      We appreciate the correction and have clarified the language to specify that the work of Kolundzic et al. included human fibroblast reprogramming and Shen et al. performed MEF reprogramming.

      1. Line 3, I disagree with "these data establish FACT as essential in pluripotent cells". One paper said FACT KD increased proliferation of mESCs, the other paper said chemical inhibition of FACT was necessary for passaging ESCs, but not proliferation. Importance of FACT in pluripotent cells was very unclear to me.

      We have clarified our language to specify that pluripotent cells have a FACT dependency that differentiated cells do not. We note that we were unable to recapitulate a relationship between FACT and trypsinization/passaging of ES cells, suggesting a more nuanced role for FACT in pluripotent cells, in line with work from the Tessarz and Gurova labs.

      Line 7 Page 7, reference the paper with the ChIP-seq data.

      We apologize for the oversight and have added the reference.

      Line 16, Page 7. It doesn't seem the the Cut&run and previously published ChIP-seq data agree well.. >50% look different. It is nothing the authors can do, but can you show venn diagram of peak overlap?

      In response to Reviewer 1, we have generated Response Figure 1 where we display a pie chart of the overlap. In addition to displaying this again to the right in Response Figure 11 this, we have included another analysis below in Response Figure 11, to address this comment. Specifically, we have plotted peak overlaps as a Venn diagram to compare peaks identified in at least two experimental replicates from either the CUT&RUN or ChIP-seq data (left). We have also overlapped replicated peaks between the individual targets and displayed them as a pie chart (right; same as Response Figure 1). While the CUT&RUN data do display a greater signal:noise ratio and call far more peaks, we note that more peak conservation between experiments is relatively consistent (1-6%) between all datasets, including the ChIP-seq experiments profiling opposite factors.

      Overall, we see strongly reproducible trends (albeit with less sharp definition in the ChIP-seq), complemented by highly similar biological feature assignment in Fig. 2D and Pearson correlation values of between 0.76 and 0.78 between SPT16 ChIP-seq and V5-SPT16 CUT&RUN (Supp Fig. S2A).

      __Response Figure 11. Overlaps between SPT16-V5 CUT&RUN, SPT16 ChIP-seq, and SSRP1 ChIP-seq. __Called peaks were compared between V5-SPT16 CUT&RUN, SPT16 ChIP-seq, and SSRP1 ChIP-seq, using both our own analysis pipeline (left) and the peaks published with the original manuscript by Tessarz et al. (2018; right). While our ChIP-seq peak-calling appears to have applied more stringent thresholds, trends are generally agreeable.

      Line 12, 22 page 10. Fig.3AB is 24 hrs. Do not match with the text.

      We apologize for the error and have changed the references in the text to the new panel 3C.

      1. Line 23, 24, page 10, Highlight Klf4 and Myc in the volcano plot.

      We have added KLF4 and MYC annotation to the volcano plot in Fig. 3A, as well as plotted their log2FC over time in the new panel 3B.

      1. Line 18, 19, page 16. This is not accurate statement. Sample 2 increased the accessibility at 6 hours. Sample 1 decreased, but even the control did so.

      We apologize for the unclear wording; we intended to suggest that all timepoints after 6 hours (i.e., 12 and 24 hours) display decreased accessibility directly over the DHS. We have corrected the text.

      1. Line 48-50, page 16. Two replicates show very different patterns. Difficult to agree with the statement based on the figure.

      We agree that the differential replicate patterns are not ideal; however, both replicates display an increase in nucleosome-sized reads over the promoter region, consistent with our ATAC-seq results presented in Fig 6C. Size distribution plots did not suggest differences in MNase digestion between samples, and neither quartile/RPGC nor TMM-based normalization fully solved this issue. Because our ATAC-seq datasets agree with the general trends identified by MNase-seq (which are consistent, despite technical differences between clones), we do not believe that the differences constitute biological difference, but rather experimental noise. We have included a heatmap of non-differential MNase-seq signal around TSSs in Fig 7A to highlight the experimental reproducibility between replicates. Based on this analysis it appears that the transformation of the data into differential at this location was the cause of the slightly variable plots over TSSs.

      1. Line 15, page 19. Where does "1.5 times" come from? which is 1.5 times more, and is that different from the proportion of those?

      We apologize for the unclear reference to the altered transcripts in Table 1 and have changed our wording to be more precise.

      1. Line 32, page 19. Is Fig S2B correct figure?

      We appreciate the correction; the text should have referred to Fig. 4 and has been fixed.

      Line 35-39, page 21. I understand FACT does not bind to silenced loci. If FACT does not bind, it is not surprising that expression from those loci does not change upon FACT deletion. I do not understand what the authors said.

      We agree that a lack of binding and unchanged expression after FACT depletion at putative silencers are unsurprising; given FACT’s extensive genic and gene-distal binding, we wished to show a class of transcribed regions unbound by FACT as a control, to show that non-FACT-regulated transcription was not affected by FACT transcription. We have clarified our wording in the text to emphasize that a lack of change was expected at silencers.

      Reviewer #3 (Significance (Required)):

      Previously it has been shown that Oct4 physically interacts with the FAcilitates Chromatin Transactions (FACT) complex. Seemingly contradicting phenotypes have been reporting upon suppression of FACT function in the maintenance and induction of pluripotent cells. Mylonas has reported that knockdown of SSRP1, a component of FACT complex, increased ESC proliferation (2018). Shen has described that chemical inhibition of FACT complex affected passaging of ESCs, but proliferation was not affected without passaging. Kolundzic has found that both SSRP1 and SUPT16H, another component of FACT complex, enhance human fibroblast reprogramming into iPSCs (2018), while Shen has reported that chemical inhibition of FACT blocks mouse iPSC generation form MEFs.

      My expertise lies on pluripotent stem cells and transcriptional regulations. I did like the Auxin-mediated FACT degradation system these authors used and acute depletion of FACT is an excellent way of evaluating FACT function in ESC, compared to previously published shRNA based knockdown or use of a chemical inhibitor. However, as I described above, it was not very clear what could the direct effects and I feel looking at 24 hours after depletion might be to late to address this question.

      We appreciate the review and agree that acute depletion of FACT has great potential to understand the complex’s function in ES cells. We understand that the nature of gene-distal regulation does make it difficult to cleanly elucidate direct regulation, and hope that our revisions have clarified that our goal was to examine direct, gene-distal regulation, rather than indirect effects. We would like to note that we examined transcription and chromatin accessibility after 3, 6, 12, and 24 hours of 3-IAA treatment, with all these data included in the original manuscript, and saw minimal change (likely because FACT was not fully depleted until later timepoints); to capture the true biological effects of FACT depletion, we explored most thoroughly the 24 hour 3-IAA treatment to understand the downstream effects between FACT loss and cellular differentiation. However, we have expanded discussion and analyses of the earlier timepoints in this revised manuscript.

    1. Author Response

      Reviewer #1 (Public Review):

      Overall, the science is sound and interesting, and the results are clearly presented. However, the paper falls in-between describing a novel method and studying biology. As a consequence, it is a bit difficult to grasp the general flow, central story and focus point. The study does uncover several interesting phenomena, but none are really studied in much detail and the novel biological insight is therefore a bit limited and lost in the abundance of observations. Several interesting novel interactions are uncovered, in particular for the SPS sensor and GAPDH paralogs, but these are not followed up on in much detail. The same can be said for the more general observations, eg the fact that different types of mutations (missense vs nonsense) in different types of genes (essential vs non-essential, housekeeping vs. stress-regulated...) cause different effects.

      This is not to say that the paper has no merit - far from it even. But, in its current form, it is a bit chaotic. Maybe there is simply too much in the paper? To me, it would already help if the authors would explicitly state that the paper is a "methods" paper that describes a novel technique for studying the effects of mutations on protein abundance, and then goes on to demonstrate the possibilities of the technology by giving a few examples of the phenomena that can be studied. The discussion section ends in this way, but it may be helpful if this was moved to the end of the introduction.

      We modified the manuscript as suggested.

      Reviewer #2 (Public Review):

      Schubert et al. describe a new pooled screening strategy that combines protein abundance measurements of 11 proteins determined via FACS with genome-wide mutagenesis of stop codons and missense mutations (achieved via a base editor) in yeast. The method allows to identify genetic perturbations that affect steady state protein levels (vs transcript abundance), and in this way define regulators of protein abundance. The authors find that perturbation of essential genes more often alters protein abundance than of nonessential genes and proteins with core cellular functions more often decrease in abundance in response to genetic perturbations than stress proteins. Genes whose knockouts affected the level of several of the 11 proteins were enriched in protein biosynthetic processes while genes whose knockouts affected specific proteins were enriched for functions in transcriptional regulation. The authors also leverage the dataset to confirm known and identify new regulatory relationships, such as a link between the SDS amino acid sensor and the stress response gene Yhb1 or between Ras/PKA signalling and GAPDH isoenzymes Tdh1, 2, and 3. In addition, the paper contains a section on benchmarking of the base editor in yeast, where it has not been used before.

      Strengths and weaknesses of the paper

      The authors establish the BE3 base editor as a screening tool in S. cerevisiae and very thoroughly benchmark its functionality for single edits and in different screening formats (fitness and FACS screening). This will be very beneficial for the yeast community.

      The strategy established here allows measuring the effect of genetic perturbations on protein abundances in highly complex libraries. This complements capabilities for measuring effects of genetic perturbations on transcript levels, which is important as for some proteins mRNA and protein levels do not correlate well. The ability to measure proteins directly therefore promises to close an important gap in determining all their regulatory inputs. The strategy is furthermore broadly applicable beyond the current study. All experimental procedures are very well described and plasmids and scripts are openly shared, maximizing utility for the community.

      There is a good balance between global analyses aimed at characterizing properties of the regulatory network and more detailed analyses of interesting new regulatory relationships. Some of the key conclusions are further supported by additional experimental evidence, which includes re-making specific mutations and confirming their effects on protein levels by mass spectrometry.

      The conclusions of the paper are mostly well supported, but I am missing some analyses on reproducibility and potential confounders and some of the data analysis steps should be clarified.

      The paper starts on the premise that measuring protein levels will identify regulators and regulatory principles that would not be found by measuring transcripts, but since the findings are not discussed in light of studies looking at mRNA levels it is unclear how the current study extends knowledge regarding the regulatory inputs of each protein.

      See response to Comment #10.

      Specific comments regarding data analysis, reproducibility, confounders

      1) The authors use the number of unique barcodes per guide RNA rather than barcode counts to determine fold-changes. For reliable fold changes the number of unique barcodes per gRNA should then ideally be in the 100s for each guide, is that the case? It would also be important to show the distribution of the number of barcodes per gRNA and their abundances determined from read counts. I could imagine that if the distribution of barcodes per gRNA or the abundance of these barcodes is highly skewed (particularly if there are many barcodes with only few reads) that could lead to spurious differences in unique barcode number between the high and low fluorescence pool. I imagine some skew is present as is normal in pooled library experiments. The fold-changes in the control pools could show whether spurious differences are a problem, but it is not clear to me if and how these controls are used in the protein screen.

      Because of the large number of screens performed in this study (11 proteins, with 8 replicates for each) we had to trade off sequencing depth and power against cell sorting time and sequencing cost, resulting in lower read and barcode numbers than what might be ideally aimed for. As described further in the response to Comment #5, we added a new figure to the manuscript that shows that the correlation of fold-changes between replicates is high (Figure 3–S1A). The second figure below shows that the correlation between the number of unique barcodes and the number of reads per gRNA is highly significant (p < 2.2e-16).

      2) I like the idea of using an additional barcode (plasmid barcode) to distinguish between different cells with the same gRNA - this would directly allow to assess variability and serve as a sort of replicate within replicate. However, this information is not leveraged in the analysis. It would be nice to see an analysis of how well the different plasmid barcodes tagging the same gRNA agree (for fitness and protein abundance), to show how reproducible and reliable the findings are.

      We agree with the reviewer that this would be nice to do in principle, but our sequencing depth for the sorted cell populations was not high enough to compare the same barcode across the low/unsorted/high samples. See also our response to Comment #5 for the replicate analyses.

      3) From Fig 1 and previous research on base editors it is clear that mutation outcomes are often heterogeneous for the same gRNA and comprise a substantial fraction of wild-type alleles, alleles where only part of the Cs in the target window or where Cs outside the target window are edited, and non C-to-T edits. How does this reflect on the variability of phenotypic measurements, given that any barcode represents a genetically heterogeneous population of cells rather than a specific genotype? This would be important information for anyone planning to use the base editor in future.

      We agree with the reviewer that the heterogeneity of editing outcomes is an important point to keep in mind when working with base editors. In genetic screens, like the ones described here, often the individual edit is less important, and the overall effects of the base editor are specific/localized enough to obtain insights into the effects of mutations in the area where the gRNA targets the genome. For example, in our test screens for Canavanine resistance and fitness effects, in which we used gRNAs predicted to introduce stop codons into the CAN1 gene and into essential genes, respectively, we see the expected loss-of-function effect for a majority of the gRNAs (canavanine screen: expected effect for 67% of all gRNAs introducing stop codons into CAN1; fitness screen: expected effect for 59% of all gRNAs introducing stop codons into essential genes) (Figure 2). In the canavanine screen, we also see that gRNAs predicted to introduce missense mutations at highly conserved residues are more likely to lead to a loss-of-function effect than gRNAs predicted to introduce missense mutations at less conserved residues, further highlighting the differentiated results that can be obtained with the base editor despite the heterogeneity in editing outcomes overall. We would certainly advise anyone to confirm by sequencing the base edits in individual mutants whenever a precise mutation is desired, as we did in this study when following up on selected findings with individual mutants.

      4) How common are additional mutations in the genome of these cells and could they confound the measured effects? I can think of several sources of additional mutations, such as off-target editing, edits outside the target window, or when 2 gRNA plasmids are present in the same cell (both target windows obtain edits). Could some of these events explain the discrepancy in phenotype for two gRNAs that should make the same mutation (Fig S4)? Even though BE3 has been described in mammalian cells, an off-target analysis would be desirable as there can be substantial differences in off-target behavior between cell types and organisms.

      Generally, we are not very concerned about random off-target activity of the base editor because we would not expect this to cause a consistent signal that would be picked up in our screen as a significant effect of a particular gRNA. Reproducible off-target editing with a specific gRNA at a site other than the intended target site would be problematic, though. We limited the chance of this happening by not using gRNAs that may target similar sequences to the intended target site in the genome. Specifically, we excluded gRNAs that have more than one target in the genome when the 12 nucleotides in the seed region (directly upstream of the PAM site) are considered (DiCarlo et al., Nucleic Acids Research, 2013).

      We do observe some off-target editing right outside the target window, but generally at much lower frequency than the on-target editing in the target window (Figure 1B and Figure 1–S2). Since for most of our analyses we grouped perturbations per gene, such off-target edits should not affect our findings. In addition, we validated key findings with independent experiments. For our study, we used the Base Editor v3 (Komor et al., Nature, 2016); more recently, additional base editors have been developed that show improved accuracy and efficiency, and we would recommend these base editors when starting a new study (see, e.g., Anzalone et al., Nature Biotechnology, 2020).

      We are not concerned about cases in which one cell gets two gRNAs, since the chance that the same two gRNAs end up in one cell repeatedly is low, and such events would therefore not result in a significant signal in our screens.

      We don’t think that off-target mutations can explain the discrepancy between pairs of gRNAs that should introduce the same mutation (Figure 3–S1. The effect of the two gRNAs is actually well-correlated, but, often, one of the two gRNAs doesn’t pass our significance cut-off or simply doesn’t edit efficiently (i.e., most discrepancies arise from false negatives rather than false positives). We may therefore miss the effects of some mutations, but we are unlikely to draw erroneous conclusions from significant signals.

      5) In the protein screen normalization uses the total unique barcode counts. Does this efficiently correct for differences from sequencing (rather than total read counts or other methods)? It would be nice to see some replicate plots for the analysis of the fitness as well as the protein screen to be able to judge that.

      We made a new figure that shows a replicate comparison for the protein screen (see below; in the manuscript it is Figure 3–S1A) and commented on it in the manuscript. For this analysis, the eight replicates for each protein were split into two groups of four replicates each and analyzed the same way as the eight replicates. The correlation between the two groups of replicates is highly significant (p < 2.2e-16). The second figure shows that the total number of reads and the total number of unique barcodes are well correlated.

      For the fitness screen, we used read counts rather than barcode counts for the analysis since read counts better reflect the dropout of cells due to reduced fitness. The figure below shows a replicate comparison for the fitness screen. For this analysis, the four replicates were split into two groups of two replicates each and analyzed the same way as the four replicates. The correlation between the two groups of replicates is highly significant (p < 2.2e-16).

      6) In the main text the authors mention very high agreement between gRNAs introducing the same mutation but this is only based on 20 or so gRNA pairs; for many more pairs that introduce the same mutation only one reaches significance, and the correlation in their effects is lower (Fig S4). It would be better to reflect this in the text directly rather than exclusively in the supplementary information.

      We clarified this in the manuscript main text: “For 78 of these gRNA pairs, at least one gRNA had a significant effect (FDR < 0.05) on at least one of the eleven proteins; their effects were highly correlated (Pearson’s R2 = 0.43, p < 2.2E-16) (Figure 3–S1B). For the 20 gRNA pairs for which both gRNAs had a significant effect, the correlation was even higher (Pearson’s R2 = 0.819, p = 8.8e-13) (Figure 3–S1C). These findings show that the significant gRNA effects that we identify have a low false positive rate, but they also suggest that many real gRNA effects are not detected in the screen due to limitations in statistical power.”

      7) When the different gRNAs for a targeted gene are combined, instead of using an averaged measure of their effects the authors use the largest fold-change. This seems not ideal to me as it is sensitive to outliers (experimental error or background mutations present in that strain).

      We agree that the method we used is more sensitive to outliers than averaging per gene. However, because many gRNAs have no effect either because they are not editing efficiently or because the edit doesn’t have a phenotypic consequence, an averaging method across all gRNAs targeting the same gene would be too conservative and not properly capture the effect of a perturbation of that gene.

      8) Phenotyping is performed directly after editing, when the base editor is still present in the cells and could still interact with target sites. I could imagine this could lead to reduced levels of the proteins targeted for mutagenesis as it could act like a CRISPRi transcriptional roadblock. Could this enhance some of the effects or alter them in case of some missense mutations?

      To reduce potential “CRISPRi-like” effects of the base editor on gene expression, we placed the base editor under a galactose-inducible promoter. For both the fitness and protein screens we grew the cultures in media without galactose for another 24 hours (fitness screen) or 8-9 hours (protein screens) before sampling. In the latter case, this recovery time corresponded to more than three cell divisions, after which we assume base editor levels to have strongly decreased, and therefore to no longer interfere with transcription. This is also supported by our ability to detect discordant effects of gRNAs targeting the same gene (e.g., the two mutations leading to loss-of-function and gain-of-function of RAS2), which would otherwise be overshadowed by a CRISPRi effect.

      9) I feel that the main text does not reflect the actual editing efficiency very well (the main numbers I noticed were 95% C to T conversion and 89% of these occurring in a specific window). More informative for interpreting the results would be to know what fraction of the alleles show an edit (vs wild-type) and how many show the 'complete' edit (as the authors assume 100% of the genotypes generated by a gRNA to be conversion of all Cs to Ts in the target window). It would be important to state in the main text how variable this is for different gRNAs and what the typical purity of editing outcomes is.

      We now show the editing efficiency and purity in a new figure (Figure 1B), and discuss it in the main text as follows: “We found that the target window and mutagenesis pattern are very similar to those described in human cells: 95% of edits are C-to-T transitions, and 89% of these occurred in a five-nucleotide window 13 to 17 base pairs upstream of the PAM sequence (Figure 1A; Figure 1–S2) (Komor et al., 2016). Editing efficiency was variable across the eight gRNAs and ranged from 4% to 64% if considering only cases where all Cs in the window are edited; percentages are higher if incomplete edits are considered, too (Figure 1B).”

      Comments regarding findings

      10) It would be nice to see a comparison of the results to the effects of ~1500 yeast gene knockouts on cellular transcriptomes (https://doi.org/10.1016/j.cell.2014.02.054). This would show where the current study extends established knowledge regarding the regulatory inputs of each protein and highlight the importance of directly measuring protein levels. This would be particularly interesting for proteins whose abundance cannot be predicted well from mRNA abundance.

      We agree with the reviewer that it would be very interesting to compare the effect of perturbations on mRNA vs protein levels. We have compared our protein-level data to mRNA-level data from Kemmeren and colleagues (Kemmeren et al., Cell 2014), and we find very good agreement between the effects of gene perturbations on mRNA and protein levels when considering only genes with q < 0.05 and Log2FC > 0.5 in both studies (Pearson’s R = 0.79, p < 5.3e-15).

      Gene perturbations with effects detected only on mRNA but not protein levels are enriched in genes with a role in “chromatin organization” (FDR = 0.01; as a background for the analysis, only the 1098 genes covered in both studies were considered). This suggests that perturbations of genes involved in chromatin organization tend to affect mRNA levels but are then buffered and do not lead to altered protein levels. There was no enrichment of functional annotations among gene perturbations with effects on protein levels but not mRNA levels.

      We did not include these results in the manuscript because there are some limitations to the conclusions that can be drawn from these comparisons, including that our study has a relatively high number of false negatives, and that the genes perturbed in the Kemmeren et al. study were selected to play a role in gene regulation, meaning that differences in mRNA-vs-protein effects of perturbations are limited to this function, and other gene functions cannot be assessed.

      11) The finding that genes that affect only one or two proteins are enriched for roles in transcriptional regulation could be a consequence of 'only' looking at 10 proteins rather than a globally valid conclusion. Particularly as the 10 proteins were selected for diverse functions that are subject to distinct regulatory cascades. ('only' because I appreciate this was a lot of work.)

      We agree with this, and we think it is clear in the abstract and the main text of the manuscript that here we studied 11 proteins. We made this point also more explicit in the discussion, so that it is clear for readers that the findings are based on the 11 proteins and may not extrapolate to the entire yeast proteome.

      Reviewer #3 (Public Review):

      This manuscript presents two main contributions. First, the authors modified a CRISPR base editing system for use in an important model organism: budding yeast. Second, they demonstrate the utility of this system by using it to conduct an extremely high throughput study the effects of mutation on protein abundance. This study confirms known protein regulatory relationships and detects several important new ones. It also reveals trends in the type of mutations that influence protein abundances. Overall, the findings are of high significance and the method appears to be extremely useful. I found the conclusions to be justified by the data.

      One potential weakness is that some of the methods are not described in main body of the paper, so the reader has to really dive into the methods section to understand particular aspects of the study, for example, how the fitness competition was conducted.

      We expanded the first section for better readability.

      Another potential weakness is the comparison of this study (of protein abundances) to previous studies (of transcript abundances) was a little cursory, and left some open questions. For example, is it remarkable that the mutations affecting protein abundance are predominantly in genes involved in translation rather than transcription, or is this an expected result of a study focusing on protein levels?

      We thank the reviewer for pointing out that this paragraph requires more explanation. We expanded it as follows: “Of these 29 genes, 21 (72%) have roles in protein translation—more specifically, in ribosome biogenesis and tRNA metabolism (FDR < 8.0e-4, Figure 5C). In contrast, perturbations that affect the abundance of only one or two of the eleven proteins mostly occur in genes with roles in transcription (e.g., GO:0006351, FDR < 1.3e-5). Protein biosynthesis entails both transcription and translation, and these results suggest that perturbations of translational machinery alter protein abundance broadly, while perturbations of transcriptional machinery can tune the abundance of individual proteins. Thus, genes with post-transcriptional functions are more likely to appear as hubs in protein regulatory networks, whereas genes with transcriptional functions are likely to show fewer connections.”

      Overall, the strengths of this study far outweigh these weaknesses. This manuscript represents a very large amount of work and demonstrates important new insights into protein regulatory networks.

    1. Let us, therefore, turn to the experience itself. Upon a black cloth two squares of gray cardboard lie side by side. I am to judge whether or not they are of equal grayness. What is my experience? I can think of four different possibilities. (1) I see on a black surface one homogeneous gray oblong with a thin division line which organizes this oblong into two squares. For simplicity's sake we shall neglect this line, although it has varying aspects. (2) I see a pair of "brightness steps" ascending from left to right. This is a very definite experience with well-definable properties. just as in a real staircase the steps may have different heights, so my experience may be that of a steep or a moderate ascent. It may be well-balanced or ill-balanced, the latter e.g. when there is a middle gray on the left and a radiant white on the right. And it has two steps. This must be rightly understood. If I say a real stair has two steps, I do not say there is one plank below and another plank above. I may find out later that the steps are planks, but originally I saw no planks, but only steps. Just so in my brightness steps: I see the darker left and the brighter right not as separate and independent pieces of color, but as steps, and as steps ascending from left to right. What does this mean? A plank is a plank anywhere and in any position; a step is a step only in its proper position in a scale. Again, a sensation of gray, for traditional psychology, may be a sensation of gray anywhere, but a gray step is a gray step only in a series of brightnesses. Scientific thought, concerned as it is with real things, has centered around concepts like "plank" and has neglected concepts like "step."[7] Consequently the assertion has become true without qualification that a "step" is a "plank". Psychology, although it is [p. 541] concerned with experiences, has invariably taken over this mode of procedure. But since the inadequacy occasioned by the neglect of the step-concept is much more conspicuous in psychology than it is in physics, it is our science that first supplied the impulse to reconsider the case. And when we do reconsider, we see at once that the assertion "a sensation of gray is a sensation of gray anywhere" loses all meaning,[8] and that the assertion that a real step is a plank is true only with certain qualifications.
    1. this article may be the first time you’re reading about, and considering, the complexities of Asian Americans as racialized subjects

      For me this I really am a first timer in terms of learning about the Asian American experience. It is very interesting to learn about more cultures and think of how we all face some sort of inequality.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      • *

      The authors proposed that the stable and opened membrane neck that connects the bud to the cytoplasm may persist for a long time in the infected cell during active RNA production. The viral ring-shaped nsPs is supposed to have an important role of maintaining this stable high-curvature membrane neck. It is suggested that the nsP1 dodecamer may pull together the membrane inner surface in the neck region via electrostatic interactions. Namely the authors observed that in the absence of negatively charged membrane lipids nsP1 did not bind appreciably to the membrane. The presented experimental data and theoretical consideration suggest that the CHIKV spherule consists of a membrane bud filled with viral RNA, and has a macromolecular complex gating the opening of this bud to the cytoplasm.

      The presented results are interesting, the manuscript is well written and can be published after revision. The following comments are offered to the authors' consideration.

      We thank the reviewer for this positive overall assessment.

      1.Since there is no protein coating over the curved surface of the membrane bud, the authors concluded that the membrane neck must be stabilised by specific mechanism involving nsP1. It was further assumed that the viral protein nsP1 serves as a base for the assembly of of a larger protein complex at the neck of the membrane bud. In addition to suggested mechanism of the neck stabilization, thin highly curved membrane neck can be stabilised also by accumulation of the membrane components having the appropriate membrane curvature (i. E. negative intrinsic curvature or anisotropic intrinsic curvature), see Kralj-Iglic et al., Eur. Phys. J. B., 10: 5-8 (1999), https://doi.org/10.1007/s100510050822.

      Please discuss this issue in the manuscript.

      This is a good point, thank you for making it. In the revised manuscript we discuss both the possibility of lipid sorting into the neck region by nsP1 (lines 217-222), and the mentioned paper regarding anisotropic inclusions (lines 268-271).

      • *

      2.In Eq. (1) the Gaussian curvature term (appearing in Helfrich bending energy term) is not included. Usually this term is omitted in the case of closed membrane shapes (i.e. so-called spherical topology) due to validity of the Gauss-Bonnet theorem. In the present manuscript/work the shape equation was solved for the membrane patch. Can you therefore please explain shortly to the reader why you can omit the Gaussian curvature term from Eq.(1). For example due fixed inclination angle and foxed curvature at the boundary, .....

      Thanks for finding this omission. We have now revised the manuscript to describe why we can omit the Gaussian curvature term (lines 241-245).

      • *

      • *

      3.«Sigma« and »P« can be considered also as global Lagrange multipliers for the constraint of the fixed total membrane area of the bud (including the neck membrane) and the constraint of the fixed volume of the bud. If you then take into account separately also the equation for the fixed membrane area you could predict different shapes of the bud (by solving the shape equation) at fixed area of the bud, calculated for different values of the model parameters (and different boundary conditions) - in this case Sigma is the result of variational procedure (as well P if you consider also the constraint for the fixed volume of the bud). See for example Medical & Biological Engineering & Computing, vol. 37, pp. 125-129, 1999 and J. Phys. Condens. Matter, vol. 4, pp. 1647-1657, 1992. Can you please shortly discuss in the manuscript also this issue.

      This is an interesting point. We now discuss this and cite the mentioned papers at the end of the theory section in the supplementary information (lines 203-205) as well as briefly mentioning it when discussing Eq. 1 (lines 240-242).

      • *

      **Referees cross-commenting**

      I agree as well.

      • *

      Reviewer #1 (Significance (Required)):

      The presented experimental and theoretical results are interesting, the manuscript is well written and can be published after revision.

      We thank the reviewer for this appreciative comment.

      • *

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      In their manuscript "Architecture of the chikungunya virus replication organelle" Laurent and colleagues show:

      • *

      - the 3D structure of the "neck complex" that forms the gateway between the Chikungunya virus replication/transcription organelle (termed "spherule") and the cytoplasm of infected cells. The structure was obtained by native electron cryo-tomography and sub-tomogram averaging of BHK cells infected with a single-cycle replicon system encoding all components of the viral replication machinery. The nominal resolution of the structure is 28 Å. The viral nsP1 protein, for which two high-resolution structures have previously been published, could unambiguously be located within the density of the neck complex.

      • *

      - nsP1 interaction with membranes relies on lipids with a single negative net charge, such as POPS, POPG and PI, whereas two different PIPs with a negative net charge greater than one support nsP1 binding less efficiently. These membrane determinants for nsP1 binding were elucidated using two complementary methods: multilamellar vesicle pulldown assays and confocal imaging of fluorescently labeled giant unilamellar vesicles in the presence of fluorescently labeled nsP1. Purified nsP1 was produced in E. coli.

      • *

      - nsP1 recruits nsP2 (another component of the neck complex) to membranes with suitable lipid composition. This observation was made using the same multilamellar vesicle pulldown assay.

      • *

      - the 3D organization of the viral genome within the spherule, demonstrating that each spherule contains one copy of the genome as a double-stranded RNA molecule. This analysis was carried out by segmentation of the same tomograms that were used to visualize the neck complex.

      • *

      - the force exerted by RNA polymerization within the spherules is sufficient to drive membrane remodeling. This is a theoretical argument based on mathematical modelling.

      • *

      Major comments:

      The article is written clearly and all major claims seem justified. The biochemical assays are presented in duplicates or triplicates, which is sufficient to derive the provided conclusions. The workflow for electron cryo-tomography analysis seems sound, even though the low number of individual particles (=64) for sub-tomogram averaging of the neck complex limits the resolution of its final structure. Given the strong competition in the field, and considering the high experimental workload that would be required for further improvement of the resolution, I do not recommend any additional benchwork for this paper.

      We thank the reviewer for this assessment, especially for recognising the challenge in obtaining a larger number of spherule subtomograms under the complex replicon particle protocol we had to use in order to study the BSL3 CHIKV under BSL2 conditions.

      • *

      My only concern is the accuracy of the experimental genome length measurements, which has important implications for their mechanistic interpretation. The type of tomograms that have been recorded here inherently suffers from anisotropy with respect to both resolution and contrast. This makes accurate tracing of tangled filaments very challenging, and in this light, I congratulate the authors for the impressively good agreement of their average experimentally determined genome length with the theoretical genome length (Figure 4C). As to be expected, however, the second supplementary video clearly shows multiple gaps in the traced genome, implying that there must necessarily be errors in the length measurements. Unless there is a possibility to confidently estimate the magnitude of these errors, my preferred interpretation would be that the vast majority of imaged spherules - regardless of their temporary volume in the moment of sample freezing - likely contains precisely one copy of the double-stranded RNA genome, and not fractions thereof as is suggested in the text (for example, line 305: "Analysis of the cryo-electron tomograms gave a clear answer to the question of the membrane bud contents: the lumen of full-size spherules consistently contains 0.8-0.9 copies."). I feel that this subject deserves more discussion in the manuscript. If the authors prefer to keep their original interpretation that the majority of spherules contains only fractions of full genomes, I invite them to provide an explanation for why their experimental genome length measurements are sufficiently accurate to favor this rather surprising conclusion over my more trivial interpretation. If I understand correctly, my preferred interpretation has implications for the mathematical model for membrane remodeling (Equation 2).

      This is a good point. In fact, we agree that our original manuscript and wording was unclear and we agree with the reviewer’s interpretation (“my preferred interpretation would be that the vast majority of imaged spherules - regardless of their temporary volume in the moment of sample freezing - likely contains precisely one copy of the double-stranded RNA genome”). We have now changed the text to reflect that we believe we have a 10-20% false negative rate in the filament tracing and that the most likely interpretation is indeed that each spherule has exactly one genome copy (lines 207-210). In addition, we looked at the possible consequences of the slight underestimation of the filament length for the mathematical model, and describe on lines 257-264 why this in fact would have no impact on the conclusions of the modeling.

      • *

      Minor comments:

      Virus taxa should be capitalized and written in italics wherever applicable. I recommend adhering to the following rules:

      https://talk.ictvonline.org/information/w/faq/386/how-to-write-virus-species-and-other-taxa-names

      Thank you for helping us clarify this. In response to this we have now italicized and capitalized all virus taxa.

      Figure 2I looks as if the pink cross-section of nsP1 has not been scaled correctly. Comparison to Figure 2H gives me the impression that the diameter of the pink nsP1 ring in Figure 2I should be scaled down relative to the greyscale neck complex.

      We would like to than the reviewer for their keen eye. There was indeed a scaling problem, which we have now solved in the updated Fig. 2.

      • *

      The caption of Figure 2 calls more panels than are provided in the figure. The caption "panel E" seems to be obsolete.

      Thanks for finding this mistake. We have now revised Fig. 2 and its legend.

      • *

      In the methods, centrifugation speed should be given in units of relative centrifugal force (rcf) instead of revolutions per minute (rpm), especially for the MLV pulldown assay where no rotor is indicated.

      We agree and have changed this on lines 482,490,524,531,543 and 597 of the manuscript

      • *

      In the methods for the MLV assay, the lipid:protein ratio is given with 500:1. It should be specified whether this is a mass ratio or a molar ratio.

      It was molar ratio which we have now specified on line 595.

      In the methods, the buffer composition for the mass photometry measurement should be indicated.

      Good point. We added this on lines 632-633.

      • *

      **Referees cross-commenting**

      • *

      I agree to the other reviewers' remarks.

      • *

      Reviewer #2 (Significance (Required)):

      • *

      Chikungunya virus is a very important human pathogen, and research on the architecture of its replication/transcription organelle holds great promise for the development of future therapies. Laurent and colleagues advanced this field by providing pioneering low-resolution 3D structures of the membrane-bound viral protein complex and the viral RNA content of this organelle in situ. In addition, they also assessed the lipid requirements for membrane interaction of the primary viral membrane anchor of this complex, nsP1, in vitro. Underlining the importance of these results, a competing group submitted a partially overlapping study to BioRXiv three months ahead (https://doi.org/10.1101/2022.04.08.487651). Whereas the competing group describes the structure of the neck complex at a much higher resolution, it neither analyzes the RNA content of the spherules nor does it address the lipid preferences of nsP1. The present study by Laurent and colleagues should therefore be of great interest to many virologists and cellular biologists.

      • *

      I am a structural virologist with a focus on envelope glycoproteins. Of relevance to this review, I have experience with cellular electron cryo-tomography and sub-tomogram averaging, as well as in-vitro protein/liposome interaction assays. I do not feel qualified to evaluate the details of the mathematical model for membrane remodeling that is used in the last results section of this manuscript.

      We thank reviewer 2 for this positive overall assessment of our work.

      • *

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      • *

      This is an interesting and well written paper describing the replication spherules generated by Chikungunya virus. Cryo-electron tomography was used to determine a low-resolution structure of the spherule, suggesting that nsP1 is located at the neck of the spherule. Segmentation of the tomograms combined with mathematical modeling was used to produce a structural model for RNA organization in the spherule, suggesting that each spherule contained approximately one copy of a full double-stranded RNA genome. I have a few minor comments:

      • *

      We are thankful for this positive overall assessment of our work.

      • *

      The structural studies were complemented with lipid binding assays, showing that nsP1 has an affinity for anionic lipids. While interesting, the connection of these experiments to the rest of the study seems tenuous. There is no further mention of them in the discussion or how they relate to the tomography and their replication model.

      We agree that those data were not as well integrated into the paper as they could have been, and are thankful that the reviewer pointed this out. To improve the integration of these data into the manuscript, we have expanded on two ways in which the reconstitution data relate to the rest of the paper: (i) the tomography led us to hypothesise that nsP1 recruits other nsPs to the membrane, which we could confirm with the reconstitution (lines 151-152, and throughout that parapgraph), and (ii) the lipid preferences of nsP1 that we could measure using the titrating pulldown experiments inform the possible models for how the spherule memebrane is remodeled since nsP1 binds lipids that cannot on their own stabilize a neck shape (lines 217-222). We have also slightly expanded the discussion of the biochemistry and its relation to other data in the paper (lines 307-311).

      • *

      It is a nice match between the calculated length of the RNA (assumed to be ds) and the length of the vector, but the segmentation of the RNA is not completely convincing based on the provided images. It is difficult to distinguish the RNA strands from the noise and other components in the spherule and, at least by eye, the segments do not seem very connected. Please provide some more details on the tracing algorithm. Has it been validated on a known system?

      We appreciate this comment and recognise that we did not sufficiently explain the tracing algorithm. This software was in fact custom written (by others, ca 10 years ago) for cryo-electron tomography and has since been used by others in several studies of cellular cryo-electron tomograms, e.g. to study actin cytoskeleton. We now mention this in the results (lines 195-196) and methods (lines 462-463).

      The tomogram video is nice, but it would be good to see a raw image as well, preferably covering a wider view that includes the whole cell, as well as a tomogram that represents the entire field of the reconstruction.

      This is a good suggestion. We unfortunately cannot provide images covering the entire cells since this is beyond the field of view of the electron microscope (and an image montage was not acquired at the time of data collection). However, we are now providing an additional supplementary movie that shows the entire field of view of the tomogram. In addition, we have uploaded two of the tomograms (including the uncropped tomogram from Figure 1) to EMDB where they will be downloadable by everyone after publication. We hope the reviewer appreciates that this is all that is technically possible at the moment.

      • *

      In figure 2, the panels are mislabeled relative to the legend, which refers to the color guide as its own panel.

      Thanks for pointing this out, we have rectified this in the revised Fig. 2 and its legend.

      Line 405: C36 symmetry? Why? Shouldn't it be C12 symmetry?

      36-fold symmetry was applied to the lipid membrane part to smoothen it further. The membrane part of the structure is simply outlining the neck shape and this is better visualised in this smoothened representation as also done e.g. in the study of the coronavirus neck complex (Wolff et al, Science 2020). We changed the methods text to make this more clear (line 449).

      • *

      Line 409: "fit" should be "fitted"

      Thanks, Corrected in the revised manuscript line 454.

      **Referees cross-commenting**

      • *

      I think we are all in good agreement, and I believe that the concerns raised can be addressed though a better explanation of the methods and improved discussion of their results.

      We also agree and believe we have addressed all of the remaining concerns in the revised manuscript.

      • *

      • *

      Reviewer #3 (Significance (Required)):

      • *

      This is a rather focused study, showing tomography data on the alphavirus replication complex. The main significance of the study is the description of the spherule's dimension and its relationship to the nature of the RNA, which provided a model for the replication process. While somewhat narrow in scope, the study should be of interest to people working in the virus replication and virus structure field. The lipid data are interesting, but does not seem well integrated with the rest of the study.

    1. LDST 200: Introduction to Leadership Studies and Applications Fall 2022 (Aug. 22-Oct. 14) Instructor Information: Jacob H. Stutzman, Ph.D Email: jhstutz@ku.edu Office Hours: by appointment (Links to an external site.) Required Materials Heifetz, R., Grashow, A., & Linsky, M. (2009). The practice of adaptive leadership: Tools and tactics for changing your organization and the world. Boston, MA: Harvard Business Press Institute for Leadership Studies. (2020). LDST 201: Introduction to Leadership Coursepack (5e) [provided on Canvas] Course Outcomes Upon completion of this course, students will examine and recall various theoretical approaches to leadership and leadership development; recall the four core leadership competencies and integrate each competency into their personal leadership development; explain and differentiate the role of ethics, diversity, and community development in leadership; theorize the ethical implications and applications of Adaptive Leadership and the four core leadership competencies; identify acts of Adaptive Leadership and distinguish between technical problems and adaptive challenges; distinguish between a learning/experimenting paradigm versus a problem/solution paradigm, along with contrasting the strengths and limitations of both; evaluate his/her own personal leadership strengths and challenges based on deliberate reflection; effectively communicate knowledge about and applications of leadership to others. How will we get from where we are to where we are hoping to go? Each week, you will work through a module that includes video lectures and readings (both from the assigned texts and some provided by the instructor) that will help you build a base of knowledge about leadership studies generally and Adaptive Leadership specifically. Each module will also include a quiz, a journal, and other writing assignments designed to help you put your knowledge to use by testing it and applying it to relevant scenarios. Most of the assignments will be completed individually, but there will be a limited amount of collaborative work. By thoughtfully and carefully completing each assignment, you will develop your knowledge of Adaptive Leader and explore the ways in which the principles of Adaptive Leadership can be useful in your own contexts. Assessments/Assignments Journals (7 @25 pts ea.) In each module (except for Module 8) there will be a prompt based on the material in the module. Responses should integrate the material from that week. The assignment expectations and sample journal entries are available on pp. 16-23 of the coursepack. Quizzes (6, 80 pts total) In each module (except for Module 8), there is a short quiz based on the material in the module. You may take each quiz twice, but the most recent score will always be the score that is recorded. Additional information is on p. 24 of the coursepack. Exams (2 @80 pts ea.) There will be two multiple choice exams in the course. These exams will cover material presented in the readings, webinars, supporting documents, and videos comprising the Modules. Exams will open and close with the Modules, so each exam must be taken before the Module deadline for that week. Exam dates are listed in the course schedule. Once you login to take the exam, you must complete it within 60 minutes. You will only have one chance to complete each exam. Study guides for each exam are available on Canvas throughout the semester. Reflection Paper (150 pts) You will complete a final reflection paper that draws on the material covered throughout the course. A full description, rubric, and sample paper are in the coursepack on pp. 38-50 of the coursepack. Application Project (3 parts @80 pts ea.) You will work through a three-part project to do the work of Adaptive Leadership in a community to which you belong. Each phase of the project will build on the work done before, with the first phase due in Module 3. TruTalent Assessment/Letter (10 pts for the results, 40 pts for the letter) Each of you will complete the TruTalent assessment through the University Career Center and upload your results when they are ready. In Module 5, there is also an assignment that calls on you to reflect on your results in the form of a letter. The specific assignment will be available in that module. Ethics Discussions (30 pts) In Modules 4, 5, & 6, you will respond to a prompt and then reply to your classmates' responses in an annotation assignment. Details will be available in Module 4. Ethics Paper (75 pts) Using the framework provided in Module 4, each of you will prepare an ethics case study on a situation of your choosing. Details are in Module 4. Self-Care Plan (40 pts) Each of you will also complete a self-care plan, because doing leadership is hard work, and you can't pour from an empty cup. Details for the assignment are in Module 7. Total points available: 1000 Grade Distribution           🦄 B+: 875-899 C+: 775-799 D+: 675-699 💩 A: 925-1000 B: 825-874 C: 725-774 D: 625-674 💩 A-: 900-924 B-: 800-824 C-: 700-724 D-: 600-624 F: 0-599 Schedule Module Date Open Date Due Items Due Course Information Module; Module 1--Introduction Aug. 22 Aug. 27 Pre-Course Survey; Values Worksheet; Journal Module 2--History of Leadership Theories Aug. 22 Sept. 3 Quiz; Journal Module 3--Introduction to Adaptive Leadership Aug. 28 Sept. 10 Quiz; Journal; TruTalent results; Application Phase 1 Module 4--Diversity and Ethics Sept. 4 Sept. 17 Quiz; Journal; Ethics Paper; Ethics Annotation: Exam 1 Module 5--Leadership and Personality Sept. 11 Sept. 24 Quiz; Journal; TruTalent Letter; Ethics Annotation Module 6--Manage Self and Energize Others Sept. 18 Oct. 1 Quiz; Journal; Application Phase 2, Ethics Annotation Module 7--Diagnose the Situation and Intervene Skillfully Sept. 25 Oct. 8 Quiz; Journal; Self-Care Plan Module 8--Celebrations of Knowledge Oct. 2 Oct. 15 Application Phase 3; Final Reflection Paper; Exam 2 Policies, Procedures, and the Like Canvas and Email This course will use Canvas for the dissemination of all lecture materials and reading assignments (other than the textbook), as well as the collection of all assessments. It is the student's responsibility to regularly check Canvas for updates and information. Emails sent through Canvas will go to your KU email address, so you must also check that email address regularly for information and communication. If you send an email from a non-university email address, I will reply to that address, but any emails I initiate will go to your university address. Assignments should not be submitted via email unless explicit, case-by-case arrangements are made. Incompletes In accordance with KU's policy on incompletesLinks to an external site., an I should only be assigned when some portion of the work for a course has not been done, for reasons beyond a student's control. Incompletes should be rare and will be assigned only in rare circumstances. If you believe such circumstances apply to your situation, please contact me as soon possible. Civility Each of us is an adult that has made the choice to be in this course. Recognizing that choice, each of us is expected to respect all points of view expressed in the classroom. Each person in this classroom should feel free to express her/his opinion and should feel an obligation to ensure that everyone else in the room feels the same freedom. Intolerance and incivility will not be tolerated, though disagreement and reasoned argument are strongly encouraged. Title IX of the Education Amendments of 1972 prohibits sex discrimination against any participant in an educational program or activity that receives federal funds, including federal loans and grants. Title IX also prohibits student‐to‐student sexual harassment. If you encounter unlawful sexual harassment, gender‐based discrimination, or other forms or prohibited harassment/discrimination, please talk with your professor or with the Office of Institutional Opportunity & Access at 785‐864-6414, or go to the Institutional Opportunity & AccessLinks to an external site. page for more information and reporting tools. Academic Integrity and Intellectual Property Academic misconduct of any kind is not tolerated in this class. Both the definition of academic misconduct and potential sanctions for it are defined by KU policyLinks to an external site.. Plagiarizing another's work. knowingly misrepresenting the source of any academic work, giving or receiving of unauthorized aid on assignments, and acting dishonestly in research are all subject to penalties. Similarly, submitting all or portions of an assignment completed in another class for a grade in this class is an act of academic misconduct. If you have outside work that you believe is appropriate and valuable to include in an assignment for this course, please speak with your instructor to establish appropriate guidelines. Additionally, all work produced for and in this course remains the intellectual property of the creator, including but not limited to: the textbooks, the lectures, and student assignments. No work may be reused, reproduced, or distributed without the express permission of the work's creator. This includes sharing notes or course materials to commercial or nonprofit services/databases. This policy does NOT include taking notes for personal use or a student volunteer taking notes for someone with a reasonable accommodation identified by the Student Access Center. Accessibility If you believe you need or would benefit from the accommodation of a disability, please contact the Student Access CenterLinks to an external site. to discuss accommodations. Since accommodations may require early planning and generally are not provided retroactively, please contact the Center as soon as possible.

      My name is Eden. I'm from the most haunted town in Kansas, Atchison! My major is Philosophy and my minor is in history. I think my goal for this class is to learn more about leadership in a conceptual way, so that I can apply it in more real-life situations. My walk up song would be Wake up by Young the giant!

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Neville et al addresses the link between the localization and the activity of the so-called "Pins complex" or "LGN complex", that has been shown to regulate mitotic spindle orientation in most animal cell types and tissues. In most cell types, the polarized localization of the complex in the mitotic cell (which can vary between apical and basolateral, depending on the context) localizes pulling forces to dictate the orientation. The authors reexplore the notion that this polarized localization of the complex is sufficient to dictate spindle orientation, and propose that an additional step of "activation" of the complex is necessary to refine positioning of the spindle.

      The experiments are performed in the follicular epithelium (FE), an epithelial sheet of cell that surrounds the drosophila developing oocyte and nurse cells in the ovarium. Like in many other epithelia, cell divisions in the FE are planar (the cell divides in the plane of the epithelium). The authors first confirm that planar divisions in this epithelium depends on the function of Pins and its partner mud, and that the interaction between the two partners is necessary, like in many other epithelial structures. Planar divisions are often associated with a lateral/basolateral "ring" of the Pins complex during mitosis. The authors show that in the FE, Pins is essentially apical in interphase and becomes enriched at the lateral cortex during mitosis, however a significant apical component remains, whereas mud is almost entirely absent from the apical cortex. Pins being "upstream" of mud in the complex, this is a first hint that the localization of Pins is not sufficient to dictate the localization of mud and of the pulling forces.

      The authors then replace wt Pins, whose cortical anchoring strongly relies on its interaction with Gai subunits, with a constitutively membrane anchored version (via a N-terminal myristylation). They show that the localization of myr-Pins mimics that of wt-Pins, with a lateral enrichment in mitosis, and a significant apical component. Since a Myr-RFP alone shows a similar distribution, they conclude that the restricted localization of Pins in mitosis is a consequence of general membrane characteristics in mitosis, rather than the result of a dedicated mechanism of Pins subcellular restriction. Remarkably, Myr-Pins also rescues Pins loss-of-function spindle orientation defects.

      They further show that the cortical localization of Pins does not require its interaction with Dlg (unlike what has been suggested in other epithelia). However, spindle orientation requires Dlg, and in particular it requires the direct Dlg/Pins interaction. The activity of Dlg in the FE appears to be independent from khc73 and Gukholder, two of its partners involved in its activity in microtubule capture and spindle orientation in other cell types. Based on all these observations, the authors propose that Dlg serves as an activator that controls Pins activity in a subregion of its localization domain (in this case, the lateral cortex of the mitotic FE cell). They propose to test this idea by relocalizing Pins at the apical cortex, using Inscuteable ectopic expression. With the tools that they use to drive Inscuteable expression, they obtain two populations of cells. One population has a stronger apical that basolateral Insc distribution, and the spindle is reoriented along the apical-basal axis; the other population has higher basolateral than apical levels of Insc distribution, and the spindle remains planar. The authors write that Pins localization is unchanged between the two subsets of cells (although I do not entirely agree with them on that point, see below), and that although Mud is modestly recruited to the apical cortex in the first population, it remains essentially basolateral in both. In this situation, the localization of Insc in the cell is therefore a better predictor of spindle orientation than that of Pins or Mud. Remarkably, removing Dlg in an Insc overexpression context leads to a dramatic shift towards apical-basal reorientation of the spindle, suggesting that loss of Dlg-dependent activation of the lateral Pins complex reveals an Insc-dependent apical activation of the complex. Overall, I find the demonstration convincing and the conclusion appropriate. One of the limitations of the study is the use of different drivers and reporters for the localization of Pins, which makes it hard to compare different situations, but not to the point that it would jeopardize the main conclusions. I do not have major remarks on the paper, only a few minor observations and suggestion of simple experiments that would complete the study.

      Minor:

      What happens to Pins and Mud in Dlg mutant cells that overexpress Insc and behave as InscA? Are they still essentially lateral, or are they more efficiently recruited to the apical cortex?

      This is a terrific question. Of course we would love to know and intend to find out.

      One way to do this (consistent with the manuscript) would be to generate flies that are Dlg[1P20], FRT19A/RFP-nls, hsflp, FRT19A; TJ-GAL4/+; Pins-Tom, GFP-Mud/UAS-Insc. (Note that these flies would only allow us to image Mud; we would have to repeat the experiment using GFP FRT19A; hsflp 38 to see Pins. This isn’t ideal given that we’d like to image both together). Generating these flies is a major technical challenge because of the number of transgenes and chromosomes involved.

      Our preferred way to do this would be to generate flies that are Dlg[1P20]/Dlg[2]; TJ-GAL4/+; Pins-Tom, GFP-Mud/UAS-Insc. So far, we’ve been unsuccessful. We are now undertaking a modified crossing scheme that we hope will solve the problem, though we aren’t overly optimistic about the outcome. We find that the temperature-sensitive mutation Dlg[2] presents an activation barrier; while we are able to generate flies that are Dlg[2] / FM7 in combination with transgenes and/or mutations on other chromosomes, we do not always recover the Dlg[2] / Y males (which must develop at 18degrees) from these complex genotypes.

      In the longer term (outside the scope of revision), we are working to develop more tools for imaging Mud and Pins that we hope will help answer this question.

      Regarding the competition between Pins and Insc for dictating the apical versus basolateral localization of Insc, the Insc-expression threshold model could be easily tested in Pins62/62 mutants, where it is expected that only InscA localization should be observed, even at 25{degree sign}C (unless Pins is required for the cortical recruitment of Insc, as it is the case in NBs - see Yu et al 2000 for example).

      This is another great experiment and one we’d love to carry out. Again, the genetics are currently challenging, only because both UAS-Inscuteable and FRT82B pinsp62 are on the third chromosome. (Right now we’re trying to hop UAS-Inscuteable to the second).

      However, we do have another idea for testing the threshold model, which is to repeat the experiment in which we express UAS-Insc in cells that are DlgIP20/IP20 at 25oC. Because the relevant cells (UAS-Insc OX in Dlg mitotic clones) are relatively rare, we have not yet been able to collect enough examples to make a firm conclusion. However, our preliminary results (only six cells so far!) suggest that more InscB cells are observed at the lower temperature, consistent with the threshold model.

      I do not agree with the authors on P.10 and Figure 6A-D, when they claim that the apical enrichment of Pins is equivalent in both InscA and InscB cells. The number of measured cells is very low, and the ratio of apical/lateral Pins differs between the two sets of cells. The number of cells should be increased and the ratios compared with a relevant statistic method.

      Totally fair. We are working to add more data to these panels (6B and 6D). The trend observed in 6D may be softening in agreement with the reviewer’s prediction, although we currently don’t yet have enough new data points to be confident in that conclusion. Therefore, we have not yet updated the manuscript, though we expect to do so during the revision period. We will also add a statistical comparison. Importantly, as the reviewer suggested, this does not alter our conclusions.

      A lot of the claims on Pins localization rely on overexpression (generally in a Pins null background) of tagged Pins expressed from different promoters or drivers, and fused to different fluorescent tags. Therefore, it is difficult to evaluate to which extent the localization reflects an endogenous expression level, and to compare the different situations. As the cortical localization of Pins relies on interaction with cortical partners (mostly GDP-bound Gai) which are themselves in limiting quantity in the cell, and in the case of Gai-GDP, regulated by Pins GDI activity, this poses a problem when comparing their distribution, because the expression level of Pins may contribute to its cortical/cytoplasmic ratio, but also to its lateral/apical distribution. Although I understand that the authors have been using tools that were already available for this study, I think it would be more convincing if all the Pins localization studies were performed with endogenously tagged Pins, even those with Myr localization sequences. In an age of CRISPR-Cas-dependent homologous recombination, I think the generation of such alleles should have been possible. Although this would probably not change the main claims of the paper, it would have made a more convincing case for the localization studies.

      We don’t disagree at all with this point. We did indeed try to stick with the published UAS-Pins-myr-GFP, not only for convenience but because it allows us to make comparisons to other studies using the same tool (Chanet et al Current Biology 2017 and Camuglia et al eLife 2022). Another consideration is that we used only one driver across our experiments (Traffic jam-GAL4). It is quite weak at the developmental stages that we examine, meaning that overexpression is not a major concern. (Indeed we have struggled with the opposite problem).

      We certainly take the reviewer’s comment seriously and we therefore described it in the manuscript. We are currently working to develop endogenous tools using CRISPR.

      Paragraph added to Discussion – Limitations of our Study:

      “Another technical consideration is that our work makes use of transgenes under the control of Traffic jam-GAL4. While this strategy allows us to compare our results with previous work employing the same or similar tools, a drawback is that we cannot guarantee that Traffic jam-GAL4 drives equivalent expression to the endogenous Pins promoter (Chanet et al., 2017, Camuglia et al., 2022). However, given that Traffic jam-GAL4 is fairly weak at the developmental stages examined, we are not especially concerned about overexpression effects.”

      The authors should indicate in the figure legends or in the methods that the spindle orientation measurements for controls for Pins62/62 are reused between figures 1, 3, 4, 5, 6 , and between figure 3, 4 and 5, respectively.

      Absolutely. Added to the Methods section.

      Reviewer #1 (Significance (Required)):

      Altogether, this study makes a convincing case that the localization of the core members of the pulling force complex, Pins and Mud, is not entirely sufficient to localize active force generation, and that the complex must be activated locally, at least in the FE. The notion of activation of the Pins/LGN complex has probably been on many people's mind for years: Pins/LGN works as a closed/open switch depending on the number of Gai subunits it interacts with, it must be phosphorylated, etc... suggesting that not all cortical Pins/LGN was active and involved in force generation. However the study presented here shows an interesting case where localization and activation are clearly disconnected. The authors show how Dlg plays this role in physiological conditions in the FE, and use ectopic expression of Insc to show that, at least in an artificial context, Insc can have the same "activating activity" (or at least an activating activity that is stronger than its apical recruitment capability and stronger than Dlg's activating activity). It is to my knowledge the first case of such a clear dissociation. In their discussion, the authors are careful not to generalize the observation to other tissues. Although I did not reexplore all that has been published on the Pins/LGN-NuMA/Mud complex over the last 20 years, my understanding is that despite interesting cases of distribution of the complex like that of Mud in the tricellular junction in the notum, the localization model can still explain most of the phenotypes that have been described without invoking an activation step. If it is the case, then the activation model is another variation (an interesting one!) on the regulation of the core machinery, which are plentiful as the authors indicate in their introduction, and is maybe specific to the FE; if not, then it would be interesting to push the discussion further by reexamining previous results in other systems, and pinpointing those phenotypes that could be better explained with an activation step.

      Overall, I find this is an elegant piece of work, which should be of interest to many cell and developmental biologists beyond the community of spindle orientation aficionados.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): Summary: The manuscript by Neville et al. addressed the mechanism how conserved spindle regulators (Pins/Mud/Gai/Dynein) control spindle orientation in the proliferating epithelia by revising "the canonical model", using the Drosophila follicular epithelium (FE). The authors examined the epistatic relationship among Pins, Mud and Dlg in FE and found that Pins controls the cortical localization of Mud by utilizing mutant analyses, and suggested their localization does not fully overlap using the newly generated knock-in allele. They also showed that Pins relocalization during mitosis depends on cortical remodeling, or passive model, where Pins localization changes with other membrane-anchored proteins. Their data further suggest that Pins cortical localization is not influenced by Dlg, but Pins-interacting domain of Dlg does affect spindle orientation. Based on these results, the authors propose that Dlg controls spindle orientation not by redistributing Pins, but by promoting (or "activating" from their definition) Pins-dependent spindle orientation. Interestingly, ectopic expression of Inscuteable (Insc) suggested that Insc localization, either apical or lateral, correlates with spindle orientation, and their localization is a dominant indicator of spindle orientation, compared to the localization of Pins and Mud, implicating potentially distinct roles of activation and localization of the spindle complex. Overall their genetic experiments are well-designed and provide stimulation for future research. However, their evidence is suggestive, but not conclusive for their proposal. I have several concerns about their conclusion and would like to request more detailed information as well as to propose additional experiments.

      Major concerns: 1. This report lacks technical and experimental details. As the typical fly paper, the authors need to show the exact genotypes of flies they used for experiments. This needs to be addressed for Figures 1-6, and Supplemental Figures. Especially, which Gal4 drivers were used for UAS-Pins wt or mutant constructs in Figure 4 with pins mutant background, Khc73, GUKH mutant backgrounds. Which exact flies were used for mutant clone experiments for Supplemental Figure 3? (A for typical mosaic, and B for MARCM). Without these details, it is impossible to evaluate results and reproduce by others.

      We take this concern very seriously!

      • We listed the GAL4 driver (Traffic jam-GAL4) in the first section of the Materials and Methods: Expression was driven by Traffic Jam-GAL4 (Olivieri et al., 2010). The transgene and relevant citation have been added to Table 1.
      • We explained the background stock for the MARCM experiment in the Materials and Methods: Mosaic Analysis with a Repressible Cell Marker (after the method of Lee and Luo) was carried out using GFP-mCD8 (under control of an actin promoter) as the marker. The transgene and relevant citation have been added to Table 1.
      • In line with other fly studies (eg. Nakajima et al., Nature 2013) and our own Drosophila work (Bergstralh et al Current Biology 2013, Bergstralh*, Lovegrove*, St Johnston NCB 2015, Bergstralh et al Development 2016, Finegan et al EMBO J 2019, Cammarota*, Finegan* et al Current Biology 2020) we were careful to show the relevant genotype components in each figure.
      • We included a fully referenced Supplementary Table (Table 1 – Drosophila genetics) listing every mutant allele or transgene with a citation and a note about availability. We have expanded this table in response to the author’s concern (see above).

        Related to the comment 1, how did the author perform "clonal expression of Ubi-Pin-YFP" in page 5? As far as I understand, Ubi-Pin-YFP is expressed ubiquitously by the ubiquitin promoter.

      The reviewer makes a good point. We regret that we did not make this experiment more clear. Ubi-Pins-YFP was recombined onto an FRT chromosome (FRT82B). We made mitotic clones.

      We have clarified this in the Methods section as follows:

      “Mitotic clones of Ubi-Pins-YFP were made by recombining the Ubi-Pins-YFP transgene onto the FRT82B chromosome”

      1. In page 6, if Pins relocalization is passive and is associated with membrane-anchored protein remodeling during mitosis, its relocalization can be suppressed by disrupting the process of mitotic remodeling (mitotic rounding). The authors should test this by either genetic disruption or pharmacological treatment for the actomyosin should cause defects in Pins relocalization, which bolster their conclusion.

      We agree that this is a cool experiment and are happy to give it another shot. However, we do note that interpretation could be difficult. We don’t know that mitotic rounding and membrane-anchored protein remodeling during mitosis are inextricably linked. Notably, the remodeling we describe reflects cell polarity; apical components are evidently moved to the lateral cortex. This is contrary to understanding of rounding, which reflects isotropic actomyosin activity (Chanet et al., (2017) Curr.Biol. & Rosa et al., (2015) Dev. Cell.). Therefore we don’t understand what a “negative” result would mean, or for that matter that a “positive” result would be safe to interpret.

      We have attempted many strategies to prevent cell rounding in the follicular epithelium, none of which have successfully prevented rounding. 1) We attempted to genetically knockdown Moesin in the FE and did not see an effect on cell rounding. However we couldn’t confirm knockdown and therefore are not confident in this manipulation. 2) It is difficult to interpret the result of genetically disrupting Myosin, because it causes pleiotropic effects, such as inhibition of the cell cycle, and disruption of monolayer architecture. 3) We treated egg chambers with Y-27632 (a Rok-inhibitor) and examined its effect on mitotic cell rounding and on cytokinesis, which are Rok-dependent processes. Our experiments were performed using manually-dissociated ovarioles treated for 45 minutes in Schneider Cell Medium supplemented with insulin. Even at our maximum concentration of 1mM Y-27632, several orders of magnitude above the Ki, we are unable to see any effect on mitotic cell shape or actin accumulation at the mitotic cortex and did not observe any evidence of defective cytokinesis. We also did not observe defects in spindle organization or orientation, as would be expected from failed rounding. We therefore do not believe that the inhibitor works in this tissue. One possible explanation is that the follicle cells are secretory, and likely to pass molecules taken up from the media quickly into the germline. Therefore, we do not anticipate that we can perform this experiment to our satisfaction.

      1. The critical message in this manuscript is that the core spindle complex mediated by Pins-Mud controls spindle orientation by "activation", but not localization. The findings that Pins and Mud localization is not influenced by Insc and that ecotpic Insc expression and genetic Mud depletion (Figure 6) might support their proposal, but these results just suggest their localization does not matter. I wonder how the authors could conclude and define "activation". What does this activation mean in the context of spindle orientation? Can the authors test activation by enzymatic activity or assess dynamics of spindle alignment?

      We intend for the critical message of the manuscript to be that “The spindle orienting machinery requires activation, not just localization.” We absolutely do not make the claim that localization is not important, only that it is not sufficient. The reviewer recognizes this point here and in a subsequent comment: “The authors showed that Pins and Mud localization themselves are not sufficient for the control of spindle orientation with genetic analyses.”

      We also do not claim that Pins and/or Mud localization are not impacted by Inscuteable. On the contrary, we plainly see and report that they are; the intensity profiles in Figure 6 are distinct from those in Figure 2, as discussed in the text.

      We appreciate the reviewer’s point about activation. Since we do not understand these proteins to be enzymes, we aren’t sure what enzymatic activity would be tested. The dynamics of spindle alignment in this slowly developing system are prohibitively difficult to measure: the mitotic index is very low (~2%) and only a very small fraction of those cells will be in a focal plane that permits accurate live imaging in the apical-basal axis. Alternative modes of activation include conformational change and/or a connection with other important molecules. The simplest possibility would be that Dlg allows Pins to bind Mud, but so far our data do not support it. We have added the following paragraph to our discussion:

      “The mechanism of activation remains unclear. While the most straightforward possibility is that Dlg promotes interaction between Pins and Mud, our results show that Mud is recruited to the cortex even when Dlg is disrupted (Figure 4D). Alternatively, Discs large may promote a conformational change in the spindle-orientation complex and/or a change in complex composition. Furthermore, the Inscuteable mechanism is not likely to work in the same way. Dlg binds to a conserved phosphosite in the central linker domain of Pins and should therefore allow for Pins to simultaneously interact with Mud (Johnston et al., 2009). Contrastingly, binding between Pins and Inscuteable is mediated by the TPR domains of Pins, meaning that Mud is excluded (Culurgioni et al., 2011; 2018). While a stable Pins-Inscuteable complex has been suggested to promote localization of a separate Pins-Mud-dynein complex, our work raises the possibility that it might also or instead promote activation.”

      1. In page 7-8, although Pins-S436D rescue spindle orientation, but Pins-S436A does not in pins null clones background, Pins localization is not influenced by Dlg. This questions how exactly Pins and Dlg can interact, and how Dlg affect Pins function. Related to this observation, in the embryonic Pins:Tom localization in dlg mutant does not provide strong evidence to support their conclusion given the experimental context is different from previous study (Chanet et al., 2017).

      We agree with the reviewer. Our data (this paper and previous papers) and the work of others indicate that this interaction is important for spindle orientation (Bergstralh et al., 2013a; Saadaoui et al., 2014; Chanet et al., 2017). However, we show here that Dlg doesn’t obviously impact Pins localization (as proposed in our earlier paper), but does impact the ability of the spindle orientation machinery to work (hence activity).

      The reviewer makes a very good point. Our experimental context is different from the previous study concerning Pins and Dlg in embryos: Chanet et al (2017) performed their work in the embryonic head, whereas we look at divisions in the ventral embryonic ectoderm. These are distinct mitotic zones (Foe et al. (1989) Development) and exhibit distinct epithelial morphologies. We show that Pins:Tom localizes at the mitotic cell cortex in Dlg[2]/Dlg[1P20] in cells in the ventral embryonic ectoderm. Our only conclusion from this experiment is that Pins:Tom can localize without the Dlg GUK domain in another cell type (outside the follicular epithelium). In the current preliminary revision we have softened our claim as follows:

      “We also examined the relationship between Pins and Dlg in the Drosophila embryo. A previous study showed that cortical localization of Pins in embryonic head epithelial cells is lost when Dlg mRNA is knocked down (Chanet et al., 2017). We find that Pins:Tom localizes to the cortex in the ventral ectoderm of early embryos from Dlg1P20/Dlg2 mothers, indicating that Pins localization in the ventral embryonic ectoderm epithelium does not require direct interaction with Dlg. We therefore speculate that Dlg plays an additional role in that tissue, upstream of Pins (Figure 4G).

      Our intention is to elaborate on our findings with additional data from embryos. To this end we have already acquired preliminary control data investigating the spindle angle with respect to the plane of the epithelium, and are in the process of examining spindle angles in dlg mutant embryonic tissue.

      In page 11, the authors state "... that activation of pulling in the FE requires Dlg". I was not convinced by anything related to "pulling". There is no evidence to support "pulling" or such dynamic in this paper, just showing Mud localization, correct?

      We appreciate the reviewer’s concern. The original sentence read that “We interpret [our data] to mean that interaction between Pins and Dlg, which is required for pulling, stabilizes the lateral pulling machinery even if Dlg is not a direct anchor.” This statement is based on work across multiple systems, including the C. elegans embryo (Grill et al Nature 2001), the Drosophila pupal notum (Bosveld et al, Nature 2016), and HeLa cells (Okumura et al eLife 2018), which shows that Mud/dynein-mediated pulling (on astral microtubules) orients/positions spindles. This is described in the introduction.

      To address the reviewer’s particular concern, we have replaced “pulling” with “spindle-orentation machinery,” so that this sentence now reads …“activation of the spindle-orientation machinery in the FE requires Dlg.”

      1. Ectopic expression of Insc (Figure 6) provided a new idea and hypothesis, but the conclusion is more complicated given that Insc is not expressed in normal FE. For example, the statement that "Inscuteable and Dlg mediate distinct and competitive mechanism for activation of the spindle-orienting machinery in follicle cells" is probably right, but it does not show anything meaningful since Insc does not exist in normal FE. Is Dlg in a competitive situation during mitosis of FE? If so, which molecules are competitive against Dlg? The important issue is to provide a new interpretation of how spindle orientation is controlled epithelial cells. I strongly recommend to add models in this manuscript for clarity.

      We considered the addition of model cartoons very carefully in preparing the original manuscript, and again after review. While we are certainly not going to “dig in” on this point, our concern is that model figures would obscure rather than clarify the message. As the reviewer points out, we do not understand how activation works, and as discussed in the manuscript we don’t think it’s likely to work the same way in follicle cells (Dlg) as it does in neuroblasts (Insc). Therefore model figure(s) are premature.

      We do not agree with the statement that "Inscuteable and Dlg mediate distinct and competitive mechanism for activation of the spindle-orienting machinery in follicle cells… does not show anything meaningful.” This is a remarkable finding because it suggests that there is more than one way to activate Pins. Given the critical importance of spindle orientation in the developing nervous system, and the evolutionary history of the Dlg-Pins interaction, we think that this finding supports a model in which the Dlg-Pins interaction evolved in basal organisms, and a second Inscuteable-Pins interaction evolved subsequently to support neural complexity. These ideas are raised in the Discussion.

      The reviewer also writes that “The important issue is to provide a new interpretation of how spindle orientation is controlled epithelial cells.” We find this concern perplexing, since the reviewer clearly recognizes that we have provided a new interpretation: Dlg is not a localization factor but rather a licensing factor for Pins-dependent spindle orientation.

      Minor comments: 8. Some sections were not written well in the manuscript. "It does not" in page 6. "These predictions are not met". I just couldn't understand what they stand for. Their writing has to be improved.

      Again, we are not going to dig in here, but we would prefer to retain the original language, which in our opinion is fairly clear. Our study is hypothesis-driven and based on assumptions made by the current model. We used direct language to help the reviewer understand what happened when we tested those assumptions.

      1. In page 9, Supplementary Figure 4 should be cited in the paragraph (A potential strategy for..), not Supplemental Figure 1A, and 1B.

      Good catch, thank you! We have corrected this.

      1. In page 10, the authors examine aPKC localization in Insc expressing context of FE. Does aPKC localization correlate with Insc localization (Insc dictates aPKC?)? aPKC is not involved in spindle orientation from the author's report (Bergstralh et al., 2013), so it does not likely provide any supportive evidence.

      I’m afraid we don’t entirely understand this comment. The interdependent relationship between aPKC and Inscuteable localization is long-established in the literature and was previously addressed in the follicle epithelium (Bergstralh et al. 2016). We do not make the claim here that aPKC governs spindle orientation. We are emphasizing that the difference between InscA and InscB cells extends to the relocalization of polarity components involved in Insc localization. As described in the manuscript, these data are provided to support our threshold model:

      “In agreement with interdependence between Inscuteable and the Par complex, we find that aPKC is stabilized at the apical cortex in InscA cells but enriched at the lateral cortex in InscB cells (Figure 6E). This finding is consistent with an Inscuteable-expression threshold model; below the threshold, Pins dictates lateral localization of Inscuteable and aPKC. Above the threshold, Inscuteable dictates apical localization of Pins and aPKC.”

      1. In Dicussion page 12, "In addition, we find that while the LGN S408D (Drosophila S436D) variant is reported to act as a phosphomimetic, expression of this variant has no obvious effect on division orientation (Johnston et al., 2012)". Where is the evidence for this? I interpret that this phosphomimetic form can rescue like wt-Pins not like unphospholatable mutant S436A, so it means that have an effect on spindle orientation, correct?

      The reviewer makes a good point. We regret the confusion. We mean to point out that the S436D variant is no different from the wild type. We have amended the text to clarify:

      “In addition, we find that while the LGN S408D (Drosophila 436D) variant is reported to act as a phosphomimetic, this variant does not cause an obvious mutant phenotype in the follicular epithelium (Johnston et al., 2012). What then is the purpose of this modification? Since the phosphosite is highly conserved through metazoans, one possibility to consider is that the phosphorylation regulates the spindle orientation role of Pins, whereas unphosphorylated Pins plays a different role (Schiller and Bergstralh, 2021).”

      Reviewer #2 (Significance (Required)):

      The authors showed that Pins and Mud localization themselves are not sufficient for the control of spindle orientation with genetic analyses. While the authors tried to challenge the concept of "canonical model", there is no clear demonstration of "activation" of the spindle complex. I appreciate their genetic evidence and new results, and understand the message that Pins and Mud effects are beyond localization, but there is no alternative mechanism to support their model. At the current stage, their evidence provides more hypothesis, not conclusion. Based on my expertise in Developmental and Cell biology, I suggest that the work has an interest in audience who studies spindle machinery, but for general audience.

      We think that the reviewer fundamentally shares our perspective on the study. Our work tests assumptions made by the canonical model and shows that they aren’t always met (meaning that the question of how spindle orientation works in epithelia at least is still unsolved), and that in the FE at least one component (Dlg) has been misunderstood. We reach a major conclusion, which is that localization of Pins is not enough to predict spindle orientation in the FE.

      It’s absolutely true that the precise molecular role of Dlg has not been solved by our study. This is a major question for the lab, and we are currently undertaking biochemical work to address it. It’s probably more work than we can (or should) do on our own, which is just one reason to share our current results with colleagues.

      One fundamental reason for undertaking this study is that 25 years of spindle orientation studies released into an environment in which “positive” conclusions are the bar for publication success may have burdened the field with claims that are overly-speculative. We appear to have contributed to this problem ourselves in 2013. With that in mind we contend that providing an alternative molecular mechanism at this point is premature and would impair rather than improve the literature.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Neville et al re-examine the role and regulation of Pins/LGN in Drosophila follicular epithelial cells. They argue that polar or bipolar enrichment of Pins localisation at the plasma membrane is not crucial for spindle orientation, and therefore propose that Pins must be somehow activated to function. These interpretations are not supported by the data. However, the data strongly suggest an alternative interpretation which is of major biological significance.

      As an initial point, we disagree with the summary above. We do not argue that enrichment of Pins is not crucial for spindle orientation. We argue that enrichment of Pins is not sufficient. This is why we titled the paper “The spindle orienting machinery requires activation, not just localization” instead of “The spindle orienting machinery requires activation, not localization.”

      Although we disagree with reviewer, we appreciate their criticism of our manuscript and are glad for the opportunity to clarify our findings. In our responses to the specific comments (below) we explain why our data contradict the reviewer’s model and what we will do to improve the manuscript.

      Comments:

      1. In the experiments on Dlg mutants (Fig 4D, S3) visualising Pins:Tom, the wild-type needs to be shown next to the Dlg mutant image, otherwise a comparison cannot be made. For example, Pins:Tom looks strongly enriched at the lateral membranes in the wild-type shown in Fig 2B&C, but much more weakly localised at the lateral membranes in Dlg1P20/2 mutants in Fig 4D. Thus, it looks like the Dlg GUK domain is required for full enrichment of Pins:Tom at lateral membranes, even if some low level of Pins can still bind to the plasma membrane in the absence of the Dlg GUK domain. Quantification would likely show a reduction in Pins:Tom lateral enrichment in the Dlg1P20/2 mutants - consistent with the spindle misorientation phenotype in these mutants.

      The reviewer raises a reasonable concern about Figure 4D. We noted the difficulty of imaging Pins:Tom, which is exceedingly faint, in our original manuscript. For technical reasons, only one copy of the transgene was imaged in the experiment presented in 4G (two copies were used in Figure 2B), and the lack of signal presented an even greater challenge. In the manuscript we went with the clearest image. To address the reviewer’s concern, we have added signal intensity plots to this figure showing that Pins:Tom and Pins-myr are both laterally enriched at mitosis in Dlg[1P20]/Dlg[2] mutants. These data have been added as a new panel (E) in Figure 4. We were also able to replace the pictures in 4D with new ones generated after review.

      1. In Fig 4E, the phosphomutant PinsS436A-GFP looks more strongly apical and less strongly lateral than the wildtype Pins-GFP, consistent with the spindle misorientation phenotype in S436A rescued pins mutants.

      The reviewer has an eagle eye! We did not detect a difference in localization across the three transgenes, though we were certainly looking for it (that’s why we generated these flies in the first place). Again, the strength of signal was a major challenge in these experiments, and we therefore went with the cleanest image. In response to the reviewer’s concern, we note that the S436A and S436D examples shown have equivalent apical signal, but only the S436A fails to rescue spindle orientation.

      Together, Reviewer Comments 1 and 2 suggest a model in which Dlg is required for lateral enrichment of Pins at mitosis. As described in the manuscript, this is the very model proposed in our own previous study (Bergstralh, Lovegrove, and St Johnston; 2013), and reiterated in a subsequent review article (Bergstralh, Dawney, and St Johnston; 2017). We point these publications out because the senior author of the current manuscript is not especially enthusiastic about showing himself to be wrong (twice!) in the literature. He therefore insisted on seeing multiple lines of evidence before making the counterargument:

      • The reviewer’s model (the 2013 model) is firstly challenged by work shown in Figure 3. We find that membrane-anchored proteins (even just myristoylated RFP!) demonstrate lateral enrichment at mitosis, regardless of whether or not they interact with the Dlg-GUK domain.
      • Even stronger evidence is shown in Figure 4F. Pins-myr-GFP is very plainly enriched at the lateral cortex in Dlg[IP20]/Dlg[2] mutant cells (now demonstrated with signal intensity plots in FIGURE 4E). However, the spindle doesn’t orient correctly (quantified in 4C). Since Dlg is impacting spindle orientation independently of Pins localization, these data support our “claim in the final sentence of the abstract ‘Local enrichment of Pins is not sufficient to determine spindle orientation; an activation step is also necessary’.”

        In the InscA examples, Pins:Tom looks apical. In the InscB examples, Pins:Tom looks more laterally localised, consistent with the spindle orientations in these experiments.

      These figures (6A-D) don’t only show/quantify Pins:Tom localization. They also show localization of GFP-Mud. Whereas Pins:Tom is certainly apically enriched in the InscA examples, the interesting finding is that GFP-Mud is not. In strong contrast, it instead shows a weak apical localization and a strong lateral enrichment. As described in the manuscript, this pattern of Mud localization predicts normal spindle orientation, which is not observed in these cells.

      Thus, these data appear to support the existing model that Pins enrichment at the plasma membrane is a key factor directing mitotic spindle orientation in these cells. The author's claim in the final sentence of the abstract "Local enrichment of Pins is not sufficient to determine spindle orientation; an activation step is also necessary" is not supported by the data.

      We are pleased that the reviewer shared this quote; our claim is that Pins localization is not sufficient, not that it is unnecessary (see above). We absolutely do not dispute that “Pins enrichment at the plasma membrane is a key factor directing mitotic spindle orientation.”

      The open question posed by the data is why GFP-Mud is excluded apically & basally during mitosis, while Pins:Tom is not. The simple alternative model is that Mud only localises to the plasma membrane where Pins is most strongly concentrated, such that Mud strongly amplifies any Pins asymmetry. Thus, even myr-Pins can still rescue a pins n mutant, because myr-Pins is still enriched laterally compared to apically (or basally).

      Thus, I would strongly suggest re-titling the manuscript to: "Mud/NuMA amplifies minor asymmetries in Pins localisation to orient the mitotic spindle".

      Well, that is a good-looking title, and we’re therefore sorry to decline the suggestion. However, as described above, Figure 4D shows that Pins enrichment does not always predict spindle orientation. More importantly, Figure 6A (cited by the reviewer in Comment 3) very plainly shows that Mud does not “only locali[ze] to the plasma membrane where Pins is most strongly concentrated.” In this picture – and across multiple InscA cells (Figure 6B) - Pins is strongly concentrated at the apical surface, whereas Mud is not.

      Mud/NuMA presumably achieves this amplification by binding to the plasma membrane only where Pins is concentrated above a critical threshold level. This would mean a non-linear model based on cooperativity among Pins monomers that increases the binding avidity to Mud above the threshold concentration of Pins monomers.

      This is essentially a minor revision of the standard model, which we expected would hold true in the FE. As described above, it is not supported by our data.

      Reviewer #3 (Significance (Required)):

      The manuscript is focused on the question of mitotic spindle orientation in epithelial cells, which is a fundamental unsolved problem in biology. The data reported are impressive and important, providing new insights into how the key spindle orientation factors Mud/NuMA and Pins/LGN localise during mitosis in epithelia. I recommend publication after major revisions.

      We are delighted that the reviewer finds our data impressive and important, and our experiments insightful. We understand that the “major revisions” requested are meant to bring the paper in line with their model (our own earlier model). Since the data in our original manuscript contradict that model, the revisions are instead focused on clarifying and strengthening our message.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank for reviewers for their feedback and were pleased they think that the manuscript is “of great interest to the scientific community”. The reviewers agree that the manuscript addresses an important question and that the identification of ASNS as a potential vulnerability of late-stage colorectal cancer is significant. The reviewers agree that our findings would be substantially strengthened by validation in state-of-the-art organoid model systems. We agree with this and are currently liaising with collaborators (Owen Sansom, Beatson Institute and Laura Thomas, Swansea University) to replicate our findings in both mouse and human colorectal organoid models. We will determine the sensitivity of colorectal organoid models to ASNS inhibition across a range of tumorigenicities and mutational profiles representing different stages of the adenoma-carcinoma progression. We believe these experiments will substantially strengthen the manuscript and lend weight to our finding that late-stage adenocarcinoma cells are vulnerable to ASNS inhibition.

      This is the predominant concern across reviewers, we are confident we can address this and all other, relatively minor, concerns as detailed below.

      Please find below a point-by-point reply to the reviewer’s comments. Reviewer comments are in italicized text and our responses follow.

      Reviewer #1

      • All of the findings in this manuscript are limited to in vitro observations, we know that most of the in vitro findings can not be translated in vivo. The manuscript would significantly benefit from in vivo experiments using the cells described in Fig.1 A and confirming the in vitro findings.*

      We agree that validation of our results in a more physiological context would significantly elevate our manuscript. In order to address this, we intend to use both human and mouse colorectal organoid models (please see detailed description of this in response to reviewer 2). We have decided to take this approach rather than conduct in vivoexperiments using the AA series (C1, SB, 10C and M) for two main reasons. Firstly, the C1 and SB cell lines do not form tumours in mice, consistent with them representing early colorectal adenoma cells. As such, we are not able to use the entire series in in vivo experiments. Secondly, we are keen to demonstrate replication of our findings in an alternative model. An organoid model would offer increased functional relevance, whilst allowing us to retain the ability to validate our observed metabolic dependencies across the adenoma to carcinoma sequence. We hope the reviewer agrees that these experiments would address their concerns.

      • The authors should provide proliferation data for the cell lines they used in this manuscript (C1, SB, 10C and M). In Fig. 1 B they show clear differences in EACR, can the authors provide data on glucose uptake differences in these analyzed cell lines.*

      We agree that proliferation and glucose uptake data would be a useful addition to the manuscript. We will provide doubling times for the cell lines used in this study and will measure glucose uptake by analysing extracellular glucose levels in the cell culture media from each of the cell lines.

      • In Figure 2 C the authors should provide isotope tracing data for the upper glycolysis (e.g. glucose and glucose-6-P) and alanine. In Figure 2 D the authors should provide the isotope tracing data for glutamine and glutamate.*

      We have data for glycolytic intermediates; glycerol-3-phosphate and dihydroxyacetone phosphate (DHAP) and alanine and will add them to the figures as requested.

      • Do the authors see any sign of reductive carboxylation in their U-13C glutamine experiments?*

      We observe only a low level of reductive carboxylation across the AA series cell lines (

      • Can the authors speculate how the C1, SB, 10C and M cell lines would react when glucose would be replaced with galactose in the culture environment and forcing the cells to increase oxidative phosphorylation (OXPHOS).*

      We would speculate that the cells would react similarly to our experiments in low glucose conditions displayed in Fig 3A-K. Given that M cells were shown to be the most flexible with regards to fuel source, we would expect them to be able to survive and proliferate more efficiently than the other cell lines in challenging culture conditions. Additionally, we would expect the C1s to survive well in galactose conditions, given that they rely less on glycolysis for ATP production and have significantly higher spare respiratory capacity compared to the more progressed cell lines.

      • Can the authors comment whether C1, SB, 10C and M cell lines show differences in coping with oxidative stress?*

      Again, we would speculate that the M cells would cope with exposure to oxidative stress best, given their metabolic flexibility. However, we would aim to test this by measuring the cellular response to hydrogen peroxide (which would induce oxidative stress) across all cell lines.

      • In the ASNS knockdown experiments do the authors detect an increase in glucose uptake in ASNS deficient cells.*

      This is an interesting question; we will address it by comparing extracellular glucose levels in C1 and M cells transfected with control and siRNA targeting ASNS.

      • Can the authors provide gene expression data that would explain the metabolic switch between early and late-stage adenocarcinoma? Do the authors detect any differences in mTORC1 activation among the C1, SB, 10C and M cell lines? ASNS is an ATF4 target, can the authors provide any expression data on ATF4 in their cell lines.*

      To address the first question, using our proteomics data, we have generated heatmaps showing protein abundance data from key metabolic pathways including glycolysis, the TCA cycle and the electron transport chain in the C1, SB and M cell lines. These data show an array of variation in protein expression of these pathways between the C1, SB and M cells, with no clear up or downregulation of these pathways as a whole, but rather more intricate regulation of clusters of proteins within the pathways. These data align well with the metabolomic data presented in Figure 2 and will allow us to investigate the mechanisms underlying the metabolic switch. These heat maps will be incorporated into the manuscript. Using the heatmaps we will identify and discuss key nodes we predict to explain the metabolic switch between early and late-stage adenocarcinoma. We will then determine whether manipulation of these nodes impact the metabolic phenotype of the cells experimentally. For example, the heat maps have highlighted that ENO3 or enolase 3 is strongly upregulated in the SB and M cells in comparison to the C1 cells and may be involved in driving the metabolic switch. Indeed, ENO3 has previously been found to promote colorectal cancer progression by enhancing glycolysis (Chen et al, Med Oncol, 2022), consistent with what we see here. To test this, we will knock down ENO3 across the cell line series and determine the impact on cellular phenotype and metabolism (using Seahorse extracellular flux analysis).

      With regards to mTORC1 activation, we have further analysed our proteomics data from C1, SB and M cells and have found that the M cells show significantly higher serine 235/236 phosphorylation of ribosomal S6 protein, a common readout for mTORC1 activation, compared to C1 and SB cells. Further, we aim to carry out immunoblotting across the four cell lines to analyse phospho-S6 (relative to total S6), 4E-BP1 and phospho-ULK-1 (relative to total ULK-1) levels.

      With regards to ATF4, using our proteomics data we have generated a heatmap of gene expression changes of ATF4 target genes in C1, SB and M cells that we will provide in supplementary material . These data suggest that there does not appear to be any clear pattern of enhanced or reduced ATF4 transcriptional activity across the cell lines, with different clusters of genes within this signature up or downregulated across the series. Moreover, Ingenuity Pathway Analysis (IPA) revealed that the ATF4 pathway showed an activation z-score of -0.41 (p=0.0134) in SB versus C1 cells, and 0.35 (p=0.00051) in M versus C1 cells (where a threshold of +/- 2 indicates activation/suppression of the pathway, respectively), confirming there is no clear regulation of this pathway between the cell lines. In addition, we will carry out immunoblotting for ATF4 expression levels across the cell line series.

      Reviewer #2

      *Major comments: *

      *Early CRC *

      *Molecular understanding of CRC is obviously of great interest and importance for the clinics. However, tumors of early stages are almost exclusively resected and not treated with systemic agents. Hence, the argument by the authors that the metabolic understanding of early CRC is of clinical relevance is somewhat misleading. Overall, it would have been much more clinically relevant to investigate the multiple steps of later stages during CRC progression. How about metabolic changes during metastasis. Deep mechanistic understanding of process during metastasis has striking clinical relevance. *

      We agree with the reviewer that understanding metastatic progression is of clinical relevance and should indeed be investigated in more detail. Using our model, we do shed light on a vulnerability of late-stage adenocarcinoma cells (sensitivity to asparagine synthetase (ASNS) inhibition). Indeed, we show that ASNS expression is elevated in both colorectal tumour and metastatic tissue in comparison to normal suggesting that our study may have revealed a vulnerability with utility for treating late stage (and potentially metastatic) tumours. The reviewer raises an important issue with the way we frame the utility of the model in the manuscript text. We will rewrite this to emphasise its utility in identifying late-stage vulnerabilities and the clinical value of this approach. We maintain that the molecular understanding of colorectal cancer across all stages of its progression will provide a valuable contribution to the field but agree that we should be more specific with regards to the clinical utility of our findings.

      *Model system *

      The cell lines used in this study are not state-of-the-art to investigate the complex process during CRC progression. The original paper is from 1993 in which the cell lines were generated does not allow understanding of the characteristics of these cell lines. Recently, multiple models have been established, for example in organoids, to investigate the progression of CRC much more reliably. There are systems that use CRISPR/CAS9 edited human organoids that follow the genetic alterations of CRC progression with accompanied phenotypes. Further, extensive biobanks of organoids from patients are available (also commercially) which better represent the stages of CRC. Similarly, the question raised above of how representative this progression cell line set is needs to addressed. The mutagen-induced progression could generate various alterations that are not detected in patients, hence create an artificial system. Overall, biological replicates are missing.

      We thank the reviewer for their critique and agree that our manuscript would be significantly strengthened if we were able to replicate our key findings in another model. We agree that the cell line series we have used here has limitations and we will make sure these are discussed by adding a ‘Limitations’ section to the ‘Discussion’. We maintain that the cell line series is a valuable tool in which to effectively identify metabolic vulnerabilities for further research. A key advantage of this system is that it is a human cell line series of the same lineage. In addition, we can easily conduct metabolomics and stable isotope tracer analysis allowing us to investigate cellular metabolic activity and manipulate any identified pathways easily. As such, the cell line series is an effective tool in which to identify potential vulnerabilities, but we agree that these vulnerabilities need to be validated in state-of-the-art organoid systems for the impact of the work to be clearer.

      To address this, in collaboration with Owen Sansom (Beatson Institute) and Laura Thomas (Swansea University), we aim to validate our identified metabolic dependency in mouse and human colorectal organoids respectively. We will determine the sensitivity of colorectal organoid models across a range of tumorigenicities and mutational profiles representing different stages of the adenoma-carcinoma progression to asparagine synthetase (ASNS) inhibition. We believe these experiments will substantially strengthen the manuscript and lend weight to our finding that late-stage adenocarcinoma cells are vulnerable to ASNS inhibition.

      *Gene Expression analysis *

      In Figure 5 C and D is the expression of ASNS to stages and overall survival from online available datasets correlated. Its unclear what the difference between tumor and metastatic in C means. The labelling in D is too small. Is the difference between the two groups significant? Are these patients only at a specific stage? It seems not that ASNS is a strong prognosticator; further stratification is needed to clarify the role of ASNS in CRC.

      The data displayed in Fig 5C and 5D are from separate datasets so are not correlated. In Fig 5C ‘Tumour’ refers to gene expression from the primary tumour site (in this case the colorectum), whereas ‘Metastatic’ refers to gene expression from a metastatic tumour (from which the primary tumour was of colorectal origin). We will make this clearer in the text and figure legend. We will also make the labelling on the survival plot in D clearer, indicating that the difference between the two groups is significant and displaying the p value clearly.

      The data included in the survival plots in 5D encompass all tumour stages. We have further analysed these data, adjusting for tumour stage. We found that high ASNS expression in later stage tumours (stage 3 and 4) is associated with poorer overall survival, whereas there is no significant difference in overall survival in earlier stage tumours (stage 1 and 2) in relation to ASNS expression. We plan to add this to the supplementary materials and discuss in the main text as it is consistent with our findings from the AA cell line series.

      *Western Blot controls *

      For the Western Blots in Figure 6 A and C the total S6 and ULK1 controls are missing what is needed to assess the effect on pS6 and pULK1 correctly.

      We will add total S6 and ULK1 controls to these figures.

      In the same panels, the KO efficacy is not very high in A (-ASN). However, this is crucial to make the conclusion that this cell line (C1) is not dependent on ASNS.

      The average knockdown efficiency in the C1 cells is 72% across n=3 experiments. Therefore, levels of ASNS are significantly reduced. However, to further validate this finding, we will use L-Albizziine, a competitive inhibitor of ASNS, at the same concentration in both C1 and M cells to eliminate any issues surrounding variation in knockdown efficiency and to replicate the results obtained using ASNS siRNA. These data will be included in supplementary material.

      *Minor comments: *

      *Statistical analysis of proliferation assays *

      The statistical significance for proliferation assays are missing.

      The statistical significance at the final timepoints of the proliferation assays are displayed on bar graphs in Supplementary Figure 5 (Figure S5B and C). We will add these to the proliferation curves in the main figure.

      Reviewer #3

      *A major concern is the model used in this study: *

      Sodium butyrate and the carcinogen N-methyl-N-nitro-nitrosoguanidine (MNNG) were used for the transformation. I believe this model was developed by one of the co-authors of the study, A.C. Williams in the 1990s. The relevance of the model for in vivo colon carcinogenesis is not entirely clear to me and I miss information why in particular sodium butyrate and MNNG were used. I am not an expert on colon carcinogenesis but I did not have the impression that this model has been widely adopted and I miss detailed information on the model itself as well as a critical discussion of its limitations.

      We thank the reviewer for raising these concerns and will include a ‘Limitations’ section in the manuscript ‘Discussion’ to elaborate on both the utility and the limitations of this model system. As described in response to concerns raised by reviewer #1 and reviewer #2, we plan to validate our findings in organoid models of colorectal tumourigenesis to strengthen the discoveries made using the AA cell line series.

      With regards to the use of sodium butyrate and MNNG for transformation of the C1 cells, justification was provided in the original paper describing generation of the cell line model series (Williams et al, Cancer Research. 1990). Sodium butyrate is naturally occurring in the gut and was used for the transformation of the C1 cells as it had been proposed to play a role in promoting colorectal tumorigenesis through upregulating carcinoembryonic antigen (CEA) expression and enhancing proliferation in adenoma cells able to resist growth arrest following treatment (Berry et al, Carcinogenesis. 1988). At the time of generating the cell line series, few reagents were known to induce transformation in human epithelial cells. However, MNNG was one of those and had been previously used to transform keratinocytes (Rhim et al, Science. 1986). Crucially, tumours formed in mice from xenografted 10C cells were found to be heterogeneous, displaying areas of differentiation with glandular organisation, the presence of functional goblet cells enabling mucin production, as well as areas of poorly or undifferentiated cells. Furthermore, cytogenetic analyses revealed that genetic changes in the cell line progression model such as chromosome 18q loss and KRAS activation replicate those seen in CRC patients (Williams et al, Oncogene. 1993). Together, these characteristics recapitulate human tumours in vivo, validating the use of sodium butyrate and MNNG in generating an in vitro CRC cell line model that represents human colorectal tumorigenesis.

      Figure 6: total levels of ribosomal S6 protein and ULK1 should be detected, quantified and used for normalization.

      We agree with the reviewer, we will add total S6 and ULK1 controls to these figures.

      Can you measure ASN upon inhibition of autophagy? Does it go down further?

      This is an interesting question, and we will address this experimentally by measuring ASN levels following treatment with chloroquine in the C1 and M cell lines. We will do this using stable isotope labelling and mass spectrometry and include the results in supplementary material.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General Comments

      We thank all reviewers for providing very detailed, knowledgeable, and informative reviews.

      All reviewers were complementary about the data and the rigor of the study. Reviewers 2 and 3 commented on the significance of the work, and their assessments were complementary, specifically about the fact that it bridges several previous studies and links these to kinase-phosphatase regulation on the BUB complex. We agree that this a major strength of the work. That is why we also believe the comment by reviewer 1 that “most of the phenotypes/observations are consistent with the literature and not surprising” is actually a strength and not a weakness. Sometimes manuscripts that bring together various different findings into one conceptual model can be very powerful, even if each finding in isolation is not so surprising. In this case, the concept that a dual kinase-phosphatase module integrates two major mitotic processes will, we believe, prove to be a significant breakthrough that helps to explain how these processes are properly integrated at kinetochores.

      The main criticism of all reviewers related to the interpretations and writing style, which in general, we felt were valid. We will take on board all these comments, reword the manuscript during revision, and provide a detailed response to each of these points at resubmission.

      In terms of points requiring new experiments, there were 3 in total:

      1) Reviewers 1 and 3 raised an important issue about the feedback loop which will be addressed with new experiments to uncouple the feedback.

      2) Reviewer 2 made an important point about KNL1 levels, including a good suggestion to perform FRAP analysis to examine BUB complex dynamics when MELT numbers are increased. We will carry out this experiment prior to revision.

      3) Reviewer 1 had a second major comment regarding the modulation of MELT number and how this cannot be directly linked to PLK1/PP2A levels. We have 3 new experiments to add regarding this, performed already, which are discussed in the section below.

      All other comments were textual points that in most case we felt were valid. They showed that all reviewers had a very good grasp of the paper, the concepts, and the field in general. So, we finish by thanking all reviewers again for their thorough and detailed assessments of our manuscript. The comments they raised will help us to improve the manuscript after revision.

      Description of the planned revisions

      Three main points:

      1) The role of the feedback loop [reviewers 1 and 3]:

      The general issue is explained succinctly by reviewer 3’s comment:

      “The argument linking the negative feedback loop to biological functions is not straightforward. The authors provide evidence in Figure 1 for regulatory pathways between docked PLK1 and bound PP2A. However, their assays in Figure 2 bypass the feedback loop by directly modulating PP2A activity. These experiment supports an argument that the kinase/phosphatase activity balance is important, but do not address the feedback loop specifically (which could potentially be done using mutations that disrupt the feedback regulation). The claim that "a homeostatic feedback loop maintains an optimal balance of PLK1 and PP2A on the BUB complex" is too strong because there is no direct evidence connecting the feedback loop to optimal function.”

      This is a good point that we will address at revision. We demonstrate that the enzymes regulate each other on the BUB complex in figure 1 (PLK1 recruits PP2A, and PP2A removes PLK1), which balances their levels on the BUB complex. To determine consequences of upsetting this balance we locked either the kinase-bound or phosphatase-bound states (Figure 2). Importantly, this is required to assess direct phenotypes associated with each, but it does not directly demonstrate the role of the feedback loop. To do this we will generate mutants, as suggested, and analyse their phenotypes.

      We will mutate the PLK1 binding site (T620A) and recover the PLK1-regulated sites in the KARD motif to phospho-mimicking aspartates (S676D/T680D), analyzing effects on PLK1/PP2A recruitment, chromosome alignment and SAC strength. We predict that this will remove PLK1 and recover some PP2A, but to lower levels overall than the BUBR1-B56 fusion. In that case the phenotypes will probably be milder, but that would not change the overall conclusions.

      We maintain that locking PLK1 on its phospho-binding site (in BUBR1-DPP2A) is the ideal scenario to test direct PLK1 roles, but we will also now create alanine mutants of the PLK1 site (S676A/T680A) and the CDK1 site (S670A) to address the feedback loops controlled by CDK1 and PLK1. Our prediction is that these will skew the balance towards PLK1, without fully removing PP2A, again likely to produce milder intermediate phenotypes.

      It is definitely worth testing these predictions, because it would directly address the role of the feedback loop and it would avoid relying solely on “artificially high levels” as mentioned by reviewer 1. One final point on this however, the PLK1 recruitment in DPP2A cells is not artificial – it is PLK1 bound to its native phospho-motif when PP2A is unbound (without any feedback from PP2A this phospho-site and PLK1 binding increase to the observed maximal levels). The fusion of B56 is admittedly less optimal, but this does still lock the phosphatase-bound state in a set stoichiometry, crucially in the absence of kinase. This is required to assess direct phosphatase effects. These PLK1/PP2A levels may well be higher than observed physiologically on the BUB complex when considering the behavior of all BUBR1 molecules, since we doubt they ever reach 1:1 stoichiometry with either PLK1 or PP2A. However, the feedback loop is operating within individual molecules (figure 1), which may well individually flip between PLK1 or PP2A bound states. This may occur on certain molecules at specific times. Therefore, locking the PLK1.PP2A-bound state on all molecules is, in our opinion, still a valid and useful perturbation to assess function of these two states.

      2) The increase recruitment of BUB1-PLK1/PP2A when MELT numbers are increased [reviewer 2]

      "While in the 12x and 19x mutant conditions there are more molecules of BUBs per Knl1, the overall BUB levels are the same as in wild-type controls. Since the MELT repeat used throughout the paper is a consensus sequence that is likely optimal for BUB binding, it is possible that the phenotypes of the 12x and 19x mutants are explained because of an increase in the affinity of BUBs for Knl1 rather than overall levels. This would also help explain why Knl1 and BUBs are observed at the spindle midzone in the 19x mutant (Fig. S4)"

      The reviewer raises an important issue here, when stating that increasing MELT numbers decreases KNL1 kinetochore recruitment. This has the net effect of normalizing overall BUB1-PLK1/PP2A kinetochore levels, even though BUB1-PLK1/PP2A recruitment per KNL1 molecule is increased. That is why we were careful to state BUB1-PLK1/PP2A were increased “on each KNL1 molecule” and not “on kinetochores” when referring to the effect in the 12x/19x MELT mutant. However, this could easily be misinterpreted so this point will be clarified at revision.

      The issue of why the phenotype is so dependent on kinase/phosphatase level per KNL1 molecules is an important one however, which has puzzled us until now. We think the suggestion to look at turnover by FRAP is a good one, because enhanced binding strength could underlie the phenotype here, and potentially explain the lack of disassembly at anaphase. We will perform these experiments at revision to see if they can clarify the issue.

      3) The link between MELT number and PLK1/PP2A levels [Reviewer 1]

      “My second comment relates to the fact that the two parts of the paper are not directly linked although the authors try to do this. They nicely manipulate the MELT repeats on KNL1 to change the number of Bub complexes. However, they cannot directly link the data to changes in Plk1 and PP2A-B56 levels only as many other things are changing. By increasing MELT numbers Bub complex and Mad1/Mad2 levels increase as well as an example and this makes interpretations complicated. To me these experiments are not addressing the main conclusions of the paper.”

      We do not agree with this overall assessment, but there are two elements to this comment: the effect of modulating MELT number on SAC strength (and its link to PLK1) or on KT-MT stability (and link to P2A). We will therefore discuss each separately:

      For SAC regulation, we feel that the data is clear and the interpretations are justified, although we will add new data to support this point after revision. Increasing MELT number causes defects in MELT-BUB dissociation and SAC silencing (4a-c). Importantly, these phenotypes can be completely rescued by inhibiting PLK1 (4d-e). So, we do link the effects of high MELT number to PLK1 activity. Our interpretation is that when MELT numbers are increased the ability of PLK1 to phosphorylate these motifs and maintain the SAC platform is enhanced (when MPS1 is inhibited pharmacologically or upon KT-MT attachment). So, whilst it is true that many factors, such as the kinetochore levels of BUB/MAD1/MAD2, are crucial for the SAC, the ability of PLK1 to maintain these levels (via pMELT-BUB1) is crucial and that changes as MELT number increases. This contributes directly to the observe SAC silencing phenotype, as confirmed by the complete rescue of this phenotype after PLK1 inhibition.

      We did also explore the possibility that increased BUB1 activity could also contribute to SAC strengthening, for example, by enhancing Aurora B recruitment to centromeres. However, BUB1 inhibition did not alter SAC strength or MELT dephosphorylation kinetics. We will add this data after revision.

      We also evaluated the levels of phosphorylated MAD1-pT716, which is important for MCC assembly (Ji et al. 2017, Ji et al. 2018, Faesen, 2017). Our data show that WT and 19xMELT exhibit similar MAD1-pT716 levels during a nocodazole arrest and following MPS1 inhibition. In summary, the main changes we observe are elevated BUB1 levels due to MELT phosphorylation, and increased BUB1 phosphorylation on pT461 (as shown in Figure 4h). All this points towards a localized effect of PLK1 on/around the BUB complex. We will add this data and make this point clearly at revision.

      For KT-MT attachment regulation, we agree that we do not have a similar way to inhibit PP2A-B56 activity to rescue hyperstable microtubule attachment when MELT numbers are high. For this, we require a way to rapidly inhibit PP2A-B56 activity after attachments have formed, something that is not technical feasible at the present. We can also not say for certain that reduced MELT numbers destabilize microtubule due to lack of PP2A, however we feel this is the most like interpretation for the following reasons. The phenotype of removing PP2A from BUBR1 or removing the MELT from KNL1 (along with all associated factors), is identical: mutant cells have comparable chromosome misalignment due to unattached kinetochores (compare 2F-I with 5A-D). Therefore, the additional factors lost by removing the MELTs cannot be having such a strong impact in KT-MT attachment. The obvious factor that could affect attachment strength is again BUB1, via Aurora B recruitment to centromeres. However, loss of BUB1 (after MELT removal) is predicted to enhance attachment stability (reduced Aurora B) and not decrease it, as we observe. So, whilst we cannot definitely conclude that modulating MELT number affect attachment stability via PP2A, we feel that this is certainly the most likely explanation. We will state this clearly in the revised text.

      Description of analyses that authors prefer not to carry out

      “SAC strength of BubR1 WT, ΔC and B56γ was analysed in the presence of nocodazole + MPS1i. It would be interesting to see what the phenotypes are without MPS1i [Reviewer 1]”

      In the absence of MPS1i basal MELT phosphorylation increases (DC) or decreases (B56g) as predicted (Figure 2d; compare timepoint 0 all conditions). This does not cause any change to SAC strength when all kinetochores are unattached in nocodazole (not shown). The sensitize SAC assay (nocodazole + MPSi) has been used by many groups (originally Santaguida et al, 2011; Saurin et al, 2011), because it reduces SAC signals from all unattached kinetochores which would otherwise produce a saturated response. In this case, we specifically chose a dose of MPS1 inhibitor that gave a partial SAC response from which we could observe either strengthening or weakening – a key point of the assay. Indeed, this showed that the SAC was strengthened (DC) or weakened (B56g), as predicted (Figure 2E). The only other way to do this, which has been used by some in the literature, is to use a low dose of nocodazole which prevents all kinetochore from signaling to the SAC. We specifically wanted to avoid this situation because then you cannot untangle the effects on SAC and KT-MT attachment stability – this was crucial in our case.

    1. Who Can Name the Bigger Number?by Scott Aaronson [Author's blog] [This essay in Spanish] [This essay in French] [This essay in Chinese] In an old joke, two noblemen vie to name the bigger number. The first, after ruminating for hours, triumphantly announces "Eighty-three!" The second, mightily impressed, replies "You win." A biggest number contest is clearly pointless when the contestants take turns. But what if the contestants write down their numbers simultaneously, neither aware of the other’s? To introduce a talk on "Big Numbers," I invite two audience volunteers to try exactly this. I tell them the rules: You have fifteen seconds. Using standard math notation, English words, or both, name a single whole number—not an infinity—on a blank index card. Be precise enough for any reasonable modern mathematician to determine exactly what number you’ve named, by consulting only your card and, if necessary, the published literature. So contestants can’t say "the number of sand grains in the Sahara," because sand drifts in and out of the Sahara regularly. Nor can they say "my opponent’s number plus one," or "the biggest number anyone’s ever thought of plus one"—again, these are ill-defined, given what our reasonable mathematician has available. Within the rules, the contestant who names the bigger number wins. Are you ready? Get set. Go. The contest’s results are never quite what I’d hope. Once, a seventh-grade boy filled his card with a string of successive 9’s. Like many other big-number tyros, he sought to maximize his number by stuffing a 9 into every place value. Had he chosen easy-to-write 1’s rather than curvaceous 9’s, his number could have been millions of times bigger. He still would been decimated, though, by the girl he was up against, who wrote a string of 9’s followed by the superscript 999. Aha! An exponential: a number multiplied by itself 999 times. Noticing this innovation, I declared the girl’s victory without bothering to count the 9’s on the cards. And yet the girl’s number could have been much bigger still, had she stacked the mighty exponential more than once. Take , for example. This behemoth, equal to 9387,420,489, has 369,693,100 digits. By comparison, the number of elementary particles in the observable universe has a meager 85 digits, give or take. Three 9’s, when stacked exponentially, already lift us incomprehensibly beyond all the matter we can observe—by a factor of about 10369,693,015. And we’ve said nothing of or . Place value, exponentials, stacked exponentials: each can express boundlessly big numbers, and in this sense they’re all equivalent. But the notational systems differ dramatically in the numbers they can express concisely. That’s what the fifteen-second time limit illustrates. It takes the same amount of time to write 9999, 9999, and —yet the first number is quotidian, the second astronomical, and the third hyper-mega astronomical. The key to the biggest number contest is not swift penmanship, but rather a potent paradigm for concisely capturing the gargantuan. Such paradigms are historical rarities. We find a flurry in antiquity, another flurry in the twentieth century, and nothing much in between. But when a new way to express big numbers concisely does emerge, it’s often a byproduct of a major scientific revolution: systematized mathematics, formal logic, computer science. Revolutions this momentous, as any Kuhnian could tell you, only happen under the right social conditions. Thus is the story of big numbers a story of human progress. And herein lies a parallel with another mathematical story. In his remarkable and underappreciated book A History of π, Petr Beckmann argues that the ratio of circumference to diameter is "a quaint little mirror of the history of man." In the rare societies where science and reason found refuge—the early Athens of Anaxagoras and Hippias, the Alexandria of Eratosthenes and Euclid, the seventeenth-century England of Newton and Wallis—mathematicians made tremendous strides in calculating π. In Rome and medieval Europe, by contrast, knowledge of π stagnated. Crude approximations such as the Babylonians’ 25/8 held sway. This same pattern holds, I think, for big numbers. Curiosity and openness lead to fascination with big numbers, and to the buoyant view that no quantity, whether of the number of stars in the galaxy or the number of possible bridge hands, is too immense for the mind to enumerate. Conversely, ignorance and irrationality lead to fatalism concerning big numbers. Historian Ilan Vardi cites the ancient Greek term sand-hundred, colloquially meaning zillion; as well as a passage from Pindar’s Olympic Ode II asserting that "sand escapes counting." ¨ But sand doesn’t escape counting, as Archimedes recognized in the third century B.C. Here’s how he began The Sand-Reckoner, a sort of pop-science article addressed to the King of Syracuse: There are some ... who think that the number of the sand is infinite in multitude ... again there are some who, without regarding it as infinite, yet think that no number has been named which is great enough to exceed its multitude ... But I will try to show you [numbers that] exceed not only the number of the mass of sand equal in magnitude to the earth ... but also that of a mass equal in magnitude to the universe. This Archimedes proceeded to do, essentially by using the ancient Greek term myriad, meaning ten thousand, as a base for exponentials. Adopting a prescient cosmological model of Aristarchus, in which the "sphere of the fixed stars" is vastly greater than the sphere in which the Earth revolves around the sun, Archimedes obtained an upper bound of 1063 on the number of sand grains needed to fill the universe. (Supposedly 1063 is the biggest number with a lexicographically standard American name: vigintillion. But the staid vigintillion had better keep vigil lest it be encroached upon by the more whimsically-named googol, or 10100, and googolplex, or .) Vast though it was, of course, 1063 wasn’t to be enshrined as the all-time biggest number. Six centuries later, Diophantus developed a simpler notation for exponentials, allowing him to surpass . Then, in the Middle Ages, the rise of Arabic numerals and place value made it easy to stack exponentials higher still. But Archimedes’ paradigm for expressing big numbers wasn’t fundamentally surpassed until the twentieth century. And even today, exponentials dominate popular discussion of the immense. Consider, for example, the oft-repeated legend of the Grand Vizier in Persia who invented chess. The King, so the legend goes, was delighted with the new game, and invited the Vizier to name his own reward. The Vizier replied that, being a modest man, he desired only one grain of wheat on the first square of a chessboard, two grains on the second, four on the third, and so on, with twice as many grains on each square as on the last. The innumerate King agreed, not realizing that the total number of grains on all 64 squares would be 264-1, or 18.6 quintillion—equivalent to the world’s present wheat production for 150 years. Fittingly, this same exponential growth is what makes chess itself so difficult. There are only about 35 legal choices for each chess move, but the choices multiply exponentially to yield something like 1050 possible board positions—too many for even a computer to search exhaustively. That’s why it took until 1997 for a computer, Deep Blue, to defeat the human world chess champion. And in Go, which has a 19-by-19 board and over 10150 possible positions, even an amateur human can still rout the world’s top-ranked computer programs. Exponential growth plagues computers in other guises as well. The traveling salesman problem asks for the shortest route connecting a set of cities, given the distances between each pair of cities. The rub is that the number of possible routes grows exponentially with the number of cities. When there are, say, a hundred cities, there are about 10158 possible routes, and, although various shortcuts are possible, no known computer algorithm is fundamentally better than checking each route one by one. The traveling salesman problem belongs to a class called NP-complete, which includes hundreds of other problems of practical interest. (NP stands for the technical term ‘Nondeterministic Polynomial-Time.’) It’s known that if there’s an efficient algorithm for any NP-complete problem, then there are efficient algorithms for all of them. Here ‘efficient’ means using an amount of time proportional to at most the problem size raised to some fixed power—for example, the number of cities cubed. It’s conjectured, however, that no efficient algorithm for NP-complete problems exists. Proving this conjecture, called P¹ NP, has been a great unsolved problem of computer science for thirty years. Although computers will probably never solve NP-complete problems efficiently, there’s more hope for another grail of computer science: replicating human intelligence. The human brain has roughly a hundred billion neurons linked by a hundred trillion synapses. And though the function of an individual neuron is only partially understood, it’s thought that each neuron fires electrical impulses according to relatively simple rules up to a thousand times each second. So what we have is a highly interconnected computer capable of maybe 1014 operations per second; by comparison, the world’s fastest parallel supercomputer, the 9200-Pentium Pro teraflops machine at Sandia National Labs, can perform 1012 operations per second. Contrary to popular belief, gray mush is not only hard-wired for intelligence: it surpasses silicon even in raw computational power. But this is unlikely to remain true for long. The reason is Moore’s Law, which, in its 1990’s formulation, states that the amount of information storable on a silicon chip grows exponentially, doubling roughly once every two years. Moore’s Law will eventually play out, as microchip components reach the atomic scale and conventional lithography falters. But radical new technologies, such as optical computers, DNA computers, or even quantum computers, could conceivably usurp silicon’s place. Exponential growth in computing power can’t continue forever, but it may continue long enough for computers—at least in processing power—to surpass human brains. To prognosticators of artificial intelligence, Moore’s Law is a glorious herald of exponential growth. But exponentials have a drearier side as well. The human population recently passed six billion and is doubling about once every forty years. At this exponential rate, if an average person weighs seventy kilograms, then by the year 3750 the entire Earth will be composed of human flesh. But before you invest in deodorant, realize that the population will stop increasing long before this—either because of famine, epidemic disease, global warming, mass species extinctions, unbreathable air, or, entering the speculative realm, birth control. It’s not hard to fathom why physicist Albert Bartlett asserted "the greatest shortcoming of the human race" to be "our inability to understand the exponential function." Or why Carl Sagan advised us to "never underestimate an exponential." In his book Billions & Billions, Sagan gave some other depressing consequences of exponential growth. At an inflation rate of five percent a year, a dollar is worth only thirty-seven cents after twenty years. If a uranium nucleus emits two neutrons, both of which collide with other uranium nuclei, causing them to emit two neutrons, and so forth—well, did I mention nuclear holocaust as a possible end to population growth? ¨ Exponentials are familiar, relevant, intimately connected to the physical world and to human hopes and fears. Using the notational systems I’ll discuss next, we can concisely name numbers that make exponentials picayune by comparison, that subjectively speaking exceed as much as the latter exceeds 9. But these new systems may seem more abstruse than exponentials. In his essay "On Number Numbness," Douglas Hofstadter leads his readers to the precipice of these systems, but then avers: If we were to continue our discussion just one zillisecond longer, we would find ourselves smack-dab in the middle of the theory of recursive functions and algorithmic complexity, and that would be too abstract. So let’s drop the topic right here. But to drop the topic is to forfeit, not only the biggest number contest, but any hope of understanding how stronger paradigms lead to vaster numbers. And so we arrive in the early twentieth century, when a school of mathematicians called the formalists sought to place all of mathematics on a rigorous axiomatic basis. A key question for the formalists was what the word ‘computable’ means. That is, how do we tell whether a sequence of numbers can be listed by a definite, mechanical procedure? Some mathematicians thought that ‘computable’ coincided with a technical notion called ‘primitive recursive.’ But in 1928 Wilhelm Ackermann disproved them by constructing a sequence of numbers that’s clearly computable, yet grows too quickly to be primitive recursive. Ackermann’s idea was to create an endless procession of arithmetic operations, each more powerful than the last. First comes addition. Second comes multiplication, which we can think of as repeated addition: for example, 5´3 means 5 added to itself 3 times, or 5+5+5 = 15. Third comes exponentiation, which we can think of as repeated multiplication. Fourth comes ... what? Well, we have to invent a weird new operation, for repeated exponentiation. The mathematician Rudy Rucker calls it ‘tetration.’ For example, ‘5 tetrated to the 3’ means 5 raised to its own power 3 times, or , a number with 2,185 digits. We can go on. Fifth comes repeated tetration: shall we call it ‘pentation’? Sixth comes repeated pentation: ‘hexation’? The operations continue infinitely, with each one standing on its predecessor to peer even higher into the firmament of big numbers. If each operation were a candy flavor, then the Ackermann sequence would be the sampler pack, mixing one number of each flavor. First in the sequence is 1+1, or (don’t hold your breath) 2. Second is 2´2, or 4. Third is 3 raised to the 3rd power, or 27. Hey, these numbers aren’t so big! Fee. Fi. Fo. Fum. Fourth is 4 tetrated to the 4, or , which has 10154 digits. If you’re planning to write this number out, better start now. Fifth is 5 pentated to the 5, or with ‘5 pentated to the 4’ numerals in the stack. This number is too colossal to describe in any ordinary terms. And the numbers just get bigger from there. Wielding the Ackermann sequence, we can clobber unschooled opponents in the biggest-number contest. But we need to be careful, since there are several definitions of the Ackermann sequence, not all identical. Under the fifteen-second time limit, here’s what I might write to avoid ambiguity: A(111)—Ackermann seq—A(1)=1+1, A(2)=2´2, A(3)=33, etc Recondite as it seems, the Ackermann sequence does have some applications. A problem in an area called Ramsey theory asks for the minimum dimension of a hypercube satisfying a certain property. The true dimension is thought to be 6, but the lowest dimension anyone’s been able is prove is so huge that it can only be expressed using the same ‘weird arithmetic’ that underlies the Ackermann sequence. Indeed, the Guinness Book of World Records once listed this dimension as the biggest number ever used in a mathematical proof. (Another contender for the title once was Skewes’ number, about , which arises in the study of how prime numbers are distributed. The famous mathematician G. H. Hardy quipped that Skewes’ was "the largest number which has ever served any definite purpose in mathematics.") What’s more, Ackermann’s briskly-rising cavalcade performs an occasional cameo in computer science. For example, in the analysis of a data structure called ‘Union-Find,’ a term gets multiplied by the inverse of the Ackermann sequence—meaning, for each whole number X, the first number N such that the Nth Ackermann number is bigger than X. The inverse grows as slowly as Ackermann’s original sequence grows quickly; for all practical purposes, the inverse is at most 4. ¨ Ackermann numbers are pretty big, but they’re not yet big enough. The quest for still bigger numbers takes us back to the formalists. After Ackermann demonstrated that ‘primitive recursive’ isn’t what we mean by ‘computable,’ the question still stood: what do we mean by ‘computable’? In 1936, Alonzo Church and Alan Turing independently answered this question. While Church answered using a logical formalism called the lambda calculus, Turing answered using an idealized computing machine—the Turing machine—that, in essence, is equivalent to every Compaq, Dell, Macintosh, and Cray in the modern world. Turing’s paper describing his machine, "On Computable Numbers," is rightly celebrated as the founding document of computer science. "Computing," said Turing, is normally done by writing certain symbols on paper. We may suppose this paper to be divided into squares like a child’s arithmetic book. In elementary arithmetic the 2-dimensional character of the paper is sometimes used. But such use is always avoidable, and I think it will be agreed that the two-dimensional character of paper is no essential of computation. I assume then that the computation is carried out on one-dimensional paper, on a tape divided into squares. Turing continued to explicate his machine using ingenious reasoning from first principles. The tape, said Turing, extends infinitely in both directions, since a theoretical machine ought not be constrained by physical limits on resources. Furthermore, there’s a symbol written on each square of the tape, like the ‘1’s and ‘0’s in a modern computer’s memory. But how are the symbols manipulated? Well, there’s a ‘tape head’ moving back and forth along the tape, examining one square at a time, writing and erasing symbols according to definite rules. The rules are the tape head’s program: change them, and you change what the tape head does. Turing’s august insight was that we can program the tape head to carry out any computation. Turing machines can add, multiply, extract cube roots, sort, search, spell-check, parse, play Tic-Tac-Toe, list the Ackermann sequence. If we represented keyboard input, monitor output, and so forth as symbols on the tape, we could even run Windows on a Turing machine. But there’s a problem. Set a tape head loose on a sequence of symbols, and it might stop eventually, or it might run forever—like the fabled programmer who gets stuck in the shower because the instructions on the shampoo bottle read "lather, rinse, repeat." If the machine’s going to run forever, it’d be nice to know this in advance, so that we don’t spend an eternity waiting for it to finish. But how can we determine, in a finite amount of time, whether something will go on endlessly? If you bet a friend that your watch will never stop ticking, when could you declare victory? But maybe there’s some ingenious program that can examine other programs and tell us, infallibly, whether they’ll ever stop running. We just haven’t thought of it yet. Nope. Turing proved that this problem, called the Halting Problem, is unsolvable by Turing machines. The proof is a beautiful example of self-reference. It formalizes an old argument about why you can never have perfect introspection: because if you could, then you could determine what you were going to do ten seconds from now, and then do something else. Turing imagined that there was a special machine that could solve the Halting Problem. Then he showed how we could have this machine analyze itself, in such a way that it has to halt if it runs forever, and run forever if it halts. Like a hound that finally catches its tail and devours itself, the mythical machine vanishes in a fury of contradiction. (That’s the sort of thing you don’t say in a research paper.) ¨ "Very nice," you say (or perhaps you say, "not nice at all"). "But what does all this have to do with big numbers?" Aha! The connection wasn’t published until May of 1962. Then, in the Bell System Technical Journal, nestled between pragmatically-minded papers on "Multiport Structures" and "Waveguide Pressure Seals," appeared the modestly titled "On Non-Computable Functions" by Tibor Rado. In this paper, Rado introduced the biggest numbers anyone had ever imagined. His idea was simple. Just as we can classify words by how many letters they contain, we can classify Turing machines by how many rules they have in the tape head. Some machines have only one rule, others have two rules, still others have three rules, and so on. But for each fixed whole number N, just as there are only finitely many distinct words with N letters, so too are there only finitely many distinct machines with N rules. Among these machines, some halt and others run forever when started on a blank tape. Of the ones that halt, asked Rado, what’s the maximum number of steps that any machine takes before it halts? (Actually, Rado asked mainly about the maximum number of symbols any machine can write on the tape before halting. But the maximum number of steps, which Rado called S(n), has the same basic properties and is easier to reason about.) Rado called this maximum the Nth "Busy Beaver" number. (Ah yes, the early 1960’s were a more innocent age.) He visualized each Turing machine as a beaver bustling busily along the tape, writing and erasing symbols. The challenge, then, is to find the busiest beaver with exactly N rules, albeit not an infinitely busy one. We can interpret this challenge as one of finding the "most complicated" computer program N bits long: the one that does the most amount of stuff, but not an infinite amount. Now, suppose we knew the Nth Busy Beaver number, which we’ll call BB(N). Then we could decide whether any Turing machine with N rules halts on a blank tape. We’d just have to run the machine: if it halts, fine; but if it doesn’t halt within BB(N) steps, then we know it never will halt, since BB(N) is the maximum number of steps it could make before halting. Similarly, if you knew that all mortals died before age 200, then if Sally lived to be 200, you could conclude that Sally was immortal. So no Turing machine can list the Busy Beaver numbers—for if it could, it could solve the Halting Problem, which we already know is impossible. But here’s a curious fact. Suppose we could name a number greater than the Nth Busy Beaver number BB(N). Call this number D for dam, since like a beaver dam, it’s a roof for the Busy Beaver below. With D in hand, computing BB(N) itself becomes easy: we just need to simulate all the Turing machines with N rules. The ones that haven’t halted within D steps—the ones that bash through the dam’s roof—never will halt. So we can list exactly which machines halt, and among these, the maximum number of steps that any machine takes before it halts is BB(N). Conclusion? The sequence of Busy Beaver numbers, BB(1), BB(2), and so on, grows faster than any computable sequence. Faster than exponentials, stacked exponentials, the Ackermann sequence, you name it. Because if a Turing machine could compute a sequence that grows faster than Busy Beaver, then it could use that sequence to obtain the D‘s—the beaver dams. And with those D’s, it could list the Busy Beaver numbers, which (sound familiar?) we already know is impossible. The Busy Beaver sequence is non-computable, solely because it grows stupendously fast—too fast for any computer to keep up with it, even in principle. This means that no computer program could list all the Busy Beavers one by one. It doesn’t mean that specific Busy Beavers need remain eternally unknowable. And in fact, pinning them down has been a computer science pastime ever since Rado published his article. It’s easy to verify that BB(1), the first Busy Beaver number, is 1. That’s because if a one-rule Turing machine doesn’t halt after the very first step, it’ll just keep moving along the tape endlessly. There’s no room for any more complex behavior. With two rules we can do more, and a little grunt work will ascertain that BB(2) is 6. Six steps. What about the third Busy Beaver? In 1965 Rado, together with Shen Lin, proved that BB(3) is 21. The task was an arduous one, requiring human analysis of many machines to prove that they don’t halt—since, remember, there’s no algorithm for listing the Busy Beaver numbers. Next, in 1983, Allan Brady proved that BB(4) is 107. Unimpressed so far? Well, as with the Ackermann sequence, don’t be fooled by the first few numbers. In 1984, A.K. Dewdney devoted a Scientific American column to Busy Beavers, which inspired amateur mathematician George Uhing to build a special-purpose device for simulating Turing machines. The device, which cost Uhing less than $100, found a five-rule machine that runs for 2,133,492 steps before halting—establishing that BB(5) must be at least as high. Then, in 1989, Heiner Marxen and Jürgen Buntrock discovered that BB(5) is at least 47,176,870. To this day, BB(5) hasn’t been pinned down precisely, and it could turn out to be much higher still. As for BB(6), Marxen and Buntrock set another record in 1997 by proving that it’s at least 8,690,333,381,690,951. A formidable accomplishment, yet Marxen, Buntrock, and the other Busy Beaver hunters are merely wading along the shores of the unknowable. Humanity may never know the value of BB(6) for certain, let alone that of BB(7) or any higher number in the sequence. Indeed, already the top five and six-rule contenders elude us: we can’t explain how they ‘work’ in human terms. If creativity imbues their design, it’s not because humans put it there. One way to understand this is that even small Turing machines can encode profound mathematical problems. Take Goldbach’s conjecture, that every even number 4 or higher is a sum of two prime numbers: 10=7+3, 18=13+5. The conjecture has resisted proof since 1742. Yet we could design a Turing machine with, oh, let’s say 100 rules, that tests each even number to see whether it’s a sum of two primes, and halts when and if it finds a counterexample to the conjecture. Then knowing BB(100), we could in principle run this machine for BB(100) steps, decide whether it halts, and thereby resolve Goldbach’s conjecture. We need not venture far in the sequence to enter the lair of basilisks. But as Rado stressed, even if we can’t list the Busy Beaver numbers, they’re perfectly well-defined mathematically. If you ever challenge a friend to the biggest number contest, I suggest you write something like this: BB(11111)—Busy Beaver shift #—1, 6, 21, etc If your friend doesn’t know about Turing machines or anything similar, but only about, say, Ackermann numbers, then you’ll win the contest. You’ll still win even if you grant your friend a handicap, and allow him the entire lifetime of the universe to write his number. The key to the biggest number contest is a potent paradigm, and Turing’s theory of computation is potent indeed. ¨ But what if your friend knows about Turing machines as well? Is there a notational system for big numbers more powerful than even Busy Beavers? Suppose we could endow a Turing machine with a magical ability to solve the Halting Problem. What would we get? We’d get a ‘super Turing machine’: one with abilities beyond those of any ordinary machine. But now, how hard is it to decide whether a super machine halts? Hmm. It turns out that not even super machines can solve this ‘super Halting Problem’, for the same reason that ordinary machines can’t solve the ordinary Halting Problem. To solve the Halting Problem for super machines, we’d need an even more powerful machine: a ‘super duper machine.’ And to solve the Halting Problem for super duper machines, we’d need a ‘super duper pooper machine.’ And so on endlessly. This infinite hierarchy of ever more powerful machines was formalized by the logician Stephen Kleene in 1943 (although he didn’t use the term ‘super duper pooper’). Imagine a novel, which is imbedded in a longer novel, which itself is imbedded in an even longer novel, and so on ad infinitum. Within each novel, the characters can debate the literary merits of any of the sub-novels. But, by analogy with classes of machines that can’t analyze themselves, the characters can never critique the novel that they themselves are in. (This, I think, jibes with our ordinary experience of novels.) To fully understand some reality, we need to go outside of that reality. This is the essence of Kleene’s hierarchy: that to solve the Halting Problem for some class of machines, we need a yet more powerful class of machines. And there’s no escape. Suppose a Turing machine had a magical ability to solve the Halting Problem, and the super Halting Problem, and the super duper Halting Problem, and the super duper pooper Halting Problem, and so on endlessly. Surely this would be the Queen of Turing machines? Not quite. As soon as we want to decide whether a ‘Queen of Turing machines’ halts, we need a still more powerful machine: an ‘Empress of Turing machines.’ And Kleene’s hierarchy continues. But how’s this relevant to big numbers? Well, each level of Kleene’s hierarchy generates a faster-growing Busy Beaver sequence than do all the previous levels. Indeed, each level’s sequence grows so rapidly that it can only be computed by a higher level. For example, define BB2(N) to be the maximum number of steps a super machine with N rules can make before halting. If this super Busy Beaver sequence were computable by super machines, then those machines could solve the super Halting Problem, which we know is impossible. So the super Busy Beaver numbers grow too rapidly to be computed, even if we could compute the ordinary Busy Beaver numbers. You might think that now, in the biggest-number contest, you could obliterate even an opponent who uses the Busy Beaver sequence by writing something like this: BB2(11111). But not quite. The problem is that I’ve never seen these "higher-level Busy Beavers" defined anywhere, probably because, to people who know computability theory, they’re a fairly obvious extension of the ordinary Busy Beaver numbers. So our reasonable modern mathematician wouldn’t know what number you were naming. If you want to use higher-level Busy Beavers in the biggest number contest, here’s what I suggest. First, publish a paper formalizing the concept in some obscure, low-prestige journal. Then, during the contest, cite the paper on your index card. To exceed higher-level Busy Beavers, we’d presumably need some new computational model surpassing even Turing machines. I can’t imagine what such a model would look like. Yet somehow I doubt that the story of notational systems for big numbers is over. Perhaps someday humans will be able concisely to name numbers that make Busy Beaver 100 seem as puerile and amusingly small as our nobleman’s eighty-three. Or if we’ll never name such numbers, perhaps other civilizations will. Is a biggest number contest afoot throughout the galaxy? ¨ You might wonder why we can’t transcend the whole parade of paradigms, and name numbers by a system that encompasses and surpasses them all. Suppose you wrote the following in the biggest number contest: The biggest whole number nameable with 1,000 characters of English text Surely this number exists. Using 1,000 characters, we can name only finitely many numbers, and among these numbers there has to be a biggest. And yet we’ve made no reference to how the number’s named. The English text could invoke Ackermann numbers, or Busy Beavers, or higher-level Busy Beavers, or even some yet more sweeping concept that nobody’s thought of yet. So unless our opponent uses the same ploy, we’ve got him licked. What a brilliant idea! Why didn’t we think of this earlier? Unfortunately it doesn’t work. We might as well have written One plus the biggest whole number nameable with 1,000 characters of English text This number takes at least 1,001 characters to name. Yet we’ve just named it with only 80 characters! Like a snake that swallows itself whole, our colossal number dissolves in a tumult of contradiction. What gives? The paradox I’ve just described was first published by Bertrand Russell, who attributed it to a librarian named G. G. Berry. The Berry Paradox arises not from mathematics, but from the ambiguity inherent in the English language. There’s no surefire way to convert an English phrase into the number it names (or to decide whether it names a number at all), which is why I invoked a "reasonable modern mathematician" in the rules for the biggest number contest. To circumvent the Berry Paradox, we need to name numbers using a precise, mathematical notational system, such as Turing machines—which is exactly the idea behind the Busy Beaver sequence. So in short, there’s no wily language trick by which to surpass Archimedes, Ackermann, Turing, and Rado, no royal road to big numbers. You might also wonder why we can’t use infinity in the contest. The answer is, for the same reason why we can’t use a rocket car in a bike race. Infinity is fascinating and elegant, but it’s not a whole number. Nor can we ‘subtract from infinity’ to yield a whole number. Infinity minus 17 is still infinity, whereas infinity minus infinity is undefined: it could be 0, 38, or even infinity again. Actually I should speak of infinities, plural. For in the late nineteenth century, Georg Cantor proved that there are different levels of infinity: for example, the infinity of points on a line is greater than the infinity of whole numbers. What’s more, just as there’s no biggest number, so too is there no biggest infinity. But the quest for big infinities is more abstruse than the quest for big numbers. And it involves, not a succession of paradigms, but essentially one: Cantor’s. ¨ So here we are, at the frontier of big number knowledge. As Euclid’s disciple supposedly asked, "what is the use of all this?" We’ve seen that progress in notational systems for big numbers mirrors progress in broader realms: mathematics, logic, computer science. And yet, though a mirror reflects reality, it doesn’t necessarily influence it. Even within mathematics, big numbers are often considered trivialities, their study an idle amusement with no broader implications. I want to argue a contrary view: that understanding big numbers is a key to understanding the world. Imagine trying to explain the Turing machine to Archimedes. The genius of Syracuse listens patiently as you discuss the papyrus tape extending infinitely in both directions, the time steps, states, input and output sequences. At last he explodes. "Foolishness!" he declares (or the ancient Greek equivalent). "All you’ve given me is an elaborate definition, with no value outside of itself." How do you respond? Archimedes has never heard of computers, those cantankerous devices that, twenty-three centuries from his time, will transact the world’s affairs. So you can’t claim practical application. Nor can you appeal to Hilbert and the formalist program, since Archimedes hasn’t heard of those either. But then it hits you: the Busy Beaver sequence. You define the sequence for Archimedes, convince him that BB(1000) is more than his 1063 grains of sand filling the universe, more even than 1063 raised to its own power 1063 times. You defy him to name a bigger number without invoking Turing machines or some equivalent. And as he ponders this challenge, the power of the Turing machine concept dawns on him. Though his intuition may never apprehend the Busy Beaver numbers, his reason compels him to acknowledge their immensity. Big numbers have a way of imbuing abstract notions with reality. Indeed, one could define science as reason’s attempt to compensate for our inability to perceive big numbers. If we could run at 280,000,000 meters per second, there’d be no need for a special theory of relativity: it’d be obvious to everyone that the faster we go, the heavier and squatter we get, and the faster time elapses in the rest of the world. If we could live for 70,000,000 years, there’d be no theory of evolution, and certainly no creationism: we could watch speciation and adaptation with our eyes, instead of painstakingly reconstructing events from fossils and DNA. If we could bake bread at 20,000,000 degrees Kelvin, nuclear fusion would be not the esoteric domain of physicists but ordinary household knowledge. But we can’t do any of these things, and so we have science, to deduce about the gargantuan what we, with our infinitesimal faculties, will never sense. If people fear big numbers, is it any wonder that they fear science as well and turn for solace to the comforting smallness of mysticism? But do people fear big numbers? Certainly they do. I’ve met people who don’t know the difference between a million and a billion, and don’t care. We play a lottery with ‘six ways to win!,’ overlooking the twenty million ways to lose. We yawn at six billion tons of carbon dioxide released into the atmosphere each year, and speak of ‘sustainable development’ in the jaws of exponential growth. Such cases, it seems to me, transcend arithmetical ignorance and represent a basic unwillingness to grapple with the immense. Whence the cowering before big numbers, then? Does it have a biological origin? In 1999, a group led by neuropsychologist Stanislas Dehaene reported evidence in Science that two separate brain systems contribute to mathematical thinking. The group trained Russian-English bilinguals to solve a set of problems, including two-digit addition, base-eight addition, cube roots, and logarithms. Some subjects were trained in Russian, others in English. When the subjects were then asked to solve problems approximately—to choose the closer of two estimates—they performed equally well in both languages. But when asked to solve problems exactly, they performed better in the language of their training. What’s more, brain-imaging evidence showed that the subjects’ parietal lobes, involved in spatial reasoning, were more active during approximation problems; while the left inferior frontal lobes, involved in verbal reasoning, were more active during exact calculation problems. Studies of patients with brain lesions paint the same picture: those with parietal lesions sometimes can’t decide whether 9 is closer to 10 or to 5, but remember the multiplication table; whereas those with left-hemispheric lesions sometimes can’t decide whether 2+2 is 3 or 4, but know that the answer is closer to 3 than to 9. Dehaene et al. conjecture that humans represent numbers in two ways. For approximate reckoning we use a ‘mental number line,’ which evolved long ago and which we likely share with other animals. But for exact computation we use numerical symbols, which evolved recently and which, being language-dependent, are unique to humans. This hypothesis neatly explains the experiment’s findings: the reason subjects performed better in the language of their training for exact computation but not for approximation problems is that the former call upon the verbally-oriented left inferior frontal lobes, and the latter upon the spatially-oriented parietal lobes. If Dehaene et al.’s hypothesis is correct, then which representation do we use for big numbers? Surely the symbolic one—for nobody’s mental number line could be long enough to contain , 5 pentated to the 5, or BB(1000). And here, I suspect, is the problem. When thinking about 3, 4, or 7, we’re guided by our spatial intuition, honed over millions of years of perceiving 3 gazelles, 4 mates, 7 members of a hostile clan. But when thinking about BB(1000), we have only language, that evolutionary neophyte, to rely upon. The usual neural pathways for representing numbers lead to dead ends. And this, perhaps, is why people are afraid of big numbers. Could early intervention mitigate our big number phobia? What if second-grade math teachers took an hour-long hiatus from stultifying busywork to ask their students, "How do you name really, really big numbers?" And then told them about exponentials and stacked exponentials, tetration and the Ackermann sequence, maybe even Busy Beavers: a cornucopia of numbers vaster than any they’d ever conceived, and ideas stretching the bounds of their imaginations. Who can name the bigger number? Whoever has the deeper paradigm. Are you ready? Get set. Go. References Petr Beckmann, A History of Pi, Golem Press, 1971. Allan H. Brady, "The Determination of the Value of Rado’s Noncomputable Function Sigma(k) for Four-State Turing Machines," Mathematics of Computation, vol. 40, no. 162, April 1983, pp 647- 665. Gregory J. Chaitin, "The Berry Paradox," Complexity, vol. 1, no. 1, 1995, pp. 26- 30. At http://www.umcs.maine.edu/~chaitin/unm2.html. A.K. Dewdney, The New Turing Omnibus: 66 Excursions in Computer Science, W.H. Freeman, 1993. S. Dehaene and E. Spelke and P. Pinel and R. Stanescu and S. Tsivkin, "Sources of Mathematical Thinking: Behavioral and Brain-Imaging Evidence," Science, vol. 284, no. 5416, May 7, 1999, pp. 970- 974. Douglas Hofstadter, Metamagical Themas: Questing for the Essence of Mind and Pattern, Basic Books, 1985. Chapter 6, "On Number Numbness," pp. 115- 135. Robert Kanigel, The Man Who Knew Infinity: A Life of the Genius Ramanujan, Washington Square Press, 1991. Stephen C. Kleene, "Recursive predicates and quantifiers," Transactions of the American Mathematical Society, vol. 53, 1943, pp. 41- 74. Donald E. Knuth, Selected Papers on Computer Science, CSLI Publications, 1996. Chapter 2, "Mathematics and Computer Science: Coping with Finiteness," pp. 31- 57. Dexter C. Kozen, Automata and Computability, Springer-Verlag, 1997. ———, The Design and Analysis of Algorithms, Springer-Verlag, 1991. Shen Lin and Tibor Rado, "Computer studies of Turing machine problems," Journal of the Association for Computing Machinery, vol. 12, no. 2, April 1965, pp. 196- 212. Heiner Marxen, Busy Beaver, at http://www.drb.insel.de/~heiner/BB/. ——— and Jürgen Buntrock, "Attacking the Busy Beaver 5," Bulletin of the European Association for Theoretical Computer Science, no. 40, February 1990, pp. 247- 251. Tibor Rado, "On Non-Computable Functions," Bell System Technical Journal, vol. XLI, no. 2, May 1962, pp. 877- 884. Rudy Rucker, Infinity and the Mind, Princeton University Press, 1995. Carl Sagan, Billions & Billions, Random House, 1997. Michael Somos, "Busy Beaver Turing Machine." At http://grail.cba.csuohio.edu/~somos/bb.html. Alan Turing, "On computable numbers, with an application to the Entscheidungsproblem," Proceedings of the London Mathematical Society, Series 2, vol. 42, pp. 230- 265, 1936. Reprinted in Martin Davis (ed.), The Undecidable, Raven, 1965. Ilan Vardi, "Archimedes, the Sand Reckoner," at http://www.ihes.fr/~ilan/sand_reckoner.ps. Eric W. Weisstein, CRC Concise Encyclopedia of Mathematics, CRC Press, 1999. Entry on "Large Number" at http://www.treasure-troves.com/math/LargeNumber.html. Back to Writings page Back to Scott's homepage Back to Scott's blog

      Why do we even care about big numbers is there any use?

    1. Applied Ecology textbook.

      I really appreciate this project overall as this will really mean a lot to some scientists that may not ever be in a textbook when they should be. Even if they don't see it I think it's awesome we can do something about giving more people appreciation for their work they may not get.

    1. Author Response

      Reviewer #1 (Public Review):

      “The synthesis and metabolism of sphingolipid (SL) are involved in wide range of biological processes. In the present study, the authors investigate the role of SPTLC1, one of the essential subunits of serine palmitoyl transferase complex, in both physiological and pathophysiological angiogenesis, via using inducible endothelial-specific SPTLC1 knockout mice. They found SPTLC1 deficiency in ECs inhibited retinal angiogenesis along with reducing several SL metabolites in plasma, red blood cells, and peripheral organs. In addition, the authors found SPTLC1 EC-KO mice are resistant to APAP-induced liver injury. Overall, the in vivo findings in the present study are of potential interest and the authors have given clear evidence that endothelial SPTLC1 is critical to retinal angiogenesis. However, the underlying mechanisms are completely lacking in the present study. Most of the evidence provided is circumstantial, associative, and indirect.”

      We appreciate the positive comments of the reviewer. We have addressed the reviewer’s concern regarding underlying mechanisms as detailed below.

      “To be specific,

      1. The authors found endothelial SPTLC1 is important to both angiogenesis and the plasma lipid profile. However, the authors did not present the data to demonstrate the relationship between them. The in vivo findings about the phenotype and the plasma lipid profile might be true and unrelated. It would be important to know whether supplementing the reduced lipid induced by SPTLC1 KO could rescue the angiogenesis related phenotype in mice, or, whether the alternative way to inhibit the SL synthesis could mimic the phenotype of KO mice.”

      In the manuscript, we discussed the possibility whether S1P is involved, since it is one of the most down-regulated SL in the plasma and a major regulator of angiogenesis. We think it is unlikely that reduced plasma S1P is responsible for the phenotype. First, the retinal angiogenesis defect in Sptlc1 ECKO mice is the opposite of S1pr1 ECKO as we have published previously (PMID: 22975328, PMID: 32059774). Moreover, deletion of sphingosine kinase, the enzyme produces S1P, in the endothelium does not influence retinal angiogenesis at P6 (Figure 3 Supplement 2 A and B). Loss of S1P chaperone ApoM- i.e., Apom KO, which exhibits 50% reduction of plasma S1P, does not show change in retinal vascular development (Figure 3 Supplement 2 C and D). Taken together, our results strongly suggest that reduction in plasma S1P is not the cause of vascular defect in Sptlc1 ECKO retinas.

      Based on our results in the manuscript, loss of SPT enzyme activity in endothelial cells reduced SL species in the endothelial cells and the plasma. Our in vitro and VEGF intraocular injection experiments (new data) suggests that the angiogenic defects seen in Sptlc1 ECKO mice is due to cell intrinsic defects in VEGF signaling and not due to changes in plasma SL levels. We have edited the discussion section to address this issue.

      “2. A major issue is that the present study did not reveal is a real downstream target. It is possible that VEGF signaling might be impaired by SPTLC1 knockout as discussed by the authors. However, the authors did not demonstrate this point with data. Including both in vivo and in vitro data to evaluate the effects of SPTLC1 deficiency on VEGF signaling might further strengthen the hypothesis. Besides, with in vitro experiments, the authors might further find the critical metabolite(s) involved in VEGF signaling and angiogenesis.”

      As discussed above, we agree with the review’s critique and have addressed this essential point with new experiments (both in vitro and in vivo) in Figure 5. Our new data shows that SPT pathway supplies the glycosphingolipid GM1, which is needed for efficient VEGF-induced ERK phosphorylation and tip cell formation.

      Reviewer #2 (Public Review):

      “Andrew Kuo et al. investigated the role of endothelial de novo sphingolipids (SL) synthesis using endothelial cell specific SPTLC1 knockout (ECKO) mice. They showed that these mice exhibited low concentration of various SL species in not only ECs but also RBC, circulation, and other non-EC tissues. They also showed that ECKO mice exhibited impaired angiogenesis in normal and oxygen-induced retinopathy models, consistent with the decrease of endothelial proliferation and tip cell formation. They finally revealed that these mice were resistant to acetaminophen-induced acute liver injury in early phase. The experiments were well-designed, and the results were clear and convincing. The authors concluded that endothelial cells were the major source of SL in circulation and various organs (liver and lung) other than retina (and probably brain). The weakness of the current version of the manuscript is that the authors did not elucidate the mechanisms underlying the observed phenomena.

      1) The authors showed impaired angiogenesis in ECKO mice using neonatal retina model. Based on the fact that this phenotype was similar to that in endothelial VEGFR2 deficient mice, they suggested that VEGF responsiveness is altered in ECKO mice. Although this hypothesis is plausible, the authors would need to prove it by evaluating VEGFR signaling (VEGFR phosphorylation, Akt activation etc.) in ECKO mice.”

      We thank the reviewer for positive comments. As for the weakness identified, we have addressed this point by conducting new in vitro and in vivo experiments (detailed above). The new Figure 5 addresses this issue directly.

      “2) The acetaminophen-induced liver injury was reduced in ECKO mice in early phase. However, it is still unclear whether SL production itself affects liver injury. The authors discussed the possibility that gene deficiency increases unconsumed serine resulting in GSH increase, but it is essentially independent to SL. If possible, it would be good if the authors could investigate the effect of SL administration on the liver injury progression.”

      We appreciate the reviewer’s concern about liver injury model in the Sptlc1 ECKO mice. Our data suggests that SL species supplied from EC impacts hepatocyte response to stress. Since the acetaminophen induced liver injury is highly dependent on reactive oxygen species, our finding that increased glutathione levels in the Sptlc1 ECKO mice may be involved in the phenotype. However, we are simply considering them as biochemical markers of liver injury. This has been addressed in the discussion.

      “3) This paper showed the impaired cell proliferation in Sptlc1 KO EC mice, and discussed it. Authors described that this phenotype was similar to that of Nos3 KO mice, but its inconsistency with Sptlc2 ECKO adult mice was only justified by a word "isoform-selective function". Authors could quantify eNOS expressions in Sptlc1 KO mice, compared results and then discuss this matter. “

      In figure 1C, we used eNOS as an EC marker to show purity during our EC isolation process. In fact, we did not observe change of eNOS expression in Sptlc1 ECKO. We also did not detect elevated phospho-eNOS in Sptl1c ECKO in contrast to Sptlc2 ECKO adult mice (Figure1 supplement 4). Additionally, our work in the retina was performed in postnatal-genedeletion pups from P6-P17 which is different from the published Sptlc2 ECKO study. The differences in gene deletion strategy (early postnatal vs. adult) could result in differences in eNOS expression . We have added discussion about this issue.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Ciliates extensively rearrange their somatic genome every time a new somatic nucleus develops from the zygotic germline nucleus. In this manuscript, Feng et al report the sequencing, assembly and annotation of the germline and somatic genomes of Euplotes woodruffi and the germline genome of Tetmemena sp. (whose somatic genome was sequenced and assembled by the same lab in 2015). They present a comparative analysis of developmentally programmed genome rearrangements in these two species and in the model ciliate Oxytricha trifallax. Their major findings are that:

      (i) E. woodruffi and Tetmemena sp. eliminate a smaller fraction of their germline genome (~54%) from their somatic macronucleus (MAC) than O. trifallax (>80%)

      (ii) Transposable elements (TE) represent a smaller fraction of the germline genome (~2%) in the first two ciliates than in O. trifallax (~15%). TEs are mainly located at the boundaries of germline chromosomes and in intergenic regions, but can also be found inside IESs

      (iii) Several thousands of genes are scrambled in the germline genome of all three species

      The authors have also addressed the possible origin of gene scrambling. They report an interesting association with local paralogy and propose a model for the emergence of the odd-even pattern of gene unscrambling between two paralogous copies.

      Major comments:

      1. Based on the statistics presented in Table 1, genome assemblies are of good quality, with a reasonable N50 size of germline (MIC) contigs. It seems, however, that no entire MIC chromosome could be assembled, since no two-telomere contig is mentioned in the list. As proposed by the authors (p.7) the presence of numerous TEs at the boundaries of MIC contigs (Fig S1) may have hindered the assembly of MIC chromosome ends. I would have appreciated to have more information on the "other repeats" (which seem to differ from tandem repeats according to Fig 2) and their location along MIC contigs.

        Subcategories of “other repeats” were included in Table S2 based on Repeatmasker annotations. We now analyzed the locations of other repeats in MIC contigs and include those as well in new Figure S1B. About 30% of “other” transposable elements are present at the boundaries of MIC contigs, which may also hinder the assembly. Notably, 35-45% of “other TEs” are in assembled, intergenic regions.

      The definition of "Internal Eliminated Sequences" (IES) is not clear. The authors make a distinction between IESs and TEs. I understand that IESs are DNA segments that separate two macronuclear-destined sequences (MDS) in the germline genome. Thus they appear to be restricted to those regions that eventually yield gene-sized MAC chromosomes. IESs are eliminated between two pointers that may not be identical on both sides in case of scrambled genes. Some clarification is needed here.

      To illustrate my point: I found the statement "with many TE insertions within IESs, suggesting that TE insertions may have generated IESs" particularly confusing (p. 9 lines 5-6). Does this mean that IESs extend beyond the ends of inserted TEs? The legend of Fig S1 should also be clarified.

      We clarified the text and legend. IESs can extend beyond the ends of inserted TEs, even if the original IES is a decayed TE, due to subsequent sequence evolution at the boundaries or if the original insertion was into an existing IES. David Prescott referred to sequence evolution at the edges of IESs as “pointer sliding” (ref.36).

      p. 10 lines 2-4 and Fig S2: Could the authors explain the difference they make between MDS (in the text) and CDS (in Fig S2)? My understanding is that a CDS is the entire gene coding sequence and may be made of multiple MDSs. If this is correct, the sentence should read "We compared the number of MDSs between single-copy orthologs for single-gene MAC chromosomes across the three species and found that the orthologs have similar CDS lengths".

      Yes, we made the correction.

      p. 12 lines 10-15: the discovery that paralogous MDSs can be found in scrambled genomic loci is interesting. If the two paralogs can be distinguished based on the number of substitutions, it would be informative to go back to individual reads and check whether each of the two copies can be incorporated in the unscrambled CDS (and at which frequency). Would the pointers be compatible with this?

      The paralogous MDSs in the MIC are often not identical. The copy with the highest similarity is assigned as “preliminary match” by SDRAP (ref. 52), and others are assigned as “additional matches”. To validate SDRAP assignments, we did pairwise BLASTN alignments (“-task megablast”) of paralogous MIC MDSs and their corresponding MAC MDSs. We confirmed that in the three species, the preliminary match has the best or equally best pid (percentage of identity) in most cases. Therefore, the MDS assigned as preliminary match is more likely the paralog incorporated into the MAC chromosome.

      We used genome assemblies of Euplotes woodruffi, which had the highest Nanopore coverage, to further investigate the frequency of MDS incorporation. We followed the reviewer’s suggestion and called SNP variants on both MAC and MIC genomes. For MAC SNP calling, we used Illumina reads as input for freebayes (ref a). For MIC SNP calling, we used Nanopore reads, instead of Illumina reads, to avoid non-specific short-read mapping on paralogous MDSs and to avoid the presence of any contaminating MAC reads. Variants were called and phased by PEPPER-Margin-DeepVariant (ref b), a new tool published in 2021 in Nature Methods, which has been reported to have similar accuracy to Illumina read variant calling, especially at high read coverage. We used the parameter “--pepper_min_coverage_threshold 20” to call confident variants when at least 20 reads cover the position. Only 92 MIC SNPs in the paralogous MDSs passed all filters of the program. Using this small set of MIC SNPs, we were unfortunately unable to distinguish which paralogous MIC MDS was incorporated into the MAC. Therefore, we cannot infer with what frequency one paralogous MDS is incorporated over another, until they become sufficiently diverged, which is compatible with the model.

      a. Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. arXiv preprint arXiv:1207.3907. 2012 Jul 17.

      b. Shafin K, Pesout T, Chang PC, Nattestad M, Kolesnikov A, Goel S, Baid G, Kolmogorov M, Eizenga JM, Miga KH, Carnevali P. Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads. Nature methods. 2021 Nov;18(11):1322-32.

      The hypothesis that odd-even scrambled loci have evolved from paralogous genes in E. woodruffi is supported by the existence of paralogous MDSs, length conservation of MDS/IES pairs and sequence similarity between corresponding MDS and IES in a pair. The correlations presented for Oxytricha and Tetmemena are much less convincing (Fig S5D and E). I recommend that the authors are even more cautious in their statement on p.13 ("For Oxytricha and Tememena, the MDS and IES lengths for such MDS/IES pairs also correlate positively, but more moderately").

      Thank you, we rephrased the text.

      p. 15 last paragraph: Why did the authors focus only on TBEs inserted in non-scrambled IESs to look for orthologous TBE insertions? Is there a reason to believe that no recent TBE insertion occurred at other genomic loci? Or was it only for practical reasons? It is also not clear to me whether the authors have considered full-length TBEs or the presence of at least one TBE ORF.

      This analysis was limited for practical reasons, because we identify position conservation of TBEs by aligning protein sequences of MAC genes. We only consider TBEs inserted in non-scrambled IESs in exons. It would be difficult and less meaningful to align completely non-coding MIC-limited regions.

      Partial TBEs are also included if they contain at least one TBE ORF (detected by BLAST).

      Furthermore, TE insertion cannot explain the origin of scrambled IESs, and TEs rarely map to scrambled IESs (Figure S1A), but there is a clear evolutionary model for the origin of nonscrambled IESs from decay of TBEs (ref. 49). Initial purifying selection would act on the TE to maintain its ability to self-excise, whereas we advocate for a different model for the origin of scrambled IESs by decay of paralogous MDSs.

      p. 16: the authors report that some introns of E. woodruffi map "near" Oxytricha/Tetmemena pointers. How near? Based on the information provided by the authors, I don't think this observation necessarily implies that IESs were converted to introns (or reciprocally) during evolution. If this were true, shouldn't at least one intron boundary coincide exactly with a pointer? The authors should clarify this (also in the discussion, on p. 20, top paragraph).

      We used a 20bp window (~7 amino acids), as described in the Methods, and added that to the Results. Full detail is provided in the Methods section, “Ortholog comparison pipeline and Monte Carlo simulations”. 103 E. woodruffi introns are within 20bp from the midpoint of Oxytricha/Tetmemena pointers. Among these, 43 intron boundaries overlap an Oxytricha or Tetmemena pointer. We observed 306 cases of precisely matching boundaries between any two species, where the exon junction of one species maps inside the MDS/IES pointer of another species, although we would only expect the boundaries of introns and IESs to coincide so precisely if they were recent conversions. Hence we feel that a window analysis is informative.

      p. 19 2nd paragraph: the suggested mechanism explaining the 5' bias of IESs in E. woodruffi genes is unclear. How could germline recombination take place between a MIC chromosome and a MAC reverse transcript or nanochromosome? This would imply that DNA could be imported in the MIC. Is there evidence that this might occur?

      The ability of TEs to invade the MIC demonstrates that even foreign DNA can be incorporated into the MIC. Since MAC DNA is present at high copy number, it offers a potential source for a recombination template that could erase IESs, as could an errant reverse transcript of one of the long noncoding template RNAs. Any of these would be infrequent events that would matter on an evolutionary time scale even if developmentally rare.

      According to Figure 1, no scrambled genes have been reported in Paramecium tetraurelia. Within the frame of the proposed model, this is somewhat unexpected because this ciliate went through several whole genome duplications during evolution and harbors many paralogous gene pairs. Is there a reason why no gene scrambling took place in Paramecium?

      Paramecium uses only TA dinucleotide pointers for IES elimination, unlike the rich diversity of pointers in spirotrichous ciliates. This limitation in its machinery may explain why no scrambled loci have been observed in Paramecium, despite the abundance of paralogs. Our model suggests that local MIC paralogy is associated with the origin of scrambling. But most of the paralogy reported in Paramecium is at the level of whole chromosomes in the MAC (ref. 104) rather than local MIC paralogy.

      Minor comments:

      p. 4 (4th bottom line): To my knowledge, ref #28 presents a draft (incomplete) MIC assembly of the Paramecium genome.

      Thank you, we added reference 29 and adjusted the wording describing the quality of MIC genome draft assemblies.

      p. 7 (last paragraph): "encoding" should be replaced by "carrying"

      Thank you, we made the change.

      p. 10 (2nd paragraph): insert a missing "o" into "nanochromosomes"

      Thank you, corrected.

      p. 10 (same paragraph): the weak 5' bias of IES distribution in Tetmemena should be shown (either as an additional panel in Fig 3 or in a Sup Figure.

      Thank you, we added it as Figure S2C.

      p. 24 2nd paragraph: "a" is missing in "Trinity, which is a software..."

      Thank you, we made the correction.

      CROSS-CONSULTATION COMMENTS

      I agree with most comments of reviewer 3.

      The authors have actually defined "TE" in the introduction (p. 6). Depending on the journal's rules for abbreviation use, it may not be necessary to define it again in the results section

      Reviewer #1 (Significance (Required)):

      Ciliates are unicellular models to study developmentally programmed genome rearrangements at the mechanistic, genome-wide and evolutionary levels. These aspects have so far mostly been addressed in three species: P. tetraurelia and Tetrahymena thermophila on the one hand, the spirotrichous ciliate O. trifallax on the other.

      One new piece of information that can be found in the present manuscript is the assembly and annotation of the germline genome of two novel species: Tetmemena sp, closely related to Oxytricha, and the more distant E. woodruffi. Feng et al establish that, similar to other ciliates, Tetmemena and Euplotes eliminate TEs and other germline-specific sequences during programmed genome rearrangements. They also undergo extensive gene unscrambling, which results in IES removal and MDS reordering to assemble coding sequences.

      A TE origin was discussed previously for Paramecium (Arnaiz et al PLoS Genet; Sellis et al 2021 PLoS Biol) and Tetrahymena IESs (Hamilton et al 2016 eLife). While this may also hold true in spirotrichous ciliatesThe present manuscript proposes a completely new evolutionary scenario for IESs from scrambled genes. Here, Feng et al establish that scrambled genes of spirotrichous ciliates tend to be associated with local paralogy. They provide evidence supporting that IESs from scrambled genes may have evolved from paralogous MDSs.

      Although I am more an expert in the molecular mechanisms involved in genome rearrangements, I feel that the work reported here should draw the attention of a broader audience interested in genome dynamics and evolution, beyond the specific field of spirotrichous ciliate biology.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Feng et al. provide a solid analysis of the evolution of genome rearrangement in spirotrich ciliates. The authors applied a variety of state-of-the-art sequencing and bioinformatic methods to investigate the intriguing and extremely complex patterns of genome architecture in this protist lineage. Methods (including statistical analyses) are adequate and explained in detail. Results and discussions reflect careful, clever analysis of the data and excellent linkage with the literature. Figures and tables complement the text in a compelling way. I have only minor suggestions:

      Summary: more gradually introduce Spirotrichea and the phylogenetic relationship among the three species analyzed. This would better position the reader to understand the evolutionary context you are working in. Also, it would be helpful to more clearly differentiate novel vs. existing data. A suggestion: "This study focuses on three spirotrich species: two in the family Oxytrichidae (Oxytricha trifallax and Tetmemena sp) and Euplotes woodruffi as an outgroup. To complement existing data, we sequenced, assembled and annotated the germiline and somatic genomes of E. woodruffi and the germline genome of Tetmemena sp."

      Thank you, we clarified the summary (abstract).

      Introduction, first paragraph: Replace "The species in this study..." for a more precise statement, such as "The three spirotrich species studied here..."

      Thank you, we have made this statement more precise.

      p. 4: This sentence is unclear: "These useful tools provide partial insight to guide selection of species for full genome sequencing, which allows construction of complete rearrangement maps of a MIC genome onto a MAC genome for a reference species."

      Thank you, we have clarified this sentence.

      p. 8: define TE on first mention.

      Defined on page 6.

      Table 1. Indicate which MIC and MAC data are from this study.

      References are included for published data and a note has been added to indicate data from this study.

      Reviewer #3 (Significance (Required)):

      The present work represents a significant advance in the field of evolutionary genomics. The focus of the paper is on ciliates, an ancient (2 billion-year old) and highly diverse eukaryotic phylum that presents many peculiarities, including sex, nuclear dimorphism, genome rearrangement, high numbers of paralogs and transposons, etc. While some data exist on a few model ciliates of disparate phylogenetic position, this work focuses on two species taxonomically placed in the same family, plus a more distant outgroup within the same class. This gives a novel dimension to this study, that goes beyond exploring genome architecture in a single clade. Instead, it allows to explore evolutionary trends in genome rearrangement among relatively closely related species. This paper should be of high interest not only for ciliate biologists (like me), but also in relation to comparative genomics of protists/eukaryotes and germ-soma biology. I highly recommend publication.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Ciliates extensively rearrange their somatic genome every time a new somatic nucleus develops from the zygotic germline nucleus. In this manuscript, Feng et al report the sequencing, assembly and annotation of the germline and somatic genomes of Euplotes woodruffi and the germline genome of Tetmemena sp. (whose somatic genome was sequenced and assembled by the same lab in 2015). They present a comparative analysis of developmentally programmed genome rearrangements in these two species and in the model ciliate Oxytricha trifallax. Their major findings are that:

      (1) E. woodruffi and Tetmemena sp. eliminate a smaller fraction of their germline genome (~54%) from their somatic macronucleus (MAC) than O. trifallax (>80%)

      (2) Transposable elements (TE) represent a smaller fraction of the germline genome (~2%) in the first two ciliates than in O. trifallax (~15%). TEs are mainly located at the boundaries of germline chromosomes and in intergenic regions, but can also be found inside IESs

      (3) Several thousands of genes are scrambled in the germline genome of all three species

      The authors have also addressed the possible origin of gene scrambling. They report an interesting association with local paralogy and propose a model for the emergence of the odd-even pattern of gene unscrambling between two paralogous copies.

      Major comments:

      (1) Based on the statistics presented in Table 1, genome assemblies are of good quality, with a reasonable N50 size of germline (MIC) contigs. It seems, however, that no entire MIC chromosome could be assembled, since no two-telomere contig is mentioned in the list. As proposed by the authors (p.7) the presence of numerous TEs at the boundaries of MIC contigs (Fig S1) may have hindered the assembly of MIC chromosome ends. I would have appreciated to have more information on the "other repeats" (which seem to differ from tandem repeats according to Fig 2) and their location along MIC contigs.

      (2) The definition of "Internal Eliminated Sequences" (IES) is not clear. The authors make a distinction between IESs and TEs. I understand that IESs are DNA segments that separate two macronuclear-destined sequences (MDS) in the germline genome. Thus they appear to be restricted to those regions that eventually yield gene-sized MAC chromosomes. IESs are eliminated between two pointers that may not be identical on both sides in case of scrambled genes. Some clarification is needed here.

      To illustrate my point: I found the statement "with many TE insertions within IESs, suggesting that TE insertions may have generated IESs" particularly confusing (p. 9 lines 5-6). Does this mean that IESs extend beyond the ends of inserted TEs? The legend of Fig S1 should also be clarified.

      (3) p. 10 lines 2-4 and Fig S2: Could the authors explain the difference they make between MDS (in the text) and CDS (in Fig S2)? My understanding is that a CDS is the entire gene coding sequence and may be made of multiple MDSs. If this is correct, the sentence should read "We compared the number of MDSs between single-copy orthologs for single-gene MAC chromosomes across the three species and found that the orthologs have similar CDS lengths".

      (4) p. 12 lines 10-15: the discovery that paralogous MDSs can be found in scrambled genomic loci is interesting. If the two paralogs can be distinguished based on the number of substitutions, it would be informative to go back to individual reads and check whether each of the two copies can be incorporated in the unscrambled CDS (and at which frequency). Would the pointers be compatible with this?

      (5) The hypothesis that odd-even scrambled loci have evolved from paralogous genes in E. woodruffi is supported by the existence of paralogous MDSs, length conservation of MDS/IES pairs and sequence similarity between corresponding MDS and IES in a pair. The correlations presented for Oxytricha and Tetmemena are much less convincing (Fig S5D and E). I recommend that the authors are even more cautious in their statement on p.13 ("For Oxytricha and Tememena, the MDS and IES lengths for such MDS/IES pairs also correlate positively, but more moderately").

      (6) p. 15 last paragraph: Why did the authors focus only on TBEs inserted in non-scrambled IESs to look for orthologous TBE insertions? Is there a reason to believe that no recent TBE insertion occurred at other genomic loci? Or was it only for practical reasons? It is also not clear to me whether the authors have considered full-length TBEs or the presence of at least one TBE ORF.

      (7) p. 16: the authors report that some introns of E. woodruffi map "near" Oxytricha/Tetmemena pointers. How near? Based on the information provided by the authors, I don't think this observation necessarily implies that IESs were converted to introns (or reciprocally) during evolution. If this were true, shouldn't at least one intron boundary coincide exactly with a pointer? The authors should clarify this (also in the discussion, on p. 20, top paragraph).

      (8) p. 19 2nd paragraph: the suggested mechanism explaining the 5' bias of IESs in E. woodruffi genes is unclear. How could germline recombination take place between a MIC chromosome and a MAC reverse transcript or nanochromosome? This would imply that DNA could be imported in the MIC. Is there evidence that this might occur?

      (9) According to Figure 1, no scrambled genes have been reported in Paramecium tetraurelia. Within the frame of the proposed model, this is somewhat unexpected because this ciliate went through several whole genome duplications during evolution and harbors many paralogous gene pairs. Is there a reason why no gene scrambling took place in Paramecium?

      Minor comments:

      • p. 4 (4th bottom line): To my knowledge, ref #28 presents a draft (incomplete) MIC assembly of the Paramecium genome.

      • p. 7 (last paragraph): "encoding" should be replaced by "carrying"

      • p. 10 (2nd paragraph): insert a missing "o" into "nanochromosomes"

      • p. 10 (same paragraph): the weak 5' bias of IES distribution in Tetmemena should be shown (either as an additional panel in Fig 3 or in a Sup Figure.

      • p. 24 2nd paragraph: "a" is missing in "Trinity, which is a software..."

      CROSS-CONSULTATION COMMENTS

      I agree with most comments of reviewer 3.

      The authors have actually defined "TE" in the introduction (p. 6). Depending on the journal's rules for abbreviation use, it may not be necessary to define it again in the results section

      Significance

      • Ciliates are unicellular models to study developmentally programmed genome rearrangements at the mechanistic, genome-wide and evolutionary levels. These aspects have so far mostly been addressed in three species: P. tetraurelia and Tetrahymena thermophila on the one hand, the spirotrichous ciliate O. trifallax on the other.

      • One new piece of information that can be found in the present manuscript is the assembly and annotation of the germline genome of two novel species: Tetmemena sp, closely related to Oxytricha, and the more distant E. woodruffi. Feng et al establish that, similar to other ciliates, Tetmemena and Euplotes eliminate TEs and other germline-specific sequences during programmed genome rearrangements. They also undergo extensive gene unscrambling, which results in IES removal and MDS reordering to assemble coding sequences.

      • A TE origin was discussed previously for Paramecium (Arnaiz et al PLoS Genet; Sellis et al 2021 PLoS Biol) and Tetrahymena IESs (Hamilton et al 2016 eLife). While this may also hold true in spirotrichous ciliatesThe present manuscript proposes a completely new evolutionary scenario for IESs from scrambled genes. Here, Feng et al establish that scrambled genes of spirotrichous ciliates tend to be associated with local paralogy. They provide evidence supporting that IESs from scrambled genes may have evolved from paralogous MDSs.

      • Although I am more an expert in the molecular mechanisms involved in genome rearrangements, I feel that the work reported here should draw the attention of a broader audience interested in genome dynamics and evolution, beyond the specific field of spirotrichous ciliate biology.

    1. Author Response

      Reviewer #1 (Public Review):

      1) While the authors identify the suppressors in known genetic interactors (GIs) of the yeast SEC53, it is worth testing if the compensatory mutations are rewiring the GIs, thereby explaining the lack of comparable compensations observed in reconstituted strains. If altered GIs explain the suppression, then while yeast serves as an excellent tool to perform these assays, the human context of the disease may require a different set of genetic suppressors and, therefore, a different target than the yeast PGM1 ortholog.

      Our data show that pgm1 mutations alone greatly improve growth of sec53-V238M strains. Our data also indicate other pathways of compensation. Whether each of these compensatory mechanisms translate to humans is unknown. However, the observed enrichment of compensatory mutations in genes whose human homologs are associated with Type 1 CDG, suggests that many of these genetic interactions are likely to be conserved.

      Also, are Sec53 and Pgm1 proteins directly interacting in yeast and whether these mutations are on the interaction interface?

      As we mention above, there is no support for a direct physical interaction between Sec53 and Pgm1.

      2) Based on the data obtained between pACT1 and pSEC53-driven expression of the SEC53 mutant alleles, the pattern of suppressors appears to be different. Authors report that the variants expressed from strong pACT1 promoters show more suppressors than those driven by native promoters. Is this a general trend in experimental evolution that slower-growing strains tend to show lesser suppressors? For example, on Page 6, line 154, "compensating for Sec53-F126L dimerization defects are rare or not easily accessible". The statement suggests that the authors did obtain suppressors that compensate for the dimerization defect. At the same time, while rare (also, are authors suggesting suppression of dimerization defect as in better dimerization?), the rate of obtaining suppressors seems to be linked to the severity of the fitness defects of the strains. The lack of suppressors may be a limitation of the evolution experiments. Indeed later in the manuscript, the authors noticed that while PGM1 suppressors obtained in V238M can also suppress F126L alleles, the suppression was not as efficient. Could it be that evolution experiments in slower-growing strains predominantly enrich suppressors in other pathways (i.e., not in the CDG orthologs) that restore the growth better and compete out the relatively weaker suppressors in PGM1? In fact, the authors report similar effects on Page 7, lines 204-210. These two paragraphs are contradictory and should be explained further.

      All of our sequencing was performed on strains with sec53 under the control of the pACT1 promoter. While we did not identify unique sec53-F126L suppressors, we cannot exclude that sec53-F126L suppressors exist, so we describe them as “rare or not easily accessible”. While it is possible that the slower growth rate of the sec53-F126L allele could impact the likelihood of observing suppressors, we think it is more likely due to the nature of the variant (dimerization defect versus stability defect) rather than growth rate. In other laboratory evolution experiments the same beneficial mutation typically has a greater effect in slower-growing backgrounds (for example: doi.org/10.1126/science.1250939).

      3) Authors report that the LOF of PGM1 compensates for the SEC53 mutations. However, the evolution experiments did not capture any LOFs in PGM1. The fitness comparisons in evolution experiments are different as many different genotypes compete in a mix. Therefore, the fitness assays in a clonal population may not represent these differences well. To test this argument, authors can try to mimic the evolution experiments by mixing two genotypes to check competitive fitness, like the co-culture of pgm1 suppressor obtained via evolution experiments with pgm1Δ.

      Though we did not perform a direct head-to-head competition between a pgm1 suppressor and a pgm1Δ, our data suggest that the pgm1 delete would outcompete some of the lower-fitness suppressors. In the Discussion we speculate as to why we do not see deletion mutations: “Given that most of the evolved clones containing pgm1 mutations are more fit than the reconstructed strains, it is possible that other evolved mutations interact epistatically only with non-loss-of-function pgm1 mutations.”. Though it is beyond the scope of the present manuscript, it would be possible to rerun the evolution experiment in sec53-V238M strains carrying either a pgm1 missense suppressor or a pgm1Δ. Under the hypothesis of additional interacting loci, only the pgm1 missense suppressors would be more likely to acquire additional compensatory mutations.

      Reviewer #3 (Public Review):

      Vignogna et al. used yeast genetics, experimental evolution and biochemistry to tackle human congenital disorders of glycosylation (CDG), a disease mostly caused by mutations in PMM2. They took advantage of the observation that the budding yeast gene SEC53 is almost identical to human PMM2, and used experimental evolution to find interactors of SEC53/PMM2. They found an overrepresentation of mutations in genes corresponding to other human CDG genes, including PGM1. Genetic and biochemical characterizations of the pgm1 mutations were carried out. This work is solid, although authors did not reveal why reduction of pgm1 activity could compensate for defects of a particular mutant allele of sec53.

      Out of curiosity, if the authors were to simply focus on the preexisting mutations, would they have gotten the materials for most of the experiments in this article? In other words, how important is the experimental evolution?

      The evolution experiment was crucial as the specific pgm1 mutations we identified here have not been reported elsewhere, nor have the orthologous mutations been identified in human PGM1.

      A strain table with full genotypes is needed.

      We added a strain genotype table (Supplemental Dataset 2).

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01541

      Corresponding author(s): Hubert Hilbi

      1. General Statements

      Upon infection of eukaryotic host cells, Legionella pneumophila forms a unique compartment, the Legionella-containing vacuole (LCV). While the role of vesicle trafficking pathways for LCV formation has been quite extensively studied, the role of putative membrane contact sites (MCS) between the LCV and the ER has been barely addressed. In our study, we provide a comprehensive analysis of the localization and function of protein and lipid components of LCV-ER MCS in the genetically tractable amoeba Dictyostelium discoideum.

      We would like to thank the 3 reviewers for their thorough and constructive reviews. Overall, the reviewers state that the study is of interest to researchers in the field of Legionella and other intracellular pathogens (Reviewer 2), as well as to cell biologists (Reviewer 3). Reviewer 1 does not ask for additional experiments but is critical about the overall structure of the manuscript and the proteomics approach. As requested by the reviewer, we have substantially restructured the revised manuscript, now clearly outline the hypotheses put forward in the study and streamlined the proteomics data. Reviewer 2 asks for additional experiments to support our model of LCV-ER MCS. In the revised manuscript, we have included additional experiments addressing lipid exchange at the MCS, and we plan to perform further co-localization experiments. Reviewer 3 appreciates the comprehensive LCV proteomics and asks for only minor revisions, which we have incorporated in the revised version of the manuscript. We include below a point-by-point response to all the comments made by the reviewers.

      2. Description of the planned revisions

      Reviewer #2

      Major comment

      1) MCS contain protein complexes or a group of proteins, but the proteins here are studied in isolation and do not support the model shown in Figure 7. Co-localization studies of the putative LCV-ER MCS proteins are critical, especially given that the authors hypothesize the proteins are working together to modulate PI(4)P levels.

      Response: As suggested by the reviewer, we will perform additional co-localization experiments with MCS components. To this end, we will construct mCherry-Vap, and we will co-transfect the parental D. discoideum strain Ax3 with plasmids producing mCherry-Vap and OSBP8-GFP or GFP-OSBP11. Using these dually fluorescence labelled D. discoideum strains, the co-localization of Vap with the OSBPs will be assessed at 1, 2, and 8 h post infection. The data will be presented as fluorescence micrographs, and co-localization of Vap with the OSBPs will be quantified using Pearson’s correlation coefficient and fluorescence intensity profiles. The data will be outlined in the text (l. 258 ff.) and shown in the new Fig. 2 and__ Fig. S4__.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Reviewer #1 (Evidence, reproducibility and clarity):

      In the manuscript by Vormittag, et al., the authors perform proteomics identification of proteins associated with the Legionella-containing vacuole (LCV) in the model amoeba Dictyostelium discoideum comparing WT to atlastin knockout mutants. The authors find approximately half the D. discoideum proteome associated with the LCV, but there was enrichment of some proteins on the WT relative to the mutant. They focus on proteins involved in forming membrane contact sites (MCS) that previously were shown to be important for expansion of the Chlamydia-containing vacuole. Most significant are the oxysterol binding proteins (OSBP) and VapA (similar to that seen in Chlamydia). The authors show differential association of these proteins with either the LCV or presumably the ER associated with the LCV. Using a linear scale over 8 days, they show that mutations in some of the MCS reduce yields in two of the OSPB knockout mutants and the growth rate of the vap mutant is slowed but ultimate yield is increased. Using some nice microscopy techniques, they measure LCV size, and the osbK mutant appears particular small relative to other strains, whereas the osbH mutant generates large vacuoles. This doesn't necessarily correlate with the PI4P quantities on the vacuoles (which is higher in all of them), but I am not totally sure how this is measured, and whether is it PI4P/pixel or PI4P/LCV. In all cases, this was reduced by Sac1 mutation. Surprisingly, even though there was uniform increase in PI4P in each of the mutants, loss of PI4P only affects localization of some of the proteins. Finally, in what seems to be a peripherally related experiment, the authors show that a pair of Legionella translocated effectors are required to maintain PI4P levels, although it is not clear how this is related to the other data in the manuscript.

      It is not clear from the manuscript if the authors are just cataloging things or trying to test a hypothesis. This is an extremely difficult manuscript to read and reconstruct what the authors showed. I really think that the only people who will understand what is written are people who are familiar with the work in Chlamydia starting in 2011 in Engel's and Derre's laboratories, which clearly showed that MCS and most specifically Vap/OSBPs are involved in vacuole expansion. If the authors could rewrite the manuscript along these lines, perhaps comparing their data to the Chlamydia data it would help a lot. Otherwise, I don't think anyone else will understand why they are focusing on these things. I don't recommend new experiments (although re-analyzing data is necessary), but the manuscript has to be taken apart and claims removed, and data be interpreted properly. Otherwise, the manuscript seems like just a clearing house for data.

      Response: Thank you for the concise summary of our data and pointing out the need to restructure the manuscript and to clearly outline the hypotheses underlying the study. According to the reviewer’s suggestions, we have now re-structured the manuscript. In the revised manuscript the story unfolds from the observation that the ER tightly associates with (isolated) LCVs, and the proteomics approach is used as a validation of the presence of MCS proteins at the LCV-ER MCS.

      As suggested by the reviewer, we now highlight the seminal work on Chlamydia by the Engel and Derré laboratories not in the Discussion section (as in the original version of the manuscript) but already in the Introduction section (l. 142-148). We believe that it makes a stronger case to start out an analysis of LCV-ER MCS with a Legionella-specific cell biological finding (LCV-ER association) and an unbiased proteomics approach, as compared to a more derivative and defensive approach starting out with what is known about Chlamydia.

      The reviewer’s comment “This is an extremely difficult manuscript to read” appears overly harsh and conflicts with the positive evaluation of Reviewer #2 and Reviewer #3. Finally, we respectfully disagree with the reviewer’s statement that experiments characterizing L. pneumophila effectors implicated in the formation and function of LCV-ER MCS are peripheral. These experiments significantly contribute to a mechanistic understanding of how L. pneumophila forms and exploits LCV-ER MCS, and they are central for studies on pathogen-host interactions. The studies are analogous to the work on Chlamydia effectors by the Engel and Derré laboratories, but the mode of action of Legionella and Chlamydia effectors is obviously different. Another important distinction of our work to the studies on Chlamydia is the use of the genetically tractable amoeba, D. discoideum, which allows an analysis of LCV-ER MCS by fluorescence microscopy at high spatial resolution.

      Specific comments

      1. The problems start with the first figure, in which the authors state that almost half the D. discoideum proteome is LCV-associated. I doubt that this is correct, and they should base this on some selective criterion. Furthermore in Fig. 1A, they show Venn diagrams for how they whittled this down, but the Supplemental Dataset gives us no clue on how this was done. I can only sit down myself with the dataset and try to figure that out, but that is an unreasonable expectation for the reader. The dataset provided should have a series of sheets, describing how the large protein set was whittled down and how they were sorted, so the reader can evaluate how robust the final results were. To me (at least), if they said: "look we got this surprising result that suggests MCS are involved in promoting LCV formation, and although this is well recognized in Chlamydia but poorly recognized in Legionella", that would be satisfactory to me.

      Response: According to the reviewer’s suggestions, we have now thoroughly re-structured the manuscript. In the revised manuscript the story unfolds from the observation that the ER tightly associates with LCVs in infected cells and with isolated LCVs. The proteomics approach is now used as a validation of the presence of MCS proteins at the LCV-ER MCS and relegated to the Supplementary Information section (former Fig. 1, now Fig. S3).

      For the proteomics analysis, all protein identifications have been filtered for robustness applying a constant FDR (false discovery rates) of protein and PSM (peptide spectrum match) of 0.01, which is a commonly accepted threshold in the field. Moreover, two identified unique peptides were required for protein identification. The parallel application of both filter criteria results in very robust and reliable data sets. This is outlined in the Material and Methods section (l. 683-693).

      In the data set of LCV-associated proteins, 2,434 D. discoideum proteins have been identified (Table S1). This is 18.5% of the total of 13,126 predicted D. discoideum proteins (UniprotKB) and considerably less than “almost half the D. discoideum proteome”, as stated by the reviewer. Moreover, 1,224 L. pneumophila proteins have been identified (among 3,024 predicted L. pneumophila proteins in the database). This is a reasonable number of proteins identified from an intracellular vacuolar pathogen, given the LCV isolation and proteomics methods applied. We now outline these findings more extensively in the Results section (l. 207-213). Moreover, to render Table S1 more reader-friendly, we added to the datasheet “All data” the datasheets “Dictyostelium”, “Legionella” and “Info”.

      The Venn diagram in Fig. S3A (previously Fig. 1A) does not show a subset of proteins “whittled down” from the entire proteomes, but simply summarizes LCV-associated proteins, which were either identified exclusively in the parental strain Ax3 but not in the Δsey1 mutant strain, or only in Δsey1 but not in Ax3, thus identifying possible candidates relevant for the LCV-ER MCS. This information is now outlined more clearly in the text (l. 238-241). Moreover, we now explicitly define in the Material and Methods section (l. 697-704) the “on” and “off” proteins shown in Fig. S3A.

      The overall rational for the comparative proteomics approach was our previous finding that compared to the D. discoideum parental strain Ax3, the Δsey1 mutant strain accumulates less ER around LCVs (PMID: 28835546, 33583106). This finding suggests that formation of the LCV-ER MCS might be compromised in the Δsey1 mutant strain. This hypothesis is now outlined at the beginning of the Results paragraph (l. 204-207).

      I am clueless regarding how Fig. 6 fits with the rest of the manuscript. If this is about MCS, there is no demonstration these effectors are directly involved in MCS other than the somewhat diffuse argument that there is some correlative connection to PI4P levels, that I am not particularly convinced by.

      Response: The PtdIns(4)P gradient between two different cellular membranes is an intrinsic feature of MCS. To date, a quantification of PtdIns(4)P levels on LCVs in response to the presence or absence of specific L. pneumophila effectors is lacking. Accordingly, we opted for quantifying the PtdIns(4)P levels on LCVs in presence and absence of an L. pneumophila effector putatively generating PtdIns(4)P on LCVs, the phosphoinositide 4-kinase LepB, or titrating PtdIns(4)P on LCVs, the PtdIns(4)P-binding ubiquitin ligase SidC. To address the concerns of Reviewer 1 and Reviewer 3 (see below), we now outline in detail the rational to assess the role of LepB and SidC for MCS function (l. 385-387). Importantly, we now also provide data that at LCV-ER MCS PtdIns(4)P/cholesterol lipid exchange is functionally important (new Fig. 6 and Fig. S10). In the revised version of the manuscript, this new data is preceding the experiments with the L. pneumophila effectors, which should render our choice of effectors more comprehensible to the reader and increase the flow of the manuscript.

      Line 146 and associated paragraph. We don't need a catalog of proteins in narrative. There is more detail in the narrative than there is in the tables and figures, which would be a more appropriate way to present the data.

      Response: As suggested by the reviewer, we summarized the LCV-associated D. discoideum proteins and considerably reduced the list in the text (l. 214-230).

      Line 186. There is nothing wrong with pursuing MCS based on the idea that this was seen before with Chlamydia and you wanted to test if this was a previously unappreciated aspect of Legionella biology. I don't see the rationale based on the proteomics, partly because I don't understand how the proteomics dataset was parsed.

      Response: As suggested by the reviewer, we thoroughly re-structured the manuscript and now highlight the seminal work on Chlamydia by the Engel and Derré laboratories already in the Introduction section (not in the Discussion section as in the original version of the manuscript). We believe that it makes a stronger case to start out an analysis of LCV-ER MCS with a Legionella-specific cell biological finding (LCV-ER association) and an unbiased proteomics approach, as compared to a more derivative and defensive approach starting out with what is known about Chlamydia.

      Figure 3: These growth curves are super-weird. I am not used to looking at 8 days of logarithmic growth in a linear scale and seeing no (apparent) growth for 4 days. Considering all the microscopy data are performed in the first 18 hrs of infection, it’s hard to see how this is related to data at 8 days post infection. If this were plotted in logarithmic scale, as microbiologists are used to doing, then perhaps we could see a connection. Also, in some cases, it might be helpful to calculate a growth rate, because it’s possible the author may now see some effects by comparing logarithmic growth rates.

      Response: We have been characterizing growth of L. pneumophila in D. discoideum in several studies using growth curves with RFU vs. time plotted in linear scale (e.g., Finsel et al., 2013, Cell Host Microbe 14:38; Rothmeier et al., 2013, PloS Pathog 9: e1003598; Swart et al., 2020, mBio 11: e00405-20). The D. discoideum-L. pneumophila infection model is peculiar, since the amoebae do not survive temperatures beyond 26 degC. This is substantially below the optimal growth temperature of L. pneumophila (35-40 degC). This means that - due to the many genetic tools available - D. discoideum is an excellent model to investigate cell biological aspects of the infection at early time points (ca. 1-18 h p.i.), but the amoebae are not an optimal system to quantify (several rounds) of intracellular growth.

      Figure 2: The images don't necessarily show what the bar graphs show. In particular, look at Osp8. That image doesn't make sense to me.

      Response: The individual channels of the merged images in Fig. 1 (formerly Fig. 2) are shown in Fig. S2. By looking at the individual channels, it becomes clear that OSBP8-GFP co-localizes with calnexin-mCherry (overlapping signals), but not with P4C-mCherry or AmtA-mCherry (adjacent signals). Co-localization was quantified in a non-biased manner by Pearson’s correlation coefficient. To further visualize co-localization, we now also provide fluorescence intensity profiles for all confocal micrographs (amended Fig. 1).

      In summary, I think the authors hit on something that is probably important for Legionella biology, but it’s not clear what they want to show. They are very invested in connecting everything to PI4P levels, which may or may not be correct, but it seems to me that perhaps taking more care in showing the importance of the Vap/OSPB nexus in supporting Legionella growth should be the first priority.

      Response: Given the importance of the PtdIns(4)P gradient for lipid exchange at MCS, we believe it is justified to put considerable emphasis on this lipid. To further substantiate a functional role of PtdIns(4)P at LCV-ER MCS, we now also show that an increase in PtdIns(4)P at the LCV correlates with a decrease of cholesterol (new Fig. 6 and Fig. S10). The inverse correlation of these two lipids is in agreement with the notion that cholesterol is a counter lipid of PtdIns(4)P at LCV-ER MCS.

      It is not clear from the manuscript if the authors are just cataloging things or trying to test a hypothesis.

      Response: In the revised version of the manuscript, we put forward several specific hypotheses, which we then tested in our study (l. 152-155).

      If I understand Fig. 1, only one of the candidates (VapA) was verified as being more enriched in WT relative to atlastin mutants. This argues even more strongly that the authors have to describe their criteria for choosing these candidates.

      Response: As outlined above (specific point 1), we have now re-structured the manuscript according to the reviewer’s suggestions. In the revised manuscript the story unfolds from the observation that the ER tightly associates with LCVs in infected cells and with isolated LCVs. The proteomics approach is now used as a validation of the presence of MCS proteins at the LCV-ER MCS and relegated to the Supplementary Information section (formerly Fig. 1, now Fig. S3). We consider the proteomics approach a powerful hypothesis generator, and the experimental identification of several MCS proteins by proteomics validated the cell biological and bioinformatics insights.

      Reviewer #1 (Significance (Required)):

      As stated above, the manuscript can't decide if it’s about MCS or PI4P, and I would argue strongly that the emphasis on PI4P detracts from the manuscript, as well as its inability to draw connection to previous work that is likely to be important.

      Response: We respectfully disagree with the reviewer on this important point and hold that proteins as well as lipids are crucial functional determinants of MCS. The PtdIns(4)P gradient is a pivotal process for lipid exchange at MCS. Therefore, we believe it is justified to put considerable emphasis on this lipid. In the Introduction section, we now specify several hypotheses on the localization and function of lipids and proteins at LCV-ER MCS (l. 152-155). Moreover, we now also refer to the previous work on Chlamydia MCS in the Introduction section (l. 142-148).

      Reviewer #2 (Evidence, reproducibility and clarity):

      Summary of paper and major findings

      Membrane contact sites (MCS) are locations where two membranes are in close proximity (10-80nm). MCS have a defined protein composition which tether the membranes together and function in small molecule and lipid exchange. Typically, MCS proteins contain structural (e.g., tethers) and functional (e.g., exchange lipids) proteins, in addition to proteins which regulate the structure and function of the MCS. In this manuscript, Vormittag et al describe protein components of MCS between the Legionella-containing vacuole (LCV) and the host endoplasmic reticulum (ER) in the amoeba Dictyostelium. Proteomics of isolated LCVs followed by microscopy analysis identified several proteins which localize to either the LCV-associated ER (OSBP8), the LCV (OSBP11), or both (VAP and Sac1). The mammalian homologs of these proteins have been shown to play important roles in ER MCS, with VAP serving a structural role, Sac1 a PI(4P) phosphatase regulating PI(4)P levels, and OSBP8 and OSBP11 lipid transferring proteins. Given the importance of PI(4)P in formation and maintenance of the Legionella-containing vacuole, the authors used dicty mutants to determine the importance of these proteins in bacterial growth, LCV size, and PI(4)P levels on the LCV. While VAP and OSBP11 appear to promote Legionella infection, OSBP8 appears to restriction infection, although all identified MCS components appear to play a role in decreasing PI(4P) shortly after infection. Finally, VAP and OSBP8 localization to the LCV is PI(4)P-dependent. Overall, the authors conclude that these MCS components play a role in modulating PI(4)P levels on the LCV.

      Overall, this is an interesting study further exploring the role of PI(4)P in LCV-ER interactions, and how PI(4)P levels are regulated. The figures are clearly presented, there is an impressive amount of data, and rigor appears to be strong with appropriate replicates and statistical analysis. The phenotypes are often mild, but the authors are careful to not overinterpret the data. While this is an interesting study, additional experiments are necessary to support the overall model and the text needs to put the findings into the larger context.

      Response: We would like to thank the reviewer for this positive and constructive assessment. We performed and planned additional experiments to further strengthen the study and support our model.

      Major comments

      1) MCS contain protein complexes or a group of proteins, but the proteins here are studied in isolation and do not support the model shown in Figure 7. Co-localization studies of the putative LCV-ER MCS proteins are critical, especially given that the authors hypothesize the proteins are working together to modulate PI(4)P levels.

      Response: To further explore the possible interactions between Vap and OSBP proteins, we plan co-localization experiments using D. discoideum strains producing mCherry-Vap and either OSBP8-GFP or GFP-OSBP11, as outlined above (Section 2, new__ Fig. 2__ and Fig. S4).

      Moreover, we included additional data on PtdIns(4)P/cholesterol lipid exchange (Fig. 6 __and Fig. S10__), which have been incorporated into the model (amended Fig. 8). Based on the available data, we do not postulate direct interactions between Vap and OSBP proteins. The previous model, which now has been amended, might have been misleading in that respect.

      2) The phenotypes are relatively mild, suggesting functional redundancy. Double knockouts, particularly in VAP and OSBP11, may generate a stronger phenotype that better supports the hypothesis and demonstrate the importance during infection.

      Response: Thank you for this interesting suggestion. Please see Section 4 below for our arguments, why we believe that this intriguing approach is beyond the scope of the current study.

      3) The timing of PI(4)P and MCS protein localization during infection is critical to understanding how MCS might be functioning. Based on Figure 6C, PI(4)P levels decrease on the LCV during infection, but this is not fully explained in the context of what's known in the literature and what is observed the previous figures. How does localization of different MCS components change during infection, and does this correlate with the changes in growth or LCV size? A better description in the Introduction on LCV-associated PI(4)P levels would be beneficial in orienting the reader to why PI(4)P levels are modulated.

      Response: As suggested by the reviewer, we added to the Introduction section more detail about the kinetics of PtdIns(4)P accumulation on LCVs (l. 65-71), and we discuss the limited spatial resolution of the IFC approach (formerly Fig. 6C, now Fig. 7C; l. 407-408). Importantly, we also provide new data showing that within 2 h p.i. an increase in PtdIns(4)P at the LCV coincides with a decrease of cholesterol (new Fig. 6 and Fig. S10). The new data is put into this context in the Discussion section (l. 449-454).

      4) OSW-1 has other targets besides OSBPs, and depleting Sac1 and Arf1 in A549 cells is not specifically targeting the MCS, as these proteins have other functions. The data in mammalian cells is not convincing and should be removed.

      Response: As suggested by the reviewer, we removed the data on depleting Sac1 in A549 cells (Fig. 3D, and Fig. S6BC). We propose to leave the pharmacological data on inhibition of L. pneumophila replication by OSW-1 in the manuscript, but to clearly point out that OSW-1 has other targets besides OSBPs (l. 297-299).

      Minor comments

      1) Figure 2 is missing details on number of experiments/replicates and statistical analysis.

      Response: Thank you for having noted this oversight. The number of independent experiments and statistical analysis have now been added to Fig. 1 (formerly Fig. 2) (l. 1009-1010).

      2) Can the authors hypothesize why VAP promotes growth early during infection, but appears to restrict growth at later timepoints (Figure 3A)?

      Response: Thank you for raising this intriguing point. The opposite effects on growth of Vap at early and later timepoints during infection might be explained by interactions with antagonistic OSBPs. Vap likely co-localizes with OSBP8 as well as with OSBP11 on the limiting LCV membrane or the ER, respectively (experiment to be performed; Fig. 2 and__ Fig. S4__). The absence of OSBP8 (ΔosbH) or OSBP11 (ΔosbK) causes larger or smaller LCVs, and increased or reduced intracellular replication of L. pneumophila, respectively. Thus, OSBP8 seems to restrict and OSBP 11 seems to promote intracellular replication. Accordingly, if Vap affects or interacts with OSBP11 early and with OSBP8 later during infection, opposite effects on growth of Vap might be explained. These reflections are now outlined in the Discussion section (l. 431-441).

      3) There is a large amount of data, which makes it difficult at times to follow. I suggest adding additional information to table 1, including LCV size and whether or not the protein's localization is PI(4)P-dependent.

      Response: Thank you for this suggestion. As proposed by the reviewer, we added the additional information to Table 1 (PtdIns(4)P-dependency of protein localization, LCV size).

      Reviewer #2 (Significance (Required)):

      Membrane contact sites during bacterial infection are a growing area of research. In Legionella, several papers point to the presence of MCS. Further, PI(4)P is known to be an important component on the LCV. This paper shows that MCS protein members are important in modulating LCV PI(4)P levels. The model as presented is not completely supported by the data as co-localization experiments are needed, along with more detailed analysis of how PI(4)P levels change over infection and the role of these MCS proteins in that process. This study will be of interest to those studying Legionella and other vacuolar pathogens. Area of expertise is on membrane contact sites and lipid biology.

      Response: Thank you very much for the overall positive and constructive evaluation.

      Reviewer #3 (Evidence, reproducibility and clarity):

      The authors perform proteomic analysis of Legionella-containing vacuoles. The observe association of membrane contact site (MCS) proteins including VAP, OSBPs, and Sac1. Functional data indicates that these proteins contribute to PI4P levels on LCVs and their ability to acquire lipid from the ER to enable LCV expansion/stability. Overall, the paper is an important contribution to the field and builds upon a growing appreciation for MCS in establishment of intracellular niches by microbial pathogens. I have only minor comments for the authors consideration.

      Response: We would like to thank the reviewer for this enthusiastic assessment.

      Minor comments:

      -line 145, "This approach revealed 3658 host or bacterial proteins identified on LCVs...". This number seems high... how does it compare to prior proteomic studies of pathogen-containing vacuoles?

      Response: As outlined above (reviewer 1, point 1), we have now changed the text (l. 207-213): “This approach revealed 2,434 LCV-associated D. discoideum proteins (Table S1), of a total of 13,126 predicted D. discoideum proteins (UniprotKB). Moreover, 1,224 L. pneumophila proteins were identified (among 3,024 predicted L. pneumophila proteins), which is a reasonable number of proteins identified from an intracellular bacterial pathogen within its vacuole with the proteomics methods applied (Herweg et al, 2015; Schmölders et al., 2017).”

      • line 160. Can the authors comment on why mitochondrial proteins are observed in their proteomic analysis? Are these non-specific background signals or reflecting relevant organelle contact?

      Response: The dynamics of mitochondrial interactions with LCVs and the effects of L. pneumophila infection on mitochondrial functions have been thoroughly analyzed (PMID: 28867389). This seminal work is now cited in the text (l. 227-230).

      • line 268. It is reported that LCVs are smaller with MCS disruption at 2 and 8 h p,i.. Does this also lead to instability or rupture of LCVs? And related to this why would LCVs be bigger at 16h with MCS disruption?

      Response: MCS components affect LCV size positively or negatively. E.g., the absence of OSBP8 (ΔosbH) or OSBP11 (ΔosbK) causes larger or smaller LCVs, and increased or reduced intracellular replication of L. pneumophila, respectively. However, as outlined in the Discussion section (l. 442-454), we believe that the relatively small size likely reflects a structural remodeling of the pathogen vacuole rather than a substantial LCV expansion. LCV rupture takes place only very late in the infection cycle (beyond 48 h) and is followed by lysis of the host amoeba (PMID: 34314090).

      • lines 288 and 299 "data not shown" this data should be included in a supplemental figure.

      Response: The data on the localization of GFP-Sac1 and GFP-Sac1_ΔTMD are included in the Figs. 1A, 4A, 5AD, S2A, S7A, and__ S9__ (l. 328, l. 339).

      • line 327. The authors choose to focus on the role of LepB and SidC in MCS modulation. The rationale for choosing these two amongst the ca 330 effectors was not given. Were other effectors also examined?

      Response: LepB and SidC were chosen due to their activities producing or titrating PtdIns(4)P, respectively, and their LCV localization. This rational is now given in the text (l. 385-387). No other effectors were examined up to this point.

      Reviewer #3 (Significance (Required)):

      Comprehensive LCV proteomics of interest to field of cellular microbiology. Studies of MCS broadly relevant to cell biologists.

      Response: Thank you very much for the overall very positive evaluation.

      4. Description of analyses that authors prefer not to carry out

      Reviewer #2

      Major comment

      2) The phenotypes are relatively mild, suggesting functional redundancy. Double knockouts, particularly in VAP and OSBP11, may generate a stronger phenotype that better supports the hypothesis and demonstrate the importance during infection.

      Response: Thank you for raising the important question of functional redundancy. We now outline this concept in the Discussion section (l. 427-429). A further analysis of the genetic and biochemical relationship between Vap and OSBP11 or OSBP8 are without doubt some of the most interesting aspects of further studies on the topic of LCV-ER MCS.

      The construction of a D. discoideum double mutant strain is time consuming and usually takes 1-2 months. Provided that a Vap/OSBP11 double deletion mutant strain is viable and can be generated, it takes another 1-2 months to thoroughly characterize the strain regarding intracellular replication of L. pneumophila (Fig. 3), LCV size (Fig. 4), and PtdIns(4)P score (Fig. 5). Moreover, there is already a large amount of data in the paper (to quote Reviewer #2), and therefore, adding new data might makes it even harder to follow the story and focus on the key points. Finally, we believe that the planned colocalization experiments (Reviewer #2, point 1) and the new data on lipid exchange kinetics (new Fig. 6 and Fig. S10) fit the current story more coherently, and thus, are more straightforward and informative than the generation and characterization of double mutant strains. For these reasons, we believe that the generation and characterization of D. discoideum double mutant strains is beyond the scope of the current study.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In the manuscript by Vormittag, et al., the authors perform proteomics identification of proteins associated with the Legionella-containing vacuole (LCV) in the model amoeba Dictyostelium discoideum comparing WT to atlastin knockout mutants. The authors find approximately half the D. discoideum proteome associated with the LCV, but there was enrichment of some proteins on the WT relative to the mutant. They focus on proteins involved in forming membrane contact sites (MCS) that previously were shown to be important for expansion of the Chlamydia-containing vacuole. Most significant are the oxysterol binding proteins (OSBP) and VapA (similar to that seen in Chlamydia). The authors show differential association of these proteins with either the LCV or presumably the ER associated with the LCV. Using a linear scale over 8days, they show that mutations in some of the MCS reduce yields in two of the OSPB knockout mutants and the growth rate of the vap mutant is slowed but ultimate yield is increased. Using some nice microscopy techniques, they measure LCV size, and the osbK mutant appears particular small relative to other strains, whereas the osbH mutant generates large vacuoles. This doesn't necessarily correlate with the PI4P quantities on the vacuoles (which is higher in all of them), but I am not totally sure how this is measured, and whether is it PI4P/pixel or PI4P/LCV. In all cases, this was reduces by Sac1 mutation. Surprisingly, even though there was uniform increase in PI4P in each of the mutants, loss of PI4P only affects localization of some of the proteins. Finally, in what seems to be a peripherally related experiment, the authors show that a pair of Legionella translocated effectors are required to maintain PIF4P levels, although it is not clear how this is related to the other data in the manuscript.

      It is not clear from the manuscript if the authors are just cataloging things or trying to test a hypothesis. This is an extremely difficult manuscript to read and reconstruct what the authors showed. I really think that the only people who will understand what is written are people who are familiar with the work in Chlamydia starting in 2011 in Engel's and Derre's laboratories, which clearly showed that MCS and most specifically Vap/OSBPs are involved in vacuole expansion. If the authors could rewrite the manuscript along these lines, perhaps comparing their data to the Chlamydia data it would help a log. Otherwise, I don't think anyone else will understand why they are focusing on these things. I don't recommend new experiments (although re-analyzing data is necessary), but the manuscript has to be taken apart and claims removed, and data be interpreted properly. Otherwise, the manuscript seems like just a clearing house for data.

      1. The problems start with the first figure, in which the authors state that almost half the D. discoideum proteome is LCV-associated. I doubt that this is correct, and they should base this on some selective criterion. Furthermore in Fig. 1A, they show Venn diagrams for how they whittled this down, but the Supplemental Dataset gives us no clue on how this was done. I can only sit down myself with the dataset and try to figure that out, but that is an unreasonable expectation for the reader. The dataset provided should have a series of sheets, describing how the large protein set was whittled down and how they were sorted, so the reader can evaluate how robust the final results were. To me (at least), if they said: "look we got this surprising result that suggests MCS are involved in promoting LCV formation, and although this is well recognized in Chlamydia but poorly recognized in Legionella", that would be satisfactory to me.
      2. I am clueless regarding how Fig. 6 fits with the rest of the manuscript. If this is about MCS, there is no demonstration these effectors are directly involved in MCS other than the somewhat diffuse argument that there is some correlative connection to PI4P levels, that I am not particularly convinced by.
      3. Lin 146 and associated paragraph. We don't need a catalog of proteins in narrative. There is more detail in the narrative than there is in the tables and figures, which would be a more appropriate way to present the data.
      4. Line 186. There is nothing wrong with pursuing MCS based on the idea that this was seen before with Chlamydia and you wanted to test if this was a previously unappreciated aspect of Legionella biology. I don't see the rationale based on the proteomics, partly because I don't understand how the proteomics dataset was parsed.
      5. Figure 3: These growth curves are super-weird. I am not used to looking at 8 days of logarithmic growth in a linear scale, and seeing no (apparent) growth for 4 days. Considering all the microscopy data are performed in the first 18 hrs of infection, its hard to see how this is related to data at 8 days post infection. If this were plotted in logarithmic scale, as microbiologists are used to doing, then perhaps we could see a connection. Also, in some cases, it might be helpful to calculate a growth rate, because its possible the author may now see some effects by comparing logarithmic growth rates.
      6. Figure 2: The images don't necessarily show what the bar graphs show. In particular, look at Osp8. That image doesn't make sense to me.

      In summary, I think the authors hit on something that is probably important for Legionella biology, but its not clear what they want to show. They are very invested in connecting everything to PI4P levels, which may or may not be correct, but it seems to me that perhaps taking more care in showing the importance of the Vap/OSPB nexus in supporting Legionella growth should be the first priority.

      It is not clear from the manuscript if the authors are just cataloging things or trying to test a hypothesis.

      If I understand Fig. 1, only one of the candidates (VapA) was verified as being more enriched in WT relative to atlastin mutants. This argues even more strongly that the authors have to describe their criteria for choosing these candidates

      Significance

      As stated above, the mansucript can't decide if its about MCS or PI4P, and I would argue strongly that the emphasis on PI4P detracts from the manuscript, as well as its inability to draw connection to previous work that is likely to be important.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This paper examines the formation and repair of micronuclei in non-cancerous cells, specifically in mouse embryonic fibroblasts. This work was performed completely in culture and used a combination of western blot, confocal and superresolution microscopy to assess the contents of micronuclei over a repair period of 5 hours after 2 hours of induction of double strand breaks by treatment with etoposide. The authors found that the bodies colocalised with LC3, Beclin 1 and lysosomes suggestive of autophagy. However no evidence of autophagic flux has been demonstrated.

      Major issues are as follows:

      Figure 2

      A - Any sense of the autophagic flux? LC3B - I and LC3B - II seem to be in equal quantities most of the time. Maybe using the tandem LC3 in this system could provide further insight. Also remove the violin plots from this graph and from G and H, as there are too few data points.

      Thank you for your comment. We have evidence of a functional autophagic flux, since we observed an increasing number of acidic vesicles stained with Lysotracker in response to DNA damage, which were reduced after DNA repair. Some of the micronuclei were also co-stained with Lysotracker, suggesting their lysosomal degradation. We reorganized the data in the revised figure 2A to communicate better these observations. We reproduce here the dynamic of Lysotracker stain, please notice an increase in the abundance of acidic vesicles after 2h of DNA damage. A further evidence of activation of functional autophagy is the dynamic intracellular distribution of both LC3 and BECN1, indicative of autophagy induction. Please notice in revised Figure 2A that LC3 surrounding vesicles increases after 2h of DNA damage and diminish when DNA is repaired. BECN1 in control MEFs is highly concentrated inside the nucleus, predominantly at the nucleolus, and after DNA damage it redistributes towards the cytoplasm. Finally after DNA repair, BECN1 appears highly concentrated at the nucleus again. These dynamic changes correlate with autophagosomes formation and successful fusion with lysosomes. In the revised manuscript we removed the violin plot as suggested. Since the elimination of nuclear components occurs in a subset of cells, the role of the autophagic machinery needs to be analyzed cell by cell. We considered better to eliminate also the Western blot, as an analysis of the whole population does not provide information relevant for this study.

      • Can you reduce the brightness in the merge image, as I cannot see DAPI nor a convincing Beclin-1/LC3 co-localisation.

      Thank you for the observation. We improved the quality of the images and reorganized Figure 2 to convincingly show BECN1 and LC3 co-localization, together with Lysotracker, in nuclear alterations (buds and micronuclei). We modified the results text accordingly.

      • Although the data is convincing, It would be clearer if the brightness of the merge image was reduced.

      Thank you for your comment. We improved the images shown, these data is now integrated in new Figure 2A.

      • Is the significant result the difference between 5h R Control si and 5h R Atg7? if so, there is no significant change in micronuclei as the same time point, can you explain this disconnect? are the buds being degraded prior to becoming micronuclei?

      That is correct, we found no statistical significant difference in the number of micronuclei formed silencing Atg7, although there was a trend to reduce them. To consolidate the role of autophagy in nuclear buds and micronuclei formation, we studied Atg4-/- MEFs. We confirmed a statistical significant reduction of buds formation when autophagy is impaired (new Figure 2G). However, we observed that the number of micronuclei increased after 2h of DNA damage in Atg4-/- MEFs, suggesting that autophagy does not contribute to micronuclei formation but elimination. Together, our results suggest that the origin of buds and micronuclei are mechanistically different. A difference in the biogenesis of buds and micronuclei has been previously suggested studying cells cultured under strong stress conditions that induce DNA amplification, as well as in cells under folic acid deficiency. While interstitial DNA without telomere was more prevalent in buds than in micronuclei, telomeric DNA was more frequently observed in micronuclei (Fennech et al. 2011, Mutagenesis 26:125-132). We agree with the reviewer, it seems that not all the buds become micronuclei.

      Figure 3 A - nice microscopy showing the co-localisation of TOP2A and LC3-GFP. I'm interested in DAPI being on some bodies and not others. Do you have any sense of the dynamics of this?

      Thank you for the interesting question. Since removal of nuclear alterations as nuclear buds and micronuclei is a very dynamic process, we detect nuclear damaged material in the cytoplasm are at different degradation stages. Nucleases could be degrading DNA in micronuclei. Another possibility to the lack of DAPI signal in some micronuclei containing TOP2A and GFP-LC·is that TOP2A could be expelled from the nucleus with undetectable fragments of DNA or even without DNA, as a renewal process. We believe that nuclear buds can form without extruding DNA in some cases, perhaps to modulate proteostasis in addition to protect genome stability. In the revised manuscript we discuss this possibility further.

      G - c shows a strand of mostly TOP2B coming from the nucleus. Is there any evidence that this occurs using either confocal microscopy or super resolution approaches. Could you try Z-stack to find these?

      Thank you for the suggestion, we analyzed Z-stack images and tried to observe it also by immunofluorescence. We could detect some tubular signal connecting the nucleolus with a micronucleus containing TOP2B and BECN1 (arrow head in Fig 3B reproduced below), although we cannot be certain we are detecting the same nuclear extrusion mechanism by Electron Microscopy than by immunofluorescence.

      Figure 4 C - is there a significant increase in FBL negative bodies, this would make sense if FBN is being degraded in the micronuclei during the repair process

      We found that the number of micronuclei without FBL increased with statistical significant difference by Two-way-ANOVA followed by Dunnett´s multiple comparison test (P=0.463 comparing cells with 2h of DNA damage with control cells; P=0.0017 comparing cells after 5h of DNA repair with untreated cells; n=5). We agree with the reviewer, a possible explanation is that FBL is being degraded in micronuclei during the repair process. Although it could also be possible that nucleolar is less sensitive to Etoposide poisoning, or that nucleolar DDR is mechanistically different.

      • Would it be possible to increase the n of these experiments to confirm either no change in FBL/LC3 co-loc, or evidence of increase?

      Thank you for the suggestion. We repeated the experiment two more times to increase the n to 5. We found no statistical difference in the number of nuclear buds or micronuclei containing both FBL and LC3 during DNA damage and repair. Therefore it seems that the release of nucleolar components is not enhanced by Etoposide-induced DSB, suggesting that nucleolar DDR is a unique response, independent of DDR elsewhere in the genome (reviewed in Nucleic Acids Research, 2020, Vol. 48, No. 17 9449–9461 doi: 10.1093/nar/gkaa713).

      Minor issues:

      Figure 4 and 5 legends are in a different font.

      Thank you. We correct the font in the current manuscript.

      Reviewer #1 (Significance (Required)):

      There is little specific data on the role of autophagy in clearing micronuclei in cancer cells, so this may be suggestive of a new mechanism that occur during normal cellular homeostasis. There are known links between lamin A defects and the formation of micronuclei, but not explicitly that the micronuclei are also Lamin A positive. it is likely that analogous processes occur in both cancer and non-cancer, so the impact of these data is not clear to me. This paper may be of interest to researchers interested in nuclear structure and DNA damage, but based on the data presented the significance is limited.

      The significance of the present work is to discover that autophagy is relevant both during physiological DNA damage and in response to an exogenous DNA damaging agent, to extrude damaged DNA, TOP2cc and Fibrillarin from the nucleus. This knowledge is relevant since insufficiencies on autophagy imply a risk of genomic instability, which in turn could drive the cell into a senescent or malignant state. We present data showing that autophagy regulates the dynamic formation and elimination of nuclear buds and micronuclei in a mechanistically differentiated way. While autophagy contributes to nuclear buds formation, it is necessary for micronuclei elimination. Our data suggest that nucleophagy could be also a mechanism to alleviate basal nucleolar stress. As the reviewer noticed, some micronuclei did not have DNA. It is conceivable then that nuclear buds and micronuclei form also for a proteostatic function, not necessarily involving DNA damage elimination. We believe the significance of our work contributes to our understanding of the cell, as well as to cancer research. Whether common mechanisms between cancerous and normal cells occur is relevant to know, to consider the specificity of potential therapeutic approaches.

      I don't have sufficient expertise to evaluate the super resolution microscopy beyond assessing the images.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Peer review of the manuscript with the number RC-2021-01181 by Muciño-Hernandez G et. al. at Review Commons and with the tittle "Nucleophagy contributes to genome stability 1 though TOP2cc and nucleolar components degradation"

      1. Summary Muciño-Hernandez G et. al. show in this manuscript that mouse embryonic fibroblasts (MEFs) have basal levels of nuclear buds and micronuclei, which are indicators of genomic DNA damage. These basal levels of nuclear buds and micronuclei in MEFs increased after Etoposide treatment, which is known to induce DNA Double stranded Breaks (DSD). Interestingly, the nuclear buds and micronuclei co-localize with makers for nucleophagy (BECN1 and LC3) and acidic vesicles, suggesting that they are cleared by nucleophagy. The authors propose that basal levels of nucleophagy clear basal levels of genomic DNA damage that occurs as result from DNA-dependent biological processes in the cell nucleus, thereby contributing to nuclear stability of MEFs under physiological conditions. These basal levels of nucleophagy increase after the action of factors that induce DNA damage and nuclear stress. The concepts proposed by Muciño-Hernandez G et. al. are novel, since most of the current published data on nucleophagy related to DNA damage have been obtained under pathological conditions, e.g. implementing cancer cells.

      The authors use in their manuscript various molecular biology techniques to obtain data that support their claims, including Western Blot analysis of protein extracts from MEFs, immunostaining on MEFs and neutral comet assays, complemented with state of the art imaging techniques, such as confocal microscopy, immunoelectron microscopy and super resolution microscopy. The quality of the data is sound. The structure of the manuscript support the understanding of the reader. However, I would like to suggest several improvements that will help to increase the quality of the manuscript, in order that fits to the standards of articles recently published in journals affiliated to Review Commons, such as the Journal of Cell Biology, the EMBO Journal or eLife.

      1. Major comments

      2.1 The authors have to improve the description of the results. Especially the description of those Figure panels containing plots that were generated using data from several experiments has to be improved.

      One example is the description of the Figure 1D, which is in the lanes 137-151 of the current version of the manuscript. Whereas the authors describe in lanes 137-147 observations related to representative pictures of confocal microscopy after immunostaining presented in Figure 1D (left), the description of the quantification from 9 independent experiments presented in the plots in Figure 1D (right) comes relatively short in lanes 147-150 without mentioning any of the values implemented for creating the plots.

      "Interestingly, while the frequency of nuclear buds gradually increased after DNA damage and during DNA repair, the frequency of micronuclei also increased after DNA damage, but diminished upon DNA repair."

      The other plots presented in the different figure panels across the manuscript are described in a similar manner. I would like to suggest to the authors to improve their manuscript by including during the description of their results the values that were implemented for the degeneration of the plots presented in the manuscript. For example, in the specific case of Figure 1D above:

      "Interestingly, the percentage of MEFs with nuclear buds gradually increased from XY% ({plus minus} XY SD) in control non-treated (Ctrl) MEFs to XY% ({plus minus} XY SD; P=XY) after 2 h Etoposide-induced DSB in MEFs and XY% ({plus minus} XY SD; P=XY) after DNA repair take place in MEFSs 5 h upon stop of Etoposide treatment (Figure 1D, right). In contrast, the percentage of MEFs with micronuclei significantly increased from XY% ({plus minus} XY SD) in Ctrl MEFs to XY% ({plus minus} XY SD; P=XY) after 2 h Etoposide-induced DSB, whereas it was reduced to XY% ({plus minus} XY SD; P=XY) 5 h after stop of Etoposide treatment (Figure 1D, right)."

      Descriptions of the plots as mentioned above will make the text more intuitive for the reader, and they will make possible to read the Results Section without switching to the Figure Legends or the Material and Methods Section or to Supplementary Files. Even though the representative pictures from different microscopy techniques presented in the manuscript are of good quality and support the claims of the authors, it is important to mention that the quantifications presented in the plots demonstrate the statistical significance of these representative pictures. Thus, the authors should consistently include in the manuscript during the description of theirs results all the information (mean values, standard error of the means, P values, n values, etc.) that support their interpretation of the results and demonstrate the statistical significance of their claims.

      Thank you for your clear and valuable advice. We followed it and in the revised manuscript we included the data in the results section.

      2.2 Following a similar line of argumentation as in the previous point, the authors should provide as Supplementary Material an Excel file containing a statistical summary, including all statistical relevant information from each one of the plots presented in each Figure panel, such as n values, P values, Test implemented, values used for the plots, numbers of experiments, etc. The information could be organized in the Excel file in different data sheets according to the Figure panels, in order that the reader can easily navigate through the data. In the current version of the manuscript, one cannot find the values used for the generation of the plots presented in the manuscript in any of the submitted files.

      Thank your for this suggestion. We have included in Table S1 an Excel file with a data sheet for each Figure panel, containing all the data collected and the statistical analysis performed.

      Minor comments

      3.1 In general, prior studies were appropriately referenced. Only few references has to be added.

      Line 48: Add to the already included reference "Dobersch et al., 2021" also the reference Singh et al., 2015 PMID 26045162.

      Thank you, we added this reference.

      Line 53: Add the corresponding reference after the word "respectively".

      We added the corresponding reference.

      Line 82: Add the corresponding reference after the word "them".

      We added the corresponding reference.

      Line 125: Add the corresponding reference after the word "cells".

      We added the corresponding reference.

      Line 130: The expression "...by analyzing the recruitment of the phosphorylated histone γH2AX..." is the first time that the authors mention in the manuscript the DNA damage maker γH2AX. I suggest that is better introduced as " ... by analyzing the recruitment of the DNA damage marker γH2AX (histone variant H2A.X phosphorylated a serine 139, Rogakou EP, et al., 1998, PMID 9488723) to DSB sites."

      Thank you very much for your suggestion. In the revised manuscript we corrected the text as suggested.

      Line 199: Add the corresponding reference after the word "formation".

      We added the corresponding reference.

      Line 205: Add the corresponding reference after the word "cells".

      We added the corresponding reference.

      3.2 The use of the English language is appropriate throughout the manuscript. However, there are minor errors in the use of punctuation marks, in the use of prepositions and typos. I will list some of them below. However, I would like to recommend that manuscript is corrected by an English native speaker.

      Thank you for your careful review of our manuscript. We corrected all the errors listed. A college proficient in English has reviewed the revised manuscript.

      Line 41: "...and reproductive systems; genome instability also..." the semicolon can be replaced by a period.

      Line 43: "Since early in development DNA is under constant endogenous..." between "development" and "DNA" there should a comma.

      The sentence in lanes 53-55 has to be rephrased.

      Lines 62-63: the expression "...throughout life." should be substituted.

      Line 70: The abbreviation "rDNA" has to be explained the first time that is used.

      Lines 81-82: It has to be explained for the scientist that is not specialized in the field of nucleophagy, how the integrity of the genome is threatened by micronuclei and nuclei-derived material.

      √ Lines 106-110: The sentence is long. It would be easier to understand for the reader if this sentence is divided into two sentences.

      Lines 121-122: The subtitle should be rephrased.

      Lines 132-138: The sentence is long. It would be easier to understand for the reader if this sentence is divided into two sentences, e.g. with a period before the word "hence".

      Lines 143-144: "... in a subpopulation of healthy, untreated cells...". The interpretation of "healthy" might be subjective. I would like to suggest substituting in the complete manuscript the word "healthy" by "control".

      Line 163: The abbreviation for γH2AX was already introduced in line 130.

      Line 182: A comma after "cell lines" is missed.

      Line 183: delete "either". √ Lines 190-194: The sentence is long. It would be easier to understand for the reader if this sentence is divided into two sentences, e.g. with a period after the word "decreased" in line 191.

      Line 218: I assume that instead of "bus", it should be "buds".

      Line 220: I assume that instead of "iRNA", it should be "siRNA". In addition, it is the first time that the abbreviation is used. Thus, I suggest introducing it as "...was silenced by specific small interfering RNA (siRNA) previous to ..."

      Line 327: delete the word "chronic".

      Line 344: I assume that instead of "(figures 4C)", it should be "(Figure 4D)".

      3.3 The structure of the Figures is ok for the peer review process and it might be optimized during editing of the manuscript. Nevertheless, I would like to suggest to the authors to increase the lettering size throughout all the figures. It will make the figures more intuitive.

      Thank you for the suggestion. We increase the font size of the figures.

      Reviewer #2 (Significance (Required)):

      Significance

      The work presented by Muciño-Hernandez G et. al. will be clearly a significant contribution to the scientific community working on autophagy, DNA damage repair and cancer, among others. It will be of interest to a broad spectrum of scientists, as I will elaborate in the following lines. The authors propose that MEFs have basal levels of genomic DNA damage under physiological conditions, which are cleared by basal levels of nucleophagy. On one hand, these findings are in line with various publications demonstrating that DNA-dependent biological processes in the cell nucleus, such as transcription, replication, recombination, and repair, involve intermediates with DNA breaks that may compromise the integrity of DNA. Thus, there must be mechanisms that ensure the integrity of the genome during these processes under physiological conditions, one of them seems to be nucleophagy. This perspective might explain the fact that proteins and histone modifications that were initially characterized during DNA repair also play a role during transcription, recombination, and replication. For example, phosphorylated H2AX at S139 (γH2AX) is often used as a marker for DNA-DSB [PMID 9488723]. However, accumulating evidences suggest additional functions of this histone modification [PMIDs 19377486; 22628289; 23382544]. In addition, McManus et al. [PMID 16030261] analyzed the dynamics of γH2AX in normal growing mammalian cells and found γH2AX in all phases of cell cycle with a maximum during M phase, suggesting that γH2AX may contribute to the fidelity of the mitotic process, even in the absence of ectopic- induced DNA damage. Further, Singh et al [PMID 26045162] and Dobersch et al [PMID 33594057] report that γH2AX plays a role in transcriptional activation in response to TGFB-signaling. Moreover, classical DNA-repair complexes have been linked to DNA demethylation and transcriptional activation [PMIDs 17268471; 28512237; 25901318], and DNA-DSB is known to induce ectopic transcription that is essential for repair, supporting a tight mechanistic correlation between transcription, DNA damage, and repair [PMID 24207023]. Perhaps, the authors might consider introducing several of the aspects and the citations written above into the Discussion section of the revised version of their manuscript. On the other hand, most of the published data related to nucleophagy have been obtained from cancer cells. Muciño-Hernandez G et. al. obtained their data implementing MEFs to demonstrate that the proposed mechanisms take also place under non-pathological conditions, what is one of the novel aspects of the present work.

      I hope that my suggestions help the authors to improve their manuscript, thereby reaching the standards of manuscripts recently published in journals affiliated to Review Commons AND increasing the impact of their contribution to the scientific community.

      Thank you very much for your suggestions. They helped us to present now a much-improved manuscript. We hope the revised work is now suitable for publication in the Journal of Cell Science.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Muciño-Hernández and colleagues suggest that basal formation of nuclear buds and micronuclei increases in primary mouse embryonic fibroblasts following etoposide-induced double strand breaks (DSBs). The study combines the use of biochemical methodologies with confocal and super resolution microscopy in an effort to explore the contribution of nucleophagy to genome stability. The authors provide evidence that autophagy is induced upon etoposide treatment. They detected GFP-LC3 and BECN1 signals in nuclear buds and micronuclei even in untreated control and to a higher extent in etoposide-treated cells. Then, the authors examined whether nucleophagy is required for the removal of nuclear buds and micronuclei, by treating fibroblasts with control and Atg7 siRNA. The authors claim that the percentage of cells with micronuclei or nuclear buds decrease upon Atg7 knockdown, suggesting that components of the autophagy machinery induce the formation of these nuclear abnormalities. Moreover, Type II DNA Topoisomerases (TOP2A and TOP2B) and the ribosomal protein fibrillarin were detected in nuclear buds and micronuclei in fibroblasts treated or not with etoposide. Again in this case, GFP-LC3 was detected in fibrillarin-containing nuclear alterations. Based on these observations, the authors suggest that nucleophagy contributes to the elimination of chromosomal fragments or nucleolar bodies exiting the nucleus under DNA damage -inducing conditions. Specifically, they propose a key role for nucleophagy in maintaining genome stability by eliminating Type II DNA Topoisomerase cleavage complex (TOP2cc) and nucleolar components such as fibrillarin.

      While it seems that there is a relationship between nuclear-extruded TOP2 with endogenous BECN1 and GFP-LC3 suggesting autophagic engagement, inconsistencies of fluorescent images between different figures indicate possible technical problems/limitations (please see specific comments, below), compromising authors' claims. LC3 immunoblotting and GFP-LC3 localization results appear over-interpreted (comments below). Neither TOP2 nor Fibrillarin have been shown to be actual autophagic substrates. Also, the link between genomic stability, micronuclei formation and autophagy has been previously reported (Zhao et al., PMID: 33752561).

      An additional major concern is relates to nucleophagy being a selective type of autophagy. As such it requires efficient recognition and sequestration of the nuclear material destined to be degraded. Cargo specificity is mediated by receptor proteins, but no evidence for such receptors is provided in this study. Moreover, there is no real mechanistic insight on how nucleophagy mediates genome stability and how this can be interpreted in terms of cell survival under physiological and stress conditions. In other words, the biological significance of the findings presented has not been addressed.

      Specific comments are summarized below:

      The authors suggest that autophagy is induced after etoposide treatment and during the DNA repair process. However, the Western blot presented in Fig. 2A is not convincing and quantification does not support a significant autophagy induction in any of these cases. Autophagy appears to be induced 1h after etoposide removal, as evidenced LC3II/LC3 I increase (Fig. 2A and S2A). Nevertheless, all these changes should be more rigorously assessed.

      Thank you for the observation. We removed the analysis of LC3II/LC3I by Western blot in the revised manuscript because a basal and induced elimination of nuclear components by the autophagic machinery occurs only in a subset of cells. It needs to be analyzed cell by cell. Pooling together all the cells dilutes the observation. Nevertheless, the dynamic intracellular distribution of both LC3 and BECN1 indicate autophagy induction. Please notice in revised Figure 2A that LC3 surrounding vesicles increases after 2h of DNA damage and diminish when DNA is repaired. BECN1 in control MEFs is highly concentrated inside the nucleus, at the nucleolus as it co-localized with Fibrillarin (new Figure 4E), and after DNA damage it redistributes towards the cytoplasm. Finally after DNA repair, BECN1 appears highly concentrated at the nucleus again. A further evidence of a functional autophagic flux, is the observation of an increasing number of acidic vesicles stained with Lysotracker in response to DNA damage, which were reduced after DNA repair. Some of the micronuclei were also co-stained with Lysotracker, suggesting their lysosomal degradation.

      Line 190 and Fig. 2A: It is totally unclear whether "autophagy activation" takes place during the two waves described. There is no LC3B-I to LC3B-II conversion to initially suggest "autophagy activation". It rather suggests that autophagy is stalled. Fig. 2F shows that GFP-LC3 is strongly fluorescent into the lysotracker-stained lysosomes, further pointing to possible functional or technical problems.

      As pointed out by reviewer 1, the images presented in original Figure 2F were over-exposed. In the current version we replaced those images with new images of better quality. We also reorganized the presentation of the data, and in revised Figure 2A we present photos where more convincingly can be observed a co-localization of BECN1 with LC3, with o without Lysotracker signal in nuclear buds and micronuclei. We also performed immunolocalization of endogenous LC3 (new Figure 2D) to rule out a possible misinterpretation of GFP-LC3 aggregates. As explained before, we removed original Figure2A.

      Fig. 2B and Sup. Fig. 2B: BECN1 staining looks problematic. There is extreme BECN1 accumulation in the nucleus. Are those nuclear patterns of endogenous BECN1 and GFP-LC3 normal (see also minor comment 6 and 7)? Is there literature supporting such a distribution?

      Yes, it has been documented BECN1 localization in the nucleus during development and in response to DNA damage stimuli such as ionizing radiation, and with a function related to DNA repair alternative to autophagosome formation (Fei Xu, et al. 2017, Scientific Reports | 7:45385 | DOI: 10.1038/srep45385). In the current manuscript we also detected endogenous LC3, to avoid a possible artifact with GFP-LC3 expression. We observed endogenous LC3 also localized in the nucleus (new Figure 2D).

      It is hard to imagine how BECL1 is implicated in a (here hypothetical) nuclear lamina degradation event driven by LC3-lamin B1 direct interaction (Dou et al., 2015). BECL1 is an upstream to LC3 component and is a subunit of the PI3K complex catalyzing the local PI3P generation. The above should cause recruitment of the downstream autophagic machinery. Other subunits of the same complex or downstream effectors should be identified at the same spots to support authors' claims.

      Our proposal that BECN1 is contributing to nucleophagy is supported by its co-localization with LC3 and Lysotracker stained vesicles (new figure 2A), as well as with TOP2 (Figure 3A-C). We appreciate the interesting idea of the reviewer; we certainly did not analyze the presence of BECN1 interacting partners. We agree, further studies analyzing their localization could complement our current findings. Supporting our work, others have observed UVRAG in the nucleus, specifically in centromeric regions, and it also has a role in DNA repair through its interaction with DNA-PK (Dev Cell. 2012 May 15; 22(5): 1001–1016. doi: 10.1016/j.devcel.2011.12.027). Given the anti-tumorigenic role of several autophagic molecules, it is tempting to speculate that several of them could have triple roles in the nucleus: directly interacting with DNA repair machinery, eliminating unrepairable DNA damaged and preventing excessive protein accumulation in the nucleus. Further experiments are necessary to probe this hypothesis, but are beyond the scope of the present manuscript.

      U, 2h D and 5h R images of whole cells are necessary. The authors should also provide representative images of cells under different conditions i.e. control, etoposide-treatment and during DNA repair. Along similar lines, untreated control cells are not included in Fig. 2E and F. These images are needed for a better comparison between normal and DNA damage-inducing conditions.

      The reviewer is right. In the revised Figure 2 we included representative images of control cell, Etoposide-treatment and during DNA repair cells. Images of whole cells are now shown in supplementary Figure 2S.

      The authors state that autophagy is required for nuclear buds and micronuclei formation. However, the data shown in Fig. 2G and H are hardly convincing given that the statistical difference between cells treated with control and Atg7 siRNA is not strong (for example, *p˂0.5, 5h after etoposide removal). To provide further support to this notion, they should use cells from autophagy defective mutants and examine the appearance of nuclear abnormalities across different conditions compared to control cells.

      We agree with the reviewer and followed his/her suggestion. We established collaboration with Dr. Sandra Cabrera, who kindly shared with us Atg4b-/- mice from which we isolated MEFs to compare side by side with WT MEFs the appearance of nuclear abnormalities. We confirmed a statistical significant reduction in the formation of nuclear buds in both conditions: silencing the expression of Atg7 by siRNA and in Atg4b-/- MEFs, suggesting that the autophagic machinery contributes to buds formation (new Figure 2F-G). Interestingly, we observed a different result analyzing micronuclei. While we found no statistical significant difference in the percentage of cells with micronuclei silencing the expression of Atg7 by siRNA, we found a statistical significant increment of cells with micronuclei in Atg4b-/- MEFs (new Figure 2F-G). This apparently discrepant result suggests that nuclear buds and micronuclei have a different mechanistic origin. A difference in the biogenesis of buds and micronuclei has been previously suggested studying cells cultured under strong stress conditions that induce DNA amplification, as well as in cells under folic acid deficiency. While interstitial DNA without telomere was more prevalent in buds than in micronuclei, telomeric DNA was more frequently observed in micronuclei (Fennech et al. 2011, Mutagenesis 26:125-132).

      Lines 223-228: The role of autophagic machinery in the formation of nuclear buds is not supported and furthermore hard to conceptualize. How the components of autophagy are implicated during the nuclear buds and micronuclei formation? Colocalization of autophagic proteins might mean that autophagy is engaged at some point after or during the above formation. The causal, mechanistic and temporal aspects of the above budding and nucleophagic events need experimental support and/or more accurate interpretation.

      We agree with the reviewer, and now we expressed our interpretation with more caution. The role of autophagic machinery in the formation of nuclear buds is supported by the following findings: a) the localization of LC3 and BECN1 in nuclear buds; b) the inhibition of Atg7 expression by specific siRNAs reduced the number of cells with buds and c) Atg4b-/- MEFs had reduced number of cells with buds (new Figure 2G). How the components of autophagic machinery are implicated in nuclear buds formation is an interesting question and deserves further investigation, beyond the scope of the present manuscript.

      The authors claim that nucleophagy eliminates topoisomerase cleavage complex because TOP2A and TOP2B appear to more extensively co-localize with GFP-LC3 and BECN1 after etoposide-induced DSBs. However, the quantification presented in Fig. 3D-F to support this statement does not, in general, show a statistically significant difference in fibroblasts across different conditions (normal, etoposide treatment, etoposide removal).

      Autophagic elimination of TOP2 protein is supported by the following findings: 1) both BECN1 and LC3 were detected in micronuclei in acidic vesicles (labeled with Lysotracker), which is indicative of the autolysosomal nature of the cytoplasmic compartment containing TOP2 (Figure 2A); 2) TOP2B was found by electron microscopy in some cells exiting the nucleus surrounded by LC3 (Figure 3G); 3) TOP2B accumulated in cells lacking ATG4, as expected if it is degraded by autophagy (Figure 3H).

      Why would BECLIN colocalise with TOP2B in Figure 3g, given that beclin is involved in the initiation process?

      We think that BECN1 is involved in additional functions to the initiation process of bud formation. For example, it has been shown by others that BECN interacts with TOP2 (Dev Cell. 2012 May 15; 22(5): 1001–1016. doi: 10.1016/j.devcel.2011.12.027). It could be working as an autophagic receptor targeting TOP2cc to buds and micronuclei. We are aware that further studies are necessary to test this hypothesis, but they are beyond the scope of this manuscript.

      Fig. 4A and B: There is no enrichment of GFP-LC3 in "the nuclear alterations containing Fibrillarin" as stated in lines 341-343 comparing to the rest of the cellular GFP fluorescence.

      It is true that there is not a local enrichment of GFP-LC3 as those normally reported as LC3 puncta in response to autophagy induction by starvation, for example. Nevertheless we are confident of the specificity of the observation, as not every nuclear alteration was found having GFP-LC3. We detected GFP-LC3 in 72% (mean ± 3.61 SD) of the nuclear alterations containing Fibrillarin in untreated cells, in 65.7% (mean ± 1.97 SD) of cells with 2h of DNA damage and in 90.33% (mean ±6.36 SD) after 5 h of DNA repair (in 5 independent experiments).

      Moreover, there is no statistical significance in Fig. 4C and D measurements limiting the safety of authors' conclusions in lines 341-346.

      We agree with reviewer´s observation. We repeated these experiments two more times and did not find a statistical significant difference in the percentage of cells with nuclear lesions containing Fibrillarin and GFP-LC3 after DNA damage nor after DNA repair. These results suggest that nucleolar DDR is a particular response, independent of DDR elsewhere in the genome, as has been suggested (reviewed in Nucleic Acids Research, 2020, Vol. 48, No. 17 9449–9461; doi: 10.1093/nar/gkaa713). An alternative is that the release of nucleolar components is not enhanced by Etoposide at the dose and time used in this work.

      Lines 368-370: As discussed by the authors and reported in previous publication (Xu et al., 2017), "BECN1 interacts directly with TOP2B, which leads to the activation of DNA repair proteins, and the formation of NR and DNA-PK repair complexes", independent of its role in autophagy. Currently, there are no rigorous findings supporting the contribution of BECN1 (as a functional constituent of the core autophagic machinery) to nuclear damaged material extrusion (lines 382-384).

      We agree with the reviewer in that we did not perform an assay to demonstrate that BECN1 is contributing to TOP2 nuclear extrusion as a functional constituent of the core autophagic machinery. Nevertheless, the following data support the proposal of an autophagic elimination of TOP2cc: 1) TOP2B was detected in micronuclei containing BECN1 (Figure 3B); 2) BECN1 was found in micronuclei containing LC3 and in an acidic vesicle (labeled with Lysotracker), indicative of the autolysosomal nature of the compartment (Figure 2A); 3) TOP2 was found in some cells exiting the nucleus surrounded by LC3 (Figure 3G); d) TOP2 accumulated in cells lacking ATG4, suggesting its autophagic degradation (Figure 3H).

      Lines 435-441 and Fig. 5: The current findings do not support the proposed model. It is hard to support and conceptualize the statement "proteasome and nucelophagy function in a dynamic way inside the nucleus".

      The reviewer is right. We made a mistake integrating an interpretation within the summary of the actual findings of this work. We correct the text in the current version.

      In Fig. 5, LC3 appears to decorate inner nuclear membrane and probably to interact with some of the other proteins depicted, which is misleading.

      We agree with the reviewer. We removed the scheme in the current manuscript.

      Beclin-1 appears to interact with Fibrillarin (Nucleolus).

      This is correct. We observed by immunofluorescence a co-localization of BECN1 with Fibrillarin (new Figure E), and demonstrated by co-immunoprecipitation that they are constituents of a complex (new Figure F).

      Most of the differences in Sup. Fig. 3 lack statistical significance compromising the authors' claims.

      We agree with the reviewer. To perform a separated statistical analysis of the percentage of cells with nuclear buds or micrnonuclei did not provide further information. We eliminated this analysis in the current version.

      Many conclusions are drawn by colocalisation-immunofluorescence analysis. Co-immunoprecipitation experiments should also be performed to show that TOP2B and fibrillarin interact with LC3/autophagic machinery.

      Thank you for your suggestion. We performed immunoprecipitation analysis and confirmed an interaction of Fibrillarin with BECN1, this result is now presented in Figure 4F. We found no co-immunoprecipitation of LC3 with either Fibrillarin or TOP2A, nor of TOP2B with BECN1.

      Additionally, colocalisation analysis should be performed using tools such as Pearson's correlation and is an initial indication of nucleophagy. In the case of fibrillarin, immunofluorescence images do not indicate colocalisation, they need to be repeated.

      The transport of Fibrillarin out of the nucleus by micronuclei formation and its autophagic degradation implies that both proteins are contained in the same vesicular compartment, it does not necessarily requires a direct interaction of Fibrillarin with LC3. Therefore, a co-localization detected by Pearson´s analysis is not a necessary confirmation of the nucelophagic degradation of Fibrillarin. Actually, Fibrillarin does not seem to interact with LC3, since we could not detect both proteins by co-immunoprecipitation. Nevertheless, we observed a nucleolar localization of BECN1 overlapping with Fibrillarin (new Figure 4E), and we confirm by co-immunoprecipitation the presence of both BECN1 and Fibrillarin in a complex (new Figure 4F). Following reviewer´s advice, we repeated two more times the analysis of Fibrillarin immunolocalization. We corroborated its localization in micronuclei and nuclear buds in 5.86% (mean ± 5.03 SD) of untreated cells, indicating a basal level of nucleolar material exclusion from the nucleus. Interestingly, the percentage of cells with Fibrillarin in nuclear alterations did not increased with statistical significance with Etoposide treatment. At 2 h of DNA damage we observed only a slight increase to 6.8% (mean ± 4.03 SD) of cells having nuclear buds and micronuclei with Fibrillarin, while the number of cells with nuclear lesions increased to 30.6% (mean ± 4.2 SD). Similarly, the proportion of cells having Fibrillarin in nuclear lesions after 5 h of DNA repair increased only to 7.66 % (mean ±6.08 SD), while the total number of cells having nuclear buds and micronuclei increased to 38.42% (mean ± 9.3SD). These results suggest that nucleolar components are constantly sent out of the nucleus as a homeostatic process, and not significantly in response to Etoposide-induced DSB.

      Measurement of LC3/fibrillarin positive puncta should be performed, under basal conditions, genotoxic, and nucleolar stress under control and Atg7 knockdown conditions.

      Since we observed no statistical significant change in the number of micronuclei with Fibrillarin under Etoposide-induced DSB nor DNA repair, we did not perform the suggested experiment.

      Moreover, if nuclear proteins described are substrates of autophagy, then their levels would decrease upon autophagic induction i.e. starvation or in this case DNA damage and nucleolar stress. Thus, western blot analysis of relative protein levels can be performed.

      Thank you for the suggestion. Since only 5% of the cells have micronuclei with Fibrillarin, and this proportion did not increased significantly in response to DNA damage, it is unlikely to detect a difference in the amount of Fibrillarin in response to autophagy manipulation performing a population analysis (as it is in a Western blot). Nevertheless, we compared Fibrillarin abundance by Western blot in WT MEFs vs. Atg4-/- MEFs untreated (U), treated for 2 h with Etoposide (D) and after 5 h of DNA repair (5) shown in the top panel of the follow figure. As expected, we found no statistical significant difference determined by 2way-ANOVA followed by Sidak´s multiple comparisons test (n=3). Ajusted P values are shown for each comparison (left graph).

      On the other hand, since the percentage of cells with TOP2B in micronuclei and nuclear buds increased in response to DNA damage and during DNA repair, it was possible to detect a statistical significant accumulation of TOP2B in cells lacking ATG4 after 5h of DNA repair (bottom panel and right graph in the figure above). This observation is now included in new Figure 3H. Supporting our finding, TOP2A is reduced in cancerous cells grown under glucose deprivation (Alchanati, I., et al. 2009. PLoS One. 4:e8104).

      Endogenous LC3 nuclear buds should also be detected to verify nucleophagy as GFP-LC3 has been shown to aggregate, causing artifacts under certain conditions.

      We agree with the reviewer. We detected endogenous LC3 by immunofluorescence. This result is now included in Figure 2D.

      Minor comments

      In the Discussion section, the paragraph focused on the role of the ubiquitin-proteasome system is not substantiated by the data presented in the manuscript. Along similar lines, formation of aggresomes following etoposide treatment and their subsequent removal has not been monitored.

      We apologized for the confusion, we corrected the text to now clearly distinguish which are our findings and which are published data that we just attempt to relate.

      Western blots of better quality should be provided with assigned markers of protein size.

      The Western blots shown have markers of protein size.

      There are several language errors in the text that need to be corrected. Several sentences are too long and confusing or must be re-phrased. For example, see the lines: 123-125, 209-210,212, 218,221-222.

      We apologize for our language errors. We corrected all errors indicated and asked colleges proficient in English to review our text.

      Fig. 1B. Place "μm" into parenthesis.

      Sup. Fig. 1B: Replace "gH2AX" with "γH2AX".

      Fig. 1D: Separate DAPI and γH2AX channel images would be informative.

      We now show also separated channels.

      Fig. 2E: Enlarged separate DAPI, GFP-LC3 and lamin A/C channel images would be informative.

      We now show also separated channels.

      Line 218: Replace "bus" with "buds".

      Fig. 2B, 2E, 2F, 3A and probably Sup. Fig. 2B represent MEFs treated for 2h with etoposide. The pattern of GFP-LC3 in 2B looks extensively nuclear and almost absent from cytoplasm.

      We confirmed our finding detecting endogenous LC3.

      In addition, Fig. 2B and 3B represent MEFs treated for 2h with Etoposide. The pattern of endogenous BECN1 in Fig. 2B looks extensively nuclear and almost absent from cytoplasm. In Fig. 3B the pattern is notably different.

      BECN1 pattern of distribution is rather similar, predominantly in the nucleolus. We demonstrate it further by detecting BECN1 overlapping localization with Fibrillarin (new Figure 4E) and co-immunoprecipitation (new Figure 4F).

      Sup. Fig. 2C: Index box is not properly aligned.

      Thank you. We reviewed the alignment of each index box and reorganized the figure in the revised manuscript to add the whole blots of the new experiments we performed to analyze MEFs Atg4-/-.

      Lines 154, 343 and 837: Replace "DBS" with "DSB".

      Thank you, we corrected these typos.

      Fig. 4 panels are not clearly cited at the text.

      We apologize, we reviewed that they are clearly cited now.

      Line 220: siRNA

      Thank you, we corrected the text.

      Lines 373-374: References "Lenain et al., 2015" and "Li et al., 2019" are missing.

      Thank you for noticing it, we added the missing references. We use EndNote X9, we did not expect it to fail.

      Lines 400-401 and 407: Probably the second "Latonen, 2011" reference needs "et al".

      It is correct. We now cite this paper properly.

      Line 427: Do authors refer to Fig. 1E rather than Fig. 2B?

      Yes, we are sorry for this mistake. Thank you for pointing it out.

      Line 434: Correct "clearance" spelling.

      Thank you, we corrected it.

      Reviewer #3 (Significance (Required)):

      The authors suggest that nucleophagy contributes to the elimination of chromosomal fragments or nucleolar bodies exiting the nucleus under DNA damage -inducing conditions. Specifically, they propose a key role for nucleophagy in maintaining genome stability by eliminating Type II DNA Topoisomerase cleavage complex (TOP2cc) and nucleolar components such as fibrillarin.

      However, neither TOP2 nor Fibrillarin have been shown to be actual autophagic substrates. Also, the link between genomic stability, micronuclei formation and autophagy has been previously reported (Zhao et al., PMID: 33752561).

      We found nuclear buds and micronuclei with markers of different stages of the autophagic pathway, suggesting an active role of autophagy proteins in buds formation, and micronuclei removal. We detected TOP2 and Fibrillarin in micronuclei and propose their elimination by nucleophagy by the following findings: 1) both BECN1 and LC3 were detected in micronuclei in acidic vesicles (labeled with Lysotracker), which is indicative of autolysosomes (Figure 2A); 2) TOP2B was found by electron microscopy in some cells exiting the nucleus surrounded by LC3 (Figure 3G); 3) TOP2B accumulated in cells lacking ATG4, as expected if it is degraded by autophagy (Figure 3H); 4) BECN1 has a dynamic cytoplasmic-nucelar traffic in response to DNA damage; 5) BECN1co-localized with Fibrillaron in nucleolus and both proteins were co-immunoprecupitated.

      The link between genomic stability, micronuclei formation and autophagy has been previously reported only in cancerous cells. Considering that physiological DNA damage occurs constantly in the cell, basal nucleophagy is potentially fundamental to maintain cells healthy.

    1. Author Response

      Reviewer #1 (Public Review):

      This paper addresses an important question: whether the conduction velocity in white matter tracts is related to individual differences in memory performance. The authors use novel MRI techniques to estimate the "g-ratio" in vivo in humans - the ratio of the inner axon relative to the inner axon plus its outer myelin sheath. They find that autobiographical recall is positively related to the g-ratio in a specific white matter tract (the parahippocampal cingulum bundle) in a population of 217 healthy adults. This main finding is extended by showing that better memory is associated with larger inner axon diameters and lower neurite dispersion, which suggests more coherently organised neurites. The authors also argue that their results show that the magnetic resonance (MR) g-ratio can reveal novel insights into individual differences in cognition and how the human brain processes information.

      The study is exploratory in nature and the analyses were not pre-registered. The technique has not been used before to associate cognitive performance with MR estimates of conduction velocity in candidate white matter tracts. It is therefore unknown how strong any associations are likely to be and what sort of sample size might be needed to observe them. Nevertheless, if the technique proves to be reliable, then it certainly offers a valuable new tool to understand individual differences in cognitive abilities. However, brain structure to behavior associations are notoriously variable across studies and have been argued to require very large sample sizes to obtain reproducible results.

      We respectfully disagree that the study was exploratory. We had distinct aims and hypotheses from the outset. Our prime interest is in autobiographical memory, the hippocampus and its connectivity. This motivated our focus on three specific white matter tracts. We also planned from the time of study design to examine the MR g-ratio, and even contributed to refining the pre-processing pipeline for this approach, as reported in a previous paper (Clark et al., 2021, Frontiers in Neuroscience). Moreover, in the current manuscript we outlined well thought through possible outcomes and declared specific predictions.

      Regarding pre-registration, due to the scope of this work, the experiment was planned eight years ago, and data collection commenced seven years ago. At that time, formal pre-registration was not common practice. However, it has been a long-standing feature of our Centre that proposed studies and their analysis plans undergo rigorous internal peer review, including presentation to the whole Centre, before data acquisition can commence. The proposal for the research under consideration here was presented on 26th September, 2014.

      As noted in our response to the Editors’ Public Evaluation Summary above, someone has to be the first to report a novel result, and we believe that the depth and transparency of our approach permits confidence in the findings. Not least, and to reprise, because we employed the most widely-used and best-validated method of testing autobiographical memory recall that is currently available – Levine’s Autobiographical Interview. Our primary analyses were performed using the behavioural outcome measure from this test, the results of which were directly compared to those from a closely-matched control measure to test whether significantly larger effects were observed for our variable of interest. The potential for false positives was further reduced by extracting microstructure data from hypothesised tracts of interest (instead of performing whole brain voxel-wise analyses), with statistical correction performed on all structure-behaviour analyses. Moreover, we performed partial correlations with age, gender, scanner and number of voxels in a region of interest (ROI) as covariates. Complementary investigations were also conducted using other commonly-reported measures, providing supporting evidence. We report all analyses (and provide all the source data), including those finding no relationships. The consistent results throughout were associations between autobiographical memory recall ability and the microstructure of the parahippocampal cingulum bundle only. Moreover, thanks to the excellent suggestions of the Reviewers, the revised version reports additional analyses that allow us to further corroborate and interpret our findings.

      Our sample of 217 participants allowed for sufficient power to identify medium effect sizes when conducting correlation analyses at alpha levels of 0.01 and when comparing correlations at alpha levels of 0.05 (Cohen, 1992, Psychological Bulletin). While it has recently been suggested that thousands of participants are required in order to investigate brain structure-behaviour associations (Marek et al., 2022, Nature), other, more sophisticated, analyses suggest that samples of ~200 participants can be sufficient, in line with our estimates (Cecchetti and Handjaras, https://psyarxiv.com/c8xwe; DeYoung et al., https://psyarxiv.com/sfnmk). Given that our study was principled, well-controlled, analysed appropriately and produced very specific and consistent findings, we are confident that the findings are robust.

      The authors decided to analyse performance on a single task - the Autobiographical Memory Interview - and identified three candidate white matter tracts that connect the hippocampal region with other brain regions. While it is clear why these three tracts were chosen, it is less obvious why the authors chose to investigate associations with the Autobiographical Memory Interview and not other memory tests that were part of the battery of tests administered to the participants. It is reasonable to assume that something as general as the conduction velocity of a white matter tract would have an effect on memory ability across a range of tasks, so to single out one seems an unnecessarily narrow focus.

      Our main interest over many years, and hence the focus of this study, is autobiographical memory recall because it directly relates to how people function in real life. As noted above, autobiographical experiences occur in dynamic, multisensory, multidimensional, non-linear, ever-changing contexts; they involve actively engaging with the environment and other people; they are embodied; they span milliseconds to decades. Many of these features cannot be captured by laboratory-based episodic memory tests. This issue is increasingly being discussed (for example, see recent reviews by Nastase et al., 2020, NeuroImage; Mobbs et al., 2021, Neuron; Miller et al., 2022, Current Biology). It is further laid bare in McDermott et al.’s (2009, Neuropsychologia) meta-analysis of functional MRI studies which showed that laboratory-based and autobiographical memory retrieval tasks differ substantially in terms of their neural substrates. Consequently, we were not surprised to find that when we analysed laboratory-based memory test performance, there were no correlations with the MR g-ratio. Recall of vivid, detailed, multimodal, autobiographical memories may rely on inter-regional connectivity to a greater degree than simpler, more constrained laboratory-based memory tests. Therefore, as well as speaking to conduction velocity, these findings also contribute to wider discussions about real-world compared to laboratory-based memory tests. We thank the Reviewer for making the excellent suggestion to include these additional data, analyses and discussion points.

      The results of the study are interesting and highlight a key role of the parahippocampal cingulum bundle in autobiographical memory recall. The results are corrected for multiple comparisons across the three fiber tracts of interest and the recall of "external details" provides a nice control compared to the "internal details" which are the measure of interest. The main findings are extended to show that it is likely to be an increase in axon diameter and an increase in neurite coherency that characterize those individuals with better autobiographical recall. Despite these positives, it remains unclear whether memory recall, in general, is better in people with higher g-ratios in this tract (as implied in the Abstract), or if this effect is specific to scores on the Autobiographical Memory Interview.

      Our interest is in autobiographical memory, and so we employed the most widely-used and best-validated method of testing autobiographical memory recall that is currently available – Levine’s Autobiographical Interview. Not only does this test include a control measure, external details (as noted by the Reviewer), but we had independent raters score the autobiographical memory descriptions, and found that the inter-class correlation coefficients were very high (see Materials and Methods). Despite using this current, gold standard approach, at the request of the Reviewer we have now analysed data from eight additional laboratory-based memory tests. These are standard memory tests that are often used in neuropsychological studies: testing recall - the immediate and delayed recall of the Logical Memory subtest of the Wechsler Memory Scale IV, the immediate and delayed recall of the Rey Auditory Verbal Learning Test, the delayed recall of the Rey–Osterrieth Complex Figure; testing recognition memory - the Warrington Recognition Memory Tests for Words and Faces; testing semantic memory - the “Dead or Alive Test”. While these tests can assess some aspects of memory recall, they cannot be regarded simply as proxies for autobiographical memory recall, for the reasons we outlined in our response to the previous point. They do not capture key aspects of autobiographical memories. It is therefore all the more interesting that we found no associations between these laboratory-based memory tasks and the MR g-ratio of the parahippocampal cingulum bundle, in contrast to the relationship identified with autobiographical memory recall ability. Recall of vivid, detailed, multimodal, autobiographical memories may rely on inter-regional connectivity to a greater degree than simpler, more constrained laboratory-based memory tests. Therefore, as well as speaking to conduction velocity, these findings also contribute to wider discussions about real-world compared to laboratory-based memory tests. We thank the Reviewer once again for making the excellent suggestion to include these additional data, analyses and discussion points.

      Reviewer #2 (Public Review):

      In this study, Clark and colleagues tackle a very intriguing question: how differences in autobiographical recall abilities reflect in the human brain structure and function? To answer this question, they interviewed a large cohort of subjects and proceeded to acquire MRI data, specifically diffusion-weighted imaging and magnetization transfer data, to estimate the g-ratio, a measure of myelination deeply linked to conduction velocity. Looking at three specific white matter pathways of interest, all interconnecting the hippocampus with other brain structures, they studied the relationship between the g-ratio and the autobiographical recall abilities, together with many more measures from MRI. They found a significant positive association between the g-ratio of the parahippocampal cingulum bundle and the number of inner details from the interviews. These results can provide new potential directions to further study the underlying neural features beyond memory.

      I think that this is a very interesting article, it is well written, the methods are extensively explained, and the appendix provides further details for more expert readers. The authors put an effort into providing a comprehensive context in the introduction and in the discussion, and as a result, the paper seems overall quite suitable for both general and specialistic readerships.

      Thank you.

      The main issue I can currently see in the paper is that the mentioned relationship between g-ratio and recall abilities is then used to infer that better recall abilities are associated with higher conduction velocity and larger axons. The authors' line of reasoning is that given the hypothesized association, the increase in the g-ratio implies increases in myelin and axonal diameter. Despite this scenario being indeed possible given the current result, an increased g-ratio may also not indicate higher conduction velocity. In fact, the first potential inference would be that, without having any information on the axon size, the quantify of myelin can indeed be lower and as result, the conduction velocity would decrease. I understand that the authors expected higher conduction velocity associated with better autobiographical memory recall, but it is hard to see any experimental outcome that could have disproved this hypothesis: from the possible scenarios depicted in the introduction, any change in the g-ratio (and even not any change at all) could indicate higher conduction velocity. What would be then needed to corroborate one of these scenarios is some independent or complementary measure, which unfortunately is missing.

      The mentioned issue does not mean that the paper loses relevance - I think that it should focus on the very practical result, a change in myelination and microstructure, and discuss what are the potential implications, including the one that currently dominates the discussion section.

      Thank you for these comments and the opportunity to provide further clarification.

      First, we have now provided additional background information regarding the relationship between the MR g-ratio and conduction velocity. We explicitly note that while finding a significant relationship between the MR g-ratio and autobiographical memory recall suggests the existence of an association between autobiographical memory recall and parahippocampal cingulum bundle conduction velocity, it cannot speak to the direction of this association.

      Second, we have further noted that interpretation of the parahippocampal cingulum bundle MR g-ratio in relation to the underlying microstructure requires knowledge, or an assumption, about whether the associated change in conduction velocity is faster or slower. Given that faster conduction velocity is thought to promote better cognition (e.g. Brancucci, 2012; Dicke and Roth, 2016; Miller, 1994; Reed and Jensen, 1992), we interpreted our MR g-ratio findings under the assumption of faster conduction velocity, and now explicitly note in several places in the revised manuscript that this is an assumption.

      Third, we thank the Reviewer for the excellent suggestion that a complementary measure could help to further inform the findings. Consequently, we now also include additional analyses examining the relationship between the extent of myelination and autobiographical memory recall ability. This is possible using the magnetisation transfer saturation maps, which are optimised to assess myelination. Given our assumption of faster conduction velocity when interpreting our positive MR g-ratio correlations, then no relationship between parahippocampal cingulum bundle magnetisation transfer saturation and autobiographical memory recall would be expected. On the other hand, if conduction velocity is actually decreasing, then a negative correlation between magnetisation transfer saturation values and autobiographical memory recall ability would be observed. In fact, we found no relationship between parahippocampal cingulum bundle magnetisation transfer saturation and autobiographical memory recall. This suggests that myelin was not associated with autobiographical memory recall ability, supporting our assumption that relationships with the MR g-ratio were indicative of faster rather than slower, conduction velocity.

      We have now added these new data, analyses and discussion points to the revised manuscript.

      It would also be helpful to include some paragraphs on both interpretation and methodological issues when it comes to MRI-based microstructural imaging, which at the moment is lacking. This would provide a better picture of the results for a more general readership.

      We agree, and additional consideration of interpretational and methodological limitations have now been included in the manuscript.

      As one of the first works using an MRI-based microstructural measure of myelin, the g-ratio, to study cognition in a large cohort of subjects, I think this work will be a needed and significant step towards merging the neuroscience and MRI physics community - the methodology presented here is robust and could be used in many other applications.

      Thank you.

      Reviewer #3 (Public Review):

      The manuscript adds useful information about how structural properties of the brain are related to individual differences in autobiographical memory. A novel metric is used to assess features of white matter in tracts that are important for information exchange between the hippocampus and other brain regions. In one of these, the parahippocampal bundle, a relationship between the MR g-ratio and autobiographical memory recall is identified. This represents new and interesting information. The authors interpret the results in line with the theory that speed of signal transmission is important for cognitive function.

      Thank you for this positive summary.

    1. Author Response

      Reviewer #1 (Public Review):

      Rasicci et al. have developed a FRET biosensor that is designed to light up when cardiac myosin folds. This structure is extremely important to understand, and its link to the super-relaxed (SRX) state has not been fully shown. Their study provides a comprehensive review of the literature and provides compelling data that the 15 heptad+leucine zipper+GFP construct does function well and that the DCM mutant E525K has a similar IVM velocity despite a reduced ATPase compared with HMM. They rely on the ionic strength-dependent changes in the rate of MantATP release to argue that the E525K mutation stabilizes the 'interacting heads motif' (IHM) state, which makes logical sense.

      Strengths:

      Well written and comprehensive.

      Utilizes the appropriate fluorescence-based sensor for measuring the folding of the myosin structure. Provides a detailed range of techniques to support the premise of the study

      Weaknesses:

      Over-interpretation of the outcomes from this study means that the IHM and SRX are the same. Similar studies, e.g. Anderson 2018 and Chu 2021 support the opposite view that IHM and SRX are not necessarily the same, Anderson (and Rohde 2018) point out that S1 has some element of a reduced ATPase, this clearly cannot be due to folding of the molecule. Also, mavacamten was used in these studies to show that even S1 is inhibited suggesting that SRX and IHM are not connected. This is not to say that with enough supporting evidence that these observations cannot be over-ridden, it is just not clear that there is enough in this study to support this conclusion.

      We have revised our discussion to emphasize that our results support a model in which the SRX state is enhanced by formation of the IHM, but given the S1 and 2HP data the IHM may not be required for populating the SRX biochemical state (see page 8).

      I felt that the authors passed over the recent Chu 2021 paper too quickly, the Thomas group used a FRET sensor as well and provides a direct comparison as a technique, but with opposite conclusions. They also have supporting data in Rohde 2018 that their constructs were less ionic strength sensitive. It would be useful to understand what the authors think about this.

      We have discussed the Rohde and Chu papers in more detail in the discussion (see page 8). In the Rhode paper they used proteolytically prepared HMM and S1. Rohde found 20% SRX at all KCl concentrations in S1, while HMM shifted from 50% to 20% SRX in low and high salt conditions, respectively. Our results are different in terms of the absolute fraction of the SRX state but the trend is similar in terms of S1 being salt-insensitive and HMM being salt-sensitive. The difference could be proteolytic HMM, which is a longer construct, and proteolytic S1, which is prone to internal cleavage that can impact ATPase activity. Another difference could be the mixed isoform of mantATP used in previous studies and the single isoform of mantATP used on our study (see page 5)

      Reviewer #2 (Public Review):

      The paper by Rasicci et al. examines the impact of the DCM mutation E525K in beta-cardiac myosin on its function and regulation by autoinhibition. The role of the auto-inhibited state of beta-cardiac myosin in fine-tuning cardiac contractility is an active and exciting area of current research related to muscle biology and cardiomyopathies. Several studies in the past have linked the destabilization of the autoinhibited, super-relaxed (SRX) state of myosin to the pathogenesis of hypertrophic cardiomyopathy. This timely study provides one of the first few examples where the hypocontractile phenotype of a DCM mutation has been linked to the stabilization of the SRX state.

      One of the strengths here is the utilization of a wide variety of both pre-existing and novel biochemical and biophysical assays for the study. The authors have characterized a new two-headed long-tailed myosin construct containing 15-heptad repeats of the proximal S2 (15HPZ), which they show allows myosin to form the SRX state in vitro using single ATP turnover assays. The authors go on to compare the E525K and WT proteins using the 15HPZ myosin construct in terms of their steady-state actin-activated ATPase activity, in-vitro actin-sliding velocity and single ATP turnover measurements. These assays reveal that the predominant effect of this mutation is the stabilization of the SRX state which is maintained even at 150 mM salt concentration where the WT SRX is largely disrupted. This is an important observation because DCM mutations so far have been believed to only affect the force-generating capacity of myosin.

      One of the biggest strengths of this study is the attempt to develop a FRET-based approach to directly ask if the biochemical SRX state here correlates well with the structural IHM state, which is an important unresolved question in the field. The authors have designed a FRET pair (C-terminal GFP and Cy3ATP bound to the active site) that is sensitive to the relative position of the heads and the tail, allowing them to distinguish between the low-FRET closed IHM conformation and the no-FRET open conformation. Remarkably, the authors show that the salt dependence of the FRET efficiency values closely follows their results from the salt dependence of the percent SRX for both WT and E525K proteins. The authors then attempt to substantiate their FRET results by a direct visual analysis of the conformational states populated by both WT and E525K proteins at low salt using negative staining EM analysis. The authors have optimized conditions to allow the deposition of the IHM state on grids without adding the small molecule mavacamten, which was found to be necessary in an earlier study to visualize the closed state using EM. The authors conclude that the SRX state correlates well with the IHM state and that the E525K mutation indeed stabilizes the folded-back conformation of myosin.

      This study significantly strengthens the previously illustrated correlation between the SRX and IHM states and provides methodological advances (especially visualization of the IHM state by negative EM in the absence of cross-linking agents) that will be very useful to the field going forward. The observation that a DCM mutation can lead to stabilization of the folded back state is a novel insight that should spark interest in the field to test how broadly this applies to other DCM mutations. The conclusions of the paper are mostly supported by the data; however, some clarifications and qualifications are needed.

      Weaknesses:

      The extremely low enzymatic activity of the M2β 15HPZ myosins as compared to the WT S1 control (which is a historical control not assayed in parallel with the 15HPZ proteins), is concerning for the low protein quality of the 15HPZ myosins. The authors attribute the low kcat to the high proportion of SRX population in their ensembles. However, the DRX rates reported for the WT and E525K 15HPZ proteins in the single ATP turnover assay are ~3-4 fold lower than those of their S1 counterparts. These rates reflect basal turnover of ATP in the open state and thus should not be affected by the presence of the S2 tail, which leads to concerns about the 15HPZ protein activity. In addition, the very high percentage of stuck filaments in the in vitro motility assay for the 15HPZ constructs (despite the use of dark actin) is also concerning for significant amounts of enzymatically inactive protein.

      We thank the reviewer for pointing out the differences in the S1 and HMM DRX rates. We performed additional single turnover measurements with S1, adding two sets of measurements from one additional preparation (N=3), and we demonstrate that there is a significant increase in the DRX rates of WT S1 compared to WT HMM (see pages 4-5, Table 3, Figure 3- figure supplement 3). A faster rate in S1 was also reported in Rohde et al. 2018. Indeed, the DRX rates of E525K S1 are significantly higher than WT in S1, which we also now report in the results (see page 5, Figure 3 – figure supplement 3). We addressed the concerns about 15HPZ activity by performing NH4+ ATPase assays to demonstrate that the number of active heads was similar in S1 and 15HPZ HMM (see page 4). It is possible that the higher percentage of stuck filaments in the HMM motility is due to myosin heads in the IHM state on the motility surface, which generate a drag force by non-specifically interacting with actin, but further study is necessary to examine this question.

      The authors assert that the E525K mutation represents a new mechanism by which DCM-causing mutations lead to decreased contractility - by stabilizing the sequestered state rather than affecting motor function. However, there is no evaluation of the motor function (actin-activated ATPase activity or in vitro motility) of the E525K S1, which would reveal the effects of the mutation without confounding effects due to the sequestering of heads. Interestingly, in the single ATP turnover assay, the DRX rate of the E525K S1 is >2-fold higher than the WT control, suggesting that the mutation may have effects beyond stabilization of the SRX state. The conclusion that the E525K mutation's effect on myosin function is mediated via stabilization of the SRX state would be strengthened if the effects of the mutation on the motor domain alone were also known.

      We thank the reviewer for this suggestion. We performed actin-activated ATPase assays with WT and E525K S1 and found that E525K increases kcat and lowers KATPase, demonstrating enhanced intrinsic motor activity in the mutant S1 construct (see page 4, Figure 2B). This adds an interesting dimension to the manuscript because we report a mutant that enhances the intrinsic motor activity but stabilizes the SRX/IHM (see Discussion page 10). We did not perform in vitro motility, because this assay depends on the surface attachment strategy, and we would like to compare all constructs with the same attachment strategy using a C-terminal GFP tag (mutant and WT S1 and 15HPZ HMM). Therefore, we are making the S1 construct with a C-terminal GFP tag for this purpose, to be examined in a future study.

      While the authors show strong qualitative correlations between the SRX and IHM states using single ATP turnover, FRET, and EM experiments, attempts to quantitatively compare the fraction of heads in the IHM state using the various experimental approaches is problematic. For example, the R0 value of the FRET pair used here doesn't allow precise measurement of the distances being probed here to be made, but the distances are reported and compared to predicted distances. The authors report that the R0 for their FRET pair is 63 Å. Surprisingly the authors go on to use the steady-state FRET efficiency values to determine the average D-A distance (Fig 5B) which is 100 Å when all heads are in the IHM configuration and becomes larger than that when heads open. R0 of 63 Å allows a precise distance measurement to be made in the 31.5-94.5 Å range which corresponds to 0.5-1.5 R0. It is therefore technically incorrect to use the steady-state FRET efficiency values to determine the D-A distance here. Besides, there are several unknown factors here like orientation factor (κ2) which further complicate these calculations. Similarly, the quantification of IHM state molecules from the negative stain EM experiments is significantly hampered by the disruptive effect of the grid surface on the structure of the IHM state. The authors find that limiting the contact time with the grid to ~ 5s is necessary to preserve the IHM state.

      Despite that, only ~15% WT molecules were seen in the IHM state at low salt (Fig. 6B). In contrast, ~56% E525K molecules were in the IHM state. Both these proteins have similar SRX proportions (Fig. 3C) and similar FRET efficiency values (Fig. 5A) at this salt concentration. This mismatch highlights the problem arising due to not having a measure of the populations from the FRET data. It is not clear if the hugely different proportions of the IHM state in EM experiments are indicative of the relative stability of this state in the two proteins or a random difference in the electrostatic interactions of WT vs mutant with the grid. These experiments do not provide a correct idea of the %IHM in the two proteins. In the absence of any IHM population measurement, it is important to proceed with caution when quantitatively correlating the SRX and IHM.

      We thank the reviewer for pointing out that measuring precise distances by FRET can be difficult. We agree that the low FRET efficiency makes precise distance determination even more challenging. However, FRET is quite good at measuring a change in distance given a specific donor-acceptor pair. We feel our FRET biosensor clearly demonstrates FRET efficiencies that are salt-insensitive in E525K but a clear decrease in FRET at higher salt concentrations in WT. In order to compare the trend in the predicted FRET, based on the single turnover measurements, and the actual FRET we thought it was important to plot the two together on the same graph. We understand that this could have been misleading that we were reporting actual distances. We have now plotted the FRET efficiency instead of distance as a function of KCl concentration (Figure 5B), to prevent any confusion with reporting distances. In addition, we have emphasized that the data are plotted to allow for a comparison of the trend in the single turnover and FRET data (see page 6, 10, Figure 5B).

      We agree that it is important to proceed with caution when comparing the EM to the FRET and single turnover data. The EM does not give a quantitative estimate of the fraction of IHM molecules, due to the disruptive effect of the grid surface on protein conformation. However, it does provide direct (though qualitative) evidence that the conformation underlying SRX and enhanced FRET is the IHM, and it is consistent with our interpretation that the E525K mutation enhances FRET and SRX by stabilizing the IHM. To consolidate this result, we have performed EM experiments now with a total of 3 preparations of WT and mutant (see page 6-7 and Figure 6D). We find that while there is variability from experiment to experiment, likely because the grid surface is slightly different each time the experiment is performed, in all cases there was a ~4-fold higher fraction of folded molecules in the mutant. Since each WT/mutant experimental pair was studied in parallel, using identically prepared grids, the results provide further evidence that the mutant stabilizes the IHM. However, we agree that a quantitative, direct visual correlation of the SRX and IHM is not possible based on the current EM data.

      Finally, the utility of the methods described in the paper to the field would be greatly enhanced if they were described in more detail. As currently written, it would be difficult for others to replicate these experiments.

      Thank you for the comment. We have made significant changes in the methods to clarify the details of the experiments (see pages 11-14). In addition, we have added details to the results and figure legends.

    1. Author Response

      Reviewer #1 (Public Review):

      “This study investigates the dynamics of brain network connectivity during sustained experimental pain in healthy human participants. To this end, capsaicin was applied to the tongues of two cohorts of participants (discovery cohort, N=48; replication cohort, N=74). This procedure resulted in pain for several minutes. During sustained pain, pain avoidance/intensity ratings and fMRI scans were obtained. The analyses (i) compare the pain state with a resting state, (ii) assess the dynamics of brain networks during sustained pain, and (iii) aim to predict pain based on the dynamics of brain networks. To this end, the analyses focus on community structures of time-evolving networks. The results show that sustained pain is associated with the emergence of a brain network including somatomotor, frontoparietal, basal ganglia and thalamic brain areas. The somatomotor area of the tongue is particularly involved in that network while this area is decoupled from other parts of the somatomotor cortex. Moreover, the network configuration changes over time with the frontoparietal network decoupling from the somatomotor network. Frontoparietal-cerebellar connections were predictive of decreases of pain. Together, the findings provide novel and convincing insights into the dynamics of brain network during sustained pain.

      Strengths

      • The brain mechanisms of sustained pain is a timely and relevant topic with potential clinical implications.

      • Assessing the dynamics of sustained pain and relating it to the dynamics of brain networks is a timely and promising approach to further the understanding of the brain mechanisms of pain.

      • The study includes discovery and replication cohorts and pursues a cutting-edge analysis strategy.

      • The manuscript is very well-written and the results are visualized in an exemplary manner including a graphical outline and summary of the findings.”

      We thank the reviewer for the thoughtful summarization and evaluation of our study.

      “Weaknesses

      • It remains unclear whether the changes of brain networks over time simply reflect the duration of sustained pain or whether they essentially reflect different levels of pain intensity/avoidance.”

      We appreciate the editor and reviewer’s comment on this issue. With the current experimental paradigm, it is difficult to dissociate the pain duration from the level of pain because the delivery of oral capsaicin commonly induces initial bursting and then a gradual decrease of pain over time. That is, the pain duration is correlated with the pain intensity in our task.

      However, when we examined the time-course of the ratings at each individual level (as shown in Figure S2), the time duration explained 53.7% of the rating variance, R2 = 0.537 ± 0.315 (mean ± standard deviation). In addition, if we constrain the beta coefficient of the time duration to be negative (i.e., ratings should decrease over time), the explained variance decreases to 48.2%, R2 = 0.482 ± 0.457, leaving us enough variance (i.e., greater than 50%) for examining the distinct effects of time duration and ratings on the patterns of functional brain reorganization.

      Indeed, the two main analyses included in the manuscript—consensus community detection and predictive modeling—were designed to examine those two aspects of the task, i.e., time duration and pain avoidance ratings, respectively. First, through the consensus community detection analysis, we examined the community structure that changes over time, i.e., across the early, middle, and late periods (as shown in Figure 3). We then developed predictive models of pain avoidance ratings in the second main analysis (as shown in Figure 5).

      Though it is still a caveat that we cannot fully dissociate the effects of time duration versus pain ratings, we could interpret the first set of results to be more about time duration, while the second set of results is more about pain ratings.

      We now added a description of the implication of predictive modeling for isolating the effects of pain ratings. In addition, a discussion on the caveat of the current experimental design and relevant future direction.

      Revisions to the main manuscript:

      p. 25: Moreover, developing models to directly predict the pain ratings is helpful to complement the group-level analysis, because the changes in consensus community structure over the early, middle, and late periods only indirectly reflect the different levels of pain.

      p. 27: This study also had some limitations. First, with the current experimental paradigm, it is difficult to dissociate the pain duration from the level of pain because the delivery of oral capsaicin commonly induces initial bursting and then a gradual decrease of pain over time. Though we aimed to model the effects of pain duration and pain avoidance ratings with our two primary analyses, i.e., consensus community detection and predictive modeling, we cannot fully dissociate the impact of time duration versus pain ratings.

      “• Although the manuscript is very well-written it might benefit from an even clearer and simpler explanation of what the consensus community structure and the underlying module allegiance measure assesses.”

      We thank you for the suggestion. Now we added additional (but simple) descriptions of module allegiance and consensus community detection methods.

      Revisions to the main manuscript:

      pp. 8-9: Here, the consensus community means the group-level representative structures of the distinct community partitions of individuals. To determine the consensus community across different individuals and times, we first obtained the module allegiance (Bassett et al., 2011) from the community assignment of each individual. Module allegiance assesses how much a pair of nodes is likely to be affiliated with the same community label, and is defined as a matrix T whose element Tij is 1 when nodes i and j are assigned to the same community and 0 when assigned to different communities. This conversion of the categorical community assignments to the continuous module allegiance values allows group-level summarization of different community structures of individuals.

      p. 14: Here, high module allegiance indicates the voxels of two regions are likely to be in the same community affiliation, and vice versa.

      “• The added value of the assessment of the dynamics of brain networks remains unclear. Specifically, it is unclear whether the current analysis of brain networks dynamics allows for a clearer distinction between and prediction of pain and no-pain states than other measures of static or dynamic brain activity or static measures of brain connectivity.”

      The main goal (and thus, the added value) of the current study was to provide a “mechanistic” understanding of the brain processes of sustained pain, rather than the “prediction.” Even though we included the results from the predictive modeling, as in Figures 4-6, our focus was more on the interpretation of the model to quantitatively examine the functional changes in the brain, not on the maximization of the prediction performance.

      Indeed, maximizing the prediction performance was the main goal of our previous study (Lee et al., 2021), in which we developed a predictive model of sustained pain based on the patterns of dynamic functional connectivity. The model showed better prediction performances compared to the current study, but it was challenging to interpret the model because of the high dimensionality of the model and its features. In addition, functional connectivity itself provides only limited insight into how functional brain networks are structured and reconfigured over time.

      In this sense, the multi-layer community detection method has several advantages to achieving our goal. First, the community detection analysis allows us to summarize the complex, high-dimensional whole-brain connectivity patterns into neurobiologically interpretable subsystems. Second, the multi-layer community detection method allows us to study the temporal changes in community structure by connecting the same nodes across different time points.

      Now we added a description of the rationale behind the choice of the multi-layer community detection analysis over the conventional functional connectivity methods, and the added value of our study.

      Revisions to the main manuscript:

      p. 3: In this study, we examined the reconfiguration of whole-brain functional networks underlying the natural fluctuation in sustained pain to provide a mechanistic understanding of the brain responses to sustained pain.

      p. 7: In this study, we used this approach to examine the temporal changes of brain network structures during sustained pain, which cannot be done with conventional functional connectivity-based analyses (Lee et al., 2021).

      p. 27: However, the previous model provides a limited level of mechanistic understanding because of the high dimensionality of the model and its features. In addition, functional connectivity itself provides only limited insight into how functional brain networks are structured and reconfigured over time.

      Reviewer #2 (public Review):

      “The Authors J-J Lee et al., investigated cortical and subcortical brain networks and their organization in communities over time during evoked tonic pain. The paper is well-written, and the findings are interesting and relevant for the field. Interestingly, other than confirming well known phenomena (e.g., segregation within the primary somatomotor cortex) the Authors identified an emerging "pain supersystem" during the initial increase of pain, in which subcortical and frontoparietal regions, usually more segregated, showed more interactions with the primary somatomotor cortex. Decrease of pain was instead associated to a reconfiguration of the networks that sees subcortical and frontoparietal regions connected with areas of the cerebellum. The main novelty of the proposed analysis, lies in the resulting high performances of the classifier, that shows how this interesting link between frontoparietal network and subcortical regions with the cerebellum, is predictive of pain decrease. In summary, the main strengths of the present manuscript are: • Inclusion of subcortical regions: most of the recent papers using the Shaefer parcellation in ~200 brain areas1, do not consider subcortical areas, ignoring possible relevant responses and behaviors of those regions. Not only the Authors smartly addressed this issue, but most of their results showed how subcortical regions played a key role in the networks reconfiguration over time during evoked sustained pain.

      • Robust classification results: high accuracy obtained on training dataset (internal validation), using a leave-one-out approach, and on the available independent test dataset (external validation) of relatively large sample size (N=74).

      • Clarity in the description of aim and sub-aims and exhaustive presentation of the obtained results helped by appropriate illustrations and figures (I suggest less wording in some of them).

      • Availability of continuous behavioral outcome (track ball).”

      We appreciate the reviewer’s summary and positive evaluations.

      “Even though the results are mostly cohesive with previous literature, some of the results need to be discussed in relationship to recently published papers on the same topic as well as justifying some of the non-standard methodological procedures adding appropriate citations (or more detailed comments). The Authors do not touch upon the concept of temporal summation of pain, historically associated with tonic pain, especially when the study is finalized to better understanding brain mechanisms in chronic pain populations (chronic pain patients often exhibit increased temporal summation of pain2). I would suggest starting from the paper recently published by Cheng et al. that also shares most of the methodological pipeline3 to highlight similarities and novelties and deepen the comparison with the associated literature.”

      We thank the reviewer and editor for the comment on this important topic. Temporal summation of pain indicates progressively increased sensation of pain during prolonged noxious stimulation (Price, Hu, Dubner, & Gracely, 1977), and has been suggested as a hallmark of chronic pain disorders including fibromyalgia (Cheng et al., 2022; Price et al., 2002). In a recent study by Cheng et al. (2022), the authors induced tonic pain using constantly high cuff pressure and examined whether the participants experienced increased pain in the late period compared to the early period of pain. On the contrary, in our experimental paradigm, the capsaicin liquid initially delivered into the oral cavity is being cleaned out by saliva, and thus overall pain intensity was decreasing over time, not increasing (Figure 1B). Therefore, the temporal summation of pain may occur in a limited period (e.g., the early period of the run), but it is difficult to examine its effect systematically in our study.

      However, it is notable that Cheng et al.’s results overlap with our findings. For example, Cheng et al. reported the intra-network segregation within the somatomotor network and the inter-network integration between the somatomotor and other networks during the temporal summation of pressure pain in patients with fibromyalgia, which were similar to the findings we reported in Figure S9 and Figure 4. Although it is unclear whether these results reflect the temporal summation of pain, these network-level features shared across the two studies are likely to be an essential component of the sustained pain processes in the brain.

      Now we added a comment on the temporal summation of pain in the main manuscript.

      Revisions to the main manuscript (p. 26):

      Interestingly, a recent fMRI study on the temporal summation of pain in fibromyalgia patients reported results similar to ours (Cheng et al., 2022), including the intra-network dissociation within the somatomotor network and the inter-network integration between the somatomotor and other networks during pain. Although we cannot directly examine whether the temporal summation of pain gave rise to these network-level changes due to the limitation of our experimental paradigm, these consistent findings between the two studies may suggest that our findings could be generalized to clinical conditions.

      We thank the reviewer and editor for the information about this recent publication. Cheng et al. (2022) was not published at the time we wrote the manuscript, and we were surprised that Cheng et al. shares many aspects with our study, e.g., both used multilayer community detection and also reported similar findings, as described above.

      However, there were some differences between the two studies as well.

      First, the focus of our study was on the brain dynamics during the natural time-course of sustained pain from its initiation to remission in healthy participants, whereas the focus of Cheng et al. was on the temporal summation phenomenon of pain (TSP) and the enhanced TSP in patients with fibromyalgia patients. Because of this difference in the research focuses, our study and Cheng et al. are providing many nonoverlapping results and insights. For example, our study paid particular attention to the coping mechanisms of the brain (e.g., the network-level changes in the subcortical and frontoparietal network regions) and the brain systems that are correlated with the natural decrease of pain (e.g., the cerebellum in Figure 5). In contrast, Cheng et al. (2022) identified the brain connectivity and network features important for the increased TSP in fibromyalgia patients.

      Second, our great interest was in identifying and visualizing the fine-grained spatiotemporal patterns of functional brain network changes over the period of sustained pain. To utilize fine-grained brain activity information, we conducted our main analyses at a voxel-level resolution and on the native brain space, such as in Figures 2-3 and Figures S5, S7, and S8. With this fine-grained spatiotemporal mapping, we were able to identify small, but important voxel-level dynamics.

      We now cited Cheng et al. (2022) in multiple places and revised the manuscript accordingly.

      Revisions to the main manuscript (p. 26):

      Interestingly, a recent fMRI study on the temporal summation of pain in fibromyalgia patients reported results similar to ours (Cheng et al., 2022), including the intra-network dissociation within the somatomotor network and the inter-network integration between the somatomotor and other networks during pain. Although we cannot directly examine whether the temporal summation of pain gave rise to these network-level changes due to the limitation of our experimental paradigm, these consistent findings between the two studies may suggest that our findings could be generalized to clinical conditions.

      “Here the main significant weaknesses of the study:

      • The data analysis is entirely conducted on young healthy subjects. This is not a limitation per se, but the conclusion about offering new insights into understanding mechanisms at the basis of chronic pain is too far from the results. Centralization of pain is very different from summation and habituation, especially if all the subjects in the study consistently rated increased and decreased pain in the same way (it never happens in chronic pain patients). A similar pipeline has been actually applied to chronic pain patients (fibromyalgia and chronic back pain)3,4. Discussing the results of the present paper in relationship to those, could offer a more robust way to connect the Authors' results to networks behavior in pathological brains.”

      We are grateful for the opportunity to discuss the clinical implication of our study. First of all, we agree with the reviewer and editor that we cannot make a definitive claim about chronic pain with the current study, and thus, we revised the last sentence of the abstract to tone down our claim.

      Revisions to the main manuscript (p. 2, in the abstract):

      This study provides new insights into how multiple brain systems dynamically interact to construct and modulate pain experience, advancing our mechanistic understanding of sustained pain.

      However, as we noted above in E-4, some of our findings were consistent with the findings from a previous clinical study (Cheng et al., 2022), suggesting the potential to generalize our study to clinical pain conditions. In addition, we previously reported that a predictive model of sustained pain derived from healthy participants performed better at predicting the pain severity of chronic pain patients than the model derived directly from chronic pain patients (Lee et al., 2021), highlighting the advantage of the “component process approach.”

      The component process approach aims to develop brain-based biomarkers for basic component processes first, which can then serve as intermediate features for the modeling of multiple clinical conditions (Woo, Chang, Lindquist, & Wager, 2017). This has been one of the core ideas of the Research Domain Criteria (RDoC) (Insel et al., 2010) and the Hierarchical Taxonomy of Psychopathology (HiTOP) (Kotov et al., 2017). If the clinical pain of a patient group is modeled as a whole, it becomes unclear what is being modeled because of the multidimensional and heterogeneous nature of clinical pain (Melzack, 1999) as well as other co-occurring health conditions (e.g., mental health issues, medication use, etc.). The component process approach, in contrast, can specify which components are being modeled and are relatively free from heterogeneity and comorbidity issues by experimentally manipulating the specific component of interest in healthy participants.

      The current study was conducted on healthy young adults based on the component process approach. We used oral capsaicin to experimentally induce sustained pain, which unfolds over protracted time periods and has been suggested to reflect some of the essential features of clinical pain (Rainville, Feine, Bushnell, & Duncan, 1992; Stohler & Kowalski, 1999). Therefore, the detailed characterization of the brain processes of sustained pain will be able to serve as an intermediate feature of multiple clinical conditions in future studies.

      Now we added the discussion on the clinical generalizability issue in the discussion section.

      Revisions to the main manuscript:

      pp. 25-26: An interesting future direction would be to examine whether the current results can be generalized to clinical pain. Experimental tonic pain has been known to share similar characteristics with clinical pain (Rainville et al., 1992; Stohler & Kowalski, 1999). In addition, in a recent study, we showed that an fMRI connectivity-based signature for capsaicin-induced orofacial tonic pain can be generalized to chronic back pain (Lee et al., 2021). Therefore, a detailed characterization of the brain responses to sustained pain has the potential to provide useful information about clinical pain.

      p. 26: Interestingly, a recent fMRI study on the temporal summation of pain in fibromyalgia patients reported results similar to ours (Cheng et al., 2022), including the intra-network dissociation within the somatomotor network and the inter-network integration between the somatomotor and other networks during pain. Although we cannot directly examine whether the temporal summation of pain gave rise to these network-level changes due to the limitation of our experimental paradigm, these consistent findings between the two studies may suggest that our findings could be generalized to clinical conditions.

      “Vice versa, the behavioral measure used to assess evoked pain perception (avoidance ratings), has been developed for chronic pain patients and never validated on healthy controls5. It might not be an appropriate measure considering the total absence of pain variability in the reported responses over forty-eight subjects6,7.”

      We acknowledge that pain avoidance measures are not fully validated in the healthy population. Nevertheless, we used this measure in this study for the following two main reasons that outweigh the limitations.

      First, a pain avoidance rating provides an integrative measure that can reflect the multi-dimensional aspects of sustained pain. One of the essential functions of pain is to avoid harmful situations and promote survival, and the avoidance motivation induced by pain is composed of not only sensory-discriminative, but also cognitive components including learning, valuation, and contexts (Melzack, 1999). According to the fear-avoidance model (Vlaeyen & Linton, 2012), if the pain-induced avoidance motivation is not resolved for a long time and is maladaptively associated with innocuous environments, chronic pain is likely to develop, suggesting the importance and clinical relevance of pain avoidance measures. In addition, our experimental design is particularly suitable for the use of avoidance rating because the oral capsaicin stimulation is accompanied by the urge to avoid the painful sensation, but it cannot immediately be resolved similar to chronic pain. Moreover, capsaicin is sometimes experienced as intense but less aversive (or even appetitive) in some cases, e.g., spicy food craver (Stevenson & Yeomans, 1993). In this case, avoidance ratings can provide a more reasonable measure of pain compared to the intensity rating.

      Second, the avoidance measure provides a common scale on which we can compare different types of aversive experiences, allowing us to conduct specificity tests for a predictive model of pain. For example, a recent study successfully compared the brain representations of two types of pain and two types of aversive, but non-painful experiences (e.g., aversive auditory and visual experiences) using the same avoidance measure (Ceko, Kragel, Woo, Lopez-Sola, & Wager, 2022). These comparisons were possible because the avoidance measure provided one common scale for all the aversive experiences regardless of their types of stimuli.

      To provide a better justification for the use of the avoidance measure, we now included the specificity test results of our pain predictive models. More specifically, we tested our module allegiance-based SVM and PCR models of pain on the aversive taste and aversive odor conditions (Figure S13).

      Despite these advantages, the use of avoidance rating without thorough validation is a limitation of the current study, and thus future studies need to examine the psychometric properties of the avoidance rating, e.g., examining the relationship among pain intensity, unpleasantness, and avoidance measures. However, the current study showed that the predictive models derived with pain avoidance rating (Study 1) could be used to predict the pain intensity rating (Study 2). In addition, the overall time-course of pain avoidance ratings in Study 1 was similar to the time-course of pain intensity ratings in Study 2, providing some supporting evidence for the convergent validity of the pain avoidance measure.

      As to the following comment, “It might not be an appropriate measure considering the total absence of pain variability in the reported responses over forty-eight subjects,” there are pieces of evidence supporting that the low between-individual variability of ratings is due to the characteristics of our experimental design, not to the fact that we used the avoidance measure. As we discussed in more detail in our response to E-1, our experimental procedure based on capsaicin liquid commonly induces the initial burst of painful sensation and the subsequent gradual relief for most of the participants (Figure 1B, left). A similar time-course pattern of ratings was observed in Study 2 (Figure 1B, right), which used the pain “intensity” rating, not the pain avoidance rating. In addition, previous studies with a similar experimental design (i.e., intra-oral capsaicin application) (Berry & Simons, 2020; Lu, Baad-Hansen, List, Zhang, & Svensson, 2013; Ngom, Dubray, Woda, & Dallel, 2001) also showed a similar time-course of pain ratings with low between-individual variability regardless of the rating types (e.g., VAS or irritation intensity), confirming that this observation is not unique to the pain avoidance rating.

      Now we added descriptions on the small between-individual variability of pain ratings and the use of avoidance ratings.

      Revisions to the main manuscript:

      pp. 5-7: Note that the overall trend of pain ratings over time was similar across participants because of the characteristics of our experimental design, which has also been observed in the previous studies that used oral capsaicin (Berry & Simons, 2020; Lu et al., 2013; Ngom et al., 2001). However, also note that each individual’s time-course of pain ratings were not entirely the same (Figures S2 and S3).

      p. 26: However, there are also differences between the characteristics of capsaicin-induced tonic pain versus clinical pain. For example, clinical pain continuously fluctuates over time in an idiosyncratic pattern (Apkarian, Krauss, Fredrickson, & Szeverenyi, 2001), whereas capsaicin-induced tonic pain showed a similar time-course pattern across the participants—i.e., increasing rapidly and then decreasing gradually (Figure 1B). This typical time-course of pain ratings has been reported in previous studies that used oral capsaicin (Berry & Simons, 2020; Lu et al., 2013; Ngom et al., 2001).

      pp. 26-27: Note that Study 1 used a pain avoidance measure that is not yet fully validated in healthy participants. However, we chose to use the pain avoidance measure, which can provide integrative information on the multi-dimensional aspects of pain (Melzack, 1999; Waddell, Newton, Henderson, Somerville, & Main, 1993). It also has a clinical implication considering that the maladaptive associations of pain avoidance to innocuous environments have been suggested as a putative mechanism of transition to chronic pain (Vlaeyen & Linton, 2012). Lastly, the avoidance measure can provide a common scale across different modalities of aversive experience, allowing us to compare their distinct brain representations (Ceko et al., 2022) or test the specificity of their predictive models (Lee et al., 2021) (Figure S13). Although the psychometric properties of the pain avoidance measure should be a topic of future investigation, we expect that the pain avoidance measure would have a high level of convergent validity with pain intensity given the observed similarity between pain avoidance (Study 1) and pain intensity (Study 2) in their temporal profiles. The generalizability of our PCR model across Studies 1 and 2 also supports this speculation. However, there would also be situations in which pain avoidance is dissociated from pain intensity. For example, capsaicin can be experienced to be intense but less aversive or even appetitive in some contexts, such as cravings for spicy food (Stevenson & Yeomans, 1993). In addition, the gradual rise of avoidance ratings during the late period of the control condition in Study 1 would not be observed if the intensity measure was used. Future studies need to examine the relationship between pain avoidance and the other pain assessments and the advantage of using the pain avoidance measure.

      “• The dynamic measure employed by the Authors is better described from the term "windowed functional connectivity". It is often considered a measure of dynamic functional connectivity and it gives information about fluctuations of the connectivity patterns over time. Nevertheless, the entire focus of the paper, including the title, is on dynamic networks, which inaccurately leads one to think of time-varying measures with higher temporal resolution (either updating for every acquired time point, as the Authors did in their previous publication on the same dataset4, or sliding windows involving weighting or tapering8,9). This allows one to follow network reorganization over time without averaging 2-min intervals in which several different brain mechanisms might play an important role3,10,11. In summary, the assumption of constant response throughout 2-min periods of tonic pain and the use of Pearson correlations do not mirror the idea of dynamic analysis expressed by the Authors in title and introduction. I would suggest removing "dynamic" from the title, reduce the emphasis on this concept, address possible confounds introduced by the choice of long windows and rephrase the aim of the study in terms of brain network reconfiguration over the main phases of tonic pain experience.”

      Now we removed the word ‘dynamic’ from many places in the manuscript, including the title. In addition, we added a brief discussion on the reason we chose to use the long and non-overlapping windows for connectivity calculation.

      Revisions to the main manuscript (p. 8):

      Although the long duration of the time window without overlaps may obscure the fine-grained temporal dynamics in functional connectivity patterns, we chose to use this long time window based on previous literature (Bassett et al., 2011; Robinson, Atlas, & Wager, 2015), which also used long time windows to obtain more reliable estimates of network structures and their transitions.

      “• Procedure chosen for evoking sustained pain. To the best of my knowledge, capsaicin sauce on the tongue is not a validated tonic pain procedure. In favor of this argument is the absence of inter-subject variability in the behavioral results showed in the paper, very unusual for response to painful stimulations. The procedure is well described by the Authors, and some precautions like letting the liquid drying before the start of the scan, have helped reducing confounds. Despite this, the measures in figure 1B suggest that the intensity of the painful stimulation is not constant as expected for sustained pain (probably the effect washes out with the saliva). In this case, the first six-minute interval requires particular attention because it encapsulates the real tonic pain phase, and the following ones require more appropriate labels. Ideally the Author should cite previous studies showing that tongue evoked pain elicits a very specific behavioral response (summation, habituation/decrease of pain, absence of pain perception). If those works are missing, this response need to be treated as a funding rather than an obvious point.”

      We addressed this comment. Moreover, we could find previous studies that experimentally induced tonic pain through the application of capsaicin on the tongue (Berry & Simons, 2020; Boudreau, Wang, Svensson, Sessle, & Arendt-Nielsen, 2009; Green, 1991; Ngom et al., 2001), suggesting that our experimental procedure is in line with previous literature.

      Reviewer #3 (Public Review ):

      “In their manuscript, Lee and colleagues explore the dynamics of the functional community structure of the brain (as measured with fMRI) during sustained experimental pain and provide several potentially highly valuable insights into, and evaluate the predictive capacity of, the underlying dynamic processes. The applied methodology is novel but, at the same time, straightforward and has solid foundations. The findings are very interesting and, potentially, of high scientific impact as they may significantly push the boundaries of our understanding of the dynamic neural processes during sustained pain, with a (somewhat limited) potential for clinical translation.

      However (Major Issue 1), after reading the current manuscript version, not all of my doubts have been dissolved regrading the specificity of the results to pain. Moreover (Major Issue 2), some of the results (specifically, those related to the group level analysis of community differences) do not seem to be underpinned with a proper statistical inference in the current version of the manuscript and, therefore, their presentation and discussion may not be proportional to the degree of evidence. Next to these Major Issues (detailed below), some other, minor clarifications might also be needed before publications. These are detailed below or in the private part of the review ("Recommendations for the authors").

      Despite these issues, this is, in general, a high quality work with a high level of novelty and - after addressing the issues - it has a very high potential for becoming an important contribution (and a very interesting read) to the pain-research community and beyond.”

      We appreciate the reviewer’s thoughtful comments. We have revised the manuscript to address the Reviewer’s major concerns, as described below.

      “Major Issue 1:

      The main issue with the manuscript is that it remains somewhat unclear, how specific the results are to pain.

      Differences between the control resting state and the capsaicin trials might be - at least partially - driven by other factors, like:

      • motion artifacts

      • saliency, attention, axiety, etc.

      Differences between stages over the time-course might, additionally, be driven by scanner drifts (to which the applied approach might be less sensitive, but the possibility is still there ) or other gradual processes, e.g. shifts in arousal, attention shifts, alertness, etc.

      All the above factors might emerge as confounding bias in both of the predictive models.

      This problem should be thoroughly discussed, and at least the following extra analyses are recommended, in order to attenuate concerns related to the overall specificity and neurobiological validity of the results:

      • reporting of, and testing for motion estimates (mean, max, median framewise displacement or anything similar)

      • examining whether these factors might, at least partially, drive the predictive models.

      • e.g. applying the PCR model on the resting state data and verifying of the predicted timecourse is flat (no inverse U-shape, that is characteristic to all capsaicin trials).

      Not using the additional sessions (bitter taste, aversive odor, phasic heat) feels like a missed opportunity, as they could also be very helpful in addressing this issue.”

      We thank the reviewer for this comment on the important issue regarding the specificity of our results and the potential influences of noise. The effects of head motion and physiological confounds are particularly relevant to pain studies because pain involves substantial physiological changes and often causes head motion. To address the related concerns of specificity, we conducted additional analyses assessing the independence of our predictive models (i.e., SVM and PCR models) from head movement and physiology variables and the specificity of our models to pain versus non-painful aversive conditions (i.e., bitter taste and aversive odor) in Study 1.

      First, we examined the overall changes of framewise displacement (FD) (Power, Barnes, Snyder, Schlaggar, & Petersen, 2012), heart rate (HR), and respiratory rate (RR) in the capsaicin condition (Figure S11). For the univariate comparison between the capsaicin vs. control conditions (Figure S11A), the results showed that, as expected, the capsaicin condition caused significant changes in head motion and autonomic responses. The mean FD and HR were significantly higher, and the RR was lower in the capsaicin condition compared to the control condition (FD: t47 = 5.30, P = 2.98 × 10-6; HR: t43 = 4.98, P = 1.10 × 10-5; RR: t43 = -1.91, P = 0.063, paired t-test). In addition, the increased motion and autonomic responses were more prominent in the early period of pain (Figure S11B). The 10-binned (2 mins per time-bin) FD and HR showed a decreasing trend while the RR showed an increasing trend over time in the capsaicin condition. The comparisons between the early (1-3 bins, 0-6 min) vs. late (8-10 bins, 14-20 min) periods of the capsaicin condition showed significant differences both for FD and HR (FD: t47 = 6.45, P = 8.12 × 10-8; HR: t43 = 6.52, P = 6.41 × 10-8; RR: t43 = -1.61, P = 0.11, paired t-test). These results suggest that while participants were experiencing capsaicin tonic pain, particularly during the early period, head motion and heart rate were increased, while breathing was slowed down. Note that we needed to exclude 4 participants’ data in this analysis due to technical issues with the physiological data acquisition.

      Next, we examined whether the changes in head motion and physiological responses influenced our predictive model performance (Figure S12). We first regressed out the mean FD, HR, and RR (concatenated across conditions and participants as we trained the SVM model) from the predicted values of the SVM model with leave-one-subject-out cross-validation (2 conditions × 44 participants = 88) and then calculated the classification accuracy again (Figure S12A). The results showed that the SVM model showed a reduced, but still significant classification accuracy for the capsaicin versus control conditions in a forced-choice test (n = 44, accuracy = 89%, P = 1.41 × 10-7, binomial test, two-tailed). We also did the same analysis for the PCR model (10 time-bins × 44 participants = 440) and the PCR model also showed a significant prediction performance (n = 44, mean prediction-outcome correlation r = 0.20, P = 0.003, bootstrap test, two-tailed, mean squared error = 0.159 ± 0.022 [mean ± s.e.m.]) (Figure S12B). These results suggest that our SVM and PCR models capture unique variance in tonic pain above and beyond the head movement and physiological changes.

      Lastly, we examined the specificity of our predictive models to pain, by testing the models on the non-painful but aversive conditions including the bitter taste (induced by quinine) and aversive odor (induced by fermented skate) conditions (Figure S13). All the model responses were obtained using leave-one-participant-out cross-validation. The results showed that the overall model responses of the SVM model for the bitter taste and aversive odor conditions were higher than those for the control condition but lower than the capsaicin condition (Figure S13A). Classification accuracies for comparing capsaicin vs. bitter taste and capsaicin vs. aversive odor were all significant (for capsaicin vs. bitter taste, accuracy = 79%, P = 6.17 × 10-5, binomial test, two-tailed, Figure S13C; for capsaicin vs. aversive odor, accuracy = 83%, P = 3.31 × 10-6, binomial test, two-tailed, Figure S13E), supporting the specificity of our SVM model of pain. Similarly, the model responses of the PCR model for the bitter taste and aversive odor conditions were lower than the capsaicin condition, and their temporal trajectories were less steep and fluctuating compared to the capsaicin condition (Figure S13B). The time-course of the model responses for the control condition was flatter than all other conditions and did not show the inverted U-shape. Furthermore, the model responses of the bitter taste and aversive odor conditions did not show the significant correlations with the actual avoidance ratings (bitter taste: mean prediction-outcome correlation r = 0.05, P = 0.41, bootstrap test, two-tailed, mean squared error = 0.036 ± 0.006 [mean ± s.e.m.], Figure S13D; aversive odor: mean prediction-outcome correlation r = 0.12, P = 0.06, bootstrap test, two-tailed, mean squared error = 0.044 ± 0.004 [mean ± s.e.m.], Figure S13F), suggesting the specificity of PCR model to pain.

      Overall, we have provided evidence that our models can predict pain ratings above and beyond the head motion and physiological changes and that the models are more responsive to pain compared to non-painful aversive conditions.

      Now we added descriptions on the specificity tests to the main manuscript and also to the Supplementary Information.

      Revisions to the main manuscript (p. 20):

      Specificity of the module allegiance-based predictive models To examine whether the predictive models were specific to pain and the prediction performances were not influenced by confounding variables such as head motion and physiological changes, we conducted additional analyses as shown in Figures S11-13. The SVM and PCR models showed significant prediction performances even after controlling for head motion (i.e., framewise displacement) and physiological responses (i.e., heart rate and respiratory rate) (Figures S11 and S12) and did not respond to the non-painful but aversive conditions including the bitter taste and aversive odor conditions (Figure S13), supporting the specificity of our predictive to pain. For details, please see Supplementary Results.

      Revisions to the Supplementary Information (pp. 2-4):

      Specificity analysis (Figures S11-13) To examine whether the predictive models (i.e., SVM and PCR models) were specific to pain and not influenced by confounding noises, we conducted additional specificity analysis assessing the independence of the models from head movement and physiology variables and specificity of our models to pain versus non-painful aversive conditions (i.e., bitter taste and aversive odor) in Study 1. First, we examined the overall changes of framewise displacement (FD) (Power et al., 2012), heart rate (HR), and respiratory rate (RR) in sustained pain (Figure S11). For the univariate comparison between capsaicin vs. control conditions (Figure S11A), the results showed that, as expected, capsaicin condition caused significant changes in motion and autonomic responses. The mean FD and HR were significantly higher, and the RR was lower in the capsaicin condition compared to the control condition (FD: t47 = 5.30, P = 2.98 × 10-6; HR: t43 = 4.98, P = 1.10 × 10-5; RR: t43 = -1.91, P = 0.063, paired t-test). For the temporal changes of movement and physiology variables (Figure S11B), the results showed that the increased motion and autonomic responses are more prominent in the early period of pain. The 10-binned (2 mins per time-chunk) FD and HR showed decreasing trend while the RR showed increasing trend over time in capsaicin condition. Additional univariate comparisons between early (1-3 bins, 0-6 min) vs. late (8-10 bins, 14-20 min) period of capsaicin condition showed that differences were significant for FD and HR (FD: t47 = 6.45, P = 8.12 × 10-8; HR: t43 = 6.52, P = 6.41 × 10-8; RR: t43 = -1.61, P = 0.11, paired t-test). This suggests that while participants were experiencing tonic pain, particularly in the early period, motion and heart rate was increased but breathing was slowed. Note that we needed to exclude 4 participants’ data due to technical issues with physiological data acquisition. Next, we examined whether the head movement and physiological responses are the main driver of our predictive models (Figure S12). For all the original signature responses from SVM model (2 conditions × 44 participants = 88), we regressed out the mean FD, HR, and RR (concatenated across conditions and participants as the SVM model was trained) and calculated the classification accuracy (Figure S12A). Although the signature responses were controlled for movement and physiology variables, the SVM model still showed a high classification accuracy for the capsaicin versus control conditions in a forced-choice test (n = 44, accuracy = 89%, P = 1.41 × 10-7, binomial test, two-tailed). Similarly, for all the original signature responses from PCR model (10 time-bins × 44 participants = 440), we regressed out the 10-binned FD, HR, and RR (concatenated across time-bins and participants as the PCR model was trained) and calculated the within-individual prediction-outcome correlation (Figure S12B). Again, the PCR model showed a significantly high predictive performance (n = 44, mean prediction-outcome correlation r = 0.20, P = 0.003, bootstrap test, two-tailed, mean squared error = 0.159 ± 0.022 [mean ± s.e.m.]) while controlling for movement and physiology variables. These results suggest that our SVM and PCR models captures unique variance in tonic pain above and beyond the head movement and physiological changes. Lastly, we examined the specificity of our predictive models to pain, by testing the models onto the non-painful but tonic aversive conditions including bitter taste (induced by quinine) and aversive odor (induced by fermented skate) (Figure S13). All the signature responses were obtained using leave-one-participant-out cross-validation. The results showed that the overall signature responses of SVM model for bitter taste and aversive odor conditions were higher than those for control conditions, but lower than capsaicin condition (Figure S13A). Classification accuracy between capsaicin vs. bitter taste and vs. aversive odor were all significantly high (capsaicin vs. bitter taste: accuracy = 79%, P = 6.17 × 10-5, binomial test, two-tailed, Figure S13C; capsaicin vs. aversive odor: accuracy = 83%, P = 3.31 × 10-6, binomial test, two-tailed, Figure S13E), suggesting the specificity of SVM model to pain. Similarly, the temporal trajectories of the signature responses of PCR model for bitter taste and aversive odor conditions were not overlapping with that of the capsaicin condition (Figure S13B). Furthermore, the signature responses of bitter taste and aversive odor conditions do not have significant relationship with the actual avoidance ratings (bitter taste: mean prediction-outcome correlation r = 0.05, P = 0.41, bootstrap test, two-tailed, mean squared error = 0.036 ± 0.006 [mean ± s.e.m.], Figure S13D; aversive odor: mean prediction-outcome correlation r = 0.12, P = 0.06, bootstrap test, two-tailed, mean squared error = 0.044 ± 0.004 [mean ± s.e.m.], Figure S13F), suggesting the specificity of PCR model to pain. Overall, we have provided evidence that the module allegiance-based models can predict pain ratings above and beyond the movement and physiological changes, and are more responsive to pain compared to non-painful aversive conditions, which suggest the specificity of our results to pain.

      “Major Issue 2:

      Another important issue with the manuscript is the (apparent) lack of statistical inference when analyzing the differences in the group-level consensus community structures (both when comparing capsaicin to control and when analysing changes over the time-course of the capsaicin-challenge).

      Although I agree that the observed changes seem biologically plausible and fit very well to previous results, without proper statistical inference we can't determine, how likely such differences are to emerge just by chance.

      This makes all results on Figs. 2 and 3, and points 1, 4 and 5 in the discussion partially or fully speculative or weakly underpinned, comprising a large proportion of the current version of the manuscript.

      Let me note, that this issue only affects part of the results and the remaining - more solid - results may already provide a substantial scientific contribution (which might already be sufficient to be eligible for publication in eLife, in my opinion).

      Therefore I see two main ways of handling Major Issue 2:

      • enhancing (or clarifying potential misunderstandings regarding) the methodology (see my concrete, and hopefully feasible, suggestions in the "private part" of the review),

      • de-weighting the presentation and the discussion of the related results.

      I believe there are many ways to test the significance of these differences. I highlight two possible, permutation testing-based ideas.

      Idea 1: permuting the labels ctr-capsaicin, or early-mid-late, repeating the analysis, constructing the proper null distribution of e.g. the community size changes and obtain the p-values. Idea 2: "trace back" communities to the individual level and do (nonparametric) statistical inference there.”

      We appreciate this important comment. We did not conduct statistical inference when comparing the group-level consensus community affiliations of the different conditions (Figure 2) or different phases (Figure 3) because of the difficulty in matching the community affiliation values of the networks to be compared.

      For example, let us assume that the 800 out of 1,000 voxels of community #1 and 1,000 out of 4,000 voxels of community #2 in the control condition are commonly affiliated with the same community #3 in the capsaicin condition. To compare the community affiliation between two conditions, we should first match the community label of the capsaicin condition (i.e., #3) to that of the control condition (i.e., #1 or #2), and here a dilemma occurs; if we prioritize the proportion of the overlapping voxels for the matching, the common community should be labeled as #1, whereas if we prioritize the number of the overlapping voxels for the matching, the label of the common community should be #2. Although both choices look reasonable, none of them can be a perfect solution.

      As the example above, it is impossible to exactly match the community affiliation of the different networks. We must choose an imperfect criterion for the matching procedure, which essentially affects the comparison of network structure. This was the main reason that we limited our results of Figures 2-3 to a qualitative description based on visual inspection. Moreover, the group-level consensus community structures in Figures 2-3 are not a simple group statistic like sample mean; they were obtained from multiple steps of analyses including permutation-based thresholding and unsupervised clustering, which could further complicate the interpretation of statistical tests.

      Alternatively, there is a slightly different but more rigorous approach to the comparisons of the community structures, which is the Phi-test (Alexander-Bloch et al., 2012; Lerman-Sinkoff & Barch, 2016). Instead of direct use of the community labels, this method converts the community label of each voxel into a list of module allegiance values between the seed voxel and all the voxels of the brain (i.e., 1 if the seed and target voxels have the same community label and 0 otherwise). This allows quantitative comparisons of voxel-level community profiles between different conditions without an arbitrarily matching of the community labels. We adopted this Phi-test for our analyses to examine whether the regional community affiliation pattern is significantly different between (i) the capsaicin vs. control conditions and (ii) the early vs. late periods of pain (Figure S6), which correspond to the main findings of the Figures 2 and 3 in our manuscript, respectively.

      More specifically, to compare the group-level consensus community structures between the capsaicin vs. control conditions and the early vs. late periods, we first obtained a seed-based module allegiance map for each voxel (i.e., using each voxel as a seed). Then, we calculated a correlation coefficient of the module allegiance values between two different conditions for each voxel. This correlation coefficient can serve as an estimate of the voxel-level similarity of the consensus community profile. Because module allegiance is a binary variable, these correlation values are Phi coefficients. A small Phi coefficient means that the spatial pattern of brain regions that have the same community affiliation with the given voxel are different between the two conditions. For example, if a voxel is connected to the somatomotor-dominant community during the capsaicin condition and the default-mode-dominant community during the control condition, the brain regions that have the same community label with the voxel will be very different, and thus the Phi coefficient will become small. Moreover, the Phi coefficient can be small even if a voxel is affiliated as the same (matched) community label for both conditions, when the spatial patterns of the same community is different between conditions.

      To calculate the statistical significance of the Phi coefficient, we conducted permutation tests, in which we randomly shuffled the condition labels in each participant and obtained the group-level consensus community structure for each shuffled condition. Then, we calculated the voxel-level correlations of the module allegiance values between the two shuffled conditions. We repeated this procedure 1,000 times to generate the null distribution of the Phi coefficients, and calculated the proportion of null samples that have a smaller Phi coefficient (i.e., a more dis-similar regional community structure) than the non-shuffled original data.

      Results showed that there are multiple voxels with statistical significance (permutation tests with 1,000 iterations, one-tailed) in the area where the community affiliations of the two contrasting conditions were different (Figure S6). For example, the frontoparietal and subcortical regions for the capsaicin vs. control (c.f., Figure 2), and the frontoparietal, subcortical, brainstem, and cerebellar regions for the early vs. late period of pain (c.f., Figure 3) contain voxels that survived after thresholding with FDR-corrected q < 0.05, suggesting the robustness of our main results.

      Particularly, the somatomotor and insular cortices showed statistical significance in the permutation test, and this may reflect the large changes in other areas that are connecting to the somatomotor and insular cortices across different conditions. The statistical significance was also observed in the visual cortex, which was unexpected. We interpret that the spatial distribution of the visual network community is too stable across conditions, and thus the null distribution from permutation formed a very narrow distribution of Phi coefficients. Therefore, a small change in the community structure could achieve statistical significance.

      Now we added descriptions on the permutation tests.

      Revisions to the main manuscript:

      p. 9: Permutation tests confirmed that the community assignment in the frontoparietal and subcortical regions showed significant changes between the capsaicin versus control conditions (Figure S6A).

      p. 13: Permutation tests further confirmed that the community assignment in the frontoparietal, subcortical, and brainstem regions showed significant changes between the early versus late period of pain (Figure S6B).

      pp. 36-37: Permutation tests for regional differences in community structures. To test the statistical significance of the voxel-level difference of consensus community structures (Figures 2 and 3), we performed the following Phi-test (Alexander-Bloch et al., 2012; Lerman-Sinkoff & Barch, 2016). First, for each given voxel, we compared the community label of the voxel to the community label of all the voxels, generating a list of voxel-seed module allegiance values that allow quantitative comparison of voxel-level community profile (e.g., [1, 0, 1, 1, 0, 0, ...], whose element is equal to 1 if the seed and target voxels were assigned to the same community and 0 otherwise). Next, a correlation coefficient was calculated between the module allegiance values of the two different brain community structures (i.e., capsaicin versus control, and early versus late). This correlation coefficient is an estimate of the regional similarity of community profiles (here, the correlation coefficient is Phi coefficient because module allegiance is a binary variable). To estimate the statistical significance of the Phi coefficient, we performed permutation tests, in which we randomly shuffled the labels and then obtained the group-level consensus community structures from the shuffled data. Then, the Phi coefficient between the module allegiance values of the two shuffled consensus community structures was calculated. We repeated this procedure 1,000 times to generate the null distribution of the Phi coefficient for each voxel. Lastly, we examined the probability to observe a smaller Phi coefficient (i.e., a more dissimilar community profile) than the one from the non-shuffled original data, which corresponds to the P-value of the permutation test. All the P-values were one-tailed as the hypothesis of this permutation test is unidirectional.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Avar et al report on the development of a high-throughput method to screen modifiers of prion replication in cell lines using a genome-wide siRNA library. They identified a number of hits and further studied one candidate, the ribonucleoprotein Hnrnpk. The authors convincingly show the interest of their method. However, the claims that the ribonucleoprotein Hnrnpk impact prion propagation need to be more quantitatively and statistically substantiated.

      1. * A large part of the manuscript is dedicated to the validation of the high-throughput assay (called QUIPPER). QUIPPER is made in 384-plates and provides great technological improvement. It works with different prion-permissive cell lines and different prion strains. QUIPPER is an antibody-FRET-based assay that detects a specific population of PrPSc that resists phospholipase C (PIPLC) treatment. Historically, PIPLC has been shown to cleave cell surface PrPC while preserving PrPSc (which is endocytic or inaccessible). I would recommend that the authors quantify the proportion of PIPLC-resistant PrPSc (PrPPIPLC) versus total PrPSc in their different models. First, PrPPIPLC proportion may be cell and strain dependent. Second and most importantly, as siRNA effects are studied using PrPPIPLC as readout, it is crucial to know if this form is a bona fide surrogate of PrPSc and infectivity or only a specific, subcellular, potentially minor form of PrPSc. This is particularly important as the effects of Hnrnpk knock-down in QUIPPER and western blot sounds discordant; in QUIPPER, the effects are strong (> 5-fold) while by western blot, the effects are much more modest (We addressed this issue in several ways; firstly, we quantified the proportion of PIPLC-resistant PrP (PrPPLC) versus PrPSc in two different models (Fig. 1B and D). Secondly, we directly compared residual infectivity of cells treated with PK or PIPLC (Figure 1C), using the standard scrapie cell assay. The results show that infectivity is retained upon PIPLC treatment. In addition, we assessed the 161 hits obtained via QUIPPER using PrPSc as a readout (Fig. 3B).

      To provide further data on the robustness of our PIPLC-based readout, we have performed western blotting of infected and uninfected cells upon PIPLC treatment and assessed the band patterns following PIPLC administration. This Figure is now incorporated in the manuscript as Supp. Fig. 1C and demonstrates that upon PIPLC digestion of NBH and RML infected CAD5 and GT-1/7 cells, PrP is barely detectable in the non-infected cells, while it is in the prion infected ones. The blots also show that the PIPLC-resistant PrP (PrPPLC) is resistant to PK digestion. These new data, together with those provided in Fig. 1B and Figure 1C, show that PrPPLC is equivalent to PrPSc in terms of PK resistance and infectivity.

      The reviewer pointed out a discordance between Western Blotting and QUIPPER. Although it is not clearly stated, we think the reviewer may be suggesting a discordance based on Fig. 3D. We would like to point out that Fig. 3D does not report fold changes as the reviewer is suggesting, but Z-scores, measured by standard deviations from the mean, not allowing to infer fold-changes. We quantified the effect of NT and HNRNPK targeting siRNAs on prion levels (Fig. 4A) and saw a three-fold change. We believe that the quantifications provided in the new version of the manuscript alleviate the concerns regarding any discordance.

      Technically, this is quite easy as it necessitates, after PIPLC treatment, the quantification of PrPSc in the supernatant versus PrPSc in the cell pellet. In Fig. 1C, the authors show that PrPPIPLC is infectious in a cell-scrapie assay. Using this approach, they could also quantify the infectivity of these species relative to the total infectivity content.

      We addressed this in Supplementary Fig. 1C as depicted above. Supplementary Fig. 1C shows the alikeness of the PrP species measured via the QUIPPER vs. the canonical PK digestion: upon digestion with PIPLC following a PK treatment, we detect PrPSc. Therefore, the experiment demonstrates that PrPPLC is alike in nature to PrPSc. The difference between the PK digested (lanes 3&4) vs PIPLC treated then PK digested lanes (lanes 7&8) is the PrPSc that is released into the media following PIPLC digestion.

      • *

      • The authors identified a list of prion modifiers candidate. Surprisingly, the authors did not perform a pathways analysis to identify potential pathways that could impact prion propagation.*

      Despite extensive efforts, there were no pathways that were enriched in our 40 hits, which is mentioned in the discussion part of the manuscript. Two analyses (for the 161 candidates and 40 hits) are now added to Supplementary Fig. 3C and pasted below.

      • *

      • The authors then studied in more details one hit, the ribonucleoprotein Hnrnpk. They studied the impact of Hnrnpk knock-down on PrPC and PrPres levels in different cell lines. These data (Fig 4 and Fig S4) lack quantitative (on a higher number of wells) and statistical analyses. The western blot that are shown suggest that PrPC levels are slightly increased by the siRNA and that the increase in PrPres levels is modest, barely significant given the western blot method. Same comment after PSA treatment, at least in PG127-infected hovS cells.*

      We performed a quantification on the western blots for all figures mentioned by the reviewers throughout the manuscript. These are incorporated to the manuscript for the figures: Fig. 4A, Fig. 4B, Supplementary Fig. 4A, Supplementary Fig. 4C, Supplementary Fig. 4D, Supplementary Fig. 4F, Supplementary Fig. 4G.

      Additionally, statistical analyses have been incorporated into the manuscript in these figures: Fig. 4C, Fig. 4D, Fig. 4E, Fig, 4F, Fig, 4G, Fig, 4H, Supplementary Fig. 4F. The analyses and the quantitative data demonstrate the effect of Hnrnpk downregulation and PSA treatment on prion levels to be significant. Moreover, we also addressed the regulation of prions via HNRNPK using vacuoles as a read-out as well as with a different mode of regulating HNRNPK expression using shRNAs. All these results, point to HNRNPK as a true modulator of PrPSc.

      In Figure 4A and B, the use of POM1 and/or POM2 to detect PrPC / PrPres is confusing. POM2 is supposed to detect mostly full-length PrPC (Fig 4A top panel), but more than 3 glycoforms are detected. In Fig 4B, POM1 is used for PrPC but because it has a central epitope, it detects both PrPC and PrPSc.

      Both antibodies are able to recognize both PrPC and PrPSc as it has been shown in many publications from the Aguzzi lab as well as other labs in the field. https://pubmed.ncbi.nlm.nih.gov/19060956/

      Note also in Fig 4B, that DMSO alone seems to impact PrPC levels in PG127-infected hovS cells. This advocates again for a more quantitative analysis.

      We have quantified the western blots using the DMSO control as standard value. As DMSO was used to dilute PSA, this should take into account potential effects coming from DMSO (Fig. 4D, Fig. 4F, Fig. 4H and Supplementary Fig. 4F).

      • Psammaplysene A (PSA) is a pharmacological Hnrnpk binder. The authors used this molecule to further demonstrate that Hnrnpk is involved in prion propagation. I disagree with the author's conclusion that "PSA effect does seem to be limited when HNRNPK shRNAs are applied". In Fig S4D, 1µM PSA seems do decrease PrPres levels at similar levels whether the shRNA is applied or not. Again quantification and statistical analyses from several independent experiments would help supporting the authors conclusions.*

      We assessed this point carefully by quantification of the western blots (Fig. 4H) and providing statistical data (Student’s t-test) from three experiments. As we see a threefold lower decrease of prions with and without Hnrnpk regulation when PSA is present, we concluded that the effect we see from PSA should be arising through Hnrnpk. However, we cannot conclusively delineate the effect of PSA, because Hnrnpk ablation is not possible due to essentiality of Hnnrpk. This has now been added to the discussion portion of our manuscript.

      • The authors finally tested PSA on organotypic brain slices (in that case, they provide statistical results) and on flies infected with ovine PG137 prions. PSA administration significantly reduced the locomotor deficits prion-infected flies. The authors quantified the effects of PSA on prion accumulation in flies. Because the overall levels were not detectable by immunoblot, they used a cell-free assay termed RT-QuIC to address prion seeding activity in fly heads. I have specific comments about these experiments:
      • Maybe I missed it, but I could not find which recombinant PrP is used in RT-QuIC assay.*

      This information is provided in the M&M section of the manuscript at hand. The relevant section on P25 reads, where HaPrP23-231 refers to hamster PrP:

      The reaction buffer of the RT-QuIC consisted of 1 mM EDTA (Life Technologies), 10 μM thioflavin T, 170 mM NaCl, and 1× PBS (incl. 130 mM NaCl) and HaPrP23-231 filtered using 100-kD centrifugal filters (Pall Nanosep OD100C34) at a concentration of 0.1 mg/ml.

      In addition, we added this information to the main text as well.

      - This is important as recombinant PrP self-polymerize after a period of time and here the authors have left the RT-QuIC assay running for unusually long period of times (RT-QuIC are stopped after 24h-48h).

      For prions, long RT-QuIC experiments are often performed (also see: https://pubmed.ncbi.nlm.nih.gov/32598380/, https://journals.asm.org/doi/10.1128/mBio.02451-14, https://www.nature.com/articles/s41598-021-84527-9, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3458796/ and others).

      In addition, this is controlled for in all experiments performed in the lab, as the prion-negative sample containing the same RT-QuIC substrate does not become positive after the entire duration of the assay (Fig. 5D).

      - Instead of titrating prion seeding activity by endpoint titration, the authors quantified PSA activity by measuring the effect on another parameter of the RT-QuIC, the length of the lag phase before the conversion reaction is visible. While this is an interesting criterion, reduction of seeding activity must be shown to unequivocally demonstrate that PSA has delayed prion pathogenesis in flies.

      Based on the data presented in the manuscript, we assessed prion pathogenesis in flies using a well-established climbing assay, demonstrating that treatment with PSA significantly improves locomotor behavior, which has been shown to be directly linked to prion levels and is known to have even greater sensitivity then the traditional mouse bioassay (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5998032/, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6113635/, https://link.springer.com/article/10.1007/s00441-022-03586-0).. The RT-QuIC represented here represents itself as a secondary read-out to the climbing assay, for which Lag-time quantification is used routinely (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3893511/, https://www.nature.com/articles/s41598-017-10922-w, https://journals.asm.org/doi/10.1128/mBio.02451-14, https://www.nature.com/articles/s41598-021-87295-8). Our results effectively highlight the overlap between the complementary read-outs.

      - Can the authors exclude any interfering effect of PSA on the RT-QuIC reaction, given the amount of material used to seed the reaction (1:20 diluted head homogenates)?

      We do not know how much PSA has reached the Drosophila brain, therefore, the experiment suggested by the reviewer cannot be tied to a 1:20 dilution. However, the concern of the reviewer is valid, and we therefore performed a spiking experiment of a prion positive sample using 1uM PSA (the highest amount used to treat cells, for which we saw a strong prion-reducing effect). We did not see an interference in the RT-QuIC signal due to PSA in the reaction. This has been incorporated into Figure 5D.

      • could the authors comment on the fact that HNRNPK knock-out is not possible and that their siRNA and shRNA are not affecting the cell viability?*

      To select hits during the screen process, we apply a viability filter, excluding siRNAs that reduce viability by more than 50% when compared to the non-targeting control siRNA (Supplementary Fig. 1F). For GT-1/7 cells we do not see any effect on viability of siRNA treatment after 96h. However, as downregulation of HNRNPK worsens the cytopathological vacuolation in the hovS model, as shown in Supp. Fig 4A, we do see an effect on cell fitness using both siRNA as well as shRNA. In addition, as knocking down HNRNPK will not lead to its complete loss, the remaining levels might be enough to sustain viability. Moreover, the longest knockdown experiment we performed is 7 days, we cannot exclude that longer exposure would have an impact on viability, but this question is not in the scope of the paper.

      • In the discussion the authors do not discuss how Hnrnpk could impact prion propagation. This may deserve a comment as this protein is present in the nucleus. As PrPC has been also identified in this compartment, can this specific form be involved in prion pathogenesis?*

      We additionally elaborated on potential ways of how Hnrnpk might impact prion propagation in the discussion, which includes potential nuclear PrPSc as well as with regards to our data obtained from the sequencing efforts shown in Fig. 4I. In addition, we investigated some functional targets of Hnrnpk how they are affected by PSA, which is now added to Supp. Fig 4G.

      Reviewer #1 (Significance (Required)):

      The QUIPPER method is a great conceptual and technological approach that could be applied to genome-wide analyses and screening for therapeutic molecules.

      * The study will interest a general audience interested in neurodegenerative diseases linked to protein misfolding. There are commonalities in pathways and modifiers of the conversion. Further PrP has emerged as a receptor for alpha-synuclein (Parkinson disease) and A-beta peptides (Alzheimer's disease).

      Expertise key words: prion diseases - prion pathogenesis in cell models*

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Prions are protein-based infectious agents that underlie neurodegenerative disease. For prion diseases (e.g., mad cow disease), the infectious agent is the cellular prion protein (PrPc). It exists in a normal conformation and carries out its normal cellular function. However, when it becomes misfolded and aggregates it can adopt an altered conformation, referred to as the prion conformation, or PrPSc. PrPSc aggregates can template the conversion of other PrPc molecules into the PrPSc form. In this way the prions can propagate from one cell to the next and throughout an organism. Prion diseases are truly devastating and identifying ways of stopping prion propagation is of great interest. In this manuscript by Aguzzi and colleagues, the authors designed a way to screen for prion propagation modifiers in mammalian cells. They built a highly sensitive readout of PrPSc propagation and adapted it to a 384-well plate format in adherent cells. They then used this to perform a genomewide siRNA screen, looking for genes that increased or decreased PrPSc propagation when knocked down.

      * They identified nearly 1,200 modulators of prion propagation and then subjected them to various validations and filtering to focus on only those hits that affected PrPSc but not PrPc (though hits that affect levels of PrPc could certainly be interesting). All this led to 40 genes (20 that increased and 20 that decreased prion propagation.*

      * Among these 40, the authors focused on one hit, hnRNPK, an essential RNA-binding protein with diverse cellular functions. They provide evidence that reducing levels of hnRNPK leads to increase prion levels.*

      * They next move to a marine compound called Psammaplysene A (PSA), which had previously been shown to have some neuroprotective properties and to be able to bind to hnRNPK. Because of the latter observation, the authors test if PSA can affect prion levels. They show that indeed treatment of their cell line prion infection model, or an organotypic slice model, or a fly model with PSA is sufficient to decrease prion levels.*

      * The authors propose that PSA works to reduce prion levels by increasing the activity of hnRNPK and that this also implies a role of RNA (because hnRNPK is an RNA-binding protein) in prion propagation. * In a nutshell, in my opinion the design and execution of this genomewide screen is ingenious and has yielded a treasure trove of potential prion modifiers. The ability to distinguish between modifiers of Prpc and PrpSc is super powerful. However, the follow-up and focus on hnRNPK and its connections (which seem tenuous) to the marine compound PSA are incomplete and raise more questions than answers. In its present form, it is hard to assess the potential significance of hnRNPK in prion propagation. I have some comments and suggestions for the authors to consider.

      * 1.To my eye, Fig. 4A looks like Hnrnpk siRNA leads to slightly increased levels of PrPc (detected with POM2 antibody) and this could explain the increase in PrPSc levels. Can the authors assess Prnp RNA levels and the effects of their siRNAs on Prnp expression? It would also be useful to provide quantification of immunoblots if possible.*

      We quantified the western blots as mentioned in our response to reviewer 1. The quantifications are now provided for figures: Fig. 4A and Supplementary Fig. 4A, showing that the increase in prion levels is much stronger than that of PrPC. These confirm the results from the screen as seen in Fig. 3D. In addition, we would again like to point out that the use of shRNAs to knockdown HNRNPK did not yield the increase in PrPC levels aforementioned, as evident by Supplementary Fig. 4D which demonstrates a decrease of PrPC, despite increasing PrPSc levels. Moreover, we show quantification of RNA levels upon downregulation of Hnrnpk and with PSA, which show that downregulation of Hnrnpk via siRNAs indeed increases Prnp mRNA levels and that PSA does not change RNA levels of neither Hnrnpk nor Prnp (Fig. 4C).

      • In Supplemental Fig. 4B it also looks like knocking down Hnrnpk results in decreased PrPc levels in this experiment and its not clear how robust the increase in PrPSc levels are. Quantification of these experiments, if possible, would be helpful.*

      Please see response above. We now provide quantification to all western blots.

      • The authors treat with PSA, which is supposed to bind to Hnrnpk. They state that this treatment does not affect PrPc levels but to my eye Supplemental Fig. 4C looks like highest doses of PSA cause a decrease in PrPc levels. Quantification of the immunoblots would also be useful here.*

      Please see response above. We now provide quantification to all western blots and added a sentence to the manuscript.

      • The authors use Hnrnpk knockdown along with PSA to test if the effects of PSA depend on Hnrnpk. They see PSA decreases PrPSc levels and that this is, to my eye, only slightly attenuated by Hnrnpk reduction. I interpret these results slightly different than the authors. To me, it seems that this result indicates that PSA's effects are (mostly) independent of Hnrnpk.*

      Addressed in point 4 from reviewer one.

      • In the original paper identifying PSA and hnRNPK physical interaction, RNA-binding was important. In the authors' assays, does Hnrnpk's effect on prions depend on RNA-binding? Specific mutations to the RNA-binding domains can be made to assess this.*

      This is a very interesting point. We did try to obtain data to support this claim, however, due to the essentiality as well as tight control of Hnrnpk expression, we were not able to express different forms of Hnrnpk and acquire conclusive data. Therefore, it is currently being pursued how Hnrnpk might affect prion propagation in the scope of another publication.

      • The genetic interaction in the vacuolation phenotype between Prnp and Hnrnpk that the authors report is very interesting (Supplemental Fig. 4A). It seems like this system and phenotype could be useful for the authors in exploring mechanisms by which HnrnpK is functioning.*

      • *

      We absolutely agree to the reviewer’s comment. As mentioned above a second publication is under way to investigate the mechanisms of Hnrnpk’s antiprion function, which is not in the scope of this study.

      • The authors propose that PSA increases activity of Hnrnpk but does it change any Hnrnpk RNA targets from their RNA sequencing? Some functional readout of Hnrnpk function would be useful here to test this hypothesis.*

      Although we do suspect RNA binding has an important role in the anti-prion function of Hnrnpk, we cannot exclude other modalities which Hnrnpk might be function through, such as DNA binding and protein-protein interactions. Therefore, to answer this question, a considerable effort that explores each of the potential of these modalities with regards to the anti-prion function of Hnrnpk would be needed. This extensive effort, however, is out of the scope of the manuscript at hand. However, we investigated the effect of PSA on some known functional targets of Hnrnpk (as suggested by the reviewer) from our sequencing efforts and added this analysis as Supplementary Fig. 4H to the manuscript. These results suggest that PSA leads to an increase of the expression of DNA targets of Hnrnpk, potentially suggesting a modality of action. Moreover, we amended the discussion with regards to potential pathways that might be yielding the effect seen as evidenced by the RNAseq data.

      • In the Introduction, the authors mention two yeast papers in introducing the concept of using unicellular model organisms to perform modifier screens. The first paper (Outeiro and Lindquist, 2003) is a classic but does not contain a yeast screen. The other one does include a loss of function screen in yeast (for polyQ toxicity modifiers) but those results seems to be due to loss of the [RNQ+] prion from certain deletion strains instead of from specific roles of modifier genes, so that paper might not be the best exemplar of yeast modifier screens.*

      We sincerely thank the reviewer for their careful readthrough of the manuscript, the portion that refers to the manuscripts as screens was amended and two new citations for appropriate yeast screens were added to the manuscript.

      • The authors asked if any of their hits from their screen had human genetics connections to neurodegeneration. They mention one of their hits Dock3 right after saying that no hit reached statistical significance after multiple testing corrections. This seems a bit misleading since any time one makes a list of anything there will always be, by definition, one at the top of the list.*

      We amended the wording to improve clarity of the manuscript.

      • The authors perform RNA sequencing on prion infected cells that either had Hnrnpk siRNA or PSA and since these two treatments had opposite effects they looked for genes that went in the corresponding directions. They didn't find anything significant when looking for genes downregulated by Hnrnpk siRNA and upregulated by PSA. They did find glucose metabolism genes when looking in the opposite direction. The significance of this finding is unclear and the authors do not expand on it.*

      Addressed in point 7 of reviewers 1 and 2, we expanded the discussion portion of the manuscript with regards to these results.

      • To me, the data with PSA seem more robust than the Hnrnpk data and it seems that the authors are trying to perhaps over-fit them together. It is possible that PSA affects prion levels independent of Hnrnpk function. This would not dampen my enthusiasm at all for this finding and could be of interest to those in the prion field, in which the search for anti-prion compounds is of great interest.*

      Upon statistical analysis of the result in Fig 4H, we see a three-fold decrease of PSA activity upon HNRNPK downregulation, suggesting PSA activity might be linked to HNRNPK. However, the reviewers point is well taken and we emphasized the value of understanding the function of PSA or mimicry of its effect as potential therapy in the future.

      ***Cross-commenting:**

      All three reviewers seem to appreciate the novelty and impact of the new QUIPPER method the authors have developed to discover modifiers of prion propagation. All three reviewers also seem to be somewhat less convinced by the connection to hnRNPK, including how the compound PSA's anti-prion effects involve hnRNPK (or not).*

      * In my opinion, this manuscript presents important and novel work and a really ingenious new method to study prion propagation, which will be broadly useful to the prion field. I feel that the hnRNPK data could be strengthened, especially with more quantitative analyses. The PSA treatment data are compelling but it seems that the effects might be independent of hnRNPK and that the authors are trying to force a connection which might not be there.*

      * Reviewer #2 (Significance (Required)):*

      * *** Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. ****

      I have expertise in neurodegenerative disease, protein misfolding, yeast modifier screens, CRISPR modifier screens in human cells, and RNA-binding proteins. I have general knowledge about prions, including PrP, but I am not a prion expert.*

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The authors conducted an arrayed RNAi-based genome-wide high-throughput screening of all protein-coding modifier genes that affect prion propagation in cultured cells (murine and human cell lines) using a novel quantitative high throughput QUIPPER assay that they developed. They identified 1191 genes, of which 40 selectively affect PrPSc. Half of the 40 genes seem to inhibit PrPSc (limiter) whereas the other half do the opposite (stabilizers). One of the strong limiters is Hnrnpk, is an essential small heterogeneous nuclear ribonucleoprotein that has been implicated in a few protein misfolding diseases. The biological relevance of the findings is demonstrated by the detection of previously reported modifier genes as well as thorough verification of Hnrnpk as an effective prion limiter that seems to be independent of the two prion strains or host species (mouse and human cell lines as well as Drosophila).

      * The manuscript is very well written, the approach is novel, very well verified, and effective, the data are solid, and the main conclusions convincing.*

      * Two issues need to be discussed.*

      * Major comments:*

      * First, some genes encoding proteins involved in PrP processing, such as ADAM10 and ADAM8, are known to affect PrPC levels, but they are not among the modifier genes identified. Based on Table 2, ADAM8 expression is very low in the GT-1/7 cells. This points to one of the caveats of the RNAi screening approach in that potential roles of low expressing genes in the cell lines used could be missed. Although it is beyond the scope of this manuscript, it would be helpful to add discussions on complimentary screening enhancing gene expression and the use of more cell lines that will allow identification of more modifiers.*

      We thank the reviewer for their concern. The point regarding the screen being less sensitive for genes that are low-expressed in the cell line in question is valid. Upon advancing of the CRISPR-based technologies and the improvement of these technologies to be used in combination with prions, we see their value. We added a sentence to the discussion, talking about gene activation as a future alternative to perform a complimentary screen.

      Second, the statement that PSA's anti-prion effect potentially arises through enhancing the activity of HNRNPK makes sense, but it is also possible that PSA can directly inhibit prion replication as well. It would be helpful to calculate the percentage of reduction in PrPSc by PSA treatment and the percentages compared between shNT and shHNRNK cells.

      We thank the reviewer for the careful read through of the manuscript. The point was addressed for reviewer 1 point 4. In addition, if PSA is added to the RT-QuIC, it does not prevent aggregate formation, indicating that PSA is unlikely to directly inhibit prion replication, but rather depends on a cellular host-intrinsic molecule for its activity. However, we also elaborate more on the possibility of potential other mechanisms for Hnrnpk and PSA’s function on regulating prion levels in the discussion section of our manuscript.

      Minor comments:

      * First, Figure 1C shows that the relative intensity for RML CAD5 cell lysate infected cells is less than with PIPLC treated or PK treated, which seems to be the opposite of what is expected, because PIPLC or PK treatment should not increase infectivity. Please explain.*

      We agree with the reviewer that the results were surprising. For the practicality of the screen, we wanted to show that the treatment does not eliminate the infectious species, which we were able to demonstrate. However, the increase of infectivity could stem from many different factors, e.g. the amount of duration of PK treatment might not harm but instead rather expose the infectious species, or PIPLC might remove cell surface molecules that could prevent infection of cells. However, as there are a plethora of possible scenarios and it was not relevant for the study at hand, we did not go into further detail.

      Second, in Fig S1 e, the labels are too small to read. In Fig 3D, it would be easier to match the stabilizer or limiter genes with the corresponding Z score dots if the genes with a negative Z scores are labelled on the left side while genes with positive Z scores be labelled on the right side.

      We amended the figures as per the reviewer’s suggestion.

      Third, The following sentence on page 11 is confusing: "20 out of these 40 candidates reduce prion propagation upon silencing, and 20 candidates enhanced prion propagation, and henceforward are called stabilizers or limiters, respectively (Fig. 3D-E, Supplementary Table 1)." Did the author mean to say "....and 20 candidates enhanced prion propagation upon silencing, and hence..."?

      We reworded the sentence according to the reviewer’s comment.

      * Fourth, In the subheading "Hnrnpk expression limits of prion propagation in mouse and human cells", "of" should be deleted.*

      We addressed this in the main manuscript file.

      ***Cross-commenting:**

      I agree with Reviewer #2's assessment that more quantification will be helpful and the link between the effect of PSA treatment and hnRNPK can be strengthened. I want to stress that the knockdown data clearly shows the involvement of hnRNPK as a prion limiter in cultured cells. The question on PSA does affect the interpretation of the ex vivo and in vivo data.*

      * The blot in Fig. S4c seems to show some decrease in PrPC levels in NBH-treated GT-1/7 cells. This blot needs to be quantified to confirm whether the PrPC level is changed by PSA treatments. Whether PSA directly inhibits prion replication can be relatively easily assessed in RT-QuIC reactions. Alternative to the use of PSA, RNAi-mediated hnRNPK knockdown can also be done on cultured tissue slices or in brain, but this will require a lot more time and efforts and may be too much to ask for in this manuscript.*

      Quantifications for blots were added throughout the manuscript and the text was amended accordingly, and all the points mentioned have been addressed throughout this response letter.

      Reviewer #3 (Significance (Required)):

      * The findings are novel and very significant. They identified a large number of modifier genes, and established a solid foundation for future studies on prion modifier genes to study prion replication and pathogenesis and for novel therapies against prions and potentially some other protein misfolding diseases. HNRNPK seems to be good target for therapeutic intervention and PSA may be a good candidate for prion treatment. The novel QUIPPER assay can be used to screen for anti-prion compounds and potentially adapted to study other misfolding proteins associated with cells.*

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      All the Reviewer’s comments are reproduced below, with our responses interspersed in [[brackets]]. Citations from the revised manuscript are included in “quotation marks”. The website accepts input only as plain text. Consequently, we had to transform the mathematical expressions into plain text. We apologize for the reduced readability.

      Reviewer #1

      1) The authors state that: "the conductance density mediated by the expression of the mutant was 2.5 times smaller than the wild type, although we transfected the same amount of plasmid DNA (Fig. 2E). Assuming that protein expression is independent of the mutation, the observation suggested that the unitary proton flux ratio RC of wild type to mutant channel was equal to 2.5" (lines 82‐85).

      Macroscopic conductance (G) depends on channel number (N), microscopic or unitary conductance (γ), and open probability (PO) by G=N γ PO. The authors assume that the level of WT and D174A mutant protein expression on plasma membrane, which determines N, are equal; however, this critical assumption does not appear to have been tested.

      The fact that conductance density (nS/pF) is plotted in Fig. 2E does not alter this caveat because this procedure normalizes the data only for cell surface area (i.e., size). The authors' conclude that "The conductance density relationship (Fig. 2E) compares the maximal conduction of both constructs; this is the fully open channel (open probability ≈ 1)"(lines 87‐88). However, neither raw currents nor G‐V data are shown. Typically, currents measured at large, near‐saturating PO are used to compare the relative conductances of WT and mutant ion channels. The currents shown in Fig. 2A and 2B exhibit prominent 'droop' at even modest depolarizing potentials (+10 mV for D174A and +30 mV for WT), indicating that the proton gradient has been substantially perturbed by the flow of ge depolarizing voltages needed to drive channels to near‐maximal PO. Furthermore, there is no evidence that maximal PO itself is also not different in WT and D174A channels. Indeed, maximal PO for native Hv1 channels measured using variance analysis is reported by significantly smaller than 1.0, and assuming that PO = 1.0 for either WT or D174A is therefore not well supported. Maximal could be altered by the D174A mutation, which has a clear and strong effect on channel gating evidenced by the large (‐70 mV) negative shift in threshold potential reported both here and previously in the literature. Effects of mutations on maximal PO due to altered gating behavior could be separate and distinct from any change in plasma membrane channel number (N). 3 Lastly, because D174A channels have a much higher PO than WT at 0 mV, the mutant will necessarily conduct inward proton currents at the physiological resting membrane potential (RMP) in tsa‐201 cells (perhaps ‐30 mV?). Inwardly directed proton currents will therefore cause intracellular acidification under resting conditions.

      The constitutive acid load in cells expressing D174A, but not WT, is likely to have a variety of physiological consequences, including decreased protein expression or plasma membrane targeting of D174A. There is evidence that another constitutively open Hv1 mutant (R205H) also generates smaller currents macroscopic conductance than WT, and this phenomenon is likely to result from decreased cell surface expression. To conclude that the microscopic conductances of WT and D174A are unequal, the authors must demonstrate that N is not different. The authors' conclusion that D174A "conducts protons at a lower rate" (line 89) is therefore not well supported by the experimental data.

      [[

      We toned down our conclusions from the experiments to accommodate the reviewer's criticism: (page 4): " Consequently, the mutant channel is nearly fully open (Fig. 2D), readily seen when the membrane potential is 0 mV and external voltage is absent. The high open probability of the D174 mutant under symmetrical pH conditions is readily seen in the tail current amplitude reaching a quasi-saturation (Fig. 2A). The resulting outward currents have a higher amplitude in the wild-type (Fig. 2A+B). Interestingly, the conductance density mediated by the expression of the mutant was 2.5 times smaller than the wild type, although we transfected the same amount of plasmid DNA (Fig. 2E). Our observation suggests a reduced flux through the mutant if we assume that protein abundance in the plasma membrane is independent of the mutation."]]

      2) The authors indirectly measure apparent proton flux rates (λD) in LUVs containing WT and D174A mutant Hv1 channels using a fluorescence‐based approach, and conclude that λD is 2.4 times smaller for D174A than WT. However, the method for estimating λD is not performed under voltage clamp, and the driving force for proton current is neither known nor measured.

      [[

      The reviewer is mistaken. The method for estimating λD is performed under voltage clamp, and the driving force for proton current is known.

      Page 6: “To obtain λD, we encapsulated c_k^i=150 mM KCl in the HV1 containing large unilamellar vesicles (LUVs) and exposed these vesicles to a buffer with a K+ concentration c_k^o= 3 mM. The addition of valinomycin facilitated K+ efflux, thereby inducing a membrane potential, ψ. ψ constituted the driving force for H+ uptake. It can be calculated according to the Goldman equation:

      ψ = -RT/F ln ((c_k^i+(P_H/P_K ) c_H^i)/(c_k^o+(P_H/P_K ) c_H^o ))

      (1)

      The ratio of the HV1 mediated proton permeability P_H to the valinomycin-mediated potassium permeability P_K is always smaller than 0.04. We base our conclusion on the observation that the CCCP-mediated proton permeability represents an upper limit for P_H since CCCP always induces a faster vesicular proton uptake than HV1 (Fig. 3). Accordingly, the maximum value of P_H/P_K can be estimated as the ratio of valinomycin to CCCP conductivities. The respective values are equal to 1.6 10-3 Ω-1 cm2 [1] and 4 10-6 Ω-1 cm-2 [2]. At pH 7.5, we find c_H^o=10^(-7.5) M, i.e., c_k^o ≫ (P_H/P_K )c_H^o. Similarily, c_k^I ≫ (P_H/P_K ) c_H^i for a broad range of intravesicular pH. With these simplifications, Eq. 1 transforms into the Nernst equation yielding:

      ψ = -RT/F ln (c_k^i)/(c_k^o )=-100 mV

      (2)

      ψ of such size may decrease intravesicular pH by nearly two units. Such acidification does not violate c_k^i ≫ (P_H/P_K ) c_H^i so that ψ remains constant throughout the experiment. That is, the vesicle experiments proceed under voltage clamp conditions. The simple explanation is that, due to the small proton concentration and the limited buffer capacity, the K+ conductance exceeds H+ conductance under all conditions. The conclusion is in line with simulations (32), confirming that the membrane potential is driven very near the Nernst potential for K+.”]]

      The authors state that "Transmembrane voltage constituted the driving force for proton uptake into LUVs (Figure M). It resulted from facilitated K+ efflux out of the vesicles (30)", (lines 261‐262), but this voltage is unknown and not likely to equal the Nernst equilibrium potential for K+ once Hv1 channels begin to open.

      [[

      The reviewer is mistaken. The voltage is known (see the equations above). The opening of the HV1 channels does not alter the potential because c_k^o ≫ (P_H/P_K ) c_H^o and c_k^i ≫ (P_H/P_K ) c_H^i for a broad range of intravesicular pH (see above).]]

      Once Hv1 channels begin to open, intra‐lumenal pH (pHi) will necessarily occur during the experiment. Such changes are likely exacerbated by a) the low proton buffering capacity of the system (5 mM HEPES) and b) the absence of any counter‐charge pathway to balance the effect of proton charge movement on the membrane potential.

      [[

      Vesicle acidification occurs. It signifies the presence of functional proton channels. Nevertheless, the membrane potential does not change (see Equation 1 above). The statement b) is not correct because the outward K+ movement counters the inward-directed proton charge movement.]]

      Given the small volume of LUVs, even a relatively modest difference in either membrane potential or pHi could substantially alter the driving force for proton movement. Together, these factors are highly likely to result in a rapid and potentially large change in the driving force for proton flux.

      [[

      As outlined above, membrane potential stays invariant. Vesicle acidification changes the driving force for proton flux. The steady state is reached when the electrochemical potentials for protons on the two sides of the membrane are equal to each other.]]

      Driving force changes may also be different for WT and D174A because their relative PO may be different under the experimental conditions used here. Because D174A activates at much more negative voltages, it is likely to open more quickly and to a higher PO than WT at early times after depolarization is initiated by addition of valinomycin (Fig. 3A). This fact will likely result in a larger initial inward current being carried by D174A than WT channels. The result would be a more rapid acidification of LUVs by D174A.

      [[

      The reviewer is mistaken. Assuming a transport rate of 20,000 potassium ions per second (G. Stark, B. Ketterer, R. Benz and P. Läuger; Biophys. J. 1971 Vol. 11 Pages 981-981) and a membrane capacity of 1 μF cm-2, it takes valinomycin about 10 ms to drive the vesicular potential to near Nernst values. Activation of the proton channel is at least 10 times slower. Thus, both mutant channel and wild type channel may open at roughly the same instant. The driving force is sufficient to open both channels to the same probability.]]

      The experimental data in Fig. 3A are consistent with the expectation that the proton gradient and driving force more rapidly approach equilibrium for D174A than WT channels: the apparent rate of AMCA fluorescence change is slower in D174A. Although the authors correctly interpret the experimental data to mean that the apparent λD is slower for D174A, they do not rule out the artifactual explanation for the measured differences. Indeed, the observation in Fig. 3A that AMCA fluorescence change eventually reaches a plateau and is not affected by CCCP means that the proton gradient has become exhausted during the experiment, and directly demonstrates that the proton driving force is uncontrolled under the current experimental conditions.

      [[

      The reviewer's interpretation of our results is flawed. Instead of becoming exhausted, the proton gradient builds up during the experiment. Initially, extravesicular and intravesicular pH values are equal to each other. Valinomycin-mediated K+ efflux results in a membrane potential that drives Hv1-mediated H+ influx.

      Page 8: “The number NC of reconstituted HV1 dimers per vesicle determines the acidification rate λ, i.e., the time that elapses before reaching the steady state. The final intraluminal pH is independent of NC. Similarly, CCCP addition in the steady state does not change the intraluminal pH of HV1-containing vesicles. But CCCP will affect the intraluminal pH of vesicles deprived of HV1 since H+ background permeability is too small to allow vesicle acidification within the time allotted for the experiment. Consequently, only HV1-free vesicles will acidify upon CCCP addition. That is, CCCP addition allows estimating the fraction of vesicles deprived of HV1.”]]

      In contrast to the authors' statement that "Our experiments with the purified and reconstituted channels corroborated the conclusion (Fig. 3A)", (lines 92‐93) it is not clear that unitary proton flux rates/unitary conductances are actually different in WT and D174A.

      [[

      The reviewer is mistaken. Since we measured under voltage clamp conditions, ensured rapid installment of the membrane potential, and selected a potential large enough to allow for the same open probability of wild-type and mutant channels, the measured transport rates, λ, are valid. Moreover, we determined the number of HV1 channels per vesicle and thus calculated the transport rate of an individual channel, λD. Since λD is different for WT and D174A, the unitary proton flux rates/unitary conductances are actually different in the wild type and mutant.]]

      3) The presumed differences in unitary conductances (i.e., 'transport rate') between WT and D174A are used to estimate Arrhenius activation energies (Ea): ("The difference in measures transport rates allows a rough estimation of the Arrhenius 128 activation energy Ea for HV1‐mediated proton flow. It amounts to 40 kJ/mol for the wild type and 23 kJ for the mutant. Thus, Ea exceeds the corresponding 15 kJ/mol barrier measured for gramicidin A (32, 33)", (lines 128‐130). The method for determining Ea in the current work is not well‐described. In Ref. 32, the authors estimate Arrhenius activation energy (Ea = 20 kJ/mol) for gramicidin D (not gramicidin A) from the slope of a line fit to measurements of currents at various temperatures. Here, the authors measure AMCA fluorescence decay rates at 4 °C and 23 °C and observe a similar temperature‐dependent difference in WT and D174A (Fig. S2). Given that the data indicate that WT and D174A are similarly temperature‐dependent, it is unclear how the authors arrive at different Ea values. The authors' conclusion that "The increment in Ea suggests that the transport mechanism may be different from a pure Grotthuss type, where the proton uses an uninterrupted water wire to cross the membrane", (lines 131‐133) therefore does not appear to be well‐supported.

      [[

      We removed both the calculation and discussion of activation energies. Knowledge and discussion of activation energies distract from the scope of the manuscript. We show the experiments at different temperatures solely to demonstrate that Hv1 and D174A facilitate proton transport at a decreased temperature where the background conductivity of the lipid bilayer to water is small.]]

      4) The authors report no difference in water permeability in WT vs. D174A (Fig. 5 and S1) and interpret the results to mean that proton currents are not associated with measurable bulk water flow. A similar conclusion was reached for native Hv1 channels using deuterium substitution (DeCoursey & Cherny, 1997).

      [[

      The comment of the reviewer is misleading:

      • Equal water permeabilities of WT and D174A would not exclude an association between proton currents and water flow. Accordingly, our manuscript does not contain the stipulated interpretation.
      • DeCoursey & Cherny (1997) did not evaluate bulk water flow through proton channels. They compared D+ and H+ currents across the plasma membrane of rat alveolar epithelial cells. Page 2: “Comparing deuterium ion and proton currents through the plasma membrane of rat alveolar epithelial cells, DeCoursey & Cherny (22) found an isotope effect exceeding that for hydrogen bond cleavage in bulk water. It suggested the involvement of an amino acid side chain in proton conduction (22). Alternatively, altered properties of confined water could have been responsible for the higher isotope effect.”]]

      However, the absence of bulk water flow does not itself rule out the possibility that 'trapped' waters within the Hv1 pore do not themselves carry the measured proton current. If intra‐pore water molecules are tethered by hydrogen bonds with protein atoms, they may not move when Hv1 channels open.

      [[

      The reviewer’s comment contains one misinterpretation and one unfounded statement:

      1. We never stated that 'trapped' waters within the Hv1 pore do not themselves carry the measured proton current. On the contrary, we envisioned the trapped waters delivering the protons to one or more titratable amino acid side chains and accepting the protons from them.
      2. The reviewer’s view that intra‐pore water molecules tethered by hydrogen bonds with protein atoms may not move when Hv1 channels open is a misconception. Page 12 bottom: “The contrasting opinion that instead of a channel obstruction hydrogen bonds may immobilize the pore water (19) is not convincing. First, the lifetime of a hydrogen bond is in the ps range while HV1’s mean open time exceeds 100 ms (41). Thus, hydrogen bonds may break more than 1011 times during the open state, rendering them unfit for tethering intraluminal water molecules. Second, the effect of hydrogen bonds between water molecules and pore residues is limited to decreased water mobility in narrow channels (23). Their number, NH, allows for predicting pf (26). Specifically, every H-bond donating or receiving pore-lining residue contributes an average increment ΔΔG╪ of 0.1 kcal/mol to the Gibbs free energy of activation ΔG╪ (24). Equation (1) allows the calculation of ΔG╪:

      ΔG╪= N_H ΔΔG╪ + ΔΔG╪_i (13)

      where ΔΔG╪_i = 2 kcal/mol (24). Since N_H = 6 (Fig. S1) in the open HV1 conformation, Eq. 1 predicts ΔG╪ = 2.6 kcal/mol. Eq. (2) allows calculating HV1’s pf from this value (42):

      p_f = v_0 v_w exp(-ΔG╪/RT) (14)

      where vw = 3 × 10−23 cm3 is the volume of one water molecule and ν0 is the universal attempt frequency, ν0 = kB∙T/h ≈ 6.2 × 1012 s−1 at room temperature (kB is Boltzmann’s and h is Planck’s constant).”]]

      Proton transfer through a hydrogen‐bonded network of waters requires only that the electronic structure of the network be rearranged during proton transfer; water is not required. As in the previous study (DeCoursey & Cherny, 1997), the lack of water flux reported here demonstrates seems to reinforce the notion that H+ moves separately from its waters of hydration (i.e., hydronium, H3O+, is not the permeant species) and does not necessarily imply information about the mechanism of proton transfer (i.e., side chain ionization vs. Grotthuss‐type transfer in a water‐wire).

      [[

      The reviewer is mixing two unrelated issues. Of course, proton transport may be separated from mass transfer. Yet, charge transfer may or may not include one or several titratable amino acid side chains. If proton side chain ionization is not involved in proton transfer, a water wire must exist that connects the aqueous solutions on both sides of the membrane. In this case, an osmotic gradient will drive water molecules through the open channel. Since we did not observe such water flux, we conclude that the water wire is interrupted by at least one side chain. Thus, our experiments imply information about the mechanism of proton transfer.]]

      The authors state that: 1) "every H‐bond donating or receiving pore‐lining residue would have contributed an increment ΔΔ𝐺‡ of 0.1 kcal/mol to the Gibbs free energy of activation Δ𝐺‡ (25)" (lines 145‐147), and 2) calculating NH from this Δ𝐺‡ allows estimation of the channel's unitary water permeability (Eqn. 2). Although hydrogen bonding patterns will undoubtedly alter the free energy for channel activation, this is not the same free energy change as that for proton transfer.

      [[

      The reviewer's remark is in line with the previous and the current versions of our manuscript.]]

      Hv1 gating involves conformational changes that are both voltage and Δ pH-dependent, and the D174A mutation is known to alter the voltage dependence of gating (Fig. 2 and previous studies). The effect of D174A on Hv1 unitary conductance, however, is speculated but not unambiguous (see above).

      [[

      Our experiments unambiguously demonstrate the effect of D174A on Hv1 unitary conductance. The interpretation of the experiments is straightforward – there is no speculation involved. The contrasting opinion of the reviewer rests on his misinterpretations of (i) our measurements of proton transport rate λD for wild-type and mutant (see above) and the CCCP-effect (see above).]]

      In the absence of definitive experimental data showing differences in the unitary conductance of WT vs. D174A, the authors' assumption that water permeability would be strongly temperature‐dependent (lines 154‐160) seems premature and their ensuing conclusion tenuous: "pore residues interrupt the HV1 spanning water wire, trapping the water molecules inside the HV1 channel. In contrast to water, protons cross the pore by hopping from one acidic residue to another through one or more bridging water molecules (Fig. 6)" (lines 161‐164).

      [[

      The reviewer chooses to misinterpret our lines. We did not assert that water permeability through the Hv1 channel would be strongly temperature‐dependent. We referred to the well-known fact that there is a strong temperature dependence of lipid bilayer water permeability - in contrast to the tiny effect of temperature on the water permeation across aqueous channels.

      Page 11, bottom: “Considering the stark dependence of the activation energy for background water flow across lipid bilayers (24), we repeated the experiments at a decreased temperature of 4°C. Thanks to the low background water permeability at 4°C, even tiny contributions of HV1 to Pf should be detectable. Yet, the channels did not contribute to the water flow through the vesicular membrane even though channel water permeability but weakly depends on temperature (24).”]]

      Furthermore, the authors calculate the number of hydrogen bonds (NH) that pore waters could form with pore lining residues based on an X‐ray structure of a chimeric proton channel protein (pdb: 3WKV) that is: a) manifests discontinuous transmembrane water density and is known to represent a non‐conductive conformation, b) contains residues from Ci‐VSP in the critical S2‐S3 linker that form part of the proton transfer pathway, and c) exhibits structural features (i.e., highly conserved ionizable residues such as D185 and R205, which like D174 are reported to dramatically alter Hv1 gating, are packed into a solvent‐free crevice) that are inconsistent with physiological function. Given that all Hv1 ionizable mutant combinations tested so far (the sole exception of D112V ‐ other nonionizable substitutions at D112 are tolerated) remain functional (Musset, Smith et al., 2011, Ramsey, Mokrab et al., 2010), the identities of water‐interacting residues speculative.

      [[

      We substituted the X‐ray structure of the chimeric proton channel protein for the AlphaFold structure. We now provide views of the open and closed conformations in the Supplement based on the homology structure (13). Microsecond-long molecular dynamics simulations have optimized the latter.

      The experimental observation of mutants’ functionality (with the sole exception of D112V) supports our view that proton transfer occurs through a hydrogen‐bonded network of waters that is only once (at D112) interrupted by an amino acid side chain. The nature of the amino acids interacting with the proton transferring water molecules is of little importance.]]

      Interpreting differences in the calculated NH based on pdb: 3WKV therefore seems unlikely to reveal fundamentally important insights into Hv1 function. The author's conclusion that "The observation rules out the formation of an uninterrupted water chain spanning the open channel from the aqueous solution at one side of the membrane to the other. NH would have governed water mobility if such a water wire had formed (24)", (lines 143‐145) therefore does not appear to be strongly supported.

      [[

      We did not base our conclusion of an obstructed water pathway on the analysis of structural models. In contrast, the conclusion is the result of our experiments. The structural models permitted the prediction of the expected water permeability. Depending on the model and the channel conformation, we find NH values between six and 16. All of these NH values translate into water permeabilities exceeding gramicidin’s water permeability. Thus, we would have been able to detect the water flux through an unobstructed proton channel.]]

      Reviewer #2:

      Summary: Voltage‐gated proton channels are peculiar members of the voltage‐gated ion channel family due to their absence of canonical pore. Instead, protons permeate through their voltage‐sensing domain. The mechanisms of proton permeation in Hv1 channels are still unclear, with currently two competing hypotheses: (i) hopping through titrable residues within the protein; or (ii) via Grotthuss mechanism involving proton jumping through a continuous water wire. So far, these hypotheses were only tackled by computation. The authors therefore aimed to experimentally test the two hypotheses. To do so, the authors measured the transport rates of protons and water through wild‐type and mutant D174A Hv1 reconstituted in lipid vesicles. Overall, the presented data are convincing and support their conclusion that proton conduction through the channel is not solely mediated by water transport. However, there are several aspects of the paper that I did not understand and would require clarification.

      [[

      We thank the reviewer for the positive evaluation.]]

      Major comments: My major concern is about the relevance of using the D174A mutant. The authors explain at the beginning of the paper that Hv1‐D174A is open at 0 mV, which allows measuring proton flux in systems in which voltage cannot be controlled. However, it seems from the proton flux experiments that wild‐type Hv1 can conduct protons perfectly well in the used experimental paradigm. So why test a mutant? It is actually not clear why wild‐type Hv1 can conduct protons in the proton conduction assay.

      [[

      We introduced the D174A mutation to measure water flux in a setting where the membrane potential is zero. We only performed the proton flux measurements to show that our reconstituted HV1 channels are functional. HV1 can conduct protons because we establish a transmembrane potential in the proton conduction assay. That is, only initially, extravesicular and intravesicular pH values are equal. Valinomycin addition results in a K+ efflux that, in turn, generates a membrane potential. This potential drives the HV1-mediated H+ influx.]]

      The authors should clearly state the trans‐membrane potential created by the K+ gradient across the vesicle, as well as the pH inside and outside the vesicle, and related these conditions to their electrophysiology data to give us an idea of the open probability of wild‐type Hv1 in the conditions used in the proton conduction assays. This is critical to be able to compare the relative rates of proton transport between the wild‐type and the mutant.

      [[Page 6, bottom:

      " ...we encapsulated c_k^i=150 mM KCl in the HV1 containing large unilamellar vesicles (LUVs) and exposed these vesicles to a buffer with a K+ concentration c_k^o= 3 mM. The addition of valinomycin facilitated K+ efflux, thereby inducing a membrane potential, ψ. ψ constituted the driving force for H+ uptake. It can be calculated according to the Goldman equation:

      ψ = -RT/F ln ((c_k^i+(P_H/P_K ) c_H^i)/(c_k^o+(P_H/P_K ) c_H^o ))

      (1)

      The ratio of the HV1 mediated proton permeability P_H to the valinomycin-mediated potassium permeability P_K is always smaller than 0.04. We base our conclusion on the observation that the CCCP-mediated proton permeability represents an upper limit for P_H since CCCP always induces a faster vesicular proton uptake than HV1 (Fig. 3). Accordingly, the maximum value of P_H/P_K can be estimated as the ratio of valinomycin to CCCP conductivities. The respective values are equal to 1.6 10-3 Ω-1 cm2 [1] and 4 10-6 Ω-1 cm-2 [2]. At pH 7.5, we find c_H^o=10^(-7.5) M, i.e., c_k^o ≫ (P_H/P_K )c_H^o. Similarily, c_k^I ≫ (P_H/P_K ) c_H^i for a broad range of intravesicular pH. With these simplifications, Eq. 1 transforms into the Nernst equation yielding:

      ψ = -RT/F ln (c_k^i)/(c_k^o )=-100 mV

      (2)

      ψ of such size may decrease intravesicular pH by nearly two units.

      Such acidification does not violate so that remains constant throughout the experiment. That is, the vesicle experiments proceed under voltage clamp conditions. The simple explanation is that, due to the small proton concentration and the limited buffer capacity, the K+ conductance exceeds H+ conductance under all conditions. The conclusion is in line with simulations (32), confirming that the membrane potential is driven very near the Nernst potential for K+.”]]

      Similarly, the buffers and pH used for the water transport assay are not explicitly mentioned. Are they the same as for the proton transport assay or are the buffers inside and outside the vesicle symmetrical?

      [[

      We added the information about buffers and pH used to the legend. Except for 150 mM sucrose, the internal and external solutions were identical: 150 mM KCl, 5 mM HEPES (pH 7.5), and 0.5 mM EGTA.]]

      Finally, in the introduction the authors base their assumptions about water transport on an X‐ray structure of Hv1 in a closed conformation (3WKV). I do not think it is relevant to study permeation, which in theory should only happen in an open state. If the authors want to make assumptions about the number of hydrogen bonds in the pore and how many water molecules are in the pore (and I don't think they need to do it), they should rather base their assumptions on the computational models of Hv1 open state.

      [[

      We thank the reviewer for the advice. We added a figure to the Supplement. It shows Hv1 models from long-timescale molecular dynamics simulations (Geragotelis et al, Proc Natl Acad Sci U S A 2020 Vol. 117 Issue 24 Pages 13490-13498). The open structure reveals NH=6. We used this value for our calculations.]]

      Minor comments:

      1) Figure 6: the authors should precise that the model of proton conduction through Hv1 is just an assumption. The structural features of Hv1 open state are indeed unknown.

      [[We modified the figure based on the simulation results of Geragotelis et al. We indicated in the legend that the scheme is based on HV1 homology models.]]

      2) Page 9, lines 170‐171 "Drastically prolonged tail current kinetics might reflect a decreased voltage‐dependence of the deactivation in the D174 mutant". Or rather the prolonged kinetics reflect the stabilization of the open state by the mutation (as stated by the authors just after).

      [[Page 14:

      “Drastically prolonged tail current kinetics might reflect (i) a decreased voltage dependence of the deactivation in the D174A mutant or (ii) a stabilized open state (14).”]]

      3) Supplementary figures are displayed in an odd fashion. Figure S3 should be placed before Figures S1 and S2.

      [[We added two more Supplementary Figures and displayed them in the order of text mentionings.]]

      4) In Figure 2, displaying the current trace corresponding to the 0 mV voltage step would improve readability of the figure, by showing that Hv1‐D174A mutants conduct protons at 0 mV and not wt Hv1.

      [[

      We show the current trace corresponding to the 0 mV voltage step for the D174A mutant in panel A and the trace for the wild-type in panel B of Fig. 2.]]

      5) Figure 2 legend "Pronounced inward H+ currents activate negatively to the reversal potential (here ‐70 mV)". I think the authors mean "Here 0 mV", ‐70 mV is the threshold potential. Panel (c), I guess the EH vs Vrev plot is for D174A mutants but it is not mentioned in the legend

      [[

      We corrected the legend. “Pronounced inward H+ currents activate negatively (here – 70 mV) to reversal potential (here – 8 mV), indicating a high open probability of the D174A mutant at 0 mV.” And “Comparison of calculated Nernst potential for protons (EH) and measured reversal potential (Vrev) for the D174A mutant.”]]

      6) Page 4, line 89: the fact that D174A conducts protons at a lower rate is, at this point, based on a lot on assumption. I would just correct the last sentence by saying "Thus, D174A, while opening with less depolarization, seems to conduct protons at a lower rate"

      [[We toned down our statement and inserted a phrase very close to the one suggested.

      Page 5: “Our observation suggests a reduced flux through the mutant if we assume that the protein expression level is independent of the mutation.”]]

      7) Page 6, line 107. The word "therefore" is not necessary

      [[ok]]

      8) Page 7, line 128: "of" in "measures of transport" is missing

      [[We deleted the paragraph.]]

      9) Page 12, lines 261‐262: "Figure M" ??

      [[“Inset of Figure 3A”]]

      CROSS‐CONSULTATION COMMENTS I agree with the two other reviewer's comments. I think our reviews more or less raise the same weaknesses in the study.

      Significance

      This paper addresses a single question with a clearly defined experimental paradigm. Once the issues addressed, the paper should bring important significance to the field of voltage‐gated ion channels since the nature of proton conduction in Hv1 was not known. It could help explain ion conduction in some channelopathies involving ion conduction through the voltage‐sensing domain. The audience is mainly the voltage‐gated ion channel community, as well as the community of membrane permeation mechanisms My field of expertise is in ion channel structure‐function and pharmacology. I have little expertise in the described proton and water flow assays. Therefore I do not have sufficient expertise to evaluate the detailed experimental protocol that led to the measurements.

      Reviewer #3:

      Summary: This study addresses a fundamental question about the mechanism of proton conduction in the voltage gated proton channel Hv1 i.e., whether protons hop through an uninterrupted water wire, or move by other means involving titratable channel residues. The authors argue that an uninterrupted water wire entails a certain rate of water movement through the open channel, which they estimate to be around 10‐12 cm3s‐1 based on a structural model of Hv1 and previous work on other channels. They then measure water permeability of LUVs containing a purified Hv1 mutant expected to be open at 0 mV via light scattering, and proton flux using a pH sensitive fluorescent dye. They calculate a water permeability much lower than predicted and conclude that the water in the conduction pathway does not form an uninterrupted water wire. The manuscript is written clearly, and the experimental measurements are convincing.

      [[We thank the reviewer for the positive evaluation.]]

      There are nonetheless some ambiguities in the way the formation of water wires is discussed.

      Major comments: A protein like Hv1 is larger and more complex than small peptides like gramicidin. In this context, transient water wires, frequently interrupted by titratable residues, or by steric hindrance from hydrophobic sidechains etc. are likely. Can the authors provide an estimate for the maximum frequency and lifetime of uninterrupted proton wires compatible with their measurements? This would be helpful to evaluate whether short‐lived uninterrupted water wires could contribute significantly to proton conduction or not. Trapping usually implies restricted movement. So, for how long do water molecules need to stay inside the channel in order to be considered trapped? Are the water molecules really trapped or simply forming broken wires?

      [[Page 13, bottom:

      “The question arises whether the obstacle in the water pathway is permanent. HV1’s titratable residues or steric hindrance from fluctuating sidechains may frequently interrupt otherwise intact water wires. Yet, our calculations (Eqs. 7 – 11) show that proton diffusion from the bulk solution to the pore mouth is the transport limiting step. Undoubtedly, transient closure would have caused a detectable pore resistance because part of the protons arriving at the pore mouth could not enter the pore. If the pore was closed longer than one ps, an arriving H+ may diffuse out of the capture zone and vanish into the bulk:

      t_c=(r_0^2)/6D = 10^(-16)/(6 × 8.65 × 10^(-5) ) s = 2 × 10^(-13) s

      (16)

      where tc denotes the time a proton requires to diffuse a distance equal to the capture radius r0. Since transient closures would give rise to experimentally undetected pore resistance, they must be ruled out. The observation agrees well with noise experiments, where Lorentzian time constants, albeit smaller than the time constants for H+ current activation but larger than 0.1 s were observed (41).

      We provided the calculations showing the diffusion limitations on page 9:

      “…we show that the transport limiting step is H+ diffusion to the pore (access resistance) and not transport through the pore. Therefore, we first calculate the maximum current Imax permitted by diffusion for a constantly open pore (35):

      I_max=2π F r_o D_H c_H

      (7)

      where F, r0, DH, and cH are Faraday's constant, the capture radius, the H+ diffusion constant, and the H+ concentration, respectively. The only unknown parameter is r0. Taking the gA estimate r0 = 0.87 Å (36), disregarding buffer effects and assuming DH = 8.65×105 cm2s-1, we find:

      I_max=2π (9.6 ×10^4 As)/mol × 0.87 × 10^(-8) cm × 8.65 x 10^(-5) (cm^2 s^(-1) × 4 × 10^(-7.5) mol)/(1000 cm^3 )

      (8)

      I_max=5.6 × 10^(-17) A

      (9)

      Eq. 8 considers that the approximately 25 % charged lipids in the bilayer induce an increase in surface proton concentration, i.e. it accounts for a surface potential of roughly -40 mV in 150 mM salt. The maximal unitary rate would then be equal to:

      q_max = 5.6 × 10^(-17) C/s/1.6 × 10^(-19) C =348 s^(-1)

      (10)

      Here we used the r0 value determined for gA (36). Acidic moieties at the entrance of HV1 and proton surface migration along the lipid bilayer could serve to increase that value (37, 38). The observation suggests transport limitations by poor proton availability. Calculation of the channel resistance, Rch (35), confirms the hypothesis:

      R_ch = R_pore+R_access =[l_ch+(π a_ch)/2] ρ/(π a_ch^2 )

      (11)

      where R_pore is the resistance of the pore proper and R_access is the access resistance. Assuming a channel radius, a_ch, of 0.15 nm, a length, l_ch of 4 nm and solution resistivity (H+ as the sole conducted ion at bulk pH of 7.5 and a surface potential of -40 mV), ρ, of 2×105 Ω cm, we find R_ch = 4×1013 Ω. Thus, the resulting current, Iρ, that we may expect for the vesicular membrane potential of 100 mV is equal to 3×10-15 A. Accordingly, Iρ exceeds Imax by more than one order of magnitude. Consequently, we may safely conclude that HV1 conductance is limited by proton availability under our conditions. ”]]

      The main conclusion of the paper rests on the negative results from the water permeability assay of Fig. 5. It is recommended to include a positive control (e.g., with gramicidin A), run under the same conditions and similar number of channels per LUV, to show how the results should look like in case of significant water permeability.

      [[We included the gramicidin measurements (Fig. 6) as requested.]]

      Figure 6 show a simplified scheme of proton transport with trapped water molecules in Hv1. Panel A represents a resting state (nonconductive); panel B represents an open state (conductive), favored by the D174A mutation. So, what makes B conductive and A nonconductive? Is it the presence of two salt bridges in B vs. three salt bridges in A? This should be clarified.

      [[

      We modified the figure based on the simulation results of Geragotelis et al. We indicate with arrows the parts of the channel where the proton is free to move and crosses the sites with insurmountable energy barriers.

      Legend to the figure (now Fig. 8): “In the region of the selectivity filter adjacent to D112, the channel is too narrow to let water molecules pass (see also Fig. S1). Yet, the proton may bypass the electrostatic barrier of the open channel at D112 (18), i.e., jump between the two neighboring water molecules. Removal of D174 shifts the voltage sensitivity so that most channels are already open at a transmembrane potential of 0 mV. B) The closed channel. It neither allows water nor proton transport. In its new location, D112 provides an insurmountable electrostatic barrier to proton passage.”]]

      Minor comments: The interpretation of Fig. 2E strongly depends on the assumption that the D174A mutation does not alter membrane trafficking. It is recommended to check the validity of this assumption, e.g., by colocalization with a plasma membrane marker. Images of SDS‐PAGE results for the studied Hv1 proteins should be provided to show preparation purity.

      [[

      We toned down the interpretation of Fig. 2E. As it stands now, Fig. 2 shows that the mutant (i) is functional and (ii) has a high open probability at 0 mV. These conclusions are independent on membrane trafficking. We included images of SDS page results for the studied HV1 proteins in the Supplement.]]

      CROSS‐CONSULTATION COMMENTS I agree with the comments from the other two reviewers. My major point is that refuting major water permeability in Hv1 is not the same thing as refuting that protons can be conducted by transient water wires, unless it is proved that the transient water wires cannot sustain enough proton movement to account for the single channel conductance. Reviewer #3 (Significance (Required)): The Hv1 channel plays important roles in the human body, including the immune, respiratory, and reproductive systems. Despite recent advances in understanding the mechanism of proton conduction by Hv1, whether or not protons hop within a continuous water wire in the open channel is a subject of debate (DeCoursey J. Physiol. 2017, Bennett & Ramsey J. Physiol. 2017). This work provides important insights on the debate by refuting the existence of a water wire that can sustain large water permeability. The findings reported here will be of interest to ion channel biophysicist like this reviewer, but also to biologists studying cellular pH homeostasis and the pathophysiology of Hv1.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      August 17, 2022

      RE: Review Commons Refereed Preprint #RC-2022-01442

      Dear Editor of the EMBO Journal,

      Please find our updated manuscript and response to the reviewers’ comments. We appreciate the effort that the reviewers have put into the evaluation of our manuscript.

      We are happy with the potential importance the reviewers realise in the study:

      Reviewer 1: The finding that ubiquitination occurs inside mitochondria would be an important conceptual advance, which would open new perspectives both for ubiquitination and mitochondrial biology

      Reviewer 2: This work would represent a significant/exceptional discovery if supported by compelling data.

      Reviewer 3: the results are interesting and very important, as mentioned in the major comments section…

      With regard to the major comments raised by the reviewers, you will find below our specific response point by point with explanations and suggested novel experiments (highlighted in yellow). In summary we suggest the following actions to fully support our model:

      • We will perform a-complementation with ubiquitin (lacking the GG motif) fused at its C-terminus to the short fragment of b-galactosidase (a). Blue colonies with ωm will indicate import.
      • As shown in Figure S2, now added to the manuscript, we show detection of ubiquitinated proteins and mono ubiquitin in extracts of mitochondria pre-treated with trypsin.
      • A bio-archives address of our other manuscript will be provided.
      • The use of a-complementation for protein localization was developed by us 15 years ago and since then has been used by us and other groups verifying its use as a screening tool. One point is clear, ωm or ωc do not leak into other subcellular compartments. Nevertheless, in the research of specific genes validation is important. Yes!!! ωm and ωc are exclusively located in mitochondria or the cytosol respectively.
      • We will highly purify mitochondria on gradients and treat them with protease.
      • We cannot be sure that we will be able to detect a protein with ubiquitin modifying activity which functions solely on certain proteins in mitochondria, so publication cannot rely on this.
      • Repeat mass spectrometry with careful editing will be undertaken as suggested by the reviewer.
      • We will attempt to perform protease protection assays in the presence of specific detergents.

      Before tackling the very tough revision, we would like to know if EMBO Journal would positively consider acceptance of our manuscript based on the review and planned revision.

      Prof. Ophry Pines Microbiology & Molecular Genetics Hebrew University of Jerusalem Jerusalem 91220 Israel


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary:

      In this manuscript, Zhang et al. investigate whether ubiquitination occurs inside mitochondria of the budding yeast S. cerevisiae. They first observe thanks to a sensitive complementation assay that several components of the yeast ubiquitination (and deubiquitination) machinery can localize inside mitochondria. To be able to specifically probe ubiquitin conjugates assembled inside mitochondria they fused HA-tagged ubiquitin to a mitochondrial targeting sequence. Using this construct, they demonstrate that ubiquitin conjugates can be assembled in mitochondria. A series of elegant experiments demonstrates that the pattern of ubiquitin conjugates depends on the mitochondrial localization and the activity of the ubiquitin conjugating enzyme Rad6. Altogether, these results convincingly demonstrate that ubiquitination can occur inside yeast mitochondria when ubiquitin is intentionally targeted inside this organelle. It however remains unclear whether mitochondrial ubiquitination occurs in endogenous conditions (without targeting ubiquitin into this compartment) and whether it affects mitochondrial functions.

      Response: Regarding the question whether mitochondrial ubiquitination occurs in endogenous conditions, we feel that this is obvious based on our results. We detect numerous ubiquitination related enzymes (E1, E2, E3, DUB) eclipsed in mitochondria but none of the proteasome subunits. As pointed out by the reviewer “these results convincingly demonstrate that ubiquitination can occur inside yeast mitochondria”. With that said, additional data will be incorporated into the manuscript as suggested by the reviewer and can be seen below.

      Major comments:

      1) The materials and methods section is lacking important information (western blot protocol, details of antibodies, strains, plasmids...). It is thus difficult to evaluate how several experiments were performed and how their design (e.g. the promoters chosen to express tagged proteins) could impact the interpretation of the results. This is a major issue that needs to be corrected. The main text should also explicitly indicate whether tagged proteins used in the alpha-complementation assay are overexpressed or not.

      Response: The materials and methods section will be updated accordingly.

      2) Despite the previous comment, the data presented in the manuscript convincingly demonstrate that multiple components of the ubiquitination machinery can localize within mitochondria and that ubiquitin conjugates can be assembled in mitochondria when ubiquitin is modified to be intentionally targeted into this compartment. However, little data is shown to support the hypothesis that ubiquitin conjugates can be assembled in mitochondria when ubiquitin is not fused to a mitochondrial targeting sequence. Thus, in my opinion, the evidences presented in the current manuscript are not sufficient to conclude that ubiquitin conjugates are assembled in mitochondria in endogenous conditions (as this is done implicitly). Additional evidences are needed to draw this conclusion (see some experimental suggestions hereafter). Without further evidences, the speculative aspects of the claim that "ubiquitination occurs in the mitochondrial matrix" should be discussed explicitly.

      Response: See the discussion above why we are confident that ubiquitination occurs in mitochondria. Our major problem with ubiquitin and the ubiquitination enzymes is that they are eclipsed in mitochondria. We propose as suggested by the reviewer (item 4 of his review) to perform a-complementation with ubiquitin fused at its C-terminus to the short fragment of b-galactosidase (a). Blue colonies with ωm will indicate import.

      3) The authors used a mass spectrometry approach to identify mitochondrial ubiquitination substrates. However, they have not yet succeeded in identifying a substrate whose modification is specifically regulated by a given component of the mitochondrial ubiquitination machinery. They have also not identified a phenotype or process impacted by mitochondrial ubiquitination. Thus, at this stage, the biological consequences of mitochondrial ubiquitination remain elusive.

      __Response: __We have not identified a substrate whose modification is dependent on a given component of the mitochondrial ubiquitination machinery, even though we have tried. Again, the problem is low levels of these proteins eclipsed in mitochondria. Even when we do find a protein that is ubiquitinated (e.g. Aco1) its ubiquitination is not exclusively dependent on Rad6. Thus, different ubiquitin enzymes may have the same substrates.

      4) The authors have not directly investigated whether ubiquitin itself (without a mitochondrial targeting sequence) localizes in mitochondria. I encourage them to address this question since it would provide an important piece of evidence suggesting that mitochondrial ubiquitination can occur in endogenous conditions. This could be done using the alpha-complementation assay and the results could be presented within Figure 1. Ideally this experiment should be performed without overexpressing ubiquitin. Note that if the authors decide to use a C-terminally tagged form of ubiquitin for this experiment, the GG motif of ubiquitin should be mutated to avoid cleavage of the alpha tag by cellular DUBs. This form of ubiquitin will not be conjugatable, but this is not an issue for this experiment since its aim is to determine whether ubiquitin can be targeted to mitochondria, not to probe conjugates.

      Response: We will perform experiments as suggested by the reviewer including ubiquitin fused at its C-terminus to the short fragment of b-galactosidase (a), see item 2. We have previously made a PreSu9-Ubi lacking a GG motif but now will look at a different combination of this and other constructs.

      5) In the top panels of Figure 2 and S1, free ubiquitin is well detectable in the total and cytosolic fractions. It is however not clear to me whether it is also detectable in the concentrated mitochondrial fraction. If yes and if it would be resistant to trypsin digestion, it would provide additional evidence that endogenous ubiquitin can be targeted to the mitochondrial matrix (see previous comment).

      Response: See Item 6.

      6) The data shown in the top panel of Figure 2 and S1 also suggest that free ubiquitin is less concentrated in mitochondria than in the cytosol (since it is more difficult to detect in the concentrated mitochondrial fraction than in the cytosolic fraction, see previous comment). It is thus possible that the use of preSu9-HA-Ubi (or preFum1-HA-Ubi) lead to an artificially high intra-mitochondrial concentration of free ubiquitin. As the concentration of free ubiquitin is known to impact ubiquitination processes, I encourage the authors to compare the relative levels of free ubiquitin present in the mitochondrial fraction prepared from WT and preSu9-HA-Ubi (or preFum1-HA-Ubi) expressing cells. If free ubiquitin is detectable in mitochondrial fractions and resistant to trypsin (see previous comment), this could be done by repeating the experiment shown in Figure 3B and probing the blot with an antibody that recognizes free ubiquitin.

      Response to 5 and 6: Detection of ubiquitin in mitochondria is extremely difficult even when mitochondria are 15-fold concentrated versus the cytosol and when HA-Ubi is overexpressed. Thus, ubiquitin is eclipsed in mitochondria. Nevertheless, as shown in the Figure below which was not part of the submitted manuscript yet was performed in parallel to experiments done early on, shows detection of very weak bands of free ubiquitin in extracts of mitochondria pre-treated with trypsin.

      Endogenous ubiquitination pattern in mitochondria of _Δrad6 _cells is restored to normal by Rad6-α. __WT or Δrad6 cells containing a Rad6-α construct or an empty plasmid were subjected to subcellular fractionation. Mitochondrial fractions with or without trypsin treatment, were probed for ubiquitin by WB. Aco1 is a matrix mitochondrial protein, and Tom70 is a mitochondrial outer membrane protein (MOM) facing the cytosol.

      7) I strongly encourage the authors to provide more data indicating that "ubiquitination occurs in mitochondria" by performing experiments that do not rely on the use of the preSu9-HA-Ubi or other forms of ubiquitin that are intentionally targeted to mitochondria. For instance, they could analyse the pattern of HA-Ubi conjugates of trypsin digested mitochondrial fractions prepared from wt, rad6-delta, and rad6-delta complemented with preSu9-Rad6-alpha-SL17. Note that if trypsin digested mitochondrial fractions are too contaminated by ubiquitinated proteins present outside mitochondria to perform this experiment, the authors may use the unspecific DUB Usp2 as an alternative protease to strip ubiquitinated proteins from the mitochondria periphery.

      Response: Concentrated mitochondrial extracts from WT and Δrad6 cells untreated or treated with trypsin were probed with anti-ubiquitin antibodies (Figure above). A very weak band corresponding to free ubiquitin can be detected in extracts of mitochondria treated with trypsin but these are very weak and are on the limit of detection.

      Minor comments:

      1) Overall, the manuscript is well organized and easy to follow. The text is clearly written; the figures are well annotated.

      2) The authors should provide full images of all the blots with anti-ubiquitin and anti-HA antibodies so that one can see the bands corresponding to free ubiquitin (or free HA-Ubi). For instance, in Figure 3B, it is not possible to see the presence (or absence) of the band corresponding to free HA-Ubi because the very bottom of the image is cut.

      3) The authors should indicate whether the MTS of Su9 (and Fum1) are expected to be cleaved after import of preSu9-HA-Ubi (and preFum1-HA-Ubi) in mitochondria. They should also label on the corresponding immunoblots the presence (or absence) of the band corresponding to the free preSu9-HA-Ubi (and preFum1-HA-Ubi) (or HA-Ubi if the MTS is expected to be cleaved from these constructs).

      4) In Figure 3B, the ubiquitin conjugates produced with preSu9-HA-Ubi and preFum1-HA-Ubi have different migration patterns. I think this should be explicitly mentioned and discussed. Could it be due to the presence of lysine residues in the Su9 or Fum1 MTS that could lead to the assembly of artificial ubiquitin chains?

      5) The authors indicate that "endogenous Rad6 [...] is expressed at very low levels and can hardly be detected in the mitochondrial fraction by WB (Figure S5)". I did not manage to observe the band corresponding to endogenous Rad6 in the mitochondrial fraction in the pdf. The authors should provide a more contrasted or better quality image.

      CROSS-CONSULTATION COMMENTS I agree with reviewer 2 that proper validation of the complementation assay is crucial for this manuscript. I was myself wondering whether it uses endogenously tagged proteins or whether it is based on an overexpression system. I imagine this information will be detailed in the manuscript in preparation mentioned by the authors. I am therefore wondering whether it would be possible to ask the authors to provide the draft of this manuscript (or at least the validation part).

      Response: A bio-archives address of our other manuscript will be provided upon resubmission. See other issues referred to the response Reviewer 2.

      I agree with most comments of reviewer 3. Regarding the hypothesis that preSu9-HA-Ubi could form aggregates on the cytosolic surface of the mitochondria, I think that the results presented on Figure 7B rather argue against it (since they indicate that Rad6 localized inside mitochondria can restore the pattern of ubiquitin conjugates). That's why (in my opinion) the major question the author now need to adress is whether intra-mitochondrial ubiquitination occurs in endogenous conditions (ie without forcing ubiquitin into this compartment and without E2 or E3 overexpression).

      Response: See response to the other reviewers

      Reviewer #1 (Significance (Required)):

      The finding that ubiquitination occurs inside mitochondria would be an important conceptual advance, which would open new perspectives both for ubiquitination and mitochondrial biology research. However, the significance of the current manuscript is limited because the presented evidences heavily rely on the use of artificial conditions (ubiquitin tagged with a mitochondrial-targeting sequence) that may trigger irrelevant ubiquitination events. The significance would be much higher if the authors would provide further evidences indicating that intra-mitochondrial ubiquitination occurs in endogenous conditions and/or if they had identified a mitochondrial process specifically impacted by mitochondrial ubiquitination.

      Expertise of the reviewer: Ubiquitination, Yeast biology, protein-protein interactions. No specific expertise in mitochondrial biology

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In the manuscript by Yu et al., the authors test the concept that certain proteins are unevenly distributed within distinct cell compartments. Due to this localization discrepancy, protein detection in some subcellular compartments can be "eclipsed" by a predominant subset of specific protein localizing in another cell compartment their actual distribution. Therefore, tiny amounts of physiologically relevant proteins could be biologically relevant. Still, their function in some locations can be overlooked (or eclipsed) because of the high expression level of the same protein in another subcellular compartment(s). Although, this concept is not particularly novel. For example, it is already known that many different proteins can localize to distinct cellular locations (e.g., permanent mitochondrial and peroxisomal localization of many proteins or transient localization of particular proteins to separate cell compartments). The authors apply a yeast system and an α-complementation assay to test further the role of such eclipsed proteins in mitochondrial biology. Specifically, they focus on the ubiquitin (Ub, or as abbreviated incorrectly in this manuscript; Ubi) conjugation pathway, components of which have never been convincingly shown to localize inside the mitochondria. This work proposes that certain ubiquitination events can occur inside yeast mitochondria. This work would represent a significant/exceptional discovery if supported by compelling data. However, the major problem with this work is that the conclusions are based on the ectopic expression of distinct proteins. This approach is not failproof in precise protein expression/delivery to the specific subcellular locations and is likely to result in a non-specific localization. Thus, the problem of eclipsed proteins is addressed by the methodology that may lead to the artificial generation of eclipsed overexpressed proteins. A more effective approach would be if the authors found a way to study this issue with endogenous proteins. The need for overexpression of mitochondria-targeted ubiquitin makes it challenging to reconcile the physiological role of these fundings. In addition, some critical technical issues and omissions further reduce the potential impact of this work (see Specific comments below). For example, strong evidence of mitochondria fraction purity and additional evidence that all the essential constructs used in this work are not misdirected to a different compartment are needed.

      Response: “Although, this concept is not particularly novel” is a very disappointing remark by the reviewer!! While dual targeting of proteins has been known for many, many years, how widespread the phenomenon was unknown and thought to be negligible. We are leaders for the last 30 years in the field of dual targeting and distribution and in particular distribution of single translation products. We coined the terms “echoforms” and “eclipsed distribution” and developed methods to detect and screen for dual targeting. The concept of eclipsed distribution and in particular eclipsed targeting to mitochondria is very new, and is leading to a novel perception of the mitochondrial proteome (see MS submission). While the reviewer appears to be an expert on ubiquitination, we are experts on dual targeting.

      • Ub was abbreviated incorrectly in this manuscript, Ubi. __Response: __This will be corrected.

      Other comments will be referred to in the response to Specific comments.

      Specific comments 1. The authors should demonstrate beyond doubt that the ω components of their assay (ω-C, which supposedly stays in the cytosol-ONLY and the ω-M component, which seemingly remains in the mitochondria-ONLY) are in the compartment that the authors claim. These two proteins are transfected into yeast cells and overexpressed. Therefore it is possible that they leak to other, not intended, subcellular compartments. The authors assume that ω-M and ω-C are exclusively located either in the mitochondria or the cytosol. However, this should be shown as validation of the assay. The indicated reference from 2005 (Ref.13) and others are irrelevant since assays have variations and are often researcher/lab dependent. This validation is very important since a misallocation of the overexpressed ω-M or ω-C, leaking into other subcellular compartments, may cause misdetection of the α-constructs.

      Response: The use of a-complementation for protein localization was developed by us 15 years ago and since then has been used by us and other groups verifying its use as a screening tool. One point is clear, ωm or ωc do not leak into other subcellular compartments. Nevertheless, in the research of specific genes validation is important. Yes!!! ωm and ωc are exclusively located in mitochondria or the cytosol respectively.

      It is not surprising that Ub conjugates are detected in mitochondrial fractions. It could be due to ubiquitination of the OMM (coming from the cytosol) or perhaps since the subcellular fractions were not pure mitochondria free from contamination (the likely culprit could be the ER). The mitochondrial fractions in this work were obtained by 10,000 g separation between cytosolic and mitochondrial crude fractions. Indeed, these 10,000 g crude fractions are highly impure with membranes from other compartments (i.e., microsomes, lysosomes, and so on). Therefore, more sophisticated purification methods should be used. In addition, the authors should also test these fractions for non-mitochondrial proteins from other membrane organelles.

      Response: We agree with the reviewer and therefore will take the following approaches:

      1. i) We will treat isolated mitochondria with protease in order to remove adhering proteins and digest OMM proteins…… see attached figure.
      2. ii) We will highly purify mitochondria on gradients and this will be straight forward since we are now employing such methods in other projects in the lab. iii) Matrix protein enrichment (by mass spec) is associated with IP for preSu9-HA-Ub conjugates which is three-fold higher than for HA-Ub. In any case the fact that we identify conjugates of proteins not known to be mitochondrial, strongly supports our thesis.

      Figure 2. Coomassie blue staining does not show any signal in the "M" fraction. It can be interpreted that the authors do not get any mitochondria there, and therefore the lack of Ub signal is due to the absence of the protein in the samples. Using the same amount of protein from each fraction would probably reduce the necessity of 15x enrichment.

      Response: The Coomassie blue staining does show a signal in the "M" fraction which is weak yet when a 15x enrichment is run, the protein level by Coomassie blue staining is similar to the cytosolic fraction.

      Figure 3. It is puzzling why the HA-UBQ presence is so strong in the crude mitochondrial fraction, but the preSu9-HA-Ub signal (mito-matrix) is comparatively weak. These data suggest that the crude mito-fraction could be highly contaminated with OTHER membranes. On the other hand, the preSu9-HA-UBQ signal is no more than 1-5% of the total mitochondrial signal. The high enrichment of the HA-Ubi in both cytosols and the mitochondria could indicate the OMM ubiquitination or (again) contamination by other compartments. The constructs with MTS are detected in the mitochondria. However, the localization of tagged MTS-Ubi in a non-targeted compartment (e.g., cytosol) should be excluded by additional exposure times. Because the manuscript talks about eclipsed proteins, this is important.

      Response: The HA-Ub is strong in the mitochondrial fraction, in the absence of trypsin, but is very weak in the presence of the protease indicating that most of the ubiquitinated proteins are externally attached to mitochondria. In contrast, PreSu9-HA-Ub is imported into the mitochondrial matrix and is protected from trypsin. This manuscript refers to “eclipsed in mitochondria” (not the cytosol) and this is true for ubiquitination enzymes as well as for ubiquitin.

      Figure 3C-E. These data indeed suggest that the Ub-conjugates could be formed inside the mitochondria. However, the above-discussed possibility that other than mitochondria compartments co-sediment in the 10,000g fractions makes the data interpretation highly challenging.

      __Response: __We will highly purify mitochondria on gradients and this will be straight forward since we are now employing such methods in other projects in the lab.

      Figure 4. Unsurprisingly, mitochondrial targeting of Ub leads to detecting some co-immunoprecipitating mitochondrial proteins. However, these data do not support the notion that Ub conjugation machinery acts inside the mitochondria and that the target proteins are indeed conjugated with Ub (the interaction with Ub is not equal to being conjugated). At the minimum, the authors should provide a validation that some of the detected mitochondrial matrix proteins are indeed ubiquitinated. To this end, purified mitochondria could be used for the candidate protein IP under denaturing conditions and then blotted for the candidate protein and Ub.

      __Response: __As shown in Table S2 and figure S7, forms of Ilv5, a mitochondrial protein, are ubiquitinated in WT and Drad6 cells. These modified forms of Ilv5 can be eluted from mitochondrial extracts of WT and Drad6 cells. However, the ubiquitination of ilv5 is not dependent or effected by the Drad6 mutation. We cannot be sure that we will be able to detect a protein with ubiquitin modifying activity which functions solely on certain proteins in mitochondria.

      Figure 5. The knock-out of the E2 Rad6 causes a change in the mitochondria ubiquitination pattern. This is an interesting observation, but again it does not prove that the change in the mitochondrial ubiquitination is due to the activity of Rad6 inside of the mitochondria, as opposed to ubiquitination of the OMM proteins or contaminating fractions. One also wonders why overexpression of mitochondria-targeted Ub would be necessary to detect the ubiquitination if this process was physiologically relevant, especially given that detecting endogenous Ub is not challenging. Furthermore, the apparent increase in ubiquitination in E2 mutant cells (Fig. 5) should also be addressed in more detail. Finally, data from one WB is shown, and quantification of several independent experiments should also be provided.

      __Response: __We show in the MS that RAD6 is exclusively targeted to mitochondria (Su9MTS) while unimported molecules are degraded (SL17; degron). This hybrid Rad6 can restore the WT ubiquitin pattern, while a rad6 active site mutant cannot.

      Figure 6. Can the authors provide Western blot data showing the expression of Rad6? Furthermore, quantifying these rescue experiments is necessary to make this conclusion more solid.

      Response: Even though we did not succeed in making good Rad6 antisera, we can clearly detect Rad6-a fusion proteins (Figure 7B).

      Figure 7. The authors found that preSu9-Rad6-α have problems being imported into the mitochondria matrix; therefore, they rebuild it as a preSu9-Rad6-α-SL17 protein. SL17 is a degron that targets the cytosolic protein (not imported into the mitochondria) to the proteasome and degraded (Figs. 7A-B-C). These issues could be a red flag for the rest of the manuscript, suggesting that other constructs (that were not critically evaluated for their localization in this work) could leak to different cellular compartments.

      Response: The wording used by the reviewer is particularly disturbing since current understanding in cell biology of eukaryotic cells does not accept “leaking” of proteins to different cellular compartments. One wouldn’t want DNAses, RNAses, Proteases etc leaking from one compartment to another. The localization of proteins to different cellular compartments involves very precise signals on the proteins, and specific cellular components, such as translocases, are required to target proteins to their exact destination. This is true for Rad6; it contains an MTS like sequence which when removed blocks import of the protein into mitochondria. Rad6 according to our analysis is an eclipsed dual targeted protein, so it no surprise that it is in two compartments and the trick with the SL17 degron solves the problem.

      The manuscript needs to be carefully edited, some references are in the not correct format, and there are issues with figure labels.

      Response: Careful editing will be undertaken as suggested by the reviewer.

      CROSS-CONSULTATION COMMENTS I agree with a great summary by reviewer 1. This discovery should be validated by top-quality data.

      Reviewer #2 (Significance (Required)):

      In the manuscript by Yu et al., the authors test the concept that certain proteins are unevenly distributed within distinct cell compartments. Due to this localization discrepancy, protein detection in some subcellular compartments can be "eclipsed" by a predominant subset of specific protein localizing in another cell compartment their actual distribution. Therefore, tiny amounts of physiologically relevant proteins could be biologically relevant. Still, their function in some locations can be overlooked (or eclipsed) because of the high expression level of the same protein in another subcellular compartment(s). Although, this concept is not particularly novel. For example, it is already known that many different proteins can localize to distinct cellular locations (e.g., permanent mitochondrial and peroxisomal localization of many proteins or transient localization of particular proteins to separate cell compartments). The authors apply a yeast system and an α-complementation assay to test further the role of such eclipsed proteins in mitochondrial biology. Specifically, they focus on the ubiquitin (Ub, or as abbreviated incorrectly in this manuscript; Ubi) conjugation pathway, components of which have never been convincingly shown to localize inside the mitochondria. This work proposes that certain ubiquitination events can occur inside yeast mitochondria. This work would represent a significant/exceptional discovery if supported by compelling data. However, the major problem with this work is that the conclusions are based on the ectopic expression of distinct proteins. This approach is not failproof in precise protein expression/delivery to the specific subcellular locations and is likely to result in a non-specific localization. Thus, the problem of eclipsed proteins is addressed by the methodology that may lead to the artificial generation of eclipsed overexpressed proteins. A more effective approach would be if the authors found a way to study this issue with endogenous proteins. The need for overexpression of mitochondria-targeted ubiquitin makes it challenging to reconcile the physiological role of these fundings. In addition, some critical technical issues and omissions further reduce the potential impact of this work (see Specific comments above). For example, strong evidence of mitochondria fraction purity and additional evidence that all the essential constructs used in this work are not misdirected to a different compartment are needed.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: In this study, the authors detected a set of components of a ubiquitination system in the mitochondrial matrix in budding yeast using the subcellular compartment-dependent α-complementation assay. The authors detected the conjugates of mitochondrial targeting signal sequence-directed HA-Ub (preSu9-HA-Ub) in the mitochondrial matrix. The immunoprecipitates of the preSu9-HA-Ubi conjugates were highly enriched for the mitochondrial matrix proteins. Subsequently, the authors focused on the Rad6 E2 ubiquitin conjugating enzyme in the mitochondrial matrix and evaluated its inactivation-altered ubiquitination pattern in the organelle. The authors conclude that ubiquitination occurs in the mitochondrial matrix because of the eclipsed targeted components of the ubiquitination machinery.

      Major comments: The authors argued that the proteins that were modified with preSu9-HA-Ubi, which was forced to be imported into the mitochondria, are present in the mitochondrial matrix, because these species are resistant to trypsin digestion. However, it was possible that they formed severe aggregates on the cytosolic surface of the mitochondria, and hence, were resistant to the proteinase. In other words, a small amount of proteins that were not imported into the mitochondria could be deposited on the cytosolic surface of the mitochondria, where they were modified with preSu9-HA-Ubi by cytosolic Rad6. To confirm if the preSu9-HA-Ubi-modified proteins were really present in the mitochondrial matrix, they should perform the protease protection assay in the presence of an appropriate detergent (Figure 3D). In addition, subcellular fractionation of the organelle by density gradient centrifugation, indirect immunofluorescence microscopic analysis of the preSu9-HA-Ubi conjugates, and/or experiments on the in vitro import of preSu9-HA-Ubi and Rad6 into the mitochondria would strongly support the authors conclusion. Other experiments that might support the authors conclusion would be to test whether the band pattern for the preSu9-HA-Ubi conjugates changes when the mitochondrial import is impaired.

      Response: We will attempt to perform 1) Protease protection assay in the presence of a detergent (Figure 3D). 2) Subcellular fractionation of the organelle by density gradient centrifugation. 3) In vitro import of Rad6 into the mitochondria.

      Minor comments: In Figure 3B, the molecular weight distributions of the preSu9-HA-Ubi conjugates and those of the preFum-HA-Ubi conjugates are different. Is there any reason for this difference?

      In Figure 3E, the position of "-" (MG132) for lane 1 is not correct.

      In Figure 6A: The band pattern for preSu9-HA-Ubi (lane 13) in the rad6-delta cells expressing Ubc8-alpha is different from that of the wild-type cells expressing Ubc8-alpha (lane 12) as well as that obtained from the rad6-delta cells harboring empty plasmids (lane 9). Is there any explanation for this observation?

      In Figure 7B and S6: The level of preSu9-Rad6-alpha-SL17 in the rad6-delta cells is always lower than that in the wild-type cells (compare lanes 13 and 10 in Figure 7B, and lanes 13 and 12 in Figure S6). Is there any explanation for this observation? The protease protection assay (with detergent control) is needed to fully confirm that preSu9-Rad6-alpha-SL17 is present in the mitochondria.

      In Figure S7, the authors presented the matrix proteins, Ilv5 and Aco1, detected in the preSu9-HA-Ubi IPed samples and described this observation in the main text. However, the authors also showed the blots for Idh1 and Fum1, which were also pulled down with preSu9-HA-Ubi from the WT cells more than from the rad6-delta cells. Is this correct? If so, please elucidate this observation in the main text.

      Figure 8D and 8E are not cited in the main text. Although there are no explanations for these figures in the main text, it looks like Rad6-deltaN11-alpha resides in the mitochondrial fraction. However, the alpha-complementation assay suggests that it resides in the cytosol. Please explain this discrepancy.

      First page of the discussion section, item 6): E2 Rad6, but not E3 Rad6?

      Figure S7: HA-Ub (cytosolic form) control is needed in addition to the empty vector control.

      Figure S7, left panel: There is an unnecessary line break in "Hsp60" and "Ilv5."

      Figure S7, right panel: There is an unnecessary line break in "Hsp60."

      CROSS-CONSULTATION COMMENTS I agree with comments of reviewer 1 and 2. -Validation of the complementation assay. -I also think that it is important to address whether intra-mitochondrial ubiquitination can be observed with endogenous level of ubiquitin. If even a small amount of preSu9-HA-Ub is mistargeted to the cytosol, proteins at the cytosolic side of mitochondrial outer membrane could be ubiquitinated and detected in the mitochondrial fraction. -Preparation of mitochondria with more sophisticated purification methods (i.e. high resolution density gradient) would be needed to separate mitochondria from ER and other organelles. -More information is needed in the materials and methods section.

      Reviewer #3 (Significance (Required)): Significance Although the results are interesting and very important, as mentioned in the major comments section, additional experiments are needed to support their model. However, researchers working on the mitochondrial biology and ubiquitin systems might be interested in and influenced by the reported findings.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors report a public browser in which users can easily investigate associations between PGSs for a wide range of traits, and a large set of metabolites measured by the Nightingale platform in UKBB. This browser can potentially be used for identifying novel biomarkers for disease traits or, alternatively, for identifying novel causal pathways for traits of interest.

      Overall I have no major technical concerns about the study, but I would encourage the authors to revisit whether they can find a more compelling example that can better showcase the work that they have done. I understand that this is partly a resource paper but I think the resource itself can have more impact if the paper provides a clearer use-case for how it can drive novel biological insight.

      Many thanks for your comments. We have undertaken a new application of bi-directional Mendelian randomization to demonstrate how users may use this approach to disentangle whether associations in our atlas likely reflect either causes or consequences of PGS traits/diseases. This example is described on page 9:

      ‘For example, we applied Mendelian randomization (MR) to further evaluate associations highlighted in our atlas with triglyceride-rich very low density lipoprotein (VLDL) particles. For instance, both VLDL particle average diameter size and concentration were associated with the PGS for body mass index (BMI) (Beta=0.04, 95% CI=0.033 to 0.046, P<1x10-300 & Beta=0.012, 95% CI=0.006 to 0.019, P=2.7x104 respectively) and coronary heart disease (CHD) (Beta=0.026, 95% CI=0.019 to 0.032, P<1x10-300 & Beta=0.035, 95% CI=0.028 to 0.042, P<1x10-300 respectively). Conducting bi-directional MR suggested that the associations with average diameter of VLDL particles are likely attributed to a consequence of BMI and CHD liability as opposed to the size of VLDL particles having a causal influence on these outcomes (Supplementary Table 6). In contrast, MR analyses suggested that the concentration of VLDL particles increases risk of CHD (Beta=1.28 per 1-SD change in VLDL particle concentration, 95% CI=1.25 to 1.65, P=2.8x10-7) which may explain associations between the CHD PGS and this metabolic trait within our atlas.’

      and discussed in the discussion on page 21:

      ‘We likewise conducted bi-directional MR to demonstrate that associations between the CHD PGS and VLDL particle size likely reflect an effect of CHD liability on this metabolic trait. In contrast, the association between the CHD PGS and VLDL concentrations are likely attributed to the causal influence of this metabolic trait on CHD risk, suggesting that it is the concentration of these triglyceride-rich particles that are important in terms of the aetiology of CHD risk as opposed to their actual size. We envisage that findings from our atlas, as well as other ongoing efforts which leverage the large-scale NMR data within UKB, should facilitate further granular insight into lipoprotein lipid biology.’

      PGS construction: It's unclear how well the PGS work. Should the reader prefer the stringent or lenient PGS? Perhaps there could be some validation with traits that have decent sample sizes in UKBB. Was there any filtering to remove traits with few GWS hits, low sample sizes, or low SNP heritability as these are unlikely to produce useful PGSs?

      An example of validation was previously included for the chronic kidney disease PGS and its association with circulating creatinine, although this has now been removed due to the feedback you provided in your comments below. However, we have now provided the weights for all of the PGS included in our web atlas should users want to use these scores for prediction purposes (page 7):

      ‘The specific weights for clumped variants used in all PGS can be found at https://tinyurl.com/PGSweights.’

      On page 8 we have mentioned that in this work we have used a more lenient threshold to facilitate endeavours in a ‘reverse gear Mendelian randomization’ framework. However, the option to use the more stringent threshold remains an option for users interested in this as an alternative:

      ‘In this paper, we have discussed findings using PGS that were derived using the more lenient criteria (i.e., P<0.05 & r2<0.1), although all findings based on both thresholds can be found in the web atlas.’

      ‘Specifically, we believe our findings can facilitate a ‘reverse gear Mendelian randomization’ approach to disentangle whether associations likely reflect metabolic traits acting as a cause or consequence of disease risk (Holmes and Davey Smith, 2019) as illustrated using triglyceride-rich very low density lipoprotein (VLDL) particles in the next section.’

      We have not filtering based on other criteria such as the number as SNPs given that certain scores, despite only been constructed using few SNPs, may still provide useful to users. For example, our score for ‘Drinks per day’ based on the more stringent threshold (i.e. P<5x10-8) consists of only 6 SNPs. However, one of these is rs1229984, a missense variant located at the alcohol dehydrogenase ADH1B gene region and known to be a strong predictor of alcohol use (e.g. https://pubmed.ncbi.nlm.nih.gov/31745073/).

      Reviewer #2 (Public Review):

      The authors set out to create an atlas of associations between phenome-wide polygenic scores and circulating lipids, fatty acids, and metabolites. To do so, they utilize GWAS from 129 traits available in the OpenGWAS database to derive polygenic (risk) scores (PGS) along with the recently released NMR metabolomics data containing 249 biomarkers (and ratios) in ~120,000 UK Biobank participants. The authors create a publicly available web portal containing PGS to NMR biomarker associations:

      http://mrcieu.mrsoftware.org/metabolites_PGS_atlas/.

      The strength of this study is in the comprehensive nature of the atlas, containing associations for 129 traits phenome-wide, the large sample size of the UK Biobank NMR data, and the use of PGS for prioritising molecular traits for follow-up experiments, which is an emerging area of interest (International Common Disease Alliance, 2020; Ritchie et al., 2021a). To our knowledge this study is the first to explore this for circulating metabolites.

      In its current form the atlas has several limitations, which should be straightforward to address. Notably, results in the current atlas may be confounded by (1) technical variation in the NMR data (Ritchie et al., 2021b), and (2) major biological determinants of biomarker concentrations, including body mass index, fasting time, and statin usage.

      Firstly, thank you for the suggestion to use your ‘ukbnmr’ R package to help remove technical variations from the UK Biobank NMR metabolites data. We have applied it to remove outliers and variation in the individual data due to (1) the duration between sample preparation and sample measurement, (2) position of samples on shipment plates, (3) different equipment (spectrometers) used. This meant that we needed to re-run our entire analysis pipeline for this project from scratch to the updated dataset. Results do not appear to have drastically changed, although nonetheless we have updated results from all downstream analyses in our online web atlas using this updated dataset provided by ‘ukbnmr’.

      Secondly, the reviewer is correct that biological factors, such as body mass index (BMI) and statin usage, are indeed strongly correlated with metabolites levels. However, we are not able to adjust for such biological factors directly in our analyses, given that they are potential colliders in the causal relationship between diseases/traits and metabolites. Statin usage may be caused by both the high genetic liability to coronary artery disease as well as abnormal lipoprotein lipid levels. Likewise, obesity (and changes in BMI) may result from a high genetic predisposition to cardiometabolic disorders and disrupted metabolism. Thus, adjusting for statin usage and BMI will induce collider bias (https://jamanetwork.com/journals/jama/fullarticle/2790247), which creates spurious associations between the disease/trait PGS and metabolites.

      To better illustrate this issue, we have added additional text on page 14 to justify this study design decision as well as added a new figure (Figure 3) to help demonstrate this clearly to the readers. Fasting time on the other hand we believe is unlikely to act as a collider and was adjusted as a covariate in all linear regression models in this work. This is mentioned on page 25.

      …Further, association results for two (of the 129) PGSs, systolic blood pressure (SBP) and diastolic blood pressure (DBP), are invalid (vastly inflated) as the GWASs used to construct these PGSs included UK Biobank samples.

      Many thanks for your suggestion. We have now removed the SBP and DBP PGS from our atlas due to overlapping samples in UKB. Furthermore, our colleagues at the University of Bristol have notified us that the Glioma GWAS data obtained from the OpenGWAS platform was uploaded with incorrect effect alleles. This PGS has also been subsequently removed from the atlas. Additionally, we removed the Alzheimer’s disease (without APOE) PGS because the pleiotropic effect of lipid associated genes is now systematically examined using lipid gene excluded PGS.

      To demonstrate how one might use these PGS to NMR biomarker associations to prioritise (or deprioritise) findings for follow-up, the authors select a biomarker of interest, glycoprotein acetyls (GlycA), to perform bi-directional Mendelian randomization to orient the direction of causal effects between GlycA and traits of associated PGS. However, the conclusions of this analysis are hampered by the heterogeneous nature of the GlycA biomarker, which captures the levels of five proteins in circulation (Otvos et al., 2015; Ritchie et al., 2019), making it a difficult target to appropriately instrument for Mendelian randomization analysis. This, however, does not detract from the broader point the authors make: that PGS can help prioritize molecular traits for experimental follow-up.

      We have now conducted further sensitivity analyses to evaluate the genetically predicted effects of each of the five proteins in the reference you have provided. This is discussed on page 11:

      ‘We also conducted further sensitivity analyses given that the NMR signal of GlycA is a composite signal contributed by the glycan N-acetylglucosamine residues on five acute-phase proteins, including alpha1-acid glycoprotein, haptoglobin, alpha1-antitrypsin, alpha1-antichymotrypsin, and transferrin (Otvos et al., 2015). Using cis-acting plasma protein (where possible) and expression quantitative trait loci (pQTLs and eQTLs) as instrumental variables for these proteins (Supplementary Table 12) did not provide convincing evidence that they play a role in disease risk for associations between PGS and GlycA (Supplementary Table 13). The only effect estimate robust to multiple testing was found for higher genetically predicted alpha1-antitrypsin levels on gamma glutamyl transferase (GGT) levels (Beta=0.05 SD change in GGT per 1 SD increase in protein levels, 95% CI=0.03 to 0.07, FDR=3.6x10-3), although this was not replicated when using estimates of genetic associations with GGT levels from a larger GWAS conducted in the UK Biobank data (Beta=1.6x10-3, 95% CI=-6.9 x10-3 to 0.01, P=0.71). For details of pleiotropy robust analysis and replication results see Supplementary Table 14.’

      There are also several important limitations to the study which cannot be addressed, which the authors discuss appropriately in the paper. First, the NMR data does not provide a comprehensive view of the metabolome - it is heavily focused on lipids and fatty acids. Many small metabolites in circulation cannot be measured by NMR spectroscopy, and further insights must wait for data from molecular profiling efforts planned or underway in UK Biobank (e.g. mass spectrometry). Second, the authors restricted analysis to participants of European ancestries. This a pragmatic analysis choice given (1) the PGSs were derived from GWAS performed in European ancestries, (2) PGS associations are particularly susceptible to confounding from genetic stratification and differences in environment, and (3) the very small sample sizes for which NMR data is currently available in UK Biobank participants. Finally, although a large sample size, UK Biobank is not a random sample of the population: healthy adults are over-represented, meaning PGS to metabolite associations may be different in disease cases or less healthy individuals.

      Overall this study has strong potential, with straightforward to address limitations, and the resulting atlas will provide a useful characterisation of the relationships between NMR biomarkers and polygenic predisposition to various traits and diseases, which can be used by domain experts to prioritise biomarkers or traits for experimental follow-up.

      Reviewer #3 (Public Review):

      Fang et al. created an atlas for associations between the genetic liability of common risk factors or complex disorders and the abundance of small molecules as well as the characteristics of major apolipoproteins in blood. The whole study is well executed, and the statistical framework is sound. A clear strength of the study is the large array of common risk factors and disease analyzed by means of polygenic risk scores (PGS). Further, the development of an open access platform with appealing graphical display of study results is another strength of the work. Such a reference catalog can help to identify novel biomarkers for diseases and possible causative mechanisms. The authors further show, how such a systematic investigation can also help to distinguish cause from causation. For example, an inflammatory molecule readily measured by the NMR platform and strongly associated in observational studies, is likely to be a consequence rather than a cause for common complex diseases.

      However, in its current form, the study suffers from some weakness that would need to be addressed to improve the applicability of the 'atlas'. This includes a distinction of locus-specific versus real polygenic effects, that is, to what extent are findings for a PGS driven by strong single genetic variants that have been shown to have dramatic impact on small molecule concentrations in blood.

      Thank you for your suggestions to help refine our work. In line with this comment, we have repeated all analyses 1) after applying the ‘ukbnmr’ R package as recommending by reviewer #2 to remove technical variations and outliers and 2) conducted sensitivity analyses to remove an established list of lipid gene loci from PGS construction. Full results can be interrogated in the web atlas to evaluate whether PGS association may be driven by locus-specific effects at these regions, which may be particularly informative given the representation of lipoprotein lipid metabolites on the NMR panel. Findings are reported on page 19:

      ‘The polygenic nature of complex traits means that the inclusion of highly weighted pleiotropic genetic variants in PGS may introduce bias into genetic associations within our atlas. To provide insight into this issue, we constructed PGS excluding variants within the regions of the genome which encode the genes for 14 major regulators of NMR lipoprotein lipids signals which captured 75% of the gene-metabolite associations in the Finnish Metabolic Syndrome In Men (METSIM) cohort (Gallois et al., 2019). For details of these genes see Supplementary Table 5).

      For PGS with these lipid loci excluded, anthropometric traits such as waist-to-hip ratio (N=209), waist circumference (N=206) and body mass index (N=205) still provided strong evidence of association with the majority of metabolic measurements on the NMR panel based on multiple testing corrections. Elsewhere however, the Alzheimer’s disease PGS, which was associated with 60 metabolic traits robust to P<0.05/19 in the initial analysis including these lipid loci (Supplementary Table 17), provided no convincing evidence of association with the 249 circulating metabolites after excluding the lipid loci based on the same multiple testing threshold (Supplementary Table 18). Further inspection suggested that the likely explanation for this attenuation of evidence were due to variants located within the APOE locus which are recognised to exert their influence on phenotypic traits via horizontally pleiotropic pathways (Ferguson et al., 2020).’

      …Further, it is unclear how much NMR spectroscopy adds over and above established clinical biomarkers, such as LDL-cholesterol or total triglycerides. This is in particular important, since the authors do not adequately distinguish between small molecules, such as amino acids, and characteristics of lipoprotein particles, e.g., the cholesterol content of VLDL, LDL or HDL particles, the latter presenting the vast majority of measures provided by the NMR platform. Finally, the study would benefit from more intriguing or novel examples, how such an atlas could help to identify novel biomarkers or potential causal metabolites, or lipoprotein measures other than the long-established markers named in the manuscript, such as creatinine or lipoproteins.

      To address these comments, we have added a new example focusing on the granular measures of VLDL particles provided by the NMR data (on top of the examples listed at the start of the response to reviewer document), which as the review points out is one of its strengths of the measures generated by this platform over long-established biomarkers (page 21):

      ‘We likewise conducted bi-directional MR to demonstrate that associations between the CHD PGS and VLDL particle size likely reflect an effect of CHD liability on this metabolic trait. In contrast, the association between the CHD PGS and VLDL concentrations are likely attributed to the causal influence of this metabolic trait on CHD risk, suggesting that it is the concentration of these triglyceride-rich particles that are important in terms of the aetiology of CHD risk as opposed to their actual size. We envisage that findings from our atlas, as well as other ongoing efforts which leverage the large-scale NMR data within UKB, should facilitate further granular insight into lipoprotein lipid biology.’

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1:


      1) The authors could consider qualifying the observations as preliminary as no

      mechanistic data or longer-term pathophysiology is investigated. Indeed, the latter is well

      beyond the current scope and may require generation of cell-type specific STING ki mice.

      • *

      Thank you for the comment. We have qualified our observations as preliminary (line 662).

      Indeed, generating cell-type specific STING ki mice is part of our future plans.

      2) The authors consistently write "NF-kB/inflammasomes" - these two pathways (although

      related) are quite distinct and should not be lumped together in such a way.

      • *

      Thank you for this important note, we now corrected the text (for example see section headings in the Results section, lines 338 and 415).

      3) Line 79: "NRLP3" should be corrected to NLRP3.

      • *

      Line 210: age of "adult mice" in weeks should be state in the text and figure legend.

      Thank you, corrected.

      4) Line 262: In Figure 3B and D the images look very different and there is no indication of

      what a positive inclusion is? This should be indicated on the image.

      • *

      Thank you for the suggestion. We replaced the corresponding panels with new images, where we show the nuclei with blue, and the Thioflavin S staining with magenta pseudo-color (current Figure 3E). We marked the outline of Thioflavin S positive cells with yellow. An inset showing the magnification of some neurons with inclusions is also presented.

      5) Line 280: The data of Ifi44 should also be mentioned in the text.

      • *

      Thank you. We performed new experiments to show the gene expression changes in the

      striatum and in the substantia nigra, therefore majority of the gene expression data from the cortex has been moved to the supplementary material (supplemental Figures 3 and 5), and is not discussed in detailed in the current manuscript.

      6) Line 290: Figure 4, Examining IL-1B and Caspase-1 transcripts is not a readout of

      inflammasome activation. pro-IL1B is upregulated in response to NFkB activity. Inflammasome

      activation is commonly examined in other methods e.g. via ASC puncta formation (imaging

      based), active IL-1B secretion (ELISA), Caspase-1 and IL-1B cleavage via western blot

      Thank you for this suggestion, we performed new experiments and added the data as Figure 6.

      • We performed Western blot analysis to detect IL-1β cleavage and NLRP3 proteins from the striatum (Figure 6C-E). 2) We quantified the number of ASC puncta within microglia and astroglia from striatal sections (Figure 6F-I). 3). 3) We also measured the protein levels of several additional immune mediators in the striatum of STING ki and KO animals (supplemental Figure 6, summary heatmap is on Figure 7A). 7) Line 310: The NF-kB subunit examined should be stated (p65?). Furthermore, IRF3

      translocation might be a better readout for STING activation.

      • *

      We indeed detected the p65 subunit of NF-kB (antibody is listed in the supplemental Table), and now it is also indicated in the text (line 366). We also performed subcellular fractionation and quantified IRF3 in the nuclear and cytoplasmic fractions. The data is now added on Figure 6A, B.

      8) Discussion: Given the findings here suggest a strong role for NF-kB, a short discussion

      of IFN vs non-IFN responses from STING should be included. There have been a number of

      seminal papers demonstrating the importance of non-IFN STING responses of late as well as

      much evidence from SAVI mice to suggest some non-IFN driven pathologies.

      • *

      Thank you for the suggestion. The data on inflammasomes were given a separate section in the results (from line 395). In the discussion, from line 535 we discuss the IFN dependent response and from line 548 we discuss the non-IFN driven pathways.

      9) Discussion: Is there any evidence from the human SAVI patients of neuroinflammation

      etc. This should be mentioned either way in the discussion

      • *

      Thank you for this comment. The manifestation of neurological symptoms is not a core feature of the human SAVI disease. Some patients suffer from various neurological symptoms e.g. calcification of basal ganglia, spastic diplegia and episodes of seizure (Fremond et al., 2021). We inserted a short text in discussion (lines 532-534).

      10) Discussion: There is a large body of work demonstrating STING-induced cell death in

      numerous cell types. Despite this it is not mentioned nor discussed but should be. It could

      represent how dopaminergic neurons are lost in the STING ki mice.

      • *

      Thank you for pointing out the gap in our discussion. We added additional text in lines 604-618.

      11) The resolution/quality of some of the imaging is not great but this may be due to PDF

      Compression

      Thank you, we upload the figures with higher resolution.

      Reviewer #2:


      1) The authors base their conclusions (line 215-216) on the neuroinflammatory status of

      their mice strongly on an assessment of the Iba1 and GFAP-positive area fraction. Increase of

      Iba1 and GFAP areas does not necessarily correlate with an increased cytokine production and

      release by the cells. Therefore, in addition the measurement of cytokine mRNAs it would be

      necessary to measure cytokines also on protein level (see also #4 and #5).

      • *

      Thank you for this suggestion, we measured the protein levels of several immune mediators with LEGENDplex™ assay from the striatum, and the new data are included as Figure 7A and supplemental Figure 6.

      2) In the same context: Is the increase of Iba1 and GFAP- covered area due to increased

      proliferation of microglia and astrocytes or due to increased expression of these markers in

      activated glia? How is the number of Iba1/GFAP-positive cells affected?

      • *

      We quantified the number of glia cells in the striatum and in the substantia nigra of adult STING WT and STING ki mice, and, parallel with higher immunoreactivity for the corresponding markers, we detected increased number of cells as well. The quantifications are now included in supplemental Figure 1.

      3) Nowadays we know that microglia and astrocytes can exist in a variety of activated

      states which can be either beneficial or detrimental. An analysis of disease-associated

      microglial markers (Keren-Shaul et al. 2017) would give a good picture of the state microglia are

      in.

      • *

      Thank you for the suggestion. In addition to the panel of immune modulators at the protein level (supplemental Figure 6), we performed qPCR analysis of additional “M1” marker (Nos2) and additional “M2” markers (Il4, Fizz2, Ym1) (Gong et al., 2019). The data is included in Figure 7A and shown in supplemental figure 6. The findings are described from line 431.

      4) It also would be of interest to determine which cell type is responsible for the observed

      neurodegeneration. Which cytokines are released by microglia or astrocytes upon STING

      activation? Even in vitro experiments would help here to get a more profound understanding.

      • *

      We agree with the suggestion, however, the further in vitro experiments are beyond the scopes of this study and will be the basis of a future project.

      5) In line 273 the authors describe that STING is known to activate NFkB and the

      inflammasome. As proof that this is also occurring in their mouse, they perform qPCR analysis

      of whole brain IL-1b, TNF-a and Casp1 expression. While this analysis indicates that there is

      indeed an increased mRNA production of proinflammatory cytokines in the brains of STING ki

      mice, it does not give any indication whether the inflammasome is active or not. The inflammasome is a protein complex largely regulated on protein level. Meaning an assessment

      of the cleavage of Caspase 1 on protein level or the presence of cleaved IL-1b in comparison to

      uncleaved Pro-IL-1b by Western Blot as well as a staining for the number of inflammasomes

      would be required to draw these conclusions.

      • *

      Thank you for the suggestion. We performed additional experiments: 1) Western blot to detect pro-IL1b and IL1b and NLRP3 proteins from the striatum (Figure 6C-E), and 2) we quantified the number of ASC puncta within microglia and astroglia from striatal sections (Figure 6F-I).

      6) To conclude that NFkb/inflammasome pathway is the most active/crucial in astrocytes

      (line 354) a staining for ASC inflammasomes would be of importance, especially as astrocytes

      normally do not express NLRP3.

      • *

      Thank you for this comment. We stained brain sections for ASC specks and for microglia (Iba1) and astroglia (GFAP) markers (Figure 6F-I). Although amount of ASC specks in astroglia was lower than in microglia, we found still a substantial amount of ASC specks in astroglia in the brains of STING ki animals.

      7) As already shown for ALS (Yu et al., 2020) and Parkin KO (Sliter et al. 2018), the authors want to

      further assess the relevance of the STING pathway to PD (line 27-28). Therefore, an in-depth analysis of

      key PD hallmarks beyond phosphorylated a-synuclein, loss the other was parkin/PINK related (so TDP

      deleted) of TH-stained neurons and dopamine reduction is needed. In the discussion the authors

      hypothesize that autophagy (line 467) may be linked to the observed phenotype. Therefore,

      assessment of autophagy/mitophagy as well as mitochondrial dysfunction and mtDNA should

      be analysed. In the same line of thought it would be important to know if and how the observed

      dopamine reduction effects mouse behaviour, thus mice should be subjected to the Rotarod or

      pole or beam walk test.

      • *

      Thank you for these suggestions. In the work by Yu et al. and Sliter et al., the STING pathway was shown to mediate neurodegeneration resulting from TDP-43 pathology and mitochondrial damage. Our work is complementary by investigating the effects of constitutive activation of STING. We have therefore focused on the signaling pathways downstream of STING. As mentioned above, the most important next step will be to separate the contributions of neuronal and glial cells by generating cell type specific STING activation. Of course, it will be interesting to see at a later time point whether STING activation feeds back. We also speculate that STING activation may also cause TDP-43 pathology. Yet, this will be part of a future study. To acknowledge that the pathology is not specific to alpha-synuclein, we added a short statement from line 634.

      With respect to the comprehensive analysis of the PD phenotype, our work includes the

      classical parameters of TH neuron number, TH fiber density, dopamine concentration and

      synuclein pathology. With respect to mouse behavior, we note that the STING ki mice have severe inflammation in the lung, kidney and other (peripheral) organs, reduced body weight and reduced lifespan (Luksch et al., 2019; Motwani et al., 2019; Siedel et al., 2020). Motor deficits cannot be attributed to dopamine neuron degeneration and for this reason were not included (stated in the Discussion, lines 624-625). In order to expand the description of the PD phenotype we now included measurements of cytosolic reactive oxygen species, mitochondrial oxygen species and nitric oxide, which result from inflammation and are known to affect dopaminergic neurons (new Figure 8).

      Reviewer #3:


      1) The method for quantification of TH-positive cells is not sufficient. They just described

      how they stained every fifth sections but did not mention how they count. This is a critical point

      and they should carefully provide information more than just referring their previous paper.

      Counting of dopaminergic neurons and quantification of fibers was described in a dedicated section of the methods. This section has now been expanded (from line 154).

      2) It is not persuasive that they did not investigate local inflammation in SN. They

      presented increased microglia and astrocytes in the striatum but not analyzed these cells in SN

      • *

      Indeed, we measured neuroinflammation in the substantia nigra as well, however, although increased in STING ki mice, it was less pronounced than neuroinflammation in the striatum. We now include the quantification of area fraction as well as cell number counting of microglia and astroglia in the substantia nigra of STING WT and STING ki animals (supplemental Figure 1), and also the expression of inflammatory mediators in Figure 4.

      3) In Figure 3, they analyzed alpha-synuclein phosphorylation and beta-sheet structure in

      the striatum. This is funny from the aspect of Parkinson's disease, which dominantly affects SN.

      They should perform similar experiments with SN samples. In a different aspect, the aggregates

      detected by Thio S may not be alpha-synuclein and could be tau, TDP43 or other substances.

      Phospho-synuclein of course does not mean aggregation, so they can consider electron

      microscopy.

      • *

      We agree with the reviewer. To complement our data, we therefore performed solubility assay both from the striatum and from the substantia nigra to quantify the ratio of alpha-synuclein in the Triton X-100 soluble and insoluble fractions (Figure 3C, D) as previously (Szego et al., 2022; Szegő et al., 2019). Additionally, we quantified phosphorylated alpha-synuclein from the substantia nigra as well Figure 3A,B).

      We also agree with the reviewer that the presence of Thioflavin S-positive inclusions may also contain other, beta-sheet forming proteins and noted this from line 634.

      4) Figure 5, pSTAT3 increased in Iba1-negative cells, which seem neurons from the size of

      nuclei. First, the authors should investigate the identity of pSTAT3-positive cells with GFAP and

      MAP2. If pSTAT3 is actually increased in neurons, what does it mean in the pathology? For

      instance, in viral infection, STAT3 activation triggers suicide of neurons to prevent further

      proliferation of viral particles in neurons. Is it homologous or other function?

      • *

      Thank you for this suggestion. The brain sections were stained for Iba1 and GFAP. pSTAT3 nuclear staining indeed increased in non-glia cells, based on the morphology, we think in neurons. However, detailed characterization of the signal is out of the scopes of this (preliminary) study.

      5) In Figure 6 and overall, cell types in which the activation of three signaling pathways,

      were mixed up and hard to understand the actual situation in the brain.

      • *

      In our model, STING is activated in all cells. Consequently, we cannot determine the origin of immune mediators found elevated in the STING ki mice. This will require cell type specific STING activation. In order to react to the reviewer’s comment and be clearer, we have added more details about the brain region and age of mice used for each analysis also in the figures.

      6) In the method section, the original paper for generation of heterozygous STING N153S

      KI mice should be Warner et al, JEM 2017.

      • *

      We used a STING N153S ki mouse strain that was independently generated in the Technical University Dresden (Luksch et al., 2019).

      7) NF-κB stains seem located in cytoplasm in Figure 5B.

      • *

      We agree: especially in the young STING ki mice, cytoplasmic NF-kB staining is increased

      compared to STING WT mice. To quantify nuclear translocation, however, we counted the

      number of those cells where NF-kB signal was overlapping with the nuclear Hoechst staining.

      8) In Figure 4 and 6, why the authors evaluate gene expressions in frontal cortex instead of

      SN or striatum.

      • *

      As noted in several comments, we show here that the STING-induced pathology involves

      dopaminergic neurons, but believe that it is not specific for the dopaminergic system given that STING-ki is ubiquitously expressed. For practical reasons, we have used cortical samples for the expression analysis. For consistency, we now performed additional qPCR measurement from the striatum and from the substantia nigra and included them as new Figure 4 and supplemental Figure 6N-Q. The previous data from the cortex was moved to the supplemental Figures 3 and 5. Additionally, we measured the levels of several inflammatory modulators from the striatum of STING ki and KO animals (Figure 7A and supplemental figure 6A-M).

      9) In some groups (Sting-ki;ifnar1-/- in Fig 6C, 6E), the values were separated to two

      groups, which makes readers to doubt on soundness of their genotyping.

      • *

      Our genotyping protocol is highly standardized, and the genotype of the animals were correctly assigned. Here we provide an example of gel images showing the products after PCR reactions for the STING N153S allele (Figure 1a), STING WT allele (Figure 1b), Ifnara WT allele (Figure 1c) and lack of Ifnara allele (Figure 1d) of the same animals. We note that a bimodal distribution of phenotypes is often observed in Ifnar-/- mice.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors evaluate the involvement of the hippocampus in a fast-paced time-to-contact estimation task. They find that the hippocampus is sensitive to feedback received about accuracy on each trial and has activity that tracks behavioral improvement from trial to trial. Its activity is also related to a tendency for time estimation behavior to regress to the mean. This is a novel paradigm to explore hippocampal activity and the results are thus novel and important, but the framing as well as discussion about the meaning of the findings obscures the details of the results or stretches beyond them in many places, as detailed below.

      We thank the reviewer for their constructive feedback and were happy to read that s/he considered our approach and results as novel and important. The comments led us to conduct new fMRI analyses, to clarify various unclear phrasings regarding our methods, and to carefully assess our framing of the interpretation and scope of our results. Please find our responses to the individual points below.

      1) Some of the results appear in the posterior hippocampus and others in the anteriorhippocampus. The authors do not motivate predictions for anterior vs. posterior hippocampus, and they do not discuss differences found between these areas in the Discussion. The hippocampus is treated as a unitary structure carrying out learning and updating in this task, but the distinct areas involved motivate a more nuanced picture that acknowledges that the same populations of cells may not be carrying out the various discussed functions.

      We thank the reviewer for pointing this out. We split the hippocampus into anterior and posterior sections because prior work suggested a different whole-brain connectivity and function of the two. This was mentioned in the methods section (page 15) in the initial submission but unfortunately not in the main text. Moreover, when discussing the results, we did indeed refer mostly to the hippocampus as a unitary structure for simplicity and readability, and because statements about subcomponents are true for the whole. However, we agree with the reviewer that the differences between anterior and posterior sections are very interesting, and that describing these effects in more detail might help to guide future work more precisely.

      In response to the reviewer's comment, we therefore clarified at various locations throughout the manuscript whether the respective results were observed in the posterior or anterior section of the hippocampus, and we extended our discussion to reflect the idea that different functions may be carried out by distinct populations of hippocampal cells. In addition, we also now motivate the split into the different sections better in the main text. We made the following changes.

      Page 3: “Second, we demonstrate that anterior hippocampal fMRI activity and functional connectivity tracks the behavioral feedback participants received in each trial, revealing a link between hippocampal processing and timing-task performance.

      Page 3: “Fourth, we show that these updating signals in the posterior hippocampus were independent of the specific interval that was tested and activity in the anterior hippocampus reflected the magnitude of the behavioral regression effect in each trial.”

      Page 5: “We performed both whole-brain voxel-wise analyses as well as regions-of-interest (ROI) analysis for anterior and posterior hippocampus separately, for which prior work suggested functional differences with respect to their contributions to memory-guided behavior (Poppenk et al., 2013, Strange et al. 2014).”

      Page 9: “Because anterior and posterior sections of the hippocampus differ in whole-brain connectivity as well as in their contributions to memory-guided behavior (Strange et al. 2014), we analyzed the two sections separately. “

      Page 9: “We found that anterior hippocampal activity as well as functional connectivity reflected the feedback participants received during this task, and its activity followed the performance improvements in a temporal-context-dependent manner. Its activity reflected trial-wise behavioral biases towards the mean of the sampled intervals, and activity in the posterior hippocampus signaled sensorimotor updating independent of the specific intervals tested.”

      Page 10: “Intriguingly, the mechanisms at play may build on similar temporal coding principles as those discussed for motor timing (Yin & Troger, 2011; Eichenbaum, 2014; Howard, 2017; Palombo & Verfaellie, 2017; Nobre & van Ede, 2018; Paton & Buonomano, 2018; Bellmund et al., 2020, 2021; Shikano et al., 2021; Shimbo et al., 2021), with differential contributions of the anterior and posterior hippocampus. Note that our observation of distinct activity modulations in the anterior and posterior hippocampus suggests that the functions and coding principles discussed here may be mediated by at least partially distinct populations of hippocampal cells.”

      Page 11: Interestingly, we observed that functional connectivity of the anterior hippocampus scaled negatively (Fig. 2C) with feedback valence [...]

      2) Hippocampal activity is stronger for smaller errors, which makes the interpretationmore complex than the authors acknowledge. If the hippocampus is updating sensorimotor representations, why would its activity be lower when more updating is needed?

      Indeed, we found that absolute (univariate) activity of the hippocampus scaled with feedback valence, the inverse of error (Fig. 2A). We see multiple possibilities for why this might be the case, and we discussed some of them in a dedicated discussion section (“The role of feedback in timed motor actions”). For example, prior work showed that hippocampal activity reflects behavioral feedback also in other tasks, which has been linked to learning (e.g. Schönberg et al., 2007; Cohen & Ranganath, 2007; Shohamy & Wagner, 2008; Foerde & Shohamy, 2011; Wimmer et al., 2012). In our understanding, sensorimotor updating is a form of ‘learning’ in an immediate and behaviorally adaptive manner, and we therefore consider our results well consistent with this earlier work. We agree with the reviewer that in principle activity should be stronger if there was stronger sensorimotor updating, but we acknowledge that this intuition builds on an assumption about the relationship between hippocampal neural activity and the BOLD signal, which is not entirely clear. For example, prior work revealed spatially informative negative BOLD responses in the hippocampus as a function of visual stimulation (e.g. Szinte & Knapen 2020), and the effects of inhibitory activity - a leading motif in the hippocampal circuitry - on fMRI data are not fully understood. This raises the possibility that the feedback modulation we observed might also involve negative BOLD responses, which would then translate to the observed negative correlation between feedback valence and the hippocampal fMRI signal, even if the magnitude of the underlying updating mechanism was positively correlated with error. This complicates the interpretation of the direction of the effect, which is why we chose to avoid making strong conclusions about it in our manuscript. Instead, we tried discussing our results in a way that was agnostic to the direction of the feedback modulation. Importantly, hippocampal connectivity with other regions did scale positively with error (Fig. 2B), which we again discussed in the dedicated discussion section.

      In response to the reviewer’s comment, we revisited this section of our manuscript and felt the latter result deserved a better discussion. We therefore took this opportunity to extend our discussion of the connectivity results (including their relationship to the univariate-activity results as well as the direction of these effects), all while still avoiding strong conclusions about directionality. Following changes were made to the manuscript.

      Page 11: Interestingly, we observed that functional connectivity of the anterior hippocampus scaled negatively (Fig. 2C) with feedback valence, unlike its absolute activity, which scaled positively with feedback valence (Fig. 2A,B), suggesting that the two measures may be sensitive to related but distinct processes.

      Page 11: Such network-wide receptive-field re-scaling likely builds on a re-weighting of functional connections between neurons and regions, which may explain why anterior hippocampal connectivity correlated negatively with feedback valence in our data. Larger errors may have led to stronger re-scaling, which may be grounded in a corresponding change in functional connectivity.

      3) Some tests were one-tailed without justification, which reduces confidence in the robustness of the results.

      We thank the reviewer for pointing us to the fact that our choice of statistical tests was not always clear in the manuscript. In the analysis the reviewer is referring to, we predicted that stronger sensorimotor updating should lead to stronger activity as well as larger behavioral improvements across the respective trials. This is because a stronger update should translate to a more accurate “internal model” of the task and therefore to a better performance. We tested this one-sided hypothesis using the appropriate test statistic (contrasting trials in which behavioral performance did improve versus trials in which it did not improve), but we did not motivate our reasoning well enough in the manuscript. The revised manuscript therefore includes the two new statements shown below to motivate our choice of test statistic more clearly.

      Page 7: [...] we contrasted trials in which participants had improved versus the ones in which they had not improved or got worse (see methods for details). Because stronger sensorimotor updating should lead to larger performance improvements, we predicted to find stronger activity for improvements vs. no improvements in these tests (one-tailed hypothesis).

      Page 18: These two regressors reflect the tests for target-TTC-independent and target-TTC-specific updating, respectively. Because we predicted to find stronger activity for improvements vs. no improvements in behavioral performance, we here performed one-tailed statistical tests, consistent with the direction of this hypothesis. Improvement in performance was defined as receiving feedback of higher valence than in the corresponding previous trial.

      4) The introduction motivates the novelty of this study based on the idea that thehippocampus has traditionally been thought to be involved in memory at the scale of days and weeks. However, as is partially acknowledged later in the Discussion, there is an enormous literature on hippocampal involvement in memory at a much shorter timescale (on the order of seconds). The novelty of this study is not in the timescale as much as in the sensorimotor nature of the task.

      We thank the reviewer for this helpful suggestion. We agree that a key part of the novelty of this study is the use of the task that is typically used to study sensorimotor integration and timing rather than hippocampal processing, along with the new insights this task enabled about the role of the hippocampus in sensorimotor updating. As mentioned in the discussion, we also agree with the reviewer that there is prior literature linking hippocampal activity to mnemonic processing on short time scales. We therefore rephrased the corresponding section in the introduction to put more weight on the sensorimotor nature of our task instead of the time scales.

      Note that the new statement still includes the time scale of the effects, but that it is less at the center of the argument anymore. We chose to keep it in because we do think that the majority of studies on hippocampal-dependent memory functions focus on longer time scales than our study does, and we expect that many readers will be surprised about the immediacy of how hippocampal activity relates to ongoing behavioral performance (on ultrashort time scales).

      We changed the introduction to the following.

      Page 2: Here, we approach this question with a new perspective by converging two parallel lines of research centered on sensorimotor timing and hippocampal-dependent cognitive mapping. Specifically, we test how the human hippocampus, an area often implicated in episodic-memory formation (Schiller et al., 2015; Eichenbaum, 2017), may support the flexible updating of sensorimotor representations in real time and in concert with other regions. Importantly, the hippocampus is not traditionally thought to support sensorimotor functions, and its contributions to memory formation are typically discussed for longer time scales (hours, days, weeks). Here, however, we characterize in detail the relationship between hippocampal activity and real-time behavioral performance in a fast-paced timing task, which is traditionally believed to be hippocampal-independent. We propose that the capacity of the hippocampus to encode statistical regularities of our environment (Doeller et al. 2005, Shapiro et al. 2017, Behrens et al., 2018; Momennejad, 2020; Whittington et al., 2020) situates it at the core of a brain-wide network balancing specificity vs. regularization in real time as the relevant behavior is performed.

      5) The authors used three different regressors for the three feedback levels, asopposed to a parametric regressor indexing the level of feedback. The predictions are parametric, so a parametric regressor would be a better match, and would allow for the use of all the medium-accuracy data.

      The reviewer raises a good point that overlaps with question 3 by reviewer 2. In the current analysis, we model the three feedback levels with three independent regressors (high, medium, low accuracy). We then contrast high vs. low accuracy feedback, obtaining the results shown in Fig. 2AB. The beta estimates obtained for medium-accuracy feedback are being ignored in this contrast. Following the reviewer’s feedback, we therefore re-run the model, this time modeling all three feedback levels in one parametric regressor. All other regressors in the model stayed the same. Instead of contrasting high vs. low accuracy feedback, we then performed voxel-wise t-tests on the beta estimates obtained for the parametric feedback regressor.

      The results we observed were highly consistent across the two analyses, and all conclusions presented in the initial manuscript remain unchanged. While the exact t-scores differ slightly, we replicated the effects for all clusters on the voxel-wise map (on whole-brain FWE-corrected levels) as well as for the regions-of-interest analysis for anterior and posterior hippocampus. These results are presented in a new Supplementary Figure 3C.

      Note that the new Supplementary Figure 3B shows another related new analyses we conducted in response to question 4 of reviewer 2. Here, we re-ran the initial analysis with three feedback regressors, but without modeling the inter-trial interval (ITI) and the inter-session interval (ISI, i.e. the breaks participants took) to avoid model over-specification. Again, we replicated the results for all clusters and the ROI analysis, showing that the initial results we presented are robust.

      The following additions were made to the manuscript.

      Page 5: Note that these results were robust even when fewer nuisance regressors were included to control for model over-specification (Fig. S3B; two-tailed one-sample t tests: anterior HPC, t(33) = -3.65, p = 8.9x10-4, pfwe = 0.002, d=-0.63, CI: [-1.01, -0.26]; posterior HPC, t(33) = -1.43, p = 0.161, pfwe = 0.322, d=-0.25, CI: [-0.59, 0.10]), and when all three feedback levels were modeled with one parametric regressors (Fig. S3C; two-tailed one-sample t tests: anterior HPC, t(33) = -3.59, p = 0.002, pfwe = 0.005, d=-0.56, CI: [-0.93, -0.20]; posterior HPC, t(33) = -0.99, p = 0.329, pfwe = 0.659, d=-0.17, CI: [-0.51, 0.17]). Further, there was no systematic relationship between subsequent trials on a behavioral level [...]

      Page 17: Moreover, instead of modeling the three feedback levels with three independent regressors, we repeated the analysis modeling the three feedback levels as one parametric regressor with three levels. All other regressors remained unchanged, and the model included the regressors for ITIs and ISIs. We then conducted t-tests implemented in SPM12 using the beta estimates obtained for the parametric feedback regressor (Fig. 2C). Compared to the initial analyses presented above, this has the advantage that medium-accuracy feedback trials are considered for the statistics as well.

      6) The authors claim that the results support the idea that the hippocampus is findingan "optimal trade-off between specificity and regularization". This seems overly speculative given the results presented.

      We understand the reviewer's skepticism about this statement and agree that the manuscript does not show that the hippocampus is finding the trade-off between specificity and regularization. However, this is also not exactly what the manuscript claims. Instead, it suggests that the hippocampus “may contribute” to solving this trade-off (page 3) as part of a “brain-wide network“ (pages 2,3,9,12). We also state that “Our [...] results suggest that this trade-off [...] is governed by many regions, updating different types of task information in parallel” (Page 11). To us, these phrasings are not equivalent, because we do not think that the role of the hippocampus in sensorimotor updating (or in any process really) can be understood independently from the rest of the brain. We do however think that our results are in line with the idea that the hippocampus contributes to solving this trade-off, and that this is exciting and surprising given the sensorimotor nature of our task, the ultrashort time scale of the underlying process, and the relationship to behavioral performance. We tried expressing that some of the points discussed remain speculation, but it seems that we were not always successful in doing so in the initial submission. We apologize for the misunderstanding, adapted corresponding statements in the manuscript, and we express even more carefully that these ideas are speculation.

      Following changes were made to the introduction and discussion.

      Page 2: Here, we approach this question with a new perspective by converging two parallel lines of research centered on sensorimotor timing and hippocampal-dependent cognitive mapping. Specifically, we test how the human hippocampus, an area often implicated in episodic-memory formation (Schiller et al., 2015; Eichenbaum, 2017), may support the flexible updating of sensorimotor representations in real time and in concert with other regions.

      Page 12: Because hippocampal activity (Julian & Doeller, 2020) and the regression effect (Jazayeri & Shadlen, 2010) were previously linked to the encoding of (temporal) context, we reasoned that hippocampal activity should also be related to the regression effect directly. This may explain why hippocampal activity reflected the magnitude of the regression effect as well as behavioral improvements independently from TTC, and why it reflected feedback, which informed the updating of the internal prior.

      Page 12: This is in line with our behavioral results, showing that TTC-task performance became more optimal in the face of both of these two objectives. Over time, behavioral responses clustered more closely between the diagonal and the average line in the behavioral response profile (Fig. 1B, S1G), and the TTC error decreased over time. While different participants approached these optimal performance levels from different directions, either starting with good performance or strong regularization, the group approached overall optimal performance levels over the course of the experiment.

      Page 13: This is in line with the notion that the hippocampus [...] supports finding an optimal trade off between specificity and regularization along with other regions. [...] Our results show that the hippocampus supports rapid and feedback-dependent updating of sensorimotor representations, suggesting that it is a central component of a brain-wide network balancing task specificity vs. regularization for flexible behavior in humans.

      Note that in response to comment 1 by reviewer 2, the revised manuscript now reports the results of additional behavioral analyses that support the notion that participants find an optimal trade-off between specificity and regularization over time (independent of whether the hippocampus was involved or not).

      7) The authors find that hippocampal activity is related to behavioral improvement fromthe prior trial. This seems to be a simple learning effect (participants can learn plenty about this task from a prior trial that does not have the exact same timing as the current trial) but is interpreted as sensitivity to temporal context. The temporal context framing seems too far removed from the analyses performed.

      We agree with the reviewer that our observation that hippocampal activity reflects TTC-independent behavioral improvements across trials could have multiple explanations. Critically, i) one of them is that the hippocampus encodes temporal context, ii) it is only one of multiple observations that we build our interpretation on, and iii) our interpretation builds on multiple earlier reports

      Interval estimates regress toward the mean of the sampled intervals, an effect that is often referred to as the “regression effect”. This effect, which we observed in our data too (Fig. 1B), has been proposed to reflect the encoding of temporal context (e.g. Jazayeri & Shadlen 2010). Moreover, there is a large body of literature on how the hippocampus may support the encoding of spatial and temporal context (e.g. see Bellmund, Polti & Doeller 2020 for review).

      Because both hippocampal activity and the regression effect were linked to the encoding of (temporal) context, we reasoned that hippocampal activity should also be related to the regression effect directly. If so, one would expect that hippocampal activity should reflect behavioral improvements independently from TTC, it should reflect the magnitude of the regression effect, and it should generally reflect feedback, because it is the feedback that informs the updating of the internal prior.

      All three observations may have independent explanations indeed, but they are all also in line with the idea that the hippocampus does encode temporal context and that this explains the relationship between hippocampal activity and the regression effect. It therefore reflects a sparse and reasonable explanation in our opinion, even though it necessarily remains an interpretation. Of course, we want to be clear on what our results are and what our interpretations are.

      In response to the reviewer’s comment, we therefore toned down two of the statements that mention temporal context in the manuscript, and we removed an overly speculative statement from the result section. In addition, the discussion now describes more clearly how our results are in line with this interpretation.

      Abstract: This is in line with the idea that the hippocampus supports the rapid encoding of temporal context even on short time scales in a behavior-dependent manner.

      Page 13: This is in line with the notion that the hippocampus encodes temporal context in a behavior-dependent manner, and that it supports finding an optimal trade off between specificity and regularization along with other regions.

      Page 12: Because hippocampal activity (Julian & Doeller, 2020) and the regression effect (Jazayeri & Shadlen, 2010) were previously linked to the encoding of (temporal) context, we reasoned that hippocampal activity should also be related to the regression effect directly. This may explain why hippocampal activity reflected the magnitude of the regression effect as well as behavioral improvements independently from TTC, and why it reflected feedback, which informed the updating of the internal prior.

      The following statement was removed, overlapping with comment 2 by Reviewer 3:

      Instead, these results are consistent with the notion that hippocampal activity signals the updating of task-relevant sensorimotor representations in real-time.

      8) I am not sure the term "extraction of statistical regularities" is appropriate. The termis typically used for more complex forms of statistical relationships.

      We agree with the reviewer that this expression may be interpreted differently by different readers and are grateful to be pointed to this fact. We therefore removed it and instead added the following (hopefully less ambiguous) statement to the manuscript.

      Page 9: This study investigated how the human brain flexibly updates sensorimotor representations in a feedback-dependent manner in the service of timing behavior.

      Reviewer #2 (Public Review):

      The authors conducted a study involving functional magnetic resonance imaging and a time-to-contact estimation paradigm to investigate the contribution of the human hippocampus (HPC) to sensorimotor timing, with a particular focus on the involvement of this structure in specific vs. generalized learning. Suggestive of the former, it was found that HPC activity reflected time interval-specific improvements in performance while in support of the latter, HPC activity was also found to signal improvements in performance, which were not specific to the individual time intervals tested. Based on these findings, the authors suggest that the human HPC plays a key role in the statistical learning of temporal information as required in sensorimotor behaviour.

      By considering two established functions of the HPC (i.e., temporal memory and generalization) in the context of a domain that is not typically associated with this structure (i.e., sensorimotor timing), this study is potentially important, offering novel insight into the involvement of the HPC in everyday behaviour. There is much to like about this submission: the manuscript is clearly written and well-crafted, the paradigm and analyses are well thought out and creative, the methodology is generally sound, and the reported findings push us to consider HPC function from a fresh perspective. A relative weakness of the paper is that it is not entirely clear to what extent the data, at least as currently reported, reflects the involvement of the HPC in specific and generalized learning. Since the authors' conclusions centre around this observation, clarifying this issue is, in my opinion, of primary importance.

      We thank the reviewer for these positive and extremely helpful comments, which we will address in detail below. In response to these comments, the revised manuscript clarifies why the observed performance improvements are not at odds with the idea that an optimal trade-off between specificity and regularization is found, and how the time course of learning relates to those reported in previous literature. In addition, we conducted two new fMRI analyses, ensuring that our conclusions remain unchanged even if feedback is modeled with one parametric regressor, and if the number or nuisance regressors is reduced to control for overparameterization of the model. Please find our responses underneath each individual point below.

      1) Throughout the manuscript, the authors discuss the trade-off between specific and generalized learning, and point towards Figure S1D as evidence for this (i.e., participants with higher TTC accuracy exhibited a weaker regression effect). What appears to be slightly at odds with this, however, is the observation that the deviation from true TTC decreased with time (Fig S1F) as the regression line slope approached 0.5 (Fig S1E) - one would have perhaps expected the opposite i.e., for deviation from true TTC to increase as generalization increases. To gain further insight into this, it would be helpful to see the deviation from true TTC plotted for each of the four TTC intervals separately and as a signed percentage of the target TTC interval (i.e., (+) or (-) deviation) rather than the absolute value.

      We thank the reviewer for raising this important question and for the opportunity to elaborate on the relationship between the TTC error and the magnitude of the regression effect in behavior. Indeed, we see that the regression slopes approach 0.5 and that the TTC error decreases over the course of the experiment. We do not think that these two observations are at odds with each other for the following reasons:

      First, while the reviewer is correct in pointing out that the deviation from the TTC should increase as “generalization increases”, that is not what we found. It was not the magnitude of the regularization per se that increased over time, but the overall task performance became more optimal in the face of both objectives: specificity and generalization. This optimum is at a regression-line slope of 0.5. Generalization (or regularization how we refer to it in the present manuscript), therefore did not increase per se on group level.

      Second, the regression slopes approached 0.5 on the group-level, but the individual participants approached this level from different directions: Some of them started with a slope value close to 1 (high accuracy), whereas others started with a slope value close to 0 (near full regression to the mean). Irrespective of which slope value they started with, over time, they got closer to 0.5 (Rebuttal Figure 1A). This can also be seen in the fact that the group-level standard deviation in regression slopes becomes smaller over the course of the experiment (Rebuttal Figure 1B, SFig 1G). It is therefore not generally the case that the regression effect becomes stronger over time, but that it becomes more optimal for longer-term behavioral performance, which is then also reflected in an overall decrease in TTC error. Please see our response to the reviewer’s second comment for more discussion on this.

      Third, the development of task performance is a function of two behavioral factors: a) the accuracy and b) the precision in TTC estimation. Accuracy describes how similar the participant’s TTC estimates were to the true TTC, whereas precision describes how similar the participant’s TTC estimates were relative to each other (across trials). Our results are a reflection of the fact that participants became both more accurate over time on average, but also more precise. To demonstrate this point visually, we now plotted the Precision and the Accuracy for the 8 task segments below (Rebuttal Figure 1C, SFig 1H), showing that both measures increased as the time progressed and more trials were performed. This was the case for all target durations.

      In response to the reviewer’s comment, we clarified in the main text that these findings are not at odds with each other. Furthermore, we made clear that regularization per se did not increase over time on group level. We added additional supporting figures to the supplementary material to make this point. Note that in our view, these new analyses and changes more directly address the overall question the reviewer raised than the figure that was suggested, which is why we prioritized those in the manuscript.

      However, we appreciated the suggestion a lot and added the corresponding figure for the sake of completeness.

      Following additions were made.

      Page 5: In support of this, participants' regression slopes converged over time towards the optimal value of 0.5, i.e. the slope value between veridical performance and the grand mean (Fig. S1F; linear mixed-effects model with task segment as a predictor and participants as the error term, F(1) = 8.172, p = 0.005, ε2=0.08, CI: [0.01, 0.18]), and participants' slope values became more similar (Fig. S1G; linear regression with task segment as predictor, F(1) = 6.283, p = 0.046, ε2 = 0.43, CI: [0, 1]). Consequently, this also led to an improvement in task performance over time on group level (i.e. task accuracy and precision increased (Fig. S1I), and the relationship between accuracy and precision became stronger (Fig. S1H), linear mixed-effect model results for accuracy: F(1) = 15.127, p = 1.3x10-4, ε2=0.06, CI: [0.02, 0.11], precision: F(1) = 20.189, p = 6.1x10-5, ε2 = 0.32, CI: [0.13, 1]), accuracy-precision relationship: F(1) = 8.288, p =0.036, ε2 = 0.56, CI: [0, 1], see methods for model details).

      Page 12: This suggests that different regions encode distinct task regularities in parallel to form optimal sensorimotor representations to balance specificity and regularization. This is in line with our behavioral results, showing that TTC-task performance became more optimal in the face of both of these two objectives. Over time, behavioral responses clustered more closely between the diagonal and the average line in the behavioral response profile (Fig. 1B, S1G), and the TTC error decreased over time. While different participants approached these optimal performance levels from different directions, either starting with good performance or strong regularization, the group approached overall optimal performance levels over the course of the experiment.

      Page 15: We also corroborated this effect by measuring the dispersion of slope values between participants across task segments using a linear regression model with task segment as a predictor and the standard deviation of slope values across participants as the dependent variable (Fig. S1G). As a measure of behavioral performance, we computed two variables for each target-TTC level: sensorimotor timing accuracy, defined as the absolute difference in estimated and true TTC, and sensorimotor timing precision, defined as coefficient of variation (standard deviation of estimated TTCs divided by the average estimated TTC). To study the interaction between these two variables for each target TTC over time, we first normalized accuracy by the average estimated TTC in order to make both variables comparable. We then used a linear mixed-effects model with precision as the dependent variable, task segment and normalized accuracy as predictors and target TTC as the error term. In addition, we tested whether accuracy and precision increased over the course of the experiment using separate linear mixed-effects models with task segment as predictor and participants as the error term.

      2) Generalization relies on prior experience and can be relatively slow to develop as is the case with statistical learning. In Jazayeri and Shadlen (2010), for instance, learning a prior distribution of 11-time intervals demarcated by two briefly flashed cues (compared to 4 intervals associated with 24 possible movement trajectories in the current study) required ~500 trials. I find it somewhat surprising, therefore, that the regression line slope was already relatively close to 0.5 in the very first segment of the task. To what extent did the participants have exposure to the task and the target intervals prior to entering the scanner?

      We thank the reviewer for raising the important question about the time course of learning in our task and how our results relate to prior work on this issue. Addressing the specific reviewer question first, participants practiced the task for 2-3 minutes prior to scanning. During the practice, they were not specifically instructed to perform the task as well as they could nor to encode the intervals, but rather to familiarize themselves with the general experimental setup and to ask potential questions outside the MRI machine. While they might have indeed started encoding the prior distribution of intervals during the practice already, we have no way of knowing, and we expect the contribution of this practice on the time course of learning during scanning to be negligible (for the reasons outlined above).

      However, in addition to the specific question the reviewer asked, we feel that the comment raises two more general points: 1) How long does it take to learn the prior distribution of a set of intervals as a function of the number of intervals tested, and 2) Why are the learning slopes we report quite shallow already in the beginning of the scan?

      Regarding (1), we are not aware of published reports that answer this question directly, and we expect that this will depend on the task that is used. Regarding the comparison to Jazayeri & Shadlen (2010), we believe the learning time course is difficult to compare between our study and theirs. As the reviewer mentioned, our study featured only 4 intervals compared to 11 in their work, based on which we would expect much faster learning in our task than in theirs. We did indeed sample 24 movement directions, but these were irrelevant in terms of learning the interval distribution. Moreover, unlike Jazayeri & Shadlen (2010), our task featured moving stimuli, which may have added additional sensory, motor and proprioceptive information in our study which the participants of the prior study could not rely on.

      Regarding (2), and overlapping with the reviewer’s previous comment, the average learning slope in our study is indeed close to 0.5 already in the first task segment, but we would like to highlight that this is a group-level measure. The learning slopes of some subjects were closer to 1 (i.e. the diagonal in Fig 1B), and the one of others was closer to 0 (i.e. the mean) in the beginning of the experiment. The median slope was close to 0.65. Importantly, the slopes of most participants still approached 0.5 in the course of the experiment, and so did even the group-level slope the reviewer is referring to. This also means that participants’ slopes became more similar in the course of the experiment, and they approached 0.5, which we think reflects the optimal trade-off between regressing towards the mean and regressing towards the diagonal (in the data shown in Fig. 1B). This convergence onto the optimal trade-off value can be seen in many measures, including the mean slope (Rebuttal Figure 1A, SFig 1F), the standard deviation in slopes (Rebuttal Figure 1B, SFig 1G) as well as the Precision vs. Accuracy tradeoff (Rebuttal Figure 1C, SFig 1H). We therefore think that our results are well in line with prior literature, even though a direct comparison remains difficult due to differences in the task.

      In response to the reviewer’s comment, and related to their first comment, we made the following addition to the discussion section.

      Page 12: This suggests that different regions encode distinct task regularities in parallel to form optimal sensorimotor representations to balance specificity and regularization. This is well in line with our behavioral results, showing that TTC-task performance became more optimal in the face of both of these two objectives. Over time, behavioral responses clustered more closely between the diagonal and the average line in the behavioral response profile (Fig. 1B, S1G), and the TTC error decreased over time. While different participants approached these optimal performance levels from different directions, either starting with good performance or strong regularization, the group approached overall optimal performance levels over the course of the experiment.

      3) I am curious to know whether differences between high-accuracy andmedium-accuracy feedback as well as between medium-accuracy and low-accuracy feedback predicted hippocampal activity in the first GLM analysis (middle page 5). Currently, the authors only present the findings for the contrast between high-accuracy and low-accuracy feedback. Examining all feedback levels may provide additional insight into the nature of hippocampal involvement and is perhaps more consistent with the subsequent GLM analysis (bottom page 6) in which, according to my understanding, all improvements across subsequent trials were considered (i.e., from low-accuracy to medium-accuracy; medium-accuracy to high-accuracy; as well as low-accuracy to high-accuracy).

      We thank the reviewer for this thoughtful question, which relates to questions 5 by reviewer 1. The reviewer is correct that the contrast shown in Fig 2 does not consider the medium-accuracy feedback levels, and that the model in itself is slightly different from the one used in the subsequent analysis presented in Fig. 3. To reply to this comment as well as to a related one by reviewer 1 together, we therefore repeated the full analysis while modeling the three feedback levels in one parametric regressor, which includes the medium-accuracy feedback trials, and is consistent with the analysis shown in Fig. 3. The results of this new analysis are presented in the new Supplementary Fig. 3B.

      In short, the model included one parametric regressor with three levels reflecting the three types of feedback, and all nuisance regressors remained unchanged. Instead of contrasting high vs. low accuracy feedback, we then performed voxel-wise t-tests on the beta estimates obtained for the parametric feedback regressor. We found that our results presented initially were very robust: Both the observed clusters in the voxel-wise analysis (on whole-brain FWE-corrected levels) as well as the ROI results replicated across the two analyses, and our conclusions therefore remain unchanged.

      We made multiple textual additions to the manuscript to include this new analysis, and we present the results of the analysis including a direct comparison to our initial results in the new Supplementary Fig. 3. Following textual additions were.

      Page 5: Note that these results were robust even when fewer nuisance regressors were included to control for model over-specification (Fig. S3B; two-tailed one-sample t tests: anterior HPC, t(33) = -3.65, p = 8.9x10-4, pfwe = 0.002, d=-0.63, CI: [-1.01, -0.26]; posterior HPC, t(33) = -1.43, p = 0.161, pfwe = 0.322, d=-0.25, CI: [-0.59, 0.10]), and when all three feedback levels were modeled with one parametric regressors (Fig. S3C; two-tailed one-sample t tests: anterior HPC, t(33) = -3.59, p = 0.002, pfwe = 0.005, d=-0.56, CI: [-0.93, -0.20]; posterior HPC, t(33) = -0.99, p = 0.329, pfwe = 0.659, d=-0.17, CI: [-0.51, 0.17]). Further, there was no systematic relationship between subsequent trials on a behavioral level [...]

      Page 17: Moreover, instead of modeling the three feedback levels with three independent regressors, we repeated the analysis modeling the three feedback levels as one parametric regressor with three levels. All other regressors remained unchanged, and the model included the regressors for ITIs and ISIs. We then conducted t-tests implemented in SPM12 using thebeta estimates obtained for the parametric feedback regressor (Fig. S2C). Compared to the initial analyses presented above, this has the advantage that medium-accuracy feedback trials are considered for the statistics as well.

      4) The authors modeled the inter-trial intervals and periods of rest in their univariateGLMs. This approach of modelling all 'down time' can lead to model over-specification and inaccurate parameter estimation (e.g. Pernet, 2014). A comment on this approach as well as consideration of not modelling the inter-trial intervals would be useful.

      This is an important issue that we did not address in our initial manuscript. We are aware and agree with the reviewer’s general concern about model over-specification, which can be a big problem in regression as it leads to biased estimates. We did examine whether our model was overspecified before running it, but we did not report a formal test of it in the manuscript. We are grateful to be given the opportunity to do so now.

      In response to the reviewer’s comment, we repeated the full analysis shown in Fig. 2 while excluding the nuisance regressors for inter-trial intervals (ISI) and breaks (or inter-session intervals, ISI). All other regressors and analysis steps stayed unchanged relative to the one reported in Fig. 2. The new results are presented in a new Supplementary Figure 3B.

      Like for our previous analysis, we again see that the results we initially presented were extremely robust even on whole-brain FWE corrected levels, as well as on ROI level. Our conclusions therefore remain unchanged, and the results we presented initially are not affected by potential model overspecification. In addition to the new Supplementary Figure 3B, we made multiple textual changes to the manuscript to describe this new analysis and its implications. Note that we used the same nuisance regressors in all other GLM analyses too, meaning that it is also very unlikely that model overspecification affects any of the other results presented. We thank the reviewer for suggesting this analysis, and we feel including it in the manuscript has further strengthened the points we initially made.

      Following additions were made to the manuscript.

      Page 16: The GLM included three boxcar regressors modeling the feedback levels, one for ITIs, one for button presses and one for periods of rest (inter-session interval, ISI) [...]

      Page 16: ITIs and ISIs were modeled to reduce task-unrelated noise, but to ensure that this did not lead to over-specification of the above-described GLM, we repeated the full analysis without modeling the two. All other regressors including the main feedback regressors of interest remained unchanged, and we repeated both the voxel-wise and ROI-wise statistical tests as described above (Fig. S2B).

      Page 17: Note that these results were robust even when fewer nuisance regressors were included to control for model over-specification (Fig. S3B; two-tailed one-sample t tests: anterior HPC, t(33) = -3.65, p = 8.9x10-4, pfwe = 0.002, d=-0.63, CI: [-1.01, -0.26]; posterior HPC, t(33) = -1.43, p = 0.161, pfwe = 0.322, d=-0.25, CI: [-0.59, 0.10]), and when all three feedback levels were modeled with one parametric regressors (Fig. S3C; two-tailed one-sample t tests: anterior HPC, t(33) = -3.59, p = 0.002, pfwe = 0.005, d=-0.56, CI: [-0.93, -0.20]; posterior HPC, t(33) = -0.99, p = 0.329, pfwe = 0.659, d=-0.17, CI: [-0.51, 0.17]). Further, there was no systematic relationship between subsequent trials on a behavioral level [...]

      Reviewer #3 (Public Review):

      This paper reports the results of an interesting fMRI study examining the neural correlates of time estimation with an elegant design and a sensorimotor timing task. Results show that hippocampal activity and connectivity are modulated by performance on the task as well as the valence of the feedback provided. This study addresses a very important question in the field which relates to the function of the hippocampus in sensorimotor timing. However, a lack of clarity in the description of the MRI results (and associated methods) currently prevents the evaluation of the results and the interpretations made by the authors. Specifically, the model testing for timing-specific/timing-independent effects is questionable and needs to be clarified. In the current form, several conclusions appear to not be fully supported by the data.

      We thank the reviewer for pointing us to many methodological points that needed clarification. We apologize for the confusion about our methods, which we clarify in the revised manuscript. Please find our responses to the individual points below.

      Major points

      Some methodological points lack clarity which makes it difficult to evaluate the results and the interpretation of the data.

      We really appreciate the many constructive comments below. We feel that clarifying these points improved our manuscript immensely.

      1) It is unclear how the 3 levels of accuracy and feedback (high, medium, and lowperformance) were computed. Please provide the performance range used for this classification. Was this adjusted to the participants' performance?

      The formula that describes how the response window was computed for the different speed levels was reported in the methods section of the original manuscript on page 13. It reads as follows:

      “The following formula was used to scale the response window width: d ± ((k ∗ d)/2) where d is the target TTC and k is a constant proportional to 0.3 and 0.6 for high and medium accuracy, respectively.“

      In response to the reviewer’s comment, we now additionally report the exact ranges of the different response windows in a new Supplementary Table 1 and refer to it in the Methods section as follows.

      Page 10: To calibrate performance feedback across different TTC durations, the precise response window widths of each feedback level scaled with the speed of the fixation target (Table S1).

      2) The description of the MRI results lacks details. It is not always clear in the resultssection which models were used and whether parametric modulators were included or not in the model. This makes the results section difficult to follow. For example,

      a) Figure 2: According to the description in the text, it appears that panels A and B report the results of a model with 3 regressors, ie one for each accuracy/feedback level (high, medium, low) without parametric modulators included. However, the figure legend for panel B mentions a parametric modulator suggesting that feedback was modelled for each trial as a parametric modulator. The distinction between these 2 models must be clarified in the result section.

      We thank the reviewer very much for spotting this discrepancy. Indeed, Figure 2 shows the results obtained for a GLM in which we modeled the three feedback levels with separate regressors, not with one parametric regressor. Instead, the latter was the case for Figure 3. We apologize for the confusion and corrected the description in the figure caption, which now reads as follows. The description in the main text and the methods remain unchanged.

      Caption Fig. 2: We plot the beta estimates obtained for the contrast between high vs. low feedback.

      Moreover, note that in response to comment 5 by reviewer 1 and comment 3 by reviewer 2, the revised manuscript now additionally reports the results obtained for the parametric regressor in the new Supplementary Figure 3C. All conclusions remain unchanged.

      Additionally, it is unclear how Figure 2A supports the following statement: "Moreover, the voxel-wise analysis revealed similar feedback-related activity in the thalamus and the striatum (Fig. 2A), and in the hippocampus when the feedback of the current trial was modeled (Fig. S3)." This is confusing as Figure 2A reports an opposite pattern of results between the striatum/thalamus and the hippocampus. It appears that the statement highlighted above is supported by results from a model including current trial feedback as a parametric modulator (reported in Figure S3).

      We agree with the reviewer that our result description was confusing and changed it. It now reads as follows.

      Page 5: Moreover, the voxel-wise analysis revealed feedback-related activity also in the thalamus and the striatum (Fig. 2A) [...]

      Also, note that it is unclear from Figure 2A what is the direction of the contrast highlighting the hippocampal cluster (high vs. low according to the text but the figure shows negative values in the hippocampus and positive values in the thalamus). These discrepancies need to be addressed and the models used to support the statements made in the results sections need to be explicitly described.

      The description of the contrast is correct. Negative values indicate smaller errors and therefore better feedback, which is mentioned in the caption of Fig. 2 as follows:

      “Negative values indicate that smaller errors, and higher-accuracy feedback, led to stronger activity.”

      Note that the timing error determined the feedback, and that we predicted stronger updating and therefore stronger activity for larger errors (similar to a prediction error). We found the opposite. We mention the reasoning behind this analysis at various locations in the manuscript e.g. when talking about the connectivity analysis:

      “We reasoned that larger timing errors and therefore low-accuracy feedback would result in stronger updating compared to smaller timing errors and high-accuracy feedback”

      In response to the reviewer’s remark, we clarified this further by adding the following statement to the result section.

      Page 5: “Using a mass-univariate general linear model (GLM), we modeled the three feedback levels with one regressor each plus additional nuisance regressors (see methods for details). The three feedback levels (high, medium and low accuracy) corresponded to small, medium and large timing errors, respectively. We then contrasted the beta weights estimated for high-accuracy vs. low-accuracy feedback and examined the effects on group-level averaged across runs.”

      b) Connectivity analyses: It is also unclear here which model was used in the PPIanalyses presented in Figure 2. As it appears that the seed region was extracted from a high vs. low contrast (without modulators), the PPI should be built using the same model. I assume this was the case as the authors mentioned "These co-fluctuations were stronger when participants performed poorly in the previous trial and therefore when they received low-accuracy feedback." if this refers to low vs. high contrast. Please clarify.

      Yes, the PPI model was built using the same model. We clarified this in the methods section by adding the following statement to the PPI description.

      Page 17: “The PPI model was built using the same model that revealed the main effects used to define the HPC sphere “

      Yes, the reviewer is correct in thinking that the contrast shows the difference between low vs. high-accuracy feedback. We clarified this in the main text as well as in the caption of Fig. 2.

      Caption Fig 2: [...] We plot results of a psychophysiological interactions (PPI) analysis conducted using the hippocampal peak effects in (A) as a seed for low vs. high-accuracy feedback. [...]

      Page 17: The estimated beta weight corresponding to the interaction term was then tested against zero on the group-level using a t-test implemented in SPM12 (Fig. 2C). The contrast reflects the difference between low vs. high-accuracy feedback. This revealed brain areas whose activity was co-varying with the hippocampus seed ROI as a function of past-trial performance (n-1).

      c) It is unclear why the model testing TTC-specific / TTC-independent effects (resultspresented in Figure 3) used 2 parametric modulators (as opposed to building two separate models with a different modulator each). I wonder how the authors dealt with the orthogonalization between parametric modulators with such a model. In SPM, the orthogonalization of parametric modulators is based on the order of the modulators in the design matrix. In this case, parametric modulator #2 would be orthogonalized to the preceding modulator so that a contrast focusing on the parametric modulator #2 would highlight any modulation that is above and beyond that explained by modulator #1. In this case, modulation of brain activity that is TTC-specific would have to be above and beyond a modulation that is TTC-independent to be highlighted. I am unsure that this is what the authors wanted to test here (or whether this is how the MRI design was built). Importantly, this might bias the interpretation of their results as - by design - it is less likely to observe TTC-specific modulations in the hippocampus as there is significant TTC-independent modulation. In other words, switching the order of the modulators in the model (or building two separate models) might yield different results. This is an important point to address as this might challenge the TTC-specific/TTC-independent results described in the manuscript.

      We thank the reviewer for raising this important issue. When running the respective analysis, we made sure that the regressors were not collinear and we therefore did not expect substantial overlap in shared variance between them. However, we agree with the reviewer that orthogonalizing one regressor with respect to the other could still affect the results. To make sure that our expectations were indeed met, we therefore repeated the main analysis twice: 1) switching the order of the modulators and 2) turning orthogonalization off (which is possible in SPM12 unlike in previous versions). In all cases, our key results and conclusions remained unchanged, including the central results of the hippocampus analyses.

      Anterior (ant.) / Posterior (post.) Hippocampus ROI analysis with A) original order of modulators, B) switching the order of the modulators and C) turning orthogonalization of modulators off. ABC) Orange color corresponds to the TTC-independent condition whereas light-blue color corresponds to the TTC-specific condition. Statistics reflect p<0.05 at Bonferroni corrected levels () obtained using a group-level one-tailed one-sample t-test against zero; A) pfwe = 0.017, B) pfwe = 0.039, C) pfwe = 0.039.*

      Because orthogonalization did not affect the conclusions, the new manuscript simply reports the analysis for which it was turned off. Note that these new figures are extremely similar to the original figures we presented, which can be seen in the exemplary figure below showing our key results at a liberal threshold for transparency. In addition, we clarified that orthogonalization was turned off in the methods section as follows.

      Page 18: These two regressors reflect the tests for target-TTC-independent and target-TTC-specific updating, respectively, and they were not orthogonalized to each other.

      Comparison of old & new results: also see Fig. 3 and Fig. S5 in manuscript

      d) It is also unclear how the behavioral improvement was coded/classified "wecontrasted trials in which participants had improved versus the ones in which they had not improved or got worse"- It appears that improvement computation was based on the change of feedback valence (between high, medium and low). It is unclear why performance wasn't used instead? This would provide a finer-grained modulation?

      We thank the reviewer for the opportunity to clarify this important point. First, we chose to model feedback because it is the feedback that determines whether participants update their “internal model” or not. Without feedback, they would not know how well they performed, and we would not expect to find activity related to sensorimotor updating. Second, behavioral performance and received feedback are tightly correlated, because the former determines the latter. We therefore do not expect to see major differences in results obtained between the two. Third, we did in fact model both feedback and performance in two independent GLMs, even though the way the results were reported in the initial submission made it difficult to compare the two.

      Figure 4 shows the results obtained when modeling behavioral performance in the current trial as an F-contrast, and Supplementary Fig 4 shows the results when modeling the feedback received in the current trial as a t-contrast. While the voxel-wise t-maps/F-maps are also quite similar, we now additionally report the t-contrast for the behavioral-performance GLM in a new Supplementary Figure 4C. The t-maps obtained for these two different analyses are extremely similar, confirming that the direction of the effects as well as their interpretation remain independent of whether feedback or performance is modeled.

      The revised manuscript refers to the new Supplementary Figure 4C as follows.

      Page 17: In two independent GLMs, we analyzed the time courses of all voxels in the brain as a function of behavioral performance (i.e. TTC error) in each trial, and as a function of feedback received at the end of each trial. The models included one mean-centered parametric regressor per run, modeling either the TTC error or the three feedback levels in each trial, respectively. Note that the feedback itself was a function of TTC error in each trial [...] We estimated weights for all regressors and conducted a t-test against zero using SPM12 for our feedback and performance regressors of interest on the group level (Fig. S4A). [...]

      Page 17: In addition to the voxel-wise whole-brain analyses described above, we conducted independent ROI analyses for the anterior and posterior sections of the hippocampus (Fig. S2A). Here, we tested the beta estimates obtained in our first-level analysis for the feedback and performance regressors of interest (Fig. S4B; two-tailed one-sample t tests: anterior HPC, t(33) = -5.92, p = 1.2x10-6, pfwe = 2.4x10-6, d=-1.02, CI: [-1.45, -0.6]; posterior HPC, t(33) = -4.07, p = 2.7x10-4, pfwe = 5.4x10-4, d=-0.7, CI: [-1.09, -0.32]). See section "Regions of interest definition and analysis" for more details.

      If the feedback valence was used to classify trials as improved or not, how was this modelled (one regressor for improved, one for no improvement? As opposed to a parametric modulator with performance improvement?).

      We apologize for the lack of clarity regarding our regressor design. In response to this comment, we adapted the corresponding paragraph in the methods to express more clearly that improvement trials and no-improvement trials were modeled with two separate parametric regressors - in line with the reviewer’s understanding. The new paragraph reads as follows.

      Page 18: One regressor modeled the main effect of the trial and two parametric regressors modeled the following contrasts: Parametric regressor 1: trials in which behavioral performance improved \textit{vs}. parametric regressor 2: trials in which behavioral performance did not improve or got worse relative to the previous trial.

      Last, it is also unclear how ITI was modelled as a regressor. Did the authors mean a parametric modulator here? Some clarification on the events modelled would also be helpful. What was the onset of a trial in the MRI design? The start of the trial? Then end? The onset of the prediction time?

      The Inter-trial intervals (ITIs) were modeled as a boxcar regressor convolved with the hemodynamic response function. They describe the time after the feedback-phase offset and the subsequent trial onset. Moreover, the start of the trial was the moment when the visual-tracking target started moving after the ITI, whereas the trial end was the offset of the feedback phase (i.e. the moment in which the feedback disappeared from the screen). The onset of the “prediction time” was the moment in which the visual-tracking target stopped moving, prompting participants to estimate the time-to-contact. We now explain this more clearly in the methods as shown below.

      Page 16: The GLM included three boxcar regressors modeling the feedback levels, one for ITIs, one for button presses and one for periods of rest (inter-session interval, ISI), which were all convolved with the canonical hemodynamic response function of SPM12. The start of the trial was considered as the trial onsets for modeling (i.e. the time when the visual-tracking target started moving). The trial end was the offset of the feedback phase (i.e. the moment in which the feedback disappeared from the screen). The ITI was the time between the offset of the feedback-phase and the subsequent trial onset.

      On a related note, in response to question 4 by reviewer 2, we now repeated one of the main analyses (Fig. 2) without modeling the ITI (as well as the Inter-session interval, ISI). We found that our key results and conclusions are independent of whether or not these time points were modeled. These new results are presented in the new Supplementary Figure 3B.

      Page 16: ITIs and ISIs were modeled to reduce task-unrelated noise, but to ensure that this did not lead to over-specification of the above-described GLM, we repeated the full analysis without modeling the two. [...]

      1. Perhaps as a result of a lack of clarity in the result section and the MRI methods, it appears that some conclusions presented in the result section are not supported by the data. E.g. "Instead, these results are consistent with the notion that hippocampal activity signals the updating of task-relevant sensorimotor representations in real-time." The data show that hippocampal activity is higher during and after an accurate trial. This pattern of results could be attributed to various processes such as e.g. reward or learning etc. I would recommend not providing such interpretations in the result section and addressing these points in the discussion.

      Similar to above, statements like "These results suggest that the hippocampus updates information that is independent of the target TTC". The data show that higher hippocampal activity is linked to greater improvement across trials independent of the timing of the trial. The point about updating is rather speculative and should be presented in the discussion instead of the result section.

      The reviewer is referring to two statements in the results section that reflect our interpretation rather than a description of the results. In response to the reviewer’s comment, we therefore removed the following statement from the results.

      Instead, these results are consistent with the notion that hippocampal activity signals the updating of task-relevant sensorimotor representations in real-time.

      In addition, we replaced the remaining statement by the following. We feel this new statement makes clear why we conducted the analysis that is described without offering an interpretation of the results that were presented before.

      Page 8: We reasoned that updating TTC-independent information may support generalization performance by means of regularizing the encoded intervals based on the temporal context in which they were encoded.

    1. Author Response

      Reviewer #2 (Public Review):

      This work will be of potential interest to biologists studying aging. While transposable elements have been reported to have higher expression as organisms age, it was previously unclear if their expression can exacerbate aging phenotypes or if they are a byproduct of aging. The authors present evidence in this manuscript that artificially increasing transposable element expression during the whole Drosophila life cycle can worsen aging phenotypes.

      Strengths

      The authors provide direct evidence that expression of their gypsy construct across the whole life of animals decreases fly lifespan (Figure 4), and that this outcome is dependent on reverse transcriptase (Figure 6).

      Monitoring TE mobilization can be difficult in general and is often expensive when using a sequencing approach. The authors accurately monitor gypsy mobilization from their ectopic copy through qPCR and sequencing.

      Weaknesses

      Experiment design, data interpretation, and story structure:

      The current model proposes that TE increases activity in aged animals and potentially contributes to the aging process. However, this paper artificially drives gypsy activation throughout the whole fly life cycle. Under this design, TE may already bring deleterious effects from early developmental stages or young age, thus ultimately shortening their life cycle. To truly test the function of TE during the aging process, the authors need to temporally control gypsy expression and only express their construct in aged animals.

      Figure 1: I am not sure I got any convincing messages from this figure. First, flies at 30 days of age should not be considered as old. Second, the authors try to claim that TE expression increased with aged FOXO mutants. However, there is no data to show the comparison between aged wild-type and FOXO mutants (panel e is young wt vs young FOXO null). Meanwhile, Figure 1 has nothing to do with Gypsy. How could this figure fit into the story?

      It is clear that we did not do a good job explaining this section. First, we did not mean to imply that the 30-day flies are old. They are simply older than the 5-day flies. The 30-day timepoint was chosen to match previous experiments and data sets in the literature. It was also chosen to minimize any survivor bias that could occur by doing the assay in very old flies. We have clarified this in the text and figures.

      Second, it is the number of transposons that show an increase in expression in the dFOXO null animals that we mean to highlight (18 for dFOXO vs only 2 for wDAH). Panel e is meant to illustrate that the transposon landscapes, even in young flies differ by genotype making a direct transposon to transposon comparison impossible. We have added text to clarify these points.

      Third, we also do not mean to imply that anything here is specific for gypsy. The work going forward in the paper uses gypsy as a tool because it is one of the better understood retrotransposons, there existed a validated active clone of the transposon and it has already been implicated in aging in the fly. We took gypsy as a model retrotransposon. We have added text to clarify here.

      Figure 3: While the data presented in this Figure is sound, it is unclear how this data fits into the overall narrative that transposon activity drives aging.

      Figure 3 is a continuation of the characterization of our ectopic gypsy. We wanted to rule out that there is a “hotspot” of insertion that would account for any phenotypes we observe. We find no hotspot in the males we use for analysis suggesting it is the act of transposition, not a specific target gene that is important. We have added to the text to clarify the motivation for these experiments.

      Figure 5: It is interesting to see the copies of gypsy are not increased after 5 days. Does gypsy still mobilize after this young age? If yes, the authors should observe increased gypsy gDNA in later time points, unless the cells having gypsy new insertions keep dying. The authors should specifically check tissues with low cell turnover (such as brain) or high cell turnover (such as gut).

      Reviewer 2 makes a great observation. In fact, using primer pairs that specifically detect the ectopic gypsy, we consistently see insertion numbers go down in very old animals (figure 5a&b). With our current understanding of retrotransposition, we should not be able to see loss of insertions unless the host cells are being lost from the analysis. We favor the idea that the reviewer suggests; that the cells that have high levels of insertion are dying and disappearing from the analysis. We think this is also reflected in the bias for intergenic or intronic sequences in our insertion mapping of figure 3. In an attempt to address this question we did measure insertions in heads versus bodies. In male flies aged 14 days there was no difference in the average number of insertions (although the variability was greater in heads). This data is reported in Supplemental Figure 6a.

      Figure 8: Using Ubiquitin GAL4 to drive both gypsy and FOXO expression could dilute the expression of each individual gene. Thus, it is possible the rescue effect seen by expressing FOXO in addition to gypsy may just be due to lower gypsy expression. Including qPCR data showing gypsy expression levels in Ubi>gypsy, UAS FOXO flies compared to Ubi>gypsy flies would be helpful.

      We included this data in Figure 2b and 8c. Unfortunately, we did not clearly direct the reader to compare the values. Comparing Figure 2b with Figure 8c shows the RNA level of the ectopic gypsy is comparable in both cases. Perhaps even slightly higher in the UAS-FOXO case. We have added a sentence to make this clear.

      It is unclear if FOXO can rescue TE-specific aging phenotypes. While it appears that FOXO overexpression rescues the decrease in lifespan caused by gypsy expression, the authors did not test if FOXO overexpression could rescue the effects of gypsy in the paraquat resistance assays or rhythmicity experiments.

      We include in this revision data showing dFOXO overexpression rescues the paraquat resistance and lowers the levels of overall insertions in the animals.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors of the paper provide new evidence of how prefrontal cortex of mutant mice used as a disease model of schizophrenia differs from wild type littermates. By analyzing local network dynamics at the level of specific cell type, authors shed new light on the circuit mechanisms that underlie changes in network dynamics in these mice.

      The claims in the submitted manuscript are supported by the data. I have a few comments and questions that need to be clarified.

      We thank the reviewer for highlighting the novelty of our work and its relevance (…shed new light on the circuit mechanisms that underlie changes in network dynamics in these mice…) for the field and the validity of our data (….claims in the submitted manuscript are supported by the data).

      1) Average firing rates

      Authors claim that they saw a significant reduction in interneuron firing rates in Disc1 mutant mice compared to control mice Fig.1c. However, the difference could be general and not interneuron specific. Due to the high firing rates of interneurons, the statistical test will work better on interneurons than on pyramidal cells as pyramidal cells average firing rates are lower. What I suggest to do is to take interneuron cells that fire at a lower rate (lower 33% for example ) and compare the control and Disc1 groups. Also I would suggest to take pyramidal cells that have higher firing rates (upper 33% for example) and compare firing rates across the same groups. One would like to see if these differences are not due to changes in firing rates per se.

      We thank the reviewer for pointing out this important aspect. In our original analysis, we did not take into account that additional differences in the PYR population might be present but ‘masked’ by the overall lower firing rate of that neuronal population. As suggested by the reviewer, we separately considered the firing rate of the ‘top 33%” of the PYR population, which did not significantly differ between genotypes (p=0.958, n=209 control and 245 Disc1 PYRs, Welch’s test). As suggested, we moreover considered the ‘bottom 33%’ of INT firing rates, for which the significantly lower rates of Disc1-mutant INTs remained (control: 4.2 ± 0.6 Hz vs. Disc1: 1.8 ± 0.2, n=26 and 34 neurons, p=0.013, Mann-Whitney U-test). Since only few INTs were recorded per session in some cases (ranges: Disc1: 2-12/session; control: 2-19/session), we performed this analysis on the basis of individual cells (see also our reassessment of the main statistical comparisons in response to #1 by reviewer 2 and #4 by reviewer 3). These data are now reported in the new Fig. 1 – figure supplement 3 and referred to in line 76 ff. (line 72 ff. without tracked changes) of the revised manuscript.

      2) Optogenetic tagging

      Authors indicate that light triggered and spontaneous spike waveform are similar Fig.1d. This is nice, but would be better to see all the tagged neurons. I would suggest showing all optically tagged neurons spike features. Authors can impose with a different color spike features of tagged neurons in Fig.1a. I suspect that since all PVI are narrow spiking and they must fall into the area of blue colored cells in Fig.1a.

      Following the reviewers suggestions, we included the average waveforms with and without light for all opto-tagged PVIs in the revised Fig. 1f. Moreover, we included the kinetic features of opto-tagged PVIs in Fig. 1a (red dots), and separately for control and Disc1-mutant mice in the new Figure 1-figure supplement 2. As predicted by the reviewer, the PVIs indeed cluster with the other putative INTs. We would moreover like to point to our new analysis in response to #2 of reviewer 2 addressing the spike kinetics of optotagged PVIs versus untagged putative INTs, which are similar in their trough-to-peak duration and asymmetry index. These data are shown in the novel Fig. 1 – figure supplement 2.

      3) It was not clear why authors assessed only firing rates in last 25ms (line 348-349). If they have a clear justification for this they should provide it. But why not use the latency of the first spike also as an additional metric. A well tagged cell will respond to light pulse with short latency (within 5 ms). My concern is that non PVI cells may increase firing rate after 25ms of stimulation of PVI cells due to disinhibition.

      Despite the latency to the first spike being frequently used as a method to detect ChR2-positive neurons, the laser stimulation produced significant photoartefacts in our hands. We were therefore concerned that spikes that happen shortly after the onset of the light pulse might be missed, and hence the latency to the first spike might be misinterpreted. Selecting a later time point in the stimulation interval allowed us to assess the firing rate during light application without the interference by artefacts. Nevertheless, we fully agree with the reviewer’s concern that ChR2-negative non-PVIs might increase their rate due to disinhibition, and that these neurons might thus be falsely classified as PVIs. However, we are confident that that is not the case. First, optotagged PVIs cluster well within the population of electrophysiologically identified INTs (see our response to your first remark on ‘optogenetic tagging’) and were indistinguishable from this population in terms of spike kinetics (see our response to #2 of reviewer 2 and the new Fig. 1 – figure supplement 2), suggesting that no disinhibited PYRs were included in the optotagged sample of cells. Second, we performed an additional analysis to address the time course of firing rate changes in optotagged PVIs. We computed smoothed spike trains (convolved with a 5 ms SD Gaussian kernel), and extracted the average firing rate of each optogenetically identified PVI centered on the onset of the light pulses. This analysis revealed a rapid increase in firing rate upon light delivery, arguing against disinhibitory network effects. These new data are now shown in the new Fig. 1 – figure supplement 5 and reported in line 89 (85 without tracked changes) of the revised manuscript.

      4) Spike cross-correlations

      The authors show that spike transmission probability from PYR to PVI is reduced in Disc1 mice compared to the controls Fig.2d and Fig.2e, but what happens to PVI to PYR spike transmission probability? Is it different in those groups? Answering this question is important since the authors discuss this topic in line 185-193.

      Inhibitory synaptic interactions are indeed detectable by spike-train cross-correlation. However, we find these to be harder to quantitatively interpret than excitatory connections. Those interactions are not visible as spike transmission but rather as a reduction in spike transmission. Reliable estimates of the reduction in spike rate of postsynaptic PYRs require very large spike numbers of postsynaptic neurons that need to be sampled. For instance, Senzai et al., 2019 (Neuron 101: 500-513.e5) identified inhibitory interactions in continuous recordings lasting up to 68 h. Since we did not explicitly design our experiments to investigate inhibitory interactions, our recordings were substantially shorter than the required length. Using the method of Senzai et al., 2019 to identify inhibitory interactions, we detected only 5 INT-INT interactions (in the pooled Disc1-mutant and control data set). This low number does not allow the quantification of potentially reduced spike transmission. Thus, attempts to quantify inhibitory interactions properly would require a substantial amount of additional long-duration recordings. While the point raised by the reviewer is highly relevant and should be investigated in future, we think that given the extensive amount of experimentation needed to address this question, it is beyond the scope of the current manuscript.

      5) Authors could try to link oscillations with spike transmission probabilities. On line 180 authors discuss that lower synchrony between PVI might be responsible for observed reduction in gamma power in Disc1 mutant mice. With the available data authors could test this hypothesis. They can look at spike cross correlations in their pool of INT and PVI (if they have pairs of PVI recorded in the same session) population.

      We thank the reviewer for this excellent suggestion! We computed the cross-correlations for all simultaneously recorded putative INTs and quantified the baseline-subtracted mean cross-correlation within 10 ms around zero time lag. This analysis revealed weaker cross-correlation in Disc1-mutant mice (p=0.026, Mann-Whitney U test, tested on averages from n=7 control and Disc1 mice with at least 2 INTs recorded simultaneously), suggestive of reduced synchronization of putative INTs at short time lags. These new data are now included in the new Fig. 4 and reported in line 201 ff. (185 ff. without tracked changes) of the revised manuscript.

      6) An alternative way to link oscillations with lower spike transmission probabilities in PYR-PVI pairs is to use synchrony triggered LFP analysis. One could take all time points when PVI and PYR cells fired acausal spikes within 2ms window and look at the LFP around this time point. Than take the average of the synchrony-triggered LFP and look at the power spectrum.

      The proposal to link spike transmission with LFP power is indeed intriguing. As suggested by the reviewer, we extracted the 60-90 Hz-filtered LFPs triggered by INT spikes that followed a spike in a presynaptic PYR by <2 ms and measured the average gamma amplitude in a window of 20 ms around the INT spike. This analysis revealed comparable gamma amplitudes in Disc1 compared to control pairs. This finding suggests that local PYR-INT loops are still capable to produce gamma oscillations, and that the gamma oscillation defect of Disc1 mice is likely not caused by such a local defect. To investigate the relationship between INT spike timing and gamma oscillations more generally, we further extracted gamma amplitudes of spike-triggered LFPs using all available spikes of the INTs. Moreover, we compared the data to gamma amplitudes measured at randomly selected time points. ANOVA analysis followed by Tukey tests performed on the level of mouse averages indicated that while INT spiking-associated gamma amplitudes were significantly larger than those depicted from random time points in wild type mice (p=0.001). However, the same was not true for Disc1-mutant mice (p=0.591). Furthermore, this analysis revealed significantly reduced spike-triggered high gamma amplitudes in Disc1-mutant compared to control mice (p=0.011). While these results argue against a driving role of local connection alterations in gamma defects, they generally confirm the impaired synchrony of INT spiking relative to gamma oscillation that we observed in our analysis of phase coupling. These data are now shown in the new Fig. 4, which summarizes all new analyses regarding gamma oscillations and phase-coupling, and in figure 4 – figure supplement 2. The new results are described in the main text of the revised manuscript in line 188 ff. (172 ff. without tracked changes).

      Considering the reduced short time scale synchronization of INTs (see our new results towards the reviewer’s #5) and reduced gamma amplitude of INT spike-triggered LFPs, it is possible that impaired synchronization among prefrontal INTs might contribute to the observed reduction in gamma power of Disc1-mutant mice (thereby, essentially, reflecting impaired INT gamma (ING)). Additionally, reduced long-range excitatory drive maintaining local gamma oscillations might be a contributing factor. For example, recent work showed that high gamma oscillations in the mPFC occur synchronized with gamma oscillations in the olfactory bulb (Karalis & Sirota, 2022, Nat Commun 13:467). It remains to be investigated whether local INTs are rhythmically driven by input from the olfactory bulb (in a multi-synaptic pathway including olfactory cortex) and to what extent that drive maintaining afferent gamma might be altered in Disc1-mutant mice. While the current data set does not allow a systematic evaluation of these possibilities, they should be further explored in future experiments.

      7) Cell assembly analysis

      The authors used 10ms for testing synchronization among pairs of PYR neurons in Fig.4a but 25ms for analysis of assembly dynamics. I think the authors justified why they used 25ms bin size, but it was not clear why they used 10ms? Could the authors clarify the reasons behind this decision?

      The synchronization analysis was originally applied to PYRs converging on a common postsynaptic INT. English et al. (Neuron 95:505-520, 2017) systematically tested the effect of presynaptic cooperativity on spike transmission in the hippocampus (their Fig. 5). Their analysis revealed a maximum in cooperativity at ~10 ms. To maximize the sensitivity of our approach, we thus focused on 10 ms for this analysis. However, we agree that using the same time window as for assembly extraction is a reasonable proposal, in particular since we find no difference in the synchronization of identified presynaptic PYRs (Fig. 3e of the revised manuscript). Thus, we have recomputed cross-correlations using a 25 ms bin size. To further improve the analysis, we restricted it to neurons with at least 1000 spikes and simplified the quantification of excess spiking by using the ‘coinicident_spikes’ function of the Python package neuronpy.utils.spiketrain. Excess synchrony is now estimated by quantifying the number of coincident spikes between a reference and a comparison spike train detected in a 25 ms time window normalized by the firing rate expected by chance (2*frequency of comparison train * synchrony window * number of the reference train).

      By using this improved analysis with a 25 ms time window, we could replicate our original finding of enhanced synchronization of PYR spiking. However, when we averaged the data on the basis of individual mice as suggested in #1 of reviewer 2 and #4 of reviewer 3, we could not observe this effect (irrespective of whether we used the new, coincident spikes-based analysis or the original excess synchrony analysis at either 10 or 25 ms synchrony window). This result is now stated in line 215 ff. (199 ff. without tracked changes) of the revised manuscript.

      Reviewer #2 (Public Review):

      This is an interesting paper, in which the authors assessed spiking and network deficits in a well-established mouse model of schizophrenia. This mouse model carries a genetic deletion of the Disrupted-in-schizophrenia-1 (Disc1) gene, which is highly penetrant in the human condition. The authors combined behavioral analyses with state-of-the-art electrophysiological recordings in vivo, coupled to optogenetic tagging, to study a subnetwork formed by a major inhibitory neuron subclass (the parvalbumin (PV)-expressing interneuron) and principal excitatory pyramidal neurons in the medial prefrontal cortex. This work indicates reduced firing rates of PV cells in Disc1-KO mice, likely due to reduced coupling with pyramidal neurons, leading to alterations in local network activity. Indeed, the authors found that Disc-KO mice exhibited reduced levels of gamma oscillations and somewhat hypersynchronous networks.

      Taking advantage of novel techniques and analytical strategies, the manuscript provides rich, novel insight into the neurobiology of a mouse model of this severe psychiatric condition. The data is of high quality, the findings interesting and the manuscript is well written.

      Overall, the results support the authors' conclusions, although some additional analyses are necessary to corroborate their interpretations.

      Although the paper does not give information on how PV cell dysfunctions are engaged during cognitive tasks, this study can be considered as an important first step in advancing our knowledge on the basic dysfunctions of cortical networks in this model of schizophrenia

      We thank the reviewer for praising the ‘high quality’ of our work, and the ‘rich, novel insights’ on the neurobiology of a mouse model of a psychiatric disorder.

      1) The major findings stem from the analysis of the spiking activity of individual neurons recorded using either silicon probes or arrays of tetrodes. Both techniques allow simultaneous recording of many neurons from a single animal; therefore, from a statistical point of view neurons recorded from one animal are pseudo replicas and cannot be considered as independent measurements. Throughout the manuscript, the authors perform two-sample tests on the pooled data from all recorded neurons to compare differences between genotypes; therefore, artifactually increasing the power of statistical tests. Comparisons between genotypes should be performed using each mouse as an independent measurement.

      To be able to compare the data on the basis of mouse averages, we performed additional recordings, which resulted in a final data set of 9 Disc1 and 7 control mice. We recomputed the main results of this study based on mouse averages. First, consistent with our original cell-by-cell analysis, we found significantly reduced firing rates of putative INTs but not of PYRs (line 72 (69 without tracked changes)). Moreover, we confirmed our results on decreased spike transmission probability at PYR-INT connections (line 121 (107 without tracked changes)), decreased spike transmission in the resonance window (line 163 (147 without tracked changes)), reduced high gamma power (line 173 ff. (157 ff. without tracked changes)), lower phase-coupling of INT spikes to high gamma oscillations (line 178 (162 without tracked changes)), and reduced strength of assembly activations in Disc1 compared to control mice (line 229 ff. (211 ff. without tracked changes)). Similarly, we performed new analysis on INT-INT synchronization and INT spike-triggered gamma amplitudes (as requested by reviewer 1 #5 & 6), which showed significant effects on the level of mouse averages (line 188 ff. (line 172 without tracked changes)). Second, our original finding on significant differences in the synchronization of individual PYR-PYR pairs could not be reproduced on the level of individual mice. This is reported in line 215 (199 without tracked changes) of the revised manuscript. Finally, the analyses based on optogentically identified PVIs did not allow comparison by mouse averages due to the low number of experiments (n=3 mice each). Given that the vast majority of our conclusions is based on electrophysiologically identified INTs, with optogenetic identification experiments being only confirmatory in nature, and that performing additional experiments for optogentic identification of PVIs would be very laborious, we report the results of these analyses as comparisons between neurons or connected pairs. This is clearly stated at the respective sections throughout the revised manuscript. We hope that the reviewer can agree with our decision.

      2) The superficial layers of the mPFC are difficult to reach with a vertical approach of the probes due to the presence of a large blood vessel located medially in the frontal dura. Therefore, the authors are most likely reaching mPFC deep layers where PYR neurons produce fast spikes at high rates. If this is the case, this would make it difficult to sort the spiking of PYR from that of INs based on the spike kinetics and rate. The authors used opto-tagging of PVIs in a set of experiments. It would be reassuring to confirm that the spike waveform and kinetics that they extracted from PVIs are similar to those they assigned as INTs in their experiments with no opto-tagging. Identified PVIs should be statistically different from putative PYRs (not responding to light). Although opto-tagging of PVIs can solve this issue, the amount of cells isolated remains low and the number of animals is not stated. Opto-tagged cells are subsequently used for further analyses but the statistical value of those remain unclear. Since the entire interpretation of the rest of the results depend on this result, this must be clarified.

      As correctly pointed out by the reviewer, we indeed targeted deep layers of the mPFC (~0.4 mm lateral of the midline; see also the histological information about the recordings sites that is now included in Figure 1 – figure supplement 1), where higher spike rates are expected compared to superficial layers. To assess whether this might have influenced the identification of putative INTs, we separately plotted the duration and asymmetry index used to classify the neurons in PYRs and putative INTs for Disc1 and control mice. This analysis yielded well separated clusters in both cases. In addition, as suggested by the reviewer, we compared the kinetic properties (spike duration and asymmetry index) and rates of PYRs, putative INTs, and optotagged PVIs. In both genotypes, ANOVA analysis followed by Tukey post-hoc testing revealed significant differences between the PYRs and both groups of INTs, both for rate (smaller in PYRs) and kinetic properties (longer spikes of PYRs) while we found no difference between putative INTs and PVIs. These results thus suggest that the method used to identify INTs works reliably. These new data are now shown in the revised Fig. 1a and the new Figure 1 – figure supplement 2 and mentioned in line 89 ff. (85 without tracked changes) of the revised manuscript.

      We agree that the number of experiments using PVI opto-tagging is low (n=3 mice per genotype, this information is now included in the main text in line 93 ff. (88 ff. without tracked changes)). However, our analysis of spike transmission probability using the population of untagged putative fast-spiking INTs revealed similar results as for the sample of optogenetically identified PVIs. We view the PVI optotagging experiment as an additional confirmation that the difference in firing rate and spike transmission did likely not arise from sampling from different INT types in Disc1 and control mice, as pointed out in line 80 (76 without tracked changes) of the revised manuscript. The limitation of the low number of PVIs in our study is critically reflected in the revised discussion in line 249 ff. (229 without tracked changes).

      3) Proportion of gamma coupled neurons. The authors mention the use of pairwise phase consistency (PPC). PPC is a good method to measure phase coupling independent of differences in firing rates. However, it is not entirely clear how PPC is used to measure the extent of phase locking. In the methods, the authors mention that they ran the PPC analysis after determining significant phase locking with Rayleigh's test. Moreover, they provide PPC values for high gamma oscillations but not for other frequency ranges. Perhaps, it would be better to test significant coupling of all units by nonrandom spike-phase distributions crossing a confidence interval, estimated by Monte Carlo methods from independent surrogate data set. These can be obtained upon randomly jittering each spike times. Indeed, PPC values estimated by the authors for high gamma are higher for PYR than INT (Fig. 1- Fig. Suppl 4 b). This is at odds with previously published observations in V1 (e.g. Perrenoud et al., PLoS Biol. 2016 PMID: 26890123). Given the existing reports of reduced excitatory transmission in DISC-1 mice, phase locking of PYR to other frequency bands might be affected.

      Following the reviewer’s suggestion we have revised our phase-coupling analysis. First, Perrenoud et al (2016) show that gamma oscillations occur in short bursts of high power. To better reflect the coupling of putative INTs to those transient gamma events, we restricted the phase-coupling analysis to epochs within the largest quintile of gamma amplitude (assessed by the envelope of the gamma-filtered signal obtained by Hilbert transformation). Second, instead of the Rayleigh test, we obtained for each unit randomized spike trains by shuffling the inter-spike intervals (500 iterations). Significant phase locking was then obtained by testing whether two consecutive bins of the phase histogram exceeded the 95th percentile of the random distribution. This analysis was performed separately for the low (20-40 Hz) and high gamma bands (60-90 Hz) for both putative INTs and PYRs. Third, the depth of phase coupling was assessed by PPC for all significantly phase-coupled neurons. While this metric is more robust against changes in spike rates than traditional measures, it is still not completely independent of it. Perrenoud et al, for instance, showed using spike sub-sampling that the reliability in estimating PPC depends on spike rate (with >1000 spikes being optimal). However, our data set of PYRs contained fewer than 1000 spikes during high gamma events (mean Disc1: 657 ± 32, mean control: 840 ± 43). To better account for the effect of rate dependence, we restricted the analysis to neurons with >250 spikes. To further limit the potential impact of different spike counts across neurons, we used random subsampling with a fixed spike number of 250 (100 iterations per cell), computed PPC in each iteration, and averaged over the PPC estimates per cell. Finally, in response to the reviewers point 1, the results of all neurons (PYR and INT separately) were then averaged for each mouse.

      Consistent with our original analysis, we found a significantly reduced proportion of phase-coupled INTs but unaltered PPC of significantly coupled INTs to the high gamma band. Moreover, we observed no significant effects for low gamma oscillations or for the phase-coupling of PYRs to either low or high gamma bands. These results are now shown in the new Fig. 4 and the new Figure 4 – figure supplement 1, and are described in line 170 ff. (154 without tracked changes) of the revised manuscript. In addition, we provide a detailed explanation of the revised phase coupling analysis, including a formal description how PPC is computed, in the Methods section of the revised manuscript in line 524 ff. (486 without tracked changes).

      Using the revised phase-coupling analysis, we observed comparable PPC values of significantly coupled PYRs (0.013) and INTs (0.014) to high gamma in control mice. While the improved analysis thus resolved the paradoxical finding of lower PPC in INTs, we did not observe weaker phase-coupling of PYRs as reported in Perrenoud et al. (2016). A possible explanation for this discrepancy might be genuine differences in gamma coupling of the PYR population between visual cortex (Perrenoud et al., 2016) and the prefrontal cortex (our study), which will require further investigation in future.

      Reviewer #3 (Public Review):

      In the present study, the authors aim to assess network activity alterations in the prefrontal cortex of mice with a deletion variant in the schizophrenia susceptibility gene DISC1 ("DISC1 mutants"). Using silicon probe in vivo recordings from the prefrontal cortex, they find that mutant mice show reduced firing rates of fast-spiking interneurons, reduced spike transmission efficacy from pyramidal cells to interneurons, and enhanced synchronization and activation of cell assemblies. The authors conclude that "interneuron pathology is linked with the abnormal coordination of pyramidal cells, which might relate to impaired cognition in schizophrenia."

      The cellular and circuit basis of psychiatric disorders has received strong interest in the recent past. In particular, alterations of the "excitation-inhibition balance" in cortical circuits has been the focus of extensive scrutiny (reviewed in pmid 22251963). Specifically, in both human samples as well as in mouse models, disruption of interneuron development and function have been implicated in the pathogenesis of schizophrenia. In the DISC1 mouse model, studies have reported disrupted interneuron development (e.g. pmid 23631734, 27244370), reduced numbers of GABAergic neurons (e.g. pmid 18945897), reduced inhibition from GABAergic neurons ex vivo (e.g. pmid 32029441), and reduced firing rates of fast-spiking neurons in vivo in the basal forebrain (pmid 34143365).

      The present manuscript makes a potentially important contribution to this question by probing the microcircuitry of the prefrontal cortex in vivo in the DISC1 mouse model of schizophrenia. It goes beyond previous work in assessing circuit dynamics in vivo in more detail, albeit with indirect methods. The experiments and analysis have generally carefully been performed, though the statistical analysis raises some questions. The advances made by the present work compared to previous studies could be delineated more clearly.

      We thank the reviewer for praising the analysis of our data ‘…have generally carefully been performed..’ and the ‘important contribution’ of our work to the field.

  4. www.janeausten.pludhlab.org www.janeausten.pludhlab.org
    1. we women never mean to have anybody. It is a thing of course among us, that every man is refused, till he offers

      See also Emma "A woman may not marry a man merely because she is asked, or because he is attached to her" (chapter 7) and Mansfield Park "I think it ought not to be set down as certain that a man must be acceptable to every woman he may happen to like himself" (Chapter 35)

    1. Author Response

      Reviewer #1 (Public Review):

      Neural stem cells express cascades of transcription factors that are important for generating the diversity of neurons in the brain of flies and mammals. In flies, nothing is known about whether the transcription factor cascades are build from direct gene regulation, e.g. factor A binding to enhancers in gene B to activate its expression. Here, Xin and Ray show that one temporal factor, Slp1/2, is regulated transcriptionally via two molecularly defined enhancers that directly bind two other transcription factors in the cascade as well as integrating Notch signaling. This is a major step forward for the field, and provides a model for subsequent studies on other temporal transcription factor cascades.

      Thanks for the positive comments!

      Reviewer #2 (Public Review):

      The manuscript addresses an important question concerning the mechanisms regulating temporal transitions in Drosophila neural progenitors called neuroblasts. Here, they concentrate on a specific transition between the transcription factors Ey and Slp1/2 that are sequentially expressed within a cascade involving at least 6 temporal transcription factors. Using a combination of new transgenes, bioinformatics and genome-wide profiling of transcription factor biding sites (Dam-ID), they functionally characterize two enhancers of the Slp1/2 genes that are active during this transition. This led to the identification of the Notch pathway as an important facilitator of the transition. They also show that Notch signaling requires cell cycle progression and that Slp1/2 is a direct target of Ey, validating the importance of transcriptional cross-regulatory interactions among the temporal transcription factors to trigger progression.

      In my opinion, the study is very interesting, representing the first careful analysis of enhancers involved in temporal transitions in neural progenitors, and leading to new insights into the mechanisms promoting temporal progression.

      Thanks for the positive comments!

      Reviewer #3 (Public Review):

      In this manuscript, the authors present data to suggest that transcriptional activation of the Slp1/2 temporal factors in the medulla neuroblasts of the developing Drosophila optic lobe is dependent on two enhancer elements. The authors concluded that these two enhancers were able to be activated by Ey and Scro, two other factors identified to be involved in the temporal cascade of the medulla NB. The authors show that cell cycle progression is necessary for Notch signaling, and that Notch signaling activates and sustains the temporal transcription factor cascade. The authors use GFP reporter assays to correlate the enhancer activity to Slp1/2 expression and used DamID to show in-vivo binding of Su(H) and Ey to the enhancer fragments.

      I agree with the authors that it is important to define the mechanisms by which Notch, cell cycle control and these temporal transcription factors function through their cis-regulatory elements to establish this self-propagating cascade to generate diverse cell types during neurogenesis. However, the findings in this study offer limited new insights toward reaching this goal for a myriad of reasons. First, studies in invertebrate and vertebrate neurogenesis have agreed on the conceptual framework that transcriptional control plays a key role in regulating the generation of diverse cell types. The data showing the patterns of slp1/2 transcript simply reaffirm the proposed model as well as recently published single-cell transcriptomic analyses of fly optic lobe neuroblasts. Second, it remains unclear how physiologically relevant the enhancer analyses presented in this study are to the regulation of Slp1/2 expression, as the data can only suggest that they act redundantly to each other. It is also troubling to see that mutating binding sites of a single transcription factor appears to completely abolish enhancer activity while Slp1/2 protein expression is delayed in mutant clonal analyses. Third, the authors do not offer any explanation for how Notch signaling contributing to the timing of Slp1/2 expression, considering that Notch signaling should be active during the entire life of the neuroblast based on canonical Notch target gene expression. What action do Ey and Scro play in this timely enhancer activation as both appear to be necessary to activate the enhancers along with Notch. Fourth, many studies including the Okamoto et al., 2016 study cited in this study have contributed to our appreciation of the role of proper cell cycle control in promoting generation of diverse neurons in vertebrate neurogenesis. It is unclear to me if findings from the current study contribute to significant advancement on this regulatory link.

      Thanks for raising these concerns. Here are our responses:

      First, we agree that there have been great advances in this field including classical studies in the ventral nerve cord, recent studies on type II lineages and medulla including our own scRNA-seq study of medulla neuroblasts. These studies have revealed the sequential expression of transcription factors in neuroblasts of different ages, and proposed that these transcription factors form a transcriptional cascade based on the cross-regulations among them. However, these cross-regulations were based on mutant phenotypes, and in most cases, the cis-regulatory elements of these TTFs have not been characterized, and it hasn’t been studied whether these cross-regulations are direct or not. Little is known about exactly how the timing of the transition is regulated and coordinated with cell-cycle control. We have addressed these questions and identified two enhancer elements for slp1/2, and demonstrated that the previous TTF Ey, another TTF Scro, and Notch signaling directly regulate slp expression. Further we demonstrated that Notch signaling is dependent on cell cycle progression in neuroblasts, and supplying Notch signaling rescues the delay in Slp expression in cell cycle mutants. We believe this study has provided important insights in this field and is another step forward.

      Second, now we provide evidence that deletion of both enhancers specifically abolished Slp1 and Slp2 expression in medulla neuroblasts.

      Regarding the concerns about binding site mutation:

      1) Ey: With loss of Ey, Slp is completely lost. The Ey binding site mutation phenotype is consistent with loss of Ey phenotype.

      2) Su(H): For the u8772 250bp enhancer, mutating all four predicted Su(H) binding sites did abolish the reporter expression. During the revision, we generated another construct, in which we mutated the two predicted Su(H) binding sites which are perfect matches to the consensus, and found that this dramatically reduced the reporter expression. For the d5778 850bp enhancer, mutation of Su(H) binding sites caused strong glial expression which prevented us to precisely assess the neuroblast expression. Thanks to the excellent advice from review #3, we used repo-Gal4 and GFP-RNAi to remove the glial expression. This approach turned out very informative. We found that mutation of four or six out of six predicted Su(H) binding sites actually did not decrease the reporter expression in neuroblasts, suggesting that Notch signaling does not active the d5778 850bp enhancer through these binding sites. However, we think this is the explanation why this enhancer drives a delayed expression comparing to the 220bp enhancer and the endogenous Slp. In addition, this also explains why with loss of Notch signaling, endogenous Slp expression is only delayed but not completely lost. This is because although the 220bp enhancer driven expression is completely lost, the d5778 850 bp enhancer still directs a delayed expression of Slp and this expression is not dependent on Notch signaling.

      3) Scro: Mutation of Scro binding sites caused a decreased expression level of the reporter, consistent with the scro RNAi phenotype on Slp, which is a decreased expression level.

      Third, regarding how Notch signaling which is active in the entire neuroblast life, can act to activate Slp expression in a specific time We tested genetic interactions between Ey, Scro, and Notch in the regulation of Slp expression. We found that with loss of Ey, supplying constitutive active Notch or Scro is not sufficient to rescue Slp expression. Thus Ey as the previous TTF, may be required to prime the slp locus, so that Notch signaling and Scro can act to further activate Slp expression. Therefore, Notch signaling requires Ey to specifically further activate Slp at the correct time. We have added these experimental results and discussion.

      Fourth, the Okamoto et al., 2016 study actually concluded that cell cycle progression is not required for the temporal progression. In their experimental setup, they supply Notch to maintain the un-differentiated status of cortical neural progenitors when they block cell cycle progression. The observed that temporal transition still happened, and they concluded that cell cycle progression is not required for temporal transitions. However, they didn’t consider the possibility that Notch signaling, which is itself dependent on cell cycle progression, actually rescued the possible phenotype caused by arrest of cell cycle progression. Our result demonstrated that in Drosophila medulla, supplying Notch signaling can rescue the delay in the transition to the Slp stage in cell-cycle arrested neuroblasts, and further showed that the mechanism is by direct transcriptional regulation. We believe that publication of our results will be informative to the vertebrate study, promoting vertebrate researchers to re-consider the role of cell cycle progression and Notch signaling in temporal progression.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01528

      Corresponding author(s): Elena Taverna and Tanja Vogel

      1. General Statements [optional]

      We thank the reviewers for the comments and points they raised. We think what we have been asked is a doable task for us and we are confident we will manage to address all points in a satisfactory manner.

      2. Description of the planned revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Reviewer’s comment: The manuscript investigated the role of DOT1L during neurogenesis especially focusing on the earlier commitment from APs. Using tissue culture method with single-cell tracing, they found that the inhibition of DOT1L results in delamination of APs, and promotes neuronal differentiation. Furthermore, using single cell RNA-seq, they seek possible mechanisms and changes in cellular state, and found a new cellular state as a transient state. Among differentially expressed genes, they focused on microcephaly-related genes, and found possible links between epigenetic changes led by DOT1L inhibition and epigenetic inhibition by PRC2. Based on these findings, they suggested that DOT1L could regulate neural fate commitment through epigenetic regulation. Overall, it is well written and possible links from epigenetic to metabolic regulation are interesting. However, there are several issues across the manuscript.

      Response to Reviewer and planned revision:

      We thank the reviewer’s 1 for her/his comments and constructive criticism.

      We hope the revision plan will address the points raised by the reviewer in a satisfactory manner.

      Major issues:

      * *Reviewer’s comment: 1) It is not clear whether the degree of H3K79 methylation (or other histones) changes during development, and whether DOT1L is responsible for those changes. It is necessary to show the changes in histone modifications as well as the levels of DOT1L from APs to BPs and neurons, and to what extent the treatment of EPZ change the degree of histone methylation.

      Response to Reviewer and planned revision:

      • As for the level of DOT1L protein We tried several commercially available antibodies, but they do not work in the mouse, even after multiple attempts and optimization. So, unfortunately we will not be able to provide this piece of information.

      • As for the level of DOT1L mRNA We will provide info regarding the DOT1L mRNA level in APs, BPs and neurons by using scRNAseq data from E12, E14, E16 WT cerebral cortex.

      • As for the levels of H3K79methylation, we did not intend to claim that the histone methylation is responsible for the reported fate transition. We will edit the text to avoid any possible confusion. If it is deemed to be necessary to address the point raised by the reviewer, we do have 3 options, that we here in order of priority and ease of execution from our side.

      • immunofluorescence with an Ab against H3K79me2 using CON and EPZ-treated hemispheres.

      • FACS sort APs, BPs and neurons from CON and EPZ-treated hemispheres, followed by immunoblot for H3K79me2 to assess the H3K79me2 levels. As for the FACS sorting, we will use a combinatorial sorting in the lab on either a TUBB3-GFP or a GFP-reporter line using EOMES-driven mouse lines. This strategy has already been employed in the lab by Florio et al., 2015 and we will use it with minor modifications.
      • scCut&Tag for H3K79me2 from CON and EPZ-treated hemispheres. This option entails a collaboration with the Gonzalo Castelo-Branco lab in Sweden and might therefore require additional time to be established and carried out. Reviewer’s comment:

      Furthermore, the study mainly used pharmacological bath application. DOT1L has anti-mitotic effect, thus it is not clear whether the effect is coming from the inhibition of transmethylation activity.

      Response to Reviewer and planned revision:

      In a previous work we used a genetic model (DOT1L KO mouse) that showed microcephaly (Franz et al. 2019). For this study, we wanted to fill a gap in knowledge by understating if the DOT1L effect was mediated by its enzymatic activity. For this reason, we choose to use the pharmacological inhibition with EPZ, whose effect on DOT1L activity has been extensively reported and documented in literature (EPZ is a drug currently in phase clinical 3 studies).

      The stringent focus of this study on the pharmacological inhibition is thus a step toward understanding what specific roles DOT1L can play, both as scaffold or as enzyme.

      Here, we concentrate on the enzymatic function and the scaffolding function is beyond the scope of this specific study. We can further discuss and elaborate on the rationale behind this in the revised manuscript.

      Reviewer’s comment:

      In addition, the study assumed that the effect of EPZ is cell autonomous. However, if EPZ treatment can change the metabolic state in a cell, it would be possible that observed effects was non-cell autonomous. It would be important to address if this effect is coming in a cell-autonomous manner by other means using focal shRNA-KD by IUE.

      Response to Reviewer and planned revision:

      We did not claim that the effect of EPZ is cell autonomous, we are actually open on this point, as we consider both explanations to be potentially valid. We will edit the text to avoid any possible confusion on what we assume and what not.

      As a general consideration, it is entirely possible that the effects are non-cell autonomous. We will comment and elaborate on that in the revised manuscript.

      If the reviewer/journal considers this a point that must be addressed experimentally, then we will proceed as follows:

      • DOT1L shRNA-KD via in utero electroporation, followed by either
      • in situ hybridization for ASNS to check if ASNS transcript is increased upon DOT1L shRNA-KD compared to CON
      • FACS sorting of the positive electroporated cells (CON and DOT1L shRNA-KD), followed by qPCR to assess the levels of ASNS
      • If the reviewer wants us to check for a more downstream effect on fate, then we will immuno-stain the DOT1L shRNA-KD and CON with TUBB3 AB and/or TBR1 AB (as already done in the present version of the manuscript). Reviewer’s comment: 2) The possible changes in cell division and differentiation were found by very nice single-cell tracing system. However, changes in division modes occurring in targeted APs such as angles of mitotic division and the expression of mitotic markers were not addressed. These information is critical information to understand mechanisms underlying observed phenotype, delamination, differentiation and fate commitment.

      Response to Reviewer and planned revision:

      Previous effects of DOT1L manipulation on the mitotic spindle were observed in a previous paper, using DOT1L KO mouse (Franz et al. 2019). Considering that in our experiments we do use a pharmacological inhibition, we will address this point by quantifying the spindle angle in CON and EPZ-treated cortical hemispheres.

      We will co-stain for DAPI to visualize the DNA/chromosomes, and for phalloidin (filamentous actin counterstain) that allows for a precise visualization of the apical surface and of the cell contour, as it stains the cell cortex.

      Of note, the protocols we are referring to are already established in the lab, based on published work from the Huttner lab (Taverna et al, 2012; Kosodo et al, 2005).

      Reviewer’s comment: 3) The scRNA-seq analysis indicated interesting results, but was not fully clear to explain the observed results in histology. In fact, in single cell RNA-seq, the author claimed that cells in TTS are increased after EPZ treatment, which are more similar to APs. However, in histological data, they found that EPZ treatment increased neuronal differentiation. These data conflicts, thus I wonder whether "neurons" from histology data are actually neurons? Using several other markers simultaneously, it would be important to check the cellular state in histology upon the inhibition/KD of DOT1L.

      Response to Reviewer and planned revision:

      The reviewer’s comment is valid, and we indeed found that TTS cells are an intermediate state between APs and neurons in term of transcriptional profile. This is the reason why we called this cell cluster transient transcriptional state.

      We plan to address this point by staining for TBR1 and/or CTIP2 in CON and EPZ-treated hemispheres and to expand with this EOMES and SOX2 co-staining.

      Minor issues:

      Reviewer’s comment: Figure 1 - It is not clear delaminated cells are APs, BPs or some transient cells (Sox2+ Tubb3+??). It is important to use several cell type-specific and cell cycle markers simulnaneously to characterize cell-type specific identity of the analysed cells by staining. These applied to Fig1B,D,E,F,G,as well as Fig2,3.

      Response to Reviewer and planned revision:

      We will address this point by using a combinatorial staining scheme for several fate markers such as TUBB3, EOMES and SOX2, as suggested by the reviewer.

      Reviewer’s comment: - Please provide higher magnification images of labelled cells (Fig 1H)

      Response to Reviewer and planned revision:

      In the revised manuscript, we will provide higher magnification for the staining.

      Reviewer’s comment: - Please provide clarification on the criteria of Tis21-GFP+ signal thresholding.

      Response to Reviewer and planned revision:

      In the revised manuscript, we will provide a clarification on the criteria of Tis21-GFP+ signal thresholding.

      Reviewer’s comment: - Splitting the GFP signal between ventricular and abventricular does not convincingly support the "more basal and/or differentiated" states after EPZ treatment.

      Response to Reviewer and planned revision:

      We will provide a clarification regarding this point.

      Reviewer’s comment: - Please explain the presence of Tis21-GFP+ cells at the apical VZ.

      Response to Reviewer and planned revision:

      Tis21-GFP+ cells at the apical VZ has been extensively reported in the literature, since the first paper by Haubensak et al. regarding the generation of the Tis21-GFP+ line. In a nutshell, T Tis21-GFP+ cells are present throughout the VZ (therefore also in the apical portion) as neurogenic, Tis21-GFP positive cells are undergoing mitosis at the apical surface. Indeed, the presence of Tis-21 GFP signal have been extensively used by the Huttner lab and collaborators to score apical neurogenic mitosis. In addition, since AP undergo interkinetic nuclear migration, it follows that Tis21-GFP+ nuclei are going to be present throughout the entire VZ.

      In the revised manuscript, we will explain this point and cite additional literature.

      Reviewer’s comment: - Order the legends in same order as the bars.

      Response to Reviewer and planned revision:

      We will follow reviewers’ recommendation and order the legends accordingly.

      Reviewer’s comment: Figure 2 -Fig 2B) The difference between CON and EPZ apical contacts is not clear and does not match with the graph in Fig 2E.

      Response to Reviewer and planned revision:

      We will explain Fig. 2B in more detail and provide additional images in the revised manuscript.

      Reviewer’s comment: -Supp Fig 2 - are these injected slices cultured in control conditions? Please include this in the text and figure/figure legend

      Response to Reviewer and planned revision:

      In the revised manuscript, the text will be changed to address this point and provide clearer info.

      Reviewer’s comment: Fig 2C) The EPZ-treated DxA555+ cells exhibit morphological change of cell shape. Is this phenotype? please comment on the image shown for EPZ treatment panel.

      Response to Reviewer and planned revision:

      We thank the reviewer for having raised this point.

      The change in morphology might be a consequence of delamination and or of cell fate. In the revised manuscript, we will certainly better comment on this very relevant point and expand the discussion accordingly.

      Reviewer’s comment: Fig 2F - 2G) Data presented on EOMES+ and TUBB3+ % are counterintuitive. The authors claimed that TUBB3+ cells are increased and neuronal differentiation is promoted. However, no changes in EOMES+ are observed. What is the explanation? Did the author check the double positive cells? These could be TSS cells?

      Response to Reviewer and planned revision:

      We thank the reviewer to have raised this point.

      As envisioned by the reviewer, we suspect that the counterintuitive data might be due to TSS cell, which based on our scRNAseq data are expressing at the same time several cell type specific markers. It is possible that, since the treatment with EPZ is 24h long, cells (like the TTS cluster) have no time to completely eliminate the EOMES protein. If that were to be the case, then we would expect to still detect (as we indeed do) EOMES immunoreactivity.

      To address this point, we will:

      • analyze scRNA-seq data and check which is the extent of co-expression of Eomes and Tubb3 mRNAs in the TTS population.
      • Check for EOMES and TUBB3 double positive cells in the microinjection experiment. Reviewer’s comment: Figure 2 and Figure 3) the number of pairs analyzed for EPZ is twice as that of Con for comparison of the parameters taken into account. Please include n of each graph in the figure legend of the specific panel if not the same for all panels in that figure (i.e. for figure 3)

      Response to Reviewer and planned revision:

      We will revise the text accordingly.

      Reviewer’s comment:

      Figure 3) The data indicated that the number of daughter cell pairs in EPZ samples is almost double than Control. Is this the phenotype? More numbers of daughter cells in EPZ treated samples were observed from the same number of injections? or the number of injected cells were different?

      Response to Reviewer and planned revision:

      Due to technical reasons, we indeed performed a higher number of injections in EPZ-treated slices. We think this is the main reason behind the difference in number.

      If the reason were to be biological, one would expect to see the same trend in IUE experiments, but this is actually not the case. This does suggest/corroborate the idea that the reason behind the difference is mainly technical.

      Reviewer’s comment: Figure 4)

      • Please clarify if the single cell transcriptomic analysis has been performed only once, and if yes, how statistical testing to compare the cell proportion is carried out with only one batch. Fig 4G)

      Response to Reviewer and planned revision:

      As for the scRNAseq on microinjected cells:

      the scRNA-seq analysis was done once using cells pooled from 3 different microinjection experiments performed in 3 different days.

      As for the scRNAseq on IUE cells:

      The scRNA-seq analysis was done once using cells pooled from 2-3 different IUE experiments performed in 3 different days.

      For all scRNAseq experiments the statistical testing is achieved by intrasample comparisons according to established bioinformatics pipelines. We will better explain this point in the revised manuscript.

      Reviewer’s comment: Figure 4 and 5) - Figures are not supportive of the statement regarding APs' neurogenic potential upon DOT1L inhibition. TSS transcriptomic profile resembles more progenitors than neurons. Please comment on TSS neurogenic capacity taking into account the provided GO and RNAseq.

      Response to Reviewer and planned revision:

      We thank Reviewer 1 for raising this point, It is indeed true that TTS resemble more AP than neurons (as indicated in the Fig. S5B, C). We took that to indicate the fact that these cells are transient and therefore still maintain some AP features. Interestingly, TTS downregulate cell division markers, suggesting a restriction of proliferative potential, as one would expect for cells with an increased neurogenic potential. We will discuss this point in the revised manuscript.

      Reviewer’s comment: - Please provide GO analysis for APs and BPs.

      Response to Reviewer and planned revision:

      Following the reviewer’s suggestion, we will incorporate a more careful and in-depth analysis in the revised version of the manuscript.

      Reviewer’s comment: - Reconstruct figure 5A by listing genes in the same order in both Con and EPZ and prioritize EPZ-Con differences instead of cell-cell differences.

      Response to Reviewer and planned revision:

      We will revise Figure 5A based on the reviewer’s comment.

      Reviewer’s comment:

      Moreover, the presented genes in the heatmap is not the same in two conditions (i.e. NEUROG1 is present in EPZ but absent in Con). Please justify.

      Response to Reviewer and planned revision:

      This observation is based on different activities of transcription factor networks in the control and EPZ condition. They are not supposed to be the same as the cell states are altered and different TF are expressed and active upon the treatment in the diverse cell types. In a revised manuscript we will justify this point.

      Reviewer’s comment: Fig 5D)

      • Please explain why binding of EZH2 on the promoter of Asns is strongly reduced in comparison to a mild significant reduction of H3K79me/H3K27me3 in EPZ compared to Control.

      Response to Reviewer and planned revision:

      Several explanations are possible

      First, the variation can be due to batch effects.

      Second, the acute reduction of EZH2 might not be directly accompanied by a reduced histone mark, which is reduced either by cell division or by demethylases. The two processes of getting rid of the mark might be slower than the reduction of EZH2 presence at the respective site.

      Based on the reviewer’s comment, we will explain this point in the revised manuscript.

      • *

      Reviewer’s comment:

      Also is the changed directly medicated by DOT1L?

      Please test whether DOT1L can bind the promoter of Asns.

      Response to Reviewer and planned revision:

      To address this relevant issue we will proceed with the following protocol:

      • electroporate a tagged version of DOT1L into ESCs
      • select ESCs and differentiate them into NPC_48h.
      • treat NPC with DMSO (Con) or EPZ
      • harvest CON and EPZ-treated NPC
      • perform ChIP-qPCR DOT1L at the Asns promoter Reviewer’s comment: Please provide the expression patterns of DOT1L and Asns during neuronal differentiation.

      Response to Reviewer and planned revision:

      As for Dot1l

      Dot1l expression was shown in Franz et al 2019, by ISH from E12.5 to E18.5.

      As for Asns

      We will provide E14.5 in situ staining of Asns in the developing mouse brain using the Gene Paint database (see Figure below).

      We will also show immunostainings for ASNS at mid-neurogenesis, provided that Ab against ASNS works in the mouse.

      Other General comments:

      Reviewer’s comment: Please Indicate VZ, SVZ and CP on the side of the pictures/ with dot lines in the pictures both for primary figures and supplementary.

      Response to Reviewer and planned revision:

      We will revise the figures accordingly.

      Reviewer’s comment: - The Results and figures sometimes do not support the statement made by the authors

      Response to Reviewer and planned revision:

      We will carefully check on this and eliminate any overinterpretation or non-supported statements from the text.

      • Schemes are not informative/explanatory enough, i.e. time windows of treatment and sample collection, culture conditions details.

      Response to Reviewer and planned revision:

      We will revise the schemes to include more details. In particular, we plan to add a supplementary figure with a detailed visual description of the protocol, to match the detailed description presented in the materials and methods.

      Reviewer’s comment: - A more extensive characterization of TTS cells in terms of differentiation progression and integration would be enlightening

      Response to Reviewer and planned revision:

      In general, we are facing two main challenges while studying the TTS population: one is the lack of a specific marker gene for TTS, the other is the relatively small size of the TTS subpopulation.

      For these reasons, our ability to carry on an in-depth analysis of this cell state is limited.

      Considering the reviewer’s comment, in the revised manuscript we will expand the analysis ad characterization of the differentiation potential of TTS using RNA velocity trajectory.

      We can also expand the discussion on this point.

      Reviewer’s comment: - Picture quality can be improved, provide high magnification images.

      Response to Reviewer and planned revision:

      We will revise the figures to include higher magnification images.

      Reviewer #1 (Significance (Required)):

      Reviewer’s comment: The study could be important for the specific field in neural development. It aims to understand mutations in respective genes and brain malformation. If the link between epigenetic and metabolic changes is clearly shown, it will be interesting. However, the current manuscript is still rather descriptive, and clear mechanistic insights were not provided. The study have potentials and additional data will strength the value of study.

      Response to Reviewer and planned revision:

      We will address the direct impact of DOT1L and H3K79me2 on the Asns gene locus during the revision (see the rationale of the experimental strategy also in the revision plan above). We hope we will thus provide a mechanistic link between epigenetics and altered metabolome.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Reviewer’s comment: Appiah et al. present a concise manuscript that provides details and possible mechanisms of their previous work (Franz et al., 2019; Ferrari et al., 2020). The study uses diverse lines of investigation to arrive at most conclusions. However, as interesting as the data is, we find that at the present state, it is not sufficient to prove that, indeed, the asparagine metabolism is regulated by DOTL1/PRC2 crosstalk. The neurogenic shift presented in the first part of the paper is not comprehensive and, therefore, not very convincing. The quality of images provided in the main and supplementary data is less than ideal. Additional data analysis and interpretation of the scRNA seq data may be needed. The authors finally conclude with rescue experiments done in culture and in-vivo, which we believe is the stand-out part of this study. Overall the manuscript has some interesting observations that are often over-interpreted with less supporting data. The manuscript reads well but requires additional data and changes in the claims/interpretation to be suited for publication.

      Response to Reviewer and planned revision:

      In the revised manuscript, we hope we will address the comments and concerns raised by the reviewer in a satisfactory manner. Comments

      Reviewer’s comment: 1) Abstract: Is this statement correct: "DOT1L inhibition led to increased neurogenesis driven by a shift from asymmetric self-renewing to symmetric neurogenic divisions of APs. AP undergoes symmetric division for self-renewal and asymmetric neurogenic divisions.

      Response to Reviewer and planned revision:

      Based on the current literature (cit. Huttner and Kriegstein), AP undergo:

      • symmetric division for proliferative division at early stages of neurogenesis
      • asymmetric self-renewing division, generating an AP and a BP at mid neurogenesis. This division is also described as neurogenic, as it produces a BP, that is a step further than AP in term of neurogenic potential.
      • symmetric consumptive division at late neurogenesis To avoid any possible confusion, we will re-phrase the sentence to include the adjective “consumptive” and specify the composition of the progeny.

      In the revised manuscript, the sentence will read as follow:

      "DOT1L inhibition led to increased neurogenesis driven by a shift of APs from asymmetric self-renewing (generating one AP and one BP) to symmetric consumptive divisions (generating two neurons)"

      Reviewer’s comment: All the data is based on treatments with EPZ (DOTL1 inhibitor), yet no information is shown to support its targeted activity in this system. A proof of principle in the chosen experimental system is missing; for instance, examining the activity or protein level of DOTL1 and decreased methylation of the target(s) is essential.

      Response to Reviewer and planned revision:

      EPZ is a well characterized drug, that has been used previously in our lab and by others as well.

      As for our lab, the information regarding the inhibitor, its activity and efficiency in inhibiting DOT1L towards H3K79me2 was shown in Franz et al. Supplementary Fig. S6 D, E.

      In the present manuscript, an additional confirmation that EPZ targets DOT1L in regard to its H3K79me2 activity is shown in Fig. 5D.

      We would refer to this information more explicitly in a revised manuscript.

      Reviewer’s comment: 2) Figure 1: The scoring of centrosomes and cilia is insufficient to conclude delamination and increase in basal fates. The effect could be on ciliogenesis or centrosome tethering to the apical end-feet of the AP, and other possible explanations for this observation also exist. The images are too small; larger images or graphic representations could be helpful in addition to the data.

      Response to Reviewer and planned revision:

      We did not intend to claim that the change in centrosome location demonstrate delamination, but only that it suggests delamination. This criterion has been extensively used as a proxy for delamination by several labs working on the cell biology of neurogenesis, such Huttner and Gotz labs. If the issue persists, we can re-phrase in a more cautious way the text referring to Figure 1 to highlight that the data only suggest delamination.

      Response to Reviewer and planned revision:

      To make a statement regarding delamination, I would like to see either the dynamics of delamination (organotypic slices images), staining with BP markers, or morphological changes of AP (staining that will reveal loss of adherence) or comparable data to support the observation. In my opinion Supp. Figure 1 is insufficient; the single image is not convincing; I would like to see 3D reconstruction and better-quality images.

      Response to Reviewer and planned revision:

      We can certainly provide better images and co-stain with relevant markers.

      We think it is beyond the scope of the manuscript embarking in live imaging as we are not studying the dynamics of delamination per se.

      Reviewer’s comment: Tis21 data (1H), again of low quality, is only a single piece of evidence and the conclusion "suggesting that the acquisition of a basal fate was paralleled by a switch to neurogenesis" is premature. I think other cell cycle exit reporters, Fucci markers, pHis, BrdU, NeuroD, or Tbr2 reporters (Li et al., 2020, (Haydar and Sestan labs)) to name a few, are necessary to establish the conclusions. The authors should show other markers such as PAX6, EOMES, or other upper-layer markers upon cell cycle exit in the SVZ/CP. These additional experiments will assist in cell fate analysis.

      Response to Reviewer and planned revision:

      We completely understand the points raised by the reviewer, and we plan to address them by co-staining with PAX6/SOX2, PH3 and/or EOMES.

      We think establishing the Fucci or EOMES mouse system is beyond the scope of the manuscript. In addition, given the present setting of all labs involved, it would be logistically unattainable (see also comments in the section below).

      We think the co-staining scheme and plan will be informative enough to satisfactory address the concerns raised by the reviewer.

      Reviewer’s comment: 2) Figure 2: The microinjection experiments are elegant; the images, however, do not complement the experiment. The images of the microinjected cells seem not to be reconstructed from z-stacked optical slices, so often, processes are not continuous (panel B, for example); therefore, it is not clear if an apical process is indeed missing or just not seen.

      Response to Reviewer and planned revision:

      The mentioned images are reconstructed from continuous Z-stacks, as we always do given the type of data. We can provide better reconstructions and/or additional images.

      Reviewer’s comment:

      The data analysis should include other parameters; BrdU staining could have given information on cell cycle exit, PAX6, SOX2, and EOMES on the location of the cells in the VZ/sVZ. The quality of images showing EOMES and TUBB3 staining is so low that it makes the reader doubt the validity of the quantifications. "Taken together, these data suggest that the inhibition of DOT1L might favor the acquisition of a neuronal over BP cell fate" This interpretation should be subjected to more investigations. It is possible that this treatment just accelerates the AP-> BP -> Neuronal fate. The author's claim needs to be backed by additional experiments or be changed.

      Response to Reviewer and planned revision:

      To address this point, we will include in the revised manuscript staining and co-staining with PAX6, SOX2 (see also response above) and provide a BrdU labeling experiment.

      Reviewer’s comment: 3) Figure 3: The experiment concept and its performance are impressive, yet the data is insufficient. The images in A that are supposed to be representative show two cells; their location is not clear, and the expression of GFP is not clear; in fact, both pairs seem to be GFP negative (not clear what is the threshold for background). Staining with anti-GFP and a second method to follow neurogenesis is necessary.

      Response to Reviewer and planned revision:

      We did use different staining methods and schemes to follow neurogenesis. As specified above, we will deepen our analysis by using additional markers, such as TBR1.

      Reviewer’s comment: 4) On page 9, lines 8-10, the authors claim that their number of cells was "sufficient" for single-cell analysis; the numbers are Response to Reviewer and planned revision:

      In the revised manuscript, we will include the analysis of how many cells are needed to identify cluster of 6 cell types in this paradigm, based for example on the algorithms developed in Treppner et al. 2021.

      Reviewer’s comment: 5) The authors use Seurat and RaceID without their appropriate citations in the first mention during the results. The authors also stop immediately after DEG analysis along with clustering. The authors could analyze their RNA-seq data with a trajectory; to say the least, the identification/characterization of TTS and neurons as Neurons I, II, and III are insufficient. There could be multiple ways to show the "fate" of cells in the isolated FACS, which the authors have missed.

      Response to Reviewer and planned revision:

      We will include the respective citations in a revised manuscript. We provide already differentiation trajectories but will include other methods, including scVelo of FateID to extend the trajectory analyses. We kindly ask the reviewer to also refer to the comments above regarding the TTs cluster characterization as part of our effort to provide a better picture of the different clusters.

      Reviewer’s comment: 6) The authors detected candidates like Fgfr3, Nr2f1, Ofd1, and Mme as part of their treated (different approaches) datasets (from their DEG analysis). They correctly cite Huang et al., 2020 but fail to give us a sense of the consequences of these gene dysregulations. The authors can also validate if these proteins are expressed in their treated cells.

      Response to Reviewer and planned revision:

      In the revised manuscript we will comment on the function of the four genes mentioned.

      In addition, we will validate the expression of these genes on protein and transcriptional level through immunostainings -provided that antibodies are working in our system- or smFISH, respectively.

      Reviewer’s comment: 7) The authors list a few GO terms (page 10, lines 1-10) and associate them with reduced proliferation; they must cite relevant studies. The authors can also add supplementary data showing which genes in their data correspond to these GO terms.

      Response to Reviewer and planned revision:

      We thank the reviewer for pointing out the missing citations.

      We of course agree on the need to add them, and we will do so in the revised manuscript.

      Reviewer’s comment: 8) On Page 11, lines 3-7, the authors describe their method to arrive at the 17 targets with TF activity from the previous analysis. Can the authors describe the method used to correlate the two? The reviewer understands this could be MEME analysis or analysis of earlier datasets of Ferrari et al. 2020. But it must be explicitly stated, and a few examples in supplementary need to be exemplified as this analysis is key to discovering the three metabolic genes.

      Response to Reviewer and planned revision:

      In the revised manuscript, we will clarify the exact analysis that resulted in the identification of the 17 target genes, using the specific tool for gene network analysis, that is based on our scRNA-seq data alone, but not on the Ferrari et al 2020 data set.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      n/a

      4. Description of analyses that authors prefer not to carry out

      Reviewer’s comment: Tis21 data (1H), again of low quality, is only a single piece of evidence and the conclusion "suggesting that the acquisition of a basal fate was paralleled by a switch to neurogenesis" is premature. I think other cell cycle exit reporters, Fucci markers, pHis, BrdU, NeuroD, or Tbr2 reporters (Li et al., 2020, (Haydar and Sestan labs)) to name a few, are necessary to establish the conclusions. The authors should show other markers such as PAX6, EOMES, or other upper-layer markers upon cell cycle exit in the SVZ/CP. These additional experiments will assist in cell fate analysis.

      Response to Reviewer and planned revision:

      As pointed out above, we think establishing the Fucci or EOMES mice system is beyond the scope of the manuscript as it will not provide more information than the ones we will obtain from systematic and extensive co-staining experiments. In addition, all labs involved are facing a logistic issue (animal house not ready yet, construction works etc) that made the importing and setting up of the colony unattainable for the next 6-10months. If the reviewer and/or the editorial board think this is a major point compromising the entire revision, we kindly ask to contact us again so that we can discuss the issue and arrive to a shared conclusion.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Appiah et al. present a concise manuscript that provides details and possible mechanisms of their previous work (Franz et al., 2019; Ferrari et al., 2020). The study uses diverse lines of investigation to arrive at most conclusions. However, as interesting as the data is, we find that at the present state, it is not sufficient to prove that, indeed, the asparagine metabolism is regulated by DOTL1/PRC2 crosstalk. The neurogenic shift presented in the first part of the paper is not comprehensive and, therefore, not very convincing. The quality of images provided in the main and supplementary data is less than ideal. Additional data analysis and interpretation of the scRNA seq data may be needed. The authors finally conclude with rescue experiments done in culture and in-vivo, which we believe is the stand-out part of this study. Overall the manuscript has some interesting observations that are often over-interpreted with less supporting data. The manuscript reads well but requires additional data and changes in the claims/interpretation to be suited for publication.

      Comments

      1. Abstract: Is this statement correct: "DOT1L inhibition led to increased neurogenesis driven by a shift from asymmetric self-renewing to symmetric neurogenic divisions of APs". AP undergoes symmetric division for self-renewal and asymmetric neurogenic divisions.

      All the data is based on treatments with EPZ (DOTL1 inhibitor), yet no information is shown to support its targeted activity in this system. A proof of principle in the chosen experimental system is missing; for instance, examining the activity or protein level of DOTL1 and decreased methylation of the target(s) is essential. <br /> 2. Figure 1: The scoring of centrosomes and cilia is insufficient to conclude delamination and increase in basal fates. The effect could be on ciliogenesis or centrosome tethering to the apical end-feet of the AP, and other possible explanations for this observation also exist. The images are too small; larger images or graphic representations could be helpful in addition to the data.

      To make a statement regarding delamination, I would like to see either the dynamics of delamination (organotypic slices images), staining with BP markers, or morphological changes of AP (staining that will reveal loss of adherence) or comparable data to support the observation. In my opinion Supp. Figure 1 is insufficient; the single image is not convincing; I would like to see 3D reconstruction and better quality images.

      Tis21 data (1H), again of low quality, is only a single piece of evidence and the conclusion "suggesting that the acquisition of a basal fate was paralleled by a switch to neurogenesis" is premature. I think other cell cycle exit reporters, Fucci markers, pHis, BrdU, NeuroD, or Tbr2 reporters (Li et al., 2020, (Haydar and Sestan labs)) to name a few, are necessary to establish the conclusions. The authors should show other markers such as PAX6, EOMES, or other upper-layer markers upon cell cycle exit in the SVZ/CP. These additional experiments will assist in cell fate analysis. 2. Figure 2: The microinjection experiments are elegant; the images, however, do not complement the experiment. The images of the microinjected cells seem not to be reconstructed from z-stacked optical slices, so often, processes are not continuous (panel B, for example); therefore, it is not clear if an apical process is indeed missing or just not seen. The data analysis should include other parameters; BrdU staining could have given information on cell cycle exit, PAX6, SOX2, and EOMES on the location of the cells in the VZ/sVZ. The quality of images showing EOMES and TUBB3 staining is so low that it makes the reader doubt the validity of the quantifications. <br /> "Taken together, these data suggest that the inhibition of DOT1L might favor the acquisition of a neuronal over BP cell fate" This interpretation should be subjected to more investigations. It is possible that this treatment just accelerates the AP-> BP -> Neuronal fate. The author's claim needs to be backed by additional experiments or be changed. 3. Figure 3: The experiment concept and its performance are impressive, yet the data is insufficient. The images in A that are supposed to be representative show two cells; their location is not clear, and the expression of GFP is not clear; in fact, both pairs seem to be GFP negative (not clear what is the threshold for background). Staining with anti-GFP and a second method to follow neurogenesis is necessary. 4. On page 9, lines 8-10, the authors claim that their number of cells was "sufficient" for single-cell analysis; the numbers are <500 for all samples. The authors need to justify this statement or articles that carefully analyze the number required for such a conclusion as references. 5. The authors use Seurat and RaceID without their appropriate citations in the first mention during the results. The authors also stop immediately after DEG analysis along with clustering. The authors could analyze their RNA-seq data with a trajectory; to say the least, the identification/characterization of TTS and neurons as Neurons I, II, and III are insufficient. There could be multiple ways to show the "fate" of cells in the isolated FACS, which the authors have missed. 6. The authors detected candidates like Fgfr3, Nr2f1, Ofd1, and Mme as part of their treated (different approaches) datasets (from their DEG analysis). They correctly cite Huang et al., 2020 but fail to give us a sense of the consequences of these gene dysregulations. The authors can also validate if these proteins are expressed in their treated cells. 7. The authors list a few GO terms (page 10, lines 1-10) and associate them with reduced proliferation; they must cite relevant studies. The authors can also add supplementary data showing which genes in their data correspond to these GO terms. 8. On Page 11, lines 3-7, the authors describe their method to arrive at the 17 targets with TF activity from the previous analysis. Can the authors describe the method used to correlate the two? The reviewer understands this could be MEME analysis or analysis of earlier datasets of Ferrari et al. 2020. But it must be explicitly stated, and a few examples in supplementary need to be exemplified as this analysis is key to discovering the three metabolic genes.

      Significance

      Appiah et al. present a concise manuscript that provides details and possible mechanisms of their previous work (Franz et al., 2019; Ferrari et al., 2020). The study uses diverse lines of investigation to arrive at most conclusions. However, as interesting as the data is, we find that at the present state, it is not sufficient to prove that, indeed, the asparagine metabolism is regulated by DOTL1/PRC2 crosstalk. The neurogenic shift presented in the first part of the paper is not comprehensive and, therefore, not very convincing. The quality of images provided in the main and supplementary data is less than ideal. Additional data analysis and interpretation of the scRNA seq data may be needed. The authors finally conclude with rescue experiments done in culture and in-vivo, which we believe is the stand-out part of this study.

      Overall the manuscript has some interesting observations that are often over-interpreted with less supporting data. The manuscript reads well but requires additional data and changes in the claims/interpretation to be suited for publication.

    1. I like to think of thoughts as streaming information, so I don’t need to tag and categorize them as we do with batched data. Instead, using time as an index and sticky notes to mark slices of info solves most of my use cases. Graph notebooks like Obsidian think of information as batched data. So you have a set of notes (samples) that you try to aggregate, categorize, and connect. Sure there’s a use case for that: I can’t imagine a company wiki presented as streaming info! But I don’t think it aids me in how I usually think. When thinking with pen and paper, I prefer managing streamed information first, then converting it into batched information later— a blog post, documentation, etc.

      There's an interesting dichotomy between streaming information and batched data here, but it isn't well delineated and doesn't add much to the discussion as a result. Perhaps distilling it down may help? There's a kernel of something useful here, but it isn't immediately apparent.

      Relation to stock and flow or the idea of the garden and the stream?

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this study, the authors consider the problem of inferring transcription dynamics from smFISH data. They distinguish between two important experimental situations. The first one considers measurements of mature mRNAs, while the second one considers measurements of nascent mRNA through fluorescent probes targeting PP7 stem loops. The former problem has been previously dealt with extensively, but less work has been done on the context of the latter. The inference approaches are based on maximum likelihood estimation, from which point estimates for promoter-switching and transcription rates are obtained. The study focuses on steady state measurements only. The authors perform several analyses using synthetic data to understand the limitations of both approaches. They find that inference from nascent mRNA is more reliable than inference from mature mRNA distributions. Moreover, they show that accounting for different cell-cycle stages (G1 vs G2) is important and that pooling measurements across the cell-cycle can lead to quantitatively and even qualitatively different inferences. Both approaches are then used to analyze transcription in an experimental system in yeast, for which they find evidence of gene dosage compensation. I consider this an interesting and relevant study, which will appeal to the systems- and computational biology community. The paper is well written and the (computational) methods are described in detail. The experimental description is quite minimal and could profit from further details / explanations. I have several technical criticisms and questions, which I believe should be addressed before publication. Since I am a theorist, I will comment predominantly on the statistical / computational aspects.

      Major comments/questions:

      -A key reference that is missing is Fritzsch et al. Mol Syst Biol (2018). In this work, the authors have used nascent mRNA distributions and autocorrelations (obtained from live-imaging) to infer promoter- and transcription dynamics. I believe this work should be appropriately cited and discussed.

      Synthetic case study:

      -Inference and point estimates. The authors use a maximum-likelihood framework to extract point estimates of the parameters. Subsequently, relative absolute differences are used to assess the accuracy of the inference. However, as far as I have understood, this is performed for only a single simulated dataset, for each considered parameter configuration. The resulting metric, however, does not really capture the inference accuracy, since it is based on a single (random) realization of the MLE. I would recommend to at least repeat the inference multiple times for different realizations of the simulated dataset (per parameter configuration) to get a better feeling of the distribution of the MLE (e.g., its bias / variance). Alternatively, identifiability analyses based on the Fisher information could be performed for (some of) the different parameter configurations although this may be computationally more demanding.

      -It would be useful to include confidence intervals based on profile likelihoods also for the synthetic case study, in particular for the 6 reported datasets. I would also find it helpful to see comprehensive profile likelihood plots for the key results / parameter inferences in the supplement. This would also provide useful insights into the identifiability of the parameters.

      Experimental case study:

      -Validation against live-cell data. In the simulation of the autocorrelation function, what was the ratio of cells initialized in G1 / G2, respectively? I'd expect this to have direct influence on the simulated ACF. Moreover, a linear fit is used to correct for "non-stationary effects" in the ACF that supposedly stem from cell-cycle dynamics. First, I don't think this terminology is really accurate, since non-stationarity would lead to an ACF that depends on two parameters (tau_1 and tau_2). I suppose the goal of the linear correction is to remove slow / static population heterogeneity? If yes, wouldn't it be easier / more direct to also change the simulations to non-synchronized cell-cycles? In this case, they should also display the very slow / static components as displayed in the data, which would eliminate the need for the post-hoc correction. I was also wondering whether other statistics (e.g., mean, variance, distributions) match between the simulations and the live-cell experiment? This could provide further validation of the inferred parameters.

      -If I understood correctly, the signal intensity of the measured transcription spot is normalized by the median cytoplasmic spot brightness. Since the normalized intensity of a single complete transcript is 1, the cumulative intensity should give a lower bound on the nascent mRNAs. The histograms in Fig. 4b show intensity values in the range of 30, which would mean that at least 30 transcripts contribute to the transcription spot. The total number of nucleoplasmic and cytoplasmic mRNA, however, is in the range of 10 (Fig. 3a). I am probably missing something but how can we reconcile these numbers? The authors mention that the brightest spot just counts for one transcript, but argue that this has negligible influence on mature RNA counts. Could this be a possible explanation for the mismatch?

      Minor comments:

      -In the experimental case study, the authors argue that the "correct" inference result is the one that accounts for cell-cycle stage, while the other one termed "incorrect". I find this terminology too strong, since every estimate is subject to uncertainty.

      -Page 2: "... in a asynchronous population" -> "... in an asynchronous population"

      -Page 7: "...parameters sets 3 and 4" -> "...parameter sets 3 and 4"

      -Figures 5a and 6a: parameter names and units should go on the y-axis.

      Significance

      Quantifying kinetic parameters from incomplete and noisy experimental data is a core problem in systems biology. I therefore consider this manuscript to be very relevant to this field. The contribution of this manuscript is largely methodological, although its potential usefulness is demonstrated using experimental data in yeast.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In recent years, the field has investigated crosstalk between cGMP and cAMP signaling (PMID: 29030485), lipid and cGMP signaling (PMID: 30742070), and calcium and cGMP signaling (PMID: 26933036, 26933037). In contrast to the Plasmodium field, which has benefited from proteomic experiments (ex: PMID 24594931, 26149123, 31075098, 30794532), second messenger crosstalk in T. gondii has been probed predominantly through genetic and pharmacological perturbations. The present manuscript compares the features of A23187- and BIPPO-stimulated phosphoproteomes at a snapshot in time. This is similar to a dataset generated by two of the authors in 2014 (PMID: 24945436), except that it now includes one BIPPO timepoint. The sub-min​​ute phosphoproteomic timecourse following A23187 treatment in WT and ∆cdpk3 parasites is novel and would seem like a useful resource.

      CDPK3-dependent sites were detected on adenylate cyclase, PI-PLC, guanylate cyclase, PDE1, and DGK1. This motivated study of lipid and cNMP levels following A23187 treatment. The four PDEs determined to have A23187-dependent phosphosites were characterized, including the two PDEs with CDPK3-dependent phosphorylation, which were found to be cGMP-specific. However, cGMP levels do not seem to differ in a CDPK3- or A23187-dependent manner. Instead, cAMP levels are elevated in ∆cdpk3 parasites. This would seem to implicate a feedback loop between CDPK3, the adenylyl cyclase, and PKA/PKG: CDPK3 activity reduces adenylyl cyclase activity, which reduces PKA activity, which increases PKG activity. The authors don't pursue this direction, and instead characterize PDE2, which does not have CDPK3-dependent phosphosites, and seems out of place in the study

      Response:

      We agree with reviewer 1 that a feedback loop between CDPK3, the adenylyl cyclase and PKA/PKG is certainly one of several possibilities (and we acknowledge this in the manuscript).

      We felt, however, that given the observation that A23187 and BIPPO treatment leads to phosphorylation of numerous PDEs (hinting at the presence of an Ca2+-regulated feedback loop), it was entirely relevant to study these in greater detail. Coupled with the A23187 egress assay on ΔPDE2 parasites - our findings suggest that PDE2 plays an important role in this signalling loop (an entirely novel finding). While PDE2 appears to exert its effects in a CDPK3-independent manner (indeed suggesting that CDPK3 might exert its effects on cAMP levels in a different fashion), this does not detract from the important finding that PDE2 is one of the (likely numerous) components that is regulated in a Ca2+-dependent feedback loop to regulate egress.

      We have modified our writing to better reflect the fact that our decision to pursue study of the PDEs was not solely CDPK3-centric.

      While we feel that our reasoning for studying the PDEs is solid, we appreciate that further clarification on the putative CDPK3-Adenylate cyclase link would make it easier for the reader to follow the rationale.

      We have not studied the direct link between CDPK3 and the Adenylate Cyclase β in more detail, as ACβ alone was shown to not play a major role in regulating lytic growth (Jia et al., 2017).

      **MAJOR COMMENTS**

      1.Some of the key conclusions are not convincing.

      The data presented in Figure 6E, F, and G and discussed in lines 647-679 are incongruent. In Figure 6E, the plaques in the PDE2+RAP image are hardly visible; how can it be that the plaques were accurately counted and determined not to differ from vehicle-treated parasites?

      Are the images in 6E truly representative? Was the order of PDE1 and PDE2 switched? The cited publication by Moss et al. 2021 (preprint) is not in agreement with this study, as stated. That preprint determined that parasites depleted of PDE2 had significantly reduced plaque number and plaque size (>95% reduction); and parasites depleted of PDE1 had a substantially reduced plaque size but a less substantial reduction in plaque number.

      Response:

      The plaques for PDE2+RAP were counted using a microscope since they are difficult to see by eye. We thank the reviewer for detecting our incorrect reference to Moss et al. (2021). This has been corrected in the text. We confirm, however, that the images in 6E are representative of what we observed and do indeed differ from what was seen by Moss et al.. We have acknowledged this clearly in the text.

      The differences cannot easily be explained other than by the different genetic systems used. Further studies of the individual PDEs will likely illuminate their role in invasion/ growth, but we feel this would be beyond the scope of this study.

      Unfortunately, the length of time required for PDE depletion (72h) is incompatible with most T. gondii cellular assays (typically performed within one lytic cycle, 40-48h). Although the authors performed the assays 3 days after initial RAP treatment, is there evidence that non-excised parasites don't grow out of the population. This should be straightforward to test: treat, wait 3 days, infect onto monolayers, wait 24-48h fix, and stain with anti-YFP and an anti-Toxoplasma counterstain. The proportion of the parasite population that had excised the PDE at the time of the cellular assays will then be known, and the reader will have a sense of how complete the observed phenotypes are. As a reader, I will regard the phenotypes with some level of skepticism due to the long depletion time, especially since a panel of PDE rapid knockdown strains (depletion in __Response:

      1. Cellular assays using KO parasites are commonly performed at the point at which protein depletion is detected. Both our western blots and plaque assay results demonstrate that, at the point of assay, there is no substantial outgrowth of non-excised parasites. The original manuscript also includes PCRs performed at the 72 hr time point (See Fig. 6B) to support this.
      2. We appreciate the reviewer’s comment re the panel of PDE KD strains. The reviewer notes that there are substantial limitations to conditional KO systems, which similarly applies to KD systems - there are notable pros and cons to each approach. When designing our strategy (pre-publication of the Moss et al., 2022), we made a deliberate decision to use conditional KO strains in light of the fact that residual protein levels in KD systems can cause significant problems, particularly for membrane proteins (all of the investigated PDEs have a transmembrane domain). Tagging of proteins with the degradation domain can have further issues, leading to protein mis-localisation, which we have experienced with several unrelated proteins in the lab.

        The authors should qualify some of their claims as preliminary or speculative, or remove them altogether.

      The claims in lines 240-260 are confusing. It seems likely that the two drug treatments have at least topological distinctions in the signaling modules, given that cGMP-triggered calcium release is thought to occur at internal stores, whereas A23187-mediated calcium influx likely occurs first at the parasite plasma membrane.The authors' proposed alternative, that treatment-specific phosphosite behavior arises from experimental limitations and "mis-alignment", is unsatisfying for the following reasons: (1) From the outset, the authors chose different time frames to compare the two treatments (15s for BIPPO vs. 50s for A23187); (2) the experiment comprises a single time point, so it does not seem appropriate to compare the kinetics of phosphoregulation. There is still value in pointing out which phosphosites appear treatment-specific under the chosen thresholds, but further claims on the basis of this single-timepoint experiment are too speculative. Lines 264-267 and 281-284 should also be tempered.

      Relatedly, graphing of the data in Figure 1G (accompanying the main text mentioned above) was confusing. Why is one axis a ratio, and the other log10 intensity? What does log10 intensity tell you without reference to the DMSO intensity? Wouldn't you want the L2FC(A23187) vs. L2FC(BIPPO) comparisons? Could you use different point colors to highlight these cases on plot 1E? Additionally, could you use a pseudocount to include peptides only identified in one treatment condition on the plot in 1E? (Especially since these sites are mentioned in lines 272-278 but are not on the plot)

      Response:

      1. The kinetics of the responses to A23187 and BIPPO are very different. This is why treatment timings are purposely different as they were selected to align pathways to a point where calcium levels peak just prior to calcium re-uptake. We make no mention of kinetic comparisons, and merely demonstrate that at the chosen timepoints, overall signalling correlation is very high. The observation that most of the sites that behave differently between conditions sit remarkably close to the threshold for differential regulation (in the treatment condition where they are not DR - see Fig. 1G) led us to speculate that many of these sites are likely on the cusp of differential regulation. While it is entirely possible that some of these differences are, in fact, treatment specific (and we clearly acknowledge this in the text), we simply state that we cannot confidently discern clear signalling features that allow us to distinguish between the two treatments. We feel that this is an entirely relevant observation given the observed preponderance of both A23187 and BIPPO-dependent DR phosphosites on proteins in the PKG signalling pathway (as current models place this upstream of Ca2+release).
      2. Log10 intensity only serves to spread the data for easier visualisation. The only comparison being made relates to the LFCs. Fig. 1Gi shows the LFC scores (x axis) for all sites regulated following A23187 treatment (for which peptides were also identified in BIPPO treatment). On this plot we have highlighted the sites that are differentially regulated following BIPPO but not A23187 treatment (with red showing the DRup and blue showing the DRdown sites). This demonstrates that many of the sites that are regulated following BIPPO but not A23187 treatment cluster close to the threshold for differential regulation in the A23187 dataset - suggesting that many of these sites are likely on the cusp of differential regulation. Fig. 1Gii shows the reverse. While we could highlight the above-mentioned sites on the plot in Fig. 1E, we do not feel that it would demonstrate our point as clearly.

      We feel that including a pseudocount on Fig. 1E for peptides lacking quantification in one treatment condition would be visually misleading as the direct correlation being made in Fig. 1E is BIPPO vs A23187 treatment. The sites mentioned in lines 272-278 in the original manuscript (now lines 268-276) are available in the supplement tables.

      3.Additional experiments would be essential to support the main claims of the paper.

      Genetic validation is necessary for the experiments performed with the PKA inhibitor H89. H89 is nonspecific even in mammalian systems (PMID: 18523239) and in this manuscript it was used at a high concentration (50 µM) The heterodimeric architecture of PKA in apicomplexans dramatically differs from the heterotetrameric enzymes characterized in metazoans (PMID: 29263246), so we don't know what the IC50 of the inhibitor is, or whether it inhibits competitively. Two inducible knockdown strains exist for PKA C1 (PMID: 29030485, 30208022). The authors could request one of these strains and construct a ∆cdpk3 in that genetic background, as was done for the PDE2 cKO strain. Estimated time: 3-4 weeks to generate strain, 2 weeks to repeat assays.

      Response:

      1. While we appreciate that H89 is not 100% specific for PKA, this is not our only line of evidence that cAMP levels are altered. We demonstrate that cAMP levels are elevated in CDPK3 KO parasites – further substantiating our finding.

      The H89 concentration used in our experiment is in keeping with/lower than the concentrations used in other Toxoplasma publications (Jia et al., 2017), and both the Toxoplasma and Plasmodium fields have shown convincingly that H89 treatment phenocopies cKD/cKO of PKA (see Jia et al., 2017; Flueck et al., 2019).

      While we agree that the genetic validation suggested by reviewer 1 would serve to further support our findings (though it would not provide further novel insights), the suggested time frame for experimental execution was not realistic. Line shipment, strain generation, subcloning and genetic validation would take substantially longer than 3-4 weeks.

      cGMP levels are found to not increase with A23187 treatment, which is at odds with a previous study (lines 524-560). The text proposes that the differences could arise from the choice of buffer: this study used an intracellular-like Endo buffer (no added calcium, high potassium), whereas Stewart et al. 2017 used an extracellular-like buffer (DMEM, which also contains mM calcium and low potassium). An alternative explanation is that 60 s of A23187 treatment does not achieve a comparable amount of calcium flux as 15 s of BIPPO treatment, and a calcium-dependent effect on cGMP levels, were it to exist, could not be observed at the final timepoint in the assay. The experiments used to determine the kinetics of calcium flux following BIPPO and A23187 treatments (Fig. 1B, C) were calibrated using Ringer's buffer, which is more similar to an extracellular buffer (mM calcium, low potassium). In this buffer, A23187 treatment would likely stimulate calcium entry from across the parasite plasma membrane, as well as across the membranes of parasite intracellular calcium stores. By contrast, A23187 treatment in Endo buffer (low calcium) would likely only stimulate calcium release from intracellular stores, not calcium entry, since the calcium concentration outside of the parasite is low. Because calcium entry no longer contributes to calcium flux arising from A23187 treatment, it is possible that the calcium fluxes of A23187-treated parasites at 60 s are "behind" BIPPO-treated parasites at 15 s. The researchers could control these experiments by *either* (i) performing the cNMP measurements on parasites resuspended in the same buffer used in Figure 1B, C (Ringer's) or (ii) measuring calcium flux of extracellular parasites in Endo buffer with BIPPO and A23187 to determine the "alignment" of calcium levels, as was done with intracellular parasites in Figure 1C. No new strains would have to be generated and the assays have already been established in the manuscript. Estimated time to perform control experiments with replicates: 2 weeks. This seems like an important control, because the interpretation of this experiment shifts the focus of the paper from feedback between calcium and cGMP signaling, which had motivated the initial phosphoproteomics comparisons, to calcium and cAMP signaling. Further, the lipidomics experiments were performed in an extracellular-like buffer, DMEM, so it's unclear why dramatically different buffers were used for the lipidomics and cNMP measurements.

      Response:

      While the initial calibration experiments to measure calcium flux were indeed performed in Ringer’s buffer, the parasites were intracellular. We therefore chose to measure cNMP concentrations of extracellular parasites syringe lysed in Endo buffer, which is better at mimicking intracellular conditions than any other described buffer.

      As the reviewer suggested, we measured the calcium flux of extracellular parasites in Endo buffer upon stimulation with either A23187 or BIPPO.

      We found that peak calcium response to BIPPO in Endo buffer was similar to that of intracellular parasites (~15 seconds post treatment) (See Supp Fig. 6A). Upon treatment with A23187, extracellular parasites in Endo buffer had a much faster response compared to their intracellular counterparts, with peak flux measured at ~25 seconds post treatment (see Supp Fig. 6B). This indeed does suggest that extracellular parasites in Endo buffer behave differently to A23187 compared to their intracellular counterparts. However, peak calcium response is still occuring within the experimental time course and is not being missed, as the reviewer worries. Moreover, since we are able to detect increased cAMP levels in A23187 treated parasites, Ca2+ flux appears sufficient to alter cNMP signalling.

      We did notice however that the intensity of the calcium flux was much weaker in Endo buffer compared to intracellular parasites (see Supp Fig. 6B). We found that this was due to the lack of host-derived Ca2+, since supplementation of Endo buffer with 1 uM CaCl2 restored the intensity of the calcium response to match that of intracellular parasites (see Supp Fig. 6C). We therefore decided to repeat our cGMP measurements, this time using extracellular parasites in Endo buffer supplemented with 1 uM CaCl2. However, we found no differences in cGMP levels in the response to ionophore under these conditions (now Supp Fig. 6D) compared to the previous experiments, so the conclusions from the previous data do not change.

      As for the lipidomics experiments, we chose to use DMEM so that our dataset could be compared with other published lipidomic datasets (Katris et al., 2020; Dass et al., 2021) where DMEM was also used as a buffer when measuring global lipid profiles of parasites.

      We now acknowledge in the paper that Endo buffer has its shortcomings, and that this could be the reason why we do not detect changes in cGMP concentrations. We do, however, believe that Endo buffer is the best alternative to intracellular parasites and is supported by its consistent use in numerous publications studying Toxoplasma signalling (McCoy et al., 2012; Stewart et al., 2017).

      Additional information is required to support the claim that PDE2 has a moderate egress defect (lines 681-687). T. gondii egress is MOI-dependent (PMID: 29030485). Although the parasite strains were used at the same MOI, there is no guarantee that the parasites successfully invaded and replicated. If parasites lacking PDE2 are defective in invasion or replication, the MOI is effectively decreased, which could explain the egress delay. Could the authors compare the MOIs (number of vacuoles per host cell nuclei) of the vehicle and RAP-treated parasites at t = 0 treatment duration to give the reader a sense of whether the MOIs are comparable?

      Response:

      Since PDE2 KO parasites have a substantial growth defect, we did notice that starting MOIs were consistently lower for the RAP-treated samples compared to the DMSO-treated samples. However, this was also the case for PDE1 KO parasites where we did not see an egress delay. We also found that the egress delay was still evident for ∆CDPK3 parasites, despite having higher starting MOIs than WT parasites in our experiments. Therefore there does not appear to be a link between starting MOIs and the egress delay.

      To be sure of our results, we also performed egress assays where we co-infected HFFs with mCherry-expressing WT parasites (WT ∆UPRT) and GFP-expressing PDE2 cKO parasites that were treated with either DMSO or RAP or ∆CDPK3 parasites. This recapitulated our previous findings, confirming the deletion of PDE2 leads to delay in A23187-mediated egress.

      4.A few references are missing to ensure reproducibility.

      The manuscript states that the kinetic lipidomics experiments were performed with established methods, but the cited publication (line 497) is a preprint. These are therefore not peer reviewed and should be described in greater detail in this manuscript, including any relevant validation.

      Response:

      We thank the reviewer for pointing this out. We have included a greater description of the methods used in the materials and methods section such that the experiment is reproducible, as per the reviewer’s suggestion. We decided to still make mention of the BioRxiv preprint since we thought it was appropriate for the reader to be informed of ongoing developments in the field.

      Please cite the release of the T. gondii proteomes used for spectrum matching (lines 972-973).

      Response:

      We have included this as per the reviewer’s suggestion.

      Please include the TMT labeling scheme so the analysis may be reproduced from the raw files.

      Response:

      We have included this as per the reviewer’s suggestion in Supp Fig. 3A.

      5.Statistical analyses should be reviewed as follows:

      Have the authors examined the possibility that some changes in phosphopeptide abundance reflect changes in protein abundance? This may be particularly relevant for comparisons involving the ∆cdpk3 strain. Did the authors collect paired unenriched proteomes from the experiments performed? Alternatively, there may be enriched peptides that did not change in abundance for many of the proteins that appear dynamically phosphorylated.

      Response:

      We did not collect unenriched proteomes from the experiments performed (although we did perform unenriched mixing checks to ensure equal loading between samples), and believe that this wasn’t a necessity for the following reasons:

      1. For within-line treatment analyses, treatment timings are so short (a maximum of 15-50s in the single timepoint experiment) that it would be unlikely to detect substantial changes in protein abundance. Moreover, these unlikely events would affect all phosphosites across a protein, and therefore be detectable.

      In our CDPK3 dependency timecourse experiments, we normalise both the WT and ∆CDPK3 strain to 0s, and measure signalling progression over time. Therefore, any difference at timepoints that are not “0” are not originating from basal differences. We also see a consistent increase/decrease in phosphosite detection across the sub-minute timecourse, further confirming that the observed changes are truly down to dynamic changes in phosphorylation and not protein levels.

      In the single timepoint CDPK3 dependency analyses (44 regulated sites identified, Data S2), we acknowledge that there could be some risk of altered starting protein abundance between lines. However, if protein abundance were responsible for the changes in phosphosite detection, we would expect all phosphosites across the protein to shift, and we do not observe this. Moreover, when we look at these CDPK3 dependent proteins and compare their phosphosite abundance in untreated WT and ∆CDPK3 lines, we find that for each protein, either all or the majority of phosphosites detected are unchanged (highlighting that there is no substantial difference in this protein’s abundance between lines). Where there are phosphosite differences between lines, these are only ever on single sites on a protein while most other sites are unchanged - implying that these are changes to basal phosphorylation states and not protein levels.

      It seems like for Figs. 3B and S5 the maximum number of clusters modeled was selected. Could the authors provide a rationale for the number of clusters selected, since it appears many of the clusters have similar profiles.

      The number of clusters is chosen automatically by the Mclust algorithm as the value that maximizes the Bayes Information Criterion (BIC). BIC in effect balances gains in model fit (increasing log-likelihood) against increasing the number of parameters (i.e. number of clusters).

      Please include figure panel(s) relating to gene ontology. Relevant information for readers to make conclusions includes p-value, fold-enrichment or gene ratio, and some sort of metric of the frequency of the GO term in the surveyed data set. See PMID: 33053376 Fig. 7 and PMID: 29724925 Fig. 6 for examples or enrichment summaries. Additionally, in the methods, specify (i) the background set, (ii) the method used for multiple test correction, (iii) the criteria constituting "enrichment", (iv) how the T. gondii genome was integrated into the analysis, (v) the class of GO terms (molecular function, biological process, or cellular component), (vi) any additional information required to reproduce the results (for example, settings modified from default).

      Response:

      We have included the additional information requested in the materials and methods.

      We purposely did not include GO figure panels as our analyses are being done across many clusters, making it very difficult to display this information cohesively. We have included all data in Tables S2-S5. These tables included all the relevant information on p-value, enrichment status, ratio in study/ratio in population, class of GO terms etc.

      The presentation of the lipidomics experiments in Figure 4A-C is confusing. First, the ∆cdpk3/WT ratio removes information about the process in WT parasites, and it's unclear why the scale centers on 100 and not 1. Second, the data in Figure S6 suggests a more modest effect than that represented in Fig. 4; is this due to day to day variability? How do the authors justify pairing WT and mutant samples as they did to generate the ratios?

      Response:

      This is a common strategy used by many metabolomics experts (Bailey et al., 2015; Dass et al., 2021; Lunghi et al., 2022). We had originally chosen to represent the data as a ratio since this form of representation helps get rid of the variability that arises between experiments and allows us to see very clear patterns which would otherwise go unnoticed. This variability arises from the amount of lipids in each sample which varies between parasites in a dish, the batch of FBS and DMEM used, and the solutions and even room temperature used to extract lipids on a given day.

      However, we agree with the reviewer that depicting the data in Figure 4A-C as a ratio of ∆CDPK3/WT parasites can be confusing, so we have now changed the graphs, plotting WT and ∆CDPK3 levels instead, and have moved the ratio of ∆CDPK3/WT to the Supplementary Figure 5.

      The significance test seems to be performed on the difference between the WT and ∆cdpk3 strains, but not relative to the DMSO treatment? Wouldn't you want to perform a repeated measures ANOVA to determine (i) if lipid levels change over time and (ii) if this trend differs in WT vs. mutant strain?

      Response:

      The reviewer correctly points out that ANOVA is often used for time courses, but we must point out that it is not always strictly appropriate since it can overlook the purpose of the individual experiment design, which in this case is, 1) to investigate the role of CDPK3 compared to the WT parental strain, and 2) specifically to find the exact point at which the DAG begins to change after stimulus to match the proteomics time course.

      Our data is clearly biassed towards earlier time points where we have 0, 5, 10, 30, 45 seconds where DAG levels are mostly unchanged compared to the single timepoint 60 seconds which shows a significant difference in DAG using our method of statistical comparison by paired two tailed t-test. Therefore, it would be unwise to use ANOVA when we really want to see when the A23187 stimulus takes effect, which appears to be after the 45 second mark. Therefore, analysing the data by ANOVA would likely provide a false negative result, where the result is non-significant but there is clearly more DAG in WT than CDPK3 after 60 seconds. T-tests are commonly used when comparing the same cell lines grown in the same conditions with a test/treatment, and in this case the test/treatment is CPDK3 present or absent (Lentini et al., 2020).

      In the main text, it would be preferable to see the data presented as the proteomics experiments were in Figure 4B and 4C, with fold changes relative to the DMSO (t = 0) treatment, separately for WT and ∆cdpk3 parasites.

      Response:

      We have now changed the way that we represent the data, plotting %mol instead of the ratio.

      Signaling lipids constitute small percentages of the overall pool (e.g. PMID: 26962945), so one might not necessarily expect to observe large changes in lipid abundance when signaling pathways are modulated. Is there any positive control that the authors could include to give readers a sense of the dynamic range? Maybe the DGK1 mutant (PMID: 26962945)?

      Response:

      DGK1 is maybe not a good example because the DGK1 KO parasites effectively “melt” from a lack of plasma membrane integrity ((Bullen et al., 2016), so this would likely be technically challenging. We don’t see the added value in including an additional mutant control since we can already see the dynamic change over time from no difference (0 seconds) to significant difference (60 seconds) between WT and CDPK3 for DAG and most other lipids. We already see a significant difference between WT and CDPK3 after 60 seconds for DAG, and we can clearly see in sub-minute timecourses the changes or not at the specific points where the A23187 is added (0-5 seconds), the parasites acclimatise, for the A23187 to take effect (10-30 seconds) and for the parasite lipid response to be visible by lipidomics (45-60 +seconds).

      Figure 4E: are the differences in [cAMP] with DMSO treatment and A23187 treatment different at any of the timepoints in the WT strain? The comparison seems to be WT/∆cdpk3 at each timepoint. Does the text (lines 562-568) need to be modified accordingly?

      Response:

      In WT (and ∆CDPK3) parasites, [cAMP] is significantly changed at 5s of A23187 treatment (relative to DMSO). We have modified our figures to include this analysis. The existing text accurately reflects this.

      Figure 6I: is the difference between PDE2 cKO/∆cdpk3 + DMSO or RAP significant?

      Response

      In our original manuscript, there was no statistical difference in [cAMP] between PDE2cKO/∆CDPK3+DMSO and PDE2cKO/∆CDPK3+DMSO+RAP, likely due to the variation between biological replicates. To overcome the issues in variability between replicates, we have now included more biological replicates (n=7). This has led to a significant difference in [cAMP] between PDE2cKO/∆CDPK3 DMSO- and RAP-treated parasites and between PDE2cKO DMSO- and RAP-treated parasites (now Fig. 6I).

      **MINOR COMMENTS**

      1.The following references should be added or amended:

      Lines 83-85: in the cited publication, relative phosphopeptide abundances of an overexpressed dominant-negative, constitutively inactive PKA mutant were compared to an overexpressed wild-type mutant. In this experimental setup, one would hypothesize that targets of PKA should be down-regulated (inactive/WT ratios). However, the mentioned phosphopeptide of PDE2 was found to be up-regulated, suggesting that it is not a direct target of PKA.

      Response:

      We thank the reviewer for spotting this error, we have now modified our wording.

      Cite TGGT1_305050, referenced as calmodulin in line 458, as TgELC2 (PMID: 26374117).

      Response:

      We have included this as per the reviewer’s suggestion.

      Cite TGGT1_295850 as apical annuli protein 2 (AAP2, PMID: 31470470).

      Response:

      We have included this as per the reviewer’s suggestion.

      Cite TGGT1_270865 (adenylyl cyclase beta, Acβ) as PMID: 29030485, 30449726.

      Response:

      We have included this as per the reviewer’s suggestion.

      Cite TGGT1_254370 (guanylyl cyclase, GC) as PMID: 30449726, 30742070.

      Response:

      We have included this as per the reviewer’s suggestion.

      Note that Lourido, Tang and David Sibley, 2012 observed that treatment with zaprinast (a PDE inhibitor) could overcome CDPK3 inhibition. The target(s) of zaprinast have not been determined and may differ from those of BIPPO (in identity and IC50). The cited study also used modified CDPK3 and CDPK1 alleles, rather than ∆cdpk3 and intact cdpk1 as used in this manuscript. That is to say, the signaling backgrounds of the parasite strains deviate in ways that are not controlled.

      Response:

      While it is true that zaprinast targets have not been unequivocally identified, zaprinast-induced egress is widely thought to be the result of PKG activation, a conclusion that is further supported by the finding that Compound 1 completely blocks zaprinast-induced egress (Lourido, Tang and David Sibley, 2012). Similarly, BIPPO-induced egress is inhibited by chemical inhibition of PKG by Compound 1 and Compound 2 (Jia et al., 2017). Moreover, like zaprinast, BIPPO has been clearly shown to partially overcome the ∆CDPK3 egress delay (Stewart et al., 2017).

      2.The following comments refer to the figures and legends:

      Part of the legend text for 1G is included under 1H.

      Response:

      This has been corrected

      Figure 1H: The legend mentions that some dots are blue, but they appear green. Please ensure that color choices conform to journal accessibility guidelines. See the following article about visualization for colorblind readers: https://www.ascb.org/science-news/how-to-make-scientific-figures-accessible-to-readers-with-color-blindness____/ . Avoid using red and green false-colored images; replace red with a magenta lookup table. Multi-colored images are only helpful for the merged image; otherwise, we discern grayscale better. Applies to Figures 1B, 5C, 6D. (Aside: anti-CAP seems an odd choice of counterstain; the variation in the staining, esp. at the apical cap, is distracting.)

      Response:

      We thank reviewer #1 for bringing this to our attention, and have modified our colour usage for all IFAs and Figures 1H and 3E.

      We chose CAP staining as the antibody is available in the laboratory and stains both the apical end (which has been shown to contain several proteins important for signalling as well as PDE9) and the parasite periphery, the location of CDPK3.

      Figure 1B: When showing a single fluorophore, please use grayscale and include an intensity scale bar, since relative values are being compared.

      Response:

      We have modified this as per the reviewer’s suggestion

      Figure 1C: it is difficult to compare the kinetics of the calcium response when the curves are plotted separately. Since the scales are the same, could the two treatments be plotted on the same axes, with different colors? Additionally, according to the legend, a red line seems to be missing in this panel.

      Response:

      Fig1C is not intended to compare kinetics, merely to show peak calcium release in each separate treatment condition. We have removed mention of a red line in the figure legend.

      Figure 2A: Either Figure S4 can be moved to accompany Figure 2A, or Figure 2A could be moved to the supplemental.

      Figure S4 has now been incorporated into Figure 2.

      Reviewer #1 (Significance (Required)):

      This manuscript would interest researchers studying signaling pathways in protozoan parasites, especially apicomplexans, as CDPK3 and PKG orthologs exist across the phylum. To my knowledge, it is the first study that has proposed a mechanism by which a calcium effector regulates cAMP levels in T. gondii. Unfortunately, the experiments fall short of testing this mechanism.

      Response:

      We thank reviewer #1 for their comments, but disagree with their assessment that the key points of the manuscript “fall short of experimental testing”.

      1. We demonstrate that, following both BIPPO and A23187 treatment, there is differential phosphorylation of numerous components traditionally believed to sit upstream of PKG activation (as well as several components within the PKG signalling pathway itself).
      2. We show that some of these sites are CDPK3 dependent, and that deletion of CDPK3 leads to changes in lipid signalling and an elevation in levels of cAMP (dysregulation of which is known to alter PKG signalling).
      3. We show that pre-treatment with a PKA inhibitor is able to largely rescue this phenotype.
      4. We demonstrate that a cAMP-specific PDE is phosphorylated following A23187 treatment (i.e. Ca2+ flux)
      5. We show that this cAMP specific PDE plays a role in A23187-mediated egress.
      6. While the latter PDE may not be directly regulated by CDPK3, these findings suggest that there are likely several Ca2+-dependent kinases that contribute to this feedback loop.

        Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

      In this manuscript, Dominicus et al investigate the elusive role of calcium-dependent kinase 3 during the egress of Toxoplasma gondii. Multiple functions have already been proposed for this kinase by this group including the regulation of basal calcium levels (24945436) or of a tyrosine transporter (30402958). However, one of the most puzzling phenotypes of CDPK3 deficient tachyzoites is a marked delay in egress when parasites are stimulated with a calcium ionophore that is rescued with phosphodiesterase (PDE) inhibitors. Crosstalk between, cAMP, cGMP, lipid and calcium signalling has been previously described to be important in regulating egress (26933036, 23149386, 29030485) but the role of CDPK3 in Toxoplasma is still poorly understood.

      Here the authors first take an elegant phosphoproteomic approach to identify pathways differentially regulated upon treatment with either a PDE inhibitor (BIPPO) and a calcium ionophore (A23187) in WT and CDPK3-KO parasites. Not much difference is observed between BIPPO or A23187 stimulation which is interpreted by the authors as a regulation through a feed-back loop.

      The authors then investigate the effect of CDPK3 deletion on lipid, cGMP and cAMP levels. The identify major changes in DAG, phospholipid, FFAs, and TAG levels as well as differences in cAMP levels but not for cGMP. Chemical inhibition of PKA leads to a similar egress timing in CDPK3-KO and WT parasites upon A23187 stimulation.

      As four PDEs appeared differentially regulated in the CDPK3-KO line upon A23187, the authors investigate the requirement of the 4 PDEs in cAMP levels. They show diverse localisation of the PDEs with specificities of PDE1, 7 and 9 for cGMP and of PDE2 for cAMP. They further show that PDE1, 7 and 9 are sensitive to BIPPO. Finally, using a conditional deletion system, they show that PDE1 and 2 are important for the lytic cycle of Toxoplasma and that PDE2 shows a slightly delayed egress following A23187 stimulation.

      **Major comments:**

      -Are the key conclusions convincing?

      The title is supported by the findings presented in this study. However I am not sure to understand why the authors imply a positive feed back loop. This should be clarified in the discussion of the results.

      Response:

      We believe in a positive feedback loop as, upon A23187 treatment (resulting in a calcium flux), ΔCDPK3 parasites are able to egress, albeit in a delayed manner. This egress delay is substantially, but not completely, alleviated upon treatment with BIPPO (a PDE inhibitor known to activate the PKG signalling pathway). In conjunction with our phosphoproteomic data (where we see phosphorylation of numerous pathway components upstream of PKG upon BIPPO and A23187 treatment - both in a CDPK3 dependent and independent manner), these observations suggest that calcium-regulated proteins (CDPK3 among them) feed into the PKG pathway. As deletion of CDPK3 delays egress, it is reasonable to postulate that this feedback is one that amplifies egress signalling (i.e. is positive).

      The phosphoproteome analysis seems very strong and will be of interest for many groups working on egress. However, the key conclusion, i.e. that a substrate overlaps between PKG and CDPK3 is unlikely to explain the CDPK3 phenotype, seems premature to me in the absence of robustly identified substrates for both kinases.

      Response:

      We certainly do not fully exclude the possibility of a substrate overlap but do lean more heavily towards a feedback loop given (a) the inability to clearly detect treatment-specific signalling profiles and (b) the phospho targets observed in the A23187 and BIPPO phosphoproteomes. We have further clarified our reasoning, and overall tempered our language in the manuscript as per the reviewer’s suggestion.

      I am not sure there is a clear key conclusion from the lipidomic analysis and how it is used by the authors to build their model up. Major changes are observed but how could this be linked with CDPK3, particularly if cGMP levels are not affected?

      Response:

      Our phosphoproteomic analyses identify several CDPK3-dependent phospho sites on phospholipid signalling components (DGK1 & PI-PLC), suggesting that there is indeed altered signalling downstream of PKG. To test whether these lead to a measurable phenotype, we performed the lipidomics analysis. We did not pursue this arm of the signalling pathway any further as we postulated that the changes in the lipid signalling pathway were less likely to play a role in the feedback loop. Nevertheless, we felt that it was worthwhile to include these findings in our manuscript as they support the conclusions drawn from the phosphoproteomics - namely that lipid signalling is perturbed in CDPK3 mutants. We, or others, may follow up on this in future.

      We agree with the reviewer that it is surprising that cGMP levels remain unchanged in our experiments when we treat with A23187. Given the measurable difference in cAMP levels between WT and ΔCDPK3 parasites, we postulate that CDPK3 directly or indirectly downregulates levels of cAMP. This would, in turn, alter activity of the cAMP-dependent protein kinase PKAc. Jia et al. (2017) have shown a clear dependency on PKG for parasites to egress upon PKAc depletion, but were also unable to reliably demonstrate cGMP accumulation in intracellular parasites. Similarly, their hypothesis that dysregulated cGMP-specific PDE activity results in altered cGMP levels has not been proven (the PDE hypothesised to be involved has since been shown to be cAMP-specific).

      While it is possible that our collective inability to observe elevated cGMP levels is explained by the sensitivity limits of the assay, it is similarly possible that cAMP-mediated signalling is exerting its effects on the PKG signalling pathway in a cGMP-independent manner.

      The evidence that CDPK3 is involved in cAMP homeostasis seems strong. However, the analysis of PKA inhibition is a bit less clear. The way the data is presented makes it difficult to see whether the treatment is accelerating egress of CDPK3-KO parasites or affecting both WT and CDPK3-KO lines, including both the speed and extent of egress. This is important for the interpretation of the experiment.

      Response:

      Fig. 4F shows that there is a significant amount of premature egress in both WT and ∆CDPK3 parasites following 2 hrs of H89 pre-treatment (consistent with previous reports that downregulation of cAMP signalling stimulates premature egress). When we subsequently investigated A23187-induced egress rates of the remaining intracellular H89 pre-treated parasites (Fig. 4Gi-ii) we found that the ∆CDPK3 egress delay was largely rescued. We have moved Fig. 4F to the supplement (now Supp Fig. 5E) in order to avoid confusion between the distinct analyses shown in 4F (pre-treatment analyses) and 4G (egress experiment). These experiments provided a hint that cAMP signalling is affected, which we then validate by measuring elevated cAMP levels in CDPK3 mutant parasites.

      The biochemical characterisation of the four PDE is interesting and seems well performed. However, PDE1 was previously shown to hydrolyse both cAMP and cGMP (____https://doi.org/10.1101/2021.09.21.461320____) which raises some questions about the experimental set up. Could the authors possibly discuss why they do not observe similar selectivity? Could other PDEs in the immunoprecipitate mask PDE activity? In line with this question, it is not clear what % of "hydrolytic activity (%)" means and how it was calculated.

      The experiments describing the selectivity of BIPPO for PDE1, 7 and 9 as well as the biological requirement of the four tested PDEs are convincing.

      Response:

      We believe that the disagreement between our findings and those published by Moss and colleagues are due to the differences in experimental conditions. We performed our assays at room temperature for 1 hour with higher starting cAMP concentrations (1 uM) compared to them. They performed their assays at 37ºC for 2 hours with 10-fold lower starting cAMP concentrations (0.1 uM). We have now repeated this set of experiments using the Moss et al. conditions, and find that PDEs 1, 7 and 9 can be dual specific, while PDE2 is cAMP-specific, thereby recapitulating their findings (Now included in the revised manuscript under Supp Fig. 7B). However, we also now performed a timecourse PDE assay using our original conditions and show that the cAMP hydrolytic activity for PDE1 can only be detected following 4 hours of incubation, compared to cGMP activity that can be detected as early as 30 minutes, suggesting that it possesses predominantly cGMP activity (See Supp Fig. 7C). We therefore believe that our experimental setup is more stringent, because if one starts with a lower level of substrate and incubates for longer and at a higher temperature, even minor dual activity could make a substantial difference in cAMP levels. Our data suggests that the cAMP hydrolytic activity of PDEs 1, 7 and 9 is substantially lower than the cGMP hydrolytic activity that they display.

      We have also included a clear description of how % hydrolytic activity was calculated in the methods section.

      -Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      The claim that CDPK3 affects cAMP levels seems strong however the exact links between CDPK3 activity, lipid, cGMP and cAMP signalling remain unclear and it may be important to clearly state this.

      Response:

      We have modified our wording in the text to more clearly describe our current hypothesis and reasoning.

      -Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      I think that the manuscript contains a significant amount of experiments that are of interest to scientists working on Toxoplasma egress. Requesting experiments to identify the functional link between above-mentioned pathways would be out of the scope for this work although it would considerably increase the impact of this manuscript. For example, would it be possible to test whether the CDPK3-KO line is more or less sensitive to PKG specific inhibition upon A23187 induced?

      -Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      The above-mentioned experiment is not trivial as no specific inhibitors of PKG are available. Ensuring for specificity of the investigated phenotype would require the generation of a resistant line which would require significant work.

      __Response: __We agree that this would be an interesting experiment to further substantiate our findings. As indicated by the reviewer, however, the lack of specific inhibitors of PKG means a resistant line would likely be required to ensure specificity.

      -Are the data and the methods presented in such a way that they can be reproduced?

      It is not clear how the % of hydrolytic activity of the PDE has been calculated.

      Response: We have included a clearer description of how % hydrolytic activity was calculated in the methods section.

      -Are the experiments adequately replicated and statistical analysis adequate?

      This seems to be performed to high standards.

      **Minor comments:**

      -Specific experimental issues that are easily addressable.

      I do not have any comments related to minor experimental issues.

      -Are prior studies referenced appropriately?

      Most of the studies relevant for this work are cited. It is however not clear to me why some important players of the "PKG pathway" are not indicated in Fig 1H and Fig 3E, including for example UGO or SPARK.

      Response:

      We have modified Fig 1H and 3E to include all key players involved in the PKG pathway.

      -Are the text and figures clear and accurate?

      While all the data shown here is impressive and well analysed, I find it difficult to read the manuscript and establish links between sections of the papers. The phosphoproteome analysis is interesting and is used to orientate the reader towards a feedback mechanism rather than a substrate overlap. But why do the authors later focus on PDEs and not on AC or CNBD, as in the end, if I understand well, there is no evidence showing a link between CDPK3-dependent phosphorylation and PDE activity upon A23187 stimulation?

      Response:

      We thank reviewer#2 and appreciate their constructive feedback re the flow of the manuscript.

      Our key findings from the phosphoproteomics study were that 1) BIPPO and A23187 treatment trigger near identical signalling pathways, 2) that both A23187 and BIPPO treatment leads to phosphorylation of numerous components both upstream and downstream of PKG signalling (hinting at the presence of an Ca2+-regulated feedback loop) and 3) several of the abovementioned components are phosphorylated in a CDPK3 dependent manner.

      While several avenues of study could have been pursued from this point onwards, we chose to focus on the feedback loop in a broader sense as its existence has important implications for our general understanding of the signalling pathways that govern egress.

      We reasoned that, given the differential phosphorylation of 4 PDEs following A23187 and BIPPO treatment (none of which had been studied in detail previously), it was relevant to study these in greater detail.

      Coupled with the A23187 egress assay on PDE2 knockout parasites - our findings suggest that PDE2 plays a role in the abovementioned Ca2+ signalling loop. While PDE2 may not exert its effects in a CDPK3-dependent manner (and CDPK3 may, therefore, alter cAMP levels in a different fashion), this does not detract from the important finding that PDE2 is one of the (likely numerous) components that is regulated in a Ca2+-dependent feedback loop to facilitate rapid egress.

      We have modified our wording to better reflect our rationale for studying the PDEs irrespective of their CDPK3 phosphorylation status.

      While we feel that our reasoning for studying the PDEs is solid, we do appreciate that further clarification on the putative CDPK3-Adenylate cyclase link would elevate the manuscript substantially. However, given the data that the ACb is not playing a sole role in the control of egress, this is likely a non-trivial task and requires substantial work.

      It is also unclear how the authors link CDPK3-dependent elevated cAMP levels with the elevated basal calcium levels they previously described. This is particularly difficult to reconcile particularly in a PKG independent manner.

      Response:

      We previously postulated that elevated Ca2+ levels allowed ΔCDPK3 mutants to overcome a complete egress defect, potentially by activating other CDPKs (e.g. CDPK1). It is similarly plausible that elevated Ca2+ levels in ΔCDPK3 parasites may lead to elevated cAMP levels in order to prevent premature egress.

      As noted in our previous responses, we acknowledge that our inability to detect cGMP is surprising. However, given the clarity of our cAMP findings, and the phosphoproteomic evidence to suggest that various components in the PKG signalling pathway are affected, we postulate that we are either unable to reliably detect cGMP due to sensitivity issues, or that cAMP is exerting its regulation on the PKG pathway in a cGMP-independent manner. As noted previously, while the link between cAMP and PKG signalling has been demonstrated by Jia et al., it is not entirely clear how this is mediated.

      The presentation of the lipidomic analysis is also not really clear to me. Why do the authors show the global changes in phospholipids and not a more detailed analysis?

      Response:

      We performed a detailed phospholipid profile of WT and ∆CDPK3 parasites under normal culture conditions. However, due to the sheer quantity of parasites required for this detailed analysis, we were unable to measure individual phospholipid species in our A23187 timecourse. We therefore opted to measure global changes following A23187 stimulation.

      As the authors focus on the PI-PLC pathway, could they detail the dynamics of phosphoinositides? I understand that lipid levels are affected in the mutant but I am not sure to understand how the authors interpret these massive changes in relationship with the function of CDPK3 and the observed phenotypes.

      Response:

      Our phosphoproteomic analyses identified several CDPK3-dependent phospho sites on phospholipid signalling components (DGK1 & PI-PLC), suggesting that (in keeping with all of our other data), there is altered signalling downstream of PKG. To test whether these changes lead to a measurable phenotype, we performed the lipidomics analysis. Following stimulation with A23187, we found a delayed production of DAG in ∆CDPK3 parasites compared to WT parasites. Since DAG is required for the production of PA, which in turn is required for microneme secretion, our finding can explain why microneme secretion is delayed in ∆CDPK3 parasites, as previously reported (Lourido, Tang and David Sibley, 2012; McCoy et al., 2012).

      We did not follow this arm of the signalling pathway any further as we postulated that the changes in the lipid signalling pathway were less likely to play a role in the feedback loop. Nevertheless, we felt that it was worthwhile to include these findings in our manuscript as they support the conclusions drawn from the phosphoproteomics - namely that lipid signalling is perturbed in CDPK3 mutants. We, or others, may follow up on this in future.

      Finally, the characterisation of the PDEs is an impressive piece of work but the functional link with CDPK3 is relatively unclear. It would also be important to clearly discuss the differences with previous results presented in this this preprint: https://doi.org/10.1101/2021.09.21.461320____.

      My understanding is while the authors aim at investigating the role of CDPK3 in A23187 induced egress, the main finding related to CDPK3 is a defect in cAMP homeostasis that is not linked to A23187. Similarly, the requirements of PDE2 in cAMP homeostasis and egress is indirectly linked to CDPK3. Altogether I think that important results are presented here but divided into three main and distinct sections: the phosphoproteomic survey, the lipidomic and cAMP level investigation, and the characterisation of the four PDEs. However, the link between each section is relatively weak and the way the results are presented is somehow misleading or confusing.

      Response:

      As mentioned in a previous response, we chose to study PDEs in greater detail because of our observation that both A23187 and BIPPO treatments lead to their phosphorylation (hinting at the presence of a Ca2+regulated feedback loop). We were particularly intrigued to study the cAMP specific PDE, as CDPK3 KO parasites suggested that cAMP may play a role in the Ca2+ feedback mechanism. As PDE2 may not be directly regulated by CDPK3, Ca2+ appears to exert its feedback effects in numerous ways. We have modified our wording to better reflect our rationale for studying the PDEs irrespective of their CDPK3 phosphorylation status.

      -Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      This is a very long manuscript written for specialists of this signalling pathway and I would suggest the authors to emphasise more the important results and also clearly state where links are still missing. This is obviously a complex pathway and one cannot elucidate it easily in a single manuscript.

      Response:

      We have included an additional summary in our conclusions to better illustrate our findings and clarify any missing links.

      Reviewer #2 (Significance (Required)):

      -Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      This is a technically remarkable paper using a broad range of analyses performed to a high standard.

      -Place the work in the context of the existing literature (provide references, where appropriate).

      The cross-talk between cAMP, cGMP and calcium signalling is well described in Toxoplasma and related parasites. Here the authors show that, in Toxoplasma, CDPK3 is part of this complex signalling network. One of the most important finding within this context is the role of CDPK3 in cAMP homeostasis. With this in mind, I would change the last sentence of the abstract to "In summary we uncover a feedback loop that enhances signalling during egress and links CDPK3 with several signalling pathways together."

      Response:

      In light of feedback received from several reviewers, we have made our wording less CDPK3 centric - as our findings relate in part to CDPK3 and, in a broader sense, to a Ca2+ driven feedback loop.

      The genetic and biochemical analyses of the four PDEs are remarkable and highlight consistencies and inconsistencies with recently published work that would be important to discuss and will be of interest for the field.

      __Response: __We thank reviewer#2 and agree that the PDE findings are of significant importance to the field.

      While I understand the studied signalling pathway is complex, I think it would be important to better describe the current model of the authors. In the discussion, the authors indicate that "the published data is not currently supported by a model that fits most experimental results." I would suggest to clarify this statement and discuss whether their work helps to reunite, correct or improve previous models.

      __Response: __We have expanded on the abovementioned statement to clarify that the presence of a feedback loop is a major pillar of knowledge required for the complete interpretation of existing signalling data.

      Could the authors also speculate about a potential role of PDE/CDPK3 in host cell invasion as cAMP signalling has be shown to be important for this process (30208022 and 29030485)?

      __Response: __Existing literature (Jia et al., 2017) suggests that perturbations to cAMP signalling play a very minor role in invasion since parasites where either ACα or ACβ are deleted show no impairment in invasion levels. We currently do not have substantial data on invasion, and are not sure that pursuing this is valuable given the minor phenotypes observed in other studies.

      -State what audience might be interested in and influenced by the reported findings.

      This paper is of great interest to groups working on the regulation of egress in Toxoplasma gondii and other related apicomplexan pathogens.

      -Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      I am working on the cell biology of apicomplexan parasites.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      Dominicus et al aimed to identify the intersecting components of calcium, cyclic nucleotides (cAMP, cGMP) and lipid signaling through phosphoproteomic, knockout and biochemical assays in an intracellular parasite, Toxoplasma gondii, particularly when its acutely-infectious tachyzoite stage exits the host cells. A series of experimental strategies were applied to identify potential substrates of calcium-dependent protein kinase 3 (CDPK3), which has previously been reported to control the tachyzoite egress. According to earlier studies (PMID: 23226109, 24945436, 5418062, 26544049, 30402958), CDPK3 regulated the parasite exit through multiple phosphorylation events. Here, authors identified differentially-regulated (DR) phosphorylation sites by comparing the parasite samples after treatment with a calcium ionophore (A23178) and a PDE inhibitor (BIPPO), both of which are known to induce artificial egress (induced egress as opposed to natural egress). When the DCDPK3 mutant was treated with A23187, its delayed egress phenotype did not change, whereas BIPPO restored the egress to the level of the parental (termed as WT) strain, probably by activating PKG.

      The gene ontology enrichment of the up-regulated clusters revealed many probable CDPK3-dependent DR sites involved in cyclic nucleotide signaling (PDE1, PDE2, PDE7, PDE9, guanylate and adenylate cyclases, cyclic nucleotide-binding protein or CNBP) as well as lipid signaling (PI-PLC, DGK1). Authors suggest lipid signaling as one of the factors altered in the CDPK3 mutant, albeit lipidomics (PC, PI, PS, PT, PA, PE, SM) showed no significant change in phospholipids. To reveal how the four PDEs indicated above contribute to the cAMP and cGMP-mediated egress, they examined their biological significance by knockout/knockdown and enzyme activity assays. Authors claim that PDE1,7,9 proteins are cGMP-specific while PDE2 is cAMP-specific, and BIPPO treatment can inhibit PDE1-cGMP and PDE7-cGMP, but not PDE9-cGMP. Given the complexity, the manuscript is well structured, and most experiments were carefully designed. Undoubtedly, there is a significant amount of work that underlies this manuscript; however, from a conceptual viewpoint, the manuscript does not offer significant advancement over the current knowledge without functional validation of phosphoproteomics data (see below). A large body of work preceding this manuscript has indicated the crosstalk of cAMP, cGMP, calcium and lipid signaling cascades. This work provides a further refinement of the existing model In a methodical sense, the work uses established assays, some of which require revisiting to reach robust conclusions and avoid misinterpretation. The article is quite interesting from a throughput screening point of view, but it clearly lacks the appropriate endorsement of the hits.The authors accept that identifying the phosphorylation of a protein does not imply a functional role, which is a major drawback as there is no experimental support for any phosphorylation site of the protein identified through phosphoproteomics. In terms of the mechanism, it is not clear whether and how lipid turnover and cAMP-PKA signaling control the egress phenotype (lack of a validated model at the end of this study).

      Response:

      We thank reviewer #3 for their comments, but respectfully disagree with their assessment that the work presented does not advance current knowledge.

      1. We demonstrate that, following both BIPPO and A23187 treatment, there is differential phosphorylation of numerous components traditionally believed to sit upstream of PKG activation (as well as numerous components within the PKG signalling pathway itself). While it may have been inferred from previous studies that A23187 and BIPPO signalling intersect, this has never been unequivocally demonstrated - nor has a feedback loop ever been shown.

      We provide a novel A23187-driven phosphoproteome timecourse that further bolsters the model of a Ca2+-driven feedback loop.

      We show that deletion of CDPK3 leads to a delay in DAG production upon stimulation with A23187.

      We show that some of the abovementioned sites are CDPK3 dependent, and that deletion of CDPK3 leads to elevated levels of cAMP (dysregulation of which is known to alter PKG signalling).

      We show that pre-treatment with a PKA inhibitor is able to largely rescue this phenotype.

      We demonstrate that a cAMP-specific PDE is phosphorylated following A23187 treatment (i.e. Ca2+ flux)

      We show that this cAMP specific PDE plays a role in egress.

      While the latter PDE may not be directly regulated by CDPK3, these findings suggest that there are likely several Ca2+-dependent kinases that contribute to this feedback loop.

      We also firmly disagree with the reviewer’s assertion that without phosphosite characterisation, we have no support for our model. Following treatment with A23187 (and BIPPO), we clearly show broad, systemic changes (both CDPK3 dependent and independent) across signalling pathways previously deemed to sit upstream of calcium flux. Given the vast number of proteins involved in these signalling pathways, and the multitude of differentially regulated phosphosites identified on each of them, it is highly likely that the signalling effects we observe are combinatorial. Accordingly, we believe that mutating individual sites on individual proteins would be a very costly endeavour which is unlikely to substantially advance our understanding of signalling during egress. Moreover, introducing multiple point mutations in a given protein to ablate phosphorylation may lead to protein misfolding and would therefore not be informative. One of the key aims of this study was to assess how egress signalling pathways are interconnected, and we believe we have been able to show strong support for a Ca2+-driven feedback mechanism in which both CDPK3 and PDE2 play a role through the regulation of cAMP.

      While we agree with the reviewer’s statement that a large body of work preceding this manuscript has indicated the crosstalk of cAMP, cGMP, calcium and lipid signalling cascades, a feedback loop has not previously been shown. We believe that this finding is absolutely central to facilitate the complete interpretation of existing signalling data. Furthermore, no previous studies have gone to this level of detail in either proteomics or lipidomics to analyse the calcium signal pathway in any apicomplexan parasite. We argue that the novelty in our manuscript is that it is a carefully orchestrated study that advances our understanding of the signalling network over time with subcellular precision. The kinetics of signalling is not well understood and we believe that our study is likely the first to include both proteomic and lipidomic analyses over a timecourse during the acute lytic cycle stage of the disease. In doing so, we found evidence for a feedback loop that controls the signalling network spatiotemporally, and we characterise elements of this feedback in the same study.

      **Major Comments:**

      Based on the findings reported here there is little doubt that BIPPO and A23187-induced signaling intersect with each other, as very much expected from previous studies. The authors selected the 50s and 15s post-treatment timing of A23187 and BIPPO, respectively for collecting phosphoproteomics samples. At these time points, which were shown to peak cytosolic Ca2+, parasites were still intracellular (Line #171). How did authors make sure to stimulate the entire signaling cascade adequately, particularly when parasites do not egress within the selected time window? There is significant variability between phosphosite intensities of replicates (Line #186), which may also be attributed to insufficient triggers for the egress across independent experiments. This work must be supported by in vitro egress assays with the chosen incubation periods of BIPPO and ionophore treatment (show the induced % egress of tachyzoites in the 50s and 15s).

      Response:

      1. We appreciate that the reviewer acknowledges that our data clearly shows that BIPPO and A23187-induced signalling intersect. While this may have been expected from previous studies, this has not previously been shown - and is therefore valuable to the field. Specifically, the fact that A23187-treatment leads to phosphorylation of targets normally deemed to sit upstream of calcium release is entirely novel and adds a substantial layer of information to our understanding of how these signalling pathways work together.

      Treatments were purposely selected to align pathways to a point where calcium levels peak just prior to calcium reuptake. At these chosen timepoints, we clearly show that overall signalling correlation is very high. We know from our egress assays using identical treatment concentrations (Fig. 2C), that the stimulations used are sufficient to result in complete egress. We are simply comparing signalling pathways at points prior to egress.

      As mentioned in point 2, we show convincingly that the treatments used are sufficient to trigger complete egress. As detailed clearly in the text, we believe that these variations in intensities between replicates are due to slight differences in timing between experiments (this is inevitable given the very rapid progression of signalling, and the difficulty of replicating exact sub-minute treatment timings). We demonstrate that the reporter intensities associated with DR sites correlate well across replicates (Supp Fig. 3C), suggesting that despite some replicate variability, the overall trends across replicates is very much consistent. This allows us to confidently average scores to provide values that are representative of a site’s phosphorylation state at the timepoint of interest.

      The reviewer’s suggestion that we should demonstrate % egress at the 50s and 15s treatment timepoints is obsolete - we state clearly in the text that parasites have not egressed at these timepoints. Our egress assays (Fig. 2C) further support this.

      The authors discuss that CDPK3 controls the cAMP level and PKA through activation of one or more yet-to-be-identified PDEs(s). cAMP could probably also be regulated by an adenylate cyclase, ACbeta that was found to have CDPK3-dependent phosphorylation sites. If CDPK3 is indeed a regulator of cAMP through the activation of PDEs or ACbeta, it would be expected that the deletion of CDPK3 would perturb the cAMP level, resulting in dysregulation of PKAc1 subunit, which in turn would dysregulate cGMP-specific PDEs (PMID: 29030485) and thereby PKG. All these connections need to explain in a more clear manner with experimental support (what is positive and what is negatively regulated by C____DPK3).

      Response:

      1. We do not firmly state that CDPK3 regulates cAMP by phosphorylation of a PDE - this is one of the possibilities addressed. We acknowledge the possibility that this could also be via the adenylate cyclase (see line 792).

      PMID: 29030485 demonstrates clearly a link between cAMP signalling and PKG signalling, but does not demonstrate how this is mediated. The authors postulate that a cGMP-specific PDE is dysregulated given their observation that PDE2 is differentially phosphorylated in a constitutively inactive PKA mutant, however this was not validated experimentally. We and others (Moss et al., 2022), however, demonstrate that PDE2 is cAMP-specific. This suggests that the model built by PMID: 29030485 requires revisiting. We acknowledge clearly in the text that Jia et al. have shown a link between cAMP and PKG signalling, and hypothesise that CDPK3’s modulation of cAMP levels may affect this (this is in keeping with our phosphoproteomic data).

      Moreover, the egress defect is not due to a low influx of calcium in the cytosol because when the ionophore A23187 was added to the CDPK3 mutant, its phenotype was not recovered. Rather, the defect may be due to the low or null activity of PKG that would activate PI4K to generate IP3 and DAG. The latter would be used as a substrate by DGK to generate PA that is involved in the secretion of micronemes and Toxoplasma egress. In this context, authors should evaluate the role of CDPK3 in the secretion of micronemes that is directly related to the egress of the parasite.

      1. We agree with the reviewer on their point about calcium influx, and have already acknowledged in the text that the feedback loop does not control release of Ca2+ from internal stores as disruption of CDPK3 does not lead to a delay in Ca2+

      We agree, and clearly address in the text, that the egress defect could be due to altered PKG/phospholipid pathway signalling.

      (Lourido, Tang and David Sibley, 2012; McCoy et al., 2012) have both previously shown that microneme secretion is regulated by CDPK3. We therefore do not deem it necessary to repeat this experiment, but have made clearer mention of their findings in our writing.

      When the Dcdpk3 mutant with BIPPO treatment was evaluated, it was observed that the parasite recovered the egress phenotype. It is concluded that CDPK3 could probably regulate the activity of cGMP-specific PDEs. CDPK3 could (in)activate them, or it could act on other proteins indirectly regulating the activity of these PDEs. Upon inactivation of PDEs, an increase in the cGMP level would activate PKG, which will, in turn, promote egress. From the data, it is not clear whether any phosphorylation by CDPK3 would activate or inactivate PDEs, and if so, then how (directly or indirectly). To reach unambiguous interpretation, authors should perform additional assays.

      Response:

      As mentioned previously, given the abundance of differentially regulated phosphosites, we do not believe that mutating individual sites on individual proteins is a worthwhile or realistic pursuit.

      We clearly show systematic A23187-mediated phosphorylation of key signalling components in the PKA/PKG/PI-PLC/phospholipid signalling cascade, and demonstrate that several of these are CDPK3-dependent. We demonstrate that CDPK3 alters cAMP levels (and that the ∆CDPK3 egress delay in A23187 treated parasites is largely rescued following pre-treatment with a PKA inhibitor). We similarly demonstrate that A23187 treatment leads to phosphorylation of numerous PDEs, including the cAMP specific PDE2, and show that PDE2 knockout parasites show an egress delay following A23187 treatment. While PDE2 may not be directly regulated by CDPK3 (suggesting other Ca2+ kinases are also involved), these findings collectively demonstrate the existence of a calcium-regulated feedback loop, in which CDPK3 and PDE2 play a role (by regulating cAMP).

      We acknowledge that we have not untangled every element of this feedback loop, and do not believe that it would be realistic to do so in a single study given the number of sites phosphorylated and pathways involved. We do believe, however, that we have shown clearly that the feedback loop exists - this in itself is entirely novel, and of significant importance to the field.

      On a similar note, a possible experiment that can be done to improve the work would be to treat the CDPK3 mutant with BIPPO in conjunction with a calcium chelator (BAPTA-AM) to reveal, which proteins are phosphorylated prior to activation of the calcium-mediated cascades?

      Response:

      We agree that this would be an interesting experiment to carry out but would involve significant work. This could be pursued in another paper or project but is beyond the scope of this work.

      The manuscript claims that PDE1, PDE7, PDE9 are cGMP specific, and BIPPO inhibits only cGMP-specific PDEs. All assays are performed with 1-10 micromolar cAMP and cGMP for 1h. There is no data showing the time, protein and substrate dependence. Given the suboptimal enzyme assays, authors should re-do them as suggested here. (1) Repeat the pulldown assay with a higher number of parasites (50-100 million) and measure the protein concentration. (2) Set up the PDE assay with saturating amount of cAMP and cGMP, which is critical if the PDE1,7,9 have a higher Km Value for cAMP (means lower affinity) compared to cGMP. An adequate amount of substrate and protein allows the reaction to reach the Vmax. Once you have re-determined the substrate specificity (revise Fig 5D), you should retest BIPPO (Fig 5E) in the presence of cAMP and cGMP. It is very likely that you would find the same result as PDE9 and PfPDEβ (BIPPO can inhibit both cAMP and cGMP-specific PDE), as described previously

      We have repeated our assay using the exact same conditions outlined by Moss et al. This involved using a similar number of parasites, a longer incubation time of 2 hours at a higher temperature (37ºC) and with a lower starting concentration of cAMP (0.1 uM). We demonstrate that we are able to recapitulate both the Moss et al. and Vo et al. (see Supp Fig. 7B). However, we noticed that these reactions were not carried out with saturating cAMP/cGMP concentrations, since all reactions had reached 100% completion at the end of the assay whereby all substrate was hydrolysed. We therefore believe that based on our original assay, as well as the new PDE1 timecourse that we have performed (Supp Fig. 7C), that PDEs 1, 7 and 9 display predominantly cGMP hydrolysing activity, with moderate cAMP hydrolysing activity.

      We also repeated the BIPPO inhibition assay using the Moss et al. conditions, and still observe that the cGMP activity of PDE1 is the most potently inhibited of all 4 PDEs. We also see moderate inhibition of the cAMP activities of PDE1 and PDE9, suggesting that cAMP hydrolytic activity can also be inhibited. Interestingly, the cGMP hydrolytic activities of PDEs 7 & 9, which were previously inhibited using our original assay conditions, no longer appear to be inhibited. This is likely due to the longer incubation time, which masks the reduced activities of these two PDEs following treatment with BIPPO.

      The authors did not identify any PKG substrate, which is quite surprising as cAMP signaling itself could impact cGMP. Authors should show if they were able to observe enhanced cGMP levels in BIPPO-treated sample (which is expected to stimulate cGMP-specific PDEs). The author mention their inability to measure cGMP level but have they analyzed cGMP in the positive control (BIPPO-treated parasite line)? Why have they focused only on CDPK3 mutant, whereas in their phosphoproteomic data they could see other CDPKs too? It could be that other CDPK-mediated signaling differs and need PKA/PKG for activation.

      In the title, the authors have mentioned that there is a positive feedback loop between calcium release, cyclic nucleotide and lipid signaling, which is quite an extrapolation as there is no clear experimental data supporting such a positive feedback loop so the author should change the title of the paper.

      Response:

      1. As addressed in our previous response to the reviewer, PMID: 29030485 demonstrates clearly a link between cAMP signalling and PKG signalling, but does not confirm how this is mediated. The authors surmise that a cGMP-specific PDE is dysregulated (although the PDE hypothesised to be involved has since been shown to be cAMP-specific), but are similarly unable to detect changes in cGMP levels. This suggests that their model may be incomplete.

      The BIPPO treatment experiment suggested by the reviewer was already included in the original manuscript (see Fig. 4D in original manuscript, now Fig. 4E). With BIPPO treatment we are able to detect changes in cGMP levels.

      We did not deem it to be within the scope of this study to study every single other CDPK. We chose to study CDPK3, as its egress phenotype was of particular interest given its partial rescue following BIPPO treatment. We reasoned that its study may lead us to identify the signalling pathway that links BIPPO and A23187 induced signalling.

      As addressed in greater detail in our response to reviewer #2, the fact that the feedback loop appears to stimulate egress implies that it is positive.

      **Minor Comments:**

      Materials & Methods

      Explanation of parameters is not clear (Line #360-367). Phosphoproteomics with A23187 (8 micromolar) treatment in CDPK3-KO and WT, for 15, 30 and 60s at 37{degree sign}C incubation with DMSO control. Simultaneously passing the DR and CDPK3 dependency thresholds: CDPK3-dependent phosphorylation

      __Response: __We have modified the wording to make this clearer as per the reviewer’s suggestion.

      Line #368: At which WT-A23187 timepoint did the authors identify 2408 DR-up phosphosites (15s, 30s or 60s)? Or consistently in all? It should be clarified?

      __Response: __As already stated in the manuscript (see line 366 in original manuscript, now line 1047), phosphorylation sites were considered differentially regulated if at any given timepoint their log2FC surpassed the DR threshold.

      A23187 treatment of the CDPK3-KO mutant significantly increased the cAMP levels at 5 sec post-treatment, but BIPPO did not show any change. The authors concluded that BIPPO presumably does not inhibit cAMP-specific PDEs. However, the dual-specific PDEs are known to be inhibited by BIPPO, as shown recently (____https://www.biorxiv.org/content/10.1101/2021.09.21.461320v1____). Authors do confirm that BIPPO-treatment can inhibit hydrolytic activity of PfPDEbeta for cAMP as well as cGMP (Line #612). Besides, it was shown in Fig 5E that BIPPO can partially though not significantly block cAMP-specific PDE2. The statements and data conflict each other under different subtitles and need to be reconciled. Elevation of basal cAMP level in the CDPK3 mutant indicates the perturbation of cAMP signaling, however BIPPO data requires additional supportive experiments to conclude its relation with cAMP or dual-specific PDE.

      Response:

      1. The manuscript to which the reviewer refers does not use BIPPO in any of their experiments. They show that continuous treatment with zaprinast blocks parasite growth in a plaque assay, but do not test whether zaprinast specifically blocks the activity of any of the PDEs.

      Having repeated the PDE assay using the Moss et al. conditions (as outlined above), we are now able to recapitulate their findings, showing that PDEs 1, 7 and 9 can display dual hydrolytic activity while PDE2 is cAMP specific. As explained further above, we believe that our original set of experiments are more stringent than the Moss *et al. * To confirm this, we also performed an additional experiment, incubating PDE1 for varying amounts of time using our original conditions (1 uM cAMP or 10 uM cGMP, at room temperature). This revealed that PDE1 is much more efficient at hydrolysing cGMP, and only begins to display cAMP hydrolysing capacity after 4 hours of incubation.

      We also measured the inhibitory capacity of BIPPO on the PDEs using the Moss *et al. * During the longer incubation time, it seems that BIPPO is unable to inhibit PDEs 7 and 9, while with the more stringent conditions it was able to inhibit both PDEs. We reasoned that since BIPPO is unable to inhibit these PDEs fully, the residual activity over the longer incubation period would compensate for the inhibition, eventually leading to 100% hydrolysis of the cNMPs. We also see that while the cGMP hydrolysing capacity of PDE1 is completely inhibited, its cAMP hydrolysing capacity is only partially inhibited. These findings and the fact that PDE2 is not inhibited by BIPPO are in line with our experiments where we measured [cAMP] and showed that treatment with BIPPO did not lead to alterations in [cAMP].

      The method used to determine the substrate specificity of PDE 1,2,7 and 9 resulted in the hydrolytic activity of PDE2 towards cAMP, while the remaining 3 were determined as cGMP-specific. However, PDE1 and PDE9 have been reported as being dual-specific (Moss et al, 2021; Vo et al, 2020), which questions the reliability of the preferred method to characterize substrate specificity by the authors. It is also suggested to use another ELISA-based kit to double check the results.

      Response:

      As outlined above, we have repeated the assay using the conditions described by Moss et al. (lower starting concentrations of cAMP, 2 hour incubation period at 37ºC) and find that we are able to recapitulate the results of both Moss et al. and Vo et al.. However, using the Moss et al. conditions, the PDEs have hydrolysed 100% of the cyclic nucleotide, suggesting that these conditions are less stringent than the ones we used originally using higher starting concentrations of cAMP and incubating for 1 hour only at room temperature. With enzymatic assays it is always important to perform them at saturating conditions (as already suggested by the reviewer) and therefore we believe that our original conditions are more stringent than the results using the Moss et al. conditions.

      Line #607-608: Authors found PDE9 less sensitive to BIPPO-treatment and concluded PDE2 as refractory to BIPPO inhibition; however, the reduction level of activity seems similar as seen in PDE9-BIPPO treated sample? This strong statement should be replaced with a mild explanation.

      __Response: __We have tempered our wording as per the reviewer’s suggestion

      Figures and legends:

      The introductory model in Fig S1 is difficult to understand and ambiguous despite having it discussed in the text. For example, CDPK1 is placed, but only mentioned at the beginning, and the role of other CDPKs is not clear. In addition, the arrows in IP3 and PKG are confusing. The location of guanylate and adenylate cyclase is wrong, and so on... The figure should include only the egress-related signaling components to curate it. The illustration of host cell in orange color must be at the right side of the figure in connection with the apical pole of the parasite (not on the top). Figure legend should also be rearranged accordingly and citations of the underlying components should be included (see below).

      __Response: __We have modified Supp Fig. 1 as per the suggestions of reviewer#2 and #3. We have now modified the localisations of the proteins and have also removed the lines showing the cross talk between pathways. We have also highlighted to the reader that this is only a model and may not represent the true localisations of the proteins, despite our best efforts.

      In Figure 5D, would you please provide the western blot analysis of samples before and after pulling down to demonstrate the success of your immunoprecipitation assay. Mention the protein concentration in your PDE enzyme assay. Please refer to the M&M comments above to re-do the enzyme assays.

      Response:

      We have now included western blots for the pull downs of PDEs 1, 2, 7 and 9 (Supp Fig. 7A). We chose not to measure protein concentrations of samples since all experiments were performed using the same starting parasite numbers, and we do not see large differences in activities between biological replicates of the PDEs.

      Figure legend 1C: Line #194: There is no red-dotted line shown in graph! Correct it!

      __Response: __We have modified this.

      Figure 4Gi-ii: Shouldn't it be labelled i: H89-treatment and ii: A23178, respectively instead of DMSO and H89? (based on the text Line #579).

      __Response: __Our labelling of Fig. 4Gi-ii is correct as panel i parasites were pre-treated with DMSO, while panel ii parasites were pre-treated with H89. Subsequent egress assays on both parasites were then performed using A23187.

      We have modified the figures to include mention of A23187 on the X axis, and modified the figure legend to clarify pre-treatment was performed with DMSO and H89 respectively.

      Bibliography:

      Line #57 and 58: Citations must be selected properly! Carruthers and Sibley 1999 revealed the impact of Ca2+ on the microneme secretion within the context of host cell attachment and invasion, not egress as indicated in the manuscript! Similar case is also valid for the reference Wiersma et al 2004; since the roles of cyclic nucleotides were suggested for motility and invasion. Also notable in the fact that several citations describing the localization, regulation and physiological importance of cAMP and cGMP signaling mediators (PMID: 30449726 , 31235476 , 30992368 , 32191852 , 25555060 , 29030485 ) are either completely omitted or not appropriately cited in the introduction and discussion sections.

      Response:

      We have modified the citations as per the reviewer’s suggestions. We now cite Endo et al., 1987 for the first use of A23187 as an egress trigger, and Lourido, Tang and David Sibley, 2012 for the role of cGMP signalling in egress. We also cite all the GC papers when we make first mention of the GC. We have also removed the Howard et al., 2015 citation (PMID: 25555060) when referring to the fact that BIPPO/zaprinast can rescue the egress delay of ∆CDPK3 parasites.

      Grammar/Language

      Line #31: After "cAMP levels" use comma

      Response:

      We have modified this.

      36: Sentence is not clear. Does conditional deletion of all four PDEs support their important roles? If so, the role in egress of the parasite?

      Response:

      We have clarified our wording as per the reviewer’s suggestion. We state that PDEs 1 and 2 display an important role in growth since deletion of either these PDEs leads to reduced plaque growth. We have not investigated exactly what stage of the lytic cycle this is.

      40: "is a group involving" instead of "are"

      Response:

      We found no mention of “a group involving” in our original manuscript at line 40 or anywhere else in the manuscript, so we are unsure what the reviewer is referring to.

      108: isn't it "discharge of Ca++ from organelle stores to cytosol"?

      __Response: __We thank the reviewer for spotting this error. We have now modified this sentence.

      120: "was" instead of "were"

      __Response: __Since the situation we are referencing is hypothetical, then ‘were’ is the correct tense.

      Reviewer #3 (Significance (Required)):

      There is a significant amount of work that underlies this manuscript; however, from a conceptual viewpoint, the manuscript does not offer significant advancement over the current knowledge without functional validation of phosphoproteomics data. In terms of the mechanism, it is not clear whether and how lipid turnover and cAMP-PKA signaling control the egress phenotype (lack of a validated model at the end of this study).In a methodical sense, the work uses established assays, some of which require revisiting to reach robust conclusions and avoid misinterpretation.

      Compare to existing published knowledge

      A large body of work preceding this manuscript has indicated the crosstalk of cAMP, cGMP, calcium and lipid signaling cascades. This work provides a further refinement of the existing model. The article is quite interesting from a throughput screening point of view, but it clearly lacks the appropriate endorsement of the hits.

      Response:

      Please refer to our first response to reviewer #3 for our full rebuttal to these points. We respectfully disagree with the assessment that the work presented does not advance current knowledge.

      Audience

      Field specific (Apicomplexan Parasitology)

      Expertise

      Molecular Parasitology

      References

      Bailey, A. P. et al. (2015) ‘Antioxidant Role for Lipid Droplets in a Stem Cell Niche of Drosophila’, Cell. The Authors, 163(2), pp. 340–353. doi: 10.1016/j.cell.2015.09.020.

      Bullen, H. E. et al. (2016) ‘Phosphatidic Acid-Mediated Signaling Regulates Microneme Secretion in Toxoplasma Article Phosphatidic Acid-Mediated Signaling Regulates Microneme Secretion in Toxoplasma’, Cell Host & Microbe, pp. 349–360. doi: 10.1016/j.chom.2016.02.006.

      Dass, S. et al. (2021) ‘Toxoplasma LIPIN is essential in channeling host lipid fluxes through membrane biogenesis and lipid storage’, Nature Communications. Springer US, 12(1). doi: 10.1038/s41467-021-22956-w.

      Endo, T. et al. (1987) ‘Effects of Extracellular Potassium on Acid Release and Motility Initiation in Toxoplasma gondii’, The Journal of Protozoology, 34(3), pp. 291–295. doi: 10.1111/j.1550-7408.1987.tb03177.x.

      Flueck, C. et al. (2019) Phosphodiesterase beta is the master regulator of camp signalling during malaria parasite invasion, PLoS Biology. doi: 10.1371/journal.pbio.3000154.

      Howard, B. L. et al. (2015) ‘Identification of potent phosphodiesterase inhibitors that demonstrate cyclic nucleotide-dependent functions in apicomplexan parasites’, ACS Chemical Biology, 10(4), pp. 1145–1154. doi: 10.1021/cb501004q.

      Jia, Y. et al. (2017) ‘ Crosstalk between PKA and PKG controls pH ‐dependent host cell egress of Toxoplasma gondii ’, The EMBO Journal, 36(21), pp. 3250–3267. doi: 10.15252/embj.201796794.

      Katris, N. J. et al. (2020) ‘Rapid kinetics of lipid second messengers controlled by a cGMP signalling network coordinates apical complex functions in Toxoplasma tachyzoites’, bioRxiv. doi: 10.1101/2020.06.19.160341.

      Lentini, J. M. et al. (2020) ‘DALRD3 encodes a protein mutated in epileptic encephalopathy that targets arginine tRNAs for 3-methylcytosine modification’, Nature Communications. Springer US, 11(1). doi: 10.1038/s41467-020-16321-6.

      Lourido, S., Tang, K. and David Sibley, L. (2012) ‘Distinct signalling pathways control Toxoplasma egress and host-cell invasion’, EMBO Journal. Nature Publishing Group, 31(24), pp. 4524–4534. doi: 10.1038/emboj.2012.299.

      Lunghi, M. et al. (2022) ‘Pantothenate biosynthesis is critical for chronic infection by the neurotropic parasite Toxoplasma gondii’, Nature Communications. Springer US, 13(1). doi: 10.1038/s41467-022-27996-4.

      McCoy, J. M. et al. (2012) ‘TgCDPK3 Regulates Calcium-Dependent Egress of Toxoplasma gondii from Host Cells’, PLoS Pathogens, 8(12). doi: 10.1371/journal.ppat.1003066.

      Moss, W. J. et al. (2022) ‘Functional Analysis of the Expanded Phosphodiesterase Gene Family in Toxoplasma gondii Tachyzoites’, mSphere. American Society for Microbiology, 7(1). doi: 10.1128/msphere.00793-21.

      Stewart, R. J. et al. (2017) ‘Analysis of Ca2+ mediated signaling regulating Toxoplasma infectivity reveals complex relationships between key molecules’, Cellular Microbiology, 19(4). doi: 10.1111/cmi.12685.

      Vo, K. C. et al. (2020) ‘The protozoan parasite Toxoplasma gondii encodes a gamut of phosphodiesterases during its lytic cycle in human cells’, Computational and Structural Biotechnology Journal. The Author(s), 18, pp. 3861–3876. doi: 10.1016/j.csbj.2020.11.024.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

      In this manuscript, Dominicus et al investigate the elusive role of calcium-dependent kinase 3 during the egress of Toxoplasma gondii. Multiple functions have already been proposed for this kinase by this group including the regulation of basal calcium levels (24945436) or of a tyrosine transporter (30402958). However, one of the most puzzling phenotypes of CDPK3 deficient tachyzoites is a marked delay in egress when parasites are stimulated with a calcium ionophore that is rescued with phosphodiesterase (PDE) inhibitors. Crosstalk between, cAMP, cGMP, lipid and calcium signalling has been previously described to be important in regulating egress (26933036, 23149386, 29030485) but the role of CDPK3 in Toxoplasma is still poorly understood.

      Here the authors first take an elegant phosphoproteomic approach to identify pathways differentially regulated upon treatment with either a PDE inhibitor (BIPPO) and a calcium ionophore (A23187) in WT and CDPK3-KO parasites. Not much difference is observed between BIPPO or A23187 stimulation which is interpreted by the authors as a regulation through a feed-back loop. The authors then investigate the effect of CDPK3 deletion on lipid, cGMP and cAMP levels. The identify major changes in DAG, phospholipid, FFAs, and TAG levels as well as differences in cAMP levels but not for cGMP. Chemical inhibition of PKA leads to a similar egress timing in CDPK3-KO and WT parasites upon A23187 stimulation.

      As four PDEs appeared differentially regulated in the CDPK3-KO line upon A23187, the authors investigate the requirement of the 4 PDEs in cAMP levels. They show diverse localisation of the PDEs with specificities of PDE1, 7 and 9 for cGMP and of PDE2 for cAMP. They further show that PDE1, 7 and 9 are sensitive to BIPPO. Finally, using a conditional deletion system, they show that PDE1 and 2 are important for the lytic cycle of Toxoplasma and that PDE2 shows a slightly delayed egress following A23187 stimulation.

      Major comments:

      -Are the key conclusions convincing?

      The title is supported by the findings presented in this study. However I am not sure to understand why the authors imply a positive feed back loop. This should be clarified in the discussion of the results. The phosphoproteome analysis seems very strong and will be of interest for many groups working on egress. However, the key conclusion, i.e. that a substrate overlaps between PKG and CDPK3 is unlikely to explain the CDPK3 phenotype, seems premature to me in the absence of robustly identified substrates for both kinases.

      I am not sure there is a clear key conclusion from the lipidomic analysis and how it is used by the authors to build their model up. Major changes are observed but how could this be linked with CDPK3, particularly if cGMP levels are not affected?

      The evidence that CDPK3 is involved in cAMP homeostasis seems strong. However, the analysis of PKA inhibition is a bit less clear. The way the data is presented makes it difficult to see whether the treatment is accelerating egress of CDPK3-KO parasites or affecting both WT and CDPK3-KO lines, including both the speed and extent of egress. This is important for the interpretation of the experiment.

      The biochemical characterisation of the four PDE is interesting and seems well performed. However, PDE1 was previously shown to hydrolyse both cAMP and cGMP (https://doi.org/10.1101/2021.09.21.461320) which raises some questions about the experimental set up. Could the authors possibly discuss why they do not observe similar selectivity? Could other PDEs in the immunoprecipitate mask PDE activity? In line with this question, it is not clear what % of "hydrolytic activity (%)" means and how it was calculated. The experiments describing the selectivity of BIPPO for PDE1, 7 and 9 as well as the biological requirement of the four tested PDEs are convincing.

      -Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      The claim that CDPK3 affects cAMP levels seems strong however the exact links between CDPK3 activity, lipid, cGMP and cAMP signalling remain unclear and it may be important to clearly state this.

      -Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      I think that the manuscript contains a significant amount of experiments that are of interest to scientists working on Toxoplasma egress. Requesting experiments to identify the functional link between above-mentioned pathways would be out of the scope for this work although it would considerably increase the impact of this manuscript. For example, would it be possible to test whether the CDPK3-KO line is more or less sensitive to PKG specific inhibition upon A23187 induced?

      -Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      The above-mentioned experiment is not trivial as no specific inhibitors of PKG are available. Ensuring for specificity of the investigated phenotype would require the generation of a resistant line which would require significant work.

      -Are the data and the methods presented in such a way that they can be reproduced?

      It is not clear how the % of hydrolytic activity of the PDE has been calculated.

      -Are the experiments adequately replicated and statistical analysis adequate?

      This seems to be performed to high standards.

      Minor comments:

      -Specific experimental issues that are easily addressable.

      I do not have any comments related to minor experimental issues.

      -Are prior studies referenced appropriately?

      Most of the studies relevant for this work are cited. It is however not clear to me why some important players of the "PKG pathway" are not indicated in Fig 1H and Fig 3E, including for example UGO or SPARK.

      -Are the text and figures clear and accurate?

      While all the data shown here is impressive and well analysed, I find it difficult to read the manuscript and establish links between sections of the papers. The phosphoproteome analysis is interesting and is used to orientate the reader towards a feedback mechanism rather than a substrate overlap. But why do the authors later focus on PDEs and not on AC or CNBD, as in the end, if I understand well, there is no evidence showing a link between CDPK3-dependent phosphorylation and PDE activity upon A23187 stimulation? It is also unclear how the authors link CDPK3-dependent elevated cAMP levels with the elevated basal calcium levels they previously described. This is particularly difficult to reconcile particularly in a PKG independent manner.

      The presentation of the lipidomic analysis is also not really clear to me. Why do the authors show the global changes in phospholipids and not a more detailed analysis? As the authors focus on the PI-PLC pathway, could they detail the dynamics of phosphoinositides? I understand that lipid levels are affected in the mutant but I am not sure to understand how the authors interpret these massive changes in relationship with the function of CDPK3 and the observed phenotypes.

      Finally, the characterisation of the PDEs is an impressive piece of work but the functional link with CDPK3 is relatively unclear. It would also be important to clearly discuss the differences with previous results presented in this this preprint: https://doi.org/10.1101/2021.09.21.461320. My understanding is while the authors aim at investigating the role of CDPK3 in A23187 induced egress, the main finding related to CDPK3 is a defect in cAMP homeostasis that is not linked to A23187. Similarly, the requirements of PDE2 in cAMP homeostasis and egress is indirectly linked to CDPK3. Altogether I think that important results are presented here but divided into three main and distinct sections: the phosphoproteomic survey, the lipidomic and cAMP level investigation, and the characterisation of the four PDEs. However, the link between each section is relatively weak and the way the results are presented is somehow misleading or confusing.

      -Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      This is a very long manuscript written for specialists of this signalling pathway and I would suggest the authors to emphasise more the important results and also clearly state where links are still missing. This is obviously a complex pathway and one cannot elucidate it easily in a single manuscript.

      Significance

      -Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      This is a technically remarkable paper using a broad range of analyses performed to a high standard.

      -Place the work in the context of the existing literature (provide references, where appropriate).

      The cross-talk between cAMP, cGMP and calcium signalling is well described in Toxoplasma and related parasites. Here the authors show that, in Toxoplasma, CDPK3 is part of this complex signalling network. One of the most important finding within this context is the role of CDPK3 in cAMP homeostasis. With this in mind, I would change the last sentence of the abstract to "In summary we uncover a feedback loop that enhances signalling during egress and links CDPK3 with several signalling pathways together."

      The genetic and biochemical analyses of the four PDEs are remarkable and highlight consistencies and inconsistencies with recently published work that would be important to discuss and will be of interest for the field.

      While I understand the studied signalling pathway is complex, I think it would be important to better describe the current model of the authors. In the discussion, the authors indicate that "the published data is not currently supported by a model that fits most experimental results." I would suggest to clarify this statement and discuss whether their work helps to reunite, correct or improve previous models.

      Could the authors also speculate about a potential role of PDE/CDPK3 in host cell invasion as cAMP signalling has be shown to be important for this process (30208022 and 29030485)?

      -State what audience might be interested in and influenced by the reported findings.

      This paper is of great interest to groups working on the regulation of egress in Toxoplasma gondii and other related apicomplexan pathogens.

      -Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      I am working on the cell biology of apicomplexan parasites.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01384

      Corresponding author(s): Mary O’Riordan and Basel Abuaita

      1. General Statements [optional]

      We appreciated the positive feedback and helpful suggestions from the reviewers that pointed to a need for more clarity regarding the central focus of the study. Our goal was to take an unbiased approach to evaluating the role of neutrophils during S. Typhimurium (STM) infection of human intestinal epithelial cells (IEC), using human intestinal organoids as a model. An abundance of data point to important inflammatory roles for neutrophils during STM infection of human intestine but the critical mechanisms involved have not been fully elucidated. New data now included in the revised manuscript provide strong support for human PMN-derived IL1-beta as a driver of epithelial cell shedding in STM-infected HIOs, consistent with known differences in local inflammation between human and mouse infection, and this is the focus of the current study. Our data did not support a significant role for human neutrophils in controlling luminal bacterial numbers, but instead the primary human PMNs robustly stimulated epithelial cell responses that led to decreased intraepithelial bacteria. Several recent studies have suggested that caspase-1 is not a critical inflammasome component during STM infection of IEC, which instead use non-canonical inflammasomes, including caspases-4 and -11. Our data point to a human neutrophil-intrinsic function for caspase-1 and IL1-beta that contributes to the inflammatory tone of the intestinal milieu early in STM infection.

      2. Point-by-point description of the revisions

      Reviewer #1

      Major comments:

      Some important links are missing to fully support the mechanistic model proposed:* *

      1- PMN activity

      The authors may strengthen their evidence of PMN activities presented in lines 135 to 143 and in Fig.S2 and S3. In particular, the authors claim that PMNs form NETs in PMN-HIOs but the evidence displayed are limited. In fact, Fig S2 shows the same condition and same staining as Fig 1B but the MPO-positive structures are different. Clarification in the text or the figure would be welcome. Besides, as the authors insist on the relevance of NETs in the discussion, it seems that a clear demonstration and characterization of these structures in the PMN-HIO model would highly benefit the manuscript.

      While we commented on NETs in our original manuscript, our conclusions do not rely on the presence or absence of NETs. We have therefore removed the NET data and the reference to NETs. While NETs are potentially interesting in the context of intestinal infection, we understand the reviewer's concern about NETs and anticipate that a more quantitative characterization of NETs may be challenging given the structure and variability of the PMN-HIOs.

      Regarding the analyses of the culture supernatants (Fig.S3), only 3 out of the 5 displayed datasets are commented on in the text. The data obtained for BD2 and N-Gal should be either commented or removed from the figure. The author further suggests that Elafin expression in presence of PMN may restrict PMNs' ability to kill Salmonella. Repeating the experiment displayed in Fig S1 in the presence of Elafin as well as in the presence of the supernatant extracted from HIOs and PMN-HIOs would clarify the potential inhibition of PMN killing capacity in the PMN-HIO model.

      We now include a sentence on the antimicrobials BD2 and N-GAL to the text (line 135-136). Elafin is one of many molecules that could potentially affect the ability of PMNs to kill Salmonella. We repeated the experiments in S3 Fig with recombinant Elafin. There was a very weak effect on killing in the presence of Elafin, however Elafin can also kill Salmonella directly, complicating interpretation of these experiments. We have now added a sentence in the Discussion to speculate that Elafin is one example of how the epithelium may inhibit the ability of PMNs to kill (line 366-372). These data are not central to our main conclusions and are only intended to provide context to the reader about possible explanations for why PMNs can kill Salmonella directly, but do not significantly alter total bacterial numbers in the HIO model.

      The author proposed that infected and uninfected cells are extracted from the epithelium due to PMN activation, suggesting that Salmonella infection of epithelial cells is only indirectly involved in cell shedding. This is an interesting hypothesis that could be tested by measuring cell shedding in a non-infected but PMN-activated (for instance with PMA) PMN-HIO model. This would clarify further the role of PMN in controlling epithelial response to the infection.

      We tested this possibility by microinjecting LPS into the lumen of PMN-HIOs (S6 Fig). There was significantly less TUNEL+ signal in LPS-injected PMN-HIOs compared to STM-infected PMN-HIOs, suggesting that active Salmonella infection is required for shedding of both infected and uninfected cells in the presence of PMNs__. __

      2- Specificity of RNA-seq profiling:

      The authors analyzed the transcriptomic profiling of PMN-HIOs and HIOs infected or not. While these experiments bring to light an interesting difference in inflammasome/cell death transcriptomic programs at the scale of the co-culture model, it is not possible to conclude from which cell type these transcriptomic shifts emerge. To clarify this, the authors stain the co-culture for ASC and observe that ASC-positive cells are PMNs. They conclude that PMNs are most likely the primary site of caspase-1 dependent production of IL1. While their model is theoretically consistent, more direct proofs are necessary to conclude on the cell-type specific transcriptomic program during infection of PMN-HIO and could be obtained by FACS sorting of the cells prior to RNA-seq, for instance using MPO to detect PMNs and E-cadherin to detect epithelial cells.

      We now provide evidence that pretreating PMNs with an irreversible Caspase-1 inhibitor before co-culturing with STM-infected HIOs prevented accumulation of luminal TUNEL+ cells (Fig 6B,C). Additionally, IL-1β treatment in the absence of PMNs recapitulated the cell death phenotype of the infected PMN-HIOs (Fig 6D,E) suggesting Caspase-1 activity in PMNs and IL-1β production are necessary for epithelial cell death in the PMN-HIOs.

      3- Roles of cytokine

      After showing an increased expression/release of IL1 and IL1RA in infected PMN-HIOs, the authors move on to testing the role of caspases on cell shedding. Yet, they do not test the impact of IL1 and IL1RA on cell shedding. As, according to their proposed model, IL1 is acting upstream of caspase-1 to promote cell shedding, testing cell shedding in infected PMN-HIOs in the presence of an IL1 inhibitor would clarify that link. The author also proposed that the decrease of IL33 in PMN-HIOs compared to HIOs could be due to PMN processing, which would give an additional role to PMNs in controlling the epithelial response to infection. In the context of this manuscript, it would be highly relevant to test this hypothesis by measuring the rate of cleaved IL-33.

      We now provide data to address these questions about IL-1 signaling. HIOs were microinjected with recombinant IL-1β during STM infection and PMN-HIOs were also treated with IL1RA during STM infection. Cell shedding was measured under these conditions in Fig. 6D-F. Cell shedding was dependent on IL-1 signaling and the model has been updated to reflect this.

      We also concentrated supernatants from STM-infected HIOs and PMN-HIOs, probed for cleaved IL33 via western blot and did see some cleavage. However, without being able to block this process it is not possible to conclude what role cleaved IL33 has during infection in the PMN-HIO and IL-1β seems to be sufficient to drive the cell shedding phenotype. Since the status of IL33 is not central to our conclusions, we have removed these data from the manuscript.

      4- Roles of caspase

      The interpretations of the role of Caspases to restrict bacteria burden are unclear and should be revised (see also minor comment). It appears that both Caspase-1 and Caspase-3 are necessary for efficient cell shedding (Fig4B), Caspase-1 (but not Caspase-3) decreases intraluminal bacteria burden (Fig4C) and Caspase-3 (but not Caspase-1) decreases epithelium-associated bacteria (Fig4D). To reconcile these observations with the hypothesis that cell shedding is responsible for the decrease of intraluminal and epithelium-associated bacterial burden, one may propose that caspase-3 (but not caspase-1) induces cell shedding of mainly non-infected cells (possibly bacteria-associated) and caspase-1 (but not caspase-3) induced cell shedding of infected cells. This could be tested by measuring the % of infected extruded cells upon caspase inhibitor treatments. In addition, these data don't allow to propose that Caspase-3 activation happens downstream of Caspase-1 as suggested by the authors in their abstract figure.

      It is difficult to accurately quantify the percent infected cells that are extruded since both infected and uninfected cells are extruded into a luminal space full of bacteria, which may associate with uninfected cells post-extrusion. However, we did observe cells positive for cleaved Caspase-3 when HIOs were treated with IL-1β leading us to infer that Caspase-1 mediated cytokine signaling through IL1R can trigger downstream Caspase-3 activation (Fig. 6G). We have expanded the Discussion to talk about differing roles of Caspases on bacterial burden and association with the epithelium (lines 374-397).

      Minor comments:

      The majority of the points listed below can be addressed with further analyses of pre-acquired data sets:

      Fig1E/1F/4D: each green dot is not likely to be individual bacteria but rather a cluster of bacterium (based on their size). So the y-axis in Fig 1E and Fig4D should not be #STM.

      Y-axis labels have been changed to #STM objects

      Fig2A: Variations in organoid size and epithelial thickness can be observed between figures. In particular, in Fig 2A, the HIO seems much younger than the other ones displayed in the manuscript.

      There is considerable natural variability between HIOs and between batches, a phenomenon observed by many HIO researchers (Hofer et al. Nature Reviews Materials 2021). HIOs were all treated the same way prior to infection, and based on our extensive observations, epithelial thickness does not correlate significantly with a particular experimental condition, as we now show in S10 Fig.

      Line 176 to 178, the authors mentioned the TUNEL+ cells in the mesenchyme but rule out the possibility that this phenotype could be infection or PMN-dependent because it is observed in the different conditions. As the picture displayed in Fig2A suggests high differences in the number of TUNEL+ cells in the mesenchyme under the 4 tested conditions, the authors should still quantify this phenomenon (possibly in the supplementary).

      This is likely an artifact of culturing and not due to the infection or PMNs. There is variability between HIO batches in the amount of TUNEL signal in mesenchymal cells (for example HIOs in Fig 4A and 5A have very low or no TUNEL positivity in the mesenchyme).

      "DAPI" should be written in blue.

      This has been corrected.

      Fig2C: Could the authors comment on the % of E-cadherin cells that are also TUNEL+? Is it 100%?

      On average about 75% of TUNEL+ cells are E-cadherin+. We think that this may be an underestimate because E-cadherin staining intensity decreases in many cells after shedding. This is commented on in the text (lines 178-179).

      Fig 2D: The point made on lines 182 to 186 that HIOs contain TUNEL + cells retained in the epithelial lining in the absence of PMNs is not very strongly supported by Fig 2D. Quantification of the number of intraepithelial TUNEL+ cells in the 4 compared conditions would make a more solid case.

      We quantified TUNEL intensity in epithelial cells retained in the monolayer (S7 Fig). We do note that there is some variability in this phenotype that correlates with different batches of HIOs__.__

      Fig2E: This experiment should be completed with a quantification of the percentage of TUNEL+ cells that are also cleaved caspase3-positive. The data, as currently displayed, do not prove that the cells negative for cleaved caspase 3 are apoptotic cells and thus do not support the sentence "suggesting that multiple forms of cell death were occurring in the PMN-HIO" (line 194).

      Cells negative for cleaved Caspase-3 that are TUNEL+ may be undergoing some other form of cell death that is not Caspase-3 dependent, such as necrosis. This possibility is consistent with the decreased TUNEL signal observed upon inhibition of Caspase-4 (Fig 5A,B)__. __However, we have reworded our conclusion to identify more clearly what the data indicate, and where we are drawing inferences.

      Fig3A: "IL1RN" should be changed for "IL1RA (IL1RN)" for consistency with Fig 3B.

      The heatmap shows gene expression data so IL1RN is more consistent with the gene nomenclature. However, we have added an asterisk to the label on the heatmap, along with a sentence in the figure legend to elucidate.

      Fig 4C: The authors should provide the percentage of infected cells rather than the number of bacteria per cell (this information can be included in supplementary).

      Percent infected cells has been moved to Fig 4C and the number of bacteria per cell has been moved to Fig 4D__.__

      FigS2: The different thicknesses of the epithelial layer observed between PBS and STM panels suggest a difference in scale. This may be double-checked by the authors.

      The images are scaled similarly – as noted earlier (S10 Fig), there is considerable natural variability between HIOs that is not correlated with any experimental condition in this study.

      Line 197-199, the authors claimed that uninfected cells may be observed in the cell lumen. This seems hard to observe/conclude at this resolution. The authors may show a non-infected cell at higher magnification. __

      We have added higher magnification images, uninfected cells are indicated with white arrows in S8 Fig.

      Discussion: Some important points should be added to the discussion. In particular, what is the fate of intracellular salmonellae after cell shedding? Can the bacterium survive cell apoptosis and burst out of the cell to re-infect the epithelium or be transferred to phagocytic cells during the clearance of intraluminal apoptotic cells? Previous studies showed that cytosolic hyper-replication could fuel cell shedding. The importance of bacterial load in PMN-induced cell shedding could be discussed.

      We have expanded the discussion to elaborate on what may happen to shed cells. One useful feature of the HIOs is that the enclosed lumen allows us to capture the cells to fully measure the extent of cell shedding, however in the intestine where there is flow these cells would be washed away and could help to reduce bacterial load in the intestine. This point is now made in lines 386-388 in the discussion.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Major concerns

      1) The authors show that only ~5% of the neutrophils have migrated to the lumen, which is a barely noticeable increase compared to PBS treated organoids. Does this reflect that the mucosal layer of the organoids might not produce neutrophil chemoattractants and that neutrophil recruitment during Salmonella is a bystander effect from a different cell type?

      This number indicates that PMNs are ~5% of total cells in the PMN-HIO (including epithelial and mesenchymal cells) during Salmonella infection (not that only 5% of PMNs added migrated). Moreover, PMNs were added to a well containing multiple HIOs. We also show that HIOs do produce neutrophil chemoattractants during infection (S1 Fig).

      2) How quickly are neutrophils recruited to the HIOs? The authors show one time point of 8 hours. Related to the relatively low number of neutrophils seen in their HIOs, is this perhaps a result of the time point they chose? Will they see more neutrophils recruited if they go longer?

      It is likely that 5% of total cells in the PMN-HIO represents a significant recruitment of PMNs, and our data clearly indicate a marked effect on the infected epithelium. PMNs can cause substantial tissue damage, and their recruitment and activation is known to be tightly regulated. Due to the short-lived nature of human PMNs it would be difficult to extend this experiment to later timepoints. We have experimentally characterized PMN migration at 24h and by that time, most of the PMNs that we observe are non-viable, thus we focused our studies earlier.

      3) The authors show that PMNs did not kill STM in their organoids, but they do in pure culture. Is this simply because of the low levels of neutrophils present in their HIOs, which would result in lower concentrations of antimicrobials being produced in the HIO lumen? If the authors are able to get higher levels of neutrophils in their HIOs, would they see increased bacterial killing?

      Neutrophils have both inflammatory signaling and microbicidal functions. For example, Cho, et al (PLoS Pathogens 2012) find that neutrophil-derived IL-1 beta is sufficient to support abscess formation in the innate immune response to Staphylococcus aureus soft tissue infection. Similarly, a recent study showed that activation of neutrophils by keratinocyte defensins in a S. aureus skin infection led to neutrophil IL1 beta and CXCL2 release that amplified antibacterial defenses (Dong, et al Immunity 2022). Moreover, in the native environment of the gut with extensive microbiome colonization, direct neutrophil microbicidal activity might be less effective against infection than signaling. Recruitment of higher levels of neutrophils in vivo or in the HIO might exacerbate damage of the epithelial barrier. In the discussion, we speculate there may be proteins, like Elafin, that are upregulated during infection and inhibit some neutrophil functions as a trade-off to control host tissue damage. We reason that our data strongly support an inflammatory signaling role for neutrophils to promote innate immune responses of the intestinal epithelium.

      4) Related to the above point, if the authors treat their HIOs with known neutrophil chemoattractants, can they increase the number of neutrophils that migrate into their organoids?

      High levels of chemoattractants are already being produced in the HIO in response to infection (S1 Fig). The most effective number of neutrophils in the context of intestinal infection may not be the highest number, given that neutrophils can cause tissue damage. Since we see a marked phenotype with the neutrophils that are recruited, we propose that this PMN-HIO model reveals important inflammatory signaling roles for PMNs to promote intestinal epithelial immune function.

      5) The authors speculate that Salmonella may "employ specific mechanisms to overcome PMN effector functions in the HIO luminal environment". Are any such mechanisms known? If so, the authors could test this hypothesis by repeating these experiments with Salmonella mutants in which these mechanisms are ablated. In this case, they should see increased killing of Salmonella by PMNs in the HIO lumen.

      The focus of this study was to test how PMNs contribute to the host response against wildtype Salmonella. In the PMN-HIO model, we find that neutrophils direct a robust epithelial cell extrusion response, impacted intracellular bacterial numbers, and that Salmonella luminal colonization is not affected by PMNs. Thus, our data are pointing to an important inflammatory role for neutrophils in the infected intestine. Indeed, reliance on direct bactericidal mechanisms in the intestinal lumen which in vivo would be colonized with the microbiota might be a losing strategy for neutrophils, which would be hugely outnumbered.

      6) Furthermore there is no information of the activation status of the neutrophils. How does the surface expression of CD16 CD62L, CD66 and CD11b look between the migrated and non-migrated and between infected and uninfected controls? Did the neutrophils de-granulate? Are they CD63+ or is the high levels of NGAL and S100 proteins an effect of lysis? The authors should also be careful in claiming that there is NETosis as the image in the supplement look more like an artifact than actual NETs.

      Our new findings suggest that IL-1 production by PMNs is the biggest factor in driving the cell death phenotype. We have also added a figure with CD63 staining. We were able to visualize some localization of CD63 to the cell surface of PMNs, consistent with degranulation (S4 Fig).

      7) Why does ASC translocate to the nucleus? Is the IL-1b cleavage mediated through Caspase-1 or Caspase-11? The neutrophils stained positive in the lumen appear to be intact, does this mean that pyroptosis does not occur, or does the IL-1b come from cells that did not migrate through the mucosal membrane? Staining for IL-1 and the different caspases might help resolve this question.

      ASC does not appear to be translocating to the nucleus. In Fig 3D the green signal (ASC) is primarily excluded from the DAPI-stained area. In this human model, Caspase-11 is not present, and inhibition of Caspase-1 is sufficient to block the cell shedding phenotype (Fig. 5A,B and Fig. 6B,C). We are unable to distinguish whether IL-1 is being produced by intact PMNs or PMNs that are undergoing pyroptosis. Unfortunately, there are not suitable antibodies for fixed immunofluorescence staining for cleaved Caspase-1, and as a secreted protein, IL-1 beta likely will not remain localized with the producer cell.

      8) The authors comment that there is substantial TUNEL staining in the mesenchyme independent of STM or PMNs, however, there is no explanation for why this happens. Does this have any downstream effects on the neutrophils that doesn't migrate towards the lumen?

      TUNEL positivity in the mesenchyme is likely an artifact of culturing and we have noted this in the text (line 169-172). The extent of TUNEL+ mesenchymal cells appears to be dependent on the batch of HIOs as not all HIOs exhibit this phenotype (for example Figs 4A and 6B). In contrast, the extent of TUNEL+ luminal cells is significantly dependent on the presence of PMNs and Salmonella.

      Minor comments

      1) The authors should remove that MPO is neutrophil-specific, monocytes are known to have higher MPO expression than neutrophils.

      In this controlled co-culture system there are no monocytes, therefore we have modified our text to indicate that MPO is used as a neutrophil marker in the PMN-HIOs (line 161).

      2) If the authors performed flow cytometry as they say, they should provide the flow plots and the gating strategy they used in the supplement.

      Representative flow plots for the data presented in Fig 1A are now included in S2 Fig. The data shown in Figs. 1A and S2 Fig are not gated.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Major Comments

      1.Overall the study is convincing and it is well-conducted. This reviewer found it surprising that the PMNs did not alter the total levels of STM in the HIOs as neutrophils are expected to control the infection. Can the authors elaborate on if the intraepithelial numbers are reduced, what happens to STM in the lumen? It would be convincing if the authors can extend the infection timeline to see if the neutrophils are capable of killing luminal STM. *

      One of the limitations of the HIO model is the lack of flow in the lumen. It is likely that shed cells would be removed from the body following extrusion in vivo. In the HIOs, since the cells are trapped in the lumen, Salmonella could then reinvade and so this phenotype might be even stronger in a model that incorporates flow. We have added this point to the discussion (lines 387-390). Due to the short-lived nature of PMNs, it is difficult to extend the infection beyond 8h. While in vitro experiments with just neutrophils and STM as we and others have performed might set the expectation that neutrophils would alter luminal bacterial levels, there is little to no direct evidence that neutrophil bactericidal activity is critical in the context of the intestinal environment (vs. releasing ROS or inflammatory signals that may have complex indirect effects). Indeed, an advantage of the HIO model is that we are able to test the function of neutrophils in a multi-component system, but one that is still sufficiently simplified that we can do some mechanistic analysis.

      2-It would be powerful to conduct the caspase inhibition on neutrophils prior to HIO co-culturing to convincingly show that the effects of caspase inhibition effect neutrophils which in turn effect the epithelium disrupting the epithelial load of STM.

      We appreciated this suggestion. We pretreated the PMNs with a Caspase-1 inhibitor for 1h prior to co-culture with infected HIOs. We found that this was sufficient to block TUNEL cell accumulation in the lumen of infected PMN-HIOs. These results are now presented in Fig 6B,C.

      3- While other caspases are well-established to be involved in Salmonella-related cell death and epithelial shedding, why did the authors picked caspase 3 but not caspase 4/5 to show activation in Fig 2?

      We have now also tested the role of Caspase-4 on cell shedding using z-LEVD-fmk inhibitor. Consistent with prior published studies, we found that Caspase-4 inhibition reduced the accumulation of TUNEL-positive cells in the PMN-HIO lumen. These results are presented in Fig 5. There are no detectable levels of Caspase-5 in the HIOs (S9 Fig).

      Minor comments

      Fig 1C It is not clear how the total bacterial burden was determined. Please include details such as the timepoint and sufficient details of the technique both in the results section and the legend.

      These details have been added in the figure legend (line 605-607). Briefly, HIOs were washed with PBS and homogenized in PBS at 8hpi. CFU/HIO were enumerated by serial dilution and plating on LB agar.

      • Fig S2. Authors claim that the PMNs form NETs in the lumen, however, the marker used in the immunostaining is MPO. Although a NETting is seen in the images, MPO staining is not sufficient to claim these are NETs. Additional staining is required to show if the neutrophils in the lumen are intact or formed NETs*.

      As noted in response to Reviewer #1, although we commented on NETs in our original manuscript, our conclusions do not rely on the presence or absence of NETs and our new data implicates PMN IL-1 as necessary and sufficient for the cell shedding phenotype. We have therefore removed the NET data and the reference to NETs. While NETs are potentially interesting in the context of intestinal infection, we understand the reviewer's concern about NETs and anticipate that a more quantitative characterization of NETs may be challenging given the structure and variability of the PMN-HIOs.

    1. ABSTRACT

      This work has been published in GigaByte Journal under a CC-BY 4.0 license (https://doi.org/10.46471/gigabyte.66), and has published the reviews under the same license. These are as follows.

      Reviewer 1. Linzhou Li

      Are the data and metadata consistent with relevant minimum information or reporting standards? No. Geographic location (country and/or sea, region, latitude and longitude) is missing, as well as environmental context.

      Is there sufficient data validation and statistical analyses of data quality? No. The genome size and gene number of Dendrobium hybrid cultivar ‘Emma White’ differ greatly from the published Dendrobium genomes (e.g. Zhang et. al Scientific Reports 2016, Zhang et. al Horticulture Research 2021, Han et. al Genome Biology and Evolution 2020...). Specifically, the authors assembled a smaller genome and predicted a larger number of genes compared with the previous study. Therefore, I strongly suspect that the assembled genome is incomplete and fragmented, resulting in more fragmental genes.

      Is the validation suitable for this type of data? No. There's not enough raw data (~24Gb) to assemble a 600Mb (or ~1.2Gb from the previous study) genome. I highly recommend the authors get more raw data and do a genome survey.

      Additional Comments: The Complete BUSCOs only account for 16.6% which is quite low. The authors explain that the large loss of BUSCOs is due to the fact that the mutant genome has a lot of specific sequences, but these genes are very conserved in plants and should not be easily mutated.

      Reviewer 2. Stephanie Chen

      Is the language of sufficient quality? No. Most of the manuscript is written in a sufficient quality, but there are certain parts that require revision to improve readability. Please see detailed comments on the Word document.

      Are all data available and do they match the descriptions in the paper? No. The SRA link is coming up as a permission error, but I assume it will be released once the paper is available. There is no information on where to access the annotation file.

      Is the data acquisition clear, complete and methodologically sound? No. The contiguity (635,396 contigs, N50 of 1,620 bp) and completeness (16.60 %) of the genome is quite low and this may limit its downstream uses. It would be good to incorporate some long-reads or increased sequencing coverage to improve your genome. There are a number of chromosome-level Dendrobium genomes that are available (e.g. D. chrysotoxum and D. huoshanense) and scaffolding off these may be attempted to improve the assembly. Scaffolding from existing assemblies may be a good option if generating more sequencing reads is not feasible.

      Is there sufficient detail in the methods and data-processing steps to allow reproduction? No. Some details on the DNA extraction and library preparation steps are missing. In the methods section, there are also missing details for multiple programs in terms of the version and parameters (e.g. BUSCO version and database used, QUAST version, AUGUSTUS version, details on adapter removal and trimming). It is mentioned 'similarity score and description of each gene was filtered out using in-house pipeline'. The script and details of the pipeline are not provided; please add a reference or details in the manuscript e.g. link to GitHub repository.

      Is there sufficient data validation and statistical analyses of data quality? No. The reporting and interpretation of BUSCO results ('BUSCO version 5.2.2 analysis reveals 913 (56.57%) single-copy orthologs doesn’t match with any data bases indicates the unique and possible uncharacterized sequences in mutant genome of Dendrobium hybrid cultivar') needs to be revisited. There needs to be additional validation of the gene annotation (e.g. BUSCO, comparison with existing Dendrobium annotations) and also some validation of the genome size (e.g. GenomeScope and comparison with reported flow cytometry measures).

      Is the validation suitable for this type of data? Yes. The type of validation in the manuscript (BUSCO) is suitable to assess genome completeness, but reporting and discussion of the results needs to be revised. Some additional validation is also needed (see box above).

      Additional Comments: In this manuscript, the authors provide a draft genome of a gamma-ray induced mutant of a Dendrobium hybrid cultivar using Illumina sequencing that will assist with future breeding efforts and studies. However, I am not convinced of the genome's usefulness in its current form. There are some methods that need to be described in more detail to be reproducible. Revisions will also help improve the readability of the manuscript. As page and line numbers are not provided on the manuscript, please find additional comments directly added to manuscript file attached.

      https://gigabyte-review.rivervalleytechnologies.com/download-api-file?ZmlsZV9wYXRoPXVwbG9hZHMvZ3gvRFIvMzA2L1Jldmlld19TQ185Njc2XzA1MDIyMl9HaWdhYnl0ZV9HYW1tYSBXR1MgZGF0YW5vdGUgKDEpLmRvY3g~

      Re-review: Thank you to the authors for addressing the previous comments on the manuscript. I generally find the revisions satisfactory, although have some follow up comments. The addition of details on the genetic origin of the Dendrobium ‘Emma White’ hybrid cultivar and requested details on bioinformatic tool versions/parameters have strengthened the manuscript. The authors have not followed up on the suggestion to improve the genome via scaffolding, but provide an explanation that existing chromosome-level assemblies/sequencing data of Dendrobium species are not suitable as they are not related to the hybrid cultivar the authors studied, implying that they are highly diverged and scaffolding would not meaningfully improve the genome. Given this information, I think the Dendrobium ‘Emma White’ hybrid cultivar genome can still be useful for orchid breeding efforts despite low contiguity and completeness. However, I do not agree with the author’s point of, “Third, we used low coverage genome analysis with short reads of gamma mutant Dendrobium hybrid cultivar, as it was the first case study and obtained SRA, genome assembly and TSA accessions from NCBI. The genome assemblies of Dendrobium species from earlier studies used both long reads and short reads in their study. Construction of scaffolding from such database species using our contigs may be skewed and shall give unreliable data based on above points mentioned. Hence, I opinioned that suggestion given by Reviwer 2 on scaffolding suggestion may not be correct point.” Even if different types of sequencing technologies were used in comparison to Emma White genome, the availability of a contiguous closely related reference genome would still be useful for reference-guided scaffolding of the draft genome and well as comparative analyses. Lines 107-109: Reorder sentence to make the order of the steps clear i.e. adapter removal and quality filtering before assembly with MaSuRCA. Also, on the MaSuRCA GitHub (https://github.com/alekseyzimin/masurca), it says “Avoid using third party tools to pre-process the Illumina data before providing it to MaSuRCA, unless you are absolutely sure you know exactly what the preprocessing tool does. Do not do any trimming, cleaning or error correction. This will likely deteriorate the assembly.” Did the authors find that the pre-processing meaningfully improved the quality of the assembly, compared to if the raw reads were input straight into the assembler? Please justify the preprocessing of reads. Suggest to reword lines 137-139 “BUSCO version 5.2.2 analysis reveals 913 (56.57%) single-copy orthologs doesn’t match with any data bases indicates the impact from evolutionary development of hybrid cultivars and influence of gamma radiation. It is because, the genome of ‘Emma White’ hybrid cultivar of Dendrobium derived from five unique different species is complex genome and continuously hybridized repeatedly 11 times over a period of 68 years with selection process for economic trait improvement” to make the explanation clearer and also to include the number and/or percentage of complete BUSCOs. This was flagged in the previous comments, but not fully resolved and would benefit from revisiting the interpretation of BUSCO results. There are a large number of missing BUSCOs in your assembly, likely related to low contiguity (as well as radiation which is mentioned). Can you discuss if/how this may be a limitation for using this genome in further studies? You suggest that the BUSCOs are not found in the assembly due to many rounds of trait selection and radiation. It is possible that some of the BUSCOs are indeed missing from the particular plant sequenced, but how can you be certain that this is due to the breeding history and radiation applied as implied in the text, and not low genome contiguity? Some papers which characterised gamma irradiation-induced mutations in plants (DOIs: 10.1093/jrr/rraa059, 10.1186/s12864-019-6182-3, 10.1534/g3.119.400555) indicate that it is unlikely as many as 913 BUSCO genes have been affected. Even with stronger doses of radiation than used on the orchid, the number of mutations/genes affected is much lower. The genus name needs to be consistently italicised throughout the manuscript.

      Re-re-review: Thank you to the authors for addressing the previous comments on the manuscript. The authors have followed up on the suggestion scaffold the genome by using the published Dendrobium huoshanense genome to scaffold their draft genome using RagTag. This is an appropriate tool to use and has improved the contiguity of the draft assembly which is good to see. In the methods, the version of RagTag is missing, as are the parameters used to run the program. Please also provide specification on the specific RagTag utilities used (correct, scaffold, patch and/or merge). The authors have added genome statistics for two other orchids and the scaffolded assembly in Table 1, however, have not added BUSCO results for their scaffolded assembly in Table 2. Also, can the authors provide a comment on if the low BUSCO values may be related to the fragmented assembly as brought up in the previous round of review? It will be interesting to see if BUSCO has improved with the scaffolding. BUSCO results for the other two species, D. catenatum and D. huoshanense, would also be a good point of comparison and this is relatively simple and quick to add. The authors could consider concatenating Table 1 and 2 in this case. The draft assembly has improved, and the authors should report numbers on the final version of the assembly presented in the paper (i.e. the scaffolded assembly) in terms of the analysis they have run. In the results and discussion section, it appears some of the statistics (e.g. 96,529 genes, 216,232 SSRs) still refer to the first draft assembly. The authors have clarified that raw reads were used as input into MaSuRCA (line 111) and have now included the necessary detail for the input and parameterisation of the program. Line 157-159: “Taxonomical analysis of mutant Dendrobium at raw sequence data also revealed limited synteny with its closest Dendrobium catenatum species at below 9% and genetically heterogeneous with outcrossing nature”. Details of how this analysis was done is missing from the methods. It may be more appropriate to perform synteny analysis at the genome level and compare the published D. catenatum genome with the scaffolded Dendrobium hybrid genome.

      Editors comment: Additional Editorial Board assessment and feedback was received during the review process.

    1. Reviewer #1 (Public Review):

      The present study aims to define the main immune cell subsets found in the hemolymph of the white shrimp, P. vannamei. This is significant because this species is heavily farmed around the world to meet the demand of the human consumption market. Yet, farmed shrimp suffer from infectious diseases and therefore we need to understand how their immune system works to design strategies that decrease infection losses.

      Classification of crustacean (and other invertebrates) hemocytes is difficult due to the lack of antibodies to use traditional flow cytometry approaches. Furthermore, hemocyte purification is not easy, cells die and clump, again precluding flow cytometry studies. Thus, the majority of what we know about shrimp hemocytes is based on morphological classification. This study contributes significantly to advancing our knowledge of shrimp Immunobiology by defining hemocyte subsets based on their transcriptional profiles.

      Another strength of the paper is that some function in vivo assays (phagocytosis) are presented in an attempt to validate the single-cell data. The authors frame their question or try to frame their question with a more evolutionary angle, such as whether the macrophage-like cell is the evolutionary precursor of human macrophages. I think that this question is not really achievable because the evolution of innate immune systems may have diverged in many branches of the metazoan tree of life. The authors, however, identify gene markers that are conserved in macrophages from shrimp and humans and that is a fair conclusion. There are some methodological caveats to the study and the manuscript needs to be heavily edited to improve language as well as to increase the depth of the interpretation.

      In summary, there are interesting findings in this manuscript but the manuscript needs to be significantly improved so that its quality and impact are elevated.

    1. Author Response

      Reviewer #1 (Public Review):

      The manuscript by Dr Riley and colleagues reports a novel link between molecular clock operative in skeletal muscle and titin mRNA, encoding for essential regulator of sarcomere length and muscular strength. Surprisingly, this clock-mediated regulation of titin occurs at the level of splicing, as demonstrated by SDS-VAGE analyses of skeletal muscle from muscle-specific Bmal1KO mice compared to Bmal1wt counterpart. Concomitant with switch of predominant isoform of titin, skeletal muscle of muscle specific Bmal1KO mice exhibited irregular sarcomere length. Moreover, the authors show that this shift of titin splice is causal for such sarcomere length irregularity and for altered sarcomere length in muscle from the mice with compromised clock function. Importantly, the authors provide compelling evidence that Rbm20, encoding for RNA-binding protein that mediates splicing of titin, is cooperatively regulated by Bmal1-Clock heterodimer and MyoD, via enhancer element in intron 1 of Rbm20, thus identifying Rbm20 as a novel direct clock-regulated gene in the skeletal muscle. Strikingly, rescue of Rbm20 in muscle specific Bmal1KO animals' results in rescue of titin splicing pattern and protein size, suggesting that Rbm20 mediates the regulatory effect of Bmal1 on titin splicing and represents a mechanistic link between the clock and regulator of sarcomere length and regularity.

      We thank reviewer 1 for the very kind comments. We agree that the circadian regulation of titin in any capacity is surprising. We are excited about the implications of our work for cardiac muscle and its therapeutic potential in human skeletal muscle.

      Reviewer #2 (Public Review):

      In this work the authors investigated whether deleting the BMAL1 gene, an integral component of the cellular clock that drives the circadian rhythms of cells, affects the giant protein titin. They report that deleting BMAL1 in skeletal muscle alters the splicing of titin and that this might underlie an increase in sarcomere length dispersion. They show that the effect is through the titin splicing factor RBM20. This work has high novelty and has the potential to add to our understanding of muscle physiology. It is unclear whether splicing of skeletal muscle titin indeed undergoes a circadian rhythm. This could be easily checked using protein gels or RNA seq in muscle samples collected at different times of the day.

      We appreciate the question and recognize that our original manuscript did not clearly outline that the circadian clock regulates both rhythmic and non-rhythmic gene expression. In this study, the target of the muscle clock is expression of Rbm20 mRNA which is not a rhythmically expressed gene in muscle. This has now been addressed in the manuscript.

      Based on the estimated titin turnover and incorporation rates of titin (Cadar et al., 2014), we do not believe that skeletal muscle titin splicing undergoes a circadian rhythm. However, we believe our data highlights the growing recognition of the molecular clock in regulating non-rhythmic processes. We have added data from a chronic phase advance model of circadian disruption with wildtype mice and identify that disrupted circadian rhythms are sufficient to change Rbm20 expression in skeletal muscle (Figure 5).

      This work would be more convincing if the sarcomere length dispersion was investigated in greater detail. Showing this in one muscle type only (TA), in muscles fixed at one length only, and not showing sarcomere length dispersion in the rescue experiment of Figure 6, is rather limited.

      We agree that our analysis of sarcomere length dispersion across joint angles would be interesting but we think it is beyond the scope of this study. As noted above, the premise of this study emerged from our early work in which we found that skeletal muscle from 2 different genetic mouse models of circadian disruption, Bmal1 KO mice as well as the Clock mutant mice, exhibit decreased maximum specific force with significant disruptions to sarcomere structure (Andrews et al., PNAS, 107 (44) 19090-19095 2010). The primary focus of this study was to address the mechanistic link between the muscle circadian clock, its transcriptional targets with a focus on sarcomere structure and our first clue was with the expression of titin isoforms. We included analysis of sarcomere length as an outcome measure because it is a fundamental feature of skeletal muscle, it has links to mechanical function and it is a structure that can be modified by titin spliceforms.

      A small increase in sarcomere length variation as suggested in Figure 2 is unlikely to have a great functional consequence. If it were, how can muscles that express naturally long titin isoforms (soleus, EDL, diaphragm, etc), function well?

      We did not intend to suggest that we see an increase in sarcomere length in Figure 2 and have clarified the figure and text accordingly. The change we see is related to the variability of sarcomere length; we do not see any change in the average sarcomere length. The topic of titin spliceform specialization and the contribution to sarcomere structure and function across different muscle groups (soleus vs. EDL vs. Diaphragm) is a really interesting question but beyond the scope of this study.

      Reviewer #3 (Public Review):

      This manuscript is using an inducible and skeletal muscle specific Bmal1 knockout mouse model (iMSBmal1-/-) that was published previously by the same group. In this study, they utilized the same mouse model and further investigated the effect of a core molecular clock gene Bmal1 on isoform switching of a giant sarcomeric protein titin and sarcomere length change resulted from titin isoform switching. Lance A. Riley et al found that iMSBmal1-/- mouse TA muscle expressed more longer titin due to additional exon inclusion of Ttn mRNA compared to iMSBmal+/+ mice. They observed that sarcomere length did not significantly change but more variable in iMSBmal1-/- muscle compared to iMSBmal+/+ muscle. In addition, they identified significant exon inclusion in the proximal Ig region, so they measured the proximal Ig length domain and confirmed that proximal Ig domain was significantly longer in iMSBmal1-/- muscle. Subsequently, they experimentally generated a shorter titin in C2C12 myotubes and observed that the shorter titin led to the shorter sarcomere length. Since RBM20 is a major regulator of Ttn splicing, they determined RBM20 expression level, and found that RBM20 expression was significantly lower in iMSBmal1-/- muscle. The reduced RBM20 expression was regulated by the molecular clock controlled transcriptional factor MyoD1. By performing a rescue experiment in vivo, the authors found that rescue of RBM20 in iMSBmal1-/- TA muscle restored titin isoform expression, however, they did not measure whether sarcomere length was restored. These data provide new information that the molecular cascades in the circadian clock mechanism regulate RBM20 expression and downstream titin isoform switching and sarcomere length change. Although the conclusion of this manuscript is mostly supported by the data, some aspects of experimental design and data analysis need be clarified and extended.

      Strengths:

      This paper links the circadian rhythms to skeletal muscle structure and function through a new molecular cascade: the core clock component Bmal1-transcription factor MyoD1-RBM20 expression-titin isoform switching-sarcomere length change.

      Utilization of muscle specific bmal1 knockout mice could rule out the confounding factors from the molecular clock in other cell types

      The authors performed the RNA sequencing and label free LC-MS analyses to determine the exon inclusion and exclusion through a side-by-side comparison which is a new approach to identify individual alternative spliced exons via both mRNA level and protein level.

      We agree that the side-by-side analysis from RNAseq and LC-MS data are novel and provides a foundation for others wanting to study both titin mRNA and protein. In this version, we have expanded this work to include samples from our Rbm20 rescue model (Figure 6). Similarly, to our approach in the muscle specific Bmal1 knockout model, these results confirm our RNA-seq results and indicate that LC-MS is a suitable method to measure titin protein isoform. We note that while more work is needed to confirm the broad utility of the LC-MC approach, it may be a suitable alternative to RNA-seq for measuring region-specific, and possibly exon-specific, changes in titin isoform expression.

      Weaknesses:

      Both RBM20 expression and titin isoform expression varies in different skeletal muscles. The authors only detected their expression in TA muscle. It is not clear why the authors only chose TA muscle.

      The reviewer, like Reviewer 2, raises a good point about muscle specificity as this is a significant challenge for research in the field of skeletal muscle. As we noted above, our primary focus was on the TA because our goal was to study the molecular links between the muscle circadian clock and titin expression with inclusion of analysis of a structural outcome, sarcomere length variability. This muscle is well suited for the combination of approaches employed. We recognize the limits of using a single muscle, but we note that the we used ChIPseq data that provided the initial clues that CLOCK and BMAL1 bind to a site within intron 1 of the Rbm20 gene came from gastrocnemius and not TA muscle samples . Our targeted ChIP-PCR confirms that CLOCK and BMAL1 bind to the same intron 1 location from TA muscle samples. In addition, we have included data from quadriceps and TA muscles in our chronic jet lag model in which we use an environmental manipulation to disrupt the muscle clocks. We believe that the edits to the text and inclusion of this data strengthen and extends our findings to other muscles through circadian disruption and not only a genetic knockout model.

      The sarcomere length data are self-contradictory. The authors stated that sarcomere length was not significantly changed in muscle specific KO mice in Line 149, however, in Line 163, the measurements showed significantly longer in muscle specific KO muscle. The significance is also indicated in Figures 2C and 3B.

      We apologize for the miscommunication. The significance indicated in Figure 2C refers to the significant difference in variability of sarcomere length and not a significant difference in sarcomere length. The difference in Figure 3B is to indicate a slightly longer but significantly different from control sarcomere length, but also a significant difference in sarcomere length variability. To make this difference clear, we have changed the symbol for significantly different variability from * to # in both Figures 2C and 3B. We hope this clarifies our findings.

      Manipulating titin size using U7 snRNPs linking to the changes in sarcomere length and overexpressing RBM20 to switch titin size are the concepts that have been proved. These data do not directly support the impact of muscle specific Bmal1 KO on ttn splicing and RBM20 expression

      We agree that the use of U7 snRNPs does not directly support the impact of muscle specific Bmal1 KO on titin splicing and RBM20 expression; however, that was not the goal of this set of experiments. Several papers have recently indicated titin’s role as a sarcomeric ruler (Tonino 2017, Brynnel 2018), but none of them have investigated the proximal Ig domain that we identified as regulated by the circadian clock disruption. Because of this, we thought it necessary to show this region specifically contributes to sarcomere length using our cell culture model. Further, we think this point strengthens our study as it suggests that in the absence of a clock effect, altering the proximal Ig domain of titin directly alters sarcomere length adding to the growing evidence base that titin acts as a sarcomeric ruler. We have edited the text of the results and the discussion to clarify this point.

      There is no evidence to show if interrupted circadian rhythms in mice change RBM20 expression and ttn splicing, which is critical to validate the concept that circadian rhythms are linked to Ttn splicing through RBM20.

      We recognize this concern and have performed a new study in which we used a model of chronic jet lag in normal adult C57BL6 mice as a model to disrupt the muscle clock (Wolff, Duncan and Esser, JAP 2013). This new data has been added in Figure 5 and shows that by altering the lights on: lights off schedule every 4 days for 8 weeks, mimicking repeated jet lag, we disrupt Rbm20 expression in TA and gastrocnemius muscle (note, this is new data for both the muscle and clock fields). Concomitant with changes in clock gene expression we reported in 2013, we found that mRNA expression of Rbm20 is altered as well. These findings confirm that normal muscle clock disruption is sufficient to alter expression of Rbm20.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      This is already a full revision, not a revision plan. All points were carefully addressed. TMF

      July 28, 2022

      RE: Review Commons Refereed Preprint #RC-2022-01555

      Dear Dr. Fuchs,

      Thank you for sending your manuscript entitled "Dissecting the invasion of Galleria mellonella by Yersinia enterocolitica reveals metabolic adaptations and a role of a phage lysis cassette in insect killing" to Review Commons. We have now completed the peer review of the manuscript. Please find the full set of reports below.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this manuscript Saenger et al. concentrate on the pathophysiological details of insect larvae infection by Yersinia enterocolitica. The authors studied the colonisation, proliferation, tissue invasion, and killing activity of the bacteria in Galleria mellonella larvae. Their study provides valuable evidence for the biological relevance of Tc toxins and a neighboring holin-endolysin cassette during establishment of Y. enterocolitica infection in Galleria mellonella larvae through the oral route. The findings of the authors provide important novel insights, that can be used for the development of Tc toxins as biopesticides.

      In general, this is a nice study. The data and the methods are presented well so that they can be reproduced and the key conclusions convincing.

      Unfortunately, the manuscript is sloppily written in some places, including grammatical and formatting errors. Citations regarding the structure and mechanism of action of Tc toxins are arbitrarily chosen, often taking the wrong ones and important aspects are left out. I highly recommend that the authors read the review of Roderer and Raunser 2019 that nicely describes and summarizes the molecular mechanism of Tc toxins.

      Answer: We have now improved the writing of the manuscript and corrected several errors and typos. In particular, the review by Roderer and Raunser, as well as other literature in the field, is now considered and cited in the text.

      The abstract ends with a speculation: "Suggesting that this dual lysis cassette is an example for a phage-related function that has been adapted for the release of a bacterial toxin" - this is likely true, but not proven in this work. What if it is used for the release of something else like extracellular DNA needed for biofilm formation (see https://doi.org/10.1038/ncomms11220)?

      Answer: This sentence was carefully written as a hypothesis strengthened by the data obtained in our study. Experimental evidence for this assumption is the strong correlation of toxin and HE cassette phenotypes of mutants (see abstract), the highly conserved localisation of the cassette within Tc loci of distinct bacterial genera (see discussion for literature), and the synchronic regulation of both the toxin and the lysis genes (manuscript in preparation). Moreover, strain W22703 is unable to form biofilms in contact with invertebrates (Spanier et al., AEM 2010). There, also in accordance with other reviewers, we would like to keep this statement in the text. However, to address this interesting point, we now mention the finding of Turnbull et al. in the discussion (see last paragraph).

      In addition to that, several outstanding issues must be addressed:

      1. Line 45 3-D structural analysis of the tripartite Tc suggests a 4:1:1 stoichiometry of the A, B and C subunits, with the A subunit forming a cage-like pentamer that associates with a tightly bound 1:1 sub-complex of B and C. This is wrong. The stoichiometry is 5:1:1 and the structure is not a cage. The statement was taken from citation 3. However, citation 3 should not be used, since the stoichiometry as well as the structure that was determined there is wrong. Use Landsberg et al. 2012 PNAS, Gatsogiannis et al. 2013 Nature instead.

      Answer: We apologize for misunderstanding the literature. Reference Lee et al. was removed here, and the two papers plus Meusch et al. (Nature, 2014) are now cited. The stoichiometry was corrected, “cage” was removed.

      "Few bacteria are known to successfully colonize and infect invertebrates" - needs a reference.

      Answer: This was modified to “Several bacteria…”, and we cite the recent paper by Weber and Fuchs (in press) that in Table 7g lists more than 40 bacterial species pathogenic towards insects.

      "Their oral insecticidal activity is comparable to that of the Bacillus thuringiensis- (Bt)- toxin" - reference missing.

      Answer: The reference is now cited (Bowen et al., Science 1998). Please see the last paragraph of the paper.

      "Type a, type b and type c" subunits is not usual for the literature. Please use TcA, TcB, TcC. A-, B-, and C-components should be abbreviated as TcA, TcB and TcC respectively in order to be in line with recent literature on the topic.

      Answer: This was corrected accordingly.

      Is TccC an ADP-ribosyltransferase or does it have a different biochemical activity?

      Answer: This is unknown with respect to the Tc of Y. enterocolitica. In the introduction, we now refer on P. luminescens and do not further attribute such a function to the TcC of Y. enterocolitica. In the abstract, we replaced “ADP-ribosylating” with “toxic”.

      "The toxic and highly variable carboxyl-terminus of TccC that has recently been demonstrated to ADP-ribosylate actin and Rho-GTPases" - this is only certain for TccC3 and TccC5 from P. luminescens. There are many such C-termini, called HVRs which have not had their activities determined yet, see here: https://doi.org/10.1371/journal.ppat.1009102

      Answer: We agree and cite this article. See also the response to comment 5 above.

      "is probably followed by receptor-mediated endocytosis" - more recent references exist for the receptor binding of Tc toxins.

      Answer: We added two references pointing to glycans as receptors of the Tc (line 52).

      "A pH decrease then triggers the injection of a translocation channel formed by the pentameric TcaA subunits into the endosomal vacuole, followed by the subsequent release of the BC subcomplex into the cytosol of the target cell" - this again is incorrect. Please read the above mentioned review and correct this passage accordingly.

      Answer: We agree. This phrase was rewritten to “The attachment of the Tc to the host cell membrane is either followed by receptor-mediated endocytosis or release of the ADP-ribosyltransferase into the target cell {Landsberg, 2011 #738;Sheets, 2011 #742}{Meusch, 2014 #788}. In a pH-dependent manner, the TcA translocation channel injected into the membrane of the host cell. Conformational changes then allow the toxic component to be released into the translocation channel of TcA and from there into the cytosol {Meusch, 2014 #788}{Roderer, 2019 #871}.” (Lines 51-56)

      What is meant by "environmental Yersinia species"?

      Answer: This was corrected to “…and in Y. mollaretii.”

      In the relevant W22703 pathogenicity island sequence (https://www.ncbi.nlm.nih.gov/nuccore/AJ920332) previously submitted by the same group, something odd is going on with the TcA component: it appears to be split into three polypeptides (tcaA, tcaB1, tcaB2). In the manuscript you state TcA is made up from only tcaA and tcaB. Could you please address this?

      Answer: Shotgun sequencing was performed 15 years ago, and mapping revealed a frameshift within tcaB that resulted in the split annotation of tcaB. Even if this frameshift is not the result of a sequencing error, it obviously does not result in Tc inactivation. As this frameshift was not identified in most other Tc-PAI of yersiniae, we assume our statement to be correct.

      "And their products were recently shown to act as a holin and an endolysin, respectively" - missing reference.

      Answer: The reference is now cited (Springer et al., JB 2018).

      "Its Tc proteins are produced at environmental temperatures, but silenced at 37{degree sign}C." versus "Remarkably, HolY and ElyY lyse Y. enterocolitica at body temperature, but not at 15{degree sign}C". Please address the issue that HolY/ElyY lyse the bacteria at temperatures where Tc proteins are not produced.

      Answer: In the absence of in vitro conditions activating the HE gene cassette, we used the pBAD system to artificially overexpress the two genes and showed cell lysis at 37°C, but not at 15°C (Springer et al., JB, 2018). This finding points to a lack of cell lysis as prerequisite for TC release and strengthens the hypothesis of a new secretion system as now corroborated in the last paragraph of the discussion. To avoid confusion of readers, the sentence was removed from the manuscript.

      "Nematodes, which are easily maintained in the laboratory without raising ethical issues, have successfully been used to identify virulence-related genes in a broad set of bacterial pathogens" - what is the relevance of this for the current manuscript?

      Answer: Invertebrates are introduced here as infection models. Nematodes are mentioned here for two reasons: yersiniae are nematocidal due to the Tc, and their immune system is less elaborated than that of G. mellonella, thus explaining its preferred use as insect model. We shortened the sentence by deleting the phrase in commas.

      Fig. 1C - no description is given for the labels 1-8.

      Answer: This is given below figures 1E-H. The labels are valid for all figure panels to ease reading.

      "The hemolymph of these cadavers was found full of Y. enterocolitica cells" - injected CFUs are provided here, but not final CFUs in the cadavers (although referred to in a later section). Please address this.

      Answer: These were preliminary experiments to identify the optimal infection dose. Hemolymph content was plated, but cell numbers in the hemolymph were not enumerated. This sentence therefore now reads: “…and the hemolymph of these cadavers contained Y. enterocolitica cells.” (lines 113-114).

      What is the inducing agent used for pACYC-tcaA and pACYC-HE? Why would "slight leakiness of the pBAD-promoter" make pBAD-tccC non-inducible? Were colonies taken from the cadavers to verify that the bacteria still contained these plasmids?

      Answer: Within pACYC, the genes tcaA and hlyY/elyY (HE) are under control of their own promoters as indicated in Table S2. In general, pACYC vectors are often and successfully used for complementation due to middle copy number.

      This now reads “Due to the slight leakiness of the pBAD-promoter, arabinose was not added to further induce tccC transcription.” (lines 133-134).

      The presence of the plasmids in vivo was confirmed by periodic plating on selective and non-selective plates, not revealing differences in cell numbers.

      Can the authors please address the TD50 of 1.83 days for W22703 ΔHE/pACYC-HE versus 3.67 days for WT bacteria? This would mean that the former kill larvae twice as fast as usual. I would not call this "did not significantly differ in their insecticidal activity".

      Answer: This statement is indeed not very intuitive given the variations of the TD50-values. However, the significance here (and elsewhere in the text) is based on a statistical calculation. For the Kaplan-Meier-plot, we used an application (K.T.Bogen, Advances in Molecular Toxicology, 2016; Exponent Health Sciences, Oakland, CA, United States; Johann Kummermehr, Klaus-Rüdiger Trott, Stem Cells, 1997; Academic Press, London, San Diego) based on all data of a graph. However, to consider this point and to not confuse the readers, the phrase was modified to “…did not significantly differ in their insecticidal activity from that of the parental strain W22703 after one week, demonstrating…” (lines 135-138).

      Fig. 2 is missing survival data for larvae infected with tcaA, HE, and tccC KO bacteria.

      Answer: These data are shown and are equal to the LB-control, e. g. the survival rate of larvae infected with strains W22703 lacking HE, tcaA, or tccC were 100%.

      "And a slight colouring of some of the larvae from one h p.i. on (data not shown)" - best show the data or remove this statement.

      Answer: Although we observed this phenomenon regularly, monitoring and documentation cannot be provided and would not substantially strengthen the manuscript. We therefore deleted this phrase.

      The infection of larvae by W22703 ΔtccC/pBAD-tccC is missing, the other bacterial variants are present. Please address this.

      Answer: Infections with W22703 DtccC are not shown to not overload the figure, please see the panel below. W22703 DtccC/pBAD-tccC infections have not been documented by photos. Figure legend 3 now reads “Infections with W22703 DtccC and DtccC/pBAD-tccC are not shown.”

      "initially proliferated from an application dose of 4.0 × 105 CFU and 4.0 × 105 CFU, respectively, to 2.2 × 106 CFU and 2.8 × 106 CFU, but could not be detected from day three on. This finding strongly suggests that TcaA is involved in adherence to epithelial cells and thus in midgut colonization". Please address the "initially proliferated" (which day post-infection?), their elimination from the larvae (how, why?), why the tccC KO bacteria were more virulent than tcaA KO bacteria, and where the suggestion about TcaA involvement specifically in adherence comes from.

      Answer: “initially proliferated” was rewritten to “proliferated within the first day p.i.”. (line 163)

      Elimination: This now reads “…was completely absent six days p.i., probably due to passage through the gut followed by excretion”. (lines 161-162)

      In our view, the tccC knockout mutant is not more virulent than W22703 DtcaA (se Fig. 2), but replicates during the first day post infection, whereas the cell numbers of the tcaA KO mutant strongly decrease already within the first 24 h p.i.. This prompted us to speculate that Tc is involved in two infection steps, e.g. adherence and hemocyte inactivation. For clarity, this sentence was modified to: “This discrepancy suggests that TcaA is involved in adherence to epithelial cells and thus in midgut colonization, without requiring TccC.” (lines 165-166)

      In Fig. 4, the CFUs for W22703 ΔtccC/pBAD-tccC are essentially the same as for the other rescued KOs and WT, while in the text a point about weaker growth is made. Is this justified? Also, even though the CFU data is present here, data on infection of larvae by W22703 ΔtccC/pBAD-tccC is missing unlike the other bacterial variants. Please explain.

      Answer: We agree that this part of the results is misleading. We want to stress that the complementation very well restores the phenotype of the wildtype. The weaker growth of DtccC may be due to the distinct vector system used here. This part was there shortened and rephrased to: “When larvae were infected with 4.0 × 105 CFU of the DtcaA and DHE mutants, and with 1.4 × 106 CFU of strain W22703 DtccC/pBAD-tccC, all of which carrying the deleted genes on recombinant plasmids, the bacterial burden at days one to six p.i. increased approximately to that of the parental strain W22703 applied with 9.0 × 105 CFU, indicating a successful complementation of the gene deletions.”

      ” (lines 166-170).

      Missing data on W22703 ΔtccC/pBAD-tccC infection in Fig. 3, please the answer to point 20 above.

      Fig. 6b - The presence of an anti-RFP signal is not obvious in any of the bottom row images. The top row images are missing the same kind of annotation provided for Fig. 6a, without which non-histologists will find understanding the figure difficult.

      Answer: The anti-RFP signal is visible only on the left photo of the bottom panel, and not in the other three photos as explained in the text. We understand that the signals are not very strong, but they are visible on the screen.

      "In the absence of the lysis cassette, however, TcaA::Rfp was not detected despite the presence of W22703 ΔHE tcaA::rfp cells." + "To test whether or not the promoter of the lysis cassette is active in vivo, we infected G. mellonella larvae with strain W22703 PHE::rfp. Although Y. enterocolitica cells densely proliferated within the hemolymph (FIG. 6B), no staining signal that would point to the presence of TcaA was obtained, possibly due to no or weak PHE activity." Does this mean that without HE, tcaA does not express?

      Answer: No, we performed Western Blots showing that TcaA is detected in cells lacking HE. Therefore, a negative feedback regulation (e. g. increasing intracellular amounts of TcaA repress its own transcription) can be excluded. This is also in line with the low transcriptional activity of the lysis cassette in vivo (new Fig. S1B).

      "These data suggest that the HE cassette is responsible for the extracellular activity of the insecticidal Tc." Please explain how the preceding paragraph leads to this conclusion.

      Answer: This was poorly written and now reads “…for the transport…” (line 224).

      "As expected, bacterial cells, e.g. Y. enterocolitica, are visible in the hemolymph obtained from W22703-infected animals, but not in all other preparations." - which figure are the authors referring to?

      Answer: We have indeed identified, but not immunostained, bacterial cells in those preparations, but they are not visible in Fig. 7. This sentence was removed. However, the presence of W22703, but not its tc-PAIYe-mutants, in the hemolymph is demonstrated in Fig. 6A.

      "To delineate the transcriptional profile of Y. enterocolitica during infection of G. mellonella, we applied immunomagnetic separation to isolate Y. enterocolitica from the larvae 12 h and 24 h after infection" - do the authors store the bacteria for up to 24 h at 4 {degree sign}C, as indicated in the methods section?

      Answer: Yes, the probes were stabilized with RNAlater and then stored up to 24 h to synchronize all samples of one experiment.

      "The endolysin located within Tc-PAIYe was significantly up-regulated after 24 h, but not after 12 h, pointing to its possible role in the release of the Tc" - I could not find the endolysin in Table S1. Could the authors mark it clearly? Also, why is the holin also not upregulated?

      Answer: The endolysin gene is lacking in Table S1 due to its FC=1.02. We now added a table to Fig. S1 that shows the FC values of all genes from Tc-PAIYe. The FC-value of holin gene is 0.87, thus pointing to a very slight transcription of this lysis gene as discussed, thus preventing cell death.

      "This is in line with the fact that a T3SS is lacking in strain W22703" - Is a complete genomic sequence available for this strain, so readers could validate this statement?

      Answer: The genome sequence is available, and the reference is now cited (line 358). The common virulence plasmid of yersiniae, pYV that encodes the T3SS, is missing in this strain. We do not mention here the presence of a second, but probably incomplete, chromosomally encoded T3SS in strain W22703 do not overload the manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This is a very, very nice study as it actually describes the role of different Tc toxin components in a model infection system using an important bacterium- really for the first time in a properly controlled manner. The mutants lacking either the syringe (AB) or the bullet (C) make 'sense' for a loss of function perspective. The description of the phage cassette in loss of function is also interesting and could do with some more speculation? For example, some groups of Photorhabdus bacteria release their oral toxicity (Tc's) into their bacterial supernatants- whereas in others it remains cell associated. The likely role of this phage cassette in this process should be discussed (is cell suicide required for release?).

      Answer: We now discuss the possibly role of the lysis cassette in more detail, including the possibility that a subpopulation commits cell suicide (see lines 375-396).

      Reviewer #2 (Significance (Required)):

      This is highly significant finding as despite all of the very elegant structural studies done on these important toxins there is still very little work in vivo. These studies clearly show the role of the different components of these ABC toxins in vivo. It should be published with priority.

      Congratulations to the authors.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: The authors analyze the phases of infection of Galleria mellonella by Yersinia enterocolitica following forced oral feeding. They study different phases of infection, including survival within the gut and invasion of the hemolymph. By analyzing differences in the genes up- and down regulated, they show that for example transporters for food sources from the hemocoel are regulated for making those sources available for the bacteria.

      Major comments: This is an interesting paper demonstrating genes of Y. enterocolitica dependent for colonization, growth and crossing of the epithelial gut barrier in G. mellonella.

      Major points which have to be addressed:

      Introduction: line 54: the BC subcomplex is not released into the cytosol! It is only the hypervariable region (enzymatic part) which enters the cytosol. This has to be corrected.

      Answer: This has been corrected accordingly.

      Fig.2/3: Why have different CFU been used for the distinct bacterial strains? This does not allow a direct comparison of their toxicity. For me the dead larvae shown in Fig. 3 are not represented in Fig 2 (data are not concordant), because of the loss before day one depicted in Fig. 2: The curves should be normalized to the same starting point (should be 100 %)?

      Answer: We would like to stress here that infection doses are hard to reproduce if frozen and diluted stocks are used. We decided for overnight culture to better mimic natural conditions and controlled each culture for its viable cell numbers by plating. Moreover, we choose the infection doses in a conservative manner, e.g. the number of mutants was higher than that of the parental strain.

      The data of Fig. 3 are concordant with Fig. 2 for two reasons: First, this experiments was performed in replicates with a total of 36 larvae per strain (see Fig. 2 legend), so that representative photos are shown. Second, larvae were considered dead if they failed to respond to touch, and many larvae without strong sign of melanisation were already killed.

      We analysed the algorithmus of the Kaplan-Meier-plot. All graphs start at 100%, this is now mentioned in the legend. There are no data between day 0 and day 1, and a stepwise graph is essential for this plot.

      Fig. 3: Why is the strain W22703 delta tccC/pBAD - tccC missing in the data set?

      Infections with W22703 DtccC are not shown to not overload the figure, please see the panel below. Answer: W22703 DtccC/pBAD-tccC infections have not been documented by photos. Figure legend 4 now reads “Infections with W22703 DtccC and DtccC/pBAD-tccC are not shown.”

      Minor: line 221: "the" is doubled

      Answer: This has been corrected accordingly.

      Reviewer #3 (Significance (Required)):

      The manuscript shows the use of G. mellonella as a straight foreward method to study gene functions of pathogenic bacteria, a significant knowledge for scientists of the field.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      Summary: Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

      Answer: There are already three sections that summarize the results and the methods applied, namely the abstract, the last paragraph of the introduction, and the conclusion following the discussion. In our view, a further summary would overload the manuscript. Nevertheless, depending on the journal the manuscript will be published in, an additional authors´ summary would be provided.

      Outlines proposed role of lysis cassette in oral infection of Galleria as a model insect for host pathogen interaction, data which is fortified through use of histology and RNAseq.

      Introduction could extend to additional background eg Aleniz et al and other entomopathogen transcriptome data, more so other studies using Yersinia and Galleria as a model (refer references provided in the below comments)

      Answer: We again carefully screened PubMed for studies in the field and added few papers. However, in vivo transcriptome analyses are still rare, as indicated by a lack of a respective investigations with the highly relevant entomopathogen Photorhabdus luminescens. The literature suggested by the reviewer is now cited in the introduction and the discussion (see below for details).

      The strength of the paper lies in understanding the progression of the disease in the insect host as mentioned L316-317 and clearance of the bacteria via in TcaA mutant

      Major comments: - Are the key conclusions convincing? Yes for mode of action Fig 5 could have additional panels -this is a strength of the paper

      Answer: We agree that this time course is a strength of the paper, and we carefully selected representative photos. There are several to be shown, but to our view, they are rather illustrative than providing a substantial additional value.

      Fig 6 legend could better describe the observed insect components

      Answer: The insect components are now indicated in Fig. 6B and in Fig. 5.

      Figure 7 may be lost in PDF conversion -the figure appears un resolved? are there more high resolution photos

      Answer: Fig. 7 was present in the merged PDF provided by the publisher. We used the photos with the best resolution.

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? the data provided is in places rudimentary (i.e. validation of the role of the lysis cassette in virulence) and could be bolstered with the construction and use of a lysis translational reporter etc I was left unsure how the HE::rfp and TcA::rfp constructs were made. I had assumed red florescent protein however it appears an antibody is used. This needs to be clarified as I then found it hard to interpret the results.

      Answer: The transcriptional PHE::rfp fusion is mentioned in the results section, but immunostaining failed probably due to a very low promoter activity (line 223). This is well in line with the transcriptome data. Please see a detailed answer how the HE::rfp and tcaA::rfp were constructed below. We applied the RFP-antibody for two reasons: first, fluorescence microscopy did not reveal clear red fluorescence in the tissue sections, and second, a TcaA antibody failed to match quality criteria for this purpose.

      It appear l114-125 that their may be enough data to derive a LD50 values and or LT value at a fixed dose - if so reporting this data of interest. It may also allude as to why a 10e5 dose was selected for subsequent expts

      Answer: This is an interesting point. The LD50 (dose of cells that kills 50% of all larvae) is usually not calculated in publications in this field of research, because its calculation requires a very huge separate data set that cannot be used to answer the questions addressed here. Such a dat set is not available. We published the dose-dependent toxicity of Y.enterocolitica W22703 upon subcutaneous injection, and from these data, we determined a LD50 for this strain of approximately 2 x 104 cells. The paper is cited in our manuscript. The 10E05 dose was selected due to our preliminary work and the reproducibility of the experimental phenotypes.

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. Use of lysis the reporter - discuss commonalties of the in host transcriptome with other Yersinia Galleria systems eg Paulson etc al (refer below). Are there any thoughts on the host range of this Yersinia and can this be placed in a pathogen host evolutionary context?

      Answer: Paulson et al. are now cited twice in the text. The host range of Yersinia enterocolitica has not been investigated to our knowledge. However, its nematocidal activity has been described by Spanier et al., and Manduca sexta larvae, the tobacco hornworm, is also killed by W22703 (see references). Moreover, there are two copies of tccC in the genome of strain W22703 encoding the cytotoxic Tc subunit with its hypervariable C-terminus that is assumed to contribute to host specificity. This is discussed in very detail by Song et al. (see references).

      Evolution: Yes, this has been addressed by Waterfield et al. 2004 (see references) where insects are hypothesized as a source of emerging pathogens. We placed our findings in the context of this article in lines 91-94 and 305-310.

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. Yes

      • Are the data and the methods presented in such a way that they can be reproduced? yes but I think some vector construction methodology is missing e.g. ::rfp (refer above)

      Answer: The plasmids used to construct the two strains W22703 tcaA::rfp and W22703 PHE::rfp are listed in Table S2. References for details are given (Starke et. al., 2013, Starke and Fuchs, 2014). Briefly, we used a suicide vector (pUTs) carrying the gene encoding the red fluorescent protein (RFP). This vector replicates in E. coli helper strains such as SM10, but not in Y. enterocolitica. Strain SM10 is now listed in Table 2. Following conjugation, the construct is chromosomally inserted upon recombination via the fragments cloned into the plasmid. In case of tcaA, we cloned the 3´-end of the gene to generate a translational fusion, and in case of HE its promoter, resulting in a transcriptional fusion with the reporter RFP.

      Fig 2 I am a little lost mortality seems quick on day 0 is this a result of aberrant injection damage mortality or are the authors observing a different effect across mutants through the initial 24 hours? If data available could this time plot be extended out 0-24 hours. The dash used for W222703 tcaA /TccC look similar can a different symbol be used.

      Answer: The reviewer is right that the mortality is high on the first day. However, larvae monitoring for up to nine days is a standard in the literature. No data are available for a better resolution of the first 24 h that, however, were investigated in more detail in the time course of Fig. 5. Moreover, we observed changes in motility and colouring of some of the larvae from one h p.i. on (data not shown). Aberrant injection damage was avoided, and damaged larvae or larvae that not completely took up the infection solution were not further considered in the experiment. This is mentioned in lines 107-109.

      A different symbol is now used for W222703 DtccC /pBAD-tccC.

      • Are the experiments adequately replicated and statistical analysis adequate? Yes

      Minor comments: - Specific experimental issues that are easily addressable. - Are prior studies referenced appropriately? Other entomopathogenic transcriptome studies could be compared to and or cross referenced (I have provided references in the response

      Answer: Repetition of our answer above: We again carefully screened PubMed for studies in the field and added few papers. However, in vivo transcriptome analyses are still rare, as indicated by a lack of a respective investigations with the highly relevant entomopathogen Photorhabdus luminescens. The literature suggested by the reviewer is now cited in the introduction and the discussion (see below for details).

      I am unsure on the use of immuno pulldown and efficiency of recovering the Yersinia using this method as opposed to direct sequencing total RNA has this method been used in other systems,

      Answer: Isolating RNA from in vivo probes of infected insects encounters two challenges: first, a possible contamination with commensal bacteria, and a too high amount of host RNA that reduces the number of sequence reads. This might be the reason for the relatively low sequence depth found in related papers in the field of in vivo transcriptomics. We overcame these problems by immunomagnetic separation that is easily applicable and enriches the samples with respect to Yersinia cells, this is now mentioned in the results. We also cite a study (Prax et al., in which we established the protocol of IMS.

      • Are the text and figures clear and accurate? Yes though in places better naming of insect components could be listed

      Answer: This was done, see above.

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      As listed above potential use of reporters and or comparison and transcriptome analysis to other systems and an evolutionary pathogen host context (refer comments above) would strengthen the manuscript

      Answer: Please see answer to comments above. We explained the use of the reporter fusions, and put the transcriptome analysis into the context of related studies.

      Minor comments as per below When first mentioned good to state the larval instar used

      Answer: We used larvae of instar 5-6 according to Jorjao et al. (2018), this is now mentioned and cited in the M&M section, line 434.

      l 78 lon protease? what type? this is an important SOS protease affecting many regulatory systems please clarify

      Answer: This is a Lon A endopeptidase, and its function for the temperature-dependent activity of the lysis cassette has ben described (Springer et al. 2021, see references). Its relevance for the thermodependent regulation of Yersinia virulence has been documented by Herbst et al. (PMID: 19468295) and Jackson et al. (https://doi.org/10.1111/j.1365-2958.2004.04353.x).

      l103-113 an description of the elemental tract which is depicted, perhaps this could be placed in the Fig. 1 figure legend

      Answer: We agree and substantially shortened the first paragraph of the results. Relevant aspects are now mentioned in Figure legend 2, redundancies with the figure legend were removed.

      l 133 use of the word larvae in place of the word animals might be more appropriate

      Answer: This was corrected accordingly.

      l 133 clarify delta HE mutant description when first mentioned

      Answer: The abbreviation HE is now introduced in the introduction in line 74.

      Lines 220-234 hard to follow mainly as I am unsure how then strains are constructed, perhaps clarify what rfp is how was it made :: demotes and insertion but yet then they seek to detect TcaA? I could not find the methodology on its or HE::rfp construction

      Answer: The plasmids used to construct the two strains W22703 tcaA::rfp and W22703 PHE::rfp are listed in Table S2. References for details is given (Starke et. Al., 2013, Starke et al. 2014). Briefly, we used a suicide vector (pUTs) carrying the gene encoding the red fluorescent protein (RFP). Following conjugation, the construct is chromosomally inserted upon recombination via the fragments cloned into the plasmid. In case of tcaA, we cloned the 3´-end of the gene to generate a translational fusion, and in case of HE its promoter, resulting in a transcriptional fusion with the reporter RFP.

      Please see above why we used RFP-antibodies to detect TcaA.

      l247 immuno-magnetic separation to isolate Yersinia - is there an efficiency behind this method, might be good to mention (I am unfamiliar with this technique)

      Answer: We here repeat our answer to the point above: Isolating RNA from in vivo probes of infected insects encounters two challenges: first, a possible contamination with commensal bacteria, and a too high amount of host RNA that reduces the number of sequence reads. This might be the reason for the relatively low sequence depth found in related papers in the field of in vivo transcriptomics. We overcame these problems by immunomagnetic separation that is easily applicable and enriches the samples with respect to Yersinia cells, this is now mentioned in the results. We also cite a study (Prax et al., in which we established the protocol of IMS.

      l313 alludes to role of Tca in hemoceol which contradicts an earlier statements in l 130 please clarify

      Answer: The reviewer is right. The sentence in former line 130 (now lines 123-124) was corrected to “…suggesting that the Tc plays a main role in the initial phases of infection”. This statement does not exclude its activity towards hemocytes. Moreover, subcutaneous infection is very artificial and was therefore replaced by oral application in our study to mimic natural routes of infection. This is now elaborated in more detail in the discussion (Lines 305-310).

      For clarity table 1 could colour highlight (different colours) tc and lysis genes

      Answer: We now added a table to Fig. S1 that shows the FC values of all genes from Tc-PAIYe.

      CROSS-CONSULTATION COMMENTS I am in agreement with all points of reviewer 1 who has a clear understanding on Tc toxin composition TcA pentamer etc. Being familiar to the field I regret I did not pick up on these errors

      Answer: This has been corrected according to R1.

      Point 13 agree and should possibly bring in other researchers who have used Galleria as a model. It also needs to be kept in mind that the target host for many Tcs has yet to be determined hence the importance of oral activity of this isolate

      Answer: This has been corrected according to R1.

      I am similarly in agreement with comments of reviewer 3

      Reviewer 4 I over looked the LT50 data -- apologies but agree with reviewer 1 where WT should be the more potent strain --I still think if possible LD50 for WT would be of value more so to define its oral activity

      Answer: We repeat our answer from above. This is an interesting point. The LD50 (dose of cells that kills 50% of all larvae) is usually not calculated in publications in this field of research, because its calculation requires a very huge separate data set that cannot be used to answer the questions addressed here. Such a dat set is not available. We published the dose-dependent toxicity of Y.enterocolitica W22703 upon subcutaneous injection, and from these data, we determined a LD50 for this strain of approximately 2 x 104 cells. The paper is cited in our manuscript. The 10E05 dose was selected due to our preliminary work and the reproducibility of the experimental phenotypes.

      Reviewer #4 (Significance (Required)):

      SECTION B - Significance ========================

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      Extends from work of Fuchs - research group Extends from work of Palmer et al on lysis cassettes as potential T10SS Extends from work off Vesga Pseudomonas and Paulson Yersinia(refs provided below) on insect transcriptomics

      Of interest and possibly understated is the oral activity of enterocolitica in the insect host as mentioned L316-317 and how this might relate to the lifestyle/evolution of this microbe further elaboration here would be of interest

      Answer: We agree that this is an important aspect. Therefore, we added the following sentences here: “In contrast to subcutaneous injection in the use of insect larvae as model for bacterial virulence properties towards mammals, oral application mimics natural routes of infection that in particular take place during the bioconversion of animal cadavers by bacteria, fungi, and larvae {Carter, 2007 #879}. Together with the broad cytocidal host spectrum of bacterial toxins {Mendoza-Almanza, 2020 #880}, investigation of yet neglected natural infections of invertebrates will contribute to a better understanding of microbial pathogenicity {Waterfield, 2004 #480}.” (lines 305-310)

      • Place the work in the context of the existing literature (provide references, where appropriate).

      Relevant Transcriptome papers which could be referred to in the discussion i.e. are similar genes in play or is their a point of difference? https://doi.org/10.1093/g3journal/jkaa024;https://doi.org/10.1038/s41396-020-0729-9; https://doi.org/10.1099/mic.0.000311

      Answer: Paulson et al. mainly address virulence factors, whereas metabolism is not uncovered. We now cite similarities with respect to hemolysis and iron scavenging. The focus of Vesga et al. is on the interaction of a plant pathogen with wheat and two insect hosts, including their transcriptome. Although metabolic details are missing, there is an interesting overlap with the paper by Vesga et al. (hemocoel as permissive environment for proliferation) and a difference (upregulation of chitinases was not observed) that are now cited in the discussion. The Alenzi paper mainly investigated the general virulence of Y. enterocolitica strain. We cite its finding on the importance of motility, thus confirming our transcriptome analysis.

      • State what audience might be interested in and influenced by the reported findings. The oral activity of enterocolitica towards Galleria of interest and an evolutionary context insect vs mammalian activity in the discussion could be provided. Potential role of TcaA in gut association For the targeted journal I feel additional technical data is required and a broader context to other global systems (bacterial species) provided

      Answer: All points were addressed carefully and in detail. We refer to our answers to points detailed above.

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. Reviewers expertise entomopathogens, their toxins and pathogen ecology
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the both the reviewers for their constructive comments. Please see our point-by-point response to all the comments.

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)): *

      • Summary *
      • The authors of this manuscript confirm data found by others by determining replication kinetics of the ancestral B.6 SARS-CoV-2 virus, Delta and Omicron BA.1 and BA.2 in Calu-3 cells. The authors quantify barrier integrity between variants and interferon induction to conclude that Delta is more cytopathic and induced less interferon than Omicron, possibly leading to its increased pathogenesis. In addition the authors identify CuCl2 and FeSO4 as potential antivirals. *

      *Major comments *

      1. *__Reviewer comment: __The author's argue that Omicron's slower replication on Calu-3 cells correlates with mild disease, however many publications show that Omicron replicates more efficiently/ rapidly in primary human airway cultures: *
      2. Hui et al., (Nature, 2022) doi: https://doi.org/10.1038/s41586-022-04479-6*
      3. Peacock et al., (bioRxiv) doi: https://doi.org/10.1101/2021.12.31.474653*
      4. Lamers et al., (bioRxiv) doi: https://doi.org/10.1101/2022.01.19.476898 * Response: Previous reports including the citations indicated by the reviewer have shown that the Omicron variant replicates at a lower levels in lung tissue as compared to cells of bronchial origin or upper respiratory tract. In fact, Omicron variant was shown not to productively infect at all in alveolar type II cells. Omicron replication was severely compromised in Calu-3 cells grown in 96-well plates (https://doi.org/10.1080/22221751.2021.2023329) which is consistent with our observations.

      *__Reviewer comment: __Can the authors explain why air-liquid grown Calu-3 cells appear to display similar viral titers for Omicron and Delta at 24 and 36 h.p.i (Figure 5B), however lower viral replication in Figure 3B? If the cells in Figure 3B are submerged, then the authors should identify why ALI grown Calu-3 cells are more susceptible to Omicron. *

      Response: Cells were grown in plastic multi-well plates for growth curve experiments shown in Figure 3. The cells in this condition are not polarized and the virus titers are the total amount of virus released into the culture supernatant. The infection conditions in Figure 5 is under air-liquid culture conditions, from polarized cells. Therefore, the virus titers are only from the basolateral chamber. The outcomes of figure 3 and figure 5 are not comparable due to these technical differences. We will add this explanation in the results section.

      *__Reviewer comment: __The authors suggest that Delta disrupts epithelial barrier integrity to a larger extent compared to B.6 and Omicron, however this may be due to fewer infected cells (despite equal viral titers, the nucleocapsid staining in Figure 2 and 5C suggests fewer infected cells). Have the authors imaged B.6 or Omicron at a later timepoint (or normalized virus input for equal infected cells) to determine barrier integrity when the amount of infected cells is equal? Alternatively, the authors should discuss this as a possible limitation of their study, especially since they argue this is a major reason why Delta has a growth advantage (lines 345 to 349). *

      Response: We performed confocal imaging of transwells from air-liquid interface model using a 20X objective and have obtained data to show that the percent of infected cells is similar between Omicron and Delta variant. We will include this data in the revised manuscript. In an in vitro system, once the infection is set in, the infected cells eventually die and the TEER reaches background levels. We are proposing a delay in disruption of barrier integrity most probably due to lower cytopathogenicity of the Omicron variant. As per the reviewer’s suggestion, we will discuss the possible limitation of the models and provide additional interpretations.

      Minor comments *A) __Reviewer comment: __Line 118: Implications of this sentence are too strong. The authors have not shown the causality of Ct values and transmission, therefore they should reword the sentence: "indicating a high viral burden in patients during this period resulting in increased transmission of the virus among the contacts" to "likely attributing to increased transmission..." *

      Response: We will correct this.

      *__B) Reviewer comment: __Line 289: The authors suggest that infection with the Omicron variant generated higher levels of antibodies to the Delta variant, however these individuals are already vaccinated and elicit cross-neutralizing antibodies against Delta even before their Omicron infection. Therefore the Delta response is boosted and the Omicron response is essentially a primary response since vaccination elicits almost no cross-protection in itself. Therefore the authors should compare primary Delta infected individuals to primary Omicron infected individuals to determine cross-protection levels. *

      Response: We agree with the reviewer’s argument. Please note that the two vaccines used in India are against the ancestral virus (inactivated) or the spike protein expressed by the adenovirus vector backbone. As over 90% of the population in India have been fully vaccinated with these two vaccines and a majority of them may also have been infected with delta variant and now with omicron, it is practically impossible to compare primary delta cases vs primary omicron cases at this stage. As part of another study in mid 2021, after the second wave of COVID-19 infections due to the Delta variant in India, we randomly selected 55 samples which had a detectable FRNT50 value for the delta variant, to test for their ability to neutralize the Omicron variant. Only twenty of the 55 samples had detectable levels of neutralizing antibodies against the Omicron variant. By assigning a FRNT50 value of 10 for the samples which had no detectable levels of antibodies in the starting dilution (1:20) of the assay, we obtained a GMT of 22.5 (95% CI: 16, 31) for these 55 samples. This value was 20-fold lower than the GMT of Delta variant which was 404 (95% CI:248, 658). This clearly indicates that even during the peak of delta wave, there were barely any cross-reactive antibodies to the Omicron variant. This study was recently published [NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-022-31170-1]. It would be interesting to eventually compare the antibody responses in reinfections with other sub-lineages of Omicron variant which is beyond the scope of our manuscript. We will add this description in the results and discussion section of the revised manuscript.

      *C) __Reviewer comment: __There appears to be no reference to Figure 6G, however this reference is most likely missing from line 306. *

      Response: Thank you for bringing this to our notice. We will insert the reference to Figure 6G.

      *D) __Reviewer comment: __Line 359-362: The authors suggest that waning antibody titers increase susceptibility to new variants of concern, however their cohort already possessed very low antibody titers against Omicron a month after vaccination (Figure 7F) suggesting they could be equally susceptible to Omicron 1 and 6 months after vaccination. *

      Response: Please note that nine out of 15 samples had FRNT50 value above the level of detection after vaccination in June 2021. The number of samples positive for Omicron antibodies reduced to six out of 15 by Dec 2021 suggesting that relatively more people were without protective antibodies for Omicron variant by Dec 2021. Around 70% of the population was seropositive by Aug 2021 (https://doi.org/10.1016/j.ijid.2021.12.353) and most adults in India received both doses of their vaccine after June 2021 which would have boosted the humoral and cellular response to SARS-CoV-2. This is corroborated in a recently published report, where we showed that 36 out of 55 previously infected subjects had neutralizing antibodies for the Omicron variant after receiving a single dose of inactivated vaccine. Therefore, in the context of hybrid immunity in India, we speculate that waning antibody titers could have played a significant role in the emergence and spread of Omicron variant in addition to the ability of the Omicron variant to escape neutralization, replicate more efficiently in the upper respiratory tract etc., The fact that booster doses of vaccines developed against the ancestral virus/viral protein was capable of increasing the level of neutralizing antibodies to omicron variant suggests that the level of antibodies above a certain threshold may play a significant role in protecting against the omicron variant.

      Reviewer #1 (Significance (Required)):

      • __Reviewer comment: __Many of the conclusions based on replication and barrier integrity may not represent the situation in primary human tissues and does not explain the rapid spread of Omicron. In addition, interferon induction has already been described for these variants and this finding is not novel. The manuscripts most interesting and novel finding is the role of CuCl2 and FeSO4 as antivirals. It would be interesting to test these salts in primary human airway cultures. *

      Response: The study was conducted in the months of Jan-March 2022 and the first version of the results were uploaded on a preprint server in March 2022. The process of journals handling the manuscript and obtaining reviews is not under our control. We cannot argue to defend the comments on novelty when the Omicron variant is barely six months old and new variants continue to emerge. The deluge of publications should not result in reviewers branding most of the efforts as not novel or insignificant. We have been trying since three months to obtain primary cells but the distributors are unable to supply the same. We will continue to try to obtain cells from one or the other source. Transwells are back-ordered with expected delivery dates in three months. Meanwhile, we now have HBEC3-KT cells which are normal human bronchial epithelial cells immortalized with CDK4 and hTERT. We will perform the inhibition experiments in these cell lines to convince the reviewers.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      *In the manuscript entitled "BA.1 and BA.2 sub-lineages of Omicron variant have comparable replication kinetics and susceptibility to neutralization by antibodies" the authors assess the kinetics of growth of SARS-CoV-2 variants in Calu-3 cells and their effects on epithelial junction, and the interferon response. The authors also analyze the capacity of metal salts to block SARS CoV-2 replication in Calu-3 cells. Finally, the authors characterize the ability of vaccinated and/or COVID-19 patients to develop neutralizing antibodies to different variants using FRNT and specific binding assays (ELISA). *

      • The paper largely confirms several previous reports on the replication capacity and interferon responses of the different variants. Although the title and abstract focus on the Omicron sub-lineages, the paper is mostly focused on comparing original CoV2, with Kappa, Delta and Omicron. *
      • Figures 1-5 compare the replication kinetics, interferon responses, and epithelial barrier disruption of Kappa, Delta and the original Omicron (B.1.1.529) to the original B6 variant. On a separate note, Figure 7 shows the ability of metal salts (especially iron, copper, and zinc) to block viral RNA-dependent RNA polymerase activity (RdRp) in vitro. The authors also show the effect on virus replication in Calu-3 cells (Delta and Omicron B.1.1.529 only). The data mainly focus on the variants, the Delta and the Omicron (BA.1.1.529 and not the BA.1 and BA.2 sub-lineages) except in Fig 6A, B, G. *

      • __Reviewer comment: __Most importantly, a major limitation of the paper is that when human samples are analyzed, the authors assume that the patients have been infected with a specific variant according to the "peak" of infection, but sequencing is never performed. When neutralization and binding of antibodies are analyzed, the information on the patients is unclear - for example, were the patients exposed to Delta or Omicron or any of their sub-lineages? What was the vaccination status of SARS CoV-2 positive patients? And why non-tested individuals showing symptoms were included in the study (lines 302-304)? *

      Response: We thank the reviewer for the comments. Over 90% of the population in India is vaccinated. All the participants of the study have been vaccinated in 2021. The participants were enrolled into the study almost 4 weeks after recovery from illness. We have enrolled participants who have reported to have had fever or COVID-19-like symptoms in the preceding weeks with or without confirmed RT-PCR test results. Testing is an individual and voluntary choice now. Therefore, it would be difficult to find RT-PCR confirmed cases. Our assumption about exposure is based on a nationwide sequencing effort of thousands of samples every week and this approach is reliable and credible. As indicated in the text and in the supplementary figure, Omicron lineages BA.1 followed by BA.2 were the circulating virus lineages since Jan 2021 in India.

      *__Reviewer comment: __The authors show that BA.1 and BA.2 have similar replication kinetics in Calu-3 cells and induce similar neutralizing antibodies in the patients tested. However, there is a large disconnection with the rest of the paper that is mostly focused on Kappa, Delta, and Omicron B.1.1.529. Also, no comparisons between these variants and BA.1 or BA.2 have been shown. Similarly, a large assumption in the paper is that the patients who tested positive for COVID-19 have had "natural Omicron infection" (lines 36-37; lines 307-311) when it could be any other variants or Omicron sub-lineages as well. *

      Response: Please note that the B.1.1.529 which was used at the beginning of the study is the BA.1 sub-lineage which has been compared with Kappa and Delta variants. BA.2 emerged at later stages and therefore we have compared the kinetics and neutralization titer between BA.1 and BA.2. It is unreasonable to expect to repeat all the comparisons with BA.2 considering the cost and challenges of working in a BSL-3 environment. The initial version of this data was uploaded on preprint server in March 2022 when only two sub-lineages of Omicron namely BA.1 and BA.2 existed. Our data from the national SARS-CoV-2 sequencing consortium clearly shows that there were no other sub-lineages circulating at that time.

      Reviewer #2 (Significance (Required)):

      *__Reviewer comment: __In light of the fact that most of the paper does not look at the subvariants BA.1 and BA.2 of Omicron- either the authors compare BA.1 and BA.2 more comprehensively with Omicron B.1.1.529 or rewrite the conclusions and claims of the current paper. Similar to the experiments comparing B6 with Kappa, Delta and Omicron, Omicron B.1.1.529 should be compared similarly to BA.1 and BA.2 in a separate figure. In any case, the novelty compared to other papers -also cited by the authors- remains limited. *

      Response: We will revise the conclusions and claims of the paper as per the suggestions. Please see our response to reviewer 1 with regards to the novelty of our observations. The B.1.1.529 variant was later classified as the BA.1 variant. Our study was uploaded on the preprint server in March 2022 and the entire review process has taken four months. It is unfair to now demand comparison of BA.2 with Kappa or Delta variant which does not add any additional value to our observations.

      *__Reviewer comment: __In addition to the concerns mentioned above, there are more pressing variants circulating right now, such as BA.4 and BA.5. These variants are not referred in the paper. It might be beyond the scope of the paper, but including more analyses with BA.1, BA.2 (as the ones done with B.1.1.529) and adding some key data with BA.3, BA.4, BA.5 might substantially increase the relevance and importance of the paper. *

      Response: Please see our comments above. Our efforts are continuing in this direction to further look at antibody responses and replication kinetics of newer variants which have emerged recently. However, the scarcity of positive clinical samples and lower probability of getting samples that would be suitable for virus isolation are the challenges we are dealing with. We think testing newer variants which have emerged during the review process is certainly valuable but is extremely difficult under the current circumstances. We will have to apply to seek import permits to obtain these sub-lineages or enrol patients with symptoms and keep testing them to isolate, culture the virus and obtain whole genome sequence. We will have to establish neutralization assays with newer sub-variants to test in parallel with other Omicron lineages. All this is beyond the scope of our manuscript and will take few months of paper work and experimentation.

    1. Reviewer #3 (Public Review):

      The present study aims to elucidate posterior cingulate cortex (PCC) function with both single-unit and population-level depth electrodes. The results clearly show that the dorsal PCC (dPCC) is involved in executive functions (search and add), but that it also contains neurons that are selective for episodic memory (past and future) and rest conditions. With this impressive study design, the authors are able to reconcile discrepancies between human and primate studies. Furthermore, the derived conclusion that PCC function is more diverse than merely its participation in the DMN is of great importance for the field. Thus, I believe that this work will have a great impact on how we think about the PCC, by (1) emphasizing its participation in executive processes and (2) providing evidence of distinct single-unit response profiles that do not manifest on a population level.

      The main strength of this work is the combination of population-level measurements that clearly show the participation of dPCC in executive processes with microelectrode single-unit measurements and an unsupervised hierarchical clustering approach that allows for the identification of 4 distinct SU response profiles within the dPCC. In addition, the population-level electrodes mostly engaged in executive function cluster around an fMRI meta-analysis peak related to executive processing derived from neurosynth, providing a bridge to human fMRI research.

      Nevertheless, there is one concern regarding the data collected within the ventral PCC (vPCC) in this study and the way the authors integrated it into their conclusions.

      Specifically, the conclusion that "Together, they [the findings] inform a view of PCC as a heterogeneous region composed of dorsal and ventral subregions specializing in executive and episodic processing respectively" may not be completely supported by the data. The dPCC macroelectrode data does clearly show a functional specialization in executive processing, but does the data from vPCC presented in this manuscript also support the claim? While taking a closer look at the vPCC data, several inconsistencies stood out: First, the total number of vPCC electrodes was much smaller (6 vs 29 microelectrodes and one microwire probe that was not analyzed). Second, it is not clear which of the presented electrodes in figure 3 were considered to be ventral. From comparing figure 3 with the dorsal/ventral split displayed in figure 1B, it seems as if only one electrode was unambiguously placed in vPCC. Third, BBG statistics of these 6 electrodes are not presented, thus the claim that they show vPCC functional specialization is not statistically supported.

    1. Author Response

      Reviewer #1 (Public Review):

      Jones et al. investigated the relationship between scale free neural dynamics and scale free behavioral dynamics in mice. An extensive prior literature has documented scale free events in both cortical activity and animal behavior, but the possibility of a direct correspondence between the two has not been established. To test this link, the authors took advantage of previously published recordings of calcium events in thousands of neurons in mouse visual cortex and simultaneous behavioral data. They find that scale free-ness in spontaneous behavior co occurs with scale free neuronal dynamics. The authors show that scale free neural activity emerges from subsets of the larger population - the larger population contains anticorrelated subsets that cancel out one another's contribution to population-level events. The authors propose an updated model of the critical brain hypothesis that accounts for the obscuring impact of large populations on nested subsets that generate scale free activity. The possibility that scale free activity, and specifically criticality, may serve as a unifying theory of brain organization has suffered from a lack of high-resolution connection between observations of neuronal statistics and brain function. By bridging theory, neural data, and behavioral dynamics, these data add a valuable contribution to fields interested in cortical dynamics and spontaneous behavior, and specifically to the intersection of statistical physics and neuroscience.

      Strengths:

      This paper is notably well written and thorough.

      The authors have taken a cutting-edge, high-density dataset and propose a data-driven revision to the status-quo theory of criticality. More specifically, due to the observed anticorrelated dynamics of large populations of neurons (which doesn't fit with traditional theories of criticality), the authors present a clever new model that reveals critical dynamics nested within the summary population behavior.

      The conclusions are supported by the data.

      Avalanching in subsets of neurons makes a lot of sense - this observation supports the idea that multiple, independent, ongoing processes coexist in intertwined subsets of larger networks. Even if this is wrong, it's supported well by the current data and offers a plausible framework on which scale free dynamics might emerge when considered at the levels of millions or billions of neurons.

      The authors present a new algorithm for power law fitting that circumvents issues in the KS test that is the basis of most work in the field.

      Weaknesses:

      This paper is technically sound and does not have major flaws, in my opinion. However, I would like to see a detailed and thoughtful reflection on the role that 3 Hz Ca imaging might play in the conclusions that the authors derive. While the dataset in question offers many neurons, this approach is, from other perspectives, impoverished - calcium intrinsically misses spikes, a 3 Hz sampling rate is two orders of magnitude slower than an action potential, and the recordings are relatively short for amassing substantial observations of low probability (large) avalanches. The authors carefully point out that other studies fail to account for some of the novel observations that are central to their conclusions. My speculative concern is that some of this disconnect may reflect optophysiological constraints. One argument against this is that a truly scale free system should be observable at any temporal or spatial scale and still give rise to the same sets of power laws. This quickly falls apart when applied to biological systems which are neither infinite in time nor space. As a result, the severe mismatch between the spatial resolution (single cell) and the temporal resolution (3 Hz) of the dataset, combined with filtering intrinsic to calcium imaging, raises the possibility that the conclusions are influenced by the methods. Ultimately, I'm pointing to an observer effect, and I do not think this disqualifies or undermines the novelty or potential value of this work. I would simply encourage the authors to consider this carefully in the discussion.

      R1a: We quite agree with the reviewer that reconciling different scales of measurement is an important and interesting question. One clue comes from Stringer et al’s original paper (2019 Science). They analyzed time-resolved spike data (from Neuropixel recordings) alongside the Ca imaging data we analyzed here. They showed that if the ephys spike data was analyzed with coarse time resolution (300 ms time bins, analogous to the Ca imaging data), then the anticorrelated activity became apparent (50/50 positive/negative loadings of PC1). When analyzed at faster time scales, anticorrelations were not apparent (mostly positive loadings of PC1). This interesting point was shown in their Supplementary Fig 12.

      This finding suggests that our findings about anticorrelated neural groups may be relevant only at coarse time scales. Moreover, this point suggests that avalanche statistics may differ when analyzed at very different time scales, because the cancelation of anticorrelated groups may not be an important factor at faster timescales.

      In our revised manuscript, we explored this point further by analyzing spike data from Stringer et al 2019. We focused on the spikes recorded from one local population (one Neuropixel probe). We first took the spike times of ~300 neurons and convolved them with a fast rise/slow fall, like typical Ca transient. Then we downsampled to 3 Hz sample rate. Next, we deconvolved using the same methods as those used by Stringer et al (OASIS nonnegative deconvolution). And finally, we z-scored the resulting activity, as we did with the Ca imaging data. With this Ca-like signal in hand, we analyzed avalanches in four ways and compared the results. The four ways were: 1) the original time-resolved spikes (5 ms resolution), 2) the original spikes binned at 330 ms time res, 3) the full population of slow Ca-like signal, and 4) a correlated subset of neurons from the slow Ca-like signal. Based on the results of this new analysis (now in Figs S3 and S4), we found several interesting points that help reconcile potential differences between fast ephys and slow Ca signals:

      1. In agreement with Sup Fig 12 from Stringer et al, anticorrelations are minimal in the fast, time-resolved spike data, but can be dominant in the slow, Ca-like signal.

      2. Avalanche size distributions of spikes at fast timescales can exhibit a nice power law, consistent with previous results with exponents near -2 (e.g. Ma et al Neuron 2019, Fontenele et al PRL 2019). But, the same data at slow time scales exhibited poor power-laws when the entire population was considered together.

      3. The slow time scale data could exhibit a better power law if subsets of neurons were considered, just like our main findings based on Ca imaging. This point was the same using coarse time-binned spike data and the slow Ca-like signals, which gives us some confidence that deconvolution does not miss too many spikes.

      In our opinion, a more thorough understanding of how scale-free dynamics differs across timescales will require a whole other paper, but we think these new results in our Figs S3 and S4 provide some reassurance that our results can be reconciled with previous work on scale free neural activity at faster timescales.

      Reviewer #2 (Public Review):

      The overall goal of the paper is to link spontaneous neural activity and certain aspects of spontaneous behavior using a publicly available dataset in which 10,000 neurons in mouse visual cortex were imaged at 3 Hz with single-cell resolution. Through careful analysis of the degree to which bouts of behavior and bouts of neural activity are described (or not) by power-law distributions, the authors largely achieve these goals. More specifically, the key findings are that (a) the size of bouts of whisking, running, eye movements, and pupil dilation are often well-fit by a power-law distribution over several decades, (b) subsets of neurons that are highly correlated with one of these behavioral metrics will also exhibit power-law distributed event sizes, (c) neuron clusters that are uncorrelated with behavior tend to not be scale-free, (d) crackling relationships are generally not found (i.e. size with duration exponent (if there is scaling) was not predicted by size power-law and duration power-law), (e) bouts of behavior could be linked to bouts of neural activity. In the second portion of the paper, the authors develop a computational model with sets of correlated and anti-correlated neurons, which can be accomplished under a relatively small subset of connection architectures: out of the hundreds of thousands of networks simulated, only 31 generated scale-free subsets/non-scale-free population/anti correlated e-cells/anti-correlated i-cells in agreement with the experimental recordings.

      The data analysis is careful and rigorous, especially in the attention to fitting power laws, determining how many decades of scaling are observed, and acknowledging when a power-law fit is not justified. In my view, there are two weaknesses of the paper, related to how the results connect to past work and to the set-up and conclusions drawn from the computational modeling, and I discuss those in detail below. While my comments are extensive, this is due to high interest. I do think that the authors make an important connection between scale-free distributions of neural activity and behavior, and that their use of computational modeling generates some interesting mechanistic hypotheses to explore in future work.

      My first general reservation is in the relationship to past work and the overall novelty. The authors state in the introduction, "according to the prevailing view, scale-free ongoing neural activity is interpreted as 'background' activity, not directly linked to behavior." It would be helpful to have some specific references here, as several recent papers (including the Stringer et al. 2019 paper from which these data were taken, but also papers from McCormick lab and (Anne) Churchland lab) showed a correlation between spontaneous activity and spontaneous facial behaviors. To my knowledge, the sorts of fidgety behavior analyzed in this paper have not been shown to be scale-free, and so (a) is a new result, but once we know this, it seems that (e) follows because we fully expect some neurons to correlate with some behavior.

      R2a: We agree with the reviewer that our original introductory, motivating arguments needed improvement. We have now rewritten the last 2 paragraphs of the introduction. We hope we have now laid out our argument more clearly, with more appropriate supporting citations. In brief, the logic is this:

      1. Previous theory, modeling, and experiments on the topic of scale-free neural activity suggest that this phenomenon is an autonomous, internally generated thing, independent of anything the body is doing.

      2. Relatively new experiments (including those by Churchland’s lab and McCormmick’s lab: Stringer 2019; Salkoff 2020; Clancy 2019; Musall 2019) suggest a different picture with a link between spontaneous behaviors and ongoing cortical activity, but these studies did not address any questions about scale-free-ness.

      3. Moreover, these new experiments show that behavioral variables only manage to explain about 10-30% of ongoing activity.

      4. Is this behaviorally-explainable 10-30% scale-free or perhaps the scale-free aspects of cortical dynamics fall withing the other 70-90%. Our goal is to find out.

      Digging a bit more on this issue, I would argue that results (b) and (c) also follow. By selecting subsets of neurons with very high cross-correlation, an effective latent variable has emerged. For example, the activity rasters of these subsets are similar to a population in which each neuron fires with the same time-varying rate (i.e., a heterogeneous Poisson process). Such models have been previously shown to be able to generate power-law distributed event sizes (see, eg., Touboul and Destexhe, 2017; also work by Priesemann). With this in mind, if you select from the entire population a set of neurons whose activity is effectively determined by a latent variable, do you not expect power laws in size distributions?

      Our understanding is that not all Poisson processes with a time-varying rate will result in a power law. It is quite essential that the fluctuations in rate must themselves be power-law distributed. As a clear example of how this breaks down, consider a Poisson rate that varies according to a sine wave with fixed period and amplitude. In this case, the avalanche size distribution is definitely not scale-free, it would have a clear typical scale. Another point of view on this comes from some of the simplest models used to study criticality – e.g. all-to-all connected probabilistic binary neurons (like in Shew et al 2009 J Neurosi). These models do generate spiking with a time-varying Poisson rate when they are at criticality or away from criticality. But, only when the synaptic strength is tuned to criticality is the time-varying rate going to generate power-law distributed avalanches. I think the Priesmann & Shriki paper made this point as well.

      My second reservation has to do with the generality of the conclusions drawn from the mechanistic model. One of the connectivity motifs identified appears to be i+ to e- and i- to e+, where potentially i+/i- are SOM and VIP (or really any specific inhibitory type) cells. The specific connections to subsets of excitatory cells appear to be important (based on the solid lines in Figure 8). This seems surprising: is there any experimental support for excitatory cells to preferentially receive inhibition from either SOM or VIP, but not both?

      R2b: There is indeed direct experimental support for the competitive relationship between SOM, VIP, and functionally distinct groups of excitatory neurons. This was shown in the paper by Josh Trachtenberg’s group: Garcia-Junco-Clemente et al 2017. An inhibitory pull-push circuit in frontal cortex. Nat Neurosci 20:389–392. However, we emphasize that we also showed (lower left motif in Fig 8G) that a simpler model with only one inhibitory group is sufficient to explain the anticorrelations and scale-free dynamics we observe. We opted to highlight the model with two inhibitory groups since it can also account for the Garcia-Junco-Clemente et al results.

      In the section where we describe the model, we state, “We considered two inhibitory groups, instead of just one, to account for previous reports of anticorrelations between VIP and SOM inhibitory neurons in addition to anticorrelations between groups of excitatory neurons (Garcia-Junco-Clemente et al., 2017).”

      More broadly, I wonder if the neat diagrams drawn here are misleading. The sample raster, showing what appears to be the full simulation, certainly captures the correlated/anti-correlated pattern of the 100 cells most correlated with a seed cell and 100 cells most anti-correlated with it, but it does not contain the 11,000 cells in between with zero to moderate levels of correlation.

      R2c: We agree that our original model has several limitations and that one of the most obvious features lacking in our model is asynchronous neurons (The limitations are now discussed more openly in the last paragraph of the model subsection). In the data from the Garcia-Junco-Clemente et al paper above there are many asynchronous neurons as well. To ameliorate this limitation, we have now created a modified model that now accounts for asynchronous neurons together with the competing anticorrelated neurons (now shown and described in Fig S9). We put this modified model in supplementary material and kept the simpler, original model in the main findings of our work, because the original model provides a simpler account of the features of the data we focused on in our work – i.e. anticorrelated scale-free fluctuations. The addition of the asynchronous population does not substantially change the behavior of the two anticorrelated groups in the original model.

      We probably expect that the full covariance matrix has similar structure from any seed (see Meshulam et al. 2019, PRL, for an analysis of scaling of coarse-grained activity covariance), and this suggests multiple cross-over inhibition constraints, which seem like they could be hard to satisfy.

      R2d: We agree that it remains an outstanding challenge to create a model that reproduces the full complexity of the covariance matrix. We feel that this challenge is beyond the scope of this paper, which is already arguably squeezing quite a lot into one manuscript (one reviewer already suggested removing figures!).

      We added a paragraph at the end of the subsection about the model to emphasize this limitation of the model as well as other limitations. This new paragraph says:

      While our model offers a simple explanation of anticorrelated scale-free dynamics, its simplicity comes with limitations. Perhaps the most obvious limitation of our model is that it does not include neurons with weak correlations to both e+ and e- (those neurons in the middle of the correlation spectrum shown in Fig 7B). In Fig S9, we show that our model can be modified in a simple way to include asynchronous neurons. Another limitation is that we assumed that all non-zero synaptic connections were equal in weight. We loosen this assumption allowing for variable weights in Fig S9, without changing the basic features of anticorrelated scale-free fluctuations. Future work might improve our model further by accounting for neurons with intermediate correlations.

      The motifs identified in Fig. 8 likely exist, but I am left with many questions of what we learned about connectivity rules that would account for the full distribution of correlations. Would starting with an Erdos-Renyi network with slight over-representation of these motifs be sufficient? How important is the homogeneous connection weights from each pool assumption - would allowing connection weights with some dispersion change the results?

      R2e: First, we emphasize that our specific goal with our model was to identify a possible mechanism for the anticorrelated scale-free fluctuations that played the key role in our analyses. We agree that this is not a complete account of all correlations, but this was not the goal of our work. Nonetheless, our new modified model in Fig S9 now accounts for additional neurons with weak correlations. However, we think that future theoretical/modeling work will be required to better account for the intermediate correlations that are also present in the experimental data.

      We confirmed that an Erdo-Renyi network of E and I neurons can produce scale-free dynamics, but cannot produce substantial anticorrelated dynamics (Fig 8G, top right motif). Additionally, the parameter space study we performed with our model in Fig 8 showed that if the interactions between the two excitatory groups exceed a certain tipping point density, then the model behavior switches to behavior expected from an Erdos-Renyi network (Fig 8F). Finally, we have now confirmed that some non-uniformity of synaptic weights does not change the main results (Fig S9). In the model presented in Fig S9, the value of each non-zero connection weight was drawn from a uniform distribution [0,0.01] or [-0.01,0] for excitatory and inhibitory connections, respectively. All of these facts are described in the model subsection of the paper results.

      As a whole, this paper has the potential to make an impact on how large-scale neural and behavioral recordings are analyzed and interpreted, which is of high interest to a large contingent of the field.

      Reviewer #3 (Public Review):

      The primary goal of this work is to link scale free dynamics, as measured by the distributions of event sizes and durations, of behavioral events and neuronal populations. The work uses recordings from Stringer et al. and focus on identifying scale-free models by fitting the log-log distribution of event sizes. Specifically, the authors take averages of correlated neural sub-populations and compute the scale-free characterization. Importantly, neither the full population average nor random uncorrelated subsets exhibited scaling free dynamics, only correlated subsets. The authors then work to relate the characterization of the neuronal activity to specific behavioral variables by testing the scale-free characteristics as a function of correlation with behavior. To explain their experimental observation, the authors turn to classic e-i network constructions as models of activity that could produce the observed data. The authors hypothesize that a winner-take-all e-i network can reproduce the activity profiles and therefore might be a viable candidate for further study. While well written, I find that there are a significant number of potential issues that should be clarified. Primarily I have main concerns: 1) The data processing seems to have the potential to distort features that may be important for this analysis (including missed detections and dynamic range), 2) The analysis jumps right to e-i network interactions, while there seems to be a much simpler, and more general explanation that seems like it could describe their observations (which has to do with the way they are averaging neurons), and 3) that the relationship between the neural and behavioral data could be further clarified by accounting for the lop-sidedness of the data statistics. I have included more details below about my concerns below.

      Main points:

      1) Limits of calcium imaging: There is a large uncertainty that is not accounted for in dealing with smaller events. In particular there are a number of studies now, both using paired electro-physiology and imaging [R1] and biophysical simulations [R2] that show that for small neural events are often not visible in the calcium signal. Moreover, this problem may be exacerbated by the fact that the imaging is at 3Hz, much lower than the more typical 10-30Hz imaging speeds. The effects of this missing data should be accounted for as could be a potential source of large errors in estimating the neural activity distributions.

      R3a: We appreciate the concern here and agree that event size statistics could in principle be biased in some systematic way due to missed spikes due to deconvolution of Ca signals. To directly test this possibility, we performed a new analysis of spike data recorded with high time resolution electrophysiology. We began with forward-modeling process to create a low-time-resolution, Ca-like signal, using the same deconvolution algorithm (OASIS) that was used to generate the data we analyzed in our work here. In agreement with the reviewer’s concern, we found that spikes were sometimes missed, but the loss was not extreme and did not impact the neural event size statistics in a significant way compared to the ground truth we obtained directly from the original spike data (with no loss of spikes). This new work is now described in a new paragraph at the end of the subsection of results related to Fig 3 and in a new Fig S3. The new paragraph says…

      Two concerns with the data analyzed here are that it was sampled at a slow time scale (3 Hz frame rate) and that the deconvolution methods used to obtain the data here from the raw GCAMP6s Ca imaging signals are likely to miss some activity (Huang et al., 2021). Since our analysis of neural events hinges on summing up activity across neurons, could it be that the missed activity creates systematic biases in our observed event size statistics? To address this question, we analyzed some time-resolved spike data (Neuropixel recording from Stringer et al 2019). Starting from the spike data, we created a slow signal, similar to that we analyzed here by convolving with a Ca-transient, down sampling, deconvolving, and z-scoring (Fig S3). We compared neural event size distributions to “ground truth” based on the original spike data (with no loss of spikes) and found that the neural event size distributions were very similar, with the same exponent and same power-law range (Fig S3). Thus, we conclude that our reported neural event size distributions are reliable.

      However, although loss of spikes did not impact the event size distributions much, the time-scale of measurement did matter. As discussed above and shown in Fig S4, changing from 5 ms time resolution to 330 ms time resolution does change the exponent and the range of the power law. However, in the test data set we worked with, the existence of a power law was robust across time scales.

      2) Correlations and power-laws in subsets. I have a number of concerns with how neurons are selected and partitioned to achieve scale-free dynamics. 2a) First, it's unclear why the averaging is required in the first place. This operation projects the entire population down in an incredibly lossy way and removes much of the complexity of the population activity.

      R3b: Our population averaging approach is motivated by theoretical predictions and previous work. According to established theoretical accounts of scale-free population events (i.e. non-equilibrium critical phenomena in neural systems) such population-summed event sizes should have power law statistics if the system is near a critical point. This approach has been used in many previous studies of scale-free neural activity (e.g. all of those cited in the introduction in relation to scale-free neuronal avalanches). One of the main results of our study is that the existing theories and models of critical dynamics in neural systems fail to account for small subsets of neurons with scale-free activity amid a larger population that does not conform to these statistics. We could not make this conclusion if we did not test the predictions of those existing theories and models.

      2b) Second, the authors state that it is highly curious that subsets of the population exhibit power laws while the entire population does not. While the discussion and hypothesizing about different e-i interactions is interesting I believe that there's a discussion to be had on a much more basic level of whether there are topology independent explanations, such as basic distributions of correlations between neurons that can explain the subnetwork averaging. Specifically, if the correlation to any given neuron falls off, e.g., with an exponential falloff (i.e., a Gaussian Process type covariance between neurons), it seems that similar effects should hold. This type of effect can be easily tested by generating null distributions using code bases such as [R3]. I believe that this is an important point, since local (broadly defined) correlations of neurons implying the observed subnetwork behavior means that many mechanisms that have local correlations but don't cluster in any meaningful way could also be responsible for the local averaging effect.

      R3c: We appreciate the reviewer’s effort, trying out some code to generate a statistical model. We agree that we could create such a statistical model that describes the observed distribution of pairwise correlations among neurons. For instance, it would be trivial to directly measure the covariance matrix, mean activities, and autocorrelations of the experimental data, which would, of course, provide a very good statistical description of the data. It would also be simple to generate more approximate statistical descriptions of the data, using multivariate gaussians, similar to the code suggested by the reviewer. However, we emphasize, this would not meet the goal of our modeling effort, which is mechanistic, not statistical. The aim of our model was to identify a possible biophysical mechanism from which emerge certain observed statistical features of the data. We feel that a statistical model is not a suitable strategy to meet this aim. Nonetheless, we agree with the reviewer that clusters with sharp boundaries (like the distinction between e+ an e- in our model) are not necessary to reproduce the cancelation of anticorrelated neurons. In other words, we agree that sharp boundaries of the e+ and e- groups of our model are not crucial ingredients to match our observations.

      2c) In general, the discussion of "two networks" seems like it relies on the correlation plot of Figure~7B. The decay away from the peak correlation is sharp, but there does not seem to be significant clustering in the anti-correlation population, instead a very slow decay away from zero. The authors do not show evidence of clustering in the neurons, nor any biophysical reason why e and i neurons are present in the imaging data.

      R3d: First a small reminder: As stated in the paper, the data here is only showing activity of excitatory neurons. Inhibitory neurons are certainly present in V1, but they are not recorded in this data set. Thus we interpret our e+ and e- groups as two subsets of anticorrelated excitatory neurons, like those we observed in the experimental data. We agree that our simplified model treats the anticorrelated subsets as if they are clustered, but this clustering is certainly not required for any of the data analyses of experimental data. We expect that our model could be improved to allow for a less sharp boundary between e+ and e- groups, but we leave that for future work, because it is not essential to most of the results in the paper. This limitation of the model is now stated clearly in the last paragraph of the model subsection.

      The alternative explanation (as mentioned in (b)) is that the there is a more continuous set of correlations among the neurons with the same result. In fact I tested this myself using [R3] to generate some data with the desired statistics, and the distribution of events seems to also describe this same observation. Obviously, the full test would need to use the same event identification code, and so I believe that it is quite important that the authors consider the much more generic explanation for the sub-network averaging effect.

      R3e: As discussed above, we respectfully disagree that a statistical model is an acceptable replacement for a mechanistic model, since we are seeking to understand possible biophysical mechanisms. A statistical model is agnostic about mechanisms. We have nothing against statistical models, but in this case, they would not serve our goals.

      To emphasize our point about the inadequacy of a statistical model for our goals, consider the following argument. Imagine we directly computed the mean activities, covariance matrix, and autocorrelations of all 10000 neurons from the real data. Then, we would have in hand an excellent statistical model of the data. We could then create a surrogate data set by drawing random numbers from a multivariate gaussian with same statistical description (e.g. using code like that offered by reviewer 3). This would, by construction, result in the same numbers of correlated and anticorrelated surrogate neurons. But what would this tell us about the biophysical mechanisms that might underlie these observations? Nothing, in our opinion.

      2d) Another important aspect here is how single neurons behave. I didn't catch if single neurons were stated to exhibit a power law. If they do, then that would help in that there are different limiting behaviors to the averaging that pass through the observed stated numbers. If not, then there is an additional oddity that one must average neurons at all to obtain a power law.

      R3f: We understand that our approach may seem odd from the point of view of central-limit-theorem-type argument. However, as mentioned above (reply R3b) and in our paper, there is a well-established history of theory and corresponding experimental tests for power-law distributed population events in neural systems near criticality. The prediction from theory is that the population summed activity will have power-law distributed events or fluctuations. That is the prediction that motivates our approach. In these theories, it is certainly not necessary that individual neurons have power-law fluctuations on their own. In most previous theories, it is necessary to consider the collective activity of many neurons before the power-law statistics become apparent, because each individual neurons contributes only a small part to the emergent, collective fluctuations. This phenomenon does not require that each individual neuron have power-law fluctuations.

      At the risk of being pedantic, we feel obliged to point out that one cannot understand the peculiar scale-free statistics that occur at criticality by considering the behavior of individual elements of the system; hence the notion that critical phenomena are “emergent”. This important fact is not trivial and is, for example, why there was a Nobel prize awarded in physics for developing theoretical understanding of critical phenomena.

      3) There is something that seems off about the range of \beta values inferred with the ranges of \tau and $\alpha$. With \tau in [0.9,1.1], then the denominator 1-\tau is in [-0.1, 0.1], which the authors state means that \beta (found to be in [2,2.4]) is not near \beta_{crackling} = (\alpha-1)/(1-\tau). It seems as this is the opposite, as the possible values of the \beta_{crackling} is huge due to the denominator, and so \beta is in the range of possible \beta_{crackling} almost vacuously. Was this statement just poorly worded?

      R3g: The point here is that theory of crackling noise predicts that the fit value of beta should be equal to (1-alpha)/(1-tau). In other words, a confirmation of the theory would have all the points on the unity line in the rightmost panels of Fig9D and 9E, not scattered by more than an order of magnitude around the unity line. (We now state this explicitly in the text where Fig 9 is discussed.) Broad scatter around the unity line means the theory prediction did not hold. This is well established in previous studies of scale-free brain dynamics and crackling noise theory (see for example Ma et al Neuron 2019, Shew et al Nature Physics 2015, Friedman et al PRL 2012). A clearer single example of the failure of the theory to predict beta is shown in Fig 5A,B, and C.

      4) Connection between brain and behavior:

      4a) It is not clear if there is more to what the authors are trying to say with the specifics of the scale free fits for behavior. From what I can see those results are used to motivate the neural studies, but aside from that the details of those ranges don't seem to come up again.

      R3h: The reviewer is correct, the primary point in Fig 2 is that scale-free behavioral statistics often exist. Beyond this point about existence, reporting of the specific exponents and ranges is just standard practice for this kind of analysis; a natural question to ask after claiming that we find scale behavior is “what are the exponents and ranges”. We would be remiss not to report those numbers.

      4b) Given that the primary connection between neuronal and behavioral activity seems to be Figure~4. The distribution of points in these plots seem to be very lopsided, in that some plots have large ranges of few-to-no data points. It would be very helpful to get a sense of the distribution of points which are a bit hard to see given the overlapping points and super-imposed lines.

      R3i: We agree that this whitespace in the figure panels is a somewhat awkward, but we chose to keep the horizontal axis the same for all panels of Fig 4B, because this shows that not all behaviors, and not all animals had the same range of behavioral correlations. We felt that hiding this was a bit misleading, so we kept the white space.

      4c) Neural activity correlated with some behavior variables can sometimes be the most active subset of neurons. This could potentially skew the maximum sizes of events and give behaviorally correlated subsets an unfair advantage in terms of the scale-free range.

    1. Author Response

      Reviewer #1 (Public Review):

      In this study, Scalabrino et al. show persistent cone-mediated RGC signaling despite changes in cone morphology and density with rod degeneration in CNGB1 mouse model of retinitis pigmentosa. The authors use a linear-nonlinear receptive field model to measure functional changes (spatial and temporal filters and gain) across the RGC populations with space-time separable receptive fields. At mesopic and photopic conditions, receptive field changes were minor until rod death exceeded 50%; while response gain decreased with photoreceptor degeneration. Using information theory, the authors evaluated the fidelity of RGC signaling demonstrated that mutual information decreased with rod loss, but cone-mediated RGC signaling was relatively stable and was more robust for natural movies than artificial stimulus. This work reveals the preservation of cone function and a robustness in encoding natural movies across degeneration. This manuscript is the first demonstration of using information theory to evaluate the effects of neural degeneration on sensory coding. The study uses a systematic evaluation of rod and cone function in this model of rod degeneration to make the following findings: (1) cone function persists for 5-7 months, (2) spatial and temporal changes to the ganglion cell receptive fields were not monotonic with time, (3) mutual information between spikes and photopic stimuli remained relatively constant up to 3-5 months, and (4) information rates were higher for natural movies than for checkerboard noise stimuli.

      The strengths of this paper include the following:

      A systemic evaluation of potentially confusing data. The authors do an excellent job of organizing the results in terms of light levels and time points. The results themselves are confusing and difficult to draw across metrics, but the data are presented as clearly as possible. The work is especially well executed and presented.

      The insight that cone responses remain relatively stable despite rod loss. The study clearly demonstrates that despite cone loss and morphological changes, cone-mediated responses remain robust and functional.

      The application of information theory to degeneration is the first of its kind and the study clearly shows the utility of the metric.

      The results are thoughtfully interpreted.

      We thank the reviewer for these comments.

      The weaknesses of this study include the following:

      The inability to follow the same ganglion cell types over time is a major weakness that could confound the interpretation in terms of whether the changes are happening from artifacts of the recording method or from dynamic changes in the pooled population of ganglion cells. Is there even a single cell class, for example the ON-OFF direction-selective ganglion cells, that this group has so well quantified on the MEA, that the study could track over time, in addition to examining the pooled population changes over time? Tracking a single cell type for each of the metrics would make the population data more convincing or could clearly show that not all ganglion cells follow the population trend.

      As suggested by the reviewer, we have added a cell type that is tracked through all the analyses: ON brisk sustained RGCs. Example receptive field mosaics, temporal receptive fields, and spike train autocorrelation functions for WT and 4M Cngb1neo/neo animals are shown in Figure 2-figure supplement 1E-F. These RGCs follow the trends displayed by the larger populations of RGCs in each analysis. We chose this cell type because they are readily identified by their spike train autocorrelation functions compared to other RGC types and they have approximately space-time separable receptive fields (RFs). There are many text changes associated with adding an analysis of the ON Brisk sustained RGCs (see lines 202-207; 227-229; 264-267, etc).

      We chose not to focus on direction selective RGCs because we are analyzing the spatial and temporal RFs of RGCs in Figures 3-5 and direction-selective RGCs do not have space-time separable RFs (see example in Figure 2C-D). Thus, those cells could not be used to track those receptive field properties across degeneration. Also, we did not collect responses to drifting gratings or bar responses across a range of speeds or contrasts, so we are unable to reliably distinguish the different types of direction-selective RGCs (e.g., ON vs ON-OFF) from these data.

      While the non-monotonic changes are interesting, they are also difficult to make sense of. Can the authors speculate in the Discussion what could be underlying mechanisms that give rise to non-monotonic changes. In the absence of potential mechanisms, the concern of recording artifacts arises.

      Thank you for raising this point. We have added some speculation for the cause of these non-monotonic changes in the Discussion (lines 455-462). “While we do not know why non-monotonic changes are occurring for some RF properties, they largely occurred in the 3-5M range. During this time, there is a transient decrease in the rate of rod death (4-5M) and cone death begins (Figure 1). Consequently, there may be complex changes to retinal circuitry as the retina reacts to a temporary stabilization in rod numbers and an acceleration in cone death. Intracellular studies of the light-driven synaptic currents impinging onto bipolar cells and RGCs during this time will be important for understanding the origin of these non-monotonic changes in RF properties.”

      The mutual information calculation seems to be correlated with the spike rate despite the argument made in Fig 10E-F. Can the authors show this directly by calculating the bits per spike in Figures 8 and 9? Of all the metrics, the gain function and the mutual information seem to be more consistent with each other. Can the authors demonstrate or refute a connection between the spike rate and information rates?

      We added a supplementary figure to each of the information figures (see for Figures 8-10 figure supplement 1) showing the trends hold after dividing the information rate by the spike rate. Certainly, changing spike rates are contributing, but there are also clear changes in the bits/spike plots (Figure 8-figure supplement 1D; Figure 9-figure supplement 1D, Figure 10-figure supplement 1D).

      Can the authors provide an explanation for why the mutual information calculation remains stable despite lower SNR and lower gain, especially after the contributions of oscillations have been ruled out?

      The mutual information depends more strongly on the precision of spiking (both in terms of time and spike number within a small time bin) than the mean spike rate (averaged over the stimulus). Diminishing the total number of spikes (because of reduced gain) will have a relatively small effect on the information rate if the spike trains continue to exhibit low variability (high precision). Indeed, spike generation by RGCs is distinctly sub-Poisson (Berry, Warland, and Meister 1997), indicating it can exhibit relatively high information rates even when spike rates are relatively low. We clarified this in Results at lines 493-496.

      Lack of age-matched WT controls to accompany the different time points. It is known that photoreceptor degeneration can occur naturally in WT mice. Though the authors have used controls pooled from across the ages used in the CNGB1 mutants, it would be informative to know if there are age-dependent changes in any of the metrics for WT mice.

      WT recordings were pooled from retinas from littermate control mice between 2 and 7 months of age (n=3 2M, n=1 each 4M, 6M, 7M). We have added data points from individual retinal recordings to the figure supplements for Figure 2-6 and 8-10 to illustrate the consistency between these recordings, which allowed us to confidently pool the results.

      Can the authors elaborate on why cone function persists despite the rod loss and morphological changes? This is unique for other models of rod loss and is worth extra discussion.

      This is something we are also very interested in, but outside the scope of this study. The Sampath Lab (co-author and collaborator) has data from single cell recordings in late stage rd10 retinas that show abnormal cone signaling (and structure similar to the 7M Cngb1neo/neo cones), yet relatively normal cone bipolar cell and horizontal cell responses. Thus, somehow there is either compensation or a high level of redundancy in the transmission of signals from cones to 2nd-order neurons that makes the responses of the 2nd-order neurons robust to deteriorating cone function. These results suggest our observations in Cngb1neo/neo mice are not unique to this model of RP. Future experiments are needed to understand how this compensation is occurring.

      Reviewer #2 (Public Review):

      In this study, the authors assess the decline of retinal function in a mouse model of slow photoreceptor degeneration - the Cngb1neo/neo. Rod loss occurs between 1-7 months and complete cone loss occurs by 8-9 months. The authors characterize cone loss in the first 7 months and find that 70% of cones are still there at 7 months, though their outer segments are highly degraded. They then use MEA recordings to characterize retinal function using a variety of measures. First, they use spike-triggered averaging to determine the spatial and temporal receptive fields, restricting this analysis to RGCs that have separable spatial and temporal receptive fields. They find that both rod and cone receptive fields are surprisingly intact over the first 5 months, identifying primarily a reduction in contrast response functions (and a reduction in the number of rods that are light responsive-though this is not quantified). Second, they show that oscillatory activity does not appear until after photoreceptors are completely deteriorated-in sharp contrast to other PR degeneration models (e.g. rd10) in which oscillatory activity appears while there are still light-evoked responses. Third, they use information theory to assess the reliability of signaling. When examining the 10% of RGCs with the highest information rates they see a significant decrease at mesoscopic light levels, while information rates were mostly stable at photopic light levels. Finally, they showed that at photopic light levels, the mutant retinas conveyed more information about natural movies than a repeating checkerboard, and this was maintained across light levels.

      My primary question is whether this represents a significant advance. There have been many studies regarding the changing retinal circuits in various rodent models of photoreceptor degeneration. The authors make a few arguments regarding the uniqueness of this study.

      One is that this is a novel analysis that is not limited to particular cell types but rather characterized the retinal as a "whole". But in this point is also its weakness. First, one cannot speak to the retinal as a "whole" since they state that there is a reduction in the number of light-responsive cells across degeneration - yet they do not quantify it. This seems incredibly important to know because even presuming the remaining cells have perfect receptive field structure if only 10% of cells are left, assessing the receptive fields of only the remaining cells is clearly not a characterization of the retention of visual function.

      We never claim that we have assessed the “retina as a whole”. We do state that we are measuring certain features of RGC signaling that reflect the “net changes” induced by photoreceptor degeneration (e.g., changes in photoreceptor function, retinal rewiring, homeostatic mechanisms, etc.) on those features. In fact, we are explicit that we are only measuring certain RF properties in certain RGC types, such as the linear spatial and temporal RFs in cells with space-time separable RFs: Figure 2 makes this point explicitly. We do not measure changes in direction-selectivity, object motion sensitivity, orientation selectivity, edge detection, looming detection, luminance encoding, chromatic opponency, contrast adaptation, motion reversal signaling, etc., because doing so would produce a manuscript with at least one figure for every RGC type (e.g., 45 figures). This would clearly be an unreasonable amount for a single study.

      We agree with the Reviewer that explicitly quantifying the number of light responsive RGCs is important, and we now include this information as a function of degeneration time point in Figure 2-figure supplement 1. Under photopic conditions, this fraction is quite stable until 5M and then begins to deteriorate. We also observe a decrease in the number of RGCs with space-time separable RFs at 5M (Figure 2F), suggesting (but not proving) that these RGCs are representative of changes across all RGCs. We also described these results in the Results (lines 167-174).

      Second, it is hard to assess whether this mouse model is better than existing models for human disease. Their phenotype is different than the rat model of this same disease. It also shows a lack of oscillatory activity that is apparent in rd models.

      We are not making the claim that this model is better than other models. Each model has value. However, because the degeneration in this model is relatively slow, it may be more representative of changes that occur in slower forms of human retinal degeneration (emphasis on “may be”). This is a discussion point, not something that we are aiming to prove. We also believe the utility of a model depends on the questions being asked. In this case, we aimed to track changes over time during photoreceptor loss to better understand the extent to which retinal output is impaired.

      Also, retinitis pigmentosa is a heterogenous disease with a spectrum of phenotypes that may or may not be genotype specific. A patient with a PDE6B mutation presents with differing phenotypes than a patient with CNGB1 mutation, despite both having an RP diagnosis. It is fallacy to assume a mouse is the exact same as a human, just as it is incorrect to assume clinical presentations are identical for all patients for one broad disease that is known to have a diverse set of underlying causes. Studying a range of models is thus essential to understanding the disease. Given that mutations causing RP have different impacts on retinal signaling, we believe it is important to contextualize findings to their mutation. We make this point in Discussion: Comparison to previous studies of RGC signaling in retinitis pigmentosa (beginning on line 436).

      Finally, the model we study does not lack oscillatory activity, it simply arises later than in rd1 or rd10 mice and does so only after all the photoreceptors have died (Figure 7). To our knowledge, it is not clear when or even if RGCs exhibit oscillations in human patients with RP. We discuss why oscillation might arise at different time points in different genetic models of RP in lines 555-570.

      Reviewer #3 (Public Review):

      In the manuscript by Scalabrino et al. a rigorous characterization of the functionality of retinal ganglion cells in a mouse model of rod photoreceptor degeneration is presented. The authors analyzed the degeneration of cone photoreceptors, which is known to be linked to rod degeneration. Based on the time course of cone degeneration they investigated the functional properties of retinal ganglion cells aged between 1 month and seven months.

      The most interesting finding is robust preservation of functional properties, as reflected in little changes of the receptive fields (spatial and temporal characteristics) or signaling fidelity/information rate. In contrast to other mouse models, the present one shows no oscillatory activity until a complete loss of cone photoreceptors occurred at an age of nine months.

      Although the receptive fields of retinal ganglion cells remain nearly intact, the number of ganglion cells with identifiable receptive fields decreases significantly with age (Fig.2F). Could the authors comment, if this might imply a "patchy" vision?

      Visual field loss is a predominant clinical observation in patients with retinitis pigmentosa, including those with Cngb1 mutations. We connect to this observation in the Discussion at lines 521-529: “At the latest stages of photoreceptor degeneration in the Cngb1neo/neo mice (5-7M), we did observe a decrease in the fraction of RGCs with spike rates that were strongly modulated by checkerboard noise (Supplemental Figure 2). It is possible these RGCs were losing their light response completely, or that changes in their light response properties made them relatively unresponsive to checkerboard noise. If the former, it is possible that light responsive RGCs are becoming sparser at the later stages of degeneration which may result in inhomogeneous, or “patchy”, visual sensitivity described by RP patients (see reviews by Hull et al., 2017; Nassisi et al., 2021).”

      Reviewer #4 (Public Review):

      Scalabrino et al. report the remarkable persistence of cone-driven retinal ganglion cell responses in a mouse model of retinitis pigmentosa (i.e., Cngb1 KO mice). The authors first map the time course of primary rod and secondary cone degeneration in Cngb1 KO mice. Approximately 30% of rods are gone at one month (1M), and all rods are lost by 7M in Cngb1 KO retinas. The cone morphology changes progressively as rods degenerate, cone outer segments shrink and are largely absent by 5M. Cones die between 8-9M. Scalabrino et al. next perform multielectrode array recordings from wild-type and Cngb1 KO retinas from 1M to 5M in mesopic and photopic stimulus conditions. They find that spatiotemporal receptive fields remain relatively stable in the face of photoreceptor degeneration, whereas contrast gain gradually decreases. Oscillatory spontaneous ganglion cell activity emerges late (~9M) in Cngb1 KO mice compared to other retinal degeneration models. Finally, the authors analyze mutual information between stimuli (white noise and naturalistic movies) and ganglion cell spikes trains and find that the encoding of the most informative ganglion cells is preserved relatively late into photoreceptor degeneration and that information rates decline less in photopic vs. mesopic conditions and for naturalistic movies vs. white noise stimuli.

      Overall, this is an exciting study that shows remarkable preservation of cone-driven ganglion cell light responses in advanced stages of a retinitis pigmentosa model when most rods have died, and cone morphologies are dramatically altered. The results are presented clearly in the text and figures and are scholarly discussed. Nonetheless, the authors should address a few specific comments to clarify and better support some of the conclusions they draw.

      Specific comments:

      1) In describing the results on information encoding, the authors write and show data (panels A of Figures 8-10) that suggest that most ganglion cells, even in recordings from wild-type retinas, respond unreliably to white noise stimuli and naturalistic movies. Why does such a large fraction of cells have such low repeat reliability? Does this reflect unreliable spike detection and sorting, poor cell or tissue health, or true variability in the responses of healthy retinal ganglion cells. The latter does not seem to align with results from patch-clamp recordings targeted to specific ganglion cell types. The limited repeat reliability also raises questions about how well the linear-nonlinear model, which the authors use to compare responses between wild-type and Cngb1 KO mice of different ages, predicts the responses of these cells. Comparing model parameters (receptive field size, temporal filtering, and contrast sensitivity) between genotypes and ages only makes sense if the model is a good description in the acquired datasets.

      We agree with the reviewer that this is an important point to be clear about. In Figures 8-10 some RGCs exhibit high repeatability, some exhibit low repeatability as quantified by their information rates. The reviewer is concerned about those cells with low repeatability and the ability of capturing their responses with an LN model. This is a valid concern, but to be clear, we are not fitting an LN model to cells with low information rates. In Figures 3-6, where an LN model is being used to estimate the spatial and temporal components of the RFs, we are fitting a subset of all the RGCs: those with space-time separable RFs (see Figure 2). Those particular cells exhibit high information rates and highly reproducible responses, and an LN model captures ~60% of the explainable variance in the spike rate (see Figure 2-figure supplement 1A-B; also see lines 157-151). This is typical for LN models that approximately predict the responses of RGCs to checkerboard noise. Thus, we think the LN model reasonably captures the responses of cells for which we use the LN model. The information rate estimates include these cells as well as other cells that are not well described by an LN model. Note, the LN model is not used to calculate the mutual information rates. We have added text in the Results (lines 324-327) to clarify this.

      In addition, the information rates we estimated in mouse are consistent with past studies from guinea pig (Koch et al, 2004 and Koch et al, 2006). We think cells with very low repeatability are not well driven by checkerboard noise or the particular 10s natural movies we showed. We have updated the example neurons to better reflect the reliability of the cells near the median of the MI distributions in Figures 8-10.

      2) The authors should, maybe in figure supplements and parts of the main figures, break results down by recordings. Inter-experimental variability has been well documented (e.g., Shah et al. Neuron 2022, Zhao et al Sci Rep 2020), and it would be reassuring to see that the conclusions drawn by the authors are supported by statistics in which n = number of recordings (e.g., there is a somewhat difficult to explain broadening of temporal filters in 4M Cngb1 KO retinas that recover by 5M).

      We agree that inter-experiment variability can be large and is important to control for. We now show all the analyses broken down by experiment in Supplemental Figures (2, 3, 4, 5, 6, 8, 9, and 10) for each analysis. None of the trends we describe or highlight in the manuscript were driven by inter-experiment variability.

      3) At different points in their manuscript, the authors conclude that their results "suggest that homeostatic mechanisms in the retina serve to compensate for deteriorating photoreceptors" (or similar). I think that this may well be the case. However, in its present form, the study provides no evidence that retinal circuits in Cngb1 KO mice change to preserve function compared to the alternative that the observed stability is evidence for functional redundancy or resilience in retinal circuits (as they are) without the need for adjustments. Distinguishing between these alternatives would be conceptually important. For example, Care et al. Cell Rep 2019 and Care et al. Cell Rep 2020 used partial stimulation to activate fewer photoreceptors and compare light responses in downstream neurons to those in retinas with fewer photoreceptors. Other studies have directly observed changes in circuit wiring in models of retinal degeneration. If the authors cannot provide experimental evidence for homeostatic changes, it would be good to reflect this in the interpretation and discussion.

      The reviewer raises a terrific point and potential alternative interpretation. We agree. We have not been able to identify an equivalent analysis to that in Care et al. 2019 that we can run that will cleanly distinguish between these two possibilities, without doing many more experiments across timepoints of degeneration. We have thus rewritten portions of the Introduction and the Discussion to recognize the potential of this alternative interpretation.

      Introduction (lines 39-44): Alternatively, homeostatic plasticity or redundancy in retinal circuitry may compensate for photoreceptor loss (Care et al., 2020; Lee et al., 2021; Shen et al., 2020). Such mechanisms could facilitate reliable signaling at the level of retinal output, despite deterioration in photoreceptor function. Identifying the extent to which changes in photoreceptor morphology impact retinal output will inform treatment timepoints for gene therapies aimed at halting rod loss to preserve cone-mediated vision.

      Discussion (lines 514-520): There are two potential classes of mechanisms for this compensation. First, homeostatic plasticity has been documented in models of photoreceptor loss in which the retina remodels to preserve signal transmission (Care et al., 2019; Keck et al., 2013, 2011, 2008; Leinonen et al., 2020; Shen et al., 2020). Alternatively, functional redundancy within the circuit could explain how robust retinal signaling is retained longer than the changes in cone morphology would suggest (Care et al., 2020). This study did not distinguish between the two compensation models.

      4) The authors do not attempt to classify retinal ganglion cells into functional types as functional changes from degeneration may confound such classifications. However, it would be beneficial to separate some categorical response types (direction-selective ON-OFF and ON ganglion cells, maybe orientation-selective [horizontal, vertical, ON, OFF] ganglion cells) and compare how their responsiveness, reliability, and information encoding change with degeneration. This would provide additional insights and address concerns that changes caused by degeneration may be obscured by the differences between ganglion cell types in the present analysis.

      We agree. We now track ON brisk sustained RGCs across degeneration time points for the RF analyses and mutual information analyses. These RGCs are likely the ON sustained alpha cells because they generate very large spikes on the MEA as would be expected for cells with large somata. Example receptive field mosaics, temporal receptive fields, and spike train autocorrelation functions for WT and 4M Cngb1neo/neo animals are shown in Figure 2-figure supplement 1E-F. These RGCs follow the trends displayed by the larger populations of RGCs in each analysis. We chose this cell type because they are readily identified by their spike train autocorrelation functions compared to other RGC types and they have approximately space-time separable receptive fields (RFs). There are many text changes associated with adding an analysis of the ON Brisk sustained RGCs (see lines 202-207; 227-229; 264-267, etc).

      We chose not to focus on direction selective RGCs because we are analyzing the spatial and temporal RFs of RGCs in Figures 3-5 and direction-selective RGCs do not have space-time separable RFs (see example in Figure 2C-D). Thus, those cells could not be used to track those receptive field properties across degeneration. Also, we did not collect responses to drifting gratings or bar responses across a range of speeds or contrasts, so we are unable to reliably distinguish the different types of direction-selective RGCs (e.g., ON vs ON-OFF) from these data.

    1. Author Response:

      Reviewer 2 (Public Review):

      Weaknesses 1. I had difficulty following the ANOVA results for Figure 1. I assume ANOVA was performed with factors of session and block. However, a single F statistic is reported. I do not know what this is referring to. It would be more appropriate to either perform repeated measures ANOVA with session and block as factors for each dependent variable or even better, multiple analyses of variance for the three dependent measures concurrently. Then report the univariate ANOVA results for each dependent measure. The graphs in Figure 1 (C-E) suggest a main effect of block, but as reported, I cannot tell if this is the case. Further, why was sex not included as an ANOVA factor?

      For the sake of transparency, we had included plots showing sessions split by each block whereas statistics related to the right side bar plots where data are collapsed across risk (which was done to minimize effects from ‘missing’ data). We appreciate that this may have caused confusion. In the revised manuscript we specify the exact figure for each statistical result and have added a better description in the methods and updated the statistics (Table 1) with the ANOVA and post-hoc results.

      Previously we had used a mixed effects model because one subject did not complete any risk trials in session 3 but in the revised manuscript, we decided to remove that subjects’ sessions to permit RM ANOVA. As requested by the reviewer, we performed a multivariate analysis on risk and no risk trials. Because of the repeated measures design we opted to utilize the multiple RM package developed by Friedrich et al. 2019, which permits multivariate analysis on repeated measures data with minimal assumptions and bootstrapped p-values for small sample sizes. We found significant interactions for session (or treatment) and risk (see tables below). This justifies the two-way univariate ANOVA which is now reported in the rest of the manuscript. Sex differences were not included in the ANOVA because the study was not intended to assess sex differences but, rather, was designed according to NIH requirements for inclusion of males and females.

      Note: MATS test was utilized based on author recommendations in Friedrich et al., (2019) for when test violates singularity, which was reported. To replicate use a random seed of 8675309.

      Package link: https://rdrr.io/github/smn74/MANOVA.RM/man/multRM.html Publication: Friedrich, S., Konietschke, F., & Pauly, M. (2019). Resampling-based analysis of multivariate data and repeated measures designs with the R package MANOVA. RM. R J., 11(2), 380.

      1. The authors describe session 1 as characterized by 'overgeneralization' due to increased reward latencies. I do not follow this logic. Generalization typically refers to a situation in which a response to one action or cue extends to a second, similar action or cue. In the authors' design, there is only one cue and one action. I do not see how generalization is relevant here.

      This wording has been changed to “non-specific” in the results and discussion.

      1. The authors consistently report dmPFC and VTA 'neural activity'. The authors did not record neural activity. The authors recorded changes in fluorescence due to calcium influx into neurons. Even if these changes have similar properties to neural activity measured with single-unit recording, the authors did not record neural activity in this manuscript.

      We do not imply that we recorded unit activity in these studies and state in the introduction that fiber photometry is an indirect measure of neural activity. We have also reworded much of the text in the manuscript to use “calcium activity.”

      1. Comparing the patterns in Figures 2 and 3, it appears that dmPFC change in fluorescence was similar in non-shocked and shock trials up until shock delivery. However, the VTA patterns differ. No cue differences were observed for sessions 1-3 on shock trials (Figure 3A), yet differences were observed on non-shocked trials (Figure 2F). Further, changes in fluorescence between sessions 1 and 2/3 appeared to emerge just as foot shock would have been delivered. A split should be evident in Figure 3B - but it is not. Were these differences caused by sampling issues due to foot shock trials being rarer?

      We agree, although some of this could be because footshock trials were collapsed across blocks 2 and 3 (as no differences in shock response was observed between blocks). Nevertheless, we have added a caveat about cue responses to the results (see page 11-bottom and 15-top). Regarding the lack of a split in Figure 3A – this difference may be due to shock onset time. The permutation tests indicate the differences in action activity in Figure 2B emerge about 0.5 – 0.8 seconds after action which is when the shock begins. So it is not surprising the results in 2F do not match well with 3A given the rapid and robust response to the footshock.

      1. Similar to Figure 1, I could not follow the ANOVA results for the effects of diazepam treatment on trials completed, action latency and reward latency (Figure 4). Related, from what session do the bar plot data in Figure 4B come from? Is it the average of the 6% and 10% blocks? I cannot tell.

      Please see our response in comment 1 for relevant analysis to this comment. Yes average of risk blocks is the average of 6 and 10%. Better explanation of what bar plot data represent are now explained in the methods.

      1. For the diazepam experiment, did all rats receive saline and diazepam injections in separate sessions? If so, were these sessions counterbalanced? And further, how did the session numbers relate to sessions 1-3 of the first study? All of these details are extremely relevant to interpreting the results and comparing them to the first study, as session # appeared to be an important factor. For example - the decrease in dmPFC fluorescence to reward during the No-Risk block appeared to better match the fluorescent pattern seen in sessions 1 and 2 of the first experiment. In which case, the saline vs. diazepam difference was due to saline rats not showing the expected pattern of fluorescence.

      Subjects received saline and diazepam in separate sessions. Furthermore, diazepam was not tested until animals had at least 3 sessions of training (range of sessions 4-8). Clarification has been added to the methods.

      The new AUC analysis for Reviewer 1 allows for better assessment of the potential differences between earlier sessions and saline (see figure 7- supplements 2 and 3). We also found the effect with reward and diazepam perplexing and somewhat modest. However, even after comparing only Saline and Session 3 PFC AUC data we found no significant effect of session or session*risk interaction for action or reward (F values < 1.3, p values >.27).

      1. The authors seem convinced that fiber photometry is a surrogate for neural activity. Although significant correlation coefficients are found during action and reward, these values hover around 0.6 for the dmPFC and 0.75 for the VTA. Further, no correlations are observed for cue periods. A strength of the calcium imaging approach is that it permits the monitoring of specific neural populations. This would have been very valuable for the VTA, in which dopamine and GABA neurons must show very different patterns of activity. Opting for fiber photometry and then using a pan-neuronal approach fails to leverage the strength of the approach.

      The parent paper (Park & Moghaddam, 2017) used unit recording in this task (including reporting data from dopamine and non-dopamine VTA units). We assure the reviewer that we do not claim that fiber photometry is a perfect surrogate for direct recording of neural activity. However, a key question we wanted to answer in this study was whether the response of PFC and VTA to the footshock changes during task acquisition (please see last paragraph of introduction), hence the choice to use fiber photometry. We note in the results and discussion that this approach is not optimal for detecting cue or other rapid responses (see page 15 and 23).

      Reviewer 3 (Public Review):

      Probably the biggest overall issue is that it is unclear what is being learned specifically. There is no probe test at the end to dissociate the direct impact of shock from its learned impact. And the blocks are not signaled in some other way. And though there seems to be some evidence that the shock effects get more pronounced with a session, it is not clear if the rats are really learning to associate specific shock risks with the particular trials. Indeed with so few sessions and so few actual shocks, this seems really unlikely, especially since without an independent cue, the shock and its frequency is the cue for the block switch. It seems especially unlikely that there is a strong dichotomy in the rats model of the environment between 6% and 10% blocks. This may be quite relevant for understanding foraging under risk. But I think it means some of the language in the paper about contingencies and the like should be avoided.

      While the parent paper (Park & Moghaddam, 2017) delved more deeply into this question we agree that what exactly is learned may be difficult to ascertain. To address this (please also see response to reviewer #1’s first comment), we have toned down our use of the “contingency learning” throughout the manuscript and use the word contingency in relation to the underlying reinforcement/punishment schedules.

      The second issue I had was that I had some trouble lining up the claims in the results with what appeared to be meaningful differences in the figures. Just looking at it, it seems to me that VTA shows higher activities at higher shocks, particularly at the time of reward but also when comparing safe vs risky anyway for the cue and action periods. DmPFC shows a similar pattern in the reward period. […] But these results are not described at all like this. The focus is on the action period only and on ramping? I don't really see ramping. it says "Anxiogenic contingencies also did not influence the phasic response to reward...". But fig 3 seems to show clearly different reward responses? The characterization of the change is particularly important since to me it looks like the diazepam essentially normalizes these features of the response. This makes sense to me […].

      We initially believed that much of the differences in reward (with the exception of Session 2 in the PFC) were from carryover of differences in the peri-action period. However upon quantifying these responses again using AUC change scores to adjust for pre-event differences in the signal, we observed small reward related increases (data are in Figure 7 – supplements 2/3) and have updated results and the discussion.

      Although some lessening of reward response may be apparent across the diazepam session in the VTA (Figure 7 – supplement 2/3G), we do not have statistical support for this as no significant differences were observed in permutation comparisons to saline and only session 3 deviated from the first session for the reward period in the AUC analyses.

    1. Historical Hypermedia: An Alternative History of the Semantic Web and Web 2.0 and Implications for e-Research. .mp3. Berkeley School of Information Regents’ Lecture. UC Berkeley School of Information, 2010. https://archive.org/details/podcast_uc-berkeley-school-informat_historical-hypermedia-an-alte_1000088371512. archive.org.

      https://www.ischool.berkeley.edu/events/2010/historical-hypermedia-alternative-history-semantic-web-and-web-20-and-implications-e.

      https://www.ischool.berkeley.edu/sites/default/files/audio/2010-10-20-vandenheuvel_0.mp3

      headshot of Charles van den Heuvel

      Interface as Thing - book on Paul Otlet (not released, though he said he was working on it)

      • W. Boyd Rayward 1994 expert on Otlet
      • Otlet on annotation, visualization, of text
      • TBL married internet and hypertext (ideas have sex)
      • V. Bush As We May Think - crosslinks between microfilms, not in a computer context
      • Ted Nelson 1965, hypermedia

      t=540

      • Michael Buckland book about machine developed by Emanuel Goldberg antecedent to memex
      • Emanuel Goldberg and His Knowledge Machine: Information, Invention, and Political Forces (New Directions in Information Management) by Michael Buckland (Libraries Unlimited, (March 31, 2006)
      • Otlet and Goldsmith were precursors as well

      four figures in his research: - Patrick Gattis - biologist, architect, diagrams of knowledge, metaphorical use of architecture; classification - Paul Otlet, Brussels born - Wilhelm Ostwalt - nobel prize in chemistry - Otto Neurath, philosophher, designer of isotype

      Paul Otlet

      Otlet was interested in both the physical as well as the intangible aspects of the Mundaneum including as an idea, an institution, method, body of work, building, and as a network.<br /> (#t=1020)

      Early iPhone diagram?!?

      (roughly) armchair to do the things in the web of life (Nelson quote) (get full quote and source for use) (circa 19:30)

      compares Otlet to TBL


      Michael Buckland 1991 <s>internet of things</s> coinage - did I hear this correctly? https://en.wikipedia.org/wiki/Internet_of_things lists different coinages

      Turns out it was "information as thing"<br /> See: https://hypothes.is/a/kXIjaBaOEe2MEi8Fav6QsA


      sugane brierre and otlet<br /> "everything can be in a document"<br /> importance of evidence


      The idea of evidence implies a passiveness. For evidence to be useful then, one has to actively do something with it, use it for comparison or analysis with other facts, knowledge, or evidence for it to become useful.


      transformation of sound into writing<br /> movement of pieces at will to create a new combination of facts - combinatorial creativity idea here. (circa 27:30 and again at 29:00)<br /> not just efficiency but improvement and purification of humanity

      put things on system cards and put them into new orders<br /> breaking things down into smaller pieces, whether books or index cards....

      Otlet doesn't use the word interfaces, but makes these with language and annotations that existed at the time. (32:00)

      Otlet created diagrams and images to expand his ideas

      Otlet used octagonal index cards to create extra edges to connect them together by topic. This created more complex trees of knowledge beyond the four sides of standard index cards. (diagram referenced, but not contained in the lecture)

      Otlet is interested in the "materialization of knowledge": how to transfer idea into an object. (How does this related to mnemonic devices for daily use? How does it relate to broader material culture?)

      Otlet inspired by work of Herbert Spencer

      space an time are forms of thought, I hold myself that they are forms of things. (get full quote and source) from spencer influence of Plato's forms here?

      Otlet visualization of information (38:20)

      S. R. Ranganathan may have had these ideas about visualization too

      atomization of knowledge; atomist approach 19th century examples:S. R. Ranganathan, Wilson, Otlet, Richardson, (atomic notes are NOT new either...) (39:40)

      Otlet creates interfaces to the world - time with cyclic representation - space - moving cube along time and space axes as well as levels of detail - comparison to Ted Nelson and zoomable screens even though Ted Nelson didn't have screens, but simulated them in paper - globes

      Katie Berner - semantic web; claims that reporting a scholarly result won't be a paper, but a nugget of information that links to other portions of the network of knowledge.<br /> (so not just one's own system, but the global commons system)

      Mention of Open Annotation (Consortium) Collaboration:<br /> - Jane Hunter, University of Australia Brisbane & Queensland<br /> - Tim Cole, University of Urbana Champaign<br /> - Herbert Van de Sompel, Los Alamos National Laboratory annotations of various media<br /> see:<br /> - https://www.researchgate.net/publication/311366469_The_Open_Annotation_Collaboration_A_Data_Model_to_Support_Sharing_and_Interoperability_of_Scholarly_Annotations - http://www.openannotation.org/spec/core/20130205/index.html - http://www.openannotation.org/PhaseIII_Team.html

      trust must be put into the system for it to work

      coloration of the provenance of links goes back to Otlet (~52:00)

      Creativity is the friction of the attention space at the moments when the structural blocks are grinding against one another the hardest. —Randall Collins (1998) The sociology of philosophers. Cambridge, MA: Harvard University Press (p.76)

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to the reviewers

      1. General Statements

      It is the common view of all three reviewers that we have not utilized adequate in vitro/biochemical evidence to support the idea that SATB1 protein undergoes liquid-liquid phase separation. We do agree with the reviewers that our manuscript lacks biochemical evidence to support such notion. Though we find it quite interesting and we would like to suggest for the first time in the field of chromatin organization and function, based upon the action of SATB1, that this protein does exist in at least two polypeptide isoforms (764 and 795 amino acids long) which display different phase separation propensity and therefore confer different actions in regulating the (patho)physiological properties of a murine T cell.

      Every single research group that works on SATB1, considered so far only a single protein isoform, that is, the shorter isoform of 764 amino acids and no tools, such as isoform-specific antibodies have been developed to discriminate the two isoforms and thus being able to assign unique functions to each isoform. We do understand that such a report, suggesting the presence of two protein isoforms, with potentially quite diverse functions, would question (not necessarily by the authors of this manuscript, since no such comment is included in our manuscript) the conclusions drawn in the literature assigning all biochemical properties to a single, short isoform of SATB1. Moreover, all the genetically modified mice that have been analyzed so far (including our group), deleted both Satb1 isoforms. Our future research approaches should, from now on, consider unraveling the isoform-specific functions of SATB1 and their involvement in physiology and disease. This could also deem useful to explain the quite diverse, both positive and negative effects of SATB1 in transcription regulation. Another major objection of the reviewers was that we should provide cumulative supporting evidence for the existence of the long SATB1 isoform, or at least evaluate the specificity of our custom-made antibody.

      Taking under consideration the aforementioned constructive criticism of the three reviewers we would like to perform (most of the suggested experiments have already been performed) additional experiments to support our claims in the manuscript. These experiments are described below as a point-by-point reply to each point raised by the reviewers.

      In line with the aforementioned rationale, we propose the title of our manuscript to change into “Two SATB1 isoforms display different phase separation propensity”, if our manuscript is considered for publication.

      1. Description of the planned revisions

      **Reviewer #1**:

      4) Lack of in vitro reconstitution experiments with purified long and short SATB1

      **PLANNED EXPERIMENT #1**

      We do realize this shortcoming of our work. We have to note that purifying recombinant SATB1 protein is quite a challenging task, yet we 1. cloned both Satb1 cDNAs for the long and short isoforms, 2. we successfully expressed both proteins in great quantity and quality and we are willing to perform these experiments if our work is considered for publication.

      This proposed experiment has also been requested by Reviewers #2 and #3.

      **Reviewer #2**:

      1. Moreover, an important and direct experiment would be to clone the long isoform in a suitable vector and overexpress in the cell line (as done for the canonical isoform in Supp Fig 1a). This would unequivocally show the efficacy of the antibody and thus the following usage of the same for various assays.

      **PLANNED EXPERIMENT #2**

      This is a great suggestion. We have cloned the long and short Satb1 cDNAs in pEGFP-C1 vector. We will transfect these plasmids in NIH 3T3 fibroblasts and we will perform Western blot analysis, utilizing the antibody raised against the extra 31 amino acids long peptide present only in the long SATB1 isoform, for the following samples: 1. NIH-3T3 whole cell protein extracts, 2. protein extracts from NIH 3T3 fibroblasts transiently transfected with the pEGFP-C1 plasmid, 3. protein extracts from NIH 3T3 fibroblasts transiently transfected with the pEGFP-long_Satb1_ plasmid and 4. protein extracts from NIH 3T3 fibroblasts transiently transfected with the pEGFP-short_Satb1_ plasmid.

      This experiment will consist another proof regarding the specificity of the antibody raised against the extra 31 amino acids long peptide present only in the long SATB1 isoform.

      **Minor comments:**

      1. On pg 6, related to Figure 1, the authors mention 'It should also be noted that when investigating the SATB1 protein levels, we have to bear in mind that the antibodies targeting the N-terminus of SATB1 protein cannot discriminate between the short and long isoforms'. The authors reason that their sizes are too close. It is indeed possible, and widely studied in biochemistry to assess various factors on protein migration (such as PTMs). The authors should validate this aspect (as it is important as per their premise) and perform separation based on charge as well and also use a commercial antibody to validate the same.

      (Experiments already performed)

      We have adapted the text so that it does not imply that the two isoforms cannot be separated by size. This part in lines 102-107 then reads: “It should also be noted that when investigating the SATB1 protein levels, we have to bear in mind that the antibodies targeting the N-terminus of SATB1 protein cannot discriminate between the short and long isoforms, thus we can only compare the amount of the long SATB1 isoform to the total SATB1 protein levels in vivo conditions. To overcome this limitation and to specifically validate the presence of the long SATB1 protein isoform in primary murine T cells, we designed a serial immunodepletion-based experiment (Fig. 1e, Supplementary Fig. 1a).”

      Moreover, in the revised version of the manuscript we now provide a number of additional proofs supporting the presence of the long isoform and also the specificity of the long isoform-specific antibody. As evident in the text cited above, in the revised Fig. 1e,f and revised Supplementary Fig. 1a,b; we present two immunodepletion experiments which should alone address the Reviewer’s concerns. Moreover, we added Supplementary Fig. 1c; demonstrating that the long isoform-specific antibody does not detect any protein in cells with conditionally depleted SATB1 (Satb1_fl/fl_Cd4-Cre+), supporting its specificity. The custom-made and publicly available antibodies targeting all SATB1 isoforms were also verified in Supplementary Fig. 1d. Moreover, the long isoform and all isoform antibodies display similar localization in the nucleus (Supplementary Fig. 1e; their co-localization based on super-resolution microscopy is also quantified in Supplementary Fig. 5a).

      In our accompanying revised manuscript Zelenka et al., 2022 (https://doi.org/10.1101/2021.07.09.451769), we will provide yet another piece of evidence, consisting of bacterially expressed short and long SATB1 protein isoforms detected by western blot using either the long isoform-specific or the non-selective all SATB1 isoform antibodies.

      **PLANNED EXPERIMENT #3**

      Although we think that in the revised version of the manuscript, we have provided enough proof about the existence of the long isoform in primary murine thymocytes we would like to try the following approach as suggested by this Reviewer.

      The pI of the two SATB1 isoform is quite similar. The pI of the short SATB1 isoform is 6.09 and for the long SATB1 isoform is 6.18. We will perform 2D PAGE coupled to Western blotting utilizing the antibodies detecting the long and all SATB1 isoforms. Given the fact that both isoforms are post-translationally modified to a various degree, it will be extremely difficult to discriminate between the long and short unmodified versus the long and short post-translationally modified proteins especially in the absence of a specific antibody only for the short isoform.

      **Reviewer #3**

      1. Hexanediol is another assay frequently used in phase-separation studies. However, hexanediol has many deleterious effects on the cell, even at a fraction of the concentration normally used in phase-separation studies. Authors should show controls of cell viability, control proteins that do not phase-separate, etc. See https://www.jbc.org/article/S0021-9258(21)00027-2/fulltext.

      Secondly, hexanediol treatment should cause phase-separated protein aggregates to disperse. It is difficult to determine from the images whether or not the aggregates actually disperse or there is just less protein. In any case, small aggregates remain even after treatment, and this appears different from most other hexanediol experiments reported in the literature where the signals become more dispersed and uniform. This is likely because the samples are fixed.

      One of the main features of using hexanediol in phase-separation is to show that upon washout, LLPS aggregates can reform. Because the cells are fixed, the critical aspect of this assay is not performed. A washout and LLPS recovery would control for cell viability issues described above and would provide the opportunity to show that total SATB1 protein levels did not change, but its distribution did, which is the essence of this assay in the context of LLPS. This review from the Tjian group is very informative and may be a good resource:

      http://genesdev.cshlp.org/content/33/23-24/1619

      In line with our reply to point #1 of this Reviewer (page 26 of this document), we should again emphasize that we utilized the hexanediol treatment in primary murine developing T cells as this is the only way to investigate the properties of SATB1 speckles under physiological conditions. This also explains why some small insoluble structure remains after the hexanediol treatment. Note that under physiological conditions, there is a contribution of several protein variants (such as differential PTMs) out of which some will tend to form more stable structures while others could undergo LLPS. It is not clear how the washout experiment could be applied in the primary cell conditions that include cell fixation as the heterogeneity and big variation among cells would make such data analysis highly unreliable.

      **PLANNED EXPERIMENT #1**

      As we answered to point #4 of Reviewer 1 (page 2), we propose the following experiment. Although the purification of recombinant SATB1 protein is quite a challenging task, yet we 1. cloned both Satb1 cDNAs for the long and short isoforms, 2. we successfully expressed both proteins in great quantity and quality and we are willing to perform in vitro reconstitution experiments if our work is considered for publication.

      1. The major difference between the long and short isoform of SATB1 is the 31aa segment within the IDR. However the authors find that neither the long or short isoform SATB1 forms LLPS aggregates, and the IDR alone forms aggregates in the cytoplasm (Fig5) but they do not respond to Cry2 light activation. When forced to localize to the nucleus, it does not form aggregates as well (Fig6). The short isoform also did not form any aggregates. These results seem to argue against any isoform specific phase-separation. This experiment seems critical for the story, yet it does not support their overall conclusions. The authors might consider using a different cell line or perhaps do an in vitro assay using purified protein.

      I am not certain what to make of the cytoplasmic aggregation, which appears to not form upon localization to the nucleus. Because of this, it is difficult to place weight on the significance of the S635A mutation and the role that a phosphorylation of SATB1 contributes to phase-separation, let alone function There are many additional points of concern, but the ones listed above are perhaps the most significant in terms of the overall conclusions of the paper.

      In Fig. 5c we show that the full length long SATB1 isoform often aggregates unlike the short isoform. These data are accompanied with the results for the IDR region, where the situation is even more obvious (Fig. 5f,g). However, in the latter, we have to bear in mind the absence of the multivalent N-terminal part of the protein which seems to be essential for the overall phase behavior of the protein as indicated in Fig. 4b,c.

      **PLANNED EXPERIMENT #1**

      To further support LLPS of SATB1, we are considering performing the following in vitro experiment, as we answered to point #4 of Reviewer 1 (page 2). Although the purification of recombinant SATB1 protein is quite a challenging task, yet we 1. cloned both Satb1 cDNAs for the long and short isoforms, 2. we successfully expressed both proteins in great quantity and quality and we are willing to perform in vitro reconstitution experiments if our work is considered for publication.

      1. Description of the revisions that have already been incorporated in the transferred manuscript

      **Reviewer #1 (Evidence, reproducibility and clarity)**:

      This paper looks at an important nuclear matrix protein SATB1, which is a well known global chromatin organizer and help chromatin loop attach to the nuclear matrix. The paper starts with identification of novel short and long form of SATB1. Both the isoform consist of a prion like low complexity domains, but the long isoform additionally contain an extra EPF domain next the Prion like low complexity domain. The paper reports that in murine cells the long isoform is 3-4 fold more abundant than the short isoform. By using STED microscopy they show SATB1 foci lie next to transcription sites in the nucleus. They conclude by looking at the spherical shape of the SATB1 foci and the susceptibility of SATB1 staining after 1,6 hexanediol treatment that SATB1 forms the small foci in the nucleus due to LLPS. The authors also use RAMAN spectroscopy to conclude a change in nuclear chemical space in absence of SATB1 but without much explanation about which chemical bond or nuclear sub structure change correspond to the change in principal component analysis from Raman spectroscopy. The authors use the light inducible aggregation cry2 tag with the PrD domain of SATB1 and compare it with the Cry2-FUS-LC domain to conclude that the SATB1 LC domain can undergo LLPS. The authors hint at involvement of RNA and also DNA in the LLPS of the SATB1 but without going into any detail. Reviewer: The paper reports that in murine cells the long isoform is 3-4 fold more abundant than the short isoform.

      Actually, in page 5 (lines 94-96) of the manuscript we write: “We confirmed that in murine thymocytes the steady state mRNA levels of the short Satb1 transcripts were about 3-5 fold more abundant compared to the steady state mRNA levels of the long Satb1 transcripts (Fig. 1d).” Although the steady state mRNA levels of the long isoform are less abundant compared to the shorter isoforms, the long isoform protein levels are almost comparable to the short isoform as deduced based on immunofluorescence experiments. Moreover, Using our two immunodepletion experiments we quantified the difference, estimating the long isoform being 1.5× to 2.62× less abundant than the short isoform (Fig. 1f and Supplementary Fig. 1b; compare lanes 2 & 3 at the lower panel). • Regarding the RAMAN spectroscopy experiments please see Minor Comment #1 of this Reviewer (page 10).

      The key conclusions of the paper are- A) SATB1 undergoes LLPS. But this conclusion is drawn after correlative experiments as detailed below-

      This conclusion is indeed made based on correlative experiments only for the primary murine T cells, which do not allow for any targeted experiments. However, the use of in vitro cell lines allowed us to validate these findings using the optogenetic approaches, utilizing additional experimentation.

      1) observation of spherical punctae by STED-which could also seem spherical due to their small size. The resolution limit achieved by the STED microscopy used in this paper is not determined or mentioned clearly.

      In the revised version of the manuscript, we have specified the resolution of our systems, for STED in Lines 745-746: ”This system enables super-resolution imaging with 35 nm lateral and 130 nm axial resolution.” and for SIM in Lines 759-761: “Images were acquired over the majority of the cell volume in z-dimension with 15 raw images per plane (five phases, three angles), providing ~120-135 nm lateral and ~340-350 nm axial resolution for 488/568 nm lasers, respectively.” The size of the observed speckles is thus above the resolution limit with sizes ranging between 40-80 nm.

      The resolution of our systems is routinely verified by the following methods: The resolution of our OMX (SIM-3D) system was tested using ARGO-SIM slide containing a pattern of 36 µm long lines with gradually increasing spacing ranging from (left to right) 0 to 390 nm, with a step of 30 nm (Fig. 1 below). Our SIM system was able to clearly resolve two lines separated by 120 nm.

      2) No live cell FRAP experiment with fluorescent SATB1 long or short isoform to show that these foci are liquid like

      We did perform FRAP experiments for the SATB1 N-terminus optogenetic construct as demonstrated in Fig. 4f. We did not perform FRAP in the primary murine T cells as this is not technically feasible without creating a new mouse line with fluorescently labeled protein. In the revised version of the manuscript, we additionally performed FRAP experiments for the full length short and long isoform of SATB1 labeled with EGFP and transfected into the NIH-3T3 cell line (Supplementary Figure 6f).

      5) LLPS is strongly coupled to the cellular concentration of the proteins. Authors should quantify the cellular concentration of the long and short isoform in the cells.

      We did consider protein concentration in our analyses of optogenetic constructs in Fig. 4b,d,e and Supplementary Fig. 6a,b,c. Quantifying the physiological cellular concentration of short and long SATB1 protein isoforms in primary T cells is impossible due to the inherent inability to discriminate between the isoforms by two antibodies, in the absence of Satb1 isoform-specific knockout mice.

      However, an approximation of the cellular concentration can be obtained from our immunodepletion experiments. On top of the original immunodepletion experiment that we now present in Supplementary Fig. 1a,b; in the revised version of the manuscript we have repeated the experiment in Fig. 1e,f. Comparison of the two bands for the long and short SATB1 isoforms in the lower panel of the western blot figures suggest that the long SATB1 isoform protein levels are 1.5× to 2.62× less abundant than the short isoform, according to the original and new immunodepletion experiment, respectively. This is now also included in the main text in Lines 110-116: “This experiment can also be used for approximation of the cellular protein levels of SATB1 isoforms in primary murine thymocytes. Comparison of the two bands for long (lane 2) and short SATB1 (lane 3) isoform in the lower panel of Fig. 1f and Supplementary Fig. 1b, suggests that the long SATB1 isoform protein levels may be about 1.5× to 2.62× less abundant than the short isoform, according to the two replicates of our immunodepletion experiment, respectively.”

      Major conclusion B)- SATB1 regulates transcription and splicing.

      This was also shown previously and in this paper they show the close proximity of the transcription site and SATB1 foci by microscopy. Hexanediol treatment which lead to loss of colocalization between FU foci and SATB1 is also taken as an evidence in regulation of transcription is not right as the transcription foci itself can be dissolved using 1,6 Hexanediol. Although the rate of transcription is not measured quantitatively.

      As mentioned in comment #3 (page 29) of this Reviewer, unfortunately there is no better tool to investigate these questions in primary cells than using microscopy approaches in conjunction with hexanediol treatment. However, we should also note that there is an accompanying manuscript from our group that is currently being under revision in another journal (preprint available: Zelenka et al., 2021; https://doi.org/10.1101/2021.07.09.451769). In the preprint manuscript, we showed that: 1. the long SATB1 isoform binding sites have increased chromatin accessibility than what expected by chance (Fig. 3b), 2. there is a drop in chromatin accessibility at SATB1 binding sites in Satb1 cKO mouse (Fig. 3c) and 3. this drop in chromatin accessibility is especially evident at the transcription start sites of genes (Supplementary Fig. 1i)

      We believe that, together these data suggest a direct involvement of SATB1 in transcription regulation. Also note the vast transcriptional deregulation that occurs in Satb1 cKO T cells, affecting the expression of nearly 2000 genes (Fig. 2f, this revised manuscript). That is why we believe that the co-localization analysis, using super-resolution microscopy, presented in Fig. 2c and quantified in Fig. 3g, represents a nice additional support to our claims. Moreover, in the revised version of the manuscript we now present a positive correlation between SATB1 binding and deregulation of splicing (Supplementary Fig. 4d) which also supports its direct involvement in the regulation of transcriptional and co-transcriptional processes.

      In the revised version of the manuscript we have made this clear in Lines 182-194: “Satb1 cKO animals display severely impaired T cell development associated with largely deregulated transcriptional programs as previously documented19,37,38. In our accompanying manuscript19, we have demonstrated that long SATB1 isoform-specific binding sites (GSE17344619) were associated with increased chromatin accessibility compared to randomly shuffled binding sites (i.e. what expected by chance), with a visible drop in chromatin accessibility in Satb1 cKO. Moreover, the drop in chromatin accessibility was especially evident at the transcription start site of genes, suggesting that the long SATB1 isoform is directly involved in transcriptional regulation. Consistent with these findings and with SATB1’s nuclear localization at sites of active transcription, we identified a vast transcriptional deregulation in Satb1 cKO with 1,641 (922 down-regulated, 719 up-regulated) differentially expressed genes (Fig. 2f). Specific examples of transcriptionally deregulated genes underlying SATB1-dependent regulation are provided in our accompanying manuscript19. Additionally, there were 2,014 genes with altered splicing efficiency (Supplementary Fig. 4d-e; Supplementary File 3-4). We should also note that the extent of splicing deregulation was directly correlated with long SATB1 isoform binding (Supplementary Fig. 4d).”

      Major conclusion C)-Post transcriptional modification is important for SATB1 function.

      This point is just barely touched upon in the last figure of the paper

      We would not call the identification of the novel phosphorylation site as a main conclusion of our manuscript. Though, it is already known that posttranslational modifications of SATB1 are important for its function as they can function as a molecular switch rendering SATB1 into either an activator or a repressor (Kumar et al., 2006; https://doi.org/10.1016/j.molcel.2006.03.010).

      In the revised manuscript, we support the effect of serine phosphorylation on the DNA binding capacity of SATB1 by another experiment. We have performed DNA affinity purification experiments utilizing primary thymocyte nuclear extracts treated with phosphatase (Supplementary Fig. 7b). We found that SATB1’s capacity to bind DNA (RHS6 hypersensitive site of the TH2 LCR) is lost upon treatment with phosphatase (Supplementary Fig. 7c). These results are in line with the data presented in Supplementary Fig. 7d, indicating the lost ability of SATB1 to bind DNA upon mutating the discovered phosphorylation site S635. Given the importance of posttranslational modifications of proteins on LLPS, we found it relevant to include it in our manuscript. Even more so, when we identified SATB1 aggregation, upon mutation of this phospho site (Fig. 6d).

      Overall I find that the major conclusion-point A and B, is based on very indirect experiments and needs much more convincing data and the role of SATB1 LLPS in cells should be demonstrated more rigorously. And conclusion C is barely described and needs a lot more cell biological and genetic evidence.

      One of the major assets of our work is that most of our data are based on the analysis of primary murine T cells and thus investigating the biological roles of the endogenous SATB1 protein, under physiological conditions. We apologize that we did not make it clear to this Reviewer, that our system has certain inherent limitations due to the utilization of primary cells.

      I do not recommend publishing the paper in current state. The story needs much more experiment to convincingly prove the major conclusions. Further, the MS needs more careful thinking and presentation to make it streamlined.

      We hope that in the revised version we have significantly improved the quality of our manuscript by implementing the suggested changes.

      Minor comments: One of the major flaw of the paper is the use too many techniques without proper explanation. E.g. use of STED and RAMAN microscopy need controls and explanation on what is being quantified. The use of Raman microscopy to quantify the nuclear environment of nucleus is not related to the chromatin organization or LLPS of SATB1 at all. And no information is provided at all which aspect of nuclear organization is being measured in Raman and what it means for the LLPS of SATB1.

      We do provide quite a thorough explanation of Raman spectroscopy and the underlying quantification in Lines 224-231: “we employed Raman spectroscopy, a non-invasive label-free approach, which is able to detect changes in chemical bonding. Raman spectroscopy was already used in many biological studies, such as to predict global transcriptomic profiles from living cells42, and also in research of protein LLPS and aggregation43–47. Thus we reasoned that it may also be used to study phase separation in primary T cells. We measured Raman spectra in primary thymocytes derived from both WT and Satb1 cKO animals and compared them with spectra from cells upon 1,6-hexanediol treatment. Principal component analysis of the resulting Raman spectra clustered the treated and non-treated Satb1 cKO cells together, while the WT cells clustered separately (Fig. 3h).” We also do provide controls as the method was performed on both treated and untreated WT and Satb1 cKO cells.

      Regarding the RAMAN spectroscopy experiments we now provide more information on the changes of chemical bonds altered between wild type and Satb1 cKO thymocytes. Following principal component analysis, we have extracted the two main principal components that were used for the clustering of our data. The differences are presented in Supplementary Fig. 5d.

      We do realize that RAMAN spectroscopy, although a quite novel approach utilized to study LLPS, has not been used to study LLPS in live cells. If deemed proper we are willing to avoid presenting these results in this manuscript.

      Similarly for Hexanediol treatment, duration of treatment is missing. Hexanediol can also dissolve the liquid like transcription foci. And hence a decrease in correlation between SATB1 foci and FU foci cannot be taken as a measure of SATB1 foci connection to transcription alone

      The duration of hexanediol treatment was 5 minutes as presented in Line 724 and in the revised version of the manuscript also in Lines 1206-1207. We should also note that additionally, we performed experiments with different hexanediol concentrations and timing varying from 1 minute to 10 minutes with results consistent with the data presented.

      It is not very clear how many times the STED or Raman microscopy is done on how many samples and biological replicates. Similarly for RNA sequencing number of samples and description of controls are missing. Also if the sequencing data is made publicly available is not clear.

      Data availability is clearly stated in Lines 506-509: “RNA-seq experiments and SATB1 binding sites are deposited in Gene Expression Omnibus database under accession number GSE173470 and GSE173446, respectively. The other datasets generated and/or analyzed during the current study are available upon request.”

      The Reviewer’s token is “wjwtmeeeppovzqx”.

      RNA sequencing was performed in a biological triplicate for each genotype as stated in the GEO repository and now also in Line 566 of the revised manuscript.

      In Lines 180-181, we also state that it was performed on Satb1 cKO animals and WT mice as a control: “we performed stranded-total-RNA-seq experiments in wild type (WT) and Satb1fl/flCd4-Cre+ (Satb1 cKO) murine thymocytes”.

      In Lines 739-740, we now also state that all imaging approaches were performed on at least two biological replicates (different mice) and please also note the fact that all findings were based on data from both STED and 3D-SIM methods, allowing to minimize detection of artifacts. In the Raman spectroscopy figure, each point represents measurements from an individual cell and for each condition we used 2-5 biological replicates (Lines 831-832 & Line 1169).

      Similarly, in Lines 129-132 we provided a quite detailed description of differences between STED and 3D-SIM, even though these techniques are not that rare as Raman spectroscopy in biology research.

      Additional control is needed to report the resolution limit of Superresolution techniques-STED and 3D-SIM systems used by them.

      We have already provided this information in our reply to comment #1 of this Reviewer (pages 6-7): In the revised version of the manuscript, we have specified the resolution of our systems, for STED in Lines 745-746: ”This system enables super-resolution imaging with 35 nm lateral and 130 nm axial resolution.” and for SIM in Lines 759-761: “Images were acquired over the majority of the cell volume in z-dimension with 15 raw images per plane (five phases, three angles), providing ~120-135 nm lateral and ~340-350 nm axial resolution for 488/568 nm lasers, respectively.” The resolution of our systems is routinely verified by the following methods: The resolution of our OMX (SIM-3D) system was tested using ARGO-SIM slide containing a pattern of 36 µm long lines with gradually increasing spacing ranging from (left to right) 0 to 390 nm, with a step of 30 nm (Fig. 1 below). Our SIM system was able to clearly resolve two lines separated by 120 nm.

      Would be very helpful if the zonation was plotted for the FluoroUridine (FU) also to show that Zone1 (heterochromatin) is completely depleted of FU, and is present in other regions.

      In the revised version of the manuscript, we performed the suggested analysis and in Supplementary Fig. 3a we now show that indeed FU is significantly less localized to Zone 1 (heterochromatin) and has the most abundant localization in Zones 3 and 4, similar to the localization of SATB1 protein, as demonstrated in Fig. 2b.

      Scale bar needed figure 3d

      In the revised version of the manuscript, we included scale bars which are both 0.5 µm (line 1213).

      Perfectly rounded SATB1 foci- this does not mean LLPS. For LLPs measurement, protein condensate dynamics measurement by FRAP or fusion experiments is required. What is the size of condensates? and cellular concentration of SATB1? Will SATB1 undergo LLPS in vitro at similar concentrations? does SATB1 interact with DNA or RNA to undergo LLPS ?

      We toned down this sentence which now reads: “Here we demonstrated its connection to transcription and found that it forms spherical speckles (Fig. 1g), markedly resembling phase separated transcriptional condensates. (Lines 200-202)”.

      Moreover, as explained in earlier replies to comments of this Reviewer, we cannot perform FRAP on primary murine T cells without generating a new mouse line. We did, however, use FRAP and other in vitro approaches including visualization of droplet fusion in ex vivo experiments utilizing cell lines. Moreover, we are willing to demonstrate the LLPS properties of SATB1 on in vitro purified SATB1 protein as indicated in the suggested experiment of Point#4 (page 2).

      After careful reading of the MS I conclude that the main conclusions of the paper are very preliminary and need much more detailed experiments. So does not qualify to get published at all at this stage.

      **Reviewer #1 (Significance)**:

      The present manuscript tries to connect the phase separation of SATB1 to understanding the mechanism of SATB1 function in cells. One of the major hallmarks of phase separation is dynamic, liquid-like behaviour and in absence of these measurements, it is very difficult to say that the current manuscript has made any contribution to showing that SATB1 can phase separate.

      The presence of 2 isoforms of SATB1 is a novel finding and the paper could have focused more on this. E.g. elucidate expression of the isoform during thymocyte development and maturation.

      As a reviewer my expertise are cell biology experiments, microscopy, in vitro reconstitution assays, RNA binding proteins, RNA and RBP condensate formation. And I feel that the reconstitution experiments are an important tool for understanding phase behaviour of proteins and also to gauge if this behaviour can occur or not in cellular concentration and conditions.

      I do not have sufficient expertise in Raman microscopy and hence the information provided in the MS on this part was not enough to understand the experiment and conclusions drawn from it.

      **Reviewer #2 (Evidence, reproducibility and clarity)**:

      The authors have reported the existence of a 'long' SATB1 isoform which also undergoes LLPS. The authors tried to draw multiple comparisons and pointed out distinction between phase properties of SATB1 isoforms. The authors also touch upon two functional roles of SATB1. Although a wide array of assays are used, the data presented and hence the manuscript makes multiple transitions into disparate hypotheses without diving deep into a single hypothesis. As a result, the connections drawn are unclear, and do not converge at best. The authors have used number of techniques, however, the results do not support their conclusions and they appear hastily drawn. It is not clear why the authors jump from one context to the other, discussing LLPS first, then transcription, splicing, post-translational modification and finally cancer. The link between all of these isn't clear and not fully supported by data. It appears that the authors wish to focus on Satb1's physiological role in development, hence the data on breast cancer is confusing. Thus, this work suffers from multiple pitfalls. Specific comments are given below:

      Major comments 1. Importantly, in Fig 1d, there is no statistics shown. There is no mention of number of replicates as well in the legends. Proper statistical evaluation is critical for interpreting this result.

      Please note that Fig. 1d only serves as a control to the sequencing experiment in Fig. 1b. In Line 566, we now state that for the RNA-seq: “A biological triplicate was used for each genotype.” To validate these data, we further designed a RT-qPCR experiment which was performed on three technical replicates from a male and female mouse. We now state this in Line 636. For the low number of samples, statistical tests are not accurate but we still added t test into the figure Fig. 1d and specified it also in the figure legend in Line 1169-1170.

      1. Figure 1f presents one of the weakest evidences in the manuscript. There are a number of corrections needed. Firstly, being their major and only validation figure for their custom antibody, the immunoblot is not clean, bands are fuzzy. Importantly, as the authors claim that the antibody is highly specific to 'long' SATB1, after the IP there should be only a single band (like input) of Satb1 long. But that does not seem to be the case, rather an array of bands are visible below (lane 2 top panel). This could easily mean that the shorter isoforms or non-specific protein bands are also pulled down with the 'long' form specific antibody. Therefore, raising a critical concern regarding the specificity of the antibody.

      • The long antibody was raised in mice inoculated with the extra peptide present in the long isoform only. Therefore, the capacity of this antibody precipitating the shorter isoforms, which do not express the sequence of the extra peptide (EP, Figure 1a) in not possible. • We have repeated the immunodepletion experiment and we now provide the results in Fig. 1f and Supplementary Fig. 1b. The western blot in Fig. 1f is now cleaner and supports quite convincingly the presence of a long SATB1 isoform. Given the lack of isoform-specific knockouts which we could utilize to immunoprecipitate or detect the different isoforms in a single cell (or cell population), the utilized approach of immunodepletion and subsequent western blotting is the approach we thought of implementing. • As shown in Fig. 1f and Supplementary Figure 1b, the long isoform SATB1 antibody has the capacity to recognize the long isoform in murine thymocyte protein extracts but not the short SATB1 isoform (please compare lane 3 in the two western blots utilizing either the antibody for the long isoform -top panel - or the antibody that detects both isoforms (lower panel). • We have performed Immunofluorescence experiments utilizing the antibody detecting the long SATB1 isoform in thymocytes isolated from either C57BL/6 or Satb1 cKO mice. The antibody is specific to the SATB1 protein since there is no signal in immunofluorescence experiments utilizing the knockout cells (Supplementary Figure 1c). • We have performed Immunofluorescence experiments utilizing thymocytes and the antibody detecting the long SATB1 or a commercially available antibody detecting all SATB1 isoforms. The pattern of SATB1 subnuclear localization is similar for both antibodies (Supplementary Figure 1e). • In our accompanying revised manuscript Zelenka et al., 2022 (https://doi.org/10.1101/2021.07.09.451769), we provide yet another piece of evidence, consisting of bacterially expressed short and long SATB1 protein isoforms detected by western blot using either the long isoform-specific or the non-selective all SATB1 isoforms antibodies. • Regarding the additional bands detected in the immunoprecipitation experiment presented in the original Supplementary Figure 1b (lane 2), it is not surprising that additional bands appear in a sample of protein extracts that is used for several hours for the immunoprecipitation experiments, while the “input” sample simply denotes protein extract that is frozen at -80oC right after the preparation of protein extracts until use. It is well-established that SATB1 is the target of proteases which might as well be active during the immunoprecipitation steps (2 consecutive immunoprecipitation steps take place). Therefore, the immunoprecipitated material cannot necessarily be a copy of the input material displaying a single protein band even if protease inhibitors are included in the buffers.

      Taken together the experiments described here we showed that the antibody raised against the extra 31 aa long peptide, present only in the long SATB1 isoform, is specific for this isoform.

      1. Related to Fig. 2 a, the authors state on Pg 5, '....the euchromatin and interchromatin regions (zones 3 & 4, Fig. 2a, b).' Although the DAPI correlation seems clear, there is no mention on how they reached the above said correlation. They should at least show a parallel speckle staining for HP1 or signature modification such as H3K4me9 STEDs for making supporting such a claim. DAPI alone is not sufficient. The authors should rectify the text thoroughly for many such interpretations without validation/reference or provide relevant data.

      This is a great suggestion we have again taken under consideration and we added the following experiments and the appropriate changes in the revised version of our manuscript. • We modified the text and added a reference to Miron et al., 2020 (https://doi.org/10.1126/sciadv.aba8811) supporting our claims regarding SATB1 localization in relation to DAPI staining. • We have also added new microscopy images for HP1, H3K4me3 and fibrillarin staining and quantified the localization of FU-stained sites of active transcription in nuclear zones, to further support our claims. • This whole modified part in Lines 139-167 then reads: “ “The quantification of SATB1 speckles in four nuclear zones, derived based on the relative intensity of DAPI staining, highlighted the localization of SATB1 mainly to the regions with medium to low DAPI staining (zones 3 & 4, Fig. 2a, b). A similar distribution of the SATB1 signal could also be seen from the fluorocytogram of the pixel-based colocalization analysis between the SATB1 and DAPI signals (Supplementary Fig. 2a). SATB1’s preference to localize outside heterochromatin regions was supported by its negative correlation with HP1β staining (Supplementary Fig. 2b). Localization of SATB1 speckles detected by antibodies targeting all SATB1 isoforms and/or only the long SATB1 isoform, revealed a significant difference in the heterochromatin areas (zone 1, Fig. 2b), where the long isoform was less frequently present (see also Fig. 2a and Fig. 3c). Although, this could indicate a potential difference in localization between the two isoforms, due to the inherent difficulty to distinguish the two based on antibody staining, we refrain to draw any conclusions. The prevailing localization of SATB1 corresponded with the localization of RNA-associated and nuclear scaffold factors, architectural proteins such as CTCF and cohesin, and generally features associated with euchromatin and active transcription32. This was also supported by colocalization of SATB1 with H3K4me3 histone mark (Supplementary Fig. 2c), which is known to be associated with transcriptionally active/poised chromatin. Given the localization of SATB1 to the nuclear zones with estimated transcriptional activity32 (Fig. 2b, zone 3), we investigated the potential association between SATB1 and transcription. We unraveled the localization of SATB1 isoforms and the sites of active transcription labeled with 5-fluorouridine. Sites of active transcription displayed a significant enrichment in the nuclear zones 3 & 4 (Supplementary Fig. 3a), similar to SATB1. As detected by fibrillarin staining, SATB1 also colocalized with nucleoli which are associated with active transcription and RNA presence (Supplementary Fig. 3b). Moreover, we found that the SATB1 signal was found in close proximity to nascent transcripts as detected by the STED microscopy (Fig. 2c). Similarly, the 3D-SIM approach indicated that even SATB1 speckles that appeared not to be in proximity with FU-labeled sites in one z-stack, were found in proximity in another z-stack (Supplementary Fig. 3c). Additionally, a pixel-based colocalization of SATB1 and sites of active transcription is quantified later in the text in Fig. 3g, supporting their colocalization.”

      1. The authors mention, '...of the different SATB1 isoforms, uncovered by the use of the two different antibodies, relied in the heterochromatin areas (zone 1), where the long isoform was less frequently...' There is no supporting figure number mentioned. The authors need to show a zone-by-zone comparison images for 'all iso' vs 'long' iso of SATB1. Just to reiterate, there is a need for a heterochromatin mark to unambiguously call out the distinction.

      We should remind that there is an inherent difficulty to accurately compare localization of short and long SATB1 isoforms in primary cells, especially due to the lack of Satb1 isoform-specific knockout mice. There is no way to detect only the short isoform in these primary cells as there are only antibodies targeting the long or all SATB1 isoforms. Therefore, we cannot set up additional experiments probing these questions.

      In line with this, in the revised version of the manuscript, we toned down our statements regarding the differential localization of the two isoforms in primary cells. We only refer to it as an indication and we support it by adding references to the relevant figures. This part now reads: “Localization of SATB1 speckles detected by antibodies targeting all SATB1 isoforms and/or only the long SATB1 isoform, revealed a significant difference in the heterochromatin areas (zone 1, Fig. 2b), where the long isoform was less frequently present (see also Fig. 2a and Fig. 3c). Although, this could indicate a potential difference in localization between the two isoforms, due to the inherent difficulty to distinguish the two based on antibody staining, we refrain to draw any conclusions. (Lines 145-150)”

      1. On the same lines, '....Given the localization of SATB1 to the nuclear zones with estimated transcriptional activity (Fig. 2b, zone 3)....' How was the region labelled as transcriptionally active? For the statistical analysis of speckle count for the two antibodies' staining, the claim posited is a bit bigger. This could simply be true for that cell. The authors thus need to statistically analyse the speckle counts for multiple cells. This needs to be done for all imaging statistics done in multiple figures throughout the manuscript.

      As mentioned in our reply to the two previous comments of this Reviewer, transcriptional activity in relation to the nuclear zonation is well established in the literature. To make this clear, we have now added the reference to Miron et al., 2020 (https://doi.org/10.1126/sciadv.aba8811) supporting our claims and additionally we have also included HP1, H3K4me3 and fibrillarin staining and quantification of FU signal in the nuclear zones. Moreover, it is not clear to which particular cell the comment refers to. The presented dots in Fig. 2b represent individual cells and the relative proportions of speckles in each nuclear zone are plotted on the y axis. In the revised version of the manuscript, we added into the figure the number of cells scored and we adapted the figure legend so that it is absolutely clear that we have analyzed multiple cells:

      “Nuclei of primary murine thymocytes were categorized into four zones based on the intensity of DAPI staining and SATB1 speckles in each zone were counted. Images used represented a middle z-stack from the 3D-SIM experiments. The graph depicts the differences between the long and all SATB1 isoforms’ zonal localization in nuclei of primary murine thymocytes. (Lines 1189-1193)”

      1. For figure 2c. the authors have used 5 Fluorouridine for nascent RNA speckles. 5FU is known to have a spread signal type (with strong association to nucleolus as well). This is not the case for the image presented 2c. The authors should resolve this by showing different sets of images.

      Developing and naive T cells are very unique in terms of their metabolic features and thus they should not be directly compared with other cell types. Therefore, we would not expect to see such a spread FU pattern as previously shown for other cell types. Having said that, we could not find any reference publication that utilized super-resolution microscopy to detect localization of FU-stained sites of active transcription in developing primary T cells. However, we performed additional immunofluorescence experiments to demonstrate the colocalization or its lack between SATB1 and HP1 (Supplementary Fig. 2b), H3K4me3 (Supplementary Fig. 2c) and fibrillarin (Supplementary Fig. 3b). Moreover, we provide additional regions of SATB1 and FU staining in Supplementary Fig. 3c. The modified text reads:

      “We unraveled the localization of SATB1 isoforms and the sites of active transcription labeled with 5-fluorouridine. Sites of active transcription displayed a significant enrichment in the nuclear zones 3 & 4 (Supplementary Fig. 3a), similar to SATB1. As detected by fibrillarin staining, SATB1 also colocalized with nucleoli which are associated with active transcription and RNA presence (Supplementary Fig. 3b). Moreover, we found that the SATB1 signal was found in close proximity to nascent transcripts as detected by the STED microscopy (Fig. 2c). Similarly, the 3D-SIM approach indicated that even SATB1 speckles that appeared not to be in proximity with FU-labeled sites in one z-stack, were found in proximity in another z-stack (Supplementary Fig. 3c). Additionally, a pixel-based colocalization of SATB1 and sites of active transcription is quantified later in the text in Fig. 3g, supporting their colocalization. (Lines 157-167)”

      1. Fig 2 d., the authors have suddenly jumped solely to 'all iso' Satb1 here for IP MS. Is there a reason for that? The authors either need to do this with 'long iso' antibody or remove the analysis from the manuscript as it does not add to their primary aim of the manuscript. Also, the authors have only selectively talked about two clusters? What about chromatin related proteins? It is quite intuitive to have highest enrichment of these given previous literature and even IP MS data by other groups. Thus, it is necessary to revise this thoroughly or remove it.

      We appreciate the acknowledgment by the Reviewer that our IP-MS data identified anticipated factors. In the revised version of the manuscript we modified the underlying text to accommodate references to these former findings revealing interactions between SATB1 and chromatin modifying complexes: “Apart from subunits of chromatin modifying complexes that were also detected in previous reports25,33–36, unbiased k-means clustering of the significantly enriched SATB1 interactors revealed two major clusters consisting mostly of proteins involved in transcription (blue cluster 1; Fig. 2d and Supplementary Fig. 4c) and splicing (yellow cluster 2; Fig. 2d and Supplementary Fig. 4c). (Lines 170-174)”

      Please note that many subunits of chromatin modifying and chromatin-related complexes are in fact characterized as transcription-related factors, therefore our statements are not in disagreement with the former findings. Note also that we provide Supplementary File 1 & 2 with comprehensive description of our IP-MS data for the readers’ convenience. Please also note that we are the first group to report on the existence of the long isoform. Therefore, we find it absolutely reasonable to perform IP-MS experiment for all SATB1 isoforms which can then be used for a comparison with other publicly available datasets. We believe that there is no contradiction in this experimental setup in relation to the rest of the manuscript. We discuss the two major clusters simply because they are the two major clusters identified as indicated in Fig. 2d. Additionally, in Supplementary Fig. 4c, we provide a comprehensive description of all significantly enriched interactors including their cluster annotation and thus anyone can investigate the data if needed.

      1. In relation to Fig. 2f, the authors have not mentioned any of the previously published work on Satb1 CD4 specific KO, not even the RNA seq studies the other groups have reported under the same condition. Only an unpublished reference of their own (preprint) is cited. It is imperative to show how much their data corroborates with other published studies. Additionally, what is the binding site status of dysregulated genes?

      In the revised version of the manuscript, we have included the references to other studies using the same Satb1 conditional knockout. Moreover, we have clarified the relationship between SATB1 binding and gene transcription. The modified part in Lines 182-194 now reads: “Satb1 cKO animals display severely impaired T cell development associated with largely deregulated transcriptional programs as previously documented19,37,38. In our accompanying manuscript19, we have demonstrated that long SATB1 isoform specific binding sites (GSE17344619) were associated with increased chromatin accessibility compared to randomly shuffled binding sites (i.e. what expected by chance), with a visible drop in chromatin accessibility in Satb1 cKO. Moreover, the drop in chromatin accessibility was especially evident at the transcription start site of genes, suggesting that the long SATB1 isoform is directly involved in transcriptional regulation. Consistent with these findings and with SATB1’s nuclear localization at sites of active transcription, we identified a vast transcriptional deregulation in Satb1 cKO with 1,641 (922 down-regulated, 719 up-regulated) differentially expressed genes (Fig. 2f). Specific examples of transcriptionally deregulated genes underlying SATB1-dependent regulation are provided in our accompanying manuscript19. Additionally, there were 2,014 genes with altered splicing efficiency (Supplementary Fig. 4d-e; Supplementary File 3-4). We should also note that the extent of splicing deregulation was directly correlated with long SATB1 isoform binding (Supplementary Fig. 4d).”

      1. In context of Figure 3a and b, the authors write .'...The long SATB1 isoform speckles evinced such sensitivity as demonstrated by a titration series with increasing concentrations of 1,6-hexanediol treatment followed...' Whereas it is apparent from the image at least that overall numbers of individual speckles are instead increased at both 2 and 5%. There is although a clear spreading of restricted speckles compared to the controls. The authors should revise their figures to substantiate the associated text. Furthermore, there needs to be 'all iso' SATB1 3D SIM imaging and not just quantitation for comparison. This is also true for panel c in order to demonstrate the effect.

      In the revised Fig. 3a we provide new images which better reflect the underlying data analysis. Moreover, in Fig. 3c and Fig. 3d we provide an additional comparison between SATB1 all isoforms and long isoform staining and their changes upon hexanediol treatment, detected by both the 3D-SIM and STED approaches. It is true that upon treatment, there tend to be more speckles, however these are much smaller as they are gradually being dissolved. Depending on the treatment duration, the cells are swollen which is reflected in increased spreading of speckles. Nevertheless, the nuclear size was considered in all the quantification analyses. We believe that the new images provide better evidence of SATB1’s sensitivity to hexanediol treatment.

      1. Fig. 3 d also does not clearly demonstrate what the authors have claimed '...hexanediol treatment highly decreased colocalization between...' The figure shows at best decreased signal intensity for both SATB1 and FU. We suggest that the authors should give a statistical analysis as well for the colocalization points between the two using multiple source images. Lastly, the two images shown (control and treated), there seems to be a clearly visible magnification difference. The authors should clarify this.

      • In the revised version of the manuscript in Figure 3d, we have provided scale bars, which are both 0.5 µm (line 1213). The difference observed by this Reviewer is actually the main reason why we provided this image. Figure 3d demonstrates that upon hexanediol treatment, the speckles are mostly missing or significantly reduced in size, for both FU and SATB1 staining. • Moreover, the suggested statistical analysis is also provided – in Figure 3e. In Figure 3e, we performed pixel-based colocalization analysis which is a method that allows both quantification and statistical comparison of colocalization between two factors and between different conditions. Please note especially the decreased colocalization between long SATB1 isoform and FU-stained sites of active transcription in the left graph, which is in agreement with our claims in the manuscript. • Moreover, our data are compared to a negative control, i.e. 90 degrees rotated samples, which is a common method in colocalization experiments as described for example in Dunn et al., 2011 (https://doi.org/10.1152/ajpcell.00462.2010). • Additionally, we provide Costes’ P values which are based on randomly scrambling the blocks of pixels (instead of individual pixels, because each pixel’s intensity is correlated with its neighboring pixels) in one image, and then measuring the correlation of this image with the other (unscrambled) image. Please see Costes et al., 2004 (https://doi.org/10.1529%2Fbiophysj.103.038422) for more details.

      1. Figure 3f. The authors show the PC plot for Raman spectroscopy for phase behaviour due to Satb1. The experiment and its related text seems misinterpreted; the authors write...' ese bonds were probably enriched for weak interactions responsible for LLPS that are susceptible to hexanediol treatment. This shifted the cluster of WT treated cells towards the Satb1 cKO cells. However, the remaining covalent bonds differentiated the WT samples from Satb1 cKO cells......' whereas the clusters are clearly far away in 3D for both WT and KO while being closer to their respective treatments. Which is also intuitive given the sensitivity of Raman spectroscopy. Thus, it is more likely to be treatment effect and KO effect as separate. Treatment of WT leads to KO like spectra is far-fetched. Thus, the authors need to show separate PCs and modify their text thoroughly.

      We do not present any 3D graph hence it is not clear what the Reviewer refers to. Please also note that as stated in Lines 817-818, we used a customized Raman Spectrometer. Therefore, this approach allowed us to measure Raman spectra at cellular and even sub-cellular levels. For example, solely by utilizing Raman spectroscopy, we can now distinguish euchromatin and heterochromatin, methylated and unmethylated DNA and RNA, etc. This, together with other reports, such as Kobayashi-Kirschvink et al., 2018 (https://doi.org/10.1016/j.cels.2018.05.015) and Kobayashi-Kirschvink et al., 2022 (https://doi.org/10.1101/2021.11.30.470655), indicate a potential use of Raman in biological research. In our manuscript, we used this method as a supplementary approach, however we do find it noteworthy. We should also emphasize that in the revised Raman spectroscopy Fig. 3h, each point represents measurements from an individual cell and for each condition we used 2-5 biological replicates (Lines 831-832 & Lines 1225-1226). We specifically refer to the principal component 1 (PC1) that differentiates the samples. Therefore, there are certain spectra (representing certain chemical bonding) that allowed us to differentiate between WT and Satb1 cKO. The same type of bonding was then affected when WT samples were treated with hexanediol and we also had controls to rule out the impact of hexanediol on the resulting spectra.

      1. In Fig 4. b, The authors have shown the propensity of SATB1 N terminus to phase separate using different optodroplet constructs. Although the imaging is clear, why are the regions selected not uniform when comparing various constructs?

      We have selected images that would best represent each category. Please note that this was live cell imaging of photo-responsive constructs, thus there are many limitations regarding the area selection. Very often, even the brief time of bright light exposure to localize cells may trigger protein clustering. Upon disassembly, every new light exposure of the same cell then triggers much faster assembly which skews the overall results. It is therefore desired to work fast, while neglecting selection of equally sized cells. Moreover, it is not clear how would the proposed change improve the quality of our manuscript.

      1. Figure 5a, the disassembly should be shown for 'long' SATB1 as well. On pg 13, the authors write '....cytoplasmic protein aggregation has been previously described for proteins containing poly-Q domains and PrLDs..' no reference given.

      • In the revised version of the manuscript, we present the assembly and disassembly for both short and long full length SATB1 optogenetic constructs. To increase clarity, we present the behavior of the short and long isoforms as two separate images in Figure 5a and Figure 5b, respectively. • Moreover, we provided references to the statement regarding aggregation of PrLD and poly-Q-containing proteins in Lines 305-309, which now reads: ”Since protein aggregation has been previously described for proteins containing poly-Q domains and PrLDs8,11,38,39, we next generated truncated SATB1 constructs encoding two of its IDR regions, the PrLD and poly-Q domain and in the case of the long SATB1 isoform also the extra peptide neighboring the poly-Q domain (Fig. 1a and 4a).”

      1. Fig. 5d, Is there an amino-acid specific reasoning to support the authors claim of the phase behaviour due to extra peptide? They need to show a proper control with equal extra (unrelated) peptide to show the specificity. Are the shorter isoform aggregates responsive to light?

      • We have referred to the amino acid composition bias in Fig. 5c. In the revised version of the manuscript, we made this clear by showing the composition bias in the new revised Fig. 5e. The related part of the main text then reads: “Computational analysis, using the algorithm catGRANULE37, of the protein sequence for both murine SATB1 isoforms indicated a higher propensity of the long SATB1 isoform to undergo LLPS with a propensity score of 0.390, compared to 0.379 for the short isoform (Fig. 5d). This difference was dependent on the extra peptide of the long isoform. Out of the 31 amino acids comprising the murine extra peptide, there are six prolines, five serines and three glycines – all of which contribute to the low complexity of the peptide region3 (Fig. 5e).” (Lines 298-304) • Moreover, we should note that the low complexity extra peptide of the long SATB1 isoform directly extends the PrLD and IDR regions as indicated in Fig. 4a and which we now directly state in Lines 304-305: “Moreover, the extra peptide of the long SATB1 isoform directly extends the PrLD and IDR regions as indicated in the Fig. 4a.” • We show in Fig. 4, that the N terminus of SATB1 undergoes LLPS. Since this part of SATB1 is shared by both isoforms, it is reasonable to assume that both isoforms would undergo LLPS. This is also in line with the observed photo-responsiveness of both short and long full length SATB1 isoforms in CRY2 optogenetic constructs in revised Fig. 5a,b, and similar FRAP results for both short and long full length SATB1 isoform constructs transiently transfected in NIH-3T3 cells in the revised Supplementary Fig. 6f. However, the main reason why we think that the difference in LLPS propensity between the isoforms is important is because the long isoform is more prone to aggregate compared to the short isoform, as documented in Fig 5c,f,g and Supplementary Fig. 5f.

      1. Fig 6c., It is important that authors show the data for NLS+short iso data as well to prove their hypothesis.

      As shown in original Figure 5d, the long SATB1 isoform undergoes cytoplasmic aggregation, unlike the short SATB1 isoform (as shown in the same Figure). Therefore, an image of the NLS + short isoform would not be related to our hypothesis. Actually, we wanted to reverse the long SATB1 isoform’s relocation, from the aggregated form in the cytoplasm into the nucleus. Nevertheless, to show the complete picture, in the revised version of the manuscript in Figure 6c, we now provide data for both short and long SATB1 isoforms.

      1. Fig 6d., The authors claim that mutating a specific P site changes the phase behaviour of the 'short iso'. Does it also increase for the long isoform? The authors need to confirm this in order to verify the effect of a single P site outside of oligomerization domain. ...' phosphorylation status; when phosphorylated it remains diffused, whereas unphosphorylated SATB1 is localized to PML bodies....' This being an important premise, thus should be moved to the results text.

      In the revised version of the manuscript, we moved the part regarding PML in the results section, as suggested by the Reviewer. Moreover, we included additional experiments probing the impact of association between PML and two SATB1 full length isoforms on their dynamics. The modified section in Lines 357-368 now reads: “In relation to this, a functional association between SATB1 and PML bodies was already described in Jurkat cells64. We should note that PML bodies represent an example of phase separated nuclear bodies65 associated with SATB1. Targeting of SATB1 into PML bodies depends on its phosphorylation status; when phosphorylated it remains diffused, whereas unphosphorylated SATB1 is localized to PML bodies66. This is in line with the phase separation model as well as with our results from S635A mutated SATB1, which has a phosphorylation blockade promoting its phase transitions and inducing aggregation. To further test whether SATB1 dynamics are affected by its association with PML, we co-transfected short and long full length SATB1 isoforms with PML isoform IV. The dynamics of long SATB1 isoform was affected more dramatically by the association with PML than the short isoform (Supplementary Fig. 7e), which again supports a differential behavior of the two SATB1 isoforms.”

      Moreover, given the localization of the discussed phosphorylation site in the DNA binding region of SATB1 we did test its impact on DNA binding as documented in the revised Supplementary Fig. 7d. Additionally, as we have noted in our answer in Major Comment C of this reviewer, to further support the effect of serine phosphorylation on the DNA binding capacity of SATB1 we have performed DNA affinity purification experiments utilizing primary thymocyte nuclear extracts treated with phosphatase (Supplementary Fig. 7b) We found that SATB1’s capacity to bind DNA (RHS6 hypersensitive site of the TH2 LCR) is lost upon treatment with phosphatase (Supplementary Fig. 7c).

      1. Pg 16,. The authors have tried to explain multiple things (concepts of self-regulation, accessibility) which is quite tangential. There is no inference to Fig 6f., which is showing the opposite to what the authors had postulated. This portion should either be removed or explained with a rationale. The writing also needs to be revised thoroughly in this section. Similarly, the discussion should also be modified.

      The rationale for the original Fig. 6f (revised Fig. 6g) was described in great detail in Lines 330-343 of the original manuscript. It is not clear why the Reviewer assumes that it shows the opposite to our hypothesis. As we explained, the increased accessibility allows faster read-through by RNA polymerase, and thus the exon with higher accessibility is more likely to be skipped. The exact relationship is shown in the revised Fig. 6g where the increased accessibility is associated with the expression of the short isoform, whereas the long isoform expression needs lower chromatin accessibility which allows the splicing machinery to act on the specific exon to be included. We reason that these findings are important and relevant because: 1) we suggest a potential regulatory mechanism for the SATB1 isoforms production. This is highly relevant to this manuscript given the fact that this is the first report on the existence of the long SATB1 isoform, and 2) the differential production of the long/short SATB1 isoforms has a potential relevance to breast cancer prognosis. In the revised version of the manuscript we added Fig. 6f, which now indicates the differential chromatin accessibility in human breast cancer patients and accordingly the expression of the long SATB1 isoform are associated with worse patient prognosis as indicated in Fig. 6h and Supplementary Fig. 8a,b. In the revised version of the manuscript, we substantially modified the text in Lines 374-408, to make the relevance of all these conclusions clear. The modified text now reads: “Therefore, we reasoned that a more plausible hypothesis would be based on the regulation of alternative splicing. In our accompanying manuscript19, we have reported that the long SATB1 isoform DNA binding sites display increased chromatin accessibility than what expected by chance (Fig. 3b in 19), and chromatin accessibility at long SATB1 isoform binding sites is reduced in Satb1 cKO (Fig. 3c in 19), collectively indicating that long SATB1 isoform binding promotes increased chromatin accessibility. We identified a binding site specific to the long SATB1 isoform19 right at the extra exon of the long isoform (Fig. 6e). Moreover, the study of alternative splicing based on our RNA-seq analysis revealed a deregulation in the usage of the extra exon of the long Satb1 isoform (the only Satb1 exon affected) in Satb1 cKO cells (deltaPsi = 0.12, probability = 0.974; Supplementary File 4). These data suggest that SATB1 itself is able to control the levels of the short and long Satb1 isoforms. A possible mechanism controlling the alternative splicing of Satb1 gene is based on its kinetic coupling with transcription. Several studies indicated how histone acetylation and generally increased chromatin accessibility may lead to exon skipping, due to enhanced RNA polymerase II elongation48,49. Thus the increased chromatin accessibility promoted by long SATB1 isoform binding at the extra exon of the long isoform, would increase RNA polymerase II read-through leading to decreased time available to splice-in the extra exon and thus favoring the production of the short SATB1 isoform in a negative feedback loop manner. This potential regulatory mechanism of SATB1 isoform production is supported by the increased usage of the extra exon in the absence of SATB1 in Satb1 cKO (Supplementary File 4). To further address this, we utilized the TCGA breast cancer dataset (BRCA) as a cell type expressing SATB150. ATAC-seq experiments for a series of human patients with aggressive breast cancer51 revealed differences in chromatin accessibility at the extra exon of the SATB1 gene (Fig. 6f). In line with the “kinetic coupling” model of alternative splicing, the increased chromatin accessibility at the extra exon (allowing faster read-through by RNA polymerase) was positively correlated with the expression of the short SATB1 isoform and slightly negatively correlated with the expression of the long SATB1 isoform (Fig. 6f). Moreover, we investigated whether the differential expression of SATB1 isoforms was associated with poor disease prognosis. Worse pathological stages of breast cancer and expression of SATB1 isoforms displayed a positive correlation for the long isoform but not for the short isoform (Fig. 6g and Supplementary Fig. 6c). This was further supported by worse survival of patients with increased levels of long SATB1 isoform and low levels of estrogen receptor (Supplementary Fig. 6d). Overall, these observations not only supported the existence of the long SATB1 isoform in humans, but they also shed light at the potential link between the regulation of SATB1 isoforms production and their involvement in pathological conditions.”

      1. The authors should not draw conclusions based on any data which is not shown '....ed differences in chromatin accessibility at the extra exon of the SATB1 gene (data not shown), suggesting its potential involvement in alternative splicing regulation according to the "kinetic coupling" model...'. This has led to overspeculation and needs correction.

      In the revised version of the manuscript, we included the ATAC-seq data from human breast cancer patients in the revised Fig. 6f. The legend of this figure now reads: “Human TCGA breast cancer (BRCA) patient-specific ATAC-seq peaks51 span the extra exon (EE: extra exon; labeled in green) of the long SATB1 isoform. Note the differential chromatin accessibility in seven selected patients, emphasizing the heterogeneity of SATB1 chromatin accessibility in cancer. Chromatin accessibility at the promoter of the housekeeping gene DNMT1 is shown as a control. (Lines 1281-1285)” Accordingly, we have also modified the main text: “ATAC-seq experiments for a series of human patients with aggressive breast cancer68 revealed differences in chromatin accessibility at the extra exon of the SATB1 gene (Fig. 6f). In line with the “kinetic coupling” model of alternative splicing, the increased chromatin accessibility at the extra exon (allowing faster read-through by RNA polymerase) was positively correlated with the expression of the short SATB1 isoform and slightly negatively correlated with expression of the long SATB1 isoform (Fig. 6g).” (Lines 395-339)”

      Minor comments: 1. On pg 4, the authors state 'Here, we utilized primary murine T cells, in which we have identified two full-length SATB1 protein isoforms.' Whereas only one 'long' isoform is identified and the other is the canonical version. The authors should correct the statement.

      In the revised version of the manuscript, we modified this statement as follows: ”In this work, we utilized primary developing murine T cells, in which we have identified a novel full-length long SATB1 isoform and compared it to the canonical “short” SATB1 isoform.” (Lines 64-66)”

      1. Fig. 1 a , Is there a specific reason to generate a custom-made antibody for 'all' SATB1, using similar regions that are already commercially available. This becomes redundant otherwise, because there is no apparent difference in detection compared to the commercial one (Suppl. Fig 1a). Antibody generation strategy (1a) should be moved to supplementary. Additionally, authors have obtained the custom antibodies from a commercial source, therefore, the text should reflect the same alongside relevant details.

      The custom-made SATB1 antibody targeting the amino-terminal region of the protein has been developed in order to be utilized for detecting the native form of the protein. Unlike commercially available antibodies raised against either short peptides or denatured forms of the protein we have utilized the native form of the amino-terminal part of the protein for raising this antibody. To be honest, this antibody has been raised in order to be utilized in ChIP-seq experiments since no commercially available antibody is of high quality for this approach. Moreover, the original Figure 1a was utilized in order to provide an overview of the SATB1 protein structure which is highly relevant to understand its biophysical properties and not for presenting the strategy for raising a custom-made antibody for SATB1.

      1. Fig 3e: what is the control used here? In their Pearson correlation analysis, there seem to be significant reduction in control sets as well upon treatment. This needs to be clarified.

      We used scans rotated by 90° which served as a negative control, as stated in Line 769: “SATB1 scans rotated by 90° served as a negative control for the colocalization with FU.” Note that this is a commonly used control in colocalization experiments as described for example in Dunn et al., 2011 (https://doi.org/10.1152/ajpcell.00462.2010).

      Additionally, we provide Costes’ P values which are based on randomly scrambling the blocks of pixels (instead of individual pixels, because each pixel’s intensity is correlated with its neighboring pixels) in one image, and then measuring the correlation of this image with the other (unscrambled) image. Please see Costes et al., 2004 (https://doi.org/10.1529%2Fbiophysj.103.038422) for more details. Moreover, it was actually anticipated to see a decrease in colocalization upon hexanediol treatment even in the negative control, as hexanediol significantly reduces both SATB1 and FU speckles as established in Fig. 3a-d.

      1. Pg 10, the authors claim that '..., thus we reasoned that it may also be used to study phase separation...' But there have been numerous reports starting from 2018, which have utilized this technique in corelation to phase behaviour (albeit individual proteins). The authors should include proper citations as they are extending an idea from the same field to their specific need.

      In the revised version of the manuscript, we included relevant citations to support the use of Raman spectroscopy in LLPS research: “Raman spectroscopy was already used in many biological studies, such as to predict global transcriptomic profiles from living cells42, and also in research of protein LLPS and aggregation43–47. Thus we reasoned that it may also be used to study phase separation in primary T cells.” (Lines 225-228)”

      1. For Fig 5b, there should be a comparative image for 'short' isoform.

      In the revised Figure 5c we have included a comparative image for the short SATB1 isoform.

      1. In the context of Figure 5c, the authors claim ...' Note also the higher LLPS propensity of the human long SATB1 isoform compared to the murine SATB1...' Why suddenly human and mouse comparisons are drawn? This figure should be moved to supplementary.

      The comparison between the human and mouse SATB1 isoforms has been implemented because it is relevant for our claims regarding the increased SATB1 aggregation in human cells in relation to the revised Fig. 6f,g,h and Supplementary Fig. 6c,d. This is also discussed in Lines 479-482, which read: “This is particularly important given the higher LLPS propensity of the human long SATB1 isoform compared to the murine SATB1 (Fig. 5d). Therefore, human cells could be more susceptible to the formation of aggregated SATB1 structures which could be associated with physiological defects.”

      **Reviewer #3 (Evidence, reproducibility and clarity):**

      Zelenka et al., focus on a T cell genome-organizing protein, SATB1, to show that SATB1 undergoes liquid liquid phase-separation (LLPS), and distinct isoforms confer different LLPS-related biophysical properties. They generate a long-isoform specific antibody and conduct several experiments to test for LLPS and compare LLPS properties between the long-isoform relative to the whole SATB1 protein population. Given that SATB1 plays important roles in T cell development and in cancer, interrogating SATB1 biophysical properties is an important question. However, there are multiple problems with the experimental setup and data that weaken their support of the conclusions. I will detail some of the major issues below:

      Regarding phase-separation There are several assays to determine whether a protein undergoes LLPS. 1. One of the first the authors address is the spherocity or roundness. Indeed, formation of spherical droplets is one evidence of the liquid nature of a protein. However, the authors use fixed preparations (which can introduce artifacts), not free-floating protein, and determine roundness by showing a 2D image. Roundness should take into account the diffraction-limits of fluorescent imaging, as many structures can be imaged to appear round by the detector. There are quantifiable measurements that can be taken on 3D images to show roundness. This would best be shown using non-fixed protein.

      • We thank this Reviewer for several insightful comments. Although, we agree with most of them, we should highlight the main goal of our manuscript, i.e. to investigate the SATB1 protein with an emphasis on its physiological roles in primary developing murine T cells. We highlight this already in the introduction in Line 64 “In this work, we utilized primary developing murine T cells,...” and mainly also in the respective part of the result section: “To probe differences in phase separation in mouse primary cells, without any intervention to SATB1 structure and expression, we first utilized 1,6-hexanediol treatment, which was previously shown to dissolve the liquid-like droplets34.(Lines 203-205)”

      • We believe that this is a very important aspect of our study that should not be overlooked. The majority of proteins perhaps behave differently under physiological and in vitro conditions. However, due to the extensive post-translational modifications affecting the properties of SATB1, its completely different localization patterns between primary developing T cells and other cell types but especially cell lines and many other aspects, it was of utmost importance to focus our research on primary T cells. Unfortunately, this was accompanied with multiple difficulties, such as that we have to use fixed cells as this is the only way to visualize SATB1 in these cells. Alternatively, one could create a new mouse line expressing a fluorescently tagged SATB1 protein, but this is beyond the scope of our work.

      • However, we should also note that many LLPS-related studies do not pay any focus on primary physiological functions of proteins and they simply focus on the investigation of protein’s artificial behavior in in vitro conditions. Having said that, we too extended our experiments in primary cells to the ex vivo studies in cell lines to further support our claims. In these experiments, we utilized live cell imaging in Fig. 4-6, quantified the spherocity in Supplementary Fig. 6, showed the ability of speckles to coalesce in Fig. 4c and also used FRAP in Fig. 4f and also in the revised version of the manuscript in Supplementary Figure 6f. Moreover, we should note that most of these experiments were designed and performed during 2017 and 2018 conforming with the standards. We are well aware of the progress in the field and impact of fixation on LLPS, as described in Irgen-Gioro et al., 2022 (https://doi.org/10.1101/2022.05.06.490956), but after over seven months of review process in another journal we also believe that these aspects should be considered not to delay further progress of the SATB1 field.

      Regarding the isoform specificity of SATB1 biophysical properties 1. The authors generate a long isoform-specific antibody. However, the western blot is not convincing that this is indeed specific to the long isoform as there is a rather large smear. Can this be improved with antibody preabsorption? Since this is a key reagent for the manuscript, improvement in antibody quality is essential.

      The custom-made antibody for the long isoform has been raised against the unique 31 amino acids long peptide present in the long SATB1 isoform. The polyclonal serum has undergone affinity chromatography utilizing the immobilized peptide (antigen) to purify the antibody. In the revised version of the manuscript we have included another immunodepletion experiment with cleaner bands (Fig. 1f). Moreover, please read our answer to Major comment #2 of Reviewer 1 that follows: • The long antibody was raised in mice inoculated with the extra peptide present in the long isoform only. Therefore, the capacity of this antibody precipitating the shorter isoforms, which do not express the sequence of the extra peptide (EP, Figure 1a) in not possible.

      • We have repeated the immunodepletion experiment and we now provide the results in Fig. 1f and Supplementary Fig. 1b. The western blot in Fig. 1f is now cleaner and supports quite convincingly the presence of a long SATB1 isoform. Given the lack of isoform-specific knockouts which we could utilize to immunoprecipitate or detect the different isoforms in a single cell (or cell population), the utilized approach of immunodepletion and subsequent western blotting is the approach we thought of implementing.

      • As shown in Fig. 1f and Supplementary Figure 1b, the long isoform SATB1 antibody has the capacity to recognize the long isoform in murine thymocyte protein extracts but not the short SATB1 isoform (please compare lane 3 in the two western blots utilizing either the antibody for the long isoform -top panel - or the antibody that detects both isoforms (lower panel).

      • We have performed Immunofluorescence experiments utilizing the antibody detecting the long SATB1 isoform in thymocytes isolated from either C57BL/6 or Satb1 cKO mice. The antibody is specific to the SATB1 protein since there is no signal in immunofluorescence experiments utilizing the knockout cells (Supplementary Figure 1c).

      • We have performed Immunofluorescence experiments utilizing thymocytes and the antibody detecting the long SATB1 or a commercially available antibody detecting all SATB1 isoforms. The pattern of SATB1 subnuclear localization is similar for both antibodies (Supplementary Figure 1e).

      • In our accompanying revised manuscript Zelenka et al., 2022 (https://doi.org/10.1101/2021.07.09.451769), we provide yet another piece of evidence, consisting of bacterially expressed short and long SATB1 protein isoforms detected by western blot using either the long isoform-specific or the non-selective all SATB1 isoforms antibodies.

      • Regarding the additional bands detected in the immunoprecipitation experiment presented in the original Supplementary Figure 1b (lane 2), it is not surprising that additional bands appear in a sample of protein extracts that is used for several hours for the immunoprecipitation experiments, while the “input” sample simply denotes protein extract that is frozen at -80oC right after the preparation of protein extracts until use. It is well-established that SATB1 is the target of proteases which might as well be active during the immunoprecipitation steps (2 consecutive immunoprecipitation steps take place). Therefore, the immunoprecipitated material cannot necessarily be a copy of the input material displaying a single protein band even if protease inhibitors are included in the buffers.

      Taken together the experiments described here we showed that the antibody raised against the extra 31 aa long peptide, present only in the long SATB1 isoform, is specific for this isoform.

      1. Fig 4 Optodroplet experiment appears to show that the N-terminus of SATB1 can undergo LLPS. The results of this assay show that SATB1 has a domain that can undergo phase-separation in isolation, but it does not show that the protein itself is a phase-separating protein. The FRAP assay methods are not provided by the authors, but this is important, as continued light activation means proteins are continuously forming aggregates, and the bleaching for FRAP should be balanced with the levels of Cry2 activation. A very good description of the methods is described in the original Optodroplet paper: https://www.sciencedirect.com/science/article/pii/S009286741631666X?via%3Dihub#sec4

      We should note that we did follow the FRAP protocol provided by the recommended study Shin et al., 2017 (https://doi.org/10.1016/j.cell.2016.11.054). Indeed, these experiments are very tricky to perform and interpret, as every cell expresses slightly different amounts of protein which is directly associated with the different speed of optoDroplet formation, and thus its propensity to aggregate upon overactivation. On the other hand, there need to be continuous activation during the FRAP experiment as the lack of activation laser would result in fast disassembly of the optoDroplets, counteracting the FRAP results. Moreover, the optoDroplets actively move around the cell in all dimensions which makes the accurate measurement of signal intensity really challenging, even with an adjusted pinhole. Therefore, we do not think that FRAP is the best approach to examine the behavior of optoDroplets.

      Either way, we have now described the detailed FRAP protocol in Lines 889-898, which read: “For the FRAP experiments, cells were first globally activated by 488 nm Argon laser illumination (alongside with DPSS 561 nm laser illumination for mCherry detection) every 2 s for 180 s to reach a desirable supersaturation depth. Immediately after termination of the activation phase, light-induced clusters were bleached with a spot of ∼1.5 μm in diameter. The scanning speed was set to 1,000 Hz, bidirectionally (0.54 s / scan) and every time a selected point was photobleached for 300 ms. Fluorescence recovery was monitored in a series of 180 images while maintaining identical activation conditions used to induce clustering. Bleach point mean values were background subtracted and corrected for fluorescence loss using the intensity values from the entire cell. The data were then normalized to mean pre-bleach intensity and fitted with exponential recovery curve in Fiji or in frapplot package in R.”

      1. Description of analyses that authors prefer not to carry out

      **Reviewer #1**:

      Can they use the all and long isoform antibodies together, then subtract the signal from long isoform to conclude about the localization of the shorth isoform ?

      We thank the Reviewer for the suggestion, though given the differential efficiency of antibodies and other limitations of imaging experiments, we do not find the suggested experiment to have a potential to improve the quality of our manuscript. However, we should note that we have performed a pixel-based colocalization experiment between the signal detected by all isoform and long isoform SATB1 antibodies. Fluorocytogram of the pixel-based colocalization, based on 3D-SIM data is provided on the left, with quantified colocalization on the right of the revised Supplementary Fig. 5a.

      3) Lack of better staining with antibody against the long and short SATB1 isoforms after treatment with 1,6 Hexanediol. 1,6 Hexanediol treatment can change many other chromatin associated proteins to which SATB1 can be bound to indirectly. This experiment can

      We do understand the controversy and difficulties of experiments using 1,6-hexanediol treatment. However, we have to note that there is no better approach available for the investigation of LLPS in our primary murine T cells. We did use alternative approaches in ex vivo experiments, utilizing cell lines to validate our hypothesis without the involvement of 1,6-hexanediol.

      **Reviewer #2**:

      1. The authors mention, '...of the different SATB1 isoforms, uncovered by the use of the two different antibodies, relied in the heterochromatin areas (zone 1), where the long isoform was less frequently...' There is no supporting figure number mentioned. The authors need to show a zone-by-zone comparison images for 'all iso' vs 'long' iso of SATB1. Just to reiterate, there is a need for a heterochromatin mark to unambiguously call out the distinction.

      We should remind that there is an inherent difficulty to accurately compare localization of short and long SATB1 isoforms in primary cells, especially due to the lack of Satb1 isoform-specific knockout mice. There is no way to detect only the short isoform in these primary cells as there are only antibodies targeting the long or all SATB1 isoforms. Therefore, we cannot set up additional experiments probing these questions.

      In line with this, in the revised version of the manuscript, we toned down our statements regarding the differential localization of the two isoforms in primary cells. We only refer to it as an indication and we support it by adding references to the relevant figures. This part now reads: “Localization of SATB1 speckles detected by antibodies targeting all SATB1 isoforms and/or only the long SATB1 isoform, revealed a significant difference in the heterochromatin areas (zone 1, Fig. 2b), where the long isoform was less frequently present (see also Fig. 2a and Fig. 3c). Although, this could indicate a potential difference in localization between the two isoforms, due to the inherent difficulty to distinguish the two based on antibody staining, we refrain to draw any conclusions. (Lines 145-150)”

      1. Fig. 6a, The authors wished to see the effect of RNA on Satb1 nuclear localization. This is not related to the main theme of the paper, thus should be moved to supplementary (true for b as well). Importantly, the experiments should be performed with total cells to show the divergence of localization (like the paper the authors referred to) instead of matrix for clarity.

      • We did not wish to see the effect of RNA on SATB1 localization. In fact, there is a long history of SATB1 research that is inherently linked with the concept of nuclear matrix, a putative nuclear structure which is highly associated with nuclear RNAs. SATB1 was described many times as a nuclear matrix protein (https://doi.org/10.1016/0092-8674(92)90432-c; https://doi.org/10.1128/mcb.14.3.1852-1860.1994; https://doi.org/10.1074/jbc.272.17.11463; https://doi.org/10.1128/mcb.17.9.5275; https://doi.org/10.1021/bi971444j; https://doi.org/10.1083/jcb.141.2.335; https://doi.org/10.1101/gad.14.5.521; https://doi.org/10.1038/ng1146).

      • Moreover, our data discussed in comments 4-7 of this Reviewer, such as i. the localization of SATB1 to the nuclear zones associated with RNA and nuclear scaffold factors (Fig. 2b, Supplementary Fig. 1c), ii. colocalization of SATB1 with actively transcribed RNAs (Fig. 2c, Fig. 3g, Supplementary Fig. 2a, Supplementary Fig. 2c), iii. including its association with nucleoli (Supplementary Fig. 3b), and also iv. its computationally predicted interaction with Xist lncRNA (Agostini et al., 2013; https://doi.org/10.1093/nar/gks968) as a notable factor of nuclear matrix, all suggest that the interaction between RNA and SATB1 is plausible and potentially relevant for its function and/or at least its subnuclear localization. It is relevant even more so, when considering numerous reports on the ability of RNA-binding, poly-Q and PrLD-containing proteins to undergo LLPS https://doi.org/10.1016/j.molcel.2015.08.018; https://doi.org/10.1042/bcj20160499; https://doi.org/10.1016/j.cell.2018.03.002; https://doi.org/10.1016/j.cell.2018.06.006; https://doi.org/10.1093/nar/gkaa681), including RNAs specifically regulating LLPS behavior, especially for poly-Q and PrLD-containing proteins, such as SATB1 (https://doi.org/10.1126/science.aar7366; https://doi.org/10.1126/science.aar7432; https://doi.org/10.1016/j.ceb.2019.03.007; https://doi.org/10.1038/s41598-020-57994-9; https://doi.org/10.1016/j.molcel.2015.09.017; https://doi.org/10.1038/s41598-019-48883-x; https://doi.org/10.1038/s41467-019-11241-6).

      • It should also be noted that SAF and various hnRNPs, as the most prominent proteins of nuclear matrix were many times reported to phase separate (https://doi.org/10.1016/j.molcel.2019.10.001; https://doi.org/10.1074/jbc.ra118.005120; https://doi.org/10.1016/j.celrep.2019.12.080; https://doi.org/10.1038/s41467-019-09902-7; https://doi.org/10.1016/j.molcel.2017.12.022; https://doi.org/10.1074/jbc.tm118.001189). All these aspects show that the relation between nuclear matrix, SATB1 and RNA are quite relevant to our manuscript.

      • Moreover, in light of the aforementioned information, we believe that it is much clearer to follow the protocol we did – i.e. to remove soluble proteins by CSK treatment and then, upon RNase treatment, extract the released proteins using ammonium sulfate. In an experiment utilizing whole cells, one would need to microinject RNase A into the nucleus, which 1. is very challenging for primary T cells having a radius of 3-5 micrometers, 2. is of low throughput, 3. would not allow for released protein removal which would thus make the results hard to interpret. Please note that in the reference paper, the authors used cell lines overexpressing heterologous GFP-tagged proteins, which is not related to our setup.

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Reply to the Reviewers

      I thank the Referees for their...

      Referee #1

      1. The authors should provide more information when...

      Responses + The typical domed appearance of a hydrocephalus-harboring skull is apparent as early as P4, as shown in a new side-by-side comparison of pups at that age (Fig. 1A). + Though this is not stated in the MS 2. Figure 6: Why has only...

      Response: We expanded the comparison

      Minor comments:

      1. The text contains several...

      Response: We added...

      Referee #2

    1. If he had takencare to specify that when he said “we”, “us”, and “our” he meant each one of us, actingand responding as a user of quantum mechanics, he would have had it exactly right. Butit seems to me likely that he was using the first person plural collectively, to mean all ofus together, thereby promulgating the Copenhagen confusion
      • PEIERLS
      • (I THINK) "OUR"==="OBJETIVO", "COLECTIVO"

      • (I DON'T UNDERSTAND) "promulgating the Copenhagen CONFUSION"

      • (google) lo he encontrado en OTRO libro de Mermin, en referencia a una "CRITICA" de BELL al uso de la palabra "KNOWLEGDE" en las "explicaciones" de Heisenberg, y PEIERLS (!!!)

      • BELL: Whose Knowlegde? Knowlegde about WHAT?

      • (google) tambien aqui [MASTERPIECE-SEE 3 LAST PARAGRAPHS)

      • [Peierls responde a Bell]: "That leaves the question: whose knowledge should be represented in the density matrix? In general there will be many who may have some information about the state of a physical system. Each of them has to use his or her density matrix. These may differ, as the nature and amount of knowledge may differ. People may have observed the system by different methods, with more or less accuracy; they may have seen part of the results of another physicist. However, there are limitations to the extent to which their knowledge may differ. This is imposed by the uncertainty principle. For example if one observer has knowledge of s, of our Stern-Gerlach atom, another may not know sx, since the measurement of sx would have destroyed the other person's knowledge of J2, and vice versa. This limitation can be compactly and conveniently expressed by the condition that the density matrices used by the two observers must commute with each other."

      • (I THINK) Ha respondido a la pregunta??? I DONT THINK SO!

      • Me recuerda la "respuesta" de Bohr a EPR
      • "AMBOS" vuleven a "repetir" como "aplican" ellos la QM, pero sin "responder" a la pregunta!
    1. As We May Think

      Da a entender que los nuevos inventos han ampliado el poder físicos del hombre, pero no el de su mente. Por lo cual, Bush, dice que están a la mano instrumentos que, si se desarrollan adecuadamente darán al hombre acceso sobre el conocimiento.

    1. Author Response:

      Reviewer #1:

      Charpentier et al. use facial recognition technology to show that mothers in a group of mandrills lead their offspring to associate with phenotypically similar offspring. Mandrills are a species of primate that live in large, matrilineal troops, with a single, dominant male that fathers the majority of the offspring. Male breeder turnover and extra-pair mating by females can lead to variation in relatedness between group members and the potential for kin-selected benefits from preferentially cooperating with closer relatives within the group. The authors argue that the strategy of influencing the social network of their offspring could be favoured by "second-order kin selection", a mechanism by which inclusive fitness benefits are accrued to female actors through kin-selected benefits to their offspring. This interpretation is supported by a theoretical model.

      The paper highlights a previously unappreciated mechanism for favouring association between non-kin in social groups and also contributes a nice insight into the complexity of social interactions in a relatively understudied wild primate species. The conclusions are strengthened by data showing associations between mothers were not influenced by the facial similarity of their offspring -- this suggests that mothers are making decisions based on the appearance of offspring and not their mothers.

      Some remaining questions regarding the strength of the authors' interpretation exist: Given the challenges of studying mandrills in the field, the fact that the study reports data from a single group is understandable but potential issues remain with the independence of data points. There may be an additional issue arising from the fact that this troop is semi-captive.

      The study group is not semi-captive. Instead, it originated from two release events of a few captive individuals into the wild (in 2002 and 2006). The population is now composed of more than 250 individuals and all of them, except for 7 founder females (<3%), were born in the wild. In addition, the study group is not fed and occasionally wanders into a fenced protected area. Fences of the park do not represent a boundary for mandrills and most of the time (c.a. 80% of days), the study group ranges outside the park. We have clarified this misunderstanding.

      Regarding the independence of data points, we would be grateful if this reviewer could clarify her/his thoughts. As a tentative response, we indeed have access to a single (although large) study group, but that’s unfortunately often the case when studying primates or other large mammals. Regarding our study questions, we have clearly demonstrated increased nepotism among paternally related mandrills in two different social groups (Charpentier et al. 2007: semi-captive mandrills; Charpentier et al. 2020: wild mandrills). More generally, we do not see any parsimonious explanations for why the studied mandrills would behave or experienced selective pressures that may have differently shaped their genetic structure and social organization compared to other wild mandrill groups.

      The number of genotyped offspring is relatively small (n = 15) and paternity is inferred from the identity of the dominant male. However, the authors also refer to the fact that it's normal for female mandrills to mate with several males during ovulation.

      Indeed, both sexes mate promiscuously during the mating season. We have very recently (June 2022) obtained new genetic profiles for a subset of the study infants (it took two years to obtain these data). We have now increased our sample size of infants with a known father, from 15 to 32. With these new data, we were able to distinguish between four categories of infant-infant dyads: those sharing the same father (PHS), those not sharing the same father (not PHS), those conceived during the same alpha male tenure, and those that were not (both infants with unknown dads). The graph below shows the average facial distance among individuals for each of these four categories. It shows that infants conceived during the same alpha male tenure are significantly more similar to each other than infants sired by different fathers or during the tenure of different alpha males, but they are also significantly less similar to each other than infants born to the same father (the four categories are all significantly different from each other, except when comparing infants born to different fathers with those conceived during different alpha male tenures). As suggested by this reviewer, the fact that females mate predominantly with the alpha male, but to some extent also with other males, likely explains the difference between “same father” and “same alpha male tenure”. Importantly, however, considering all infants conceived during the same alpha male tenure as “PHS” is highly conservative. It is thus likely that knowing the paternity of every infant would produce even clearer effects (and indeed, increasing the data set from 15 to 32 strengthened this result). We have now updated this result (first model) based on this new sample.

      What evidence is there to support a beneficial effect of nepotism in this species?

      In mandrills, females who affiliate more (groom more/associate more) with their groupmates (kin or non-kin) during juvenility also reproduce 1 year earlier than those females that are poorly socially integrated (Charpentier et al. 2012). These results are similar to what is known in many mammalian species (see for review Snyder-Mackler et al. 2020). However, the positive effects of a rich social life are generally triggered by all group members, not only close kin. However, if beneficial social relationships impact the direct fitness of individuals, as reported in mandrills and other species, then kin selection theory predicts that these effects should further translate into indirect fitness benefits.

      We have now added this relevant reference (Charpentier et al. 2012) in the revised version of our manuscript and present the results of this early study on mandrills.

      What form could nepotism take and does it necessarily have to involve full sibs?

      We are unsure why this reviewer is mentioning full-sibs here. For this reviewer information, on the 2556 study dyads (model 1 on the impact of maternal and paternal origins on facial distance), only one dyad was a full-sib pair. Full-sibs are therefore very rare in the study population due to male migration patterns and generally short alpha male tenures.

      If a female did not associate with offspring as shown here, would nepotistic interactions simply arise between her offspring and offspring that were less facially similar?

      We guess that facial similarity would not be a predictor of spatial association anymore. Indeed, we think that young mandrills do not use self-referent phenotype matching, precluding the self-evaluation of those infants that look like them. However, as stated below, we cannot fully exclude the possibility that other social partners, such as fathers, may also influence infant-infant relationships, although we think that this alternative mechanism is less parsimonious than the one we propose and test.

      Reviewer #2:

      This paper uses data on patterns of spatial association and facial similarity in mandrills to develop a new hypothesis for the evolution of kin recognition based on facial cues. Previous work on this system has shown that, among females, paternal half-sibs resemble each other visually more than maternal half-sisters do. The authors hypothesise that this paternally inherited facial similarity provides opportunities for kin selection, but it is unclear how offspring themselves could recognise kin using phenotype matching since they are unable to see their own face. One answer to this puzzle is that third parties -- mothers -- may promote social interactions between their own offspring and other offspring that resemble them since these other offspring are likely to share the same father. In support of this hypothesis, the authors find that mothers and offspring show spatial proximity to infants that are facially more similar than average. They also use an analytical evolutionary model to confirm the logic of this hypothesis. The model shows that mothers can gain inclusive fitness benefits by encouraging reciprocal social interaction among their offspring and other paternally-related offspring. They term this idea 'second-order' kin selection and identify a range of other circumstances in which it might play an important role in shaping the evolution of social behaviour.

      The main strengths of the paper are the interesting mandrill data and the cutting-edge methods used to analyse facial similarity, which have stimulated the development of a theoretically interesting hypothesis about the evolution of facially based kin recognition. The theoretical model enhances the generality and rigour of the work. The paper will be of wide interest and the concept of second-order kin selection may be applicable to other social circumstances, such as interactions among in-laws in close-knit family groups. Thus, I can see that this paper will be a stimulus for future work.

      We are grateful for these positive comments.

      The data are, I think, rather overinterpreted in terms of the degree to which they support the hypothesis. The spatial proximity data are interesting, but on their own, they are not definitive support for the hypothesis or model. A more critical approach to the hypothesis, clearly setting out the limitations of the data, and what tests in future could be used to falsify the hypothesis or model, would make for a stronger paper.

      We agree with this general comment and have addressed it by 1. Adding a model on grooming relationships between females and infants, 2. Toning down our interpretation throughout the manuscript and 3. Propose future directions of research.

      Overall the authors have presented data that support a fascinating new mechanism by which natural selection can influence social interactions among the members of family groups, in potentially surprising ways. I also find it remarkable that 60 years after the development of the kin selection theory new implications of this theory are still being uncovered. The concept of second-order kin selection may prove important in understanding the evolution of social organisation and behaviour in species that live in groups containing a mixture of kin and non-kin, such as many primates and of course humans.

      We are grateful to this reviewer for this very positive comment. We fully agree with the fact that 60 years after the kin selection theory has emerged, we are still discovering further implications!

      Reviewer #3:

      This is a very interesting and impressive manuscript. It is complex in its multiple components, and in some ways that makes it a difficult manuscript to evaluate. There is a lot in it, including empirical analyses of a face dataset and of behavioral association data, combined with a theoretical model.

      We are very grateful for this positive comment and are glad that you liked our manuscript.

      The three main findings are: 1) Paternal siblings look alike (similar to, and building on, a recent manuscript the authors published elsewhere); 2) Infants that are more facially similar tend to associate; and 3) mothers tend to be found in association with other unrelated infants that look more like their own infants. Such results are interesting, and indeed one potential interpretation, perhaps even the most likely, is that mothers are behaving in such a way that promotes association between their own infants and the paternal kin of their infants.

      Nonetheless, the evidence provided is logically only consistent with the authors' hypothesis, rather than being strong direct evidence for it. As such, the current framing and indeed the title, "Primate mothers promote proximity between their offspring and infants who look like them", are both problematic. (In addition, the title should be about mandrills, not "primates", since this manuscript does not provide evidence from any other species.) The evidence provided is consistent with the hypothesis, but also consistent with other potential hypotheses. The evidence given to dismiss other potential hypotheses is not strong, and rests on the fact that many males are not around all year to influence things, and that "males that were present during a given reproductive cycle are not responsible for maintaining proximity with either infants or their mothers (MJEC and BRT, pers. obs.)".

      We agree with this comment. Although, after examining several alternative mechanisms, in the light of the natural history of mandrills we are confident that the proposed mechanism is at work in that species, although we cannot firmly exclude some of these alternative mechanisms. To address this comment, we have changed the title of our manuscript that now reads “Mandrill mothers associate with infants who look like their own offspring using phenotype matching”. We have also included an additional model on grooming relationships (see response to R1) and have toned down the interpretation of our results throughout our revised manuscript. Finally, we have further discussed alternative scenario, in particular the one involving fathers (see details above).

      My opinion is that these are really interesting analyses and data, which are being somewhat undermined by the insistence that only one hypothesis can explain the observed association patterns. It could easily be presented differently, as a demonstration that paternal siblings look alike and that they associate. The authors could then go on to explore different possible explanations for this using their association data, make the case that maternal behavior is the most plausible (but not the only) explanation, and present their model of how such behavior could bring fitness benefits.

      In my view, such a presentation would be both more cautious and more appropriate, without in any way reducing the impact or importance of the data. In the current iteration, I think there are issues because the data do not provide sufficient support for the surety of the title and conclusion, as presented.

      We think that the current organization of our manuscript was not that different from the one proposed here and follows a reasoning already proposed in a former manuscript (Charpentier et al. 2020). Indeed, we first start by reminding the reader what we already know from that previous studies: paternal siblings look alike and they associate. We then go on exploring different mechanisms. That being said, and as suggested, we have been more cautious in interpreting our results, that are indeed only correlative.

    1. Author Response

      Reviewer 1

      Bailon-Zambrano and colleagues were trying to answer the general question: what contributes to phenotypic variation when a gene of strong effect is mutated?

      The work has several major strengths for answering this interesting question. First, they decided to study mef2ca in zebrafish for which they had previously shown that mutants displayed highly variable facial phenotypes. To learn how phenotypic variation depends on phenotypic severity, they realized they had studied more alleles, and so induced two more alleles to have three different types of molecular lesions (start codon mutation, premature stop codon, and full coding gene deletion). Investigating these alleles showed that increasingly severe alleles had more variation among individuals in the population but not necessarily more variation between the left and right sides of the face within individuals.

      Over several years, these investigators had spent considerable effort to select lines of fish that segregate the start-codon mutation and have either severe or weak effects on facial phenotypes. wondered: what factors were selected out of the original genetic background that would increase or decrease phenotypic severity? They hypothesized that one or more of the five mef2 paralogs in zebrafish might help to ameliorate the phenotype in the low line or reciprocally intensify the phenotype in the high line. They studied expression of the mef2 paralogs in neural crest cells by single-cell transcriptomics. They found that paralogs were downregulated in the high-penetrance line with respect to an unselected line, a result expected if expression of the paralogs contributed to buffering phenotypic severity. This experiment has two weaknesses, first that the method only examined neural crest cells but we know that signals from the ectodermal and endodermal epithelia contribute to craniofacial morphologies by diffusible signals. If genes regulating craniofacial morphologies that act in epithelia had genetic variation that contributes to severity, those genes would not be investigated in these crest-only experiments. A minor problem (which is associated with the expense of the experiment) is that the scRNA-seq experiments compared only the high and unselected lines, not the low line. To address both problems, the investigators performed qPCR on RNAs extracted from whole heads of genetically mef2ca-wild types from the high and low line. In these qPCR experiments, however, they did not investigate the unselected line. Leaving out the low line in one approach and leaving out the unselected line in the other approach somewhat weakens the strength with which one can draw conclusions (e.g., the qPCR conclusion assumes that the unselected line would be intermediate between the two selected lines) but is unlikely to change the basic conclusions the authors drew. In addition, using whole heads in the qPCR experiments, while it has the advantage that it includes epithelia, does not distinguish between genes expressed only in the crest and genes expressed in other cell types, and these experiments did not test for any genes known to affect craniofacial development that are epithelium-specific.

      In response to this comment, and those below, we removed the scRNA-seq comparing neural crest cells from unselected and high-penetrance strains. We replaced those data with new important results which considerably advance our model. We found significant paralog expression variation among unselected zebrafish families (Fig. 4D). These results strongly suggest that our breeding selected upon standing paralog variation the unselected parental strains. See more below.

      Finally, in key experiments that are a major strength of the work and require significant effort, the researchers systematically made mutations in four of the five zebrafish mef2 paralogs (mef2aa, mef2b, mef2cb, and mef2d, all except mef2ab, which didn't become mutated despite significant effort) in the genetic background of the lowpenetrance strain and studied them in single homozygotes, in double mutants, and in various heterozygous combinations. These important experiments showed that some paralogs provided significant buffering in the low-penetrance strain, the strain that up-regulated expression of these paralogs. It would be helpful in the discussion to mention that mef2ab couldn't be mutated and a phrase added about what that means for the general conclusions - in the opinion of this reviewer, the impact of this is not great but it should be acknowledged.

      We acknowledge that mef2ab couldn’t be mutated and consider what that means for the general conclusions in the text.

      A strength of the experiments is that the workers quantified effects of various genotypes by focusing on the length of the symplectic, a convenient element for quantification both within single individuals and among fish in a population. It would be helpful to have a statement on the evidence that this measure is a good representative for other aspects of the phenotype.

      We provide new data indicating that the symplectic cartilage length is significantly correlated with another mef2ca-associated phenotype (Fig. 1-figure supplement 2). See more below.

      Finally, the paper presents a model for understanding the results presented that does a good job of summarizing the data and, importantly, suggests ways to move the analysis deeper. Missing from the description of the model is a discussion about whether the genetic variation that was selected and ultimately upregulated mef2 paralogs is in regulatory elements of the mef2 paralogs themselves or whether it might be in trans-acting transcriptional regulators that simultaneously regulate all mef2 paralogs due to the authors' hypothesized 'cryptic vestigial' functions.

      We considerably revised the discussion, thoroughly considering both these possibilities.

      This work is likely to have a significant impact on the fields of developmental biology, the interpretation of human mutational variation (in for example the concept of phenotypic expansion), and the way people think about the evolution of new morphologies over time. A brief comparison of the authors' results and interpretations to those of C.H. Waddington's concept of genetic assimilation would provide improved historical context and broaden the potential impact of the work.

      We now include a discussion of our study in the context of Waddington’s genetic assimilation.

      Reviewer 2

      Bailon-Zambrano et al study the possible mechanisms that contribute to the oft-observed phenomenon that an individual mutation may be associated with variable expression of a phenotype. They focus on loss-of-function of the mef2ca gene of zebrafish, which is needed for the normal development of several craniofacial structures. They demonstrate that recessive putative loss-of-function mutant alleles of the mef2ca gene of zebrafish are associated with a range of expressivity. By focusing on one aspect of the mutant phenotype, the length of the symplectic cartilages that support the jaw, they find a correlation between the average strength of the phenotype of an allele (measured as reduction in length) and the extent of variability between mutant individuals that carry the allele. I am concerned about this conclusion and generalizations that may be drawn from focus on a single quantifiable character, the symplectic cartilage. Perhaps there is always a fixed variation in the length of this cartilage. As stronger alleles produce shorter cartilage pieces, variations in size may appear to be of greater significance when affecting shorter average length.

      We now show that the symplectic cartilage length is a good proxy for other craniofacial phenotypes (Fig. 1figure supplement 2). Further, we clarify in the text that we use the coefficient of variation (standard deviation/mean) which is the accepted best practice for determining and comparing variation. We also use the F-test statistic which is the standard statistical method to test for equality of two variances. This test tells us if the standard deviations from two datasets are significantly different.

      The authors hypothesize that one factor that contributes to the varied phenotypic expression of an allele (expressivity) is the co-expression of paralogs that may provide wildtype function and thus partially or wholly rescue the mutant phenotype. They test this hypothesis by "fixing" conditions where a single mutation may be expressed with low or high penetrance. By selective breeding based on phenotype, they create two sets of strains that carry an identical mef2ca mutation: one strain has high penetrance of the mutant phenotype and the other low penetrance. They then investigate the factors that are likely responsible for the high vs low penetrance. Historically we would call these factors "genetic modifiers". There is extensive literature on the nature of genetic modifiers and there are many current screens in both mice and Drosophila to identify genetic modifiers and uncover their nature, but there is little reference to these studies in the current manuscript. Further, there is previously published work that hypothesizes that one important function of paralogs in multicellular organisms is to provide a buffer to stabilize levels of gene expression needed for developmental decisions.

      Following this reviewer’s suggestion, we now include many new references (increased from ~50 to >80) incorporating much of the important work leading up to our study. These include referencing both genetic modifier mutagenesis screens, paralogous buffering in other systems, and “natural” modifier studies that set the stage for our work.

      The authors find that paralogs of the mef2ca gene are expressed in cells that normally express mef2ca, and that these paralogs are expressed at higher levels in the mutant strain with low penetrance than in the mutant strain with high penetrance. They say that selection for high penetrance of the mef2ca mutant phenotype "leads to down-regulation" of paralog expression. As the authors only show that paralog expression is at lower levels in high penetrance vs low penetrance strains, it is not clear what they mean by "down-regulation". Perhaps their breeding scheme has only "captured" what is natural variation and there is no active mechanism of "down-regulation". The authors need to clarify what they mean.

      Thank you for this suggestion. We clarified that we do not mean active down or up regulation but rather selection on preexisting genetic variation. This conclusion is supported by new data (Fig. 4D).

      The authors also find that individuals from the high penetrance strains that don't carry the mef2ca mutation (they are wildtype for this gene) sometimes exhibit mef2ca mutant characters. They suggest the reduced paralog expression is responsible for the occasional emergence of the mef2ca mutant characters. In contrast with this suggestion, the authors later claim the paralogs "have no function" in craniofacial development. The authors need to clarify their thoughts about what is paralog function in craniofacial development and why reduced paralog function might contribute to the expression of mef2ca mutant characters. This topic is worthy of discussion.

      We considerably revised our discussion of this topic including our interpretation that the decreased expression of mef2ca in high penetrance strain led to the phenotypes we observe in mef2ca wild types from this strain. We also are more careful with our language, stating that the paralog mutants are indistinguishable from wild types, rather than stating that paralogs do not function in craniofacial development. In fact, they do function in craniofacial development, as buffers. Thank you for this suggestion that strengthened our manuscript.

      The authors claim is there is both up-regulation of paralogs in low penetrance strains and down-regulation of paralogs in high penetrance strains. As they only compare steady state levels of expression in each strain, they can only reasonably conclude that there are differences - they seem to imply a mechanism and they need to be clear about what they are thinking.

      Excellent point. In the revised manuscript, we are clear that there is not active up or down regulation but rather selection upon preexisting variation.

      They hypothesize that paralog expression in the low penetrance strain masks the effects of loss of mef2ca. They test this by creating CRISPR-engineered mutations of two paralogs and examining the effects of the paralog mutations in wildtype fish or in fish carrying the mef2ca mutation. They find the putative loss-offunction mutations in the paralogs have no effect in wildtype backgrounds and conclude these paralog genes have no function in craniofacial development. However, the paralog mutations enhance the mutant phenotype in fish that carry the mef2ca mutation. This provides strong evidence consistent with the model that the elevated expression of the paralogs functions to reduce the severity of the phenotype associated with the mef2ca mutation.

      Reviewer 3

      In this elegant genetic study, Bailon-Zambrano et al. draw on classical genetic concepts to address the clinically pertinent question of how genetic variants in the same gene can yield wildly different phenotypes in different individuals. They focus on the Mef2c gene, which is required for craniofacial and cardiac development in humans and model organisms yet shows highly variable phenotypes across and within individuals. Previous work by this lab had established that zebrafish mef2ca craniofacial phenotypes are highly variable and, importantly, that this variability is heritable and can be selectively bred for low vs. high penetrance. The authors hypothesize that vestigial expression of paralogous genes variably compensates for loss of mef2ca, such that individuals with higher levels of paralogous genes will show lessened severity and vice versa. To test their hypothesis, they methodically quantify the penetrance, expressivity, and variability of all known mef2caassociated craniofacial phenotypes in fish carrying 1) different mef2ca mutations, 2) the same mutation but after selecting for high vs. low penetrance for many generations, and 3) mef2ca mutations combined with mutations in paralogous genes. They find that not only does allele severity directly correlate with variation, but also that different paralogs buffer the severity and variability of different craniofacial phenotypes. Another particularly interesting finding is that some of the craniofacial phenotypes are apparent even in mef2ca wildtypes from the high penetrance strain, which they explain by the very low expression of paralogs on this background. A weakness of the study is that the authors do not directly show whether paralog expression is increased in the low-penetrance strain relative to the initial, unselected genetic background. It is therefore not clear whether the selection for low penetrance worked in this manner, as the authors imply. Overall, the authors have achieved an important step forward in understanding the genetic basis for the high variability of human faces among both healthy individuals and those with craniofacial anomalies.

      We can’t go back (over ten generations) to survey the original parental strain. However, we can use the unselected AB strain as a proxy for the initial unselected genetic background. In an important addition to the manuscript, we found significant paralog expression variation between unselected AB families (Fig. 4D). These results strongly suggesting there is cryptic, standing paralog expression variation that we selected upon. We would like to thank the reviewer for this excellent critique which motivated these important new experiments considerably advancing our model.

    1. Author Response

      Reviewer 1

      Ting Tang et al. present the results of a species x genotype diversity experiment within BEF China. The authors assess the relative impacts of species and genotype diversity on community-level primary productivity of the trees and the potential mediation of this effect via interactions of plants with soil fungi and herbivores. The results show that both species and genotype diversity influence productivity via changes in herbivory, soil fungal diversity, and other unknown mechanisms. Most of the species diversity effects could be directly related to functional diversity, while genotype diversity effects were not well represented by the way functional diversity was measured in this study.

      Thanks for the positive comments on the paper.

      The study is based on an impressive experiment that will certainly allow achieving major insights into the role of genotype and species diversity on ecosystem functioning. However, there are some significant shortcomings in the methods that limit this study. In particular, the incomplete assessment of functional traits, herbivory, and fungal diversity across the subplots used for this study reduces statistical power. Specific measurements of traits, herbivory and fungal diversity in each plot would substantially simplify the design and the analyses and likely also reduce the unexplained variance observed in the study. However, this is nothing that can be changed now and has the likely explanation of feasibility constraints.

      Thank you for the positive comments on the paper and the understanding of the feasibility constraints. In our study, functional traits of all the seed families of the four species across all the species × genetic diversity combinations were sampled, but to reduce circularity, we used the seed-family means across all tree diversity combinations to calculate functional diversity for every subplot instead of only using the functional trait measures obtained in that particular subplot. We have taken up the suggestion to also calculate functional diversity based on trait measurements of individual trees, but also here used data across all plots to reduce circularity. Additionally, we now acknowledge the incomplete assessment of herbivory in the Methods and state that fungal diversity in plant species mixtures was sampled on plot level because of feasibility constraints.

      Lines 334–337: “To reduce circularity, we used the seed-family means across all species × genetic diversity combinations to calculate FDis values per subplot that did not only depend on the functional trait measures obtained in that particular subplot. Using traits measured in a particular subplot to calculate FDis for that subplot bears the risk that the measured traits reflect a response to the local environment, yet we want to use FDis as a predictor variable for the performance of that subplot.

      Lines 380–382: “The mean value of herbivore damage per species × genetic diversity level was used to fill in missing values in a few subplots with tree individuals lacking herbivory data (Table S3).

      Lines 385–388: “Soil fungal diversity was used as a proxy of unspecified trophic interactions. To be consistent with the species and genetic diversity treatment design, soil samples were taken on subplot level for the 1.1 and 1.4 diversity treatments, but, due to feasibility constraints, on plot level for the 4.1 and 4.4 diversity treatments in 2017.”

      The writing of the manuscript is generally good. However, given the somewhat diffuse results obtained for genetic diversity effects, they receive a lot of attention in the discussion, while species diversity effects are little mentioned. This could be better balanced and also referred back to the hypotheses. For example, I miss the discussion of the very clear hypothesis that genotype diversity effects are positive in species monocultures but neutral in species mixtures. How do your results fit with this hypothesis? My general impression is that the study is very well framed, but lacks to stick to this frame in the discussion. I am aware that this might be a challenge with the results obtained, but worth trying.

      Thank you for the positive comments on the writing and pointing out the unclear part of the genetic diversity effects. To better connect the discussion to our hypothesis that genotype diversity effects are “more important in species monocultures than in species mixtures” (lines 114–115), we have rewritten the corresponding Discussion section.

      Lines 248–164: “In contrast of our second hypothesis, we found that the effects of genetic diversity via functional diversity and multi-trophic feedbacks were negative in species monocultures but positive in the species mixture (Fig. 5 and Fig. S3). We found genetic diversity had positive effects on tree functional diversity and soil fungal diversity, which supports the trade-offs between genetic and species diversity discussed in the previous section. However, the hypothesized positive effects of tree functional diversity on productivity turned negative in species monoculture. This result indicates that functional diversity may not have positive effects on the ecosystem functioning under low environmental heterogeneity, i.e. species monocultures in our study (Hillebrand and Matthiessen 2009). Therefore, our findings show that the different effects of genetic diversity on tree productivity between species monocultures and mixtures, not only depend on the different effects of genetic diversity on functional diversity and trophic interaction but also on the varied tree productivity consequences from functional diversity and trophic interaction on tree productivity between species monocultures and mixtures. Moreover, other aspects of tree genetic diversity seem to play an important role not only for productivity in tree species mixtures (see previous section) but also for productivity in tree species monocultures. These may include unmeasured functional traits such as root traits (Bardgett et al., 2014) or unknown mechanisms underpinning effects of tree genetic diversity.

      Given the complex results obtained, I also feel that the title and main message received in the abstract do not fully reflect the results. Genetic diversity effects on productivity, but also on herbivory and fungal diversity, are not general (e.g. Fig. 2) nor are all genetic diversity effects on productivity mediated by functional diversity and trophic feedback. I think the title and main message of the study should be articulated more precisely.

      In this study we did not find direct effects of genetic diversity on tree productivity in the binary analyses (Fig. 2), but we did find indirect effects of genetic diversity on tree productivity via functional diversity and trophic feedbacks in the path analysis (Fig. 4). Now we have pointed this out in the Discussion.

      Lines 201–204: “Although only species diversity but not genetic diversity was found to affect tree productivity in binary analyses, both kinds of diversity positively affected tree community productivity and trophic interactions via functional diversity according to our structural equation models (SEMs) depicted in the corresponding path-analysis diagrams (see Fig. 4).

      We agree that not all genetic diversity effects on productivity were mediated by functional diversity and trophic feedbacks. This may have been because we did not include all relevant functional traits and trophic interactions in this study. Nevertheless, our findings support the hypothesis that genetic diversity can affect productivity via functional diversity and trophic feedbacks and suggest more possibilities for further research. We have explained this in the Discussion.

      Lines 230–238: “Even after accounting for tree functional diversity and trophic feedbacks, we still detected a direct negative effect of tree genetic diversity on tree productivity, while the direct effect of tree species diversity was fully explained by functional diversity and trophic feedbacks. This suggests that aspects of genetic diversity that do not contribute to functional diversity or trophic interactions as measured in this study may reduce ecosystem functioning, e.g. due to trade-offs between genetic diversity and species diversity. For example, it has been shown that in species-diverse grassland ecosystems, niche-complementarity between species can increase at the expense of reduced variation within species (van Moorsel et al., 2018; van Moorsel et al., 2019; Zuppinger-Dingley et al., 2014; Zvereva et al., 2012).

      Lines 260–264: “Moreover, other aspects of tree genetic diversity seem to play an important role not only for productivity in tree species mixtures (see previous section) but also for productivity in tree species monocultures. These may include unmeasured functional traits such as root traits (Bardgett et al., 2014) or unknown mechanisms underpinning effects of tree genetic diversity.”

      Reviewer 2

      This study aims to disentangle the contributions of genetic and species diversity to tree community fitness. It confirms the role of genetic diversity in functional and ecological traits but shows how these effects change when plant species diversity is increased, which can potentially add to our understanding of the interplay between plant diversity at various levels and community and ecosystem functions. It would be desirable to make emphasis whether differences between the effects of genetic and species diversity are comparable since they can act at complementary but different levels. It is hard to establish whether the effects of species diversity override the effects of genetic diversity by shared mechanisms; or whether a high species diversity reduces plant intraspecific interactions and the consequent effects of genetic diversity by density-dependent effects. However, this point has to be emphasized in the discussion.

      Thank you for your positive comments on this paper. In the binary analyses in this paper, we used general linear mixed-model analysis to detect the effects of genetic diversity within species. Now we have clarified this in the Methods and the Results section. However, in Fig. 2 we also indicate the significance of the main effect of genetic diversity. We do not focus on this because of the interaction between species and genetic diversity. In statistical terms, fitting genetic diversity effects separately for species monocultures and mixture (2 degrees of freedom) is equivalent (i.e. has the same sum of squares) as fitting the main effect of genetic diversity (1 degree of freedom) and the interactions species x genetic diversity (1 degree of freedom).

      Lines 415–424: “To determine how species and genetic diversity and their interaction affected tree functional diversity and trophic interactions, linear mixed-effects models (LMMs) were fitted with two types of contrast coding. In the first, we used the ordinary 2-way analysis of variance with interaction and in the second we replaced the genetic diversity main effect and the interaction with separate genetic diversity effects for species monocultures and the species mixture (Table S6). Note that as our design was orthogonal, fitting sequence did not matter in either of the codings. However, we focused our major analysis on the second type of coding to make it consistent with our hypotheses. Main effects of genetic diversity are presented in inset panels in Fig. 2. Our second contrast coding ensured that we tested the effects of genetic diversity separately in species monocultures and species mixture, but within the same analysis.

      Lines 120–121: “Using linear mixed-model analyses, we tested the effects of species diversity and genetic diversity within species on trophic interactions and community productivity.

      Meanwhile, to emphasize that species diversity and genetic diversity could affect each other, we discussed that the trade-offs between species and genetic diversity could contribute to the effects of tree diversity on tree community productivity. We also discussed that the different effects of genetic diversity between species monocultures and mixtures may occur because different biotic environments resulted from different species diversity.

      Lines 232–238: “This suggests that aspects of genetic diversity that do not contribute to functional diversity or trophic interactions as measured in this study may reduce ecosystem functioning, e.g. due to trade-offs between genetic diversity and species diversity. For example, it has been shown that in species-diverse grassland ecosystems niche-complementarity between species can increase at the expense of reduced variation within species (van Moorsel et al., 2018; van Moorsel et al., 2019; Zuppinger-Dingley et al., 2014; Zvereva et al., 2012).

      Lines 250–260: “We found genetic diversity had positive effects on tree functional diversity and soil fungal diversity, which supports the trade-offs between genetic and species diversity discussed in the previous section. However, the hypothesized positive effects of tree functional diversity on productivity turned negative in species monoculture. This result indicates that functional diversity may not have positive effects on the ecosystem functioning under low environmental heterogeneity, i.e. species monocultures in our study (Hillebrand and Matthiessen 2009). Therefore, our findings show that the different effects of genetic diversity on tree productivity between species monocultures and mixtures, not only depend on the different effects of genetic diversity on functional diversity and trophic interaction but also on the varied tree productivity consequences from functional diversity and trophic interaction on tree productivity between species monocultures and mixtures.

      The experimental design has to be explained in more detail, in particular how plants were planted in the species monocultures. It is not stated whether the same or different species were used in the plots or in subplots. The design lacks proper replication for the treatment with high genetic diversity in species monocultures (n=2) which could lead to a biased result, especially if those plots were located in the same area.

      Thank you for the valuable comments on the experiment design. In total, we used four species and eight seed families per species for the whole experiment, and now we have added a diagram of the experimental design to the supplementary material (Fig. S5) to show the species and seed-family information for every subplot. Furthermore, we have added a table to the supplementary material to indicate the occurrence time of each species and each seed family across all the tree diversity-treatment combinations (Table S2). The high genetic diversity in species monoculture (1.4 treatment) was replicated 2 times per species and thus had 8 replications (Fig. S5). However, because we did not have enough seedlings, we could only establish these treatments at subplot level and thus put the different species for the 1.4 treatment into only two plots. Now we have added more explanation of the plot design in the Methods part. The plot distribution was completely randomized across the experimental site and plots of the same treatments were mostly located at least 50 m from each other (see Fig. 1 from Bongers et al., 2020, pasted here further below). The reason that there are more plots for the 1.1 treatment is that typically in biodiversity experiments more plots are needed at the lowest diversity treatment because of the desire to have all seed families occurring in any mixture also present as monoculture. Regarding the point that the four diversity treatments varied between rather than within plots, we ensured that diversity effects were tested at the plot level by including plot as random-effects term in the mixed models.

      Lines 305–323: “For each of the four species, we collected seeds from eight mother trees to allow for two replications of four-family mixtures per species. Furthermore, to avoid the effects of unequal representation of particular seed families and correlations between seed family presence and diversity treatments, we made sure that every seed family occurred the same number of times at each diversity level (see Table S2, small deviations from the rule were required where not enough seeds from a seed family could be obtained). Due to budget limitations and the number of replicates required per single seed family, the 1.1 and 1.4 diversity treatments were applied at subplot level (0.25 mu) and replicated 32 and 8 times, respectively. The 4.1 and 4.4 diversity treatments were applied at plot level (1 mu) and were replicated 8 and 6 times, respectively (Fig. S5; see also Fig. 1 in Bongers et al., 2020). To allow for simpler analysis, we obtained most community measures at subplot level also for the 4.1 and 4.4 diversity treatments and thereafter used the subplots for all tests of diversity effects on these community measures, including plots as error (i.e. random-effects) term for testing the diversity effects in the corresponding mixed models. In total, because one 1-mu plot could not be established due to logistic constraints, the number of subplots used was 92 (32 subplots of 1.1, 8 subplots of 1.4, 28 subplots of 4.1 and 24 subplots of 4.4 diversity treatment). Note that in biodiversity experiments lower richness levels represent more different communities and thus require more plots. For the highest richness level, where there is typically only one species composition, this same community is typically replicated multiple times, as we did here for the 4.4. diversity treatment.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      The manuscript describes the formation of supernumerary centriole protein assemblies ("cenpas") upon silencing of the E3 ubiquitin ligase TRIM37. These "cenpas" resemble centrioles, centriole precursors, or electron-dense striped structures, termed "tigers". Similar observations are made in cells from patients lacking functional alleles of TRIM37. The "cenpas" usually lack the full complement of centriolar proteins, but contain increased amounts of the pro-centriole marker centrobin. It is further shown that the formation of "cenpas" depends on centrobin, or on a parallel pathway involving Plk1 and SAS-6. Overall, the experiments in this study are of high technical quality and most of them are carefully controlled. The discovery of centrobin-containing striped protein assemblies ("tigers") is very interesting and provokes the question of their molecular composition and their mechanistic role in centriole assembly. Since striated fibres containing the protein rootletin have a similar periodicity of stripes (75nm) as the "tigers" in this study (Vlijm et al., PNAS 2018, 115:E2246-53), I was wondering whether the authors couldn't simply test for colocalization of their "tiger"-stripes with rootletin. A potential identity of "tigers" with striated fibres would help understanding the mechanisms of "cenpas" and centriole assembly upon depletion of TRIM37: striated fibres or "tigers" might be controlling the balance of centriole cohesion vs. disengagement and thereby centriole duplication, or they might play a role in the recruitment of additional proteins involved in pro-centriole assembly.

      We are grateful to the reviewer for this interesting suggestion. Accordingly, we will test the distribution of Rootletin and potentially CEP68 by immunofluorescence analysis of cells depleted of TRIM37.

      In the same context, did the authors correct for the experimentally induced sample expansion in Figure 5B, when comparing inter-stripe distances between U-ExM and EM samples?

      Yes, we did. We will clarify the text of the revised manuscript to make this more explicit.

      Other major points: The amount of TRIM37-depletion upon siRNA-treatment should be indicated prominently. I see in the "Materials and Methods" and in Fig. S4 that quantitative RT-PCR has been performed. Could Western blotting be performed to have direct information on the protein levels? Fig. 2C demonstrates that this is possible in cells from human patients, so why are there no data on the majority of other experiments in this manuscript?

      We previously reported Western blot analysis to estimate the extent of TRIM37 depletion upon siRNA treatment (Balestra et aI., 2013). However, following the suggestion of the reviewer, we will repeat this analysis for select experiments of this study.

      Moreover, what is the transfection efficiency in the siRNA experiments? Is there variability between cells that might explain variability in the "cenpas" phenotypes?

      The reviewer brings up an interesting point. However, in the absence of an antibody to detect endogenous TRIM37 by immunofluorescence analysis, we cannot provide an accurate figure in this case. We will mention this limitation explicitly in the text of the revised manuscript.

      Minor point: In line 353 (page 12), it is stated that centrobin in si-TRIM37 cells migrates slower (Fig. 4D), suggesting that TRIM37 regulates the post-translational state of centrobin. It looks to me as if the corresponding gel in Fig. 4D was "smiling" (see curvature of centrobin in the neighboring lane). I think that the authors should tone down their statement, or replace Fig. 4D with a more convincing image.

      We thank the reviewer for having noticed this. We will provide another gel that is not “smiling” -the difference in migration has been observed in a reproducible manner.

      Reviewer #1 (Significance):

      The findings of this manuscript are highly significant for our understanding of centriole biogenesis. They should be of interest to a large community of cell biologists working on mitosis and on the centrosome, and they are of further importance for biomedical research related to developmental growth abnormalities (Mulibrey nanism). The manuscript shows for the first time a mechanistic link between TRIM37-dependent control of centrobin protein levels, and their impact on the formation of centriole precursors during the cell cycle. The manuscript is well presented, and the relevant scientific literature is cited correctly. However, I would prefer that a potential relationship between "cenpas", "tigers", and the welldescribed rootletin-containing striated fibres be discussed, if not controlled by additional experiments.

      We thank the reviewer for her/his appreciation of our work and support for publication.

      Field of expertise of this reviewer: centrosome, microtubules, mitosis, cell culture, light and electron microscopy, biochemistry.

      Reviewer #2 (Evidence, reproducibility and clarity):

      In this work, the authors investigated roles of TRIM37 in regulation of centriole numbers. It was previously observed that depletion of TRIM37 results in supernumerary centrioles and centriole-like structures (Balestra et al., Dev. Cell, 2013; Meitinger et. al., 2016). Here, the authors characterized these centriolar protein assemblies (Cenpas). Cenpas were formed, following an atypical de novo pathway and eventually trigger centriole assembly. They observed that Centrobin is frequently present in Cenpas from the early stage and other centriolar components are sequentially recruited. Furthermore, they established that Cenpas formation upon TRIM37 depletion requires PLK4 activity. TRIM37 depletion also activates PLK1-dependent centriole multiplication. 1.They propose that the tiger structure acts as platform for PLK4-dependent Cenpas assembly. Cenpas may evolve into centriole-like structures after a stepwise incorporation of other centriolar proteins. Fig. 6E suggests that a series of events seem to occur within G2 phase. Therefore, this reviewer suggests to perform a detailed time-course experiments at G2 phase. According to the model, the Centrobin-positive tiger structures may appear first, and then a Centrobin- and centrin-2-double positive structure starts to appear.

      We fully agree with the reviewer that this is an important experiment, which we will perform by analyzing TRIM37 depleted cells at successive time points after release from a double thymidine block, using antibodies against Centrobin and Centrin.

      2.They claim that Mulibrey patient cells exhibited evidence of chromosome mis-segregation, as would be expected from multipolar spindle assembly, and conclude that Cenpas are present and active also in Mulibrey patient cells. Chromosome mis-segregation may be observed in the normal cells, too. Therefore, they have to perform statistical analysis on Fig. 2D.

      In response to this suggestion and to the related comment of reviewer 3 (see below), we will conduct additional immunofluorescence analysis and quantification of patient and normal cells, assessing the distribution of Centrin, Centrobin, microtubules and γ-tubulin, as well as scoring the extent of chromosome mis-segregation.

      3.In Fig. 2A, They claimed that mitotic microtubules were disrupted with the cold treatment for 30 min. In our experience, cold treatment for 30 min is not sufficient to disrupt mitotic microtubules. They may show control panel before microtubule regrowth.

      We will show the control panel as requested.

      Reviewer #2 (Significance):

      Significance of this work resides in identification and description of Cenpas as a novel centriole assembly pathway. The authors used cutting-edge microscopy techniques to visualize Cenpas. The manuscript raised more questions than answers. Nonetheless, it is worth to publish the manuscript after revision.

      We thank the reviewer for supporting publication after revision.

      Reviewer #3 (Evidence, reproducibility and clarity):

      Balestra and colleagues investigate the function of Trim 37 in centrosome biogenesis. Trim 37 is a ubiquitin ligase that has previously been identified by the authors as a regulator of centriole duplication. Mutations in Trim37 cause a rare syndrome named Mulibrey that is responsible for a severe form of dwarphism Here they show that depletion of Trim37 in human cells results in the assembly of structures that they name Cenpas. They follow the possibility that Trim37 localises to the centrosome, which might inhibit the assembly of these structures. Further they show that Trim37 depleted cells (or in patient fibroblasts ) assemble multipolar mitosis. Further analysis shows that what the authors defined as abnormal centriole structures are formed in Trim37 depleted cells. These structures recruit centrobin, a daughter centriole component and this process requires the activity of PLK4 and PLK1. Major comments: This study characterizes Trim37 and its possible role in centriole biogenesis. Most conclusions are convincing, although some of the claims taken by the authors might require more data to be corroborated.

      1)The major point to be taken into consideration in my opinion relates with the Cenpas structure. According to the beautiful cryo-EM data shown on Fig 3, I wonder why the authors describe these structures as centriole like- or centriole related. I think these appear very different from centrioles and this might be even quite interesting if these structures nucleate microtubules and can participate in mitotic spindle assembly.

      We have a different opinion on this point. Most of the “centriole-like” or “centriole-related” structures do resemble the organelle, in that they contain microtubule bundles and are of a related size (in addition to bearing centriolar markers). However, recognizing that the distinction between these two categories of structures is somewhat arbitrary, we will combine them into the most prudent term “centriole-related”, and further explain in the revised manuscript that they comprise a range of structures.

      The authors correlate these non-canonical centriole structures as possible microtubule nucleators that might be responsible for multipolar configurations like in Fig 2D. This correlation has to be established. In Figure 2D, the authors analyze configurations of mitotic cells in terms of centrosome number and characterized frequency of extra foci. To me the foci they show are quite different in nature. Poles 1 and 3 have both centrin and g-tubulin (presumably centrioles), pole 2 has only a tiny amount of centrin and no g-tubulin, while pole 4 appears to contain both but less of each protein. So the question is are they all nucleating microtubules and participating in spindle assembly? This is particularly important in light of what the authors then mention, which is the occurrence of chromosome mis-segreation in patient cells (this is not shown). Also they describe these extra poles, and then say that Cenpas are active in patient cells. But, active in which manner? By nucleating microtubules? First, in either siRNA cells or in patient cells the authors should analyze microtubules and show that all the extra poles (made of non-canonical centriole) nucleate microtubules and participate in spindle assembly.

      In response to this suggestion and to the related comment of reviewer 2 (see above), we will conduct additional immunofluorescence analysis and quantification of patient and normal cells, assessing the distribution of Centrin, Centrobin, microtubules and γ-tubulin, as well as scoring the extent of chromosome mis-segregation.

      If they want to propose that this might be the cause of genome integrity loss in patients (as stated in the abstract and suggested a few times throughout the paper) they have to show that cells divide abnormally and generate aneuploidy progeny.

      See response just above.

      2) Another important point that is only partially addresses is the function of Trim37 in stabilizing centrobin. Does Trim37 ubiquitinates centrobin? While the western blot on Figure 4 shows an increase at 8hrs in Trim37 RNAi, this is also the case for tubulin (Fig 4E). But the overall levels appear only slightly increased when compared to its levels at time point zero (Fig. 4F). I can see that in siRNA Ctrl Trim 37 levels go down, but it is still present so how do they explain the lack of Cenpas in this case? Is there a threshold that supports centriole duplication without any major defect but accumulation of a certain level of centrobin then generates Cenpas? Can the authors generate Cenpas just by over-expressing centrobin directly?

      It appears from the comment of the reviewer that we were not sufficiently clear here. The experiment reported in Figure 4E and 4F is done in the presence of cycloheximide to analyze the half-life of Centrobin in control conditions and upon TRIM37 depletion. We will clarify the text in the revised manuscript to facilitate understanding.

      In Figure 2, they analyze configurations of mitotic cells in terms of centrosome number and characterized frequency of extra foci. To me the foci they show are quite different in nature. Poles 1 and 3 have both centrin and g-tubulin (presumably centrioles), pole 2 has only a tiny mount of centrin and no g-tubulin, while pole 4 appears to contain both but less of each protein. So the question is are they all nucleating microtubules and participating in spindle assembly? This is particularly important in light of what the authors then mention, which is the occurrence of chromosome mis-segreation in patient cells without showing it. Also they describe these extra poles, and then say that Cenpas are active in patient cells. But, active in which manner? By nucleating microtubules? This has to be shown. Also analysis of mitosis should be included to back up a defect in chromosome segregation and also to identify which type of defect.

      The above section is a copy/paste mistake (as indicated also in a correspondence between Review Commons and the reviewer).

      So in conclusion, the link between Cenpas and multipolarity has to be better investigated in my opinion. This should not be time consuming and also not extremely costly. Authors should label spindle MTs in patient fibroblasts to show that indeed Cenpas are nucleating microtubules. Ideally Cenpas would be distinguished by centrobin labeling. In siRNA depleted cells maybe time lapse microscopy can be used to image mitosis and show a correlation between Cenpas and multipolarity?

      As mentioned above, we will conduct additional immunofluorescence analysis and quantification of patient and normal cells, assessing the distribution of Centrin, Centrobin, microtubules and γ-tubulin, as well as scoring the extent of chromosome mis-segregation.

      The data is presented without statistical analysis on the figures only on Fig legends, This is really difficult for the reader. The number of experiments and cells analyzed maybe should be also included in each Figure.

      We had kept this information to the legends merely to have lean figures, but will consider moving it to the figure panels in the revised manuscript.

      Minor comments: Some picture lack scale bars

      Apologies. This will be fixed.

      the localization of GFP-Trim37. On Figure 1 the authors describe a different localization when fused to a NES localization. It is true that a dotty signal is seen on the panel of NES (Figure 1D), but a nuclear signal is not seen on Trim-GFP in any of the images provided. Shouldn't this be the case?

      There is some GFP-TRIM37 nuclear signal in the left panel of Figure 1D, although it is very weak. We will explore the possibility of providing an inset with adjusted brightness/contrast to emphasize this point.

      Fig 1C is missing a siCtrl.

      The control quantification will be included (no extra centrioles are present in this case).

      Why Trim37GFP does not rescue completely the assembly of the extra foci?

      In general, there can be many reasons why rescue in such an experimental setting is not complete, including slightly different protein levels, distribution, or interaction with partner proteins. Such possibilities will be discussed explicitly in the revised manuscript.

      In Fig 6E, are the authors sure that in the condition of siTRim3 plus si Centrobin and Plk1 inhibition, cells are not stuck in S-phase? This might explain the lack of being in a permissive G2 phase to generate Cenpas?

      Although Plk1 inhibition is not expected to block cells in S phase, we cannot rule out this possibility from the data currently available. Therefore, we plan to conduct FACS analysis in a repeat of this experiment to assess cell cycle status.

      The data is presented without statistical analysis on the figures. This can be found on figure legends, but it is better to include on the figures to facilitate the reader's job. The number of experiments and cells analyzed maybe should be also included in each Figure?

      As mentioned above also, we had kept this information to the legends merely to have lean figures, but will consider moving it to the figure panels in the revised manuscript.

      Reviewer #3 (Significance):

      Interesting findings and quite novel since a role for Trim 37 in centriole biogenesis has never been reported. Also quite interesting the possible link between multipolarity (needs better characterization) and Mulibrey syndrome.

      We thank the reviewer for recognizing the interest and novelty of our work

    1. I was thinking about everything, and that includes general relativity theory, since actually this theory is rather complicated. It has many branches and there was a lot of material which had been worked out for many years. People have studied it, and quantum gravity is extremely complicated. I was just lucky that such beautiful things were at the surface so I could see them. You see, my mind is not very technical. I work best of all in those places where I can use my intuition. Lightman: That's very interesting. I'd like to start asking you questions about that. I've noticed from your technical papers and in your paper in Physics Today[6] and your lectures that you describe things intuitively, with pictures and so forth. I know there have been certain physicists in the past who have used images and visualization and pictures more than other physicists. I think Einstein used a lot of visual images. All of his Gedanken experiments were based on mental images rather than on writing out equations. Even here [at Harvard] we make a joke in the physics department that Weinberg is very technical and [Sheldon] Glashow is very intuitive. So there do seem to be different styles of doing physics. One question that I've been very interested in, and some psychologists are interested in too, is how physicists use mental pictures. Maybe not exactly pictures but, for example, the way we say in quantum mechanics that sometimes things act as particles and sometimes as waves. I guess we're attempting to make a connection to our daily experience with the world. How do you use images in your work? Do you find images useful or harmful? Linde: Typically, I just use them. Of course, I use mathematics, certainly. Lightman: Of course. Linde: But first we usually have a rough idea of how it could work and why, and what is the purpose. Without understanding the purpose of what we are doing, you may try many different ways and you just solve equations without understanding why it is necessary.
      • "METHOD"
    1. Author Response

      Reviewer 1

      In this article Farrell et al. leverage existing datasets which measure frailty longitudinally in mice and humans to model 'robustness' (the ability to resist damage) and 'resilience' (the ability to recover from damage), their dynamics across age, and their relative contributions to overall frailty and mortality. The concept of separating damage/robustness from recovery/resilience is valid and has many important applications including better assessment and prediction of effective intervention strategies. I also appreciate the authors' sophisticated attempts to effectively model longitudinal data, which is a challenge in the field. The use of human and mouse data is another strength of the study, and it is quite interesting to see overlapping trends between the two species.

      While I find the rationale sound and appreciate the approach taken at a high level, there are a few key considerations of the specific data used which are lacking. The authors conceptualize resilience based on studies which primarily use short time scales and dynamic objective measures (ex. complete blood cell counts in Pyrkov et al.) often in conjunction with an acute stress stimulus. For example, they heavily cite Ukraintseva et al. who define resilience as "the ability to quickly and completely recover after deviation from normal physiological state or damage caused by a stressor or an adverse health event."

      Resilience and robustness are typically studied at short time-scales, with small numbers of continuous health attributes. We study transitions of binary health attributes, which we call damage and repair, and which we suggest should be thought of as resilience and robustness. Our approach is well suited for studying large numbers of binary health attributes over long time-scales without acute external stimuli. How resilience and robustness in these limits (binary, large numbers, long times, intrinsic dynamics) compare with resilience and robustness as has been typically measured (continuous, short times, acute stimuli) is an interesting and important question that arises from our work.

      Given these definitions, the human data used seem to fit within this framework, but we should carefully consider the mouse data. The mouse frailty index is a very useful tool for efficiently measuring the organismal state in large cohorts. A tradeoff for quickly measuring a broad range of health domains is that the individual measurements are low resolution (categorical) and involve inherent subjectivity (which may be considered part of the measurement error). Some transitions in individual components are due to random measurement error and I believe this is especially likely with decreases (or 'resilience' transitions).

      The reason I think the resilience transitions are subject to high measurement error is that I am skeptical as to whether many of the deficits in the mouse index are reversible under normal physiologic conditions. For example, it is exceptionally unlikely for a palpable/visible tumor to resolve in an aged mouse over the time scales studied here, thus any reversal that was observed is very likely due to random measurement error. Other components which I have doubts about reversibility are alopecia, loss of fur color, loss of whiskers, tumors, kyphosis, hearing loss, cataracts, corneal capacity, vision loss, rectal prolapse, genital prolapse.

      In summary, I applaud the authors' efforts in generating complex models to better understand longitudinal aging data. This is an important area that needs further development. I appreciate their conceptualization of resilience and robustness and think this framework has an important place in aging research. I also appreciate their cross-species approach. However, the authors may have over-conceptualized and made some assumptions about the mouse data which may not be valid. It will be important to assess the results with careful consideration of the time scales of the underlying biology and the resolution and measurement error inherent to these tools.

      For each of our mouse attributes, there are published studies demonstrating reversibility (see our new Supplementary Table 1). Nevertheless, we cannot distinguish what causes the observed discrete transitions (measurement error, stochastic fluctuations in underlying organismal features, or logisticlike continuous transitions in underlying continuous variables). We analyze the discrete data as given.

      The question of time-scale is interesting. From survival curves of individual binarized attributes, we obtain reasonable fits to exponential models (i.e. a single timescale) see Fig 5 supplement 1 and 2. For the human data there are a broad range of timescales for both robustness and resilience. For the mouse data there appears to be a similarly broad range (note the logarithmic scale) though with considerable uncertainty. We work with the data we have, so we are unable to probe shorter timescales than the measurement interval (months for mice, and years for humans). We have reinforced this caveat in the discussion.

      Reviewer 2

      This study uses repeated measurements of the frailty index (FI), composed of multiple binary parameters. It is posited that newly detected changes in the number of these parameters represent damage and that the parameters that have previously been detected but are not detected currently represent damage repair. Statistical treatment then follows, deriving resilience and robustness and their changes over time. This is an interesting idea. Strengths of the study include analyses across species (mice and humans), including multiple datasets in mice.

      To be clear, our data analysis is on the binary health attributes that are used in the FI. By considering the damage/repair (binary transitions) of individual attributes, we can obtain the aggregate damage/repair rates.

      What are the elements of FI that increase at each period of life, and what are those that decrease? For example, humped phenotype or alopecia are more likely to appear in old mice and are essentially irreversible, whereas weight loss due to infection may be more common in young mice and is reversible. Therefore, the choice of health deficits would affect the model and, for example, may artificially lead to a decreased value of what the authors call damage repair.

      More generally, information on the frailty index lacks sufficient details. I doubt that this method has sufficient accuracy to draw conclusions from as little as 32 female mice (21 + 11 animals in datasets 1 and 2) and 63 males (13 + 6 + 44 animals in datasets 1, 2 and 3). Also, only 25 enalapril-treated mice of each sec were analyzed, and only 17 exercised mice (11 females and 6 males). The number of human participants is large, but the total follow-up period is not shown, and the subjects were assessed based on 23 parameters only.

      We have not examined other choices of health attributes. While we picked standard sets from available data, we do not know whether other attributes would behave differently. It would be difficult to do our detailed modelling on single attributes in the mouse data, since the data is so sparse. Our approach was developed specifically to be able to draw conclusions from limited mouse data. Where possible we aggregate the individual mice, sex, health attributes, studies, and measurement times. The analysis of human data shows that the approach generalizes.

      While we have mostly not studied individual attributes (we have considered survival times, but without age or time effects), we would expect that some of them may have behavior that qualitatively differs from our aggregate results. If attribute selection was biased towards (or away from) qualitatively distinct behaviors that would, of course, be reflected in aggregate results. We suspect that this would be unlikely, but that any such distinctive behavior would be interesting and important to identify and understand. We have added some discussion on this point, since we cannot exclude this possibility.

      A key assumption in this work is that increased FI is equivalent to the rise in damage. However, the relationship between changes in FI and damage is unknown. One can imagine a situation when damage increases, but protection also increases. In this case, fitness may increase, decrease or remain unchanged. What is the basis for calling an increased number of health deficits damage? Is there a more reliable method to measure damage that could support the authors' claims?

      See also discussion point #1 in essential revisions. We call binary states 0 “healthy” and 1 “damaged”, but we could instead say “more healthy for most individuals” and “less healthy for most individuals” – where “healthy” means associated with desirable (low FI and low mortality) health outcomes. We have not explored other measures of organismal damage. We have not explored how interactions between variables could affect resilience or robustness for individuals. We do not think that alternative approaches would be easy to study without much more data (for mice) that is more finely resolved in time (for mice and humans). We are quite happy to have found an approach to use with binarized data, but would welcome viable alternative approaches to compare with.

      Reviewer 3

      In this work, the authors aimed at investigating two related components of aging-related processes of health deficits accumulation in mice and humans: the processes of damage (representing the robustness of an organism) and repair (corresponding to resilience), and at determining how different interventions (the angiotensin-converting enzyme inhibitor enalapril and voluntary exercise) in mice and a representative measure of socio-economic status (household wealth) in humans affect the rates of damage and repair. Two key elements in this study allowed the authors to achieve their goals: 1) the use of relevant data containing repeated measurements of health deficits from which they were able to compute the cumulative indices of health deficits in mice and humans and which are also necessary to evaluate the processes of damage and repair; 2) the methodological approach that allowed them to formulate the concepts of damage and repair, model them and estimate from the available data. This methodological framework coupled with the data resulted in important findings about the contribution of the age-related decline in robustness and resilience in health deficits accumulation with age and the differential impact of interventions on the processes of damage and repair. This provides important insights into these key components of the process of aging and this research should be of interest to both lab researchers who plan experimental studies with laboratory animals to study potential mechanisms and interventions affecting health deficits accumulation as well as researchers working with human longitudinal studies who can apply this approach to further investigate the impact of different factors on robustness and resilience and their contribution to the overall health deterioration, onset of diseases and, eventually, death.

      The key strength of this work is a rigorous analytic approach that includes joint modeling of longitudinal measurements of health deficits and mortality (in mice). This approach avoids biased inference which would be observed if longitudinal data were analyzed alone, ignoring attrition due to mortality. Another strength is a comprehensive analysis of both laboratory animal data that allows exploring the impact of different interventions on the processes of damage and repair and human data that allows investigating disparities in these processes in individuals with different socioeconomic backgrounds (represented by household wealth).

      One weakness (which is commonplace for human studies) is self-reported data on health deficits in humans which makes it difficult to compare with lab data where deficits are assessed objectively by lab researchers. The subjective nature of health deficits measurements complicates the interpretation of findings, especially about repairs of deficits. In addition, it is not clear whether the availability/absence of caregivers at different exams/interviews factors into the answers on difficulty/not difficulty with specific activities constituting health deficits and, respectively, into their change over time reflected in damage/repair estimates.

      Variability of the evaluator is expected in any longitudinal study, and amounts to a variety of measurement error. The question of whether there are age-effects in the measurement error, such as bias or age-dependent variability is interesting. For the mouse data, evaluator training is designed to minimize such errors and inter-evaluator differences are not large (Feridooni et al, 2015; Kane et al, 2017). For the human self-report data any such age-effects are unavoidable.

    1. Author Response

      Reviewer 1

      Sadeh and Clopath analyze two mouse datasets from the Allen Brain Atlas and show that sensory representations can have apparent representational drift that is entirely due to behavioral modulation. The analysis serves as a caution against over-interpreting shifts in the neural code. The analysis of data is coupled with careful modeling work that shows that the behavioral state reliably shifts sensory representations independently of stimulus modulation (rather than acting as a gain factor), and further show that it is reproducibly shifted when the behavioral state is adequately controlled for. The methods presented point towards a more careful consideration and measurement of behavioral states during sensory recordings, and a re-analysis of previous findings. The findings held up for both standard drifting grating stimuli as well as natural movies.

      The fact that neurons may have different tuning depending on the behavioral state of the animal raises obvious questions about readout. The authors show that neurons with strong behavioral shifts should simply be ignored and that this can be achieved if the downstream decoder weights inputs with more stimulus information. While questions remain about why behavior shifts representations and how that could be more effectively utilized by downstream circuits, the results presented clearly show that sensory representations might not always be simply drifting over time, and will spark some careful analysis of past and future experimental results.

      Many thanks for a clear summary of the work and emphasizing the significance of the results.

      Reviewer 2

      Studies from recent years have shown that neuronal responses to the same stimuli or behavior can gradually change with time - a phenomenon known as representational drift. Other recent studies have shown that changes in behavior can also modulate neuronal responses to a given sensory stimulus. In this manuscript, Sadeh and Clopath analyzed publicly available data from the Allen Institute to examine the relationship between animal behavioral variability and changes in neuronal representations. The paper is timely and certainly has the potential to be of interest to neuroscientists working in different fields. However, there are currently several important issues with the analysis of the data and their interpretations that the authors should address. We believe that after these concerns are addressed, this study will be an important contribution to the field.

      We really appreciate the time and the effort the reviewer(s) have taken to evaluate our results and analysis in detail. Their comments are very relevant and critical to the improvement of the manuscript. We explain below how we addressed their various comments and concerns

      1. The manuscript raises a potential problem: while previous work suggested that the passage of time leads to gradual changes in neuronal responses, the causality structure is different: i.e., the passage of time leads to gradual changes in behavior, which in turn lead to gradual changes in neuronal responses. The authors conclude that "variable behavioral signal might be misinterpreted as representational drift". While this may be true, in its current form, the paper lacks critical analyses that would support such a claim. It is possible that both factors - time and behavior - have a unique contribution to changes in neuronal responses, or that only time elicits changes in neuronal responses (and behavior is just correlated with time). Thus, the authors should demonstrate that these changes cannot be explained solely by the passage of time and elucidate the unique contributions of behavior (and elapsed time) to changes in representations.

      This is a very important point and we addressed it with new analyses, by dedicating a new figure (Figure 1–figure supplement 5) and a new part of the Results section to it. The results of our new analyses show that strong representational drift mainly exists in those animals/sessions with large behavioral changes between the two blocks, and that in animals/sessions with small behavioral changes, such drift is minimal, despite the passage of time (see our responses below to Major comments for further details).

      1. There are also several issues with the analysis of the data and the presentation of the results. The most concerning of which is that the data shows a non-linear (and non-monotonic) relationship between behavioral changes and representational similarity. In many of the presented cases, the data points fall into two or more discrete clusters. This can lead to the false impression that there is a monotonic relationship between the two variables, even though there is no (or even opposite) relationship within each cluster. This is a crucial point since the clusters of data points most likely represent different blocks that were separated in time (or separation between within-block and acrossblock comparisons).

      This is an important concern. To address this, we analyzed the source of the non-monotonic relationship / opposite trend in the data and demonstrated the results in a new figure (Figure 4–figure supplement 2). Our results show that the non-monotonic relationship does not compromise the result of our previous analysis. Furthermore, it suggests that the non-monotonic / opposite trend is emerging as a result of more complex interactions between different aspects of behavior. We have also shown, in separate analyses, that the passage of time is not the main contributing factor to representational drift, rather large behavioral changes are correlated with strong drifts between the two blocks of presentation (Figure 1—figure supplement 5, and Figure 3—figure supplement 2).

      More generally, we did not intend to claim that the relationship with behavioral changes is linear or/and monotonic. We used linear analysis just to show the main trend of decrease in representational similarity with large behavioral changes. Any other analysis should assume some form of nonlinearity, but because the nonlinear relationships between behavior and activity were complex, it was not easy to assume such nonlinearity.

      We in fact tried to use two other ways of analysis, nonlinear correlations and generalized linear models (GLM), but there were issues hindering a proper use of each analysis. Nonlinear correlations assume a specific type of nonlinearity, but the nature of nonlinearity underlying the data is not clear (in fact, it looks to be different in different example non-monotonic trends in the data). We could not, therefore, assume a nonlinearity that best fitted all the data; we believe the nature of this nonlinearity, or how behavior modulates neuronal activity in a nonlinear manner, is in itself an interesting and open question for future investigation, but beyond the scope of this study. GLM did not provide useful results either, as the relationship between behavioral changes and neural activity/representational similarity was state-dependent and transitioning between nonlinear states, therefore hindering the usage of linear methods.

      We therefore opted for the simplest analysis which can show and quantify this dependence - emphasizing that further analyses are in fact needed to get to the bottom of the exact nonlinear relationship (for further details, see the responses below to Major comments).

      1. The authors also suggest that using measures of coding stability such as 'population-vector correlations' may be problematic for quantifying representational drift because it could be influenced by changes in the neuronal activity rates, which may be unrelated to the stimulus. We agree that it is important to carefully dissociate between the effects of behavior on changes in neuronal activity that are stimulus-dependent or independent, but we feel that the criticism raised by the authors ignores the findings of multiple previous papers, which (1) did not purely attribute the observed changes to the sensory component, and (2) did dissociate between stimulus-dependent changes (in the cells' tuning) and off-context/stimulus-independent changes (in the cells' activity rates).

      That’s a very valid point. As population vector correlations are used quite often in (experimental and theoretical) works on representational drift, we wanted to highlight the pitfalls of such a metric in dissociating between sensory-evoked and sensory-independent components. However, as the reviewers have mentioned, these two aspects have been separated and addressed independently in some of the past literature in the field. For instance, as we discussed in the Discussion, Deitch et al. (Current Biology, 2021) have calculated this for different metrics, including tuning curve correlations, which can potentially alleviate this problem:

      A recent analysis of similar datasets from the Allen Brain Observatory reported similar levels of representational drift within a day and over several days5. The study showed that tuning curve correlations between different repeats of the natural movies were much lower than population vector and ensemble rate correlations5; it would be interesting to see if, and to which extent, similarity of population vectors due to behavioural signal that we observed here may contribute to this difference.

      We tried to highlight these contributions better in the revised manuscript (see further on this below in our responses to Major comments).

      1. Another important issue relates to the interchangeable use of the terms 'representational drift' and 'representational similarity'. Representational similarity is a measure to identify changes in representations, and drift is one such change. This may confuse the reader and lead to the misconception that all changes in neuronal responses are representational drift.

      We thank the reviewer(s) for raising this point. We have clarified our use of the terms representational similarity and representational drift in the revised manuscript. Specifically, we have quantified representational drift index between the two blocks according to a previously used metric (RDI; Marks & Goard, 2021) in our new analysis (Figure 1–figure supplement 5).

      For the main part of the paper, however, we have decided to base our analysis on representational similarity (RS), and to evaluate the drop of RS with changes in behavior. Our reasoning for this is twofold. First, any measure of representational drift should ultimately be a function of the representational similarity. The measure we used above, for instance, is calculated as RD = (RS_ws - RS_bs)/(RS_ws + RS_bs) (Marks and Goddard, 2021), with RS_ws and RS_bs referring to the average representational similarity within a session or between different sessions. However, RS contains more information, especially with regard to fine-tuned changes - the above metric, for instance, averages all the changes within each block of presentation. By focusing on the basic function of representational similarity, we could capture both the gross changes between the blocks as well as more nuanced changes that can arise within them, especially with regard to behavioral changes. Another aspect that would have been lost by only using the usual metric of representational drift is the direction of change. In our analysis, we in fact found that the average RS increased within the second block of presentation, which might be contrary to the usual direction of drift. We found this unconventional change of RS interesting and informative too. We could highlight that, presenting the raw RS provided a better analysis strategy. Based on these reasons, we think representational similarity would be a better metric to base our analyses upon, although we have now calculated a conventional representational drift index for comparison too.

      Reviewer 3

      Although it is increasingly realized that cortical neural representations are inherently unstable, the meaning of such "drift" can be difficult or impossible to interpret without knowing how the representations are being read out and used by the nervous system (i.e. how it contributes to what the experimental animal is actually doing now or in the future). Previous studies of representational drift have either ignored or explicitly rejected the contribution of what the animal is doing, mostly due to a lack of high-dimensional behavioural data. Here the authors use perhaps the most extensive opensource and rigorous neural data available to take a more detailed look at how behaviour affects cortical neural representations as they change over repeated presentations of the same visual stimuli.

      The authors apply a variety of analyses to the same two datasets, all of which convincingly point to behavioural measures having a large impact on changing neural representations. They also pit models against each other to address how behavioural and stimulus signals combine to influence representations, whether independently or through behaviour influencing the gain of stimuli. One analysis uses subsets of neurons to decode the stimulus, and the independent model correctly predicts the subset to use for better decoding. However, one caveat may be that the nervous system does not need to decode the stimulus from the cortex independently of behaviour; if necessary, this could be done elsewhere in the nervous system with a parallel stream of visual information.

      Overall the authors' claims are well-supported and this study should lead to a re-assessment of the concept of "representational drift". Nonetheless, a weakness of all analyses presented here is that they are all based on data in head-fixed mice that were passively viewing visual stimuli, such that it is unclear what relevance the behaviour has. Furthermore, the behavioural measurements available in the opensource dataset (pupil movements and running speed) are still a very low dimensional representation of what the mice were actually doing (e.g. detailed kinematics of all body movements and autonomic outputs). Thus, although the authors here as well as other large-scale neural recording studies in the past decade or so make it clear that relatively basic measures of behaviour can dramatically affect cortical representations of the outside world, the extent to which any cortical coding might be considered purely sensory remains an important question. Moreover, it is possible that lowerdimensional signals are overly represented in visual areas, and that in other areas of the cortex (e.g. somatosensory for proprioception), the line between behaviour parameters and sensory processing is blurred.

      Many thanks for the clear and insightful summary of the results, significance and caveats of our analysis. We totally agree with this critical evaluation - and suggestions for future work.

    1. Author Response

      Reviewer 2

      In the manuscript, the cellular deformation that is due to the shear stress generated in a classical microfluidic channel is used to deform detached cells that are moving in the flow. A very elegant point of the paper is that the same cells are used in the provided software to determine the fluid flow, which is a key parameter of the method. This is particularly important, as an independent way to crosscheck the fluid flow with the expected values is important for the reliability of the method. Instead of complicated shape analysis that are required in other microfluidic methods, here the authors simply use the elongation of the cell and the orientation angle with respect to the fluid flow direction. The nice thing here is that a well-known theory from R. Roscoe can be successfully used to relate these quantities to the viscoelastic shear modulus. Thanks to the knowledge of the fluid flow profile, the mechanical properties can be related to the tank treading frequency of the cells, which in turn depends on the position in the channel, and the flow speed. Hence, after knowing the flow profile, which can be determined with a sufficiently fast camera, and the actual static cell shape, it is possible to obtain frequency dependent information. Assuming then that cells do have a statistically accessible mean viscoelastic property, the massive and quick data acquisition can be used to get the shear modulus over a large span of frequencies.

      The very impressive strength of the paper is that it opens the door for basically any, non-specialized cell biology lab to perform measurements of the viscoelastic properties of typically used cell types in solution. This allows to include global mechanical properties in any future analysis and I am convinced that this method can become a main tool for a rapid viscoelastic characterization of cell types and cell treatment.

      Although it is both elegant and versatile, there remain a couple of important questions open to be further studied before the method is as reliable as it is suggested by the authors. A main problem is that the model and the data simply don't really work together. This is most prominent in Figure 3a. This is explained by the authors as a result of non-linear stress stiffening. Surely this is a possible explanation, but the fact that the question is not fully answered in the paper makes the whole method seems not sufficiently backed. I agree that the test with the elastic beads are beautiful, but also here the results obtained with the microfluidic method and the AFM seem not to match sufficiently to simply use the proposed model in conjecture with a single power law approach to fully translate the single frequency data into a frequency dependent plot. There are more and more hints that two power law models are more reasonable to describe cell mechanics. If true this would abolish the approach to exploit only a single image to get the mechanical power law exponent and the prefactor in a single image. Despite all the excitement about the method, I have the feeling that the used models are stretched to their extreme, and the fact that the only real crosscheck (figure 3a) does not work for the power law exponent undermines this impression.

      We had assumed that the probing frequency equals the tank treading frequency. This is incorrect. As the cell undergoes a full rotation, any given volume element inside the cell is compressed twice and elongated twice. Hence, the frequency with which the cell is probed is twice the tank-treading frequency. This correction shifts the G’ and G” versus frequency curves to the right (by a factor of two), and in addition, the G” data points are shifted (increased) by a factor of two (Eq. 17). This also increases the fluidity alpha (and hence the slope of the power-law relationship) roughly by a factor of two (Eq. 22), and since the actual slope of the G’ and G” versus frequency data “cloud” is unchanged by the correction, the single power-law description now describes the data much better (see new Fig. 3a).

      Regarding the critique that models are stretched to their extreme: The Roscoe model assumes that cells behave as the visco-elastic continuum-mechanics equivalent of a Kelvin-Voigt body consisting of an elastic spring in parallel with a resistive (or viscous) dash-pot element . This then gives rise to a complex shear modulus with storage modulus G’ and loss modulus G”, measured at twice the tank treading frequency 𝜔. Roscoe makes no assumptions whatsoever about how G’ and G” might change as a function of frequency. Hence, our “raw” G’ and G” data, e.g. in Fig. 3a, are obtained without any power law assumption.

      One could leave it at that, as the reviewer suggests below, and only present the raw G’ and G” vs. frequency plots. However, this would also make it nearly impossible to compare our measurements to those obtained with other techniques that operate at different, non-overlapping time- or frequency-scales. For such a comparison to work, one needs a model to predict how G’ and G” scale with frequency.

      A commonly used and very simple model to predict how G’ and G” scale with frequency, which is also the model used by Fregin et al. and many others, is that of a Kelvin-Voigt body consisting of an elastic spring in parallel with a resistive element (dash-pot), both with a frequency-independent stiffness and resistance (viscosity), respectively. However, our data show that G’ and G” of different cells, all measured at different tank-treading frequencies, exhibit a behavior that is very unlike that of a simple Kelvin-Voigt body with a constant, frequency-independent stiffness and resistance. In this case, G’ would be flat (power law exponent zero), and G” would increase proportional with frequency (power law exponent of unity). This is clearly not what our data show.

      Rather, we find that G’ and G” increase with increasing frequency according to a power law, with the same exponent 𝛼 for G’ and G”. At high frequencies (beyond the range of our microfluidic method, but in the range of our AFM measurements), G” increases more strongly with frequency, akin to a Newtonian viscosity (power law exponent of unity), which we take into account in the case of the AFM measurements. A large number of publications have shown that many types of cells, including cells in suspension, follow power law rheology, regardless of the measurement method. Also the AFM measurements that we include in this study support the validity of power-law rheology.

      Power law rheology predicts a peculiar behavior: The ratio of G”/G’ in the low-frequency regime (where the high-frequency viscous term is not yet dominating) must be equal to tan(𝛼𝜋/2), for mathematical reasons (Eq. 22). With our correction (that the probing frequency is twice the tank-treading frequency), we find that Eq. 22 correctly predicts the power-law exponent of the G’ and G” vs. frequency data.

      Note that we actually do not fit a power law model (Eq. 1) to the population data of G’ and G” vs. frequency in Fig. 3a. The G’ and G” data are obtained by applying Roscoe-theory, without any further assumptions such as power-law rheology. Only the lines shown in Fig. 3a that go nicely through the data are a prediction of how a typical cell (selected from the mode of the joint probability density of alpha and k, see Fig. 3b) would behave if we had measured it at different frequencies, under the assumption that this cell follows power law rheology, based on Eq. 22. With this assumption, we can directly convert the measured G’ and G” of any cell into a stiffness k and power law exponent 𝛼 using Eqs. 21 and 22 - no fit is needed here.

      Since we only measure two parameters for any given cell at twice its tank-treading frequency, namely strain and alignment angle, we can only extract two parameters for each cell (i.e., G’ and G”, or k and alpha) but not a third parameter. In essence, the reviewer expresses concerns that the G' and G" behavior of a typical cell, when extrapolated to higher or lower frequencies, may not necessarily match the frequency behavior of the entire cell population (Fig. 3a). However, our data show that a single (typical) cell that was measured at a single mid-range frequency comes remarkably close to describing the G’ and G” versus frequency behavior of all other cells.

      The reviewer suggests that a power law model with two exponents may be able to even more accurately describe the mechanics of the cell population. This is certainly correct, and in particular when cell mechanics is measured over a larger range of frequencies or strain rates, as we have done here using AFM, we find that at higher frequencies, G” deviates from a weak power law and merges into a different power law with a larger slope (i.e., power law exponent) that approaches unity or a value close to unity, akin to a Newtonian viscous term. Therefore, the single power law expression (Eq. 1) is not sufficient for the AFM data, and we use Eq. 2 instead. However, in the case of our shear stress cytometry measurements, the tank-treading frequency remains below the range where this second power law behavior becomes prominent. Therefore, the Newtonian viscosity term of Eq. 2 cannot be fitted with reasonable fidelity to the data from a single measurement.

      In the case of polyacrylamide beads, we start to see a hint of an upward trend in G” versus frequency at tank-treading frequencies of around 10 Hz, and therefore have performed a global fit with Eq. 2 to the shear flow data where we keep the Newtonian viscous term constant for all conditions (different shear stresses and bead stiffnesses).

      The reviewer furthermore cautioned that mechanical non-linearities such as strain stiffening may distort or otherwise bias the results. As the reviewer brings up this issue in more detail below, we have addressed it there.

      Regarding the concern that “results obtained with the microfluidic method and the AFM seem not to match sufficiently to simply use the proposed model in conjecture with a single power law approach to fully translate the single frequency data into a frequency dependent plot.”:

      First, we tend to agree more with the opinion of Reviewer #1 who found it remarkable that results obtained with the microfluidic method and the AFM method are actually fairly similar. Now that we have introduced the correction that the probing frequency is twice the tank-treading frequency, the cells in suspension turn out to be softer and more fluid-like compared to the cells measured with AFM. But there are many more commonalities between the AFM data and the shear flow data, which we list above in our reply to reviewer #1, the most relevant here is that cells show power-law behavior both when measured with AFM and with our new method.

      Second, we did not use a single power law to fit the AFM data. Rather, we used Eq. 2, which contains two power law relationships (the second power law exponent of unity for the Newtonian viscosity therm is usually not explicitly written). However, the origin of the Newtonian viscosity therm arises mainly from the hydrodynamic drag of the cantilever with the surrounding liquid, and less so from the cells. This hydrodynamic drag is absent in our shear flow deformation cytometry method, and moreover the tank treading frequency of most cells remains far below 10 Hz where an additional Newtonian viscosity therm does not yet come into play.

      Third, we disagree that Fig. 3a is “the only real crosscheck for the power law exponent”. The inverse relation that we see between the power law exponent and the stiffness of individual cells (Fig. 3b) has been previously reported for different cell types and methods. Moreover, we find a power law exponent close to zero for PAA beads at small strain values, which is to be expected for a predominantly elastic material such as PAA. We think that this last result is a particularly convincing experimental cross-check.

  5. Jul 2022
    1. Author Response

      Reviewer #1 (Public Review):

      My primary criticism of this paper is that it misses the opportunity to give some key details about the statistics of neural activity during 'ripples' rather than studying identified replay events. A secondary criticism is that they limit their analyses to neurons that have place fields in both environments. I think the activity of the other 3 categories of neurons (active in Track 1 only, active in Track 2 only, and not active in either track) are also of critical interest.

      We agree with the reviewer that it is important to demonstrate that the main observations are not due to a small subset of neurons or replay events. We have described above the inclusion of Figure 1- figure supplement 6, where the threshold for replay detection is made less stringent and the ratio of significant replay events/candidate replay events are now reported in the manuscript. To address the concern that the analysis is limited to neurons only with place fields on both tracks, we have added four more subpanels to Figure 1-figure supplement 6, where we perform our regression analysis on all spatially tuned (pyramidal) neurons (Figure 1-figure supplement 6E), neurons with only place fields on one track (track 1 and track 2 neurons will be in the upper right and lower left quadrant of plot respectively, Figure 1-figure supplement 6F), neurons with peak amplitude <1Hz on each tracks (Figure 1-figure supplement 6G) and finally, interneurons (Figure 1-figure supplement 6H). Consistent with our previous findings, we observe significant regressions for POST replay events for all spatially tuned neurons and neurons with place fields only one track. Conversely, neurons that were not active on either track and interneurons are not rate modulated by experience during replay.

      It is important to note that replay detection uses all spatially tuned cells, but the regression analysis is limited to cells active on both tracks in the main analysis. The reason for this is now explained in more detail in the revised manuscript (page 5):

      “It is important to note that a significant regression would be expected when analyzing neurons with a place field only on one track, as they are expected to participate in replay events of this track, while being silent during the replay of the other track. As such, our regression analysis only analyzed place cells active on both tracks and stable across the whole run (Figure 1-figure supplement 1B and see Methods).”

      Reviewer #2 (Public Review):

      This study by Tirole et al. addresses to what extent differences in firing rate that occurs during the awake experience of two different tracks are replayed during SWRs.

      In principle, this is a topic broadly relevant to our understanding of the circuit-level mechanisms and neural coding of memory, because it can provide insight into the ways in which experience is transformed into memory traces, and in particular, whether an entire coding modality (firing rate patterns) is available for replay. However, I didn't have an easy time situating this study in the context of the existing literature. When I first read the title, I expected this work was going to address the question of if there is replay of rate-remapped experiences, which is still an understudied topic (but see Takahashi, 2015) and would be important to examine. But once I realized that the two experiences here are actually more like global remapping, it was less clear to me what is novel here.

      My best guess about what's novel is that even though on the one hand, many studies have shown a distinguishable replay of two (or more) distinct experiences, e.g. different mazes like in Karlsson et al. 2009, different arms of a T-maze in Gupta et al. 2010, the overlapping central stem element of different trajectories in various mazes (Takahashi, 2015 and work from the Jadhav lab). On the other hand, there have been extremely detailed examinations of the contributions of firing rate changes (as distinct from temporal order or synchrony) as in Farooq et al. 2019. But perhaps the authors think that the intersection of those two kinds of work has not been studied, that is, how much do firing rate changes specifically contribute to the replay of two distinct experiences? In any case, regardless of whether I understood that correctly or not, the authors need to be more explicit in the introduction and discussion in contextualizing their work. I also suspect that the current findings are a direct logical consequence of putting together these well-established previous results; this would not mean the current work isn't a useful advance, but it would moderate the novelty and general interest.

      Beyond this overall question of how the work relates to the extant literature, I have a suggested modification to the data analysis. I think that the quality of the data and the care taken in the analyses were very high in general, so I do not have any major concerns, and the conclusions are very thoroughly supported. However, I wonder if there is a way to simplify some of the analyses and make them a bit more straightforward to interpret. As the authors have realized, there is potential for a circularity in the analysis, in the sense that to compare firing rate differences for two tracks between Track and Replay, Replay events first need to be assigned to one or the other (decoded) Track. But then any firing rate differences may be contributing to the output of the decoder, rendering the analysis circular. I understand the authors use various methods like the firing-rate-insensitive method in Figure 2 to deal with this crucial issue. But wouldn't a simpler way be to leave out the cell whose firing rates are being analyzed out of the decoding step so that the labeling of Replay events is independent of that cell? This seems an intuitive and rigorous way to address the central question the authors have. Is there some reason why that isn't done?

      We thank the reviewer for this feedback, and agree it is important to emphasize the novel contributions of the manuscript (as we see it), and clarify this further if needed. The reviewer is correct that there are several studies that have looked at rate remapping during reactivation. We have cited some of these, but have now updated our citations in the intro and discussion based on the comments here. While we have avoided directly criticizing a particular study in our earlier draft of the manuscript, these previous studies are affected generally by several issues: 1) replay detection methods were sensitive to rate modulation, creating a circular argument for the existence of rate modulation in replay. [Our study thoroughly addresses this with several controls]. 2) the analysis of reactivations rather than replay, which lacks the statistical rigor of sequence detection [we have focused on replay using a strict threshold for significance] 3) Replay/reactivations are analyzed for a single environment, making it difficult to distinguish between rate modulation and changes in the overall excitability levels of neurons maintained over behavior and sleep. [our studies uses two tracks to avoid this potential issue]. 4) When multiple contexts were decoded, neurons that only fired in one context were not removed from the analysis, artificially “inflating” any observed rate modulation. [we have circumvented this issue by only analyzing neurons with place fields in both environments]

      The suggestion to repeat the analysis and leave one neuron out for replay detection is excellent, however this was avoided due to the required processing time- to run our complete analysis takes more than a week, and repeating this for each possible “leave-one-out” combination would take significantly longer (this has to be done independently for each neuron). We used multiple controls (track rate shuffle, replay rate shuffle, rank order correlation- figure 2, figure 2—figure supplement 2) to eliminate any possibility that a neuron’s firing rate could influence replay detection. Specifically, for rank-order correlation based replay detection, each burst of spikes is only treated as a single event (median of spike times in the burst), which directly circumvents the problem of firing rate biasing replay event selection.

    1. Author Response

      Reviewer 1

      In general, I consider that the manuscript reflects a huge effort in terms work done and data collection, the manuscript is very well written, and it brings new knowledge in terms of cooperative breeding and its connection with groups size in ostrich. My major concerns are about the title and introduction that are in my opinion too broad and not enough detailed.

      In the introduction the scientific background that led to this research is lacking, and the manuscript would benefit from a more supported introduction, which makes it difficult to understand how far this study went comparatively to previous studies. The research work was well conducted, and adjusted to the study aims. However, it would benefit from including more details on the observational data collected by the authors.

      I think the research topic is interesting, and the study was well performed, but the manuscript would benefit from a more clear approach to the working hypothesis, expected results and background theories/hypotheses.

      We are very grateful for the positive and constructive feedback. The title and introduction have been revised according to the reviewer’s suggestions. We provide a more extensive introduction to the hypotheses being tested, which are now explicitly stated. The observational data we collected have been described in more detail and we integrate our observational and experimental data more thoroughly.

      In the evaluation summary, the reviewer highlights that we did not address some aspects of groups, such as relatedness and parentage. We have now added additional analyses to show these do not change the conclusions of our study (for details please see responses to reviewer 2 who raises similar concerns more extensively). These were not originally included in the manuscript as the aim of our study was to examine how group size and composition influence the average reproductive success for any given individual, irrespective of variation in relatedness and parentage within groups.

      Reviewer 2

      This work sets out to investigate experimentally the effect of differences in group size and group composition on reproductive behavior and success in ostrich groups. Direct field observations of the relationship between group composition/group size and reproductive success, do not allow for causal inference, as there may be several reasons why patterns may arise. For example, observing individuals having a higher reproductive success in larger groups than in smaller groups may not be a direct result of a larger group size per se, but it may be that higher quality individuals manage to establish themselves more often in larger groups. Hence, experimental manipulation of group size and group condition in natural contexts is important. 96 experimental groups of ostriches were established in fenced off areas in the Karoo in South Africa, varying the number of males (1 / 3) and the number of females (1 / 3 / 4 / 6) across groups. Groups were followed for almost a year, studying a period without parental care (eggs were removed and incubated in an incubator to measure reproductive success) and a period with parental care (eggs were left in the enclosures).

      In the latter case, behavioral observations were done to study nest incubation, and sexual conflict (interruptions of incubation). The study was done for seven years, and having such data on experimental manipulations in semi-wild conditions is very valuable. The combination of behavioral analysis, with careful tracking of the fate of eggs (by daily nest checks), the experimental nature, and measuring reproductive success make for a very complete analysis of the breeding ecology of this system and can serve as a blueprint for more of such work in the fields of cooperation, group living and breeding ecology.

      Some aspects, however, deserve more attention. First, at present, the origin and familiarity and possible relatedness among the group members of the experimentally composed groups is not discussed, and it may be that these factors play a role in shaping the results. Second, the reproductive measure used was the average number of chicks per sex, but it was not calculated at the individual level. There were no genetic analysis done to establish which individuals were actually successful in terms of reproduction. Since individual level selection is likely very important in this system, the results of average reproductive success need to be interpreted with great care. Third, the study was done under semi-natural conditions, meaning that the effects of other factors possibly shaping the success of group size and group composition in the wild (e.g., possible nest predation) were weakened. Finally, a closer connection between the experimental results on optimal group size, and whether this can actually be found in the dataset on natural variation in group size and group composition can be explored.

      We are very grateful for the careful review of our work and positive feedback. The suggestions and comments have been extremely helpful in revising the manuscript, which have led to the following changes:

      1) We have added details about the origin and familiarity of group members, together with extra analyses verifying that our results are not confounded by variation in within-group relatedness. The study population has a nine-generation pedigree allowing us to accurately estimate relatedness between individuals. In the design phase of the experiment, relatedness amongst individuals was kept low in accordance with data from natural populations, but there were related individuals of the same sex in some groups. We tested if the average relatedness within groups influenced the average number of chicks individuals produced and found no significant relationship (Supplementary file 1 – Tables S16 and S17).

      2) We have included genotyping analyses of 3227 offspring to verify that our non-genetic estimates of average reproductive success per sex (total chicks produced by groups / number of same sex individuals) accurately reflect measures obtained using genetic estimates of individual reproductive success. Genetic and non-genetic measures were highly correlated (R >0.95). We have added these verification analyses to the manuscript. The text has also been edited to further clarify that our aim is to estimate the average reproductive benefits for any given individual of being in group of a particular size, rather than examining differences in reproductive success between individuals within groups, for which genetic methods are required.

      3) We have clarified the advantages and limitations of experimental studies. As reviewer 2 highlights, observational studies alone do not provide causal insight into the factors influencing group size, but as reviewer 1 indicates, experimental studies can lack ecological context. Consequently, both have their merits. Experimental manipulations of entire social groups are currently lacking on large vertebrate cooperative breeders, but can be used to estimate the costs and benefits of living in different group sizes that arise independently of ecological conditions. The results of such experimental studies can be used as a benchmark against which other data can be compared, such as observational data on wild groups subject to ecological pressures, including nest predation. The discrepancies between experimental and observational data can then be used to infer the relative importance of social versus ecological factors in shaping social groups.

      4) We have added a figure (Figure 1 - figure supplement 1) and extended the discussion to better connect our experimental data with our observations of natural variation in group size.

    1. Author Response

      Reviewer 1

      This manuscript attempts to explain the well-known difference in DNA mutation rates between father vs. mother (paternal mutation is 4 times higher than maternal mutation in humans). Although the mutation rate difference was believed to arrive from the number of cell divisions (male germ cells undergo many more divisions compared to female germ cells), recent studies suggested that most mutations arise from DNA damage (which will be proportional to the absolute time) rather than DNA replication-induced mutations (which will be proportional to the number of cell divisions). The authors thus revisited the question as to why the paternal mutation rate is higher (if absolute time is more important than the number of cell divisions in causing mutations). They used 'taxonomic approaches' comparing paternal/maternal mutation rates of mammals, birds, and reptiles, correlating them to specifics of reproductive mode in these species. To measure paternal vs. maternal mutation rate, they compared the mutation rates of neutrally evolving DNA sequences between the X chromosome vs. autosomes, as well as the Z chromosome (utilizing the fact that the X chromosome will spend twice more generations in females than males, while autosomes spend equal time. Likewise, the Z chromosome will spend twice more time in males than in females, while autosomes spend equal time).

      They first confirm the paternal bias across a broad range of species (amniotes), eliminating many species-specific parameters (longevity, sex chromosome karyotype (XY vs. ZW), etc) as a contributor to the paternal bias. This implies that something common in males in these broad species causes paternal bias. They show that in mammals, the paternal bias correlates with a generation time. They propose that the total mutation is determined by the combination of the mutation rate during early embryogenesis (when both male and female have the same mutation rate) and the later mutation rate when two sexes exhibit different mutation rates. This model seems to explain why generation time correlates well with the extent of paternal bias in mammals. However, this does not explain at all why birds do not exhibit any correlation with a generation time. The speculation on this feels rather weak (although there is nothing they can do about this. Fact is fact).

      The logic behind their analysis is well laid out and seems mostly sound. Their finding is of broad interest in the field.

      • I am confused by this statement (the last sentence in the result section): 'If indeed the developmental window when both sexes have a similar mutation rate is short in birds then, under our model, generation times are expected to have little to no influence on α." Based on their model, if the early period is gone, when the mutation rates are similar between sexes are similar, intuitively it feels that generation time influences α even more. Am I missing something? (if the period with the same mutation rate is gone, then females and males are mutating at different rates the whole time).

      We apologize for the lack of clarity, as we should have made clear that here we are assuming a fixed ratio of paternal to maternal generation times. Under that assumption, if female and male germ cells are accumulating mutations as a fixed rate over time, then for each sex, the number of mutations accumulated with time is a line that goes through the origin, and the ratio of the paternal-to-maternal slopes (α) will be constant regardless of the age of reproduction. In other words, if Me=0 in equation 1, then α would be constant for any fixed ratio Gm/Gf. We have revised this sentence to be clearer; lines 334-338 now read:

      If indeed the mutation rate in the two bird sexes differs from very early on in development (i.e., if term Me ≈ 0 in equation 1), then assuming a fixed ratio of paternal-to-maternal generation times, our model predicts the sex-averaged age of reproduction will have little to no influence on α.

      • The authors state that this paper provides a simple explanation as to why paternal biases arise without relying on the number of cell divisions. However, it seems to me that the entire paper relies on the recent findings that mutation arises based on absolute time (instead of cell division number), and the novelty in this paper is the idea of 'two-phase mutation rates' to explain the observed numbers of paternal bias in various species. Yet it fails to explain the mutation rate difference in birds. There is not enough speculation or explanation as to what determines different mutation rates in males of various species. Although the modeling seems to be sound and there is nothing that can be done experimentally, I felt somewhat unsatisfied at the end of the manuscript.

      We agree with the reviewer that our paper does not address why the ratio of paternal-tomaternal mutation rates is lower in birds than mammals, and had stated so explicitly (lines 358360): “Another question raised by our findings is why, after sexual differentiation of the germline, mutation appears to be more paternally-biased in mammals (∼4:1) than in birds and snakes (∼2:1).

      To try to gain more insight into this question, we are now analyzing mutations in a set of three generation pedigrees from birds and reptiles, which should allow us to obtain a direct estimate of α and characterize sex differences in the mutation spectra, which we can then compare to what is seen in mammals. While this analysis is beyond the scope of this manuscript, we now note how this question might be pursued (lines 360-362):

      In that regard, it will be of interest to collect pedigree data from these taxa, with which to compare mutation signatures to those typically seen in mammals.

      Reviewer 2 The primary goal of this paper is to re-assess the cause for the excess of male over female germline mutations seen in many animals. By re-analyzing X (Z) and autosomal substitution rates across 42 species of mammals, birds, and snakes, and fitting a model that allows for a constant and equal-sex embryonic mutation rate, along with a mutation rate that increases with age, the authors show that there is no need to invoke the model that assumes mutation rate depends strictly on numbers of cell divisions.

      Strengths 1. The paper challenges a dogma in evolutionary genomics, which states that males have a higher germline mutation rate than females. It establishes convincingly that the count of premeiotic mitotic divisions is NOT the primary driver of the excess male mutations, but instead, it is the intrinsic mutation rate in males (balance of DNA damage vs DNA repair) that accumulates over time.

      1. The authors establish a simple model where the number of mutations that accumulate each generation depends on the embryonic mutation rate (which is shown empirically to not differ between the sexes) and a post-maturity mutation rate, which has elevated male mutation (driven presumably by a shift in the balance between DNA damage and DNA repair). The model is very clear and intuitive described.

      2. The paper is extremely carefully thought-out, planned, and executed. Criteria for inclusion and exclusion of species in the phylogenetic work are clearly laid out. Similarly, decisions about filtering genomic regions (avoiding repeats, etc.) are well done and exhaustively documented. The standard of scholarship is very high - for example, the analysis of de novo mutation rates in mammals pulled in data from no fewer than 15 published studies.

      Weaknesses 1. The method of estimating alpha relies on the assumption that the mutation process (and rates) are the same in autosomes and sex chromosomes. There is an attempt to control for GC content and replication timing, but it is easy to imagine other factors at play, including the inactivation of one X in females, the extensive differences in chromatin modifications, especially of the X, that differ in males vs. females. The case of the cat X chromosome, with its 50 Mb of recombination cold spot and corresponding oddly slow substitution rate, might be just one example of features in other species that cause other perturbations in the substitution rate of the X. This does not seriously erode confidence in the results, but there is more potential for intrinsic mutation rates of sex chromosomes and autosomes to differ than is suggested by the authors.

      We agree with the reviewer that despite our attempts, we do not control for all factors that distinguish X and autosomes beyond exposure to sex. We had written that “while our pipeline may not account for all the differences between autosomes and X (Z) chromosomes unrelated to sex differences in mutation, the qualitative patterns are reliable.” and have now included a sentence to make this limitation clearer (lines 165-167):

      Nonetheless, it is unlikely that our regression model perfectly accounts for all the genomic features that differ between sex chromosomes and autosomes other than exposure to sex.”

      In turn, the assumption that mutation rates in X (Z) and autosomes differ only with regard to their exposure to sex (after accounting for base composition and other genomic features) is unproven; we now state this assumption explicitly in the Methods (lines 678-681). Nonetheless, it seems warranted by the high concordance of evolutionary- and pedigree-based estimates of alpha in humans, mice and cattle. With regard to the specific factors mentioned by the reviewer, excluding CpG sites has little effect on our qualitative conclusions for mammals (see Fig S1E), suggesting that DNA methylation differences between X and autosomes are not having a major influence on our findings. Moreover, X-inactivation in the germline of mammals (as distinct from the soma) is likely quite short-lived, given that it lasts around three days in early development of mice (Chuva de Sousa Lopes et al. 2008) and at most four weeks in humans (Guo et al. 2015). Thus, it is unlikely to be an important mutation rate modifier. We have now reworked three paragraphs in the main text to make the limitations above clearer (lines 127-175).

      1. The authors point out that the human mutations in spermatogonia are due to mutation signatures SBS5/40 ( which are known not to be correlated with cell division rates). The work on the nonhuman species could be greatly extended with this mutation spectrum approach. For each species, one could ask: Are the mutation spectra of the embryonic mutations consistent between males and females? What about the mutation spectra for the post-puberty individuals? Is alpha consistent across mutation signatures? Does the GC bias correction impact these inferences?

      Unfortunately, there is not enough de novo data to address this question outside of humans. In turn, the analysis of substitution data is unreliable, because of the differential impact of repeated substitutions at a site and the effects of GC-biased gene conversion.

      1. While the data do not suggest reasons WHY males display a higher mutation rate, it is fair to ask whether the evolutionary drive for a higher mutation rate might shape the mechanism whereby it happens. There is a certain amount of speculation in the paper as it is, and it is done in a way that is often well supported by data after the fact. Speculation about why males have an elevated mutation rate would not erode the overall quality of the paper, and I would expect that many readers would be eager to see what the authors have to say on the subject.

      As we envisage it, along the lines of Lynch’s models for the evolution of germline mutation (Lynch 2010), there is likely selection to keep the mutation rate as low as possible, subject to the constraints of the need to replicate DNA, repair damage, etc. efficiently. Why the attainable lower limit would be higher in males than in females is unclear to us, both mechanistically and in terms of evolutionary selection pressures. As we now note lines 353-355, a potential proximal cause is a greater effect of reactive oxygen species, a major source of DNA damage, in male germ cells than in oocytes (Smith et al. 2013; Rodríguez-Nuevo et al. 2022). Potential evolutionary causes are even less clear to us, but could be related to the greater competition among sperm vs. oocytes (added in lines 354-357).

      Another way to think about these results is as shifting the question somewhat, broadening it from the long-standing puzzle of the selection pressures shaping sex differences to asking what determines the relative mutation rates of different cell types, including oocytes and spermatagonia but also somatic cell types/tissues. We had previously written that “our results recast long standing questions about the source of sex bias in germline mutations as part of a larger puzzle about why certain cell types (here, spermatogonia versus oocytes) accrue more mutations than others.” We have revised the final paragraph of the Discussion to try to emphasize this point.

      Overall the paper achieves its intended goal of toppling the dogma that the excess male mutation rate is driven by number of rounds of cell division in spermatogenesis (compared to oogenesis).

    1. Author Response

      Reviewer 3

      The number of identified anti-phage defense systems is increasing. However, the general understanding of how phages can overcome such bacterial defense mechanisms is a black box. Srikant et al. apply an experimental evolution approach to identify mechanisms of how phages can overcome anti-phage defense systems. As a model system, the bacteriophage T4 and its host Escherichia coli are applied to understand genome dynamics resulting in the deactivation of phage-defensive toxin-antitoxin systems.

      Strengths: The application of a coevolutionary experimental design resulted in the discovery of a geneoperon: dmd-tifA. Using immunoprecipitation experiments, the interaction of TifA with ToxN was demonstrated. This interaction results in the inactivation of ToxN, which enables the phage to overcome the anti-phage defense system ToxIN. The characterization of the genomes of T4 phages that overcome the phage-defensive ToxIN revealed that the T4 genome can undergo large genomic changes. As a driving force to manipulate the T4 phage genome, the authors identified recombination events between short homologous sequences that flank the dmd-tifA operon. The discovery of TifA is well supported by data. The authors prepared several mutant strains to start the functional characterization of TifA and can show that TifA is present in several T4-like phages.

      In addition, they describe T4 head protein IPIII as another antagonist of a so far unknown defense system.

      In summary, the application of a coevolutionary approach to discover anti-phage defense systems is a promising technique that might be helpful to study a variety of virus-host interactions and to predict phage evolution techniques.

      Weaknesses: The authors apply Illumina sequencing to characterize genome dynamics. This NGS method has the advantage of identifying point mutations in the genome. However, the identification of repetitive elements, especially their absolute quantification in the T4 genome, cannot be achieved using this method. Thus, the authors should combine Illumina Sequencing with a longread sequencing technology to characterize the genome of T4 in more detail.

      We think the combination of Illumina-based sequencing and PCR analyses presented are more than sufficient to arrive at the conclusions drawn about the repeats that emerge in our evolved T4 clones.

      To characterize the influence of TifA during infection, T4 phage mutants are generated using a CRISPR-Cas-based technique. The preparation of these phages is unclearly described in the methods section. The authors should describe in detail whether a b-gt deficient strain was applied to prepare the mutants. Information about the used primers and cloning schemes of the Cas9 plasmid would allow the community to repeat such experiments successfully.

      We have added details to the Methods section to clarify and expand on our mutagenesis approach.

      The discovery of TifA would benefit from additional data, e.g. structure-based predictions, that describe the protein-protein interaction TifA/ToxN in more detail.

      We were unable to predict the ToxN-TifA interaction interface using AlphaFold, and we are currently conducting follow-up work to characterize how TifA neutralizes ToxN.

      Several publications have described that antitoxins can arise rapidly during a phage attack. The authors should address that this concept has been described before as well by citing appropriate publications.

      We believe that we have already addressed this point sufficiently in the Introduction of the manuscript, in which we discuss (1) the emergence of phage-encoded pseudo-toxI repeats to overcome P. atrosepticum toxIN and (2) the presence of the naturally-occurring antitoxins Dmd and AdfA in T4 and T-even phages, respectively. We also discuss the similarities between TifA, Dmd, and AdfA in the discussion of the manuscript. To our knowledge, these are the only known examples of antitoxins arising during phage attack outside of TifA, but we are happy to include additional citations of which the reviewers are aware.

      The authors propose that accessory genomes of viruses reflect the integrated evolutionary history of the hosts they infected. However, the experimental data do not support such a claim.

      We disagree with the reviewer’s comment, as our evolution experiment demonstrates the plasticity of the T4 genome during adaptation to different hosts, as well as showing that the T4 accessory genome includes genes necessary for infection of some, but not all hosts. The proposal also comes as the last sentence of the Abstract and is framed not as a conclusion, but as a proposal based on the work done here, with the clear intention of providing a sense of how future work may build off our work.

    1. Author Response

      Reviewer 1

      They adopted a comprehensive experimental and analytic approach to understand molecular and cellular mechanisms underlying tissue-specific responses against 3-CePs. They used two cell lines - BxPC-3 and HCT-15 - as example models for responsive and non-responsive cell lines, respectively. Although mutation rates didn’t differ by the drug treatment, they observed changes in cell cycle and expression of genes involved in DNA damage, repair and so on. Furthermore, they combined RNA-seq and ATAC-seq data and applied two approaches, pairwise and crosswise, to identify a number of gene groups that are altered in each cell line upon the drug treatment. Finally, they calculated enrichment of up/down genes in different cell lines, tumor types and samples to estimate potential responsitivity against the drug. This study is unique in in-depth analysis of RNA-seq and ATAC-seq data in identifying genetic signature underlying drug treatment. This study has the potential to be applied to different drugs and cell lines.

      We thank the reviewer for the precise and kind summary of our work.

      However, several major concerns need to be resolved. First of all, the biological and clinical performance of 3-CePs is not clearly described. They referenced several papers but they seem to have focused on the chemical properties of the drug. Without proven activity of 3-CePs against cancers in vitro and in vivo, the rationale of the study would be compromised.

      We apologize for not being clear enough when introducing previous findings on the differential sensitivity of HCT-15 and BxPC-3 cancer cell lines to 3-CePs. In the revised manuscript, we now cite references on the preferential activity of these agents against the pancreatic cancer cell line in 2D and 3D in vitro cancer models (see lines 71-74, 128-129). These compounds have been selected to exemplify the use of the pipeline in drug discovery and early-stage of drug development: indeed, only cellular data are available for these molecules, which have not yet been characterized in vivo. The pipeline itself offered a final perspective on directions to take for their further development, i.e. most sensitive tumor types to target (PAAD, KIRC).

      Their RNA-seq analysis was focused on discovering differentially expressed genes between cell lines, time points, etc. Interestingly, they found that DNA damage and repair signal was specifically increased in HCT-15. But is this approach capable of finding signals that are constitutively expressed in different cell lines? In other words, what if the differential responsiveness to 3-CePs was already there even before the drug was introduced?

      We thank the reviewer for pointing out such key concept. The premise for the developed approach is that factors determining the overall cellular sensitivity to a treatment must be determined by intrinsic characteristics of the cell line. For this reason, we built the sensitivity signature on basal transcriptome profiles, where we prioritized a subset of genes based on perturbational evidence (perturbation-informed basal signature).

      Beyond signature genes, we show in figure R1 (see above) the results of a GSEA analysis on the whole overlap (300 genes) between DE genes from the baseline comparison (BxPC-3 ctrl vs HCT-15 ctrl) and those from the 6 h M treatment comparison, in the sensitive cell line (BxPC-3 M 6 h vs BxPC-3 ctrl). Pathways like ribosome biogenesis, ROS metabolism, UPR also arise, attesting that genes activated in response to the treatment also have a constitutively different expression in unperturbed cells.

      Are there any overlapping signals between pairwise vs crosswise approaches?

      We thank the reviewer for this question. To make it easier for the reader to compare the output from the two types of integration and to intuitively grasp their functional overlap, we changed the visualization of the results from the pairwise approach (Figure 4 D).<br /> Indeed, some functional pathways both new or already emerging from previous analysis, arise from both integrations. This overlap has now been directly discussed from the functional point of view in the main text (from line 348 and in the following crosswise integration paragraph).

      Genes used as input in both types of integration are DE or DAR-associated, so this means that many of the hits that we find having the same double regulation (pairwise) also appear in CoCena modules. Among them, only few hits show both 1) the same double regulation in a specified comparison (as suggested by crosswise) and also 2) end up having the similar pattern of regulation across all conditions (contributing to the same CoCena module, one of the strengths of the crosswise integration). Indeed, while the pairwise integration checks one single comparison per time, CoCena checks the pattern throughout conditions providing a more holistic view of the gene regulation (e.g one gene can have a different pattern across conditions at the transcriptional and chromatin level). This is due to the biological fact that RNA and chromatin regulation is not 1:1 (also, for instance, from a timing perspective).

      The major added value of the two approaches consists in their intrinsically different output information. Within a specific comparison, the pairwise integration detects genes consistently activated at the transcriptome and chromatin level. At this information level gene set enrichment can simplify the coherent functional role of this set of genes; we now report this extra information in figure 4 to provide a more granular description of the pairwise integration. Instead, CoCena analyzes the pattern throughout conditions, and clusters together genes and peaks that behave similarly. Functional annotation of genes behaving similarly can put together promoters and/or transcripts that together may orchestrate a specific process (as highlighted by GSEA on each module).

      Probably a similar question with the above: is this methodology applicable to other drugs in addition to 3-CePs?

      To address this extremely important point, that we agree with the reviewer would be key to prove the versatility of our approach, we further applied the pipeline to the prediction of cancer cell lines’ sensitivity to cisplatin, a thoroughly reported broad-acting chemotherapeutic also acting as a DNA damaging agent. Results strongly supported the broad applicability of our approach, which was able to predict sensitivity to this reference drug with extremely high accuracy.

      Reviewer 2

      Carraro et al. describe a framework to understand MoA and susceptibility of drug candidates by integrating RNA-seq and ATAC-seq information. More specifically, by collecting drug responses from high-sensitive and low-sensitive cell lines, the authors identified a key set of pathways with co-expression analysis, and further predicted sensitivity of different cancer cell lines.

      The authors provided a new bioinformatics pipeline to integrate multi-omics data (RNA-seq and ATAC-seq) in a drug response study. This approach increased detection power and identified additional key pathways that are associated with drug 3-CePs. This framework has the potential to be applied to the general drug discovery process.

      We thank the reviewer for the precise summary of our study.

      However, the current manuscript failed to describe the integration methodology in a clear and concise way. Without a full understanding of the methodology, it’s tough to evaluate the downstream results in an unbiased manner.

      We apologize for not having included sufficient details in describing the difference between CoCena and the other two horizontal and vertical approaches. As already discussed in the response to Reviewer 1, we now included a more detailed description not only in the Methods section (from line 894) but also in the main text (lines 393-400).

      In addition, the authors didn’t mention how much additional value this multi-omics approach provided compared to the single-omic data set, as multi-omics approaches are more expensive and labor-intensive.

      We thank the reviewer for this valuable point. To better support the claim for multi-omics approaches, we have extended the Introduction (lines 96-98), as successful integration of information derived from multiple omic layers usually strengthens the determination of the major observed cellular responses. Here, this information helps dissecting and predicting how perturbations (here by drugs) can affect the overall cellular dynamics and mechanisms underlying a certain niveau of sensitivity. We agree with the reviewer that current costs are still prohibitive for large scale use of multi-layer omics in many settings, mainly when it comes to clinical use or drug development. However, significantly less expensive technologies (90% cost-reductions, lines 53-55) have recently been announced, which assures us that approaches as outlined here, will be applicable to many more clinical questions in the near future. Further, we show evidence that some cellular responses to the drug-induced perturbation was only revealed by applying multilayer analysis, but not by a single omics layer, e.g. TGF beta and EMT signaling (see lines 456-459).

      Reviewer 3

      Carraro et al utilize systems biology approaches to decode the mechanism of action of 3chloropiperidines (a novel class of cancer therapeutics) in cancer cell lines and build a drugsensitivity model from the data that they evaluate using samples from The Cancer Genome Atlas and cancer cell lines. The approach provides a framework for integrating transcriptomic and open-chromatin data to better understand the mechanism of action of drugs on cancer cell types. The author’s approach is of sound design, is clearly explained, and is bolstered by validation via holdout sets and analysis in new cell lines which lends the findings and approach credibility.

      The major strength of this approach is the depth of information provided by performing RNA-seq and ATAC-seq on cells treated with 3-CePs at various time points, and the author’s utilization of this data to perform pairwise and crosswise analyses. Their approach identified gene modules that were indicative of why one cell type was more sensitive to a particular drug compared to another. The data was then used to build a sensitivity model which could be applied to samples from The Cancer Genome Atlas, and the authors evaluated their sensitivity predictions on a set of cancer cell lines which validated the predictions.

      We thank the reviewer for the accurate recapitulation of our work.

      The major drawback to this type of approach is that it relies on next-generation sequencing (somewhat costly) and requires intricate bioinformatics analyses. While I agree with the author’s perspective that this approach can be applied to additional classes of drugs and cancer samples, I disagree with their view that it is efficient and versatile. However, for research teams with the means to perform both transcriptomic and open-chromatin studies, I think this integrated approach has promise for evaluating novel classes of drugs, particularly in cancer cell lines that are easy to manipulate in vitro.

      We thank the reviewer for this insightful comment. As with almost every technology, the early years are more difficult and at times adventurous. However, we have seen enormous improvements in robustness of the technology and significant cost reduction with more to come. Only recently sequencing technologies have been introduced into the market with a further 90% cost reduction (as stated in line 53-55). We are convinced that due to their increasing affordability and robustness, RNA-seq and ATAC-seq will be implemented routinely into clinical contexts. As a group working at the cross-section between drug discovery and bioinformatics, we hope that our current work, accompanied by a fair and detailed sharing of our scripts, will become a head start to run this type of analysis also by others in the field who are not (yet) so close to bioinformatics and computational biology.

      While there are examples of similar frameworks being applied to drug development, this work will add to the body of literature utilizing an integrated systems biology approach for pairing drugs with specific tumor or cancer types and understanding their mechanism of action on an epigenetic level.

      We thank the reviewer for this very positive statement and the support for our approach and her/his interest in the described pipeline.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01501

      Corresponding author(s): Prachee Avasthi

      [The “revision plan” should delineate the revisions that authors intend to carry out in response to the points raised by the referees. It also provides the authors with the opportunity to explain their view of the paper and of the referee reports.

      • *

      The document is important for the editors of affiliate journals when they make a first decision on the transferred manuscript. It will also be useful to readers of the reprint and help them to obtain a balanced view of the paper.

      • *

      If you wish to submit a full revision, please use our "Full Revision" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

      1. General Statements [optional]

      This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

      • *

      We thank the reviewers for their careful reading and evaluation of our manuscript. The reviewers have emphasized the need for several important changes which we plan to address.

      First, they request better evidence and specificity of the BCI target in Chlamydomonas. We have created double mutants between the dusp6 ortholog mutants and found severe defects in ciliogenesis similar to what we see with BCI treatment. We plan to include this data in the paper as well as the subsequent analyses we performed with the single dusp6 ortholog mutants. This data will provide stronger evidence that this pathway regulates ciliary length in Chlamydomonas aside from the other potential off target effects that could be impacting this pathway that we may be seeing through the use of BCI.

      Second, the reviewers have requested more consistency and clarity both in statistics and descriptions of the data and to expand upon our findings in the discussion. We will create a clear guideline for our use of statistics and adjust the descriptions of the data to fit this guideline more strictly and prevent overstating/oversimplifying results. We will also add more discussion and information related to off target effects of BCI, the importance of the subtle defects in NPHP4 protein expression in the transition zone, and the relevancy of the membrane trafficking data in light of this study.

      2. Description of the planned revisions

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):____


      SUMMARY:____


      The authors investigated the effects of an allosteric inhibitor of DUSP (BCI) on cilia length regulation in Chlamydomonas. Among seven conclusions summarized in Fig. 7, BCI is found to severely disrupt cilia regeneration and microtubule reorganization. Additionally, changes in kinesin-II dynamic, ciliary protein synthesis, transition zone composition and membrane trafficking are also explored. All these aspects have been shown to affect cilia length regulation. Findings from this body of work may give insights on how MAPK, a major player in cilia length regulation, functions in various avenues. Additionally, the study of BCI and other specific phosphatase inhibitors may provide a unique addition to the toolset available to uncover this important and complicated mechanism.

      MAJOR COMMENTS

      Major comment 1____

      The addition of BCI increases phosphorylated MAPK in Chlamydomonas based on Fig 1B. However, the claim that BCI inhibits Chlamydomonas MKPs is not supported at all. SF1A shows CrMKP2, 3 and 5 are related to each other but distant from HsDUSP6 and DrDUSP6. At the same time, 2 out 3 predicted BCI interacting residues are different from the Hs and Dr DUSP6 in SF1B, contradicting "well conserved" in line 172. Consistently, mutants of these orthologs have little to no ciliary length and regeneration defects compared to BCI treatment (see major comment 6 about statistical significance). I am not convinced that BCI inhibits the identified orthologs or any MKPs in Chlamydomonas. It's possible that BCI inhibits a broad range of phosphatases including the ones listed and/or those for upstream kinases. But such a point is not demonstrated by the presented data.

      While BCI is predicted to interact with these residues, it is also predicted to interact with the “general acid loop backbone” by fitting in between the a7 helix and the acid loop backbone (Molina et al., 2009).

      MKP2 has ciliary length defects compared to wild type, though it regenerates normally. In addition, we have crossed these mutants together and have found that cells (2x3 12.2 and 3x5 29.4) cannot generate cilia. We will include this data in the supplement and perform follow up analyses on these double mutants. Because these structures are not 100% conserved, and we have changed the text to “partially conserved” to reflect this, it is possible that BCI is hitting all of these DUSPs rather than just one, or the DUSPs may serve compensatory functions that rescue ciliary length.

      Major comment 3____

      The claims that "BCI inhibits KAP-GFP protein expression" (line 271) and "BCI inhibits ciliary protein synthesis" (line 286) are not convincingly demonstrated. Overlooking that only KAP is investigated instead of kinesin-II, none of the relative intensity from the WB in 30 or 50 µM BCI and the basal body fluorescence intensity indicates a statistically significant difference. The washout made no difference in any of the assay and it's not explained how phosphatase inhibition by BCI might affect overall ciliary protein synthesis. The claims about protein expression may need a fair amount of effort and time investment to demonstrate, therefore I suggest leaving these out for this manuscript.

      Though it's very interesting to see that in SF 2C cilia in 20 µM BCI treatment can regeneration slowly. Line 162, the author claimed "In the presence of (30 µM) BCI, cilia could not regenerate at all (Fig 1E)". Since Fig 1E only extends to 2 hours, I think it's important to clarify if in 30 µM BCI cilia indeed can not generate even after 6 or 8 hours.

      We have altered the text to be more specific with our wording that KAP-GFP is investigated rather than kinesin-2, and we have added text to indicate that downstream phosphorylation events could impact transcription and translation of proteins necessary for ciliary maintenance. This interpretation of the data mentioned above is correct; KAP-GFP is not significantly altered at the basal bodies or in accordance with the steady state western blots. What we see here and demonstrated in Figure 2F-I is the depleted KAP-GFP protein which is not restored following a 2 hour regeneration in BCI. We likely do not see a difference in steady state conditions because the protein is not degraded, just being moved around in the cell. We can only see the difference when the majority of KAP-GFP, which the data suggests is mostly present in cilia, is physically removed through ciliary shedding. This protein is not replaced during a 2 hour regeneration which allows us to conclude that this protein is inhibited due to BCI.

      The washout made a small difference in the double regeneration whereby we begin to see cilia begin to form in washed out conditions, though this was not statistically significant. It is possible that BCI has a potent effect on the cell similar to how other drugs, such as colchicine, cannot be easily washed out. The purpose here is to show that regardless of the statistical significance, cells can begin to regenerate their cilia after BCI washout, though this occurs 4 hours after washout in doubly regenerated cells, and we do not see this potent effect on the singly regenerated cells in SF 2C. Though in SF2C, as mentioned, we do see slowly growing cilia, and this could, once again, be due to the potent inhibition BCI has on ciliary protein synthesis. We will confirm and clarify if 30 µM BCI cannot regenerate even after 6 or 8 hours.

      Major comment 5____

      It is very interesting that BCI disrupts microtubule reorganization induced by deciliation and colchicine. Data in Fig 6B and C are presented differently than those in SF 4C. For example, in SF 4C, BCI treatment for 60 min has close to 50 % cells with microtubule partially reorganized while in Fig 6C about 20% cells with microtubule fully (or combined?) reorganized. The nature of the difference is unclear to me without an assay comparing the two directly. Hence the implied claim that BCI affects colchicine induced microtubule reorganization differently than deciliation induced one is hard to interpret (line 398, line 388 vs line 403).


      The fact that taxol doesn't rescue cilia regeneration defect by BCI is very interesting. Here taxol treatment results in fully regenerated cilia while Junmin Pan's group (Wang et. al., 2013) reported much shorter regenerated cilia. It might be worthwhile to compare the experimental variance as this is a key data point in both instances. The relationship between cilia regeneration and microtubule dynamic is not in one direction. On one side, there's a significant upregulation of tubulin after deciliation. While many microtubule depolymerization factors such as katanin, kinesin-13 positively regulate cilia assembly (though not without exceptions). It is hard to determine that the BCI induced cilia regeneration defect can't be rescued by other forms of microtubule stabilization. Microtubule reorganization is one of the most striking defects related to BCI treatment. I suggest changing the oversimplified claim to a more limited one (such as "PTX stabilized microtubule ...") and an expansion on the discussion about microtubule dynamics and cilia length regulation beyond the use of taxol. Meanwhile, I strongly encourage authors to continue to investigate this aspect and its connection to the cilia regeneration.

      We will remove data regarding “partially” formed cytoplasmic microtubules and only include fully formed for each of these experiments for clarity.

      It is important to note the different taxol concentration used here. While Wang et al., 2013 used 40 µM taxol to study ciliary affects, we use 15 µM where stabilization still occurs. There have been reports of varied cell responses to higher vs. lower doses of taxol (see Ikui et al., 2005, Pushkarev 2009, Yeung 1999) mostly with regards to the cell’s mitotic/apoptotic response. We could be seeing altered responses at this lower concentration because Chlamydomonas cells also behave differently in higher vs. lower taxol concentrations. Thank you for your suggestions. We have adjusted the text to be more specific to PTX treatment as opposed to general stabilization.

      Major comment 7:____

      There are several places where the technical detail or presentation of the data are missing or clearly erroneous.

      Fig 1B: pMAPK and MAPK antibodies used in the WB are not described in the Material and methods. It's not clear if the same #9101, CST antibody used for RPE1 cell in Fig 1J is used.

      We have updated the materials and methods to include that this antibody was used for both RPE1 and Chlamydomonas cells.


      line 260 and Fig 3A state 20 µM BCI was used while Fig 3 legend repeatedly states 30 µM until (J). Also 30 µM in SF 2A.

      We have corrected the text to 20 µM BCI in the mentioned places.

      Fig 6C, the two lines under p value on top mostly likely start from the second column (B) instead of the first (D). Fig 6G, the line is perhaps intended for the second and fourth columns?

      We will make these comparisons more clear. We had performed a chi-square analysis and were comparing the difference between DMSO and BCI before PTX stabilization or MG132 treatment to after. We will add brackets to more clearly show these comparisons.

      Fig 6C, legends indicate bars representing each category. But only one bar is shown for each column. Same for 6G?

      This is the same as the previous comment for the way we represented the statistics. We will make this clearer with brackets to show the comparisons.

      Minor comments:____

      1. A number of small errors in text were noted above. Done.

      "orthologs" is misused in place of "ortholog mutants": line 176, 352, 421 (first), 879, 882, 898, 902, 938 , 939.

      Done.

      Capital names is misused as mutant names (e.g. "MKP2"should be "mkp2"): line 178, SF 1C, 1D and 1E, SF 3C, SF 6A

      Done.

      At several places such statistical analysis lines indicated are chosen confusingly. A simplest example is in Fig 1D, the comparison between 0 to 45 is less important than 0 to 30. Same as in Fig 1H, 1I. The line ends are inconsistent as well. They either end in the middle or the edge of the columns/data points (such as in SF 4B) and some with vertical lines (SF 2B, SF 4A, SF 6B). I suggest adding vertical lines pointing to the middle to indicate the compared datasets clearly.

      Thank you for this suggestion. We agree and will update the figures to reflect this and provide clarity for statistical comparisons.

      line 101 remove "the"

      Done.

      line 120 "modulate" to "alter"

      Done.

      line 198 "N=30" should be "N=3"

      Done.

      line 212. The legend for p value is likely for (G)

      Done.

      line 284, "singly" should be "single"

      Done.

      The dataset for "Pre" and "0m" in Fig 6D and 6E are clearly the same. Consider combining the two as in Fig 6C.

      This is correct. We will combine the data sets.

      Fig 6E, "BCI" on the X-axis should be "DMSO".

      This is correct. We will correct this.

      line 685, remove "?".

      Done.

      line 894: "Fig 3J" instead of "Fig 3H"

      Done.

      SF 1 legend, (C) and (D) are inverted.

      Done.

      SF 4A "Recovered" should be "Full"

      Done.

      SF 5, row 5, under second arrow perhaps missing +PTX

      Done. We greatly appreciate this close reading of the text and the list of changes making these errors easy to find. We will make these changes in the manuscript.

      Reviewer #1 (Significance (Required)):____


      Increasing evidence indicates that several MAPKs activated by phosphorylation negatively control cilia length while few studies focus on how MAPK dephosphorylation affects cilia length regulation, largely due to the unknown identity of the phosphatase(s) specifically involved in cilia length regulation. The authors set out to investigate the effect of BCI on cilia length control. BCI specifically inhibits DUSP1 and DUSP6, both of which are known MAPK phosphatase, and therefore may provide a unique opportunity to understand how MAPK pathway is controlled by specific phosphatase(s) activity in cilia length regulation.


      Overlooking some inconclusive results and oversimplified interpretations, I find the most striking findings are the BCI's effects including ciliogenesis, kinesin-2 ciliary dynamics and microtubule reorganization. I believe these findings have significant relevance to the stated goal (line 131) and conclusions (line 57) and readers may find them a good starting point for further investigation of the role phosphatases play in cilia length regulation.

      Cilia length regulation is a complicated mechanism that is affected by many aspects of the cell and functions differently in various systems. My field of expertise may be summarized by cilia biology, cilia length regulation, IFT, kinesin, kinases (MAPKs), microtubules. The membrane trafficking's role in cilia length regulation is somewhat unfamiliar to me. Additionally, the authors used a number of statistical tests and corrections in various assays. The nuance of these choices is not clear to me and neither explained to general readers.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In their manuscript, "ERK pathway activation inhibits ciliogenesis and causes defects in motor behavior, ciliary gating, and cytoskeletal rearrangement," Dougherty et al investigate how BCI, an activator of MAPK signaling, regulates ciliary length. Despite advances in our understanding of the structure and function of cilia, a fundamental question remains as to what are the mechanisms that control ciliary length. This is a critical question because cilia undergo dynamic changes in structure during the cell cycle where they must disassemble as they enter the cell cycle and must rebuild after cell division. This work contributes to a growing body of work to determine mechanisms that regulate cilia length.

      The authors use a well-established model system, Chlamydomonas, to study cilia dynamics. This work expands on previous findings from these authors that inhibition of MAPK signaling using U0126 lengthens cilia as well as other publications that implicate MAPK signaling in controlling ciliary length. However, the authors only observe a few significant phenotypes with other subtle trends, leaving the conclusion regarding the role of MAPK signaling murky. Furthermore, it is unclear through what mechanism BCI impacts ciliary length. Several issues must be addressed:

      MAJOR ISSUES

      1. The basis for this study is the use of the ERK activator BCI, which the authors show activates MAPK signaling. While the authors do use putative DUSP6 ortholog mutants to corroborate some of the phenotypes, the majority of the data (and conclusions) uses BCI. However, there may be off target effects and the authors do not address this limitation of the study. The authors only use 1 pharmacological tool to manipulate MAPK signaling, so it is unclear whether these ciliary disruptions are specifically due to increased MAPK. It is necessary to clarify the following questions about BCI action to interpret the results:
      2. ____a.____ What are off target effects of BCI? Does BCI impact proliferation? Why is the BCI phenotype of cilia shortening transient and dose dependent? Why does the phenotype of cilia length and regeneration capacity in Chlamydomonas differ from both ortholog mutants and hTERT-RPE1 cells? While we do mention following supplemental figure 1 that other MKPs could be the target for BCI, we also cite Molina et al., 2009 who showed specificity for BCI hydrochloride in zebrafish. BCI targets primarily DUSP6, but also exhibited some activity towards DUSP1. In this study, the authors had also used zebrafish embryos to check expression of 2 other FGF inhibitors, spry 4 and XFD, in the presence of BCI but found that their effects were not reversed. In addition, they checked the ability for BCI to suppress activity of other phosphatases including Cdc25B, PTP1B, or DUSP3/VHR and found that BCI could not suppress these phosphatases. BCI inhibition has previously been found to be more specific to MAPK phosphatases. In addition, we have previously confirmed that U0126 has a slight lengthening effect on Chlamydomonas which further implicates this pathway in cilium length tuning (Avasthi et al. 2012).

      While cell proliferation assays maybe provide more support for MAPK signaling, it does not clarify lack of off target effects that could also contribute to this same phenotype. We do provide a cell proliferation assay for RPE1 cells where we show that higher concentrations of BCI result in cellular senescence as well (Fig 1I).

      The BCI phenotype of cilia shortening is likely transient and dose dependent due to its effect on ciliary protein synthesis demonstrated in Figure 3J. The increase in drug likely increases its substrate binding to exert its effects on the cell faster, even if this includes off target proteins.

      In RPE1 cells, we are likely seeing differences in regeneration capacity potentially due to their different mechanisms of ciliogenesis (RPE1 cells partake in intracellular ciliogenesis where axonemal assembly begins in the cytosol whereas Chlamydomonas cells partake in extracellular ciliogenesis where axonemal assembly begins after basal bodies dock to the apical membrane), or it could be that we’re missing a delay in regeneration in RPE1 cells after waiting 48 hours for ciliogenesis. We do not check this process sooner. There may be a defect that cells overcome. Additionally, among ortholog mutants and RPE1 compared to BCI-treated wild-type Chlamydomonas, there indeed could be off target effects or the drug could be targeting all of these MKPs rather than just one. We will add this to the discussion for clarity.

      Reviewer #2 (Significance (Required)):


      see above

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      SUMMARY:

      In this study, the authors used a pharmacological approach to explore the function of ERK pathway in ciliogenesis. It has been reported that the alteration of FGF signaling causes abnormal ciliogenesis in several animal models including Xenopus, zebrafish, and mice. However, it remains elusive the molecular detail of how ERK pathway is associated with cilia assembling process. The authors found that the ERK1/2 activator/DUSP6 inhibitor, BCI inhibits ciliogenesis, highlighting the importance of ERK during ciliogenesis. Overall, this paper is well written, data are solid and convincing. This paper will be of great interest to many researchers who are interested in understanding ciliogenesis. The following comment is not mandatory requests but suggestions to improve the paper's significance and impact.

      MAJOR COMMENTS:

      - Combination of chemical blocker experiments were well controlled and data are solid. The authors are aware of the side effects of BCI, thus they carefully characterized the phenotypes of Mkp2/3/5 in Chlamydomonas. This reviewer wonders if the levels of ERK1/2 phosphorylation are activated in these mutants. Did the authors examine the levels of ERK1/2 phosphorylation in these mutants?

      While we do not include the data showing ERK activation in these mutants, we have checked pMAPK activation and found that it is not significantly upregulated in these mutants. This could likely be due to compensatory pathways preventing persistent pMAPK activation. For example, constant ERK activation can lead to negative feedback to regulate this signal for cell cycle progression (Fritsche-Guenther et al., 2011). The ERK pathway has not been fully elucidated in Chlamydomonas, but it is possible that these similar mechanisms are in place for MAPKs. We will include this data in the supplement.

      Reviewer #3 (Significance (Required)):


      Accumulated studies suggest that the FGF signaling pathway plays a pivotal role in ciliogenesis. Disruption of either FGF ligands or its FGF receptor results in defective ciliogenesis in Xenopus and zebrafish. On the other hand, FGF signaling negatively controls the length of cilia in chondrocytes that would cause skeletal dysplasias seen in achondroplasia. Therefore, there is strong evidence suggesting that FGF signaling participates in ciliogenesis in cell-type and tissue-context dependent manners. However, the detailed mechanism of the downstream of FGF signaling in ciliogenesis is still unclear. In this regard, this paper is beneficial for the cilia community to expand the knowledge of how ERK1/2 kinase contributes to the regulation of ciliogenesis.


      This reviewer therefore suggests that the authors may want to add more discussion to explain how their finding possibly moves the field forward to understand the pathogenesis of multiple ciliopathies.

      We will add a description of this to the discussion.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      Reviewer 1:


      Major comment 4____

      A single panel in Fig 4A also can't support the shift in protein density in the TZ in line 317. As line 324 implies protein synthesis defect by BCI, the very minor (in amount and significance) reduction of the NPHP4 fluorescence should not be interpreted as any disruption at all to the transition zone. I suggest checking other TZ proteins such as CEP290 etc or leave this section out.

      Also, The additive effect from BFA and BCI treatment in Fig 5A suggests BCI affects cilia length independent of Golgi. The "actin puncta" and arpc4 mutant are not sufficiently introduced. And more importantly, how increase in the actin puncta explains the shorter cilia length caused by BCI while actin puncta are absent in arpc4 mutant with shorter cilia? Also, the Arl6 fluorescence signal "increase" is not significant in either time point. I suggest leaving this section out as well.

      We agree that one EM image cannot support a protein shift and have removed our observation in the text. However, we do see a statistically significant decrease in NPHP4 fluorescence in BCI treated cells which we consider a disruption in the sense that the structural composition is altered. We will change the word “disruption” to “alteration” for clarity. Though this is a minor defect, we believe it is still worth noting. We believe this data still adds to the model that though the EM-visible structure is unaltered, finer details within the transition zone are indeed altered and we cannot rule out that these smaller changes are not impacting protein entry into cilia. Awata et al. 2014 shows that NPHP4 is important for controlling trafficking of ciliary proteins at the transition zone, and its loss from the transition zone has been found to have effects in ciliary protein composition. Because we see decreased NPHP4 expression, we believe this is a notable finding as we see effects on the abundance of a protein which is known to affect ciliary protein composition and have therefore chosen to leave the data in the manuscript. We will adjust the language to most accurately describe our findings.

      We also agree with the interpretation that the additive effect seen from BFA and BCI treatment could suggest independent pathway collapse separate from the Golgi which we have mentioned in the manuscript.

      We have provided more information to introduce actin puncta and ARPC4 with regards to membrane trafficking. Bigge et al. 2020 shows that ARPC4, a subunit of the ARP2/3 complex which is an actin binding protein important for nucleating actin branches, has a role in ciliary assembly. ARPC4 mutants have repressed ability to regenerate their cilia. One feature they noticed in regenerating cells is the immediate formation of actin puncta which are reminiscent of yeast endocytic pits. This observation in addition to altered membrane uptake pathways in Chlamydomonas suggests that ciliogenesis involves reclaiming plasma membrane for use in ciliogenesis (because of the diffusion barrier preventing a contiguous membrane). Here, we incorporate this assay to assess the ability for the cell to reclaim membrane during BCI treatment and find that there is increased actin puncta. This could indicate that there is increased number of endocytic pits or alternatively that the lifetime of these pits is increased (perhaps due to incomplete endocytosis) such that we are able to detect more of them at a fixed point in time. While we cannot say which is happening here, we have previously found that these actin puncta are likely endocytic and needed to reclaim membrane for early ciliogenesis. An increase in these puncta may suggest dysregulated endocytosis in one way or another. ARPC4 cells cannot form the actin puncta in the first place, whereas we are seeing defects following puncta formation. We have taken out the Arl6 data.

      Major comment 6____

      Throughout this manuscript, the standard the authors used to interpret statistical significance is erratic. In a few instances, the threshold for p value is clearly indicated such as in Fig 1 legend. Though other times, much higher p values are considered differences. Here are some examples:

      SF 1C, p=0.1167 is considered "(mkp5) shorter than wildtype ciliary lengths" (also line 177 "SF 1C" instead of "SF 1D")

      Fig 3C, p=0.083 interpreted as "slightly less" in line 262 and possibly as "(KAP-GFP) not being able to enter (cilia)" in line 268

      Fig 3G, p=0.1087 is considered "not decrease after two hours" line 267

      SF 3C, p=0.2929 for mkp2 mutant (misuse of "orthologs" in line 352) is considered "fewer actin puncta compared to wild type cells" (line 352).

      SF 6B, p=0.1565for mkp3 mutant (line 421: misuse of "orthologs" and correct use of "ortholog mutants") is considered not be able to "fully reorganize their microtubules" (line 421).

      These instances sometimes serve as basis for major conclusions and should be clarified or more carefully characterized.

      We agree the interpretations are very erratic in places and greatly appreciate this detailed list making it easy to find and correct these interpretations. We have adjusted the text in the mentioned places to reflect these changes, and we have made a statement in the text and under statistical methods that say we consider p Reviewer 2:

      In multiple instances the conclusions are overstated, and the author must clarify the interpretation of the results to reflect the data presented. Here are some examples:

      • ____a.____ The conclusion that protein synthesis is disrupted is incorrect in two instances (line 258 and 275) as the experiments in figure 3 do not directly examine changes in synthesis (they look at cilia regeneration as a proxy). We show that KAP-GFP expression is not normal during regeneration at 120 minutes which suggests, in addition to the inability for cilia to grow in BCI, that synthesis is inhibited because this protein is not replaced. In addition, blocking the proteosome did not rescue this decrease in KAP-GFP expression indicating that this is not a matter of KAP-GFP protein being degraded rapidly. We use regeneration and KAP-GFP readout as a proxy for protein synthesis. We have clarified this in the text.

      • ____b.____ The conclusion that BCI disrupts membrane trafficking is too broad when the authors only examined trafficking of one membrane protein, Arl6. While we only looked at one membrane protein specifically, we assess other membrane trafficking paths. We looked at BCI vs. BFA to assess Golgi trafficking (Dentler 2010) in addition to formation of actin puncta which is used in Bigge et al. 2020 as an assay for membrane uptake from the plasma membrane for incorporation into cilia.

      • ____c.____ The conclusion that the transition zone is disrupted is too broad based on a decrease in the expression of one transition zone protein, NPHP4. We have changed the text to be more specific to NPHP4.

      Highlighting the overstatement, the conclusion of the header and figure caption on page 10 contradict one another. The manuscript states that "BCI partially disrupts the transition zone" (line 313) and that "The TZ structure is structurally unaltered with BCI treatment" (line 329).

      In the manuscript, we show that the EM-visible structure is indeed unaltered. Because we see a decrease in NPHP4 fluorescence, we concluded that while the EM-visible structure is unaltered, protein composition within the transition zone is altered which suggests that BCI partially disrupts the transition zone.

      Why is kinesin-2 the only target studied for ciliogenesis? Ciliogenesis is a complex process that involves many other critical proteins and investigating kinesin-2 alone is not sufficient to conclude why BCI prevents cilia assembly.

      We use kinesin-2 because it is the only ciliary anterograde motor in Chlamydomonas which is required for proper ciliogenesis. By assessing kinesin-2, we were able to address whether this protein alone was the cause for inhibited ciliary assembly (and we find that it’s not), whether its ability to enter was impacted (likely owing to defects in other protein entry), and we were able to use this protein to understand how its protein expression was affected. Because KAP-GFP is a cargo adaptor protein and interacts with IFT complexes and other cargoes, defects in this protein can have a wide range of implications. We agree and the data agree that kinesin-2 alone is not sufficient to conclude why BCI prevents cilia assembly. Because of this, we assessed other pathways including membrane trafficking and microtubule stabilization to better understand why we see defects in ciliary assembly. Certainly many other proteins are important in ciliogenesis and we hope that this study sparks further work in this area to identify additional causative explanations for impaired ciliogenesis upon MAPK activation..

      Tagged ciliary proteins are sensitive to disruptions in function and expression within cilia. It is important to include proper controls in the study using KAP-GFP Chlamydomonas cells to ensure that KAP-GFP maintains endogenous expression levels and normal function as untagged KAP. Furthermore, if this information is available through the resource where the cells were purchased, then this needs to be discussed.

      KAP-GFP expressing Chlamydomonas has previously been validated as described in Mueller et al., 2005. We will provide details in the text about validation of this strain.

      The authors need to provide clear explanations to a general audience of why this technique is used and how the authors reached the interpretations. There are several instances where the authors use techniques that are cited as fundamental papers in Chlamydomonas. Here are two examples:

      • ____a.____ It is unclear how the authors concluded that decreased frequency and velocity of train size shows that kinesin entry, specifically, is disrupted. We have expanded on this in the text. Please see response to reviewer 1, Major comment 2 above.

      • ____b.____ It was impossible to follow how the experiment where cells treated with cycloheximide could not regenerate their cilia following BCI treatment shows that BCI inhibits protein synthesis. We have adapted the text to be more clear regarding this experiment. In this experiment, we deplete the ciliary protein pool by forcing ciliary shedding two times. Following the first shedding, there is enough protein to assemble cilia to half length (Rosenbaum, 1969). We ensure that the protein pool is completely used up by inhibiting further ciliary protein synthesis with cycloheximide. For the second shedding event, completely new ciliary protein must be synthesized for ciliogenesis to occur which is why ciliogenesis takes much longer compared to a single regeneration where half of the ciliary protein pool still remains and can be immediately incorporated into cilia (SF 2C). In the presence of BCI, cilia cannot grow at all as expected; but 4 hours after BCI is washed out, we see ciliogenesis just beginning to occur which indicates that there is protein present for ciliogenesis to begin whereas in cells where BCI is not washed out, we do not see any ciliogenesis.

      The impact of BCI treatment on membrane trafficking as presented is confusing. BCI exacerbated the effects of BFA treatment on Golgi, yet the authors do not address that this could be an indirect effect of BCI or an off-target effect of BCI.

      This is addressed in the discussion (paragraph 4).

      The discussion section includes many interpretations of the results, but leaves the reader confused as to what the authors think might be happening. The manuscript would be far clearer if the authors would provide a working model for why BCI impacts cilia length. It is fine for this to be left for future work but, as the experts, the authors must have relevant thoughts to share with the field.

      Figure 7 provides a model with as much as we can conclude given the data; what we show is that BCI inhibits many different processes in the cell, but we do not necessarily show links between these processes to provide a complete working model of how these are all interconnected; we have provided a summary model that depicts the various, still disconnected processes that are inhibited by BCI. MAP kinases such as ERK have dozens of downstream targets both within and outside the nucleus. Ciliogenesis also is a complex process coordinating many cellular mechanisms. The intersection of these two seem to have a multi-fold effect that results in a dramatic ciliary phenotype through a combination of factors, however not one that fully explains the severity upon initial deciliation in BCI/MAPK activation. Further work is needed to identify the precise cause of completely inhibited cilium growth from zero length.

      MINOR ISSUES

      1. The title of the manuscript is inaccurate and overstates the pathway involvement in cilia. The authors do not directly show that ERK pathway activation causes the ciliary phenotypes due to the use of BCI, a drug that modulates ERK. We have adjusted the title to “The ERK activator, BCI, causes…”

      When discussing results of data that are not statistically significant it creates confusion to state that the results "increased/decreased slightly".

      We agree that references to statistics are inconsistent or confusing throughout the text and have adjusted these references accordingly.

      Reviewer 3:

      Major comment:

      - If the authors want to emphasize their finding is associated with MAP kinases, it would be also beneficial to examine other major MAP kinase pathways such as P38/JNK. If not, then this reviewer suggests revising the text as ERK through this manuscript to avoid confusions.

      Because the ERK pathway has not been fully elucidated in Chlamydomonas, we have refrained from using “ERK” as a descriptor because this particular MAPK shares equal identity with multiple MAPKs in Chlamydomonas. Further, BCI may be targeting more than one MAPK phosphatase resulting in the myriad phenotypes we have discovered. At this time, we lack a level of gene-level resolution to map to known MAPK pathways.

      • *

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.


      Reviewer 1:

      Major comment 2____

      The claim that "BCI treatment decreases kinesin-2 entry into cilia" (line 236) is a misinterpretation of the data presented. The data indicates KAP-GFP have reduced accumulation in cilia, decreased IFT (anterograde) frequency, velocity and injection size associated with BCI treatment. Though as shown in Fig 1D and Fig 2C, cilia length is also shorter due to BCI treatment. Ludington et. al, 2013 showed a negative correlation of cilia length and KAP injection rate in various treatments that affect cilia length. It's essential to rule out that the KAP dynamics reported in the current manuscript is not an outcome of shortened cilia in order to claim as line 236 seems to suggest. One way to demonstrate specific effect by BCI would be to compare KAP dynamic in cilia with equal or similar length, either by only selecting the shorter cilia from wt or use other treatments that are known to decrease cilia length (chemicals, cell cycle, mutants etc.). Given the capability and resource represented in this manuscript, I don't expect a significant cost and time investment for these experiments.

      Ludington et al., 2013 shows that injection size decreases with increasing length. Our data show that the shorter length cilia have decreased injection size and rate inconsistent with the cause being due to shortened length alone. In other words, in figure 2C and 2G, we see decreased KAP-GFP fluorescence in shorter cilia as opposed to greater fluorescent signal in shorter cilia seen in Ludington et al., 2013. This data, in combination with the decreasing frequency of KAP-GFP entry overtime in figure 2E and decreased velocity in figure 2F support decreased kinesin-2 entry into cilia. If entry was unaltered, we would expect increased KAP-GFP fluorescence in the cilia over time in BCI-treated cells.


      Reviewer 2:

      The authors state that the decreased length of cilia following BCI treatment could be a result of reduced assembly or increased assembly. Disruptions to cilia assembly and disassembly are not mutually exclusive and both must be evaluated. The authors do not test whether cilia disassembly is disrupted in BCI treatment and therefore, cannot conclude that BCI solely disrupts cilia assembly.

      While effects on disassembly remains a possibility, the striking inability to increase from zero length upon deciliation and the effects on anterograde IFT through the TIRFM assays suggest an affect on assembly. There may be effects on disassembly and likely many other cilia related processes not investigated but we feel it remains accurate to conclude that assembly is affected by BCI treatment.

      Reviewer 3:

      - If time allows, in addition to examining NPHP4, it would be beneficial to examine other TZ/TF markers such as CEP164 to confirm if BCI partially disrupts the TZ.

      Given the known outcomes of NPHP4 loss in Chlamydomonas (Awata et al., …) in affecting ciliary protein composition, we suspect the changes in NPHP4 abundance at the transition zone will have a significant impact and agree it would be interesting in a follow up study to see how other transition zone proteins (particularly ones known to interact with NPHP4 or others critical for TZ function) are impacted following BCI treatment.


      MINOR COMMENTS:

      - I suggest moving supplemental figure 1 to the main figure (Fig. 1?) so that the readers appreciate the author's careful examination of BCI through this manuscript.

      Thank you for your suggestion and kind critique. We have included this data in the supplement for consistency with mutant data in all of the other supplemental figures.


    1. Overview Q&A Notebook Transcript INSTRUCTOR Jeff Toister Author, Consultant, Trainer Follow on LinkedIn RELATED TO THIS COURSE Learning Groups Show all Exercise Files (2) Show all Certificates Show all Continuing Education Units Show more Exam Start Exam Course details 1h 22m Beginner Updated: 11/18/2020 4.7 (12,712) View Jeff's LinkedIn NewsletterDo your customers feel valued? When they do, they keep coming back. When they don't, your business suffers. In this course, writer and customer service consultant Jeff Toister teaches you the three crucial skill sets needed to deliver outstanding customer service and increase customer loyalty. Learn how to build winning relationships, provide the right assistance at the right times, and effectively handle angry customers. He also shares ways to find out what your customers really think about your service, and use their feedback to improve. Learning objectives Explore how you can use customer surveys to build rapport. Name three ways you can use active listening to serve your customers more effectively. Identify the different types of needs that must be addressed in order to solve problems. Explain the benefits of taking ownership of a problem. Define “preemptive acknowledgment” and recognize its impact on customer service. List three types of attitude anchors and explain their differences. Skills covered Customer Loyalty Customer Service Learners 24,449 members like this content 537,649 people started learning CEU - Continuing Education Units (2 certifications available) National Association of State Boards of Accountancy (NASBA) Continuing Professional Education Credit (CPE): 3 Recommended NASBA Field of Study: Communications and Marketing Sponsor Identification number: 140940 To earn CPE credits the learner is expected to: Complete all videos and chapter quizzes Complete the final exam within one year from completing the course Score 70% or higher on final exam Glossary: see PDF file in the Exercise Files area Program Level: Basic Prerequisite Education: There are no prerequisites for this course. Advanced Preparation: There is no advance preparation required for this course. If you undertake this course for CPE credits, you can leave final comments in the Self Study Course Evaluation. LinkedIn Learning is registered with the National Association of State Boards of Accountancy (NASBA) as a sponsor of continuing professional education on the National Registry of CPE Sponsors. State boards of accountancy have final authority on the acceptance of individual courses for CPE credit. Complaints regarding registered sponsors may be submitted to the National Registry of CPE Sponsors through its web site: www.nasbaregistry.org Register here with LinkedIn Learning. For course refund policy, issue resolution, and additional info please see the LinkedIn User Agreement. For more information regarding administrative policies such as complaint and refund, please contact our offices at +1 650-687-3600. Project Management Institute (PMI)® PDUs/ContactHours: 1.75 LinkedIn Learning has been reviewed and approved by the PMI® Authorized Training Partner Program. This course qualifies for professional development units (PDUs). The PMI Authorized Training Partner logo is a registered mark of the Project Management Institute, Inc. To view the activity and PDU details for this course, click here. Related courses POPULAR 32m COURSE Course Customer Service: Problem Solving and Troubleshooting 293,029 learners Save POPULAR 27m COURSE Course Building Rapport with Customers 238,646 learners Save POPULAR 49m COURSE Course De-Escalating Conversations for Customer Service 278,035 learners Save POPULAR 23m COURSE Course Customer Service: Call Control Strategies 188,760 learners Save POPULAR 33m COURSE Course Creating Positive Conversations with Challenging Customers (2019) 275,662 learners Save Learner reviews 4.7 out of 5 12,712 ratings How are ratings calculated? 5 star Current value: 9,973 78% 4 star Current value: 2,159 17% 3 star Current value: 444 3% 2 star Current value: 44 <1% 1 star Current value: 92 <1% Olatunji Awesu 3rd Sales Team Lead July 25, 2022 Great course Helpful Report Ayanda Hlatshwayo Call Center Representative July 25, 2022 ... Helpful Report thobani mkhize agent July 25, 2022 very helpful Helpful Report Show more reviews Live office hours with experts Show all Show all upcoming events Jun 16, 10:00 AM EVENT Event Motivating customer service employees By: Jeff Toister Ask here to share with learners, experts and others Ask Looking for technical assistance (e.g. downloading certificates)? Visit Learning Help Question asked by Tye Locke Tye Locke Willing to help but are you? 5d More options for this question Copy link to question Report this post Where can I download the worksheet? From the video: Define outstanding customer service (00:38) 4 Answers Like Answer Add your answer here Add your answer here Answered by sadam arab sadam arab Student at alpha university 10h More options for this answer Report this post also I want download so how I can download Like Reply Answered by Sydney Sabelo Sydney Sabelo Risk Controller at Robor 1d More options for this answer Report this post PDF  is the best or recommended to download your worksheet with Like Reply Load more answers Question asked by Kufre Edet Kufre Edet Information Technology Specialist at Akwa Ibom State Agency for the Control of AIDS 1w More options for this question Copy link to question Report this post I cant find where to download the PDF files recommended in the course From the video: Create a plan (02:02) 2 Likes 1 Answer Like Answer Add your answer here Add your answer here Answered by Jeff Toister Jeff Toister Instructor Your service culture guide. 1w More options for this answer Report this post Hi Kufre. The exercise files are available to LinkedIn Learning subscribers. To download the files, navigate to the "Overview" tab and look for a link marked "exercise files" near the top. I'd recommend contacting LinkedIn Learning directly for technical assistance if you run into any more difficulty: www.linkedin.com/help/learning Like Reply Question asked by Sandip Kaur Badhesha Sandip Kaur Badhesha Passionate IT Analyst Looking for a Challenging Opportunity 1w More options for this question Copy link to question Report this post I can't find all the documents he suggests to Download in each Video. From the video: Implement techniques to build rapport (00:22) 1 Like 1 Answer Like Answer Add your answer here Add your answer here Answered by Jeff Toister Jeff Toister Instructor Your service culture guide. 1w More options for this answer Report this post Hi Sandip, The exercise files are available to LinkedIn Learning subscribers. They can be accessed by navigating to the course's Overview tab. Look for a link labeled "exercise files" near the top. I'd recommend contacting LinkedIn Learning directly for technical support if you run into any difficulties: www.linkedin.com/help/learning  -Jeff Like Reply Question asked by Lucas M. Ladeveze Lucas M. Ladeveze Surgeon Specialized Knee-Foot and Ankle -Specialized Sports Medicine - Profesional Football Coach - Professional Padel Coach - 2w More options for this question Copy link to question Report this post LEarning a lot! But I cannot find all the documents he suggests to Download in each Video.   From the video: Implement techniques to build rapport (00:23) 3 Answers Like Answer Add your answer here Add your answer here Answered by Jeff Toister Jeff Toister Instructor Your service culture guide. 1w More options for this answer Report this post Hi Lucas, I'm glad you're learning a lot so far! The exercise files are available to LinkedIn Learning subscribers. They can be accessed by navigating to the course's Overview tab. Look for a link labeled "exercise files" near the top. I'd recommend contacting LinkedIn Learning directly for technical support if you run into any difficulties: www.linkedin.com/help/learning  -Jeff Like Reply 1 Like Answered by Maha M. Maha M. Entrepreneurial professional with growth mindset, excellent interpersonal skills, problem-solving abilities. Successful at team-leading & building ,showcasing strong emotional intelligence & full filling business needs. 1w More options for this answer Report this post good content Like Reply 1 Like Load more answers Question asked by Marlene Ranallo Seelig Marlene Ranallo Seelig Recruiter 2w More options for this question Copy link to question Report this post Where are these downloads?  From the video: Implement techniques to build rapport (00:20) 1 Like 1 Answer Like Answer Add your answer here Add your answer here Answered by Jeff Toister Jeff Toister Instructor Your service culture guide. 2w More options for this answer Report this post Hi Marlene. The exercise files are available to LinkedIn Learning subscribers. They can be accessed by navigating to the course's Overview tab. Look for a link labeled "exercise files" near the top. I'd recommend contacting LinkedIn Learning directly for technical support if you run into any difficulties: www.linkedin.com/help/learning -Jeff Like Reply 1 Like Question asked by Charisa Chinyere Ndinojuo Charisa Chinyere Ndinojuo I am a professional freelancer in customer support, social media marketing, virtual assistant and data entry 1mo More options for this question Copy link to question Report this post I am done with watching all the video in this course and I still can't download the certificate, why? From the video: Keep your customers happy (00:28) 6 Likes 4 Answers Like Answer Add your answer here Add your answer here Answered by Ekemini Eyoh Ekemini Eyoh -- 3w More options for this answer Report this post I am not able to download the questions or try out the quizzes. Please how do I go about it,? Like Reply Answered by Quach T Dung Quach T Dung -- 4w More options for this answer Report this post Me too, I'm trying a lot but I can not get certificate Like Reply Load more answers Question asked by Patience Chekwube Patience Chekwube General virtual Assistant/ Data entry specialist/ lead generator 1mo More options for this question Copy link to question Report this post Please how do I download the learning plan worksheet.  Thank you From the video: What to know before watching this course (01:22) 1 Like 1 Answer Like Answer Add your answer here Add your answer here Answered by Patience Chekwube Patience Chekwube General virtual Assistant/ Data entry specialist/ lead generator 1mo More options for this answer Report this post Ok, I saw similar questions here and the answer to it. Have downloaded it but can't seem to open the downloaded file. What should I do Like Reply 1 Reply Commented by Jeff Toister Jeff Toister Instructor Your service culture guide. 1mo More options for this comment Report this post Hi Patience Chekwube . I'd recommend contacting LinkedIn Learning for technical support. www.linkedin.com/help/learning Like Reply Question asked by Eze Joy Eze Joy Student at Nnamdi Azikiwe University 1mo More options for this question Copy link to question Report this post Hello I have completed my course with a total of 73%in my exam but was not issued any certificate what will I do? From the video: Identify emotional needs (00:54) 1 Answer Like Answer Add your answer here Add your answer here Answered by Jeff Toister Jeff Toister Instructor Your service culture guide. 1mo More options for this answer Report this post Thanks for completing the course, Eze Joy . I hope it was very valuable to you! Here's a guide I found on the LinkedIn site for getting your certificate. It includes some troubleshooting steps. https://www.linkedin.com/help/learning/answer/a700836 Like Reply Question asked by Manar Fakhri Manar Fakhri MSc Master degree in Business Administration with Specialisation in International Marketing ( SMART CITY ) 1mo More options for this question Copy link to question Report this post I complete course and did the assessment and got 75% but no certification got !!!!!!!!!! From the video: Create a plan (00:01) 3 Likes 5 Answers Like Answer Add your answer here Add your answer here Answered by Esther Mutisya Esther Mutisya Operations Manager at Greenvale Hotel 1mo More options for this answer Report this post how do i download the pdfs? Like Reply 1 Reply Commented by Jeff Toister Jeff Toister Instructor Your service culture guide. 1mo More options for this comment Report this post Hi Esther. LinkedIn Learning subscribers can access the course worksheets by navigating to the Overview tab. There's a link near the top marked Exercise Files. Like Reply 1 Like Answered by Jeff Toister Jeff Toister Instructor Your service culture guide. 1mo More options for this answer Report this post Hi Manar. Thanks for completing the course! I found this guide on the LinkedIn Learning site with some troubleshooting steps for downloading certificates of completion: https://www.linkedin.com/help/learning/answer/a700836 If those steps don't help, I recommend contacting LinkedIn Learning directly for technical support: https://www.linkedin.com/help/learning While I don't work for LinkedIn Learning, and my technical skills are limited, I'd be happy to answer any questions you have about the course itself. -Jeff Like Reply 1 Like 3 Replies Load previous replies Commented by Jeff Toister Jeff Toister Instructor Your service culture guide. 32m More options for this comment Report this post Janh Delantar Here's what I shared with Manar. Hopefully, this will help you: I found this guide on the LinkedIn Learning site with some troubleshooting steps for downloading certificates of completion: https://www.linkedin.com/help/learning/answer/a700836 If those steps don't help, I recommend contacting LinkedIn Learning directly for technical support: https://www.linkedin.com/help/learning While I don't work for LinkedIn Learning, and my technical skills are limited, I'd be happy to answer any questions you have about the course itself. -Jeff Like Reply Commented by Janh Delantar Janh Delantar -- 1d More options for this comment Report this post How i can get my certificate i finish the course Like Reply Load more answers Question asked by Kingsley Chinemerem Kingsley Chinemerem Customer Relationship Officer at Sendme.ng 2mo More options for this question Copy link to question Report this post I'm not able to take the first lesson in the path. what could be the problem? From the video: Keep your customers happy 2 Likes 3 Answers Like Answer Add your answer here Add your answer here Answered by Dishita Peketi Dishita Peketi Customer Success Account Manager ( Sales Service Operations) CRM! 1mo More options for this answer Report this post Hello sir I am dishita I  couldn't able to open the exercise file which I  downloaded. Like Reply 2 Replies Commented by Bulelani lunathi Bulelani lunathi Student at Afedilem 1mo More options for this comment Report this post In other to be able to open your exercise file,i think you should go back to google out about how to open that type of file so that they will show you steps of opening the file you about to open. Like Reply Commented by Jeff Toister Jeff Toister Instructor Your service culture guide. 1mo More options for this comment Report this post Hi Dishita. I'd suggest contacting LinkedIn Learning's support team directly for technical assistance. These forums are focused on content-related questions, so your question might not get as fast and thorough a response as if you contacted support: www.linkedin.com/help/learning Like Reply 1 Like Answered by Sphamandla Hopewell Mchunu Sphamandla Hopewell Mchunu Cisco Network Academy IT. Computer Literacy. NACCW (Child and Youth Care).Department of Education (Learn Support Agent). Department of Health (TB screener and Lay counseling) Department 1mo More options for this answer Report this post Hi I have managed to finish all the quiz and exam but I cant access the certificate please help Like Reply 1 Reply Commented by Jeff Toister Jeff Toister Instructor Your service culture guide. 1mo More options for this comment Report this post Hi Sphamandla. I'd suggest contacting LinkedIn Learning's support team directly for technical assistance. These forums are focused on content-related questions, so your question might not get as fast and thorough a response as if you contacted support: www.linkedin.com/help/learning Like Reply Load more answers Show more Join the community of learners Project Management Institute (PMI) Prep - LI Learning Group 117,984 Members This group is for learners who are interested in Project Management Institute certification prep and want to connect, share, collaborate, learn, and teach in an open, safe environment. Learning is fun when done together. Let’s make it great and enjoy the conversation. *Note: By joining this group, your profile will be visible to other group members but your network will NOT be notified. Join National Association of State Boards of Accountancy (NASBA) - LinkedIn Learning Group 98,159 Members This group is for learners who are interested in NASBA and want to connect, share, collaborate, learn, and teach in an open, safe environment. Learning is fun when done together. Let’s make it great and enjoy the conversation. *Note: By joining this group, your profile will be visible to other group members but your network will NOT be notified. Join Graphic Design Tips & Tricks - LinkedIn Learning 30,908 Members This group is for learners who are interested in <topic> and want to connect, share, collaborate, learn, and teach in an open, safe environment. Learning is fun when done together. Let’s make it great and enjoy the conversation. *Note: By joining this group, your profile will be visible to other group members but your network will NOT be notified. Join Customer Service Skills & Management - LinkedIn Learning 17,488 Members This group is for learners who are interested in Customer Service Skills & Management and want to connect, share, collaborate, learn, and teach in an open, safe environment. Learning is fun when done together. Let’s make it great and enjoy the conversation. *Note: By joining this group, your profile will be visible to other group members but your network will NOT be notified. Join Show all Learning Groups 0 Notes taken Press Enter to save No notes saved yet Take notes to remember what you learned! Export your notes Get your notes for this course which includes description, chapters, and timestamps Download Filter results by video selected In this video Determine the value of outstanding customer service Selecting transcript lines in this section will navigate to timestamp in the video - When people think about outstanding customer service, there's often an employee who goes above and beyond to be the hero. Think about an experience where you received outstanding customer service. There's a good chance that an individual employee went above and beyond to make it happen. Have you ever wondered why they gave that extra effort? People go above and beyond, because they get something out of it. Even if it's just the satisfaction of knowing they made a difference. Let's explore some of the ways you, your coworkers and even your organization might benefit when you make the effort to provide outstanding customer service. You can download the value of outstanding service worksheet to help you, or just jot down some notes on a blank piece of paper. A good place to start is to look at how you personally benefit from providing your customers with service that exceeds their expectations. Make a list of what you gain from putting in that extra effort. It may help to think about a specific situation where you went out of your way to delight a customer. Here's some examples that might be on your list. Happy customers are easier to serve. You enjoy helping people, and you feel a sense of accomplishment when you are able to help someone else solve a problem. We can also have a positive impact on our coworkers when we personally provide outstanding service. Try making a list of ways your extra effort might benefit the people you work with. This time, it might be helpful to think about how you felt when one of your coworkers delivered outstanding service. Here's some examples that might be on that list. Your coworkers will have to fix fewer problems. Great service brings positive energy to the entire team, and you can be a positive role model to your colleagues. Customers often look at the people who serve them as representatives of the entire organization. As a third step in this exercise, make a list of benefits your organization receives when you personally provide outstanding customer service. Here are a few examples that might be on that list. Increased profits, retained customers, and positive word of mouth from customers who refer your organization to others. Hopefully this exercise helped you identify some reasons that providing outstanding service is important to you. Whenever you have a tough day, reread the list you've just created and reflect on why you worked so hard to help your customers. Customer service isn't always easy. But the important thing to remember is that you can choose to give that extra effort to be outstanding.

      customer service

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      • The authors claim that bin2 has a "confused" phenotype, which they define as high variability in shoot versus root lengths along with a low degree of response to water limitation. bin2-1 is a semi-dominant gain-of-function mutant, which can only be propagated as a heterozygote (homozygous individuals are viable, but don't produce seeds). There is no mention in the manuscript about genotyping or selection of homozygous bin2-1 individuals for the phenotyping assays. Could the high variability observed in fact be caused by the authors looking at a segregating population of bin2-1? * By propagating plants under optimal growth conditions over > 4 months at the TUMmesa ecotron, we were in fact able to obtain over 24 individual homozygous bin2-1 plants. We distinguish homo- and heterozygous seed by (i) adult phenotype (ii) segregation in the next generation (iii) root:shoot ratios from dark-grown seedlings on plate and (iv) sequencing of the TREE domain (as shown in Fig. 2e). Therefore, we are sure to have used only homozygous mutants in our analysis. This is now specified in the supplementary method S5.

      *The authors state that bin2 mutants had considerably more severe phenotypes than other BR biosynthesis, perception, or transcription factor mutants. This is like comparing apples to oranges, as the set of mutants they've examined consists of gain-of-function and partial loss-of-function alleles. Null alleles for BR biosynthesis (e.g. cpd, dwf4), perception (bri1brl1brl3 triple mutants) and transcription factors (bzr1bes1beh1-4 sextuple mutants) are described in the literature and would need to be tested before arriving at such a conclusion. *

      This is an important point and the nature of all alleles was and still is clearly outlined in Table S1 “Lines used in this study”. We have obtained and propagated bri1brl1brl3 triple mutant seed from Christian Hardtke (Kang et al., 2017), as well as null cpd alleles from NASC and these now complement or replace det2-1 and bri1-6 in our analysis. We compare null alleles, semi-dominant or dominant or higher order null alleles with each other. To make these comparisons clear we have highlighted these different allele types in the manuscript as depicted in the table, with null in regular font, semi-dominant or dominant in bold and higher order mutants underlined. This is described in Table S1 and in the figure legends, where applicable. We have not been able to obtain and propagate enough seed in the period of review to extend the analysis to sextuple transcription factor mutants. Therefore, we have removed the comparison between brassinosteroid mutants and now refer to the importance and role of the brassinosteroid pathway in general and, more specifically, to BR signaling rather than to BIN2.

      *For most of the phenotyping experiments a "RQ ratio" is presented. This is the ratio adjustment of the mutant/ratio adjustment of WT. While this derived quantity is useful for interpretation, we're missing plots of the raw data, and particularly those that show the underlying distribution of data points. *

      We understand that the RQratio (Fig. 4e) value is a step removed from the raw data. Please note that we also show the RQshoot (Fig. S8a) and the RQroot (Fig. S8b) in the supplement. We now depict violin plots in Fig. 4a-c and Fig. S7 as a best representation of the raw data, as follows

      Results page 10: “The violin plots compare organ length distributions in mutants versus the corresponding wild-type ecotype, which depicts dwarfism in some brassinosteroid mutants. It is also apparent that wild-type (Col-0) root length varies under water-deficit in the dark (Fig. S7). Although we have optimized protocols for PEG plates to the best of our ability, there is still a lot-to-lot and plate-to-plate variation. This emphasizes the need for normalizing each mutant line to its corresponding wild-type ecotype on the same (PEG) plate in the same experiment. To this end, the response to water stress in the dark was represented as a normalized response quotient (RQ), which is an indication of how much the mutant deviates from the corresponding wild type (Fig. 4e; see methods).”

      The RQhypocotyl, RQroot and RQratio are a necessary consequence of the variance in the data, and we consider them to be the most relevant metrics. Representative experiments were chosen from at least three replicates on the bases of RQ and P values (as specified in the legends of Fig. 3 and Fig. S10).

      Root growth involves both cell division in meristematic cells at the tip of the root and subsequent elongation as cells exit the meristem and begin to differentiate. The authors claim a nine-fold difference in CycB1,1:GUS in the root meristem in dark vs darkW, however their images show similar CycB1,1:GUS expression patterns. Furthermore, the meristems of darkW are actually smaller than dark, which would be unexpected if cell division *was increased. *

      We have reviewed the raw data again, applying blinding to avoid bias, and chosen a more representative image for the dark; the mitotic indexes are represented in a violin plot (Fig. 6c) to better show the distribution of datapoints. The conclusions are unchanged. We reimaged the wild-type under light, dark and darkW, specifically focusing on meristem properties and on final cell length. The results are presented in Fig. 6, Fig. S14, Fig. S15 and described as follows:

      Results page 14:

      “It is generally accepted that root growth correlates with the size of the root apical meristem (RAM; Beemster and Baskin 1998). Meristem size was assessed by computing the number of isodiametric and transition cells (González-García et al., 2011; Verbelen et al., 2006; Method S8). In addition, we applied a Gaussian mixed model of cell length to distinguish between short meristematic cells and longer cells in the elongation zone (Fig. S14; Fridman et al., 2021). Meristem size was shortest under water deficit in the dark (Fig. 6a; Fig. S15a,b) and, surprisingly, did not correlate well with final organ length (Fig. 1c; Fig. 6g). “

      Discussion page 16:

      “it appears counterintuitive that meristem size and organ length do not correlate in our conflict-of-interest scenario. Questions arise as to why the meristem is smaller under water deficit in the dark even though the mitotic index is higher than in the dark, and how growth is promoted under our additive stress scenarios. An important difference between our conditions and those described by others is that we germinated seed under limiting conditions in the dark in the absence of a carbon source… When water stress was applied in the dark, the mitotic index increased, but the newly produced meristematic cells immediately elongated, thereby exiting the meristem. As a consequence, meristem size remained small despite the increased number of mitotic cells. It appears that what our study shows is a novel paradigm for root growth under limiting conditions, which depends not only on shoot-versus-root trade-offs in the allocation of limited resources, but also on an ability to deploy different strategies for growth in response to abiotic stress cues.”

      We are not aware of any other study that has addressed root growth under water deficit in the dark and in the absence of a carbon source.

      • In addition, the authors claim that the longer root length in dark water stress was at least in part due to increased elongation (Fig. 7c). Elongation was only assessed by looking at the first elongating cell (~10-14um) and the differences found are on the order of magnitude of ~2um, but final cell size in Arabidopsis roots often reaches several hundred um. Therefore, a comparison of final cell size would be more appropriate. *

      Results page 14:

      “mature cell length… was highest in the dark, the condition with the shortest roots (Fig. 6b). Thus, neither meristem size nor mature cell length account for the fold-change in final organ length (Fig. 6g).”

      *Finally, the authors phenotype plt1/2 double mutants and show that they fail to elongate in response to water limitation. Their interpretation is that this supports a centralized control model for the root apical meristem. PLT1/2 are important determinants of meristem function and are necessary to maintain stem cell identity. Given the strong phenotype of plt1/2 double mutants it is not surprising that they are unable to elongate in response to this stimulus. This does not necessarily indicate that only the RAM controls root growth, but rather that functional stem cells are required for root growth, which also involves subsequent steps such as cell elongation. *

      This is an important point and we thank the reviewer for pointing it out. We now write:

      Results page 15:

      “Taken together, the cell length and anisotropy curves (Fig. 6) and genetic analyses (Fig. 6; Fig. S15f; Fig. S16) suggest that root length under our different environmental conditions is regulated by (i) the mitotic index, (ii) the timing of cell elongation or of exit from the meristem and (iii) cell geometry. We also conclude that these are differentially modulated to account for increased root length under different environmental conditions (Fig. 6c-e).”

      We also modulate the conclusion and model (Fig 7c) to state that RAM function accounts “in part” for root growth. However, it is to be noted that mature cell length in our study did not correlate with root length (Fig. 6b, 6g). Our conclusion is now reached not solely based on plt1plt2 but also on a careful and quantitative cellular analysis of the root apical meristem in the wild-type and in bin2-3bil1bil3 mutants. The major contribution of our study, however, is the difference between the different conditions, and the ability to respond to stimulus.

      *Reviewer #1 (Significance (Required)): *

      * While the study system and some of the findings in this manuscript are interesting, there are major flaws in the authors' primary claims. *

      Contested claims have been (i) deleted where unessential to the storyline or (ii) substantiated by independent methods.

      *Reviewer # 2 *

      1. I recommend to exchange shoot for hypocotyl when hypocotyls were examined to avoid to confuse the readers. We thank the reviewer for pointing this out and have exchanged shoot for hypocotyl throughout.

      2. The authors have chosen SnRK2 (and should also indicate it in all Figures as SnRK2, to not confuse the readers with SnRK1), and implement ABA signaling in parallel to BR action, but this must be proven in higher order mutants of both pathways, at the moment the results are to preliminary to allow conclusions. *

      We concur with the reviewer that higher order mutants between the BR and ABA pathways would be required to make this claim. We also concur that this would require numerous generations and therefore that it does not lie within the scope of this manuscript.

      • When the authors are interested in shoot dominance/photosynthetic activity, why didn't they look on snrk1 mutants, which are known to regulate those processes. *

      The issue of energy signaling is a key one, and we mention this in the final “perspective” paragraph of the discussion (p. 18) as follows:

      “As a limited budget is an essential component of our screen conditions, the role of energy sensing and signaling (Baena-González and Hanson, 2017) in growth tradeoffs will need to be elucidated.”

      • In Fig6d the authors propose a sketch of the mechanism, but the data of this study don't show direct interaction of the pathways and as indicated in the figure text parts of the information are taken from other papers, I recommend to remove this sketch or shift it to the supplements. * We concur with the reviewer and have deleted former panels 6d, 6e and 6f as well as reference to the mutants these included. We now focus on the BR pathway, as discussed below.

      *To discriminate the role of downstream BR signaling events from other roles of BIN2, I suggest to complement the data with pharmacological experiments (eBL or bikini where appropriate), and if possible to implement phenotyping of OE lines. *

      In response to this comment, we attempted bikinin experiments. Unfortunately, it is difficult to germinate seed on bikinin and seedlings grow poorly on this shaggy-like kinase inhibitor. As the assay relies on seed germination rather than on seedling transfer, applying bikinin was suboptimal. Because of the requirement for germination in the dark, and in lieu of eBL or PPZ or a combination thereof, we now include a null allele of a BR biosynthesis mutant, cpd, in Fig. 3b, to replace the leaky det2-1 mutant we had previously used.

      How many independent ko lines were tested, can the authors exclude that the BR independent phenotype indeed corresponds to BIN2 activity and not to a off target effect.

      Four independent bin2 mutants (B1, bin2-1, ucu1, dwarf12) were analyzed in our study. In total, 83000 M2 seed were used in our forward genetic screen; of these and for BIN2 the B1 line is the one we rescreened, mapped and characterized. We complemented B1 with bin2-1 and ucu1 alleles and compared it to bin2-1, ucu1 and dwarf12 alleles at the BIN2 locus; these three published mutant lines exhibited the same behavior as B1, including semi-dominance and phenotypes under single versus multiple stress conditions (Fig. 2c cf Fig. 3d; Fig. S6). Fine mapping (Fig. 2d), segregation analysis (Table S2), allele sequencing (Fig. 2e), backcrossing, outcrossing and complementation analysis provide independent lines of evidence that B1 is a BIN2 allele. Please note that the conclusions regarding BIN2 in this manuscript are based not on B1 but on the published bin2-1 and bin2-3bil1bil2 lines.

      We write results page 10:

      “We complemented B1 with bin2-1 and ucu1 alleles and compared it to bin2-1, ucu1 and dwarf12 (Perez-Perez et al., 2002; Choe et al., 2002) alleles at the BIN2 locus; these three published mutant lines exhibited the same behavior as B1, including semi-dominance and partial etiolation.”

      *I further recommend to exchange the pictures in Fig7a showing BRI1-GFP to pictures showing fewer cells, but with higher resolution. *

      We now show higher resolution images in Fig. 7b.

      • Regarding the implementation of photoreceptor mutants and the claim that photoreceptors are more abundant in shoot, I want to point out that the situation is more complex, as the root also reacts differently to light of different quality and quantity, with different responses in the meristem, by inhibiting cell proliferation, or in the elongation zone by triggering negative phototropism. this should be corrected in the text. *

      We are aware that light, especially when Arabidopsis is grown on media, is perceived by photoreceptors within the root system. Phototropic growth would not have affected measurements of root length as measurements were performed in ImageJ with the freehand tool. This is described in the methods on page 6, and in the supplementary method S5. For the model, we have now modulated our discussion as follows:

      Discussion p. 16-17:

      “ we postulate that a hypocotyl to root (basipetal) signal coordinates trade-offs in organ growth in response to light (Fig. 7c green arrow). However, and even though photoreceptors are considerably more abundant in the hypocotyl than in the root (van Gelderen et al., 2018), it needs to be borne in mind that photoreceptors in the root could be playing a role in root responses to light or to darkness (Mo et al., 2015).”

      *The data and methods are presented in a clear and sufficient way, as well as the statistical analysis. *

      We thank the reviewer for this positive assessment.

      *Altogether, the hypothesis and work amount are worth to be recognized, but the manuscript also resembles partially more a review and I would suggest to shorten those parts in the manuscript, reduce the amount of described lines and focus strictly on the BR pathway, in response to the environmental changes. Before implementing photoreceptors and ABA/SnRK2 pathway into the story to either test higher order mutants between the signaling pathways of interest or come up with a pharmacological screen connecting the data. Therefore I suggest to reduce the amount of mutants investigated and focus on BIN2 action, implementing also a pharmacological screen to track a fluorescent tagged BIN2 upon the mentioned treatments. And if possible to add proteomics and phosphoproteomics to understand better what changes are undergoing in the bin2 mutant vs WT upon stress. *

      We thank the reviewer for suggesting that we “focus strictly on the BR pathway, in response to the environmental changes”, as this has truly supported us in tightening the story line.

      We have removed the sections of the manuscript that resembled a review and focus entirely on the BR pathway, with additional or tighter mutants. We also look at BIN2 more closely and at a cellular level, with SEM micrographs for the hypocotyl and CSLM for the root tip. The BIN2 interactome on BIOGRID comprises 36 well annotated interactions (https://thebiogrid.org/12898/summary/arabidopsis-thaliana/bin2.html), of which 2 are documented by multiple lines of evidence and 27 are from low throughput studies. Adding adequately validated interactions to this exceeds the scope of this manuscript. Furthermore, as we no longer make the claim that BIN2 mutants are the most severely impacted (see response to reviewer #1), BIN2 is no longer the primary focus of this study; we now refer more loosely to the BR pathway, or to facets thereof referred to as BR biosynthesis, perception, signaling or BR-responsive gene expression. We have also updated and extended the reference list to include references on light perception and energy sensing or signaling. Phosphoproteomics is an important suggestion that we have also taken into the perspective.

      In brief, the manuscript has a new focus on what we consider is its true contribution: a cellular analysis of cell division, elongation and anisotropy in the wild type and in BR mutants under resting or additive stress conditions.

      *Reviewer #3 *

      1. *My major concern is that in the search of a decision mutant the authors performed the first screening not under 'a conflict of interest' scenario but under dark conditions. Can the authors explain the reasons behind this more clearly? * The reason we did not use the dark water stress condition as an initial but as a secondary screen is the variability of the response. In the new violin plots (Fig. 4a-c; Fig. S7), the variance especially in root length can be seen to be considerably greater in darkW than in dark even for the wild-type. This is why we initially screened individual M2 seed in the dark and then rescreened M3 populations under darkW conditions. Due to the relatively high variance, all conclusions in the manuscript are drawn on populations of seedlings rather than on individuals.

      We write in the results section on page 9:

      “We initially screened in the dark because the high variance in root growth under water deficit in the dark in the wild-type (see below) would obscure the distinction between putative mutants versus stochastically occurring wild-type seedlings with short roots under darkW.”

      • Related to above, the role of the BR pathway in etiolation has been well established with the prominent constitutive photomorphogenesis phenotypes of BR related mutants; since both bin2 alleles are impaired in light responses this mutant may behave in dark vs darkW, like a wildtype plant in light vs. lightW (maybe also partially as shown in SFig. 5a). However, the authors show that the growth tradeoff was not evident under light conditions (Fig 2). I think to conclude that bin2 is a decision mutant it requires more evidence to excluded that a defect in efficient sensing and signaling of dark conditions are not the primary source of the 'confused' phenotype. In addition to the phenotype in SFig. 5a where light responses are attenuated in B1 when compared to Wt, a comparison of gene expression analysis of some established light regulated genes could help to show that bin2 is able to efficiently sense the absence of light. *

      This is an important point. We have looked at the expression levels of the light responsive gene LHCB1.2 via qPCR in wild-type Ws-2 versus bin2-3bil1bil2. The data show that the gene expression is light-regulated in bin2-3bil1bil2 seedlings (Fig. S12) and are described in the Results on page 13.

      In addition, Fig. S10 and Fig. S11 are dedicated to a careful analysis of light responses in all the BR pathway mutants we analyze. In Fig. S10d, bin2-1 can be seen to have a significant (P-value We write, in the Results on page 13.

      “Interestingly, the BR mutant lines with the strongest etiolation phenotypes (cpd and bri1-116brl1brl3, Fig. S11a,b) in the dark were not the ones with the strongest deviation from the wild-type under water deficit in the dark (Fig. S8).”

      3. Cells that fail to elongate in the dark may cannot - or only to a limited extent - reduce further their cell length in the darkW conditions. Since BR-mutants fail to expand hypocotyl cells in the dark, an analysis of the hypocotyl epidermis cell length in bin2 mutants compared to wt in light vs dark vs darkW (as in Fig. 8c) could be a feasible experiment to exclude that the general BR-related cell elongation defects led to the confused phenotypes of this mutant.

      This is an excellent suggestion and we thank the reviewer for pointing it out. Accordingly, bin2-1 mutants were imaged via scanning electron microscopy (SEM) and cellular parameters assessed. We also investigated root meristem properties in bin2-3bil1bil2, which had the most aberrant root response to water stress in the dark (Fig. 3e; Fig. S8b). Our new observations are described in Fig. 5, Fig. 6h-j, Fig. S16 and in the results on pages 13-15 as follows:

      “To explore whether general BR-related cell elongation defects led to the confused phenotypes of some BR pathway mutants, we analysed bin2-1 mutants, which were among the most severely impaired hypocotyl response to water stress in the dark (Fig. S8a). The data show a most striking impact of bin2-1 on growth anisotropy, assessed in 2D as length/width (Fig. 5f). Indeed, in a comparison between dark and dark with water stress (darkW), the anisotropy of hypocotyl cells decreased considerably in the wild type (Fig. 5c), but showed no adjustment in bin2-1 (Fig. 5f). Cell length alone showed the elongation defect typical of bin2-1 mutants, with a much greater deviation from the wild type under darkW than under dark or light conditions; nonetheless, there was a significant length adjustment to water stress in the dark, even in bin2-1 (Fig. 5e). These observations suggest that the impaired bin2-1 hypocotyl response can be attributed to an inability to differentially regulate cell anisotropy in response to the simultaneous withdrawal of light and water. ….

      Meristem size and mature cell length followed the same trends in a comparison between bin2-3bil1bil2 (Fig. S16a, S16b) and the wild type (Fig. 6a, 6b), but the extent of elongation in cells proximal to the QC differed (Fig. S16c). Indeed, bin2-3bil1bil2 length and anisotropy curves lacked the steep slopes characteristic for darkW in the wild type (compare the green arrows in Fig. 6d, 6f & 6j to the purple arrows in Fig. 6j & Fig. S16c). We conclude that bin2-3bil1bil2 mutants fail to adjust their root length due to an inability to differentially regulate the elongation of meristematic cells in the root in response to water stress in the dark.”

      • The experiments with the BR-deficient and signaling mutant and the bypass mutant may suggest that BR hormone is playing a relative minor role in the 'decision activity' of BIN2. bri1-6 was described to respond like wildtype (page10 line 6-8). Since this seems because of normal root responses in dark vs. darkW (Fig. 5) it could also be caused by the role of BRL1 and BRL3 in root drought responses (Fabregas et al., 2018). To verify if functional BRL1 and BRL3 in bri1-6 could cause the root response to water stress an additional experiment with bri1,brl1,brl3 triple mutant is required; In my opinion this is very important to state if the BR input is at all required for BIN2 signal integration or not. *

      We have extended our analysis to include bri1brl1brl3 lines (Kang et al., 2017). These are dwarf mutants, yet able to respond to water stress in the dark with reduced hypocotyl and increased root growth (Figure panel former 5c replaced new Fig. 3c, shown left). Note that the lines have a null bri1-116 allele and segregate (bri1-/+ brl1-/- brl3 -/-)quite clearly, as was verified by propagating seedlings on plate after the scan on day 10 (Supplementary Method S5).

      ***Minor comments:** *

      *5. The authors separate conceptually growth tradeoffs in sensing, signaling, decision making and execution processes. A clearer explanation of the expected phenotypes from mutants in only decision making with and without stress would be interesting to add (page 8)? *

      We have now moved up phya phyb cry1 cry2 quadruple photoreceptor mutant and write:

      Results on page 9

      “Perception mutants would fail to perceive light or water stress; a good example of this is the phya phyb cry1 cry2quadruple photoreceptor mutant, which had a severely impaired light response (Fig. S4d), but a “normal” response to water stress in the dark (Fig. S4e). In contrast, execution mutants may have aberrantly short hypocotyls or roots that are nonetheless capable of differentially (and significantly) increasing in length depending on the stress conditions. Decision mutants would differ from perception or execution mutants as they would clearly perceive the single stress factors yet fail to adequately adjust their hypocotyl/root ratios in response to a gradient of single or multiple stress conditions. Failure to adjust organ lengths would be seen as a non-significant response, or as a significant response but in the wrong direction as compared to the wild-type. We thus used organ lengths, the hypocotyl/root ratio and the significance of the responses as decision read outs. We specifically looked for mutants in which at least one organ exceeded wild-type length under darkW.“

      Later in the results on page 11 and in the legend to Fig. 4 we pick up on this as follows:

      “For bin2-1, the response to water stress in the dark was severely impaired: the hypocotyl and root responses were non-significant …bin2-3bil1bil2 mutants fit the above definition of decision mutants as they have a significant root response but in the wrong direction as compared to the wild-type, as denoted by red asterisks (Fig. 3e)…

      Figure 4. … bin2-3bil1bil2 mutants qualified as decision mutants on 3 counts: (i) failure to adjust the hypocotyl/root ratio to darkW (the ratio for darkW is the same as for dark in panel c), (ii) low or non-significant P-value (see panel f below) and (iii) one organ (here the hypocotyl in panel a) exceeded wild-type length under darkW.”

      Line 26 page 17: BR responses in the epidermis of the hypocotyl have been shown to be already sufficient to control hypocotyl growth (Savaldi-Goldstein et al 2007), showing that not all cells of the hypocotyl need to receive the signal (at least in the case of brassinosteroids) We have deleted the sentence because it is too speculative. However, the issue of different tissue layers is now mentioned in the perspective on page 18, as follows:

      “3D imaging will be required to assess the impact of abiotic stress and/or of BR signalling on different cell files or tissue layers in the root (see Hacham et al., 2011; Fridman et al., 2014; Fridman et al., 2021; Graeff et al., 2021). .”

      Because of the importance of distinguishing between different cell files and cell layers, we have now removed the confocal images of BRI1-GFP under the different environmental conditions (formerly Fig. 7a); this needs to be extended to a 3D analysis, which is not within the scope of this manuscript.

      1. *Page 6 Line 11: In the volcano blots the mean RQ ratio is shown in Fig. 6c and 6f. *

      We thank the reviewer for pointing this out, we had accidentally written median RQratio, this has been rectified in the results text.

      *Some parts of the ms could be shortened and the amount of Fig. could be reduced. Fig. 1-3 could be merged as one figure showing the optimal conditions to analyze tradeoffs in shoot vs. root growth and all the conditions not suitable could be supplementary figures. *

      We concur with the reviewer and have merged the first three figures as suggested. Reviewer #2 has also requested that we slim the manuscript and all reviewers request that we strengthen our conclusions on the brassinosteroid pathway mutants. To reduce the number of figure panels, we have removed the analysis of all mutants that are not in the BR pathway, with the exception of the quadruple photoreceptor mutant in Fig. S4d,e and plethora mutants in Fig. S15. Nonetheless, incorporating the new data generated in response to reviewer comments leaves us with 7 main and 16 supplementary figures.

      *In the ms several experiments are described as 'screen' this is confusing with the forward genetic screen that was performed. *

      This is indeed ambiguous. We now use the terms “single versus multiple stress conditions/additive stress/conflict-of-interest scenario ” versus “forward genetic screen”.

      *Reviewer #3 (Significance (Required)): *

      * Mechanisms how growth trade-offs between multiple stresses are controlled are highly interesting. Growth vs. biotic stress tradeoffs have already been investigated and were found to be interdependent with light (Leone et al. 2014; Campos et al 2016; Fernandez-Milmanda et al. 2020) and hormone signaling (Lozano-Duran and Zifpel et al., 2016 and Ortiz-Morea et al 2020; van Butselaar and van den Ackerveken, 2020). Less is known about growth tradeoffs between two abiotic stress responses (Bechtold and Field, 2018; Hayes et al., 2019). The separation of root meristem growth and cell expansion in the hypocotyl is interesting. Whether the two directional root-to-shoot and shoot-to-root signals are independent or whether they may employ the same mechanism with a different output remains open. Different sensitivities of organs and cell types to BRs have for example been reported (Müssing et al. 2003 and Fridman et al. 2014). The findings that BIN2 most likely act to integrate multiple signals is in line with the reported roles of BIN2 to crosstalk with several pathways (reviewed by Nolan et al. 2020). In my point of view, it remains to be strengthened if this is through 'decision making' and not through signaling and execution. I think if the authors carefully separate the defects in bin2 this work will be interesting to many plant biologists. * We thank the reviewer for highlighting references we had not referred to in the former draft. The references pertaining to the growth versus defense trade-off are now included in the introduction (page 3) and the ones on abiotic stress factors in the Discussion on page 18:

      “In addition to its role in light and drought responses… BIN2 has been implicated in regulating hypocotyl elongation in response to far-red light and salt stress (Hayes et al., 2019). Studies on responses to abiotic stress factors have typically addressed growth arrest or tradeoffs between growth and acclimation (Bechtold and Field, 2018). Indeed, root growth is inhibited by, for example, phosphate deprivation or salt stress (Balzergue et al., 2017; West et al., 2004). Recent efforts have addressed strategies for engineering drought resistant or tolerant plants that do not negatively impact growth (Fàbregas et al., 2018; Yang et al., 2019). In contrast to other studies, here we look at two abiotic stress factors that promote organ growth. Indeed, hypocotyl growth is promoted by darkness or low light and primary root growth by water deficit in this study.”

      We emphasize the above point about decision making in the discussion. In the in the introduction and early on in the results we introduce conceptual frameworks for decision making. Yet after a forward genetic screen and mutant characterization, we revise this in the Discussion on page 18 as follows:

      “In the judgement and decision-making model for plant behaviour put forth by Karban and Orrock (2018), signal integration might be considered integral to judgement. ….Whether judgement and decision making can be distinguished from each other empirically remains unclear. As BR signalling regulates cell anisotropy and growth rates in the hypocotyl and root apical meristem, it may play a role not only in signal integration but also in the execution of decisions (or in an implementation of the action; González-García et al., 2011; Vilarrasa-Blasi et al., 2014). Thus, this study does not enable us to empirically distinguish between decision making on the one hand and signalling and execution on the other.”

    1. https://niklas-luhmann-archiv.de/bestand/zettelkasten/zettel/ZK_2_SW1_001_V

      One may notice that Niklas Luhmann's index within his zettelkasten is fantastically sparce. By this we might look at the index entry for "system" which links to only one card. For someone who spent a large portion of his life researching systems theory, this may seem fantastically bizarre.

      However, it's not as as odd as one may think given the structure of his particular zettelkasten. The single reference gives an initial foothold into his slip box where shuffling through cards beyond that idea will reveal a number of cards closely related to the topic which subsequently follow it. Regular use and work with the system would have allowed Luhmann better memory with respect to its contents and the searching through threads of thought would have potentially sparked new ideas and threads. Thus he didn't need to spend the time and effort to highly index each individual card, he just needed a starting place and could follow the links from there. This tends to minimize the indexing work he needed to do regularly, but simultaneously makes it harder for the modern person who may wish to read or consult those notes.

      Some of the difference here is the idea of top-down versus bottom-up construction. While thousands of his cards may have been tagged as "systems" or "systems theory", over time and with increased scale they would have become nearly useless as a construct. Instead, one may consider increasing levels of sub-topics, but these too may be generally useless with respect to (manual) search, so the better option is to only look at the smallest level of link (and/or their titles) which is only likely to link to 3-4 other locations outside of the card just before it. This greater specificity scales better over time on the part of the individual user who is broadly familiar with the system.


      Alternatively, for those in shared digital spaces who may maintain public facing (potentially shared) notes (zettelkasten), such sparse indices may not be as functional for the readers of such notes. New readers entering such material generally without context, will feel lost or befuddled that they may need to read hundreds of cards to find and explore the sorts of ideas they're actively looking for. In these cases, more extensive indices, digital search, and improved user interfaces may be required to help new readers find their way into the corpus of another's notes.


      Another related idea to that of digital, public, shared notes, is shared taxonomies. What sorts of word or words would one want to search for broadly to find the appropriate places? Certainly widely used systems like the Dewey Decimal System or the Universal Decimal Classification may be helpful for broadly crosslinking across systems, but this will take an additional level of work on the individual publishers.

      Is or isn't it worthwhile to do this in practice? Is this make-work? Perhaps not in analog spaces, but what about the affordances in digital spaces which are generally more easily searched as a corpus.


      As an experiment, attempt to explore Luhmann's Zettelkasten via an entryway into the index. Compare and contrast this with Andy Matuschak's notes which have some clever cross linking UI at the bottoms of the notes, but which are missing simple search functionality and have no tagging/indexing at all. Similarly look at W. Ross Ashby's system (both analog and digitized) and explore the different affordances of these two which are separately designed structures---the analog by Ashby himself, but the digital one by an institution after his death.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      __Manuscript number: __RC-2022-01357

      __Corresponding author(s): __Peter Novick and Gang Dong

      1. General Statements [optional]

      We would like to thank both reviewers for their thorough and constructive evaluation and comments on our manuscript. Following their suggestions, we have reworked our manuscript and added several pieces of new data to address questions from them, including (1) evaluation of how M7 mutant of Sso2 affects its interaction with Sec3 using three independent methods (in vitro); (2) investigation of how the M7 mutant affects the interaction of Sso2 with Sec3 by co-immunoprecipitation (in vivo). We hope that, with all these further introduced changes, this manuscript will be suitable for publication in your journal. Detailed point-to-point responses are shown below.

      2. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): *

      Using the entire cytoplasmic domain of Sso2 and protein crystallization, Peer and colleagues show that two N-terminal peptides (NPY) of Sso2 synergistically interact with the Sec3 PH domain. This interaction provides an additional low affinity binding site to the previously published interface between the Sso2 four-helix bundle and the PH domain. Mutagenesis, in particular of both NPY motifs, results in reduced cell growth, in the accumulation of transport vesicles at the budding site, and in decreased secretion of invertase and Bgl2. The paper is well written, the data are convincing and the characterization of these novel peptide interaction sites clearly advances the field. Although, the exact role of the Sec3 NPY - Sec3 interaction still needs to be established, the overall functional relevance is apparent and thus the paper could be published with minor changes. *

      __Response: __We really appreciate the reviewer for his/her positive comments and clear/constructive feedbacks.

      *Nevertheless, the authors may consider to address the following issues to improve the manuscript. - To strictly exclude the possibility that the Sso2 NPY motif also interacts with other components of the exocytosis machinery (e.g. Sec1), thereby causing the observed phenotypes, Sec3 mutagenesis of the NPY motif-binding site would be required. *

      __Response: __It would be a good idea to generate reverse mutants on Sec3. However, the pocket on Sec3 bound by the NPY motifs of Sso2 is mostly hydrophobic and contains many semi-buried residues that are in close contact with other residues in the hydrophobic core of structure (including L78, Y82, I109, V112, V208, etc.; see Fig. S3D, E) and thus essential in maintaining the folding of Sec3. Making mutations on these residues would destabilize the folding of Sec3. This was why we have not done this as suggested by the reviewer.

      *- The authors suggest that the NPY-peptide binding contributes to the initial interaction/recruitment of Sso2 to the exocytosis site, defined by the localization of Sec3 (exocyst). Further data sustaining this concept/hypothesis could improve the impact of the manuscript. Thus, an experiment analyzing the co-distribution of the Sec3 with Sso2 would directly support the authors' conclusion. (In Figure 7, the authors already show the highly polarized distribution of Sec3-3xGFP.) The M7 mutant could impact the distribution of Sso2. In addition, it would be helpful to determine to which degree the Sso2 NPY - Sec3 PH domain interaction increases the overall affinity of Sso2 for the Sec3 PH domain; e.g. comparison of the binding of Sso2 (1-270) wt and M7 to Sec3 PH domain using ITC. *

      Responses:

      • We greatly value the reviewer’s suggestion. For the suggestion to investigate how the M7 mutant affects the co-distribution of Sso2 with Sec3 in yeast, we have tried a variety of conditions with both the original serum and affinity purified Sso antibodies. In neither case did we see a clear concentration at sites where we would expect to see Sec3, such as the tips of small buds. We were able to see some detectable concentration of HA-tagged Sso2 in small buds using anti-HA Ab, but it would be difficult to tag the M7 mutant at the same site since it is so close to the M7 mutation. We are also worried that the tag might interfere with Sec3 binding due to the proximity. Given the lack of detectable concentration of WT Sso2, it would not be possible to see a loss of localization in M7.
      • For the suggestion to check the binding of Sec3 with either the WT or M7 mutant of Sso2 (aa1-270), we have generated M7 mutant within the same fragment of Sso2 as the WT (i.e. aa1-270) and carefully checked how this M7 mutant affects the interaction of Sso2 with the Sec3 PH domain using three independent methods. Our ITC data show that WT Sso2 bound Sec3 very robustly, with a Kd of approximately 2 µM (Fig. 8C). Surprisingly, however, the M7 mutant of Sso2 (aa1-270) completely abolished its interaction with Sec3 (Fig. 8D). To further verify this observation, we carried out electrophoresis mobility shift assays (EMSA) and size-exclusion chromatography (SEC). Our EMSA data on a native PAGE gel shows that WT Sso2 (aa1-270) bound Sec3, whereas the M7 mutant did not (Fig. S5A, B). Similarly, our SEC data demonstrate that Sec3 was co-eluted with WT Sso2 in the higher molecular weight peak; in contrast, Sec3 and the M7 mutant of Sso2 (aa1-270) were eluted in separate peaks and no stable complex of the two was formed (Fig. S5C, D). All these new data confirm that the NPY motifs play an essential role in maintaining the stable interaction between Sso2 and Sec3, which would explain why the M7 mutant gave such dramatic phenotype in vivo (Fig. 4B-E; Fig. 5D-F; Fig. 6D, E). *Minor point: In the discussion, the authors should mention to which degree the NPY binding site within Sec3 is accessible for / occupied by other known exocyst components, or PI(4,5)P2, etc. *

      Response: __Thank you for the suggestion. A new diagram has been added to __Fig. 9E to compare the structures of the previously reported Sec3/Rho1 complex and the Sso2/Sec3 complex determined by us. It shows that the NPY binding site on Sec3 is on the opposite side of the membrane-binding surface patch. The NPY binding site is also far away from the Rho1 interacting site on Sec3 and thus does not interfere with Rho1 binding either.

      *Reviewer #1 (Significance (Required)):

      The manuscript significantly contributes to our understanding of how the vesicle tethering machinery interacts and coordinates the assembly of the membrane fusion machinery and will be of broad interest in the field of membrane trafficking. I am not an expert in X-ray crystallography. *

      __Response: __We sincerely appreciate this reviewer’s positive feedbacks.

      ***Referees cross-commenting**

      I agree with the comments of the other reviewer. It would be nice to show the effect of the M7 mutant in a reconstituted liposome fusion assay, but as already mentioned this may require an additional collaboration. Whether the relatively weak Sec3 - NPY interaction can be resolved in the liposome fusion assay needs to be shown.*

      __Response: __Please check our detailed answer to the other reviewer’s question about this.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): * The manuscript of Peer et al. Describe the structural characterization of the interaction of the syntaxin-like Sso2 protein with the exocyst subunit Sec3. The authors identify here a dual NPY motif at the N-terminal part of Sso2 that binds to Sec3 and thus confers functionality. Using x-ray crystallography, they show a nearly full-length Sso2 in complex with Sec3, which reveals how Sso2 binds to Sec3. Subsequent mutagenesis shows that both NPY motifs act together in binding, and are both required for functionality in vivo, using established assays in localization of exocyst subunits, secretion assays and growth tests. Their data suggest an overall model how Sso2 is efficiently recruited by exocyst to promote vesicle secretion.

      This is__ an overall very complete and clear manuscript__, where the authors nicely demonstrate, how the two NPY motifs both contribute to efficient Sso2 interaction with Sec3. Their data further show that each motif alone can contribute to function, whereas loss of both motifs (the M7 mutant) result in deficient binding. Likewise, their established assays to determine cellular importance of the NPY motifs in Sso2 show that trafficking and localization in the secretory pathway is strongly impaired in the mutant. I only have a few questions and suggestions. *

      __Response: __Thank you for the positive feedback.

      *1. The authors present in Figure 4 the mutants. I recommend to show the alignment of the mutants (M5,M6,M7) similar to panel A in Figure S4 here to orient the reader. They could also be listed in Figure 3, since the authors have here the sequences. *

      Response: __Alignment of M5-M7 has been added in __Fig. 4A as suggested. Thank you.

      2. The authors previously showed that Sso2 mutants affect the Sec3 driven assembly and also the fusion. I am wondering if they have the tools ready to also conduct this assay with their M7 mutant, which has the strongest defect. I am aware that this may be challenging if the tools are not established here as in the previous collaboration (Yue et al., 2017). It may provide additional information on the functional crosstalk.

      Responses:

      • Thank you for the suggestion. However, we do not think it is necessary to perform such assay based on our new results. As shown in 8C&D and Fig. S5, we found that the M7 mutant of Sso2 (aa1-270) completely abolished its interaction with Sec3, which is in contrast to the robust interaction between the WT Sso2 (aa1-270) and Sec3. Therefore, we expect that the M7 mutant would fail to accelerate liposome fusion in the same way as we had previously seen for the WT Sso2.
      • On the other hand, we have to admit that to perform such assay would indeed be challenging for us as the PhD student who had carried out the in vitro liposome fusion assay has left our previous collaborator’s lab and it would take quite a while to re-establish the assay in our own group and to optimize various parameters in that assay. *3. Along the same line, it would be good if the authors show that the mutation also impairs the interaction of Sec3 and Sso2 in vivo. *

      Response: __We appreciate the reviewer’s suggestion and have carried out co-immunoprecipitation of Sec3-3×Flag and Sso2 from yeast extract to find out how the M7 mutant affects Sso2’s interaction with Sec3 (__Fig. S6). Our results show that in contrast to the clear signal of WT Sso2 pulled down by Sec3-3×Flag, the pull-down band for the M7 mutant was much weaker and at a similar level to the negative control. This is consistent with what we saw in our in vitro binding assays (Fig. 8D; Fig. S5).

      *4. I really like the similarity of the different Munc18-Syntaxin interactions and the Sec3-Sso2 interaction. Do the authors think that Sec3 is an ancestral fragment of a Sec1 like protein, which just maintained this interaction? *

      __Response: __This is a very interesting idea. However, it seems too speculative to us to draw such conclusion. It could also be due to co-evolution in function for Sec3 to use a simpler structure (i.e. PH domain) to mimic syntaxin binding of SM proteins and to employ the extra “add-on” NPY motifs as a handle to facilitate and regulate their interaction.

      1. *Small mistake in the discussionResponses: "plasmas membrane" *

      __Response: __This has been corrected. Thank you.

      *Reviewer #2 (Significance (Required)): Important advance in our understanding of Exocyst function, which deserves publication. I only had minor issues that can be addressed quickly. *

      __Response: __We sincerely appreciate the reviewer’s positive feedbacks and constructive suggestions.

    1. Author Response

      Reviewer 3

      This is work by an internationally recognized group with strong expertise in sophisticated single-molecule microscopy assays in cells. They present here a single-molecule fluorescence-based assay for proximity in the nanometer range.

      It has long been reported that cyanine dyes such as Cy3, Cy5 and derivatives such as AF555, AF647 can undergo a photoswitching mechanism by which the shorter wavelength dye when being excited can switch the longer wavelength dye which is in a dark state back into the bright state. And it has furthermore been reported that this switching mechanism is not based on FRET, as the distance requirement is more stringent (up to ~ 2 nm). However, this mechanism has not been fully explored for the investigation of molecular interactions yet.

      The authors in the present work present a similar mechanism for a different class of rhodamine-based fluorophores, specifically JF549 and JFX650. They describe the discovery of this mechanism in dual-color labeling of a pentameric protein and initial characterization to distinguish it from UV-light-mediated recovery from a pumped dark state as reported for (d)STORM-like measurements. They extend their observation to TMR, JF529 as lower wavelength "senders" and JF646 and JFX646 as longer wavelength "receivers" that can become reactivated into the ground state upon illumination of a nearby "sender". The authors then test activation pulse length and distance dependence and find that longer pulses lead to more recovery and that PAPA of JF549/JFX650 has unlike previously observed for the Cy3/Cy5 pair a smaller distance dependence than FRET of the same fluorophore pair. The authors then move on to use both the UV-light mediated direct reactivation "DR" and proximityassisted photoactivation "PAPA" to activate different molecules that are either double-labeled for PAPA or singly labeled with JFX650 for DR. They succeeded in four different cases to identify clear population shifts to distinguish molecules of different mobility.

      Overall, I think the authors made an interesting discovery and characterizing this previously poorly characterised interaction for cellular single-molecule experiments is certainly an important effort. The authors make an honest and quite complete effort to work out the practical details of this interaction and designed experiments that convincingly highlight the basic capabilities this technique offers to the detection of verified interactions and the mobility of interacting molecules in cells.

      The weakness is that these capabilities do not seem to be as clear-cut as the reviewer hoped for when starting to read this manuscript. It remains unclear to this reviewer, to what extant PAPA molecules can be separated from DR molecules. In all but the last diffusion experiment(s) in Figure 4, PAPA molecules seem to be significantly perturbed by DR molecules, casting doubt on the usefulness in real experiments. Similarly, in Figure 5, a difference is seen but does not allow for quantification. This certainly is not the case for other methods of sensing as well, but maybe the authors could more specifically compare their efforts and the dynamic range to other sensors for example in Figure 5? This would make it easier for the reader to make up their mind if the assay is worthwhile adopting for their system.

      We agree that a problem with PAPA at present is that although PAPA trajectories are significantly enriched for double-labeled complexes, they are still “contaminated” with singlelabeled molecules. As we described in the Discussion (and as pointed out by Reviewer 1), we think that one major contribution to this background arises from chance proximity of sender and receiver molecules independent of direct physical interaction. Additionally, some background is expected from continual spontaneous (a.k.a. “thermal”) reactivation of molecules from the dark state.

      In response to the reviewers’ comments, we have tried to quantify more precisely how much PAPA enriches for one population over another by fitting the diffusion spectra of 2-component mixtures to linear combinations of the corresponding individual components (Figure 4–figure supplement 4). We estimate that the fold enrichment of double-labeled molecules ranged from 3.7 to 37-fold between different 2-component mixtures.

      We fully agree that it is critical that researchers who use PAPA be aware of its limitations, so that they do not fallaciously assume that all green-reactivated localizations are protein complexes. To avoid committing a bait-and-switch against our readers, we now state explicitly in the Introduction that PAPA in its current form enriches for complexes but does not provide perfect selectivity. In Appendix 2, we now discuss the problem of background reactivation in more detail and outline what we think will be required to correct quantitatively for this background. Though we believe that such corrections will ultimately be possible, at least in some cases, figuring out how to do this rigorously will require substantial additional development of experimental and computational methods, which we hope the editor and reviewers agree is beyond the scope of the current paper.

      At the end of Appendix 2, we briefly mention another technical problem that we have noticed with SNAP ligand background staining. While this background was negligible for the experiments described in this paper, which involved highly expressed SNAPf transgenes, it may pose a more significant problem for SNAPf-tagged proteins with lower expression levels. We think it is worth mentioning this problem to make readers aware of it and hopefully to motivate the development of better orthogonal pairs of self-labeling tags.

      While there are obviously limitations to PAPA, we think this should not overshadow the fact we have identified a novel photophysical property of commonly used fluorophores and harnessed it to detect molecular interactions in live cells. Our initial proof-of-concept study provides a foot in the door of this new biophysical approach, which we and others will continue to refine. Immediate applications of PAPA could include disambiguation of peak assignments in complex diffusion spectra, confirmation of proposed interactions between proteins (and subsequent investigations into the molecular mechanisms supporting such interactions), or integration into SPT-based high-throughput screening (https://www.eikontx.com/technology) to provide a useful additional readout for each experimental condition.

    1. Author Response:

      Reviewer #1 (Public Review):

      The authors show that the unmitigated generation interval of the original variant of SARS-CoV-2 is longer than originally thought. They argue that in the absence of interventions that limit transmission late in the course of infection, the fraction of transmission events that occur before symptom onset would be considerably lower, and the fraction of transmission events occurring 10 days or more after infection of the index case would be substantially higher.

      These findings improve our ability to accurately estimate the basic reproductive number (R0), to evaluate quarantine and isolation policies, and to model counterfactual intervention-free scenarios. Many applied analyses rely on accurate generation interval estimates. To head off confusion, it would be helpful if the authors could provide more comprehensive guidance about which applied analyses should be informed by the unmitigated generation interval, or the observed generation interval. (E.g. the unmitigated interval is useful for quarantine and isolation policies, but would we ever want to use the unmitigated interval to estimate R?).

      The unmitigated generation-interval should be used for estimation of the R0 of the initial epidemic phase, but not for the R(t) of the current epidemics. Estimation of R(t) must account for changes in generation-interval distributions caused by the invasion of new variants and changes in behavior. When analyzing policies of quarantine, isolation or contact tracing, the unmitigated interval should also be adopted to account for late transmissions.

      We added few sentences at the end of our introduction to clarify this point:

      “The estimated unmitigated generation-interval distribution could be adopted for answering questions about quarantine and isolation policy, as well as for estimating the original R0 at the initial spread in China. However, estimation of instantaneous R(t) should account for changes in generation-interval distributions, reflecting mitigation effects and the current variant.”

      The analysis estimates a longer generation interval after accounting for three main sources of bias or error that are common in other analyses: 1. Recently infected individuals are intrinsically overrepresented in data on a growing epidemic. Thus, shorter incubation periods and forward serial intervals are more likely to be observed, even in the absence of interventions. This analysis adjusts for these dynamical biases. 2. Interventions or behavioral changes can prevent transmission late in the course of infection. This can shorten the generation and serial intervals over the course of an epidemic. This analysis focuses specifically on transmission pairs observed very early, before the adoption of interventions. 3. The incubation period and generation interval should be correlated - infectors that progress relatively quickly to symptoms should also become infectious sooner (symptom onset occurs near the peak of viral titers). Most existing analyses assume these intervals are uncorrelated, but this analysis accounts for their correlation.

      Overall, the conclusions seem reasonable and well-supported. The observation that the generation interval decreases over the course of an epidemic is also consistent with existing studies that show the serial interval has similarly decreased over time. But given the implications of the findings, I hope the authors can address a few questions about potential additional sources of bias:

      1. It is somewhat reassuring that the generation interval decreases relatively smoothly as the cutoff date increases (Fig. S6), but it would be helpful if the authors address the potential impact of ascertainment biases. One of the main reasons that the authors estimate a shorter generation interval is that they define January 17th, early in the outbreak before interventions and behavioral changes had taken place, as the cutoff point for the infector's date of symptom onset. This cutoff eliminates biases from interventions, but it also severely limits the size of the transmission-pair dataset (Fig. S3), and focusing on this very early subset of cases may increase the influence of ascertainment bias. Prior to January 17th, should we expect observed transmission pairs to involve more severe cases on average? And is the unmitigated generation interval correlated with case severity?

      We thank the reviewer for identifying a source of possible bias that we overlooked. Following the comment we performed a new sensitivity analysis for the inclusion of the severe cases, summarized in Appendix 1—figure 11.

      Severity of the cases was reported only in Ali et al.’s data, for some of the individuals. In these cases, individuals are divided into one of three conditions: mild, severe (non-fatal) and death. As non-mild cases represent a small fraction of the dataset, we combine them into one category, which we denote as severe.

      Severe cases (including deaths) were overrepresented in the period prior to January 17, with 8 out of 77 cases, compared to 18 out of 745 in the period of January 18-31. The effect of inclusion of severe cases was analyzed by comparing the means of the estimated generation-interval distribution, separately for the two periods in question, using the inference framework with 30 bootstrapping runs. For the earlier period, the estimated means were compared between the dataset with or without the severe cases. For the later period, we also consider the “enriched” dataset, in which severe cases are oversampled for each bootstrap such that the proportion of severe cases matches that during the earlier period (10%). In both cases we see that the effect on the estimated mean generation interval is small.

      1. The analysis assumes the incubation period follows a fixed distribution, whose parameterization comes from a meta-analysis of previously estimated incubation periods. But p.5 discusses the idea that observed incubation periods are affected by the same dynamical biases as forward serial intervals, "For example, when the incidence of infection is increasing exponentially, individuals are more likely to have been infected recently. Therefore, a cohort of infectors that developed symptoms at the same time will have shorter incubation periods than their infectees on average, which will, in turn, affect the shape of the forward serial-interval distribution." Has the incubation period been adjusted for these dynamical biases, or should it be?

      In our analysis we use the incubation period distribution from Xin et al. 2021 which already considers the backward bias caused by the expanding epidemic with the corrected growth rate of 0.1/d. Xin et al. showed in their meta-analysis that the mean incubation period reported by the various sources changed according to the dates used by the source. Incubation periods prior to the peak of the epidemic in China were lower than ones from after the peak, in a manner that coincided with the backward correction they performed (using a similar derivation to that suggested by Park et al. 2021). Accordingly, the distribution of incubation period they report is the intrinsic incubation period, after correction for the growth rate of the initial spread in China. We added two sentences in our methods section to clarify this point:

      “In their meta-analysis, Xin et al. found an increase of the incubation period following the introduction of interventions in China, matching the theoretical framework shown above. Their inferred incubation period distribution includes a correction for the growth rate of the early spread, accordingly.”

      Furthermore, we perform a sensitivity analysis for the shape of the incubation period distribution, and show that it has a minor effect on our conclusions (Appendix 1—figure 10).

      1. It appears that correlation parameter estimates co-vary with estimates of the mean generation interval (Fig. S6; S13b). Are the authors confident that the correlation parameter is identifiable? How much would the median generation interval estimate in the main analysis change if the correlation parameter had been fixed to 0 (which isn't realistic) or to 0.5 (which might be plausible)?

      As the reviewer pointed out, the correlation parameter estimates co-vary with estimates of the mean generation interval. We further analyzed this relation following the comment. The analysis is summarized in supplemental figures S19-20.

      We first examine the relation between the mean generation interval and the correlation parameter based on the uncertainty estimates, consisting of 1000 bootstrap runs. Appendix 1—figure 12 shows a joint bivariate scatter plot of the parameters, together with contours of equal probability. As can be seen there is a connection between the parameters. The estimates centered around the maximum likelihood estimate with correlation parameter of 0.75 and mean generation interval of 9.7 days. The confidence interval for the correlation parameter of 0.45-0.95 corresponds to mean generation intervals in the range of 8-11 days, supporting the conclusion of this study.

      Next, we reanalyzed the dataset while fixing the correlation parameter, as suggested by the reviewer. Appendix 1—figure 13 shows the estimated mean generation interval for fixed correlation parameters with values of 0, 0.25, 0.5, 0.75, 0.9. For each fixed correlation parameter 100 bootstrapping runs. As can be seen, the results reflect the same connection that can be seen in Appendix 1—figure 12, with probable values in the range of 8-11 days, for correlation parameters in the range of 0.5-0.9. Assuming no correlation would cause underestimation of the mean generation interval match previous literature (Hart, Maini, and Thompson 2021; Park et al. 2022).

      Reviewer #2 (Public Review):

      There have been several estimates of the generation time and serial interval published for SARS-CoV-2, but as the authors note, estimates can be subject to biases including shifted event timing depending on the phase of the epidemic, correlation in characteristics between infector and infectee, and impact of control measures on truncating potential infectiousness. This study, therefore, has several strengths. It first collates data on transmission events from the earliest phase of the COVID-19 pandemic, then makes adjustments for these potential biases to estimate the generation time in absence of control measures, and finally discusses implications for transmission.

      Given many subsequent aspects of the COVID-19 pandemic have been defined relative to earlier phases (e.g. relative transmissibility of variants, relative duration of infectiousness), understanding the baseline characteristics of the infection is crucial. I thought this paper makes a useful contribution to this understanding, generating adjusted estimates for infectiousness (which is longer than previous estimates) and corresponding values for the reproduction number (which is remarkably similar to earlier estimates, presumably because of the different sources of bias in the growth rate and generation time distribution somehow end up canceling each other out).

      However, there are some weaknesses at present. The study correctly flags several potential sources of bias in estimates, but in making adjustments uses estimates from the literature that themselves could suffer from these biases, e.g. the distribution of incubation period from a 2021 meta-analysis. Although the authors conduct some sensitivity analysis it would be worth including some more explicit consideration of whether they would expect any underlying bias to propagate through their calculations. The authors also conduct some sensitivity analysis around the underlying data (e.g. ordering of transmission pairs), but again it would be useful to know whether there could be systematic biases in these early data. Specifically, the paper references Tsang et al (2020), which highlighted variability in early case definitions - is it possible that early generation times are estimated to be longer because intermediate cases in the transmission chain were more likely to go undetected than later in the epidemic?

      We recognize the potential biases in the transmission pairs data. We therefore developed an extensive framework of sensitivity analyses for identifying biases that could substantially affect the results. In the results section and figure 5, we show that the main study result, that the unmitigated generation-interval distribution is longer than previously estimated, is robust to reasonable amounts of ascertainment bias. We discuss this point at length and have added several supplemental figures to support this claim.

      As reviewer #3 mentioned, we conducted a sensitivity analysis for the inclusion of the longest serial intervals, to investigate possible effects of missing links in the longest transmission pairs. We also discuss why we think it’s not necessary to explicitly model the short intervals that may be unobserved due to missing links.

      “Second, we considered the possibility that long serial intervals may be caused by omission of intermediate infections in multiple chains of transmission, which in turn would lead to overestimation of the mean serial and generation intervals. Thus, we refit our model after removing long serial intervals from the data (by varying the maximum serial interval between 14 and 24 days). We also considered “splitting” these intervals into smaller intervals, but decided this was unnecessarily complex, since several choices would need to be made, and the effects would likely be small compared to the effect of the choice of maximum, since the distribution of the resulting split intervals would not differ sharply from that of the remaining observed intervals in most cases.”

      We added to the discussion text regarding the effect of possible bias in the dataset, explicitly specifying the ascertainment bias.

      “Our analysis relies on datasets of transmission pairs gathered from previously published studies and thus has several limitations that are difficult to correct for. Transmission pairs data can be prone to incorrect identification of transmission pairs, including the direction of transmission. In particular, presymptomatic transmission can cause infectors to report symptoms after their infectees, making it difficult to identify who infected whom. Data from the early outbreak might also be sensitive to ascertainment and reporting biases which could lead to missing links in transmission pairs, causing serial intervals to appear longer (For example, people who transmit asymptomatically might not be identified). Moreover, when multiple potential infectors are present, an individual who developed symptoms close to when the infectee became infected is more likely to be identified as the infector. These biases might increase the estimated correlation of the incubation period and the period of infectiousness. We have tried to account for these biases by using a bootstrapping approach, in which some data points are omitted in each bootstrap sample. The relatively narrow ranges of uncertainty suggest that the results are not very sensitive to specific transmission pairs data points being included in the analysis. We also performed a sensitivity analysis to address several potential biases such as the duration of the unmitigated transmission period, the inclusion of long serial intervals in the dataset, and the incorrect ordering of transmission pairs (see Methods). The sensitivity analysis shows that although these biases could decrease the inferred mean generation interval, our main conclusions about the long unmitigated generation intervals (high median length and substantial residual transmission after 14 days) remained robust (Figure 5).”

      It would also be helpful to have some clarifications about methodology, particularly in how the main results about generation time are subsequently analyzed. For example, estimates such as the conversion of generation time to R0 and VOC scalings are described very briefly, so it is currently unclear exactly how these calculations are being performed.

      Following the reviewer comments we made edits to the Methods section in order to make it more readable and clearer. We added subheadings for the various sections. Moreover, we added a section explaining the derivation of the basic reproduction number and clarified the section regarding the VOCs extrapolations.

      We made some edits to the methods section in order to make it more accessible and clear, for example, we added subheadings for the various sections, added a section explaining the derivation of the basic reproduction number, and clarified the section regarding the VOCs extrapolations.

      Reviewer #3 (Public Review):

      Sender & Bar-On et al. perform robust analyses of early SARS-CoV-2 line list data from China to estimate the intrinsic generation interval in the absence of interventions. This is an important topic, as most SARS-CoV-2 data are from periods when transmission-reducing interventions are in place, which will lead to underestimation of the potential infectious period.

      The authors highlight two shortcomings in previous approaches. First, the distribution of 'observed' serial intervals (the time between symptom onset in the infector and symptom onset in the infectee) depends not only on the timeline of each infector's infection, but also the epidemic growth rate, which weights the proportion of observed short vs. long serial intervals. The authors argue that by accounting for this weighting, more accurate estimates of the intrinsic generation interval - the metric on which isolation policies are based - can be obtained. Second, the authors find that the original SARS-CoV-2 generation interval distribution has both a higher mean and longer tail than previous estimates when using only data prior to the introduction of interventions. Finally, the authors use publicly available data on viral load trajectories to extrapolate their estimates to other SARS-CoV-2 variants, finding that alpha, delta, and omicron may have shorter generation intervals than original SARS-CoV-2. These findings are important, as case isolation policies are based on assumptions for how long individuals remain infectious. More broadly, these methods will be important for future work to correctly estimate generation intervals in other outbreaks.

      The conclusions are well supported by the data, and a suite of sensitivity analyses give confidence that the findings are robust to deviations from many of the key assumptions. The code is well documented and publicly available, and thus the findings are easily reproducible. Key strengths of the paper include the clarity and rigor of the modeling methods, and the exhaustive consideration of potential biases and corresponding sensitivity analyses - it is very difficult to think of potential biases that the authors have not already considered! I think this is a well-written and well-executed study. The work is likely to be impactful for reconsidering SARS-CoV-2 isolation policies and revisiting generation interval estimates from other data sources. I also expect this to be a key reference and method for future studies estimating the generation interval.

      I have some minor comments on potential weaknesses and interpretation:

      1. Uncertainty in early generation interval estimates. One of the conclusions is that the estimated mean generation interval is longer than the observed mean serial interval. However, this conclusion does not seem justified given that the observed mean serial interval (9.1 days) is well within the 95% CI of 8.3-11.2 days for the mean generation interval. The confidence intervals for the serial interval in figure 2 are also wide for pre-Jan 17th (though presumably these would be reduced if all pre-Jan 17th serial intervals were combined). Further, only 77 of the ~1000 transmission pairs are actually from pre-January 17th. The actual sample size used for these estimates is much smaller than suggested by Figure S1 and thus this should be made clear. Therefore, although the intuition for why observed serial intervals may differ from the generation interval is correct, I do not think that the data alone demonstrate this. A related issue is on ascertainment bias - could the early serial interval data be biased longer because ascertainment is initially poor and thus more intermediate infectors are missed? The authors consider removing particularly long serial intervals to try and account for this, but that does not deal with e.g. chains of multiple short serial intervals being incorrectly recorded as a single long serial interval (but still within 16 days).

      We agree with the reviewer that due the large uncertainty we cannot deduce that the mean generation interval is longer than the mean serial interval. We changed the phrasing to emphasize this statement is supported by mathematical theory.

      “We note that our estimated mean generation-interval is longer than the observed mean serial-interval (9.1 days) of the period in question. This is supported by the theory (Park et al. 2021) of the dynamical effects of the epidemic -- in contrast to the common assumption that the mean generation and serial intervals are identical. During the exponential growth phase, the mean incubation period of the infectors is expected to be shorter than the mean incubation period of the infectee - this effect causes the mean forward serial interval to become longer than the mean forward generation interval of the cohorts that developed symptoms during the study period. However, these cohorts of infectors with short incubation periods will also have short forward generation (and therefore serial) intervals due to their correlations. When the latter effect is stronger, the mean forward serial interval becomes shorter than the mean intrinsic generation interval, as these findings suggest.“

      Following the comment, we added to Figure S1 the filtering according to date, to reflect the true sample size we use for the main analysis (We renamed it: Appendix 1—figure 1).

      We recognize the potential biases in the transmission pairs data. We therefore developed an extensive framework of sensitivity analyses for identifying biases that could substantially affect the results. In the results section and figure 5, we show that the main study result, that the unmitigated generation-interval distribution is longer than previously estimated, is robust to reasonable amounts of ascertainment bias. We discuss this point at length and have added several supplemental figures to support this claim.

      As reviewer #3 mentioned, we conducted a sensitivity analysis for the inclusion of the longest serial intervals, to investigate possible effects of missing links in the longest transmission pairs. We also discuss why we think it’s not necessary to explicitly model the short intervals that may be unobserved due to missing links.

      “Second, we considered the possibility that long serial intervals may be caused by omission of intermediate infections in multiple chains of transmission, which in turn would lead to overestimation of the mean serial and generation intervals. Thus, we refit our model after removing long serial intervals from the data (by varying the maximum serial interval between 14 and 24 days). We also considered “splitting” these intervals into smaller intervals, but decided this was unnecessarily complex, since several choices would need to be made, and the effects would likely be small compared to the effect of the choice of maximum, since the distribution of the resulting split intervals would not differ sharply from that of the remaining observed intervals in most cases.”

      We added to the discussion text regarding the effect of possible bias in the dataset, explicitly specifying the ascertainment bias.

      “Our analysis relies on datasets of transmission pairs gathered from previously published studies and thus has several limitations that are difficult to correct for. Transmission pairs data can be prone to incorrect identification of transmission pairs, including the direction of transmission. In particular, presymptomatic transmission can cause infectors to report symptoms after their infectees, making it difficult to identify who infected whom. Data from the early outbreak might also be sensitive to ascertainment and reporting biases which could lead to missing links in transmission pairs, causing serial intervals to appear longer (For example, people who transmit asymptomatically might not be identified). Moreover, when multiple potential infectors are present, an individual who developed symptoms close to when the infectee became infected is more likely to be identified as the infector. These biases might increase the estimated correlation of the incubation period and the period of infectiousness. We have tried to account for these biases by using a bootstrapping approach, in which some data points are omitted in each bootstrap sample. The relatively narrow ranges of uncertainty suggest that the results are not very sensitive to specific transmission pairs data points being included in the analysis. We also performed a sensitivity analysis to address several potential biases such as the duration of the unmitigated transmission period, the inclusion of long serial intervals in the dataset, and the incorrect ordering of transmission pairs (see Methods). The sensitivity analysis shows that although these biases could decrease the inferred mean generation interval, our main conclusions about the long unmitigated generation intervals (high median length and substantial residual transmission after 14 days) remained robust (Figure 5).”

      1. Frailty of using viral loads to extrapolate generation intervals. The authors take the observation that variants of concern demonstrate faster viral clearance on average to estimate shorter generation intervals for alpha, delta, and omicron. The authors rightly point out in the discussion that using viral load as a proxy for infectiousness has many limitations. I would emphasize even further that it is very difficult to extrapolate from viral load data in this way, as infectiousness appears to vary far more between variants than can be explained by duration positive or peak viral load. Other factors are potentially at play, such as compartmentalization in the respiratory tract, aerosolization, receptor binding, immunity, etc. Further, there is considerable individual-level variation in viral trajectories and thus the use of a population-mean model overlooks a key component of SARS-CoV-2 infection dynamics. An important reference, which came out recently and thus makes sense to have been missed from the initial submission, is Puhach et al. Nature Medicine 2022 https://doi.org/10.1038/s41591-022-01816-0.

      We agree with the reviewer about the frailty of using viral loads to extrapolate generation intervals. We therefore expanded our discussion of the limitation of using viral load data for inferring infectiousness including many of the points mentioned by the reviewer. We use viral load data in the most minimal way to try to enable some discussion of new VOC, and try to emphasize the needed caution.

      Viral load trajectories data have potential for informing estimates of the infectiousness profile. However the relationship between viral load, culture positivity, symptom onset, and infectivity is complex and not well characterized. Due to this limitation we tried to use viral loads in a more limited way, extrapolating our results to variants of concerns (which lack unmitigated transmission data). Following the comment, we added a detailed discussion of the limitations of using viral loads as a proxy for infectiousness, including the variation of viral loads across individuals. We also added supplementary figures (Figure 6—figure supplements 1-2) to show the possible effect of an individual's viral loads in relation to the infectiousness and for comparison with new viral load and culture results (Chu et al. 2022; Killingley et al. 2022). As the viral load trajectories data for the different VOC is given only as a function of time from the onset of symptoms, it is not possible to directly link it to the fraction of transmission post 14 days from infection. We made changes to Figure 6 to clarify the possible connection of viral load with the TOST (time from symptoms onset to transmission) distribution and the resulting extrapolation to the unmitigated generation-interval distributions.

      “SARS-CoV-2 viral load trajectories serve an important role in understanding the dynamics of the disease and modeling its infectiousness (Quilty et al. 2021; Cleary et al. 2021). Indeed, the general shapes of the mean viral load trajectories and culture positivity, based on longitudinal studies, are comparable with our estimated unmitigated infectiousness profile (Figure 6—figure supplements 1-2, comparison with (Chu et al. 2022; Killingley et al. 2022; Kissler et al. 2021)). However, the nature of the relationship between viral load, culture positivity, symptom onset, and real-world infectivity is complex and not well characterized. Therefore, the ability to infer infectiousness from viral load data is very limited, especially near the tail of infectiousness, several days following symptom onset and peak viral loads. Viral load models are usually made to fit the measurements during an initial exponential clearance phase and in many cases miss a later slow decay (Kissler et al. 2021). Furthermore, there is considerable individual-level variation in viral trajectories that isn’t accounted for in population-mean models (Kissler et al. 2021; Singanayagam et al. 2021). Other factors limiting the ability to compare generation-interval estimates with viral loads models are the variability of the incubation periods and its relation to the timing of the peak of the viral loads, and the great uncertainty and apparent non-linearity of the relation between viral loads and culture positivity (Jaafar et al. 2021; Jones et al. 2021). Due to these caveats and in order to avoid over interpretation of viral load data, we restrict our extrapolation of new VOCs’ infectiousness to a single parameter characterizing the viral duration of clearance.”

      We also edited another paragraph in the discussion:

      “Our extrapolations are necessarily crude given the complex relationship between viral load, symptomaticity, and infectiousness discussed above. Moreover, compartmentalization in the respiratory tract, aerosolization, receptor binding affinity, and immune history can also play important roles in determining the infectiousness profiles of SARS-CoV-2 variants (Puhach et al. ). ”

      1. Lack of validation with other datasets This study hinges on data from a single setting in a short window of time. Although the data are from multiple publications, the fact that so many reported the same transmission pair data demonstrates that these are overlapping datasets. As the authors note, there are potential biases e.g., ascertainment rates and behavioral changes which will impact the generation interval estimates. Thus, generalizability to other settings is limited.

      We agree with the reviewer that the dataset used in our study is limited, and consists of overlapping transmission pairs. We perform some analysis of the possible bias caused by inclusion of each dataset, as can be seen in Appendix 1—figure 4.

      The best validation would have been a comparison with another independent dataset from the early spread of the epidemic, but no such dataset exists. We added a sentence to the discussion to emphasize this point.

      “Due to the nature of early spread of a new unknown disease it is nearly impossible to find two completely unrelated datasets from the period prior to mitigation, limiting the ability of further validation of the current results.”

      1. The impact of epidemic dynamics on infector vs. infectee serial intervals. It took me a long time to get my head around the assertion that the forward serial interval distribution will be longer during epidemic growth due to the overrepresentation of short incubation periods among infectors relative to infectees. A supplementary figure, similar to the way Figure 1 is laid out, to illustrate this concept may go a long way to aid the reader's understanding.

      We added an explanation to the paragraph in order to make it clearer:

      “A cohort of individuals that develop symptoms on a given day is a sample of all individuals who have been previously infected. When the incidence of infection is increasing, recently infected individuals represent a bigger fraction of this population and thus are over-represented in this cohort. Therefore, we are more likely to encounter infected individuals with a short incubation period in this cohort compared to an unbiased sample. The forward serial-interval is calculated for a cohort of infectors who developed symptoms at the same time and therefore is sensitive to this bias. These dynamical biases are demonstrated using epidemic simulations by Park et al."

      1. Simulations to illustrate concepts and power Given the assertion that observed serial intervals will depend on epidemic growth rates, reporting, and timing of interventions, I think a simple simulation to illustrate some of these ideas would be very helpful. For example, a simple agent-based model with simulated infectivity profiles and incubation periods using the estimated bivariate distribution would be extremely helpful in illustrating how serial intervals and estimates of the generation interval can differ from the true intrinsic generation interval (I coded such a simulation to help me understand this paper in a couple of hours with <100 lines of R code, so I do not think this would be much work). This would also be very helpful for illustrating statistical power re. comment 1.

      The current paper is based on a strong theoretical foundation provided by previous works, specifically Park et al. 2021, which used simulations similar to the reviewer’s suggestions to demonstrate the dynamical biases. We now mention these simulations somewhere in the introduction section:

      “These dynamical biases are demonstrated using epidemic simulations by Park et al."

    1. Any available documents regarding student-led activism on cam

      I don't know if this would count as one of the sources we should be using, but perhaps you could also look into what schools offer on-campus gender affirming health care. Overall, I think the project pitch is well thought out and organized, with a good plan of action put in place. The research question is specific enough to produce intriguing result (that are not general) as well as may make it easier to know what to search for when it comes to sources.

    1. Author Response

      Reviewer #1 (Public Review):

      Bohère, Eldridge-Thomas and Kolahgar have studied the effect of mechanical signalling in tissue homeostasis in vivo, genetically manipulating the well known mechano-transductor vinculin in the adult Drosophila intestine. They find that loss of vinculin leads to accelerated, impaired differentiation of the enteroblast, the committed precursor of mature enterocytes, and stimulates the proliferation of the intestinal stem cell. This leads to an enlarged intestinal epithelium. They discriminate that this effect is mediated through its interaction with alpha-catenin and the reinforcement of the adherens junctions, rather than with talin and integrin-mediated interaction with the basal membrane. This results aligns well, as the authors note, with previous observations from Choi, Lucchetta and Ohlstein (2011) doi:10.1073/pnas.1109348108. Bohère et al then explore the impact that disrupting mechano-transduction has on the overall fitness of the adult fly, and find that vinculin mutant adult flies recover faster after starvation than wild types.

      The main conclusions of the paper are convincing and informative. Some important results would benefit from a more detailed description of the phenotypes, and others could have alternative explanations that would warrant some additional clarification.

      1) - Interpretation of phenotypes in vinc[102.1] mutants

      The paper presents several adult phenotypes of the homozygous viable, zygotic null mutant vinculin[102.1], where the fly gut is enlarged (at least in the R4/5 region). In many cases, they correlate this phenotype with that of RNAi knockdown of vinculin in the gut induced in adult stages. This is a perfectly valid approach, but it presents the difficulty of interpretation that the zygotic mutant has lacked vinculin throughout development and in every fly tissue, including the visceral mesoderm that wraps the intestinal epithelium and that also seems enlarged in the vinc[102.1] mutant. So this phenotype, and others reported, could arise from tissue interactions. To me, the quickest way to eliminate this problem would be to express vinculin in ISCs and/or EBs the vinc[102.1] background, either throughout development or after pupariation or emergence, and observe a rescue.

      We agree with the reviewer that we cannot exclude additional vinculin role(s) in other tissues during or after development that might have an impact on the intestinal epithelium. Our attempts to express a full-length Vinculin construct (Maartens et al, 2016) in the vinc102.1 flies, either in adulthood or throughout development, were not very conclusive: although we observed some degree of rescue, it was not fully penetrant. This was in contrast to the complete rescue observed with the genomic rescue of vinculin. Thus, it is possible that some form of tissue interaction contributes to the phenotype observed, for example if vinculin loss affects muscle structure. Alternatively, just like it was shown that too much active vinculin is detrimental to the fly (Maartens et al, 2016), our experiment suggests that too much vinculin may be deleterious to the intestine.

      In any case, because of cell-specific knockdowns in the adult gut, we are confident that EB reduction of vinculin levels or activity is sufficient to accelerate tissue turnover, at least in a specific portion of the posterior midgut. We have amended the text to acknowledge a role for tissue interactions (see page 6 (end of first paragraph), page 7 (start of last paragraph), page 12 (starvation experiments).

      An experiment where this is particularly difficult is with the starvation/refeeding experiment. The authors explored whether the disruption of tissue homeostasis, as a result of vinculin loss, matters to the fly. So they tested whether flies would be sensitive to starvation/re-feeding, where cellular density changes and vinculin mechano-sensing properties may be necessary. They correctly conclude that mutant flies are more resistant to starvation, and suggest that this may be due to the fact that intestines are larger and therefore more resilient. However, in these animals vinculin is absent in all tissues. It is equally likely that the resistance to starvation was due to the effect of Vinculin in the fat body, ovary, brain, or other adult tissues singly or in combination. The fact that the intestine recovers transiently to a size slightly larger than that of the fed flies seems anecdotal, considering the noise within the timeline of fed controls. I am not sure this experiment is needed in the paper at all, but to me, the healthy conclusion from this effort is that more work is needed to determine the impact of vinculin-mediated intestinal homeostasis in stress resistance, and that this is out of the scope of this paper.

      Please the new data presented in Figure 8A-B (text page 12).

      2) - Cell autonomy of the requirement of Vinculin and alpha-Catenin

      Authors interpret that Vinculin is needed in the EB to maintain mechanical contact with the ISC, restrict ISC proliferation through contact inhibition, and maintain the EB quiescent. This interpretation explains seemingly well the lack of an obvious phenotype when knocking down vinculin in ISCs only, while knockdown in ISCs and EBs, or EBs only, does lead to differentiation problems. It also sits well with the additional observation that vinculin knockdown in mature ECs does not have an obvious phenotype. However, a close examination makes the results difficult to explain with this interpretation only. If the authors were correct, one would expect that in mutant clones, eventually, vinculin-deficient EBs will be produced, which should mis-differentiate and induce additional ISC proliferation. However, the clones only show a reduction in ISC proportions; the most straight forward interpretation of this is that vinculin is cell-autonomously necessary for ISC maintenance (which is at odds with the phenotype of vinculin knockdown using the ISC and ISC/EB drivers).

      We apologise that we were unclear in the text. With hindsight, the confusion may have been caused by our describing the phenotype of MARCM clones before reporting the accumulation of EBs in the vinc102.1 guts. Therefore, we swapped these two sections and improved the description of these experiments in the manuscript (see section: “The pool of enterocyte progenitors expands upon vinculin depletion” pages 6-8).

      In brief, we do not think that our results are at odds with the phenotype of vinculin knockdown using the ISC and ISC/EB drivers - we realise the text was misleading and hope to have clarified our observations in the revised manuscript (pages 7 and 8). From cell conditional RNAi experiments, like the reviewer, we would predict that vinculin knockdown or loss of function in mitotic clones (MARCM experiments, Figure 4E-G) will induce accelerated differentiation of vinculin deficient enteroblasts, which in turn will increase proliferation. We observed that vinc102.1 or vinc RNAi mitotic clones contained similar number of cells compared to control clones, but reduced proportion of stem cells (Figure 4G). We interpret this as indicating that to maintain an equivalent clone size, stem cells must have divided more frequently, with some divisions producing two differentiated daughter cells. This type of symmetric division would increase the EB pool (as seen in Figure 4-figure supplement 2B), at the expense of the ISC population, in turn decreasing long term clonal growth potential. Altogether, the results obtained with MARCM clones highlight changes in tissue dynamics compatible with those observed with cell-specific vinculin knockdowns.

      Also, from the authors interpretation, it would follow induce that the phenotype of vinculin knockdown in ISCs+EBs and in EBs only should be the same. However, in ISCs+EBs vinculin knockdown, differentiation accelerates, which is likely accompanied by increased proliferation (judging by the increase in GFP area, PH3 staining would be more definitive).

      Indeed, the accelerated differentiation observed with esgGal4>UAS VincRNAi is accompanied by increased proliferation with the two independent RNAi lines used. We have added this result in Figure 1-figure supplement 1G (and in text, page 5).

      This contrasts with the knockdown only in EBs, which leads to accumulation of EBs due to misdifferentiation, and increased proliferation, mostly of ISCs, as measured directly with PH3 staining, but not additional late EBs or mature ECs. The authors call this "incomplete maturation due to accelerated differentiation". I think that one should expect to find incomplete differentiation/maturation when the rate of the process is very slow, not the other way around. To me, these are different phenotypes, which could perhaps be explained if vinculin was also needed in the ISC to transmit tension to the EB and prevent its differentiation, and removing it only in the EB may be revealing an additional, cell-autonomous requirement in maturation.

      When vinculin is knocked down in EBs, cells appear bigger than controls (as judged by the RFP+ nuclei in Figure 5E). This, compared to yw and vinc102.1 guts shown in Figure 4D suggests that these cells are more advanced in their differentiation. We have removed the sentence, to not confuse the reader, and clarified the text (see page 8). The discrepancy in the differentiation index between the esgGal4 and KluGal4 experiments might result from differences in the drivers, or an additional role of vinculin in EC differentiation, which we now mention in the text (page 8).

      So far, we have no evidence to support the idea that vinculin is also needed in the ISC to transmit tension to the EB and prevent its differentiation; for example, the lack of any phenotype when we knocked down vinculin specifically in ISCs (Figure 3) – notably, no increase in ISC ratio and no increase in cell density (unlike the reduction seen in Figure 1F with ISC+EB Knockdown).

      Another unexpected result, considering the authors interpretation, is that the over expression of activated Vinculin (vinc[CO]) does not seem to have much of an effect. It does not change the phenotype of the wild type (where there is very little basal turnover to begin with) and it only partially rescues the phenotype of the vinc[102.1] mutants, when the rescue transgene vinc:RFP does. This again suggests that there may be tissue interactions, in development or adulthood, that may explain the vinc[102.1] phenotypes. It could also be that this incomplete rescue is due to the deleterious effect of Vinc[CO]; this is another reason for doing the vinc[102.1]; esg-Gal4; UAS-vincFL experiments suggested above). An alternative experiment to perform this rescue would be to knock down vinculin gene while overexpressing the Vinc[CO] transgene - this may be possible with the RNAi HSM02356, which targets the vinculin 3'UTR and is unlikely to affect UAS-vinc[CO].

      Please refer to essential point 2c; as VincCO is not a simple overactive protein, like a constitutively active kinase, additional effects in the tissue can be expected.

      The claims of the authors would be more solid if the reporting of the phenotypes was more homogeneous, so one could establish comparisons. Sometimes conditions are analysed by differentiation index, others by extension of the GFP domains, others with phospho-histone-3 (PH3), others through nuclear size or density, and combinations. I do not think the authors should evaluate all these phenotypes in all conditions, but evaluating mitotic index and abundance of EBs and "activated EBs/early ECs" to monitor proliferation and differentiation rates should be done across the board (ISC, ISC+EB, EB drivers).

      To improve consistency, in all conditions we have compared cell types ratios and cellular density upon vinculin knockdown: see Figure 1E-F for ISC+EB, Figure 3B-C for ISC, and Figure 5 – figure supplement 1C-E for EB (with panel E newly added). As we did not observe any effect on ratio or density, we did not monitor cell proliferation for ISC knockdown.

      Nonetheless, we added the mitotic index for the ISC+EB driver (new Figure 1- figure supplement 1G) to be consistent with the results from the EB driver (Figure 5- figure supplement 1C).

      If the primary role of Vinculin is to induce contact inhibition in the ISC from the EB and prevent the EB differentiation and proliferation, one would expect that over expression of Vinc[CO] (or perhaps VincFL or sqhDD) in EBs should prevent or delay the differentiation and proliferation induced by a presumably orthogonal factor, like infection with Pseudomonas entomophila or Erwinia carotovora.

      This is indeed an exciting prediction, but outside the scope of this manuscript.

      3) - Relationship between Vinculin and alpha-Catenin

      The authors establish a very clear difference in the phenotypes between focal adhesion components and Vinculin, whereas the similarity of alpha-catenin and vinculin knockdowns is very compelling. Therefore I am sure the authors are in the right path with their interpretation of this part of the paper. However, some of the alpha-Catenin experiments are not very clear. The result from the rescue experiment of alpha-Cat knockdown with alpha-Cat-deltaM1b does not seem to show what the authors claim, and differentiation does not seem affected, only the amount of extant older ECs (which may be due to other reasons as this is a non-autonomous effect).

      Like the reviewer, we were surprised about the milder rescue with M1b compared to M1a and are unsure of the reasons for this. Nevertheless, quantifications of the differentiation and retention indices show significant differences for M1a and M1b compared to the FL control (Figure 6F-G), with phenotypes resembling the vinc knockdown. In Figure 6E, we have added a row of zoomed views to better highlight the similarity of phenotype between M1a and M1b and have acknowledged the mild differences in the text (bottom of page 9). For the sake of rigour, we think it is important to include results from both M1 deletions, even if there is not yet a logical reason to explain why they have different effects.

      Ulrich Tepass produced a UAS-alpha-catenin construct with the full deletion of the M1 region, perhaps that would show a clearer phenotype.

      This is a good suggestion, however for technical reasons this is not possible. The strategy devised by Ken Irvine and his group relies on rescuing the RNAi with an RNAi resistant construct, which is not the case for the constructs generated in the Tepass lab. Furthermore, we cannot adopt a MARCM strategy as -cat is too close to the centromere (80F).

      Also, the autonomy of the phenotype is difficult to address with these experiments alone. It would be expected that the phenotype of alpha-catenin knockdown should be similar to that of vinculin knockdown in the ISCs only or EBs only.

      This is not what our understanding of cadherin-mediated adhesion would predict. Forming cadherin adhesions requires cadherins and catenins in both cells, so we would expect similar phenotypes in ISCs only and EBs only. What is exciting about our findings is that the mechanosensitive machinery is not equally important in the two adherent cells, i.e. the EB is using vinculin to measure force on the contact and regulate differentiation, whereas the ISC needs to resist that force, but does not use vinculin to sense that force and regulate its behaviour.

      We have added new data showing the role of the vinculin/α-catenin interaction in ISCs or EBs by co-expressing α-Cat RNAi and α-Cat ΔM1a. We observed that absence of VBS in α-catenin has no effect in ISCs but promotes EB differentiation and increase in numbers (new Figure 6 – figure supplement 2), similar to our observations with vincRNAi (see text page 10).

      Reviewer #2 (Public Review):

      Vinculin functions as an important structural bridge that connects cadherin and integrin-mediated adhesions to the F-actin cytoskeleton. This manuscript carefully examined the mutant phenotype of vinc in the Drosophila intestine and found that vinc mutant in EBs causes significant increases of EB to EC differentiation, stem proliferation, and tissue growth. By analyzing the mutant phenotype of the cadherin adaptor alpha-catenin, the authors suggest that the vinc functions through the cell-cell junctions instead of cell-CEM adhesions in EBs. Finally, manipulation of myosin activity in EBs phenocopies the vinc mutant, suggesting that vinculin is regulated by the mechanical tension transduced through the cytoskeleton.

      The authors claim that the vinculin mutant phenotype is opposite compared to the loss of the major integrin components, suggesting a function independent of the cell-ECM adhesions. However, the phenotype of vinc and integrin may not be completely opposite. Besides loss of ISCs, both mys and talin knockdown in ISCs clearly causes ISCs differentiation into EC cells (Fig.3A), suggesting a possible involvement of integrin in EB to EC differentiation. Therefore, it will be important to test the phenotype of integrin KD in EBs using EB-specific Gal4.

      The reviewer raised an important point. To test this we had to overcome the ISC defect of mys or talin RNAi, and specifically tested their function in enteroblasts using the KluGal4 driver. This revealed a similar phenotype of accelerated differentiation, assayed with the ReDDM system (see new Figure 6 -figure supplement 4). Thus, as the reviewer suggested both integrins and cadherins function in this process, we have amended the text to indicate this (see page 10, and sentence in the discussion page 12). It appears however that, unlike vinculin, they also have a key role in ISCs.

      The authors proposed a model that the cell-cell adhesion between ISC and EBs is required for vinculin mediated differentiation suppression. However, this model is not directly supported by the data as the EB-ISC adhesion and EB-EC adhesion have not been tested separately.

      This is an important point and we have amended the text to address this.

      We have focussed our model on EB-ISC adhesion as the adherens junctions are stronger between progenitor cells than EBs-ECs, and because of previous data from the Ohlstein lab (Choi et al, 2011) demonstrating the relationship between adherens junction stability and EB differentiation/ISC proliferation. Nonetheless we agree it is possible that EB-EC adhesion might contribute to this mechanism and have modified the last sentence of the result section (page 12) and the legend associated to the model (Figure 8) to take this into account.

      In addition, previous short-term manipulation of E-cadherin in ISCs and EBs shows no change in cell proliferation (Liang J. et al. 2017), which seems to contradict the authors' model. To support the authors' conclusion, long-term manipulation of E-cadherin in ISCs and EBs must be tested.

      A main feature of the vinculin phenotype is the regional accelerated differentiation observed in R4/5, potentially reflecting areas more subject to mechanical forces. Strikingly, this accelerated differentiation is rarely observed more anteriorly (such as region R4a/b studied in Liang et al, 2017). In fact, these regional differences were previously reported with E-cadherin knockdown by the Adachi-Yamada group (see Figure S1, Maeda et al, 2008). This highlights the importance of considering regional control of cell fate for the field.

      To test our hypothesis further, we have knocked down E-cadherin and α-catenin in EBs only (with Klu-Gal4). As shown in new Figure 6-figure supplement 3, we observed an accumulation of EBs as early as 3 days after induction, reminiscent of vinculin loss of function phenotype. Longer E-cadherin EB knock-down with KluGal4 appears particularly detrimental for survival as all flies died after 4 days of continuous RNAi expression preventing any further observations (see new text page 10). These observations support our model that junctional stability slows down EB differentiation. Our results are also in agreement with the work described in Choi et al (2011), whereby after 6 days of E-Cadherin RNAi expression in progenitors or EBs (using a different driver from us, Su(H)Gal4), the mitotic index increases, showing a feedback regulation on ISC proliferation. Therefore, our work and the Liang et al 2017 study are not in fact contradictory: the differences in the contribution of junctions to tissue dynamics might reflect the variety of molecular mechanisms involved along the small intestine.

      The result of MARCM analysis seems inconsistent with the rest of the data. In MARCM, no significant change of clone sizes is observed between WT and vinc mutant (Fig. 3E). However, vinc mutant in EBs clearly promotes ISC proliferation in other experiments such as esg>vinc-RNAi and the EB>vinc-RNAi (Fig. 1A, Fig. 4).

      Please refer to point 2a, essential revisions. We do not think that our results are at odds with the phenotype of vinculin knockdown using the ISC and ISC/EB drivers - we realise the text was misleading and hope to have clarified our observations in the revised manuscript (pages 7-8).

      In Fig. 4H, the authors suggest that vinculin mutant prevent terminal EC formation. However, this may be simply caused by longer retention of Klu expression in the newborn ECs. To test if EB differentiation is indeed affected, the EC marker pdm1 staining will provide more convincing evidence. Another experiment to strengthen the conclusion will be the tracking of clone sizes generated from a single EB cell using the UAS-Flp system (such as G-trace).

      These are good suggestions to strengthen our findings. Unfortunately, we have not managed to obtain a working Pdm1 antibody (or other commercially available EC marker), which is why we assayed nuclear size and the tracking of KluReDDM cells. Therefore, we have not been able to test if Klu is retained in newborn ECs.

      As we agree this section of the text was misleading, we have rephrased and highlighted that the phenotype seen with KluGal4ReDDM resembles the accumulation of activated EBs and newborn ECs observed in vinc102.1 guts. (page 8).

      In Fig. 6D, the survival rate of WT and vinc mutant flies were compared. However, as there is no additional assay about the feeding behavior or metabolic rate, the systematic mutant of vinc does not provide a direct link between animal survival and intestinal EBs. Therefore, an experiment with vinc level specifically manipulated in fly intestine using esg>vinc-RNAi or BE>vinc-RNAi will be more relevant.

      This experiment has now been added in Figure 8B and the text modified to acknowledge the limitations of the survival experiments with whole mutant flies (see point 3, essential revisions above).

      Reviewer #3 (Public Review):

      Prior work had identified essential roles for Integrin signaling in regulating intestinal stem cell (ISC) proliferation, and the authors studies were motivated by trying to understand whether Vinculin (Vinc) might participate in this. However, Vinc is involved in mechanotransduction at both focal adhesions (FA) and adherens junctions (AJ), and their results revealed that Vinc phenotypes do not match those of FA proteins like Integrin. Conversely, they do match a-catenin (a-cat) RNAi phenotypes, and together with the localization of Vinc and the phenotypes associated with a-cat mutants that can't bind Vinc, this led to the conclusion that Vinc is acting at AJ rather than FA in this tissue. The results here are convincing, with clear presentation, nice images, and appropriate quantitation. It's also worth emphasizing that initial characterization of Vinc mutant flies failed to reveal any essential roles for this protein in Drosophila, so finding a mutant phenotype of any sort is significant.

      While the manuscript is strong as a descriptive report on the requirement for Vinc in the Drosophila intestine, it doesn't provide us with much understanding of the mechanism by which Vinc exerts its effects, nor how its requirement is linked to intestinal physiology.

      There is always more to learn, and the importance of our work so far is that it demonstrates a very specific role for vinculin as a mechanoeffector in regulating cell fate decisions in specific regions of the midgut, and provide the foundation for future work addressing the detailed mechanism of this function and physiological role.

      Prior work has shown that mechanical stretching of intestines stimulates ISC proliferation (presumably through Integrin signaling), which is opposite to what Vinc does here.

      We would like to stress that very little mechanistic knowledge is available regarding how mechanical stretching stimulates ISC proliferation, in Drosophila or mammalian systems. To our knowledge, the only work linking gut mechanical stretching to cell fate decisions in Drosophila identified Msn/Hippo pathway (Li et al., 2018) and the ion channel Piezo requirement (He et al., 2018). We agree with the reviewer that integrin signaling would most likely contribute, especially given the composition of gels for organoid cultures (Gjorevski et al, 2016), yet the actual molecular mechanisms remain to be elucidated.

      There is a suggestion that Vinc is involved in maintaining homeostasis, but how its regulated remains a bit murky. The authors report that reductions in myosin activity result in phenotypes reminscent of Vinc phenotypes, which they interpret as supporting a model where Vinc's role is to help maintain tension at AJ. Of course it could also be reversed - maybe they are similar because tension is needed to maintain Vinc recruitment to AJ? They lack of epistasis tests and lack of analysis of whether Vinc localization to AJ in EBs is affected by tension or the M2 deletion of a-cat leaves us uncertain as to the actual basis for the relationship between Vinc and myosin phenotypes.

      Thank you for all these suggestions. New experiments have been done to test the relationship between cellular tension and vinculin at junctions (see essential point 1).

    1. Author Response:

      Reviewer #3 (Public Review):

      Murphy et al. further develop the linked selection model of Elyashiv et al. (2016) and apply it to human genetic variation data. This model is itself an extension of the McVicker et al. (2009) paper, which developed a statistical inference method around classic background selection (BGS) theory (Hudson and Kaplan, 1995, Nordborg et al., 1996). These methods fit a composite likelihood model to diversity data along the chromosome, where the level of diversity is reduced by a local factor from some initial "neutral" level π0 down to observed levels. The level of reduction is determined by a combination of both BGS and the expected reduction around substitutions due to a sweep (though the authors state that these models are robust to partial and soft sweeps). The expected reduction factor is a function of local recombination rates and genomic annotation (such as exonic and phylogenetically conserved sequences), as well as the selection parameters (i.e. mutation rates and selection coefficients for different annotation classes). Overall, this work is a nice addition to an important line of work using models of linked selection to differentiate selection processes. The authors find that positive selection around substitutions explains little of the variation in diversity levels across the genome, whereas a background selection model can explain up to 80% of the variance in diversity. Additionally, their model seems to have solved a mystery of the McVicker et al. (2009) paper: why the estimated deleterious mutation rate was unreasonably high. Throughout the paper, the authors are careful not only in their methodology but also in their interpretation of the results. For example, when interpreting the good fit of the BGS model, the authors correctly point out that stabilizing selection on a polygenic trait can also lead to BGS-like reductions.

      Furthermore, the authors have carefully chosen their model's exogenous parameters to avoid circularity. The concern here is that if the input data into the model - in particular the recombination maps and segments liked to be conserved - are estimated or identified using signals in genetic variation, the model's good fit to diversity may be spurious. For example, often recombination maps are estimated from linkage disequilibrium (LD) data which is itself obtained from variation along the chromosome. Murphy et al. use a recombination map based on ancestry switches in African Americans which should prevent "information leakage" between the recombination map and the BGS model from leading to spuriously good fits. Likewise, the authors use phylogenetic conservation maps rather than those estimated from diversity reductions (such as McVicker et al.'s B maps) to avoid circularity between the conserved annotation track and diversity levels being modeled. Additionally, the authors have carefully assessed and modified the original McVicker et al. algorithm, reducing relative error (Figure A2).

      One could raise the concern that non-equilibrium demography confounds their results, but the authors have a very nice analysis in Section 7 of the supplementary material showing that their estimates are remarkably stable when the model is fit separately in different human populations (Figure A35). Supporting previous work that emphasizes the dependence between BGS and demography, the authors find evidence of such an interaction with a clever decomposition of variance approach (Figure A37). The consistency of BGS estimates across populations (e.g. Figures A35 and A36) is an additional strong bit of evidence that BGS is indeed shaping patterns of diversity; readers would benefit if some of these results were discussed in the main text.

      We appreciate the reviewer’s kind remarks. With regards to the results included in the main text vs the supplement, we attempted to strike a balance between having the main text remain communicative to a larger readership and providing experts with details they may find useful. We have, however, done our best for the supplementary analyses to be written clearly.

      I have three major concerns about this work. First, it's unclear how accurate the selection coefficient estimates are given the non-equilibrium demography of humans (pre-Out of Africa split, and thus not addressed by the separate population analyses). The authors do not make a big point about the selection coefficient estimates in the main section of the paper, so I don't find this to be a big problem. Still, some mention of this issue might be helpful to readers trying to interpret the results presented in the supplementary text.

      As the reviewer notes, we chose not to emphasize the inferred distributions of selection coefficients. Our main reason for this choice is the technical issue addressed in Appendix Section 1.5 (L561-564): “Second, thresholding potentially biases our estimates of the distribution of selection effects. While this bias is probably smaller than the bias without thresholding, its form and magnitude are not obvious. This is why we decided not to report the inferred distributions of selection effects in the Main Text.” We agree that if we were to focus on our estimates of the distribution of selection effects, the effects of demographic history would also need to be considered. This is, however, not the focus here.

      Second, I'm curious whether the composite likelihood BGS model could overfit any variance along the chromosome - even neutral variance. At some level, the composite likelihood approach may behave like a sort of smoothing algorithm, albeit with a functional form and parameters of a BGS model. The fact that there is information sharing across different regions with the same annotation class should in principle prevent overfitting to local noise. Still, there are two ways I think to address this overfitting concern. First, a negative neutral control could help - how much variation in diversity along the chromosome can this model explain in a purely neutral simulation? I imagine very little, likely less than 5%, but I think this paper would be much stronger with the addition of a negative control like this. Second, I think the main text should include the R2 values from out-sample predictions, rather than just the R2 estimates from the model fit on the entire data. For example, one could fit the model on 20 chromosomes, use the estimated θΒ parameters to predict variation on the remaining two. The authors do a sort of leave-one-out validation at the window level (Figure A31); however, this may not be robust to linkage disequilibrium between adjacent windows in the way leaving out an entire chromosome would be.

      The two requested analyses were done and their results are described above, in response to essential revisions (p. 2-3 here). In brief, there is no overfitting of neutral patterns or otherwise. We elaborate on why this finding is expected below.

      Finally, I feel like this paper would be stronger with realistic forward simulations. The deterministic simulations described in the supplementary materials show the implementation of the model is correct, but it's an exact simulation under the model - and thus not testing the accuracy of the model itself against realistic forward simulations. However, this is a sizable task and efforts to add selection to projects like Standard PopSim are ongoing.

      We agree that forward simulations would be a nice addition, but believe that it is a project in itself. Indeed, a major complication is that when, for computational tractability, purifying selection is simulated in small populations with realistic population-scaled parameters, the reduction in diversity due to selection at unlinked sites has a major effect on neutral diversity levels (see, e.g., Robertson 1961). We hope to address this issue in future work. Meanwhile, we note that the theory that we rely on has been tested against simulations in the past (e.g., Charlesworth et al., 1993; Hudson and Kaplan, 1995; Nordborg et al., 1996).

    1. Author reponse

      Reviewer #1 (Public Review):

      In their paper, Kroell and Rolfs use a set of sophisticated psychophysical experiments in visually-intact observers, to show that visual processing at the fovea within the 250ms or so before saccading to a peripheral target containing orientation information, is influenced by orientation signals at the target. Their approach straddles the boundary between enforcing fixation throughout stimulus presentation (a standard in the field) and leaving it totally unconstrained. As such, they move the field of saccade pre-processing towards active vision in order to answer key questions about whether the fovea predicts features at the gaze target, over what time frame, with what precision, and over what spatial extent around the foveal center. The results support the notion that there is feature-selective enhancement centered on the center of gaze, rather than on the predictively remapped location of the target. The results further show that this enhancement extends about 3 deg radially from the foveal center and that it starts ~ 200ms or so before saccade onset. They also show that this enhancement is reinforced if the target remains present throughout the saccade. The hypothesized implications of these findings are that they could enable continuity of perception trans-saccadically and potentially, improve post-saccadic gaze correction.

      Strengths:

      The findings appear solid and backed up by converging evidence from several experimental manipulations. These included several approaches to overcome current methodological constraints to the critical examination of foveal processing while being careful not to interfere with saccade planning and performance. The authors examined the spatial frequency characteristics of the foveal enhancement relative, hit rates and false alarm rates for detecting a foveal probe that was congruent or incongruent in terms of orientation to the peripheral saccade target embedded in flickering, dynamic noise (i/f )images. While hit rates are relatively easy to interpret, the authors also reconstructed key features of the background noise to interpret false alarms as reflecting foveal enhancement that could be correlated with target orientation signals. The study also - in an extensive Supplementary Materials section - uses appropriate statistical analyses and controls for multiple factors impacting experimental/stimulus design and analysis. The approach, as well as the level of care towards experimental details provided in this manuscript, should prove welcome and useful for any other investigators interested in the questions posed.

      Weaknesses:

      I find no major weaknesses in the experiments, analyses or interpretations. The conclusions of the paper appear well supported by the data. My main suggestion would be to see a clearer discussion of the implications of the present findings for truly naturalistic, visually-guided performance and action. Please consider the implication of the phenomena and behaviors reported here when what is located at the gaze center (while peripheral targets are present), is not a noisy, relatively feature-poor, low-saliency background, but another high-saliency target, likely crowded by other nearby targets. As such, a key question that emerges and should be addressed in the Discussion at least is whether the fovea's role described in the present experiments is restricted to visual scenarios used here, or whether they generalize to the rather different visual environments of everyday life.

      This is a very interesting question. While we cannot provide a definite answer, we have added a paragraph discussing the role of foveal prediction in more naturalistic visual contexts to the Discussion section (‘Does foveal prediction transfer to other visual features and complex natural environments?’). We pasted this paragraph in response to another comment in the ‘Recommendations for the authors’ section below. We suggest that “the pre-saccadic decrease in foveal sensitivity demonstrated previously[9] as well as in our own data (Figure 2B) may boost the relative strength of fed-back signals by reducing the conspicuity of foveal feedforward input”, presumably allowing the foveal prediction mechanism to generalize to more naturalistic environments with salient foveal stimulation.

      Reviewer #2 (Public Review):

      Human and primates move their eyes with rapid saccades to reposition the high-resolution region of the retina, the fovea, over objects of interest. Thus, each saccade involves moving the fovea from a pre-saccadic location to a saccade target. Although it has been long known that saccades profoundly alter visual processing at the time of saccade, scientists simply do not know how the brain combines information across saccades to support our normal perceptual experience. This paper addresses a piece of that puzzle by examining how eye movements affect processing at the fovea before it moves. Using a dynamic noise background and a dual psychophysical task, the authors probe both the performance and selectivity of visual processing for orientation at the fovea in the few hundred milliseconds preceding a saccade. They find that hit rates and false alarm rates are dynamically and automatically modulated by the saccade planning. By taking advantage of the specific sequence of noise shown on each trial, they demonstrate that the tuning of foveal processing is affected by the orientation of the saccade target suggesting foveal specific feedback.

      A major strength of the paper is the experimental design. The use of dynamic filtered noise to probe perceptual processing is a clever way of measuring the dynamics of selectivity at the fovea during saccade preparation. The use of a dual-task allows the authors to evaluate the tuning of foveal processing as well and how it depends on the peripheral target orientation. They show compellingly that the orientation of the saccade target (the future location of the fovea) affects processing at the fovea before it moves.

      There are two weaknesses with the paper in its current form. The first is that the key claim of foveal "enhancement" relies on the tuning of the false alarms. A more standard measure of enhancement would be to look at the sensitivity, or d-prime, of the performance on the task. In this study, hits and false alarms increase together, which is traditionally interpreted as a criterion shift and not an enhancement. However, because of the external noise, false alarms are driven by real signals. The authors are aware of this and argue that the fact that the false alarms are tuned indicates enhancement. But it is unclear to me that a criterion shift wouldn't also explain this tuning and the change in the noise images. For example, in a task with 4 alternative choices (Present/Congruent, Present/Incongruent, Absent/Congruent, Absent/Incongruent), shifting the criterion towards the congruent target would increase hits and false alarms for that target and still result in a tuned template (because that template is presumably what drove the decision variable that the adjusted criterion operates on). I believe this weakness could be addressed with a computational model that shows that a criterion shift on the output of a tuned template cannot produce the pattern of hits and false alarms.

      We thank the reviewer for this comment. We will present three arguments, each of which suggests that our effects are perceptual in nature and cannot be explained by a shift in decision criterion: (1) the temporal specificity of the difference in Hit Rates (HRs), (2) the spatial specificity of the difference in HRs and (3) the phenomenological quality of the foveally predicted signal. In general, a criterion shift would indeed affect hits and false alarms alike. Nonetheless, the difference in HRs only manifested under specific and meaningful conditions:

      First, the increase in congruent as compared to incongruent HRs, i.e., enhancement, was temporally specific: congruent and incongruent HRs were virtually identical when the probe appeared in a baseline time bin or one (Figure 2B) or even two (Figure 4A) early pre-saccadic time bins. Based on another reviewer’s comment, we collected additional data to measure the time course and extent of foveal enhancement during fixation. While pre-saccadic enhancement developed rapidly, enhancement started to emerge 200 ms after target onset during fixation. Crucially, these time courses mirror the typical temporal development of visual sensitivity during pre-saccadic attention shifts and covert attentional allocation, respectively[8,33]. We are unaware of data demonstrating similar temporal specificity for a shift in decision criterion. One could argue that a template of the target orientation needs to build up before it can influence criterion. Nonetheless, this template would be expected to remain effective after this initial temporal threshold has been crossed. In contrast, we observe pronounced enhancement in medium but not late stages of saccade preparation in the PRE-only condition (Figure 4A).

      Second, it has been argued that a defining difference between innately perceptual effects and post-perceptual criterion shifts is their spatial specificity[53]: in opposition to perceptual effects, criterion shifts should manifest in a spatially global fashion. Due to a parafoveal control condition detailed in our reply to the next comment, we maintain the claim that enhancement is spatially specific: congruent HRs exceeded incongruent ones within a confined spatial region around the center of gaze. We did not observe enhancement for probes presented at 3 dva eccentricity even when we raised parafoveal performance to a foveal level by adaptively increasing probe contrast. The accuracy of saccade landing or, more specifically, the mean remapped target location (Figure 3B) influenced the spatial extent of the enhanced region in a fashion that is reconcilable with previous findings[30]. A criterion shift that is both spatially and temporally selective, follows the time course of pre-saccadic or covert attention depending on observers’ oculomotor behavior, does not remain effective throughout the entire trial after its onset, is sensitive to the mean remapped target location across trials, and does not apply to parafoveal probes even after their contrast has been increased to match foveal performance, would be unprecedented in the literature and, even if existent, appear just as functionally meaningful as sensitivity changes occurring under the same conditions.

      Lastly and on a more informal note, we would like to describe a phenomenological percept that was spontaneously reported by 6 out of 7 observers in Experiment 1 and experienced by the author L.M.K. many times. On a small subset of trials, participants in our paradigms have the strong phenomenological impression of perceiving the target in the pre-saccadic center of gaze. This percept is rare but so pronounced that some observers interrupt the experiment to ask which probe orientation they should report if they had perceived two on the same trial (“The orientation of the normal probe or of the one that looked exactly like the target”). Interestingly, the actual saccade target and its foveal equivalent are perceived simultaneously in two spatiotopically separate locations, suggesting that this percept cannot be ascribed to a temporal misjudgment of saccade execution (after which the target would have actually been foveated). We have no data to prove this observation but nonetheless wanted to share it. Experiencing it ourselves has left us with no doubt that the fed-back signal is truly – and almost eerily – perceptual in nature.

      The analysis suggested by the reviewer is very interesting. Yet for several reasons stated in the ‘Suggestions to the authors’ section, our dataset is not cut out for an analysis of noise properties at this level of complexity. We had always planned to resolve these concerns experimentally, i.e., by demonstrating specificity in HRs. We believe that our arguments above provide a strong case for a perceptual phenomenon and have incorporated them into the Discussion of our revised manuscript.

      The second weakness is that the author's claim that feedback is spatially selective to the fovea is confounded by the fact that acuity and contrast sensitivity are higher in the fovea. Therefore, the subject's performance would already be spatially tuned. Even the very central degree, the foveola, is inhomogeneous. Thus, finding spatially-tuned sensitivity to the probes may simply indicate global feature gain on top of already spatially tuned processing in the fovea. Another possible explanation that is consistent with the "no enhancement" interpretation is that the fovea has increased. This is consistent with the observation that the congruency effects were aligned to the center of gaze and not the saccade endpoint. It looks from the Gaussian fits that a single gain parameter would explain the difference in the shape of the congruent and incongruent hit rates, but I could not figure out if this was explicitly tested from the existing methods. Additional experiments without prepared saccades would be an easy way to address this issue. Is the hit rate tuned when there is no saccade preparation? If so, it seems likely that the spatial selectivity is not tuned feedback, but inhomogeneous feedforward processing.

      We fully agree. We do not consider a fixation condition diagnostic to resolve this question since, as of now, correlates of foveal feedback have exclusively been observed during fixation. In those studies, it was suggested that the effect, i.e., a foveal representation of peripheral stimuli, reflects the automatic preparation of an eye movement that was simply not executed[11,12,14]. To address another reviewer’s comment, we collected additional data in a fixation experiment. The probe stimulus could exclusively appear in the screen center (as in Experiment 1) and observers maintained fixation throughout the trial. While pre-saccadic congruency effects were significantly more pronounced and developed faster, congruency effects did emerge during fixation when the probe appeared 200 ms after the target. If pre-saccadic processes indeed spill over to fixation tasks to some extent and trigger relevant neural mechanisms even when no saccade is executed, we could expect a similar feedback-induced spatial profile during fixation. Since this matches the reviewer’s prediction if the pre-saccadic profiles resulted from inhomogeneous feedforward processing, we do not consider a fixation condition suitable to distinguish between both hypotheses.

      To test whether the tuning of enhancement is effectively a consequence of declining visual performance in the parafovea/periphery, we instead raised parafoveal performance to a foveal level by adaptively increasing the opacity of the probe: while leaving all remaining experimental parameters unchanged, we presented the probe in one of two parafoveal locations, i.e., 3 dva to the left or right of the screen center. Observers were explicitly informed about the placement of the probe. We administered a staircase procedure to determine the probe opacity at which performance for parafoveal target-incongruent probes would be just as high as foveal performance had been in the preceding sessions. While the foveal probe was presented at a median opacity of 28.3±7.6%, a parafoveal opacity of 39.0±11.1% was required to achieve the same performance level. As a result, the gray dot at 0 dva in the figure below represents the incongruent HR in the center of gaze and ranges at 80% on the y-axis. The gray dots at ±3 dva represent incongruent parafoveal HRs and also range at ~80% on the y-axis. Using the reviewer’s terminology, we effectively removed the influence of acuity- (or contrast-sensitivity-) dependent spatial tuning. If the spatial profiles had indeed been the result of “global feature gain on top of already spatially tuned processing“, this manipulation should render parafoveal feature gain just as detectable as foveal feature gain. Instead, congruent and incongruent parafoveal HRs were statistically indistinguishable (away from the saccade target: p = .127, BF10 = 0.531; towards the saccade target: p = .336, BF10 = 0.352), inconsistent with the idea of a spatially global feature gain.

      We had included these data in our initial submission. They were collected in the same observers that contributed the spatial profiles (Experiment 2). The data points at 0 dva in the reduced figure above correspond to the foveal probe location in Figure 2D. The data points at ±3 dva had been plotted and discussed in our initial submission, yet only very briefly. Based on this and another reviewer’s comment, we realize that we should have explained this condition more extensively in the main text rather than in the Methods and have added a dedicated paragraph to the Results section.

      This paper is important because it compellingly demonstrates that visual processing in the fovea anticipates what is coming once the eyes move. The exact form of the modulation remains unclear and the authors could do more to support their interpretations. However, understanding this type of active and predictive processing is a part of the puzzle of how sensory systems work in concert with motor behavior to serve the goals of the organism.

      Reviewer #3 (Public Review):

      This manuscript examines one important and at the same time little investigated question in vision science: what happens to the processing of the foveal input right before the onset of a saccade. This is clearly something of relevance as humans perform saccades about 3 times every second. Whereas what happens to visual perception in the visual periphery at the saccade goal is well characterized, little is known about what happens at the very center of gaze, which represents the future retinal location where the saccade target will be viewed at high resolution upon landing. To address this problem the authors implemented an elegant experiment in which they probed foveal vision at different times before the onset of the saccade by using a target, with the same or different orientation with respect to the stimulus at the saccade goal, embedded in dynamic noise. The authors show that foveal processing of the saccade target is initiated before saccade execution resulting in the visual system being more sensitive to foveal stimuli which features match with those of the stimuli at the saccades goal. According to the authors, this process enables a smooth transition of visual perception before and after the saccade. The experiment is well designed and the results are solid, overall I think this work represents a valuable contribution to the field and its results have important implications. My comments below:

      1. The change in the overall performance between the baseline condition and when the probe is presented after the saccade target is large, but I wonder if there are other unrelated factors that contribute to this difference, for example, simply presenting the probe after vs before the onset of a peripheral stimulus, or the fact that in the baseline the probe is presented right after a fixation marker, but in the other condition there was a longer time interval between the presentation of the marker and the probe transient. The authors should discuss how these confounding factors have been accounted for.

      We thank the reviewer for this helpful comment. We would like to clarify that the probe was never presented right after the fixation dot. In the baseline condition, fixation dot and target were separated by 50 ms, i.e., the duration of one noise image. Since the fixation dot was an order of magnitude smaller than the probe (0.3 vs 3 dva in diameter) and since two large-field visual transients caused by the onset of a new background noise image occurred between fixation dot disappearance and probe appearance, we consider it unlikely that the performance difference was caused by any kind of stimulus interaction such as masking. Nonetheless, we had been puzzled by this difference already when inspecting preliminary results and wondered if it may reflect observers’ temporal expectations about the trial sequence. We therefore explicitly instructed and repeatedly reminded observers that the probe could appear before the peripheral target. Since the difference persisted, we ascribed it to a predictive remapping of attention to the fovea during saccade preparation, as we had stated in the Discussion.

      Another contributing factor may be that observers approached the oculomotor and perceptual detection tasks sequentially. In early trial phases, they may have prioritized localizing the target and programming the eye movement. After motor planning had been initiated, resources may have been freed up for the foveal detection task. Since on the majority of probe-present trials, the probe appeared after the saccade target, this strategy would have been mostly adaptive. Crucially, however, observers yielded similar incongruent Hit Rates in the baseline and last pre-saccadic time bin (70% vs 74%). While we observed pronounced enhancement in the last pre-saccadic bin, congruent and incongruent Hit Rates in the baseline bin were virtually identical. We therefore conclude that lower overall performance in the baseline bin did not prevent congruency effects from occurring. Instead, congruency effects started developing only after target appearance. We have added this potential explanation to the Results.

      1. Somewhat related to point 3, the authors conclude that the effects reported here are the result of saccade preparation/execution, however, a control condition in which the saccade is not performed is missing. This leaves me wondering whether the effect is only present during saccade preparation or if it may also be present to some extent or to its full extent when covert attention is engaged, i.e when subjects perform the same task without making a saccade.

      Foveal feedback has, as of now, exclusively been demonstrated during fixation (see references in Introduction and Discussion). In most of these studies, it was suggested that these effects (i.e., the foveal representation of a peripheral stimulus) may reflect the automatic preparation of an eye movement that was simply not executed[11,12,14]. Since foveal feedback has been demonstrated during fixation, and since eye movement preparation may influence foveal processing even when the eyes remain stationary, we considered it likely that congruency effects would emerge during fixation. Nonetheless, we agree with the reviewer that an explicit comparison between saccade preparation and fixation would enrich our data set and allow for stronger conclusions. We therefore collected additional data from seven observers. While all remaining experimental parameters were identical to Experiment 1, observers maintained fixation throughout each trial. We found that pre-saccadic foveal enhancement was more pronounced and emerged earlier than foveal enhancement during fixation. We present these data in the Results section (Figure 5) and have updated the Methods section to incorporate this additional experiment. We have furthermore added a paragraph to the Discussion which addresses potential mechanisms of foveal enhancement during fixation and saccade preparation.

      Furthermore, the reviewer’s comment helped us realize that we never stated a crucial part of our motivation explicitly. We now do so in the Introduction:

      “Despite the theoretical usefulness of such a mechanism, there are reasons to assume that foveal feedback may break down while an eye movement is prepared to a different visual field location. First and foremost, saccade preparation is accompanied with an obligatory shift of attention to the saccade target[6-8] which in turn has been shown to decrease foveal sensitivity[9]. Moreover, the execution of a rapid eye movement induces brief motion signals on the retina[20] which may mask or in other ways interfere with the pre-saccadic prediction signal. On a more conceptual level, the recruitment of foveal processing as an ‘active blackboard’[21] may become obsolete in the face of an imminent foveation of relevant peripheral stimuli – unless, of course, foveal processing serves the establishment of trans-saccadic visual continuity.”

      We believe that the additional data and the revisions to the Introduction and Discussion have strengthened our manuscript and thank the reviewer for this comment.

      1. Differently from other tasks addressing pre-saccadic perception in the literature here subjects do not have to discriminate the peripheral stimulus at the saccade goal, and most processing resources are presumably focused at the foveal location. Could this have influenced the results reported here?

      This is true. We intentionally made the features of the peripheral target as task-irrelevant as possible, contrary to previous investigations. We wanted to ensure that the enhancement we find would be automatic and not induced by a peripheral discrimination task, as we state in the Discussion and the Methods. We agree that the foveal detection task likely focused processing resources on the center of gaze in Experiment 1. In Experiment 2, however, we measured the spatial profile of enhancement which involved two different conditions:

      1. In each observer’s first six sessions, the probe could be presented anywhere on a horizontal axis of 9 dva length. On a given trial, an observer could not predict where it would appear, and therefore could not strategically allocate their attention. Nonetheless, enhancement of target-congruent orientation information was tuned to the fovea.
      2. In the final, seventh session, the probe appeared exclusively in one of two possible peripheral locations: 3 dva to the left or 3 dva to the right of the screen center. Observers were explicitly informed that the probe would never appear foveally, and processing resources should therefore have been allocated to the peripheral probe locations. The general performance level in this condition was comparable to performance in the fovea (see reply to the next comment). Nonetheless, we did not find peripheral enhancement of target-congruent information.

      Importantly, the magnitude of the foveal congruency effect in the PRE-only condition of Experiment 1 (i.e., when the target disappeared before the eyes landed on it) was comparable to the foveal congruency effect in Experiment 2 (PRE-only throughout), suggesting that the format of the task – i.e., purely foveal detection or foveal and peripheral detection – did not alter our findings.

      1. The spatial profile of the enhancement is very interesting and it clearly shows that the enhancement is limited to a central region. To which extent this profile is influenced by the fact that the probe was presented at larger eccentricities and therefore was less visible at 4.5 deg than it was at 0 deg? According to the caption, when the probe was presented more eccentrically the performance was raised to a foveal level by adaptively increasing probe transparency. This is not clear, was this done separately based on performance at baseline? Does this mean that the contrast of the stimulus was different for the points at +- 3 dva but the performance was comparable at baseline? Please explain.

      Based on the previous comment and comments of Reviewer #2, we realize that we should have explained this condition more extensively in the main text rather than in the Methods and have adapted the manuscript accordingly. As stated in our reply to the previous comment, Experiment 2 involved one session in which we addressed whether the lack of parafoveal/peripheral enhancement could be due to a simple decrease in acuity as mentioned by the reviewer. Observers were explicitly informed that the to-be detected stimulus (the probe) would appear either 3 dva to the left or right but never in the screen center and were shown slowed-down example trials for illustration. Observers then performed a staircase procedure which was targeted at determining the probe contrast at which performance for parafoveal target-incongruent probes would be just as high as foveal performance for target-incongruent probes had been in the previous six sessions. While the foveal probe was presented at a median opacity of 28.3±7.6%, an opacity of 39.0±11.1% was required to achieve the same performance level at a 3 dva eccentricity. Therefore, the gray curve in Figure 2D that represents incongruent Hits reaches its peak just under 80% on the y-axis. The gray dots at ±3 dva also range at ~80% on the y-axis. The performance level for target-incongruent probes (‘baseline’ here) in the parafovea is thus equal to foveal performance for target-incongruent probes. Target-congruent parafoveal feature information had the same “chance” to be enhanced as foveal information in the preceding sessions. Despite an equation of performance, we found no parafoveal enhancement. This suggests that enhancement is a true consequence of visual field location and not simply mediated by visual acuity at that location.

      1. The enhancement is significant within a region of 6.4 dva around the center of gaze. This is a rather large region, especially considering that it extends also in the direction opposite to the saccade. I was expecting the enhancement to be more confined to the central foveal region. Was the effect shown in Figure 2D influenced by the fact that saccades in this task were characterized by a large undershoot (Fig 1 D)? Did the effect change if only saccades landing closer to the target were included in the analysis? There may not be enough data for resolving the time course, but maybe there are differences in the size of the main effect.

      Width of the profile: In general, the width of the enhancement profile is likely to be influenced by two experimental/analysis choices: the size of the probe stimulus presented during the experiment and the width of the moving window combining adjacent probe locations for analysis.

      Probe size: Since the probe itself had a comparably large diameter of 3 dva, even the leftmost significant point at -2.6 dva could be explained by an enhancement of the foveal portion of the probe. We had mentioned this briefly in the Discussion but realize that this point is crucial and should be made more explicit. Moving window width: We designed the experiment with the intention to densely sample a range of spatial locations during data collection and combine a certain number of adjacent locations using a moving window during analysis (see preregistration: https://osf.io/6s24m). To ensure the reliability of every data point, the width of this window was chosen based on how many trials were lost during preprocessing. We chose a window width of 7 locations as this ensured that each data point contained at least 30 trials on an individual-observer level. Nonetheless, the width of the resulting enhancement profile depends on the width of the moving window:

      We added these caveats to the Results section and incorporated the figure above into the Supplements. We now state explicitly that…

      “the main conclusions that can be drawn are that enhancement i) peaks in the center of gaze, ii) is not uniform throughout the tested spatial range as, for instance, global feature-based attention would predict, and iii) is asymmetrical, extending further towards the saccade target than away from it.”

      For the above reasons, the absolute width of the profile should be interpreted with caution.

      Saccadic landing accuracy: To address the reviewer’s question, we inspected the spatial enhancement profile separately for trials in which the saccade landed on the target (i.e., within a radius of 1.5 dva from its center) or off-target but still within the accepted landing area. This trial separation criterion, besides appearing meaningful, ensured that all observers contributed trials to every data point. We had never resolved the time course in this experiment and could therefore not collapse across time points as suggested by the reviewer. To increase the number of trials per data point, we instead increased the width of the moving window sliding across locations from 6 to 9 neighboring locations (but see caveat above).

      Considering only saccades that landed on the target (‘accurate’; A) yielded significant enhancement from -2.6 to 2.1 dva and from 3.2 dva throughout the measured range towards the saccade target. Saccades that landed off-target (‘inaccurate’; B) showed a more pronounced asymmetry. When only considering inaccurate saccades, enhancement reached significance between -1.1 and 4.4 dva.

      The increased asymmetry for inaccurate saccades may be related to predictive remapping: since inaccurate saccades were hypometric on average, the predictively remapped location of the target was shifted towards the target by the magnitude of the undershoot. Asymmetric enhancement would therefore have boosted congruency at the remapped target location across all trials. In consequence, we inspected if aligning probe locations to the remapped target location on an individual-trial level would lead to a narrower profile for inaccurate saccades. This was not the case. Instead, we observed two parafoveal maxima (C). Their position on the x-axis equals the mean remapping-dependent leftwards (2.0 dva) and rightwards (1.9 dva) displacement across trials. In other words, they correspond to the pre-saccadic center of gaze. Note that these profiles could not be fitted with a mixture of Gaussians and were fitted using polynomials instead.  

      In sum, while we do not observe a clear narrowing of the enhancement profile for accurate saccades, the profile’s asymmetry is more pronounced for inaccurate eye movements. An increase in asymmetry could bear functional advantages since it would boost congruency at the remapped target location across all trials. Importantly though, this adjustment seems to rely on an estimate of average rather than single-trial saccade characteristics: aligning probe locations to the remapped attentional locus on an individual trial level provides further evidence that, irrespective of individual saccade endpoints, enhancement was aligned to the fovea. We have added these analyses to the Results section (Figure 3). We have also added the remapped profiles for all saccades and accurate saccades only to the Supplements.

      1. Is the size of the enhanced region around the center of gaze related to the precision of saccades? Presumably, if saccades are less precise a larger enhanced area may be more beneficial.

      This is a very interesting point. To address this question, we estimated each observer’s saccadic precision by computing bivariate kernel densities from their saccade landing coordinates. As we measured the horizontal extent of enhancement in our experiment, we defined the horizontal bandwidth as an estimate of saccadic imprecision. To estimate the size of the enhanced region for each observer, we created 10,000 bootstrapping samples for each observer’s congruent and incongruent HRs (4 locations combined at each step) We then determined the difference between the bootstrapped congruent and incongruent HRs and defined significantly enhanced locations as all locations for which <= 5% of these differences fell below zero. We then defined the width of the enhancement profile as the maximum number of consecutive significant locations.

      Instead of a positive correlation, we observed a negative correlation between the bandwidth of landing coordinates (i.e., saccadic imprecision) and the size of the enhanced window (r = -.56, p = .117). In other words, there was a non-significant tendency that the less precise an observer’s saccades, the narrower their estimated region of enhancement. We furthermore inspected the magnitude of enhancement per position within in the enhanced region. To do so, we computed the mean difference between congruent and incongruent HR across all positions in the enhanced region. The sizes of the orange circles in the figure above represent the resulting values (ranging from 2.9% to 13.3%). As saccadic precision decreases, the magnitude of enhancement per data point in the enhanced region tends to decrease as well. We therefore suggest that high saccadic precision is a sign of efficient oculomotor programming, which in turn allows peri-saccadic perceptual processes to operate more effectively. We added this analysis to the Supplements and refer to it in the Results section of the revised manuscript.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2021-01158

      Doi preprint: https://doi.org/10.1101/2021.11.16.468835

      Corresponding author(s): Salah, MECHERI

      [Please use this template only if the submitted manuscript should be considered by the affiliate journal as a full revision in response to the points raised by the reviewers.

      If you wish to submit a preliminary revision with a revision plan, please use our "Revision Plan" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

      1. General Statements [optional]

      This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

      2. Point-by-point description of the revisions

      This section is mandatory. Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      __Whole sporozoite vaccines confer sterilizing protection against Plasmodium infection. However, further improvements of whole sporozoite vaccines is needed and requires a thorough understanding of the immune processes that mediate protection and the deployment of novel strategies further augment protective immunity while limiting the impact of factors that are detrimental to protection. Work from the Mecheri laboratory and others had previously established that IL-6 signaling plays a critical role in the immune response to a liver stage infection; engagement of IL-6 signaling promotes the initial control of a liver stage infection and enhances the protective adaptive immune response. Given this potent protective role for IL-6, Belhimeur and colleagues design a parasite strain in rodent malaria parasites that encodes and secrete murine IL-6 during liver stage infection. They show that upon infection of wildtype mice, these transgenic parasites i) are unable to transition to blood stage infection, ii) produce Il-6 and iii) induce a durable adaptive immune response that can protect against sporozoite challenge. This study is novel and intriguing. However, a superficial analysis of the transgenic parasite strain, an incomplete analysis of the immune response to infection and the lack of data regarding the possibility of IL-6 mediated immunopathology have dampened this reviewer's enthusiasm for the work.

      **Major Concerns:** __

      1)The data in Figure 3b-3d clearly indicate that the IL-6 encoding transgenic parasites exhibit a defect in parasite development within HepG2 cells that is maintained in vivo. The authors propose that an arrest of these parasites in the liver stage precludes their transition to blood stage infection and that this arrest is dependent on IL-6 signaling. To better support that claim the authors should:

      a.Better characterize in vivo liver stage arrest using infected liver tissue analysis with immunofluorescence microscopy to determine when and how precisely IL-6 transgenic parasites are impacted in development.

      Done. New data in figure 3B, C, D

      b.Determine if arrested development of IL-6 transgenic parasites is truly dependent on IL-6 signaling using antibody blockade of IL-6 signaling and mice with genetic defects in IL-6 signaling.

      Experiments were done using anti-IL-6 receptor blocking antibodies, but did not work. This was commented in the text and shown in Supplementary Fig 2 .

      2)The authors claim that IL-6 production and secretion into the liver tissue augments the adaptive immune response to liver stage infection. This in turn results in a durable adaptive immune responses that protect against infection. However, the mechanistic underpinning of IL-6 signaling in the liver that is induced by their transgenic parasites and the impact on adaptive immune responses is poorly characterized:

      a.There is no evidence that the protective adaptive immune response induced by IL-6 trangenic parasite infection is dependent on IL-6 signaling. Is superior protection and immunogenicity lost in IL-6 signaling deficient animals that are infected with IL-6 transgenic parasites?

      Not addressed but the point is that IL-6 leads to attenuation.

      b.What elements of the adaptive immune response are impacted? One can imagine that IL-6 mediated killing of infected hepatocytes might introduce more parasite antigen that can be acquired by antigen presenting cells, or that IL-6 mediated pro-inflammatory signaling might regulate the maturation of antigen presenting cells, increased differentiation of helper T cells, the downregulation of regulatory T cell function and frequency and/or the differentiation of effector CD8 T cells into long-lived hepatic memory CD8 T cells. The authors should conduct a more comprehensive analysis of how parasite-encoded IL-6 impacts adaptive immunity.

      Done. An extensive analysis of CD4 and CD8 phenotype and status of activation is represented in Fig 9.

      3)While IL-6 transgenic parasites induce a potent and durable adaptive immune response, the authors should show how this compares to published whole sporozoite immunizations. The authors should determine if immunization with IL-6 transgenic parasites is superior to for example immunization with radiation-attenuated sporozoites and generically attenuated sporozoites.

      It not the point. The work presented here emphasizes the proof of concept that the proposed new strategy works. Follow up studies will compare this model to previous ones.

      4) IL-6 signaling is a major player in inflammatory diseases and the induction of immunopathology. As such the authors should carefully examine the duration and magnitude of IL-6 protein production in the liver, and serum after IL-6 Tg parasite infection and determine if IL-6 signaling promotes liver immunopathology.

      Not done but this point was discussed in the text. Also, we made it clear in the material and methods section that the way the construct was made, i.e the IL-6 production is time-frame restricted to the first 48h of liver infection, precisely because of the expression of IL-6 gene is under the control of LISP-2 promoter. Therefore there is no persistence of IL-6 production by liver stage parasites.

      Reviewer #1 (Significance (Required)):

      The paper is reporting a novel strategy to generate a whole sporozoite vaccine. Expression of IL6 in a transgenic parasite. This could be a significant contribution to the field if additional experiments as outlined in the critique are conducted.The work might also inform vaccine design for other pathogens.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The manuscript describes the construction of a Plasmodium berghei that expresses murine interleukin-6 in exoerythrocytic (liver stage) parasites and the analysis of mice infected with sporozoites of this parasite line. They find that such parasites do not complete development in liver cells and therefore do not produce subsequent infection in red blood cells. The ability of prior infection with these parasites on the ability of the host to resist both wild type and heterologous species challenge is then examined.

      The key assumption that underlies the study is that the observed phenotypes result from parasite expression of bioactive IL-6 that functions to modulate the immune system. Other explanations are not considered, for example the over-expression of secreted IL-6 may prevent the complete maturation of the intracellular parasite by clogging up the parasite secretory pathway. The authors use the 'wild type' parasite as the control but not only does the wild type not express IL-6 it also does not express the human DHFR gene used as a selection system. A much better control parasite would be one that expresses a non-bioactive IL-6 so that the potential effects on parasite maturation can be differentiated from those on the mouse immune system. Another control to be considered would be comparison with a genetically attenuated parasite with a block in late stage development, and which does not produce a host cytokine.

      Interesting comment but key novel result is that co-infection studies show reversed phenotype of IL-6 transgenic parasites, likely due to counteracting Of IL-6 effect by Wild type parasites (Supplementary Fig 1)

      Another assumption is that IL-6 is secreted from the infected liver cell and mediates its effects, presumably by binding to its cell surface receptor. The expectation of Il-6 secretion from the parasite is that it would accumulate in the parasitophorous vacuole - how would it get out of the infected host cell? While evidence is provided of IL-6 in the in vitro culture supernatant of infected cells - this might arise from damaged cells in rather artificial conditions. Have the authors considered doing the experiment of concurrent mouse infection with both wild type and recombinant parasites? If the mechanism of parasite killing in infected liver cells is as proposed, then a reduction of wild type parasites in the subsequent asexual blood stage would be expected.

      Experiments done. We discussed both experiments: IL-6 receptor blocking antibody experiement (Suppl Fig 2), and mixed infection (Suppl Fig 1).

      Figure 3 indicates that IL-6 TgPbA/LISP2 parasites are as efficient or better than wild type parasites at invading host cells but then they do not develop to maturity. What is the evidence that the key factor in their ability to immunize the host is expression of IL-6 rather than the effect of an attenuated parasite?

      This is an interesting observation made by the reviewer. With the available data, we cannot really tell which of the two possibilities is operating in thin system. It could also be that the two option are interconnected.

      In this model malaria infection, it looks like there are two lethal outcomes: one associated with experimental cerebral malaria at relatively low blood stage parasitemia (which I understand is a controversial model for human cerebral malaria) and the second associated with high blood stage parasitemia. Some of the protocols affect which outcome occurs (see for example Fig 6), but this observation is not properly discussed.

      In many occasions, we did see in the past a discrepancy between anti-parasite immunity and anti-disease protection. In this particular experiment (Fig 6), we explored the dose effect of the IL-6 mutant. What is clear from this model is that at the high dose, 104 SPZ, we observe both anti-parasite and anti-disease protection and immunity, whereas at the lower doses, 103 and 102 SPZ, although there was no efficient anti-parasite immunity, mice did not die from cerebral malaria but much later from hyperparasitemia. We consider that the two low doses of IL-6 transgenic parasites did protect against disease expression.

      For the data presented in Fig 7, why was there a challenge with WT PbA sporozoites before the heterologous Py challenge? If this step is excluded is there still an effect against P. yoelii? Why was the parasite chosen for the heterologous challenge Py17XNL? Since this parasite is largely restricted to reticulocytes in the blood stream would a different effect have been observed if the heterologous challenge parasite was, for example, P. chabaudi?

      Out of scope.

      Although the expectation is that IL-6 expression would not occur in the asexual blood stage, I think it would be important to demonstrate experimentally that this is the case.

      Done. IL-6 transgenic parasite, when inoculated as infected erythrocytes have no development defect and grow normally in infected mice.

      In Fig. 4A the y-axis is labelled IL-6 rRNA when it should be IL-6 mRNA.

      Corrected

      Reviewer #2 (Significance (Required)):

      The significance of the report does depend on whether or not the experimental evidence is sufficient to support the claim that parasite expression of IL-6 is important in generating immunity. There has been a number of studies to show that infection with sporozoites that have been genetically attenuated to not complete subsequent development in the infected liver cell can provide immunity to subsequent infection; what is different about this study is that the authors specifically target the parasite to express a host protein that is likely to be important in acquisition of immunity. Therefore for the study to have high significance they have to show convincingly that it is the expression and activity of IL-6 that is important and I do not think this is the case with the experiments reported. If the authors are correct, then the idea of manipulating the host response by expression of host proteins by the parasite may be an attractive approach to dissect the key elements of immunity to sporozoite infection. At the moment, although there is a lot of focus on developing an attenuated whole sporozoite vaccine against malaria, and this study may provide proof of principle for including a host component in the parasite, there would still be long way to go before any practical application of this approach.

      The key message was toned down. As the formal demonstration that the expression and activity of IL-6 is direcxtly involved in IL-6 transgenic parasites to confer protective immunity, we suggest to tone down the message by saying that IL-6 attenuates parasite virulence, the mechanism being likely through IL6 signaling detrimental effect on parasite development.

      The audience would be those interested in parasite immunology.

      __

      Reviewer expertise: malaria parasite cell and molecular biology; host immunity.

      **Referees cross commenting** __

      __ I think all reviewers are of the opinion that there needs to be a better demonstration that the observed phenotype is mediated by expression and signaling of IL-6, for example by antibody blockade or using a mouse line with a genetic defect in IL-6 signaling. Looking at all the issues that have been raised by the reviewers and need to be addressed with further experimentation, my feeling is that this will take longer than 6 months.

      __

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      __ **Summary** This study explores the expression of murine IL-6 by rodent Plasmodium berghei as a means to generate transgenic parasites whose development in the liver is arrested, which may be used as a genetically attenuated pre-erythrocytic vaccine against malaria. The authors conclude that IUL-6-expressing Plasmodium parasites elicit CSD8+ T-cell mediated immune responses that protect against a subsequent challenge with infectious sporozoites.

      **Major Comments** __

      In Figure 3, the authors show the results of qRT-PCR analysis of mouse livers infected with WT or transgenic parasites. They then use HepG2 cells to assess hepatic parasite numbers and development. Why didn't the authors assess this also in vivo, in liver sections of infected mice?

      Done. New data are presented in Fig 3B, C, D

      Linked to the above, a more complete analysis of the parasite's behavior in HepG2 cells should be provided. The authors write in the discussion that "IL-6 transgenic parasites develop perfectly well in cultured hepatocytic cells". Does this mean that they develop to the production of infectious merozoites? This could be confirmed by allowing the infected cultures to progress for 60-70 hours and then collecting the supernatants of these cultures and injecting them into naïve mice, to understand whether or not infectious merozoites are formed in vitro.

      New analysis demonstrate that IL-6 transgenic parasites actually display a developmental defect at the pre-erythrocytic stage in vivo.

      Figure 3C: The authors mention this result almost in passing but fail to provide an explanation for this observation. Why is the number of transgenic parasite EEFs approximately double that of WT parasite EEFs?

      A new figure 3 is provided and show that the EEF density (Fig 3B) was drastically reduced both at 24h and at 48h in mice infected with the IL-6 transgenic parasites as compared to those infected with WT PbA parasites, although the differences were not statistically significant. We also examined the size (Fig. 3C) of EEFs, and found the same tendency, namely a reduced size and diameter of IL-6 transgenic EEF as compared to those of WT PbA EEFs with a statistical difference only at 40h.

      Figure 3D: The EEF area units (mm2) on the YY axis are certainly wrong. However, they cannot be um2 either, as 15-30 um2 would be far too small for EEFs at 48 hours post-infection. What is it then?

      New data are now provided in a new Fig 3.

      The authors write "... suggest that the failure of IL-6 Tg-PbANKA/LISP2 parasites to develop in the liver of infected mice is likely due to an active anti-parasite immune response mediated by parasite-encoded IL-6 in vivo". I have several issues with this statement. 1) as mentioned above, the in vitro data cannot be used to draw definitive conclusions about the parasites' behavior in vivo; 2) the transgenic parasites do not "fail to develop in the liver of infected mice". If anything, they develop less than their WT counterparts, which is different from "failing to develop". Clarifying how much they do develop would be important (see next comment).

      We provide new in vivo data as to the development of IL-6 transgenic parasites. A new figure 3 is provided and show that the EEF density (Fig 3B) was drastically reduced both at 24h and at 48h in mice infected with the IL-6 transgenic parasites as compared to those infected with WT PbA parasites, although the differences were not statistically significant. We also examined the size (Fig. 3C) of EEFs, and found the same tendency, namely a reduced size and diameter of IL-6 transgenic EEF as compared to those of WT PbA EEFs with a statistical difference only at 40h. We replaced failure by a defect in development.

      In connection with the above, I would like to know more about the time when the development of IL-6 Tg-PbANKA/LISP2 parasites is arrested in vivo, in the liver. Are these early- or late-arresting parasites? Is the liver stage of infection compromised during parasite development or at egress? To clarify this, the manuscript would benefit from a timecourse analysis of liver sections of mice infected with this parasite, including data on EEF numbers and sizes up to and beyond 48 h after sporozoite inoculation.

      Done. See new figure 3.

      Still linked to the issue of parasite arrest in vivo and the possibility of breakthroughs, the manuscript would benefit from an experiment where mice were injected with a high number of transgenic sporozoites and parasitemia is monitored thereafter, much like what was done in Figure 2D, but starting off with a larger inoculum of at least 5 x 10^5 sporozoites.

      This was done and there was no breakthrough even with doses as high as 106 sporozoites

      While the results shown to suggest that secreted IL-6 restricts the parasite's liver stage development in vivo, this could be more definitely demonstrated by performing an infection with the transgenic parasites in the context of blocking or absence of the host's IL-6 receptor. This experiment was done but unfortunately did not work (Suppl. Fig 2). That is, the treatment of mice infected with IL-6 transgenic parasites with anti-IL-6 receptor blocking antibodies did not reverse the infection phenotype. This was also discuss in the manuscript.

      **Minor Comments**

      __

      The manuscript needs to be improved in terms of both language and format. Some examples, solely from the abstract, are listed below, but the manuscript needs to be appropriately revised in terms of language, grammar, punctuation and format throughout:__

      -Space missing between "P." and "berghei"

      Done

      -Gene names should be italicized

      Done

      -Rephrase "Considering IL-6 as a critical proinflammatory signal..." to "Considering that IL-6 is a critical proinflammatory signal..."

      Done

      -"transgenic IL-6 sporozoites" should be "transgenic IL-6-expressing P. berghei sporozoites"

      Done

      -"impairs Plasmodium infection at the liver stage" should be "impairs the liver stage of Plasmodium infection"

      Done

      INTRODUCTION

      The sentence "Among them, parasites lacking integrity of the parasitophorous vacuole, or late during development, and..." appears to be incomplete and needs rephrasing.

      Done

      The references used in sentence "During the last decade, in search of key mechanisms that determine the host inflammatory response, a set of host factors turned out to be critical for malaria parasite liver stage development (Mathieu et al., 2015); (Demarta-Gatsi et al., 2017; Demarta-Gatsi et al., 2016) (Grand et al., 2020)" do not all relate to the liver stage of infection. The authors need to select references that are relevant for their statement or else change the statement.

      Rephrased

      RESULTS

      I suggest the authors change the title of Results section "Transgenic P. berghei parasites expressing IL-6 during the liver stage lose infectivity to mice" not only to improve the quality of the English language employed but also to better clarify the notion that they are talking about hepatic infectivity.

      On the same section, please correct "timely specific timely".

      Done

      Transfectants are not "verified". If anything, the insertion of the gene in the parasite's genome is verified or, better still, confirmed.

      Done

      Sentence "The two lines behave similarly" is redundant.

      Done

      The legend of Figure 1 must include the definitions of all the acronyms in that figure.

      Acronyms in the whole manuscript are defined elsewhere

      "IL-6 transgenic sporozoites" is not an appropriate designation. If anything, they should be called IL-6-expressing P. berghei sporozoites".

      Done

      Figure 2 B: The YY axis should clarify that it refers to sporozoite numbers, as there are many other parasite stages in mosquitoes.

      Done

      Figure 2C: This scheme is hardly necessary. It would suffice to label the plots in D and E with the names of the parasite lines employed rather than "Group 1", "Group 2", "Group 3". The scheme is provided for more clarity and easy reading of the accompanying figures

      Figure 2D, 2E: Why didn't the authors use the same scale on the XX axis of the two plots?

      The qRT-PCR data per se do not substantiate the statement "Therefore, RT-qPCR analysis in the liver confirms that the loss of infectivity of IL-6 Tg-PbANKA/LISP2 SPZ is due to a defect in liver stage development in vivo", as a defect in invasion of hepatocytes cannot be excluded. The term "loss of infectivity" is also misleading. Do the authors mean loss of blood stage infectivity?

      Yes

      Sentence "... all parasites were able to invade and develop inside HepG2 cells." is misleading. The authors probably mean "parasites of both lines".

      Changed

      Figure 4: Why did the authors swap the order of the two experimental groups from one plot to the next? The same order should be used, to avoid confusion! Also, the authors should make the width of the bars in similar between the two plots.

      Done

      The authors should consider moving Figure 5 to the Supplementary materials.

      Reviewer #3 (Significance (Required)):

      *Nature and significance of the advance. Compare to existing published knowledge. Audience.*

      This study extends our current knowledge on genetically attenuated malaria vaccine candidates and validates the concept of suicide parasites for immunization against malaria. This paper will be of interest to researchers working on malaria vaccination, as well as all those interested in transgenic Plasmodium parasites, and the biology and immunology of liver stage infection by malaria parasites.

      *Your expertise.*

      The co-reviewer and the reviewer are experts on the liver stage of Plasmodium infection and on pre-erythrocytic malaria vaccination.

      **Referees cross commenting**

      I agree with all of Reviewers 1 and 2's remarks and, upon consideration, I would like to revise my "Estimated time to Complete Revisions" to become between 3 and 6 months

    1. Author Response

      Reviewer #1 (Public Review):

      The general idea of comparing response patterns to stress in the offspring generation is new and very interesting.

      We thank Reviewer 1 for their time and thoughtful comments. We agree that these comparisons are new and very interesting and have added multiple revised analyses to the manuscript based on the reviewer comments that we think will further enhance the impact of and conclusions made in this study.

      However, the data that are presented are in several ways preliminary. The phenotype comparisons are mostly convincing, although statistical treatments are partly unclear, given that each "replicate" includes itself many individuals.

      The statistical treatments for groups of individuals are the same as in Burton et al., 2017, Burton et al., 2020, and Willis et al., 2021 which include the original reports of the intergenerational responses studied here. Replicates that include many individuals are relatively common when working with C. elegans and are usually compared using ANOVA or student’s t-tests (depending on the number of comparisons) to analyze the variation in batch effects as well as differences between populations of animals.

      We believe this ability to assay hundreds or even thousands of animals, in total, for each comparison in this study makes our data substantially stronger and more reliable. However we are happy to perform any additional statistical tests the reviewer might want to see.

      The transcriptomic data are minimal (only three replicates)

      To address this comment we compared our original three replicates of RNA-seq from F1 animals from C. elegans parents exposed to P. vranovensis BIGb0446 to a second independent three replicates of F1 animals from C. elegans parents exposed to a second P. vranovensis isolate (BIGb0427 – the data for this second P. vranovensis isolate was already part of Fig. 4 of this manuscript).

      By comparing these three new replicates to our previous findings from three original replicates we found that 515 of the 562 genes that exhibited a >2-fold change and were significant at padj <0.01 in the original three replicates were also changed at >2-fold and padj <0.01 in the new three replicates. We believe our findings that 91.6% of genes change >2-fold and remain significant at padj<0.01 even when the number of replicates is doubled (and a different isolate of P. vranovensis is used!) suggests that adding additional replicates would not substantially change the conclusions of this manuscript.

      We would also like to highlight, as above, that because this analysis was done on populations of thousands of similarly staged animals, as opposed to individuals, that this further reduces the variability between replicates. In addition, much of our transcriptomic data from each species was then compared across species and genes were only analyzed for those that changed in multiple different species which themselves each represent a separate three additional replicates [ie genes that change in all 4 species analyzed have to exhibit significant (>2-fold, padj <0.01) changes across 12 total replicates].

      Our new findings comparing six replicates did not substantially change the number of genes identified when compared to using three replicates, and the fact that for all of the main conclusions of this manuscript each set of triplicates from one species was then compared across 9 additional replicates from three other species from pools of thousands of animals makes us very confident that our results are robust and highly reproducible.

      and lack comparison to the stress responses in the parental animals.

      We agree with Reviewer 1 that comparisons to parental animals are interesting and important. Comparisons of F1 progeny gene expression patterns to parental animals were not included here because such comparisons were previously published in some of our original reports of these intergenerational effects (For example, see Burton et al., 2020). In summary, we found that most, but not all, of the effects on gene expression in F1 animals were also detected in parental animals. However, the transcriptional responses only turn on in F1 animals post gastrulation and do not appear to be due to the simple deposition of parental mRNAs into embryos (Burton et al., 2020).

      We have updated the text to highlight these findings.

      The analysis of the transcriptome data is limited to counting overlaps between significantly changed genes, without deeper discussion of the genes and pathways that are affected.

      In the revised manuscript we have completely redone all of the transcriptomic analysis to use a stricter set of cutoffs for significance – both padj <0.01 and requiring a >2-fold change in expression based on the helpful comments of Reviewer 1 – which we agree with – see below.

      As part of this new analysis we have now also included a deeper discussion of the genes that exhibited similar changes across species, including using g:Profiler to examine the genes that exhibited changes across all four species.

      In addition, we have now paired our phenotypic and transcriptomic data across species to identify 19 new genes that we predict are highly likely to be involved in intergenerational responses to stress based on their expression patterns across species. These 19 genes come out of highly filtered analyses across species that identified a total of 23 genes that change only in species that adapt to P. vranovensis or osmotic stress and not in species that do not adapt.

      Interestingly, this analysis identified nearly all of the previously known genes involved in intergenerational adaptations to these stresses including rhy-1, cysl-1, cysl-2 and gpdh-1. Thus, we predict the remaining 19 genes that came out of this analysis are highly likely to be involved in the responses to these stresses. Furthermore, in the revised text we highlight that our new list of 19 genes includes multiple conserved factors that are required for animal viability including genes involved in nuclear transport (imb-1 and xpo-2), the CDC25 phosphatase ortholog cdc-25.1, and the PTEN tumor suppressor ortholog daf-18. This new analysis will likely form the basis for future investigations into the mechanisms underlying these exciting intergenerational effects.

      We believe this additional analysis greatly improves this manuscript. We are also happy to include any specific additional analysis the reviewer would like to see.

      The top response genes that are directly tested have been discovered before. Hence, while interesting patterns are evident from the data, this work largely confirms prior work, including that described in Burton et al. 2020.

      We have revised the text to highlight that the aims of this particular study were to determine if multigenerational responses to stress were evolutionarily conserved at any level, as well as to determine the potential costs of such effects and the specificity of the responses. Questions that were not addressed in any previous study of multigenerational effects, including Burton et al., 2020. Because of the aims of this study we believe it was critical to focus on genes that had an established role in these intergenerational responses in C. elegans and to compare and contrast the behavior and requirement of these genes in intergenerational responses in other species. (Although we note that this newly revised manuscript we have now also reported 19 new top response genes – see above).

      In addition to our original goals, in this study we were able to determine the extent to which intergenerational transcriptional responses are conserved and the extent to which intergenerational transcriptional changes persist transgenerationally (which we find to be effectively not at all using our revised stricter analysis). We believe these findings are not only novel, but perhaps will be surprising to much of the intergenerational and transgenerational field and have a major impact on both how multigenerational studies are interpreted and how they are conducted in the future. This is especially the case for studies in C. elegans which is one of the leading model organisms to study the mechanisms underlying both intergenerational and transgenerational responses to stress.

      For example, we note that several landmark studies of transgenerational effects (persisting into F3 or later generations) in C. elegans performed RNA-seq on F1 progeny (For example, Moore et al., Cell 2019 or Ma et al., Nature Cell Biology 2019). Our new findings reported here suggest that it is possible that none of the transcriptional effects detected in F1 animals will persist in F3 progeny. Furthermore, our studies demonstrate the importance of comparing C. elegans transcriptional effects to related Caenorhabditis species as we found that only a subset of the effects detected in C. elegans are conserved in any other Caenorhabditis species. (Such comparisons are important for determining if and to what extent observations of intergenerational and/or transgenerational effects observed in C. elegans represent conserved phenomena).

      For all of these reasons we believe our data is highly exciting, will be of broad interest to the field, and represent novel and potentially unexpected findings that were not previously reported in any prior work including Burton et al., 2020.

      Reviewer #2 (Public Review):

      Transgenerational effects (TE) (usually defined as multigenerational effects lasting for at least three generations) generated a lot of interest in recent years but the adaptive value of such effects is unclear. In order to understand the scope for adaptive TE we need to understand i) whether such effects are common; ii) whether they are stress-specific; and iii) if there are trade-offs with respect to performance in different environments. The last point is particularly important because F1, F2 and F3 descendants may encounter very different environments. On the other hand, intergenerational effects (lasting for one or two generations) are relatively common and can play an important role in evolutionary processes. However, we do not know whether intergenerational and transgenerational effects have same underlying mechanisms.

      This study makes a big step towards resolving these questions and strongly advances our understanding of both phenomena. Much of the previous work on mechanisms of multigenerational effects has been conducted in C. elegans and this works uses the same approach. They focus on bacterial infection, Microsporidia infection, larval starvation and osmotic stress. I did not quite understand why the authors chose to focus on P. vranovensis rather than P. aeruginosa P14 that has been used in previous studies of transgenerational effects in C. elegans. However, this is a minor point because I guess they were interested in broad transgenerational responses to bacterial infection rather than in strain-specific ones. The authors used different Caenorhabditis species, which is another strength of this study in addition to using multiple stresses.

      We thank the reviewer for these comments. We’d like to briefly highlight that P. vranovensis was also shown to elicit the same transgenerational effects as P. aeruginosa in the bioRxiv version of the same papers that reported transgenerational effects for P. aeruginosa (Kaletsky et al., 2020 – GRb0427 is an isolate of P. vranovensis).

      It is not clear to us why this result was not included in the final published version of this manuscript, but we in fact used P. vranovensis for these studies in part because of this bioRxiv paper and because we failed to detect any robust intergenerational effects using P. aeruginosa PA14 in any of our assays – including at the RNA-seq level (unpublished).

      Nonetheless, we have since confirmed with Coleen Murphy’s lab that they do find P. vranovensis elicits the same transgenerational effect on behaviour as P. aeruginosa. We expect that future investigations into the conditions under which P. vranovensis elicits effects that are lost/erased after 1 generation and the conditions under which effects might be maintained for more than 3 generations will be highly interesting.

      They found 279 genes that exhibited intergenerational changes in all C species tested, but most interestingly, they show that a reversal in gene expression corresponds to a reversal in response to bacterial infection (beneficial in two species and deleterious on one). This is very intriguing! This was further supported by similar observations of osmotic stress response.

      We thank Reviewer 2 for their excitement and we agree that these findings were highly exciting.

      They also report that intergenerational effects are stress-specific and there have deleterious effects in mismatched environments, and, importantly, when worms were subject to multiple stresses. It is quite likely that offspring will experience a range of environments and that several environmental stresses will be present simultaneously in nature. I really liked this aspect of this work as I think that tests in different environments, especially environments with multiple stresses, are often lacking, which limits the generality of the conclusions.

      Another interesting piece of the puzzle is that beneficial and deleterious effects could be mediated by the same mechanisms. It would be interesting to explore this further. However, this is not a real criticism of this work. I think that the authors collected an impressive dataset already and every good study generates new research questions.

      Given these findings, I was particularly keen to see what comes of transgenerational effects. The general answer was that there aren't many, and the authors conclude that all intergenerational effects that they studied are largely reversible and that intergenerational and transgenerational effects represent distinct phenomena. While I think that this is a very important finding, I am not sure whether we can conclude that intergenerational and transgenerational effects are not related.

      In my view, an alternative interpretation is that intergenerational effects are common while transgenerational effects are rare. Because intergenerational effects are stress-specific, transgenerational effects could be stress-specific as well.

      We agree with reviewer 2 that our findings suggest that intergenerational effects are common and transgenerational effects are either rare in comparison or only occur under specific conditions. We have updated the text to include this interpretation.

      Perhaps different mechanisms regulate intergenerational responses to, say, different forms of starvation (e.g. compare opposing transgenerational responses to prolonged larval starvation (Rechavi et al. doi:10.1016/j.cell.2014.06.020) and rather short adulthood starvation (Ivimey-Cook et al. 2021 https://doi.org/10.1098/rspb.2021.0701). Perhaps some (most?) forms of starvation generate only intergenerational responses and do not generate transgenerational responses. But some do. Those forms of starvation that generate both intergenerational and transgenerational effects could do so via same mechanisms and represent the same phenomenon. I am by no means saying this is the case, but I am not sure that the absence of evidence of transgenerational effects in this study necessarily suggests that inter- and trans-generational effects are different phenomena.

      We agree and, similar to above, have updated the text accordingly to state that it is also very possible that transgenerational effects only occur under certain conditions.

      The only concern real concern was the lack of phenotypic data on F3 beyond gene expression. Ideally, I would like to see tests of pathogen avoidance and starvation resistance in F3. However, given the amount of work that went into this study, the lack of strong signature of potential transgenerational effects in gene expression, and the fact that most of these effects were shown previously to last only one generation, I do not think this is crucial.

      We thank reviewer 2 for these comments and agree that phenotypic investigations of F3 effects are also very interesting.

      We have previously investigated the phenotypic effects of all of the stresses used in this paper on F3 animals using the assays described here and consistent with our new gene expression findings we previously found that most of these stresses do not exert phenotypic effects in F3 animals (Burton et al. 2020, Willis et al 2021, Hibshman et al., 2016).

      Separately, we have also attempted to investigate the effects of pathogen exposure on pathogen avoidance, as these effects have previously been reported to occur transgenerationally, but to date have been unable to consistently replicate these findings. We expect that this is likely due to what might be subtle differences in conditions between labs (differences in water used for the media prep, air humidity, potential differences in N2 wild-type strains etc….) because assays such as behavioral avoidance are known to be very sensitive to many different environmental inputs.

      We currently believe that our experiences as they relate to intergenerational and transgenerational effects support the general conclusion of this manuscript that while intergenerational effects are common and easy to initiate across multiple labs (the intergenerational effects studied here have now been successfully reproduced in labs in the US, UK, and Canada), transgenerational effects might be more specific and/or only occur/be initiated under more stringent conditions – perhaps with the aim of avoiding the costs of such multigenerational effects.

      Future studies of exactly when/under what conditions C. elegans initiates intergenerational vs transgenerational effects is likely to be very interesting.

      It would be very interesting to compare gene expression and other phenotypic responses in F1 and F3 between P. vranovensis and PA14. Also, it would be interesting to test the adaptive value of intergenerational and transgenerational effects after exposure to both strains in different environments. This is would be very informative and help with understanding the evolutionary significance of transgenerational epigenetic inheritance of pathogen avoidance as reported previously. Why response to P. vranovensis is erased while response to PA14 is maintained for four generations? Are nematodes more likely to encounter one species than the other? Again, however, this is not something necessary for this study.

      We completely agree with Reviewer 2 and have indeed attempted these experiments both in Burton et al., 2020 and in unpublished results.

      With regards to the transgenerational F3 effects, as mentioned above, P. vranovensis has been reported to elicit the same transgenerational effect as P. aeruginosa PA14 – at least as reported in the Kaletsky et al., 2020 bioRxiv version of the manuscript from the same studies. (GRb0427 is an isolate of P. vranovensis).

      To date, however, in our laboratory we have been unable to detect any transgenerational effects for either P. vranovensis or P. aeruginosa infection on gene expression data from RNA-seq experiments (data from this manuscript and unpublished data).

      It is not yet clear why this is the case, but we note that the RNA-seq analysis from the transgenerational PA14 studies (published in Moore et al., Cell 2019) was performed on F1 animals and thus was looking at intergenerational effects – to our knowledge no RNA-seq on F3 progeny from animals exposed to PA14 has ever been published. Thus, as it stands there is no existing F3 gene expression studies done using PA14 for us to compare our results to, but it remains possible that PA14 does not elicit specific effects on F3 gene expression when analyzed by RNA-seq.

      For F1 effects we have published a gene expression comparison for P. vranovensis and P. aeruginosa F1 effects in a previous manuscript (Burton et al 2020) and will add a mention of this to the text. Briefly, we detected very few F1 effects on gene expression when exposing adults to P. aeruginosa for 24 hours and parental infection by P. aeruginosa did not result in protection for offspring from P. vranovensis infection (Burton et al., 2020). We concluded that the intergenerational adaptation to P. vranovensis was not initiated by P. aeruginosa and was at least somewhat specific to P. vranovensis as well as the new species of Pseudomonas described in this manuscript which does cross protect.

      The main strengths of this paper are i) use of multiple stresses; ii) use of multiple species; iii) tests in different environments; and iv) simultaneous evaluation of intergenerational and transgenerational responses. This study is first of a kind, and it provides several important answers, while highlighting clear paths for future work.

      Excellent work and I think it will generate a lot of interest in the community, definitely want to see it published in eLife.

      We agree with Reviewer 2 and thank them for their kind comments.

      Reviewer #3 (Public Review):

      In this manuscript, the authors address whether the mechanisms mediating intergenerational effects are conserved in evolution. This question is important not only to frame this phenomenon in an evolutionary context, but to address several interlinked questions: is there a mechanism in common between adaptive versus deleterious effects? What makes some effects last one instead of several generations? What is the ecological relevance for those mechanisms? Using Caenorhabditis elegans as a model of reference, they compare four types of intergenerational effects on additional three Caenorhabditis species.

      The authors used previously characterized models of intergenerational inheritance, focusing on those that are likely to have adaptive significance. This is relevant, because the adaptive relevance of other published examples of inter- and transgenerational inheritance is not clear. They used functional studies to probe for conservation of mechanisms for bacterial infection and resistance to osmolarity stress, which is a major strength of this study. The data supports the claim of conservation in some types of intergenerational inheritance and divergence in others. One major question addressed in this manuscript is whether there is a potential overarching mechanism that confers stress-resistance across generations. Their experiments convincingly show that this is not the case, but that instead, there are stress-specific mechanisms responsible for intergenerational inheritance.

      We agree and thank Reviewer 3 for their kind comments.

    1. Author Response

      Reviewer #1 (Public Review):

      The relationship between genetic disease and adaptation is important for biomedical research as well as understanding human evolution. This topic has received considerable attention over the past several decades in human genetics research. The present manuscript provides a much more comprehensive and rigorous analysis of this topic. Specifically, the authors select a set of ~4000 human Mendelian disease genes and examine patterns of recent positive selection in these genes using the iHS and nSL tests (both haplotype test) for selection. They then compare the signals of sweeps to control genes. Importantly, they match the control set to the disease genes based upon many different genomic variables, such as recombination rate, amount of background selection, expression level, etc. The authors find that there is a deficit of selective sweeps in disease genes. They test several hypotheses for this deficit. They find that the deficit of sweeps is stronger in disease genes at low recombination rate and those that have more disease mutations. From this, the authors conclude that strongly deleterious mutations could be impeding selective sweeps.

      Strengths

      The manuscript includes a number of important strengths:

      1) It tackles an important question in the field. The question of selection in disease genes has been very well-studied in the past, with conflicting viewpoints. The present study examines this topic in a rigorous way and finds a deficit of sweeps in disease genes.

      2) The statistical analyses are rigorously done. The genome is a confusing place and there can often be many reasons why a certain set of genes could differ from another set of genes, unrelated to the variable of interest. Di et al. carefully match on these genomic confounders. Thus, they rigorously demonstrate that sweeps are depleted in disease genes relative to control genes. Further, the pipeline for ranking the genes and testing for significance is solid.

      3) The Introduction of the manuscript nicely relates different evolutionary models and explanations to patterns that could be seen in the data. As such, the present manuscript isn't just merely an exploratory analysis of patterns of sweeps in disease genes. Rather, it tests specific evolutionary scenarios.

      Weaknesses

      1) The authors did not discuss or test a basic explanation for the deficit of sweeps in disease genes. Namely, certain types of genes, when mutated, give rise to strong Mendelian phenotypes. However, mutations in these genes do not result in variation that gives rise to a phenotype on which positive selection could occur. In other words, there are just different types of genes underlying disease and positive selection. I could think that such a pattern would be possible if humans are close to the fitness optimum and strong effect mutations (like those in Mendelian disease genes) result in moving further away from the fitness optimum. On the other hand, more weak effect mutations could be either weakly deleterious or beneficial and subject to positive selection. I'm not sure whether these patterns would necessarily be captured by the overall measures of constraint which the disease and non-disease genes were matched on.

      We thank the reviewer for suggesting that alternative explanation. It is indeed important that we compare it with our own explanation. To rephrase the reviewer’s suggestion, it is possible that disease genes may just have a different distribution of fitness effects of new mutations. Specifically, mutations in disease genes might have such large effects that they will consistently overshoot the fitness optimum, and thus not get closer to this optimum. This would prevent them from being positively selected. Two predictions can be derived from this potential scenario. First, we can predict a sweep deficit at disease genes, which is what we report. Second, we can also predict that disease genes should exhibit a deficit of older adaptation, not just recent adaptation detected by sweep signals. Indeed, the decrease in adaptation due to (too) large effect mutations would be a generic, intrinsic feature of disease genes regardless of evolutionary time. This means that under this explanation, we expect a test of long-term adaptation such as the McDonald-Kreitman test to also show a deficit at disease genes.

      This latter prediction differs from the prediction made by our favored explanation of interference between deleterious and advantageous variants. In this scenario, the sweep deficit at disease genes is caused by the presence of deleterious, and most importantly currently segregating disease variants. Because the presence of the segregating variants is transient during evolution, our explanation does not predict a deficit of long-term adaptation. We can therefore distinguish which explanation (the reviewer’s or ours) is the most likely based on the presence or absence of a long-term adaptation deficit at disease genes.

      To test this, we now compare protein adaptation in disease and control genes with two versions of the MK test called ABC-MK and GRAPES (refs). ABC-MK estimates the overall rate of adaptation, and also the rates of weak and strong adaptation,and is based on Approximate Bayesian Computation. GRAPES is based on maximum likelihood. Both ABC-MK and GRPES have shown to provide robust estimates of the rate of protein adaptation thanks to evaluations with forward population simulations (refs). We find no difference in long-term adaptation between disease and control non-disease genes, as shown in new figure 4. This shows that the explanation put forward by the reviewer of an intrinsically different distribution of mutation effects at disease genes is less likely than an interference between currently segregating deleterious variants with recent, but not with older long-term adaptation. We even show in the new figure 4 that disease genes and their controls have more, not less strong long-term adaptation compared to the whole human genome baseline (new figure 4C). Also, disease genes in low recombination regions and with many disease variants have experienced more, not less strong long-term adaptation than their controls. Therefore, far from overshooting the fitness optimum due to stronger fitness effects of mutations, it looks like that these stronger fitness effects might in fact be more frequently positively selected in these disease genes.

      We now provide these new results P15L418:<br /> “Disease genes do not experience constitutively less long-term adaptive mutations<br /> A deficit of strong recent adaptation (strong enough to affect iHS or 𝑛𝑆!) raises the question of what creates the sweep deficit at disease genes. As already discussed, purifying selection and other confounding factors are matched between disease genes and their controls, which excludes that these factors alone could possibly explain the sweep deficit. Purifying selection alone in particular cannot explain this result, since we find evidence that it is well matched between disease and control genes (Figures 2 and Figure 4-figure supplement 1). Furthermore, we find that the 1,000 genes in the genome with the highest density of conserved elements do not exhibit any sweep deficit (bootstrap test + block-randomized genomes FPR=0.18; Methods). Association with mendelian diseases, rather than a generally elevated level of selective constraint, is therefore what matters to observe a sweep deficit. What then might explain the sweep deficit at disease genes?

      As mentioned in the introduction, it could be that mendelian disease genes experience constitutively less adaptive mutations. This could be the case for example because mendelian disease genes tend to be more pleiotropic (Otto, 2004), and/or because new mutations in mendelian are large effect mutations (Quintana-Murci, 2016) that tend to often overshoot the fitness optimum, and cannot be positively selected as a result. Regardless of the underlying processes, a constitutive tendency to experience less adaptive mutations predicts not only a deficit of recent adaptation, but also a deficit of more long-term adaptation during evolution. The iHS and nSL signals of recent adaptation we use to detect sweeps correspond to a time window of at most 50,000 years, since these statistics have very little statistical power to detect older adaptation (Sabeti et al., 2006). In contrast, approaches such as the McDonald-Kreitman test (MK test) (McDonald and Kreitman, 1991) capture the cumulative signals of adaptative events since humans and chimpanzee had a common ancestor, likely more than six million years ago. To test whether mendelian disease genes have also experienced less long-term adaptation, in addition to less recent adaptation, we use the MK tests ABC-MK (Uricchio et al., 2019) and GRAPES (Galtier, 2016) to compare the rate of protein adaptation (advantageous amino acid changes) in mendelian disease gene coding sequences, compared to confounding factors-matched non-disease controls (Methods). We find that overall, disease and control non-disease genes have experienced similar rates of protein adaptation during millions of years of human evolution, as shown by very similar estimated proportions of amino acid changes that were adaptive (Figure 5A,B,C,D,E). This result suggests that disease genes do not have constitutively less adaptive mutations. This implies that processes that are stable over evolutionary time such as pleiotropy, or a tendency to overshoot the fitness optimum, are unlikely to explain the sweep deficit at disease genes. If disease genes have not experienced less adaptive mutations during long-term evolution, then the process at work during more recent human evolution has to be transient, and has to has to have limited only recent adaptation. It is also noteworthy that both disease genes and their controls have experienced more coding adaptation than genes in the human genome overall (Figure 5A), especially more strong adaptation according to ABC-MK (Figure 5C). The fact that the baseline long-term coding adaptation is lower genome-wide, but similarly higher in disease and their control genes, also shows that the matched controls do play their intended role of accounting for confounding factors likely to affect adaptation. The fact that long-term protein adaptation is not lower at disease genes also excludes that purifying selection alone can explain the sweep deficit at disease genes, because purifying selection would then also have decreased long-term adaptation. A more transient evolutionary process is thus more likely to explain our results.”

      Then P22L613: “More importantly, the fact that constitutively less adaptation at disease genes combined to more power to detect sweeps in low recombination regions does not explain our results, is made even clearer by the fact that disease genes in low recombination regions and with many disease variants have in fact experienced more, not less long-term adaptation according to an MK analysis using both ABC-MK and GRAPES (Figure 5F,G,H,I,J). ABC-MK in particular finds that there is a significant excess of long-term strong adaptation (Figure 4H, P<0.01) in disease genes with low recombination and with many disease variants, compared to controls, but similar amounts of weak adaptation (Figure 5G, P=0.16). It might be that disease genes with many disease variants are genes with more mutations with stronger effects that can generate stronger positive selection. The potentially higher supply of strongly advantageous variants at these disease genes makes it all the more notable that they have a very strong sweep deficit in recent evolutionary times. This further strengthens the evidence in favor of interference during recent human adaptation: the limiting factor does not seem to be the supply of strongly advantageous variants, but instead the ability of these variants to have generated sweeps recently by rising fast enough in frequency.”

      2) While I think the authors did a superb job of controlling for genome differences between disease and non-disease genes, the analysis of separating regions by recombination rate and number of disease mutations does not seem as rigorous. Specifically, the authors tested for enrichment of sweeps in disease genes vs control and then stratified that comparison by recombination rate and/or number of disease mutations. While this nicely matches the disease genes to the control genes, it is not clear whether the high recombination rate genes differ in other important attributes from the low recombination rate genes. Thus, I worry whether there could be a confounder that makes it easier/harder to detect an enrichment/deficit of sweeps in regions of low/high recombination.

      We thank the reviewer for emphasizing the need for more controls when comparing our results in low or high recombination regions. We have now compared the confounding factors between low recombination disease genes and high recombination disease genes, as classified in the manuscript. As shown in new supp table Figure 6 figure supplement 1, confounding factors do not differ substantially between low and high recombination disease genes, and are all within a range of +/- 25% of each other. It would take a larger difference for any confounding factor to explain the sharp sweep deficit difference observed between the low and high recombination disease genes. The only factor with a 35% difference between low and high recombination mendelian disease genes is McVicker’s B, but this is completely expected; B is expected to be lower in low recombination regions.

      We now write P20L569: “Further note that only moderate differences in confounding factors between low and high recombination mendelian disease genes are unlikely to explain the sweep deficit difference (Figure 6-figure supplement 1).”

      Regarding the potential confounding effect of statistical power to detect sweeps differing in low and high recombination regions, please see our earlier response to main point 2.

      Reviewer #2 (Public Review):

      This paper seeks to test the extent to which adaptation via selective sweeps has occurred at disease-associated genes vs genes that have not (yet) been associated with disease. While there is a debate regarding the rate at which selective sweeps have occurred in recent human history, it is clear that some genes have experienced very strong recent selective sweeps. Recent papers from this group have very nicely shown how important virus interacting proteins have been in recent human evolution, and other papers have demonstrated the few instances in which strong selection has occurred in recent human history to adapt to novel environments (e.g. migration to high altitude, skin pigmentation, and a few other hypothesized traits).

      One challenge in reading the paper was that I did not realize the analysis was exclusively focused on Mendelian disease genes until much later (the first reference is not until the end of the introduction on pages 7-8 and then not at all again until the discussion, despite referring to "disease" many times in the abstract and throughout the paper). It would be preferred if the authors indicated that this study focused on Mendelian diseases (rather than a broader analysis that included complex or infectious diseases). This is important because there are many different types of diseases and disease genes. Infectious disease genes and complex disease genes may have quite different patterns (as the authors indicate at the end of the introduction).

      We want to apologize profusely for this avoidable mistake. We have now made it clearer from the very start of the manuscript that we focus on mendelian non-infectious disease genes. We have modified the title and the abstract accordingly, specifying mendelian and non-infectious as required.

      The abstract states "Understanding the relationship between disease and adaptation at the gene level in the human genome is severely hampered by the fact that we don't even know whether disease genes have experienced more, less, or as much adaptation as non-disease genes during recent human evolution." This seems to diminish a large body of work that has been done in this area. The authors acknowledge some of this literature in the introduction, but it would be worth toning down the abstract, which suggests there has been no work in this area. A review of this topic by Lluis Quintana-Murci1 was cited, but diminished many of the developments that have been made in the intersection of population genetics and human disease biology. Quintana-Murci says "Mendelian disorders are typically severe, compromising survival and reproduction, and are caused by highly penetrant, rare deleterious mutations. Mendelian disease genes should therefore fit the mutation-selection balance model, with an equilibrium between the rate of mutation and the rate of risk allele removal by purifying selection", and argues that positive selection signals should be rare among Mendelian disease genes. Several other examples come to mind. For example, comparing Mendelian disease genes, complex disease genes, and mouse essential genes was the major focus of a 2008 paper2, which pointed out that Mendelian disease genes exhibited much higher rates of purifying selection while complex disease genes exhibited a mixture of purifying and positive selection. This paper was cited, but only in regard to their findings of complex diseases. A similar analysis of McDonald-Kreitman tables3 was performed around Mendelian disease genes vs non-disease genes, and found "that disease genes have a higher mean probability of negative selection within candidate cis-regulatory regions as compared to non-disease genes, however this trend is only suggestive in EAs, the population where the majority of diseases have likely been characterized". Both of these studies focused on polymorphism and divergence data, which target older instances of selection than iHS and nSL statistics used in the present study (but should have substantial overlap since iHS is not sensitive to very recent selection like the SDS statistic). Regardless, the findings are largely consistent, and I believe warrant a more modest tone.

      We thank the reviewer for their recommendation. We should have written more about what is currently well known or unknown about recent adaptation in disease genes, and in more nuanced terms. Instead of writing “Understanding the relationship between disease and adaptation at the gene level in the human genome is severely hampered by the fact that we don't even know whether disease genes have experienced more, less, or as much adaptation as non-disease genes during recent human evolution”, we now write in the new abstract:

      “Despite our expanding knowledge of gene-disease associations, and despite the medical importance of disease genes, their recent evolution has not been thoroughly studied across diverse human populations. In particular, recent genomic adaptation at disease genes has not been characterized as well as long-term purifying selection and long-term adaptation. Understanding the relationship between disease and adaptation at the gene level in the human genome is hampered by the fact that we don’t know whether disease genes have experienced more, less, or as much adaptation as non-disease genes during the last ~50,000 years of recent human evolution.”

      We also toned down the start of the introduction. We now write P3L74:

      “Despite our expanding knowledge of mendelian disease gene associations, and despite the fact that multiple evolutionary processes might connect disease and genomic adaptation at the gene level, these connections are yet to be studied more thoroughly, especially in the case of recent genomic adaptation.”

      Although we agree that others have made extensive efforts to characterize older adaptation or purifying selection at disease genes compared to non-disease genes, we still believe that our results are novel and more conclusive about recent positive selection. Our initial statement was however poorly phrased. To our knowledge, our study is the first to look at the issue using specifically sweep statistics that have been shown to be robust to background selection, while also controlling for confounding factors. These sweep statistics have sensitivity for selection events that occurred in the past 30,000 or at most 50,000 years of human evolution (Sabeti et al. 2006). This is a very different time scale compared to the millions of years of adaptation (since divergence between humans and chimpanzees) captured by MK approaches.

      We also want to note that we did cite the Blekhman et al. paper for their result of stronger purifying selection in our initial manuscript. It is true however that we did not specify mendelian disease genes, which was confusing. We want to apologize again for it:

      From the earlier manuscript: “Multiple recent studies comparing evolutionary patterns between human disease and non-disease genes have found that disease genes are more constrained and evolve more slowly (lower ratio of nonsynonymous to synonymous substitution rate, dN/dS, in disease genes) (Blekhman et al., 2008; Park et al., 2012; Spataro et al., 2017)”

      “Among other confounding factors, it is particularly important to take into account evolutionary constraint, i.e the level of purifying selection experienced by different genes. A common intuition is that disease genes may exhibit less adaptation because they are more constrained (Blekhman et al., 2008)”

      It is important to remember that, as we mention in the introduction, previous comparisons did not take potential confounding factors at all into account. It is therefore unclear whether their conclusions were specific to disease genes, or due to confounding factors. We have now made this point clearer in the introduction, as we believe that we have made a substantial effort to control for confounding factors, and that it is a substantial departure from previous efforts:

      P7L201: “In contrast with previous studies, we systematically control for a large number of confounding factors when comparing recent adaptation in human mendelian disease and nondisease genes, including evolutionary constraint, mutation rate, recombination rate, the proportion of immune or virus-interacting genes, etc. (please refer to Methods for a full list of the confounding factors included).”.

      P9L253: “These differences between disease and non-disease genes highlight the need to compare disease genes with control non-disease genes with similar levels of selective constraint. To do this and compare sweeps in mendelian disease genes and non-disease genes that are similar in ways other than being associated with mendelian disease (as described in the Results below, Less sweeps at mendelian disease genes), we use sets of control non-disease genes that are built by a bootstrap test to match the disease genes in terms of confounding factors (Methods)”.

      Furthermore, we have now added a comparison of older adaptation in disease and non-disease genes using a recent version of the MK test called ABC-MK, that can take background selection and other biases such as segregating weakly advantageous variants into account. Also controlling for confounding factors, we find no difference in older adaptation between disease and non-disease genes (please see our response to main point 2).

      Therefore, contrary to the reviewer’s claim that the sweep statistics and MK approaches should have substantial overlap, we now show that it is clearly not the case. We further show that the lack of overlap is expected under our explanation of our results based on interference between recessive deleterious and advantageous variants (see our responses to main point 1 and to reviewer 1 weakness 1).

      Previous analyses were using much smaller mendelian disease gene datasets, less recent polymorphism datasets and, critically, did not control for confounding factors. We also note that reference 3 (Torgerson et al. Plos Genetics 2009) does not make any claim about recent positive selection in mendelian disease genes compared to other genes. Their dataset at the time also only included 666 mendelian disease genes, versus the ~4,000 currently known.

      In short, we do think that we have a claim for novelty, but the reviewer is entirely right that we did a poor job of giving due credit to previous important work. These previous studies deserved much better credit than no credit at all. We want to thank the reviewer from avoiding us the embarrassment of not citing important work.

      We now cite the papers referenced by the reviewer as appropriate in the introduction, based on the scope of their results:

      P3L93: “Multiple recent studies comparing evolutionary patterns between human mendelian disease and non-disease genes have found that mendelian disease genes are more constrained and evolve more slowly (Blekhman et al., 2008; Quintana-Murci, 2016; Spataro et al., 2017; Torgerson et al., 2009). An older comparison by Smith and Eyre-Walker (Smith and Eyre-Walker, 2003) found that disease genes evolve faster than non-disease genes, but we note that the sample of disease genes used at the time was very limited.”

      P5L134 “Among possible confounding factors, it is particularly important to take into account evolutionary constraint, i.e the level of purifying selection experienced by different genes. A common intuition is that mendelian disease genes may exhibit less adaptation because they are more constrained (Blekhman et al., 2008; Spataro et al., 2017; Torgerson et al., 2009),”

      There are some aspects of the current study that I think are highly valuable. For example, the authors study most of the 1000 Genomes Project populations (though the text should be edited since the admixed and South Asian populations are not analyzed, so all 26 populations are not included, only the populations from Africa, East Asia, and Europe are analyzed; a total of 15 populations are included Figures 2-3). Comparing populations allows the authors to understand how signatures of selection might be shared vs population-specific. Unfortunately, the signals that the authors find regarding the depletion of positive selection at Mendelian disease genes is almost entirely restricted to African populations. The signal is not significant in East Asia or Europe (Figure 2 clearly shows this). It seems that the mean curve of the fold-enrichment as a function of rank threshold (Figure 3) trends downward in East Asian and European populations, but the sampling variance is so large that the bootstrap confidence intervals overlap 1). The paper should therefore revise the sentence "we find a strong depletion in sweep signals at disease genes, especially in Africa" to "only in Africa". This opens the question of why the authors find the particular pattern they find. The authors do point out that a majority of Mendelian disease genes are likely discovered in European populations, so is it that the genes' functions predate the Out-of-Africa split? They most certainly do. It is possible that the larger long-term effective population size of African populations resulted in stronger purifying selection at Mendelian disease genes compared to European and East Asian populations, where smaller effective population sizes due to the Out-of-Africa Bottleneck diminished the signal of most selective sweeps and hence there is little differentiation between categories of genes, "drift noise"). It is also surprising to note that the authors find selection signatures at all using iHS in African populations while a previous study using the same statistic could not differentiate signals of selection from neutral demographic simulations4.

      We want to thank the reviewer profusely for putting us on the right track thanks to their insightful suggestion. As described in our response to reviewer 1 weakness 1, we have now shown with simulations that the interference of deleterious variants on advantageous variants is strongly decreased during a bottleneck of a magnitude similar to the Out of Africa bottlenecks experienced by East Asian and European populations. This decrease of interference is likely strong enough to not require any other explanation, even if other processes may also be at work, such as a decrease of the sweeps signals as suggested by the reviewer.

      About the Granka et al. paper, the last author of the current manuscript has already shown in a previous paper (ref) that the type of approaches used to quantify recent adaptation is likely to be severely underpowered due to a number of confounding factors, notably including comparing genic and non-genic windows that are not sufficiently far from each other to not overlap the same sweep signals. Our result are also based on much more recent and less biased sets of SNPs used to measure the sweeps statistics.

      The authors find that there is a remarkably (in my view) similar depletion across all but one MeSH disease classes. This suggests that "disease" is likely not the driving factor, but that Mendelian disease genes are a way of identifying where there are strongly selected deleterious variants recurrently arising and preventing positively selected variants. This is a fascinating hypothesis, and is corroborated by the finding that the depletion gets stronger in genes with more Mendelian disease variants. In this sense, the authors are using Mendelian disease genes as a proxy for identifying targets of strong purifying selection, and are therefore not actually studying Mendelian disease genes. The signal could be clearer if the test set is based on the factor that is actually driving the signal.

      Based on the reviewer’s comment, we have now better explained why our results are unlikely to be a generic property of purifying selection alone. As we explain in our response to main point 3, our results cannot be explained by purifying selection alone, because we match purifying selection between disease genes and the controls. Indeed, we now show with additional MK analyses and GERP-based analyses that our controls for confounding factors already account for purifying selection. This is shown by the fact that disease genes and their controls have similar distributions of deleterious fitness effects.

      In addition, we added a comparison that shows that purifying selection alone does not explain our results. Instead of comparing sweeps at disease and non-disease genes, we compared sweeps (in Africa) between the 1,000 genes with the highest density of conserved, constrained elements and other genes in the genome. If purifying selection is the factor that drives the sweep deficit at disease genes, then we should see a sweep deficit among the genes with the most conserved, constrained elements compared to other genes in the genome. However, we see no such sweep deficit at genes with a high density of conserved, selectively constrained elements (boostrap test + block randomization of genomes, FPR=0.18). See P15L424. Note that for this comparison we had to remove the matching of confounding factors corresponding to functional and purifying selection densities (new Methods P40L1131).

      Again, our results are better explained not just by purifying selection alone, but more specifically by the presence of interfering, segregating deleterious variants. It is perfectly possible to have highly constrained parts of the genome without having many deleterious segregating variants at a given time in evolution.

      The similarity across MeSH classes can be readily explained if what matters is interference with deleterious segregating variants. Because all types of diseases have deleterious segregating variants, then it is not surprising that different MeSH disease categories have a similar sweep deficit. We make that point clearer in the revised manuscript:

      P26L707: “The sweep deficit is comparable across MeSH disease classes (Figure 8), suggesting that the evolutionary process at the origin of the sweep deficit is not diseasespecific. This is compatible with a non-disease specific explanation such as recessive deleterious variants interfering with adaptive variants, irrespective of the specific disease type.”.

      One of the most important steps that the authors undertake is to control for possible confounding factors. The authors identify 22 possible confounding factors, and find that several confounding factors have different effects in Mendelian disease genes vs non-disease genes. The authors do a great job of implementing a block-bootstrap approach to control for each of these factors. The authors talk specifically about some of these (e.g. PPI), but not others that are just as strong (e.g. gene length). I am left wondering how interactions among other confounding factors could impact the findings of this paper. I was surprised to see a focus on disease variant number, but not a control for CDS length. As I understand it, gene length is defined as the entire genomic distance between the TSS and TES. Presumably genes with larger coding sequence have more potential for disease variants (though number of disease variants discovered is highly biased toward genes with high interest). CDS length would be helpful to correct for things that pS does not correct for, since pS is a rate (controlling for CDS length) and does not account for the coding footprint (hence pS is similar across gene categories).

      Based on our response to the previous point, it is clear that a high density of coding sequences, or conserved constrained sequence in general are not enough to explain our results. Furthermore, we want to remind the reviewer that we already control for coding sequence length through controlling for coding density, since we use windows of constant sizes.

      The authors point out that it is crucial to get the control set right. This group has spent a lot of time thinking about how to define a control set of genes in several previous papers. But it is not clear if complex disease genes and infectious disease genes are specifically excluded or not. Number of virus interactions was included as a confounding factor, so VIPs were presumably not excluded. It is clear that the control set includes genes not yet associated with Mendelian disease, but the focus is primarily on the distance away from known Mendelian disease genes.

      We are sorry that we were not more explicit from the start of the manuscript. We now make it clearer what the set disease genes includes or not throughout the entire manuscript, by repeating that we focus specifically on mendelian, non-infectious disease genes. By noninfectious, we mean that we excluded genes with known infectious disease-associated variants. This does not exclude most virus-interacting genes since most of them are not associated at the genetic variant level with infectious diseases. It is also important to note that the effect of virus interactions is accounted for by matching the number of interacting viruses between mendelian disease genes and controls.

      We write P29L818: “By non-infectious, we mean that we excluded genes with known infectious disease-associated variants. This does not exclude most VIPs since most of them are not associated at the genetic variant level with infectious diseases. It is important to note that the effect of virus interactions is accounted for by matching the number of interacting viruses between mendelian disease genes and controls.”

      Minor comments:

      On page 13, the authors say "This artifact is also very unlikely due to the fact that recombination rates are similar between disease and non-disease genes (Figure 1)." However, Figure 1 shows that "deCode recombination 50kb" is clearly higher in disease genes and comparable at 500kb. The increased recombination rate locally around disease genes seems to contradict the argument formulated in this paragraph.

      We apologize for the lack of precision in this sentence. What we meant is that the recombination rates are not different enough that the mentioned hypothetical artifact would be able to explain our results. We also forgot to remind at this point in the manuscript that we match recombination between disease genes and controls. We now use more precise language:

      P28L772 “The recombination rate at disease genes is also only slightly different from the recombination rate at non-disease genes (Figure 1), and we match the recombination rate between disease genes and controls.”.

      Reviewer #3 (Public Review):

      In this paper, the authors ask whether selective sweeps (as measured by the iHS and nSL statistics) are more or less likely to occur in or near genes associated with Mendelian diseases ("disease genes") than those that are not ("non-disease genes"). The main result put forward by the authors is that genes associated with Mendelian diseases are depleted for sweep signatures, as measured by the iHS and nSL statistics, relative to those which are not.

      The evidence for this comes from an empirical randomization scheme to assess whether genes with signatures of a selective sweep are more likely to be Mendelian disease genes that not. The analysis relies on a somewhat complicated sliding threshold scheme that effectively acts to incorporate evidence from both genes with very large iHS/nSL values, as well as those with weaker signals, while upweighting the signal from those genes with the strongest iHS/nSL values. Although I think the anlaysis could be presented more clearly, it does seem like a better analysis than a simple outlier test, if for no other reason than that the sliding threshold scheme can be seen as a way of averaging over uncertainty in where one should set the threshold in an outlier test (along with some further averaging across the two different sweeps statistics, and the size of the window around disease associated genes that the sweep statistics are averaged over). That said, the particular approach to doing so is somewhat arbitrary, but it's not clear that there's a good way to avoid that.

      In addition to reporting that extreme values of iHS/nSL are generally less likely at Mendelian disease genes, the authors also report that this depletion is strongest in genes from low recombination regions, or which have >5 specific variants associated with disease.

      Drawing on this result, the authors read this evidence to imply that sweeps are generally impeded or slowed in the vicinity of genes associated with Mendelian diseases due to linkage to recessive deleterious variants, which hitchhike to high enough frequencies that the selection against homozygotes becomes an important form of interference. This phenomenon was theoretically characterized by Assaf et al 2015, who the authors point to for support. That such a phenomenon may be acting systematically to shape the process of adaptation is an interesting suggestions. It's a bit unclear to me why the authors specifically invoke recessive deleterious mutations as an explanation though. Presumably any form of interference could create the patterns they observe? This part of the paper is, as the authors acknowledge, speculative at this point.

      We thank the reviewer for their comments. We are sorry that we did not provide a clear explanation of why only recessive deleterious mutations are expected to interfere more than other types of deleterious variants. This was shown by Assaf et al. (2015), and we should have stated it explicitly. The reason why recessive deleterious variants interfere more than additive or dominant ones is that they can hitchhike together with an adaptive variant to substantial frequencies before negative selection actually happens, when a significant number of homozygous individuals for the deleterious mutation start happening in the population. On the contrary dominant mutations do not make it to the same high frequencies linked to an adaptive variant, because they start being selected negatively as soon as they appear in the population.

      We now write P18L496: “In diploid species including humans, recessive deleterious mutations specifically have been shown to have the ability to slow down, or even stop the frequency increase of advantageous mutations that they are linked with (Assaf et al., 2015). Dominant variants do not have the same interfering ability, because they do not increase in frequency in linkage with advantageous variants as much as recessive deleterious do, before the latter can be “seen” by purifying selection when enough homozygous individuals emerge in a population (Assaf et al., 2015).”

      We have also confirmed with SLiM forward simulations that recessive deleterious variants interfere with adaptive variants much more than dominant ones (Table 1).

      I'm also a bit concerned by the fact that the signal is only present in the African samples studied. The authors suggest that this is simply due to stronger drift in the history of European and Asian samples. This could be, but as a reader it's a bit frustrating to have to take this on faith.

      We thank the reviewer for pointing out this issue with our manuscript. We have now shown, as detailed above in our response to main point 1, reviewer 1 weakness 1, that a weaker sweep deficit at disease genes in Europe and East Asia is an expected feature under the interference explanation, due to the weakened interference of recessive deleterious variants during bottlenecks of the magnitude observed in Europe and East Asia. We therefore believe that these new results strengthen our previous claim regarding the role interference between deleterious and advantageous variants. We want to thank the reviewer for forcing us to examine the difference between results in Africa and out of Africa, as the manuscript is now more consistent and our results substantially better explained.

      There are other analyses that I don't find terribly convincing. For example, one of the anlayses shows that iHS signals are no less depleted at genes associated with >5 diseases than with 1 does little to convince me of anything. It's not particularly clear that # of associated disease for a given gene should predict the degree of pleiotropy experienced by a variant emerging in that gene with some kind of adaptive function. Failure to find any association here might just mean that this is not a particularly good measure of the relevant pleiotropy.

      We agree with the reviewer that the number of associated disease may not be a good measure of pleiotropy. Unfortunately to our knowledge there is currently no good measure of gene pleiotropy in human genomes. Given that the evidence in favor of interference of deleterious variants is now strengthened, we have chosen to remove this analysis from the manuscript. As we now explain throughout the manuscript, pleiotropy is an unlikely explanation in the first place because of the fact that disease genes have not experienced less long-term adaptation (see the details on our new MK test results in the response to main point 2).

      P16L447: “We find that overall, disease and control non-disease genes have experienced similar rates of protein adaptation during millions of years of human evolution, as shown by very similar estimated proportions of amino acid changes that were adaptive (Figure 5A,B,C,D,E). This result suggests that disease genes do not have constitutively less adaptive mutations. This implies that processes stable over evolutionary time such as pleiotropy, or a tendency to overshoot the fitness optimum, are unlikely to explain the sweep deficit at disease genes.”.

      A last parting thought is that it's not clear to me that the authors have excluded the hypothesis that adaptive variants simply arise less often near genes associated with disease. The fact that the signal is strongest in regions of low recombination is meant to be evidence in favor of selective interference as the explanation, but it is also the regime in which sweeps should be easiest to detect, so it may be just that the analysis is best powered to detect a difference in sweep initiation, independent of possible interference dynamics, in that regime.

      We thank the reviewer for stating these important alternative explanations that needed more attention in our manuscript. In our response to main point 2 above, we explain that higher statistical power in low recombination regions is unlikely to explain our results alone, because we also show that the sweep deficit is substantially present not only in low recombination regions, but also requires the presence of a higher number of disease variants. We also describe in our response to main point 2 how our new MK-test results on long-term adaptation make it very unlikely that mendelian disease genes experience constitutively less adaptation. We want to thank the reviewer again for pointing out this issue with our manuscript, since it was indeed an important missing piece.

    1. Author Response

      Reviewer #2 (Public Review):

      (1) Much of the cited literature that is used to make the case for their hypothesis is very old and actually refers to active HIV infection and patient studies prior to ART. Also, the literature they cite regarding the role of H2S as an antimicrobial agent seem to be limited to tuberculosis infection.

      We have revised the list of literature and included more relevant references post- ART era. Recently, the antimicrobial role of H2S is comprehensively examined in the context of tuberculosis. Given the close association of TB with HIV, we thought our study is very timely and essential. However, we would like to point out that the references showing the effect of H2S on infection caused by respiratory viruses are included in the manuscript (7-9). Further, recent findings showing the influence of H2S in the context of SARS-CoV2 infection are also included in the revised manuscript

      (2) The choice of the latently infected model cell lines is rather unfortunate. There are much better defined models out there these days than J1.1 or U1 cells, such as the J-LAT cells from the Verdin lab or the various reporter cell lines generated by Levy and co-workers. In particularly, U1 cells should not be considered as latently infected, as the virus has a defect in the Tat/TAR axis and is mostly just transcriptionally attenuated. It is unclear why the authors only use J-LAT cells for one of the last experiments

      As suggested by the reviewer, we have generated new data using J-LAT cells in the revised manuscript. First, we confirmed that PMA-mediated HIV-1 reactivation in J-LAT cells is associated with the down-regulation of cbs, cth, and mpst transcripts (Figure 1-figure supplement 1C-D in the revised manuscript). Additionally, we have performed several other mechanistic experiments in J-LAT cells to validate the data generated in U1 (see below response to # 3).

      (3) It is further unclear why the authors perform most of the experiments using U1 cells, which are considered promonocytic, but in the end seek to demonstrate the influence of H2S on latent HIV-1 infection in CD4 T cells. Performing all experiments in J1.1 or better J-LAT cells would have seemed more intuitive.

      The choice of U1 was based on our earlier studies showing that U1 cells uniformly recapitulate the association of redox-based mechanisms and mitochondrial bioenergetics with HIV-latency and reactivation (10-12). We have validated key findings of U1 cells in J1.1 and J-Lat cell lines. We genetically and chemically silenced the expression of CTH in J-Lat cells and examined the effect on HIV-1 reactivation. Consistent with U1 and J1.1, genetic silencing of CTH using CTH-specific shRNA (shCTH) reactivated HIV-1 in J-Lat (Figure 2-figure supplement 1F-G in the revised manuscript). Supporting this, pre-treatment of J-Lat with non-toxic concentrations of a well-established CTH inhibitor, propargylglycine (PAG) further stimulated PMA-induced HIV-1 reactivation (Figure 2-figure supplement 1H-I in the revised manuscript). Altogether, using various cell line models of HIV-1 latency, we confirmed that endogenous H2S biogenesis counteracts HIV-1 reactivation.

      (4) The authors suggest that H2S production would control latent HIV-1 infection and reactivation. Regarding the idea that CBS, CTH or possibly MPST would control latent infection as a function of their ability to produce H2S from different sources, there are several questions. First, if H2S is the primary factor, why would the presence of e.g. MPST not compensate for the reduction of CTH? Second, why would J1.1 and U1 cells both host latent HIV-1 infection events, however, their CBS/CTH/MPST composition is completely different? Third, natural variations in CTH expression caused by culture over time are larger than variations caused by PMA activation.

      These questions are important and complex. CBS, CTH, and MPST produce H2S in the sulfur network. CBS and CTH reside in the cytoplasm, whereas MPST is mainly involved in cysteine catabolism and is mitochondrial localized. The lack of compensation of CTH by MPST could be due to the compartmentalization of their activities. Furthermore, CTH and CBS activities are regulated by diverse metabolites, including heme, S-adenosyl methionine (SAM), and nitric oxide/carbon monoxide (NO/CO). In contrast, MPST activity responds to cysteine availability. How substrates/cofactors availability and enzyme choices are regulated in the cellular milieu of J1.1 and U1 is an interesting question for future experimentation.

      Moreover, the tissue-specific expression/activity of CBS and CTH dictates their relative contributions in H2S biogenesis and cellular physiology (13). Some of these factors are likely responsible for differential expression of CBS, CTH, and MPST in J1.1 and U1 cells. Regardless of these concerns, viral reactivation uniformly reduces the expression of CTH in U1, J1.1, and J-Lat. While we cannot completely rule out natural variations in CTH expression over prolonged culturing, in our experimental setup CTH remained stably expressed and consistently showed down-regulation upon PMA treatment as compared to untreated conditions.

      (5) Also, the statement that H2S production as exerted per loss of CTH would control reactivation is not supported by the kinetic data. In latently HIV-1 infected T cell lines or monocytic cell lines, PMA-mediated HIV-1 reactivation at the protein level is usually almost complete after 24 hours, but at this time point the difference between e.g. CTH levels only begins to appear in U1 cells. The data for J1.1. are even less convincing.

      We have performed the kinetics of p24 production and CTH in U1 cells. We showed that the levels of p24 gradually increased from 6 h and kept on increasing till the last time point, i.e., 36 h post-PMA-treatment (Fig. 2D in the revised manuscript). The p24 ELISA detected a similar kinetics of p24 increase in the cell supernatant (Fig. 2E in the revised manuscript). The CTH levels show reduction at 24 h and 36 h. Based on these data, we report that HIV-1 reactivation is associated with diminished biogenesis of endogenous H2S. We have not made any claims that depletion of CTH precedes HIV reactivation. However, our CTH knockdown data clearly showed that diminished expression of CTH reactivates HIV-1 in the absence of PMA, which is consistent with our hypothesis that H2S production is likely to be a critical host component for maintaining viral latency.

      (6) Figure 2F. PMA is known to induce an oxidative stress response, however, in the experiments the data suggest that PMA results in a downregulated oxidative stress response. Maybe the authors could explain this discrepancy with the literature. In fact, both shRNA transductions, scr and CTH-specific seem to result in a lower PMA response.

      In our experiment, PMA treatment for 24 h results in down-regulation of oxidative stress genes. However, the effect of PMA on the oxidative stress responsive genes is time-dependent. In our earlier publication, we showed that 12 h PMA treatment induces oxidative stress responsive genes in U1 cells (12), whereas at 24 h, the expression of genes is down-regulated (10). Genetic silencing of CTH resulted in elevated mitochondrial ROS and GSH imbalance, which is in line with a further decrease in the expression of oxidative stress responsive genes as compared to PMA alone. As a consequence, PMA-treatment of U1-shCTH induced HIV-1 reactivation, which supersedes that stimulated by PMA or shCTH alone.

      (7) Given that the others in subsequent experiments use GYY4137, which is supposed to mimic the increased release of H2S, the authors should have definitely included experiments in which they would overexpress CTH, e.g. by retroviral transduction. Specifically in U1 cells, which seemingly do not express CBS, overexpression of CBS should also result in a suppressed phenotype

      We have explored the role of elevated H2S levels using GY44137. Treatment with GYY4137 suppressed HIV reactivation in multiple cell lines and primary CD4+ T cells. As suggested by the reviewer, overexpression of CTH could be another strategy to validate these findings. However, since the transsulfuration pathway and active methyl cycle are interconnected and share metabolic intermediates (e.g., homocysteine), overexpression of CTH could disturb this balance and may lead to metabolic paralysis. Owing to these potential limitations, we used a slow releasing H2S donor (GYY4137) to chemically complement CTH deficiency during HIV reactivation. We thank the reviewer for this comment.

      (8) Figure 4F: The authors need to explain how they can measure a 4-fold gag RNA expression change in untreated cells. Also, according to Figure 4A, 300 µM GYY produces much less H2S than 5mM, yet the suppressive effect of 300 µM GYY is much higher?

      The four-fold-expression in untreated cells is likely due to leaky control of viral transcription in J1.1 cells (14-16). However, to avoid confusion, we have replotted the results by normalizing the data generated upon PMA mediated HIV reactivation with the PMA untreated cells in the revised manuscript (Figure 4F in the revised manuscript). The suppressive effect of GYY4137 at the lower concentration is intriguing but consistent with the findings that high and low concentrations of H2S have profound and distinct effects on cellular physiology (3,17). One possibility is that the high concentration of H2S induces mitochondrial sulfide oxidation pathway to avert toxicity. This might modulate mitochondrial activity and ROS, resulting in the suppression of GYY4137 effect. Consistent with this, higher concentrations of H2S have been shown to cause pro-oxidant effects, DNA damage and genotoxicity (3,18). We have discussed these possibilities in the revised manuscript

      (9) Initially, the authors argue "that the depletion of CTH could contribute to redox imbalance and mitochondrial dysfunction to promote HIV-1 reactivation"(p. 9). Less CTH would suggest less produced H2S. However, later on in the manuscript they demonstrate that addition of a H2S source (GYY4137) results in the suppression of HIV-1 replication and supposedly HIV-1 reactivation. This is somewhat confusing.

      We show that depletion of endogenous H2S by diminished expression of CTH (U1-shCTH) resulted in higher mitochondrial ROS and GSH/GSSG imbalance. Both of these alterations are known to reactivate HIV-1 and promote replication (10,11,19). The addition of GYY4137 chemically compensated for the diminished expression of CTH, and prevented HIV-1 reactivation in U1-shCTH. These events are expected to suppress HIV-1 replication and reactivation. We have made this distinction clear in the revised manuscript.

      (10) CTH, or for that matter CBS or MPST do not only produce H2S, however, they also are part of other metabolic pathways. It would have been interesting and important to study how these metabolic pathways were affected by the genetic manipulations and also how the increased presence of H2S (GYY4137) would affect the metabolic activity of these enzymes or their expression.

      We fully agree with the reviewer. In fact, our NanoString data show that upon CTH knockdown (U1-shCTH), MPST levels were down-regulated and CBS remained undetectable (Fig. 2F in the revised manuscript). Additionally, GYY4137 treatment induced the expression of CTH but not MPST upon PMA addition (Fig. 5A in the revised manuscript). We have incorporated these findings in the revised manuscript. Given that CBS and CTH catalyzed at least eight H2S generating steps and two cysteine-producing reactions, the modulation of CTH by HIV is likely to have a widespread influence on transsulfuration pathway and active methyl cycle intermediates. Our future strategies are to generate a comprehensive understanding of sulfur metabolism underlying HIV latency and reactivation. These experiments require multiple biochemical and genetic technologies with appropriate controls. We hope that the reviewer would agree with our views that these experiments should be a part of future investigation. We thank the reviewer for this comment.

      (11) H2S has been reported to cause NFkB inhibition by sulfhydration of p65; as such, the findings here are not particularly novel or surprising. Also, H2S induced sulfhydration is rather not targeted to a specific protein, let alone a HIV protein, making this approach a very unlikely alternative to current ART forms.

      We believe that NF-kB inhibition is not the only mechanism by which H2S exerts its influence on HIV latency. Recent studies point towards the importance of the Nrf2-Keap1 axis in sustaining HIV-latency (20). Our data suggest an important role for Nrf2-Keap1 signaling in mediating the influence of H2S on HIV latency. Additionally, recruitment of an epigenetic silencer YY1 is also affected by H2S. Interestingly, YY1 activity is modulated by redox signaling (21), suggesting H2S could be an important regulator of YY1 activity in HIV-infected cells. We have so far, no evidence for viral proteins targeted by H2S. However, experiments to examine global S-persulfidation of host and HIV protein are ongoing in the laboratory to fill this knowledge gap. Lastly, our findings raise the possibility of exploring H2S donors with the current ART (not as an alternate to ART) for reducing virus reactivation. We have tone down the clinical relevance of our findings.

      (12) The description of the primary T cell model used to generate the data in Figure 6 is slightly misleading. Also, the idea of this model was originally to demonstrate that "block and lock" by didehydro-cortistatin is possible. In this application, the authors did not investigate whether GYY4137 would actually induce a HIV "block and lock" over an extended period of time.

      As suggested by the reviewer, we have cited the didehydro-cortistatin studies as the basis of our strategy. Our idea was to adapt the primary T cell model to begin understanding the role of H2S in blocking HIV rebound. Our results indicate the future possibility of investigating GYY4137 to lock HIV in deep latency for an extended period of time. However, comprehensive investigation would require long-term experiments and samples from multiple HIV subjects. In the current pandemic times with overburdened Indian clinical settings, we cannot plan these experiments. However, we hope our data form a solid foundation for HIV researchers to perform extended “block and lock” studies using H2S donors.

      (13) However, the authors never provide evidence that endogenous H2S is altered in latently HIV-1 infected cells (which may actually be an impossible task). By the end of the manuscript, the authors have not provided clear evidence that the effects of e.g. CTH deletion would be mediated by the production of H2S, and not by another function of the enzyme. Similarly, the inability of stimuli to trigger efficient HIV-1 reactivation following the provision of unnaturally high levels of H2S is not surprising given reports on the effect of GYY4137 as anti-inflammatory agent and suppressor NF-kB activation. Unless the authors were to demonstrate a true "block and lock" effect by GYY4137 the data will likely have limited impact on the HIV cure field.

      It's difficult to measure H2S levels in the latently infected primary cells due to the assay's sensitivity and the insufficient number of cells latently infected with HIV-1. However, in the revised manuscript we have clearly shown that cysteine levels are not affected by CTH depletion and cysteine deprivation does not reactivate HIV-1. These results indicate that the effects of CTH depletion are likely mediated by H2S. This is consistent with our data showing that GYY4137 specifically complement CTH deficiency and blocks HIV-1 reactivation in U1-shCTH. Further, we carried in-depth investigation to show that the effect of GYY4137 is not due to impaired activation of CD4+ T cells.

      Lastly, since CTH catalyzed multiple reactions during H2S production, we cannot rule out the effect of other metabolites in this process. However, we think that this is outside the scope of the present study. Our study focuses on understanding of how H2S modulates redox, mitochondrial bioenergetics, and gene expression in the context of HIV latency. These understandings are likely to positively impact future studies exploring the role of H2S on HIV cure.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors sought to establish a standardized quantitative approach to categorize the activity patterns in a central pattern generator (specifically, the well-studied pyloric circuit in C. borealis). While it is easy to describe these patterns under "normal" conditions, this circuit displays a wide range of irregular behaviors under experimental perturbations. Characterizing and cataloguing these irregular behaviors is of interest to understand how the network avoids these dysfunctional patterns under "normal" circumstances.

      The authors draw upon established machine learning tools to approach this problem. To do so, they must define a set of features that describe circuit activity at a moment in time. They use the distribution of inter-spike-intervals ISIs and spike phases of the LP and PD neuron as these features. As the authors mention in their Discussion section, these features are highly specialized and adapted to this particular circuit. This limits the applicability of their approach to other circuits with neurons that are unidentifiable or very large in number (the number of spike phase statistics grows quadratically with the number of neurons).

      We agree with the reviewer that the size of the feature vectors as described grows quadratically with the number of neurons. The feature sets we describe are most suited for “identified” neurons – neurons whose identity and connectivity are known and can be reliably recorded from multiple animals. The method described here is best suited for systems with small numbers of identified neurons. For other systems, other feature vectors may be chosen, as we have suggested in the Discussion: Applicability to other systems.

      The main results of the paper provide evidence that ISIs and spike phase statistics provide a reasonable descriptive starting point for understanding the diversity of pyloric circuit patterns. The authors rely heavily on t-distributed stochastic neighbor embedding (tSNE), a well-known nonlinear dimensionality reduction method, to visualize activity patterns in a low-dimensional, 2D space. While effective, the outputs of tSNE have to be interpreted with great care (Wattenberg, et al., "How to Use t-SNE Effectively", Distill, 2016. http://doi.org/10.23915/distill.00002). I think the conclusions of this paper would be strengthened if additional machine learning models were applied to the ISI and spike phase features, and if those additional models validated the qualitative results shown by tSNE. For example, tSNE itself is not a clustering method, so applying clustering methods directly to the high-dimensional data features would be a useful validation of the apparent low-dimensional clusters shown in the figures.

      We thank the reviewer for these suggestions, and agree with the reviewer that t-SNE is not a clustering method, and directly clustering on t-SNE embeddings is rife with complexities. Instead we have used t-SNE to generate a visualization that allows domain experts to quickly label and cluster large quantities of data. This makes a previously intractable task feasible, and offers some basic guarantees on quality (e.g., no one data point can have two labels, because labels derive from position of data points in two dimensional space). In addition:

      • We used uMAP, another dimensionality reduction algorithm, to perform the embedding step, and colored points by the original t-SNE embedding. (Figure 3—figure supplement 3). Large sections of the map are still strikingly colored in single colors, suggesting that the manual clustering did not depend on the details of the t-SNE algorithm, but is rather informed by the statistics of the data.

      • We validated our method using synthetic data. We generated synthetic spike trains from different “classes” and embedded the resultant feature vectors using t-SNE. Data from different classes are not intermingled, and form tight “clusters” (Figure 2 -- figure supplement 4).

      • Finally, we attempted to use hierarchical clustering to cluster the raw feature vectors, and were not able to find a reasonable portioning of the linkage tree that separated qualitatively different spike patterns (Figure at the top of this document). We speculate that this is because feature vectors may contain outliers that bias clustering algorithms that attempt to preserve global distance to lump the majority of the data into a single cluster, in order to differentiate outliers from the bulk of the data.

      The authors do show that the algorithmically defined clusters agree with expert-defined clusters. (Or, at least, they show that one can come up with reasonable post-hoc explanations and interpretations of each cluster). The very large cluster of "regular" patterns -- shown typically in a shade of blue -- actually looks like an archipelago of smaller clusters that the authors have reasoned should be lumped together. Thus, while the approach is still a useful data-driven tool, a non-trivial amount of expert knowledge is baked into the results. A central challenge in this line of research is to understand how sensitive the outcomes are to these modeling choices, and there is unlikely to be a definitive answer.

      We agree with the reviewer entirely.

      Nonetheless, the authors show results which suggest that this analysis framework may be useful for the community of researchers studying central pattern generators. They use their method to qualitatively characterize a variety of network perturbations -- temperature changes, pH changes, decentralization, etc.

      In some cases it is difficult to understand the level of certainty in these qualitative observations. A first look at Figure 5a suggests that three different kinds of perturbations push the circuit activity into different dysfunctional cluster regions. However, the apparent spatial differences between these three groups of perturbations might be due to animal-level differences (i.e. each preparation produces multiple points in the low-D plot, so the number of effective statistical replicates is smaller than it appears at first glance). Similarly, in Figure 9, it is somewhat hard to understand how much the state occupancy plots would change if more animals were collected -- with the exception of proctolin, there are ~25 animals and 12 circuit activity clusters which may not be a favorable ratio. It would be useful if a principled method for computing "error bars" on these occupancy diagrams could be developed. Similar "error bars" on the state transition diagrams (e.g. Fig 6a) would also be useful.

      We agree with the reviewer. Despite this paper containing data from hundreds of animals, the dataset may not be sufficiently large to perform some necessary statistical checks. We agree with the reviewer that a more rigorous error analysis would be useful, but is not trivially done.

      Finally, one nagging concern that I have is that the ISIs and spike phase statistics aren't the ideal features one would use to classify pyloric circuit behaviors. Sub-threshold dynamics are incredibly important for this circuit (e.g. due to electrical coupling of many neurons). A deeper discussion about what is potentially lost by only having access to the spikes would be useful.

      We agree with the reviewer that spike times aren’t the ideal feature to use to describe circuit dynamics. This is especially true in the STG, where synapses are graded, and coupling between cells can persist without spiking. However, the data required simply do not exist, as it requires intracellular recordings, which are substantially harder to perform (and maintain over challenging perturbations) than extracellular recordings.

      Finally, the signal to the muscles – arguably the physiologically and functionally relevant signal – is the spike signal, suggesting that spike patterns from the pyloric circuit are a useful feature to measure. Nevertheless, this is an important point, and we thank the reviewer for raising it, and we have included it in the section titled Discussion: Technical considerations.

      Overall, I think this work provides a useful starting point for large-scale quantitative analysis of CPG circuit behaviors, but there are many additional hurdles to be overcome.

      Reviewer #2 (Public Review):

      This manuscript uses the t-SNE dimensionality reduction technique to capture the rich dynamics of the pyloric circuit of the crab.

      Strengths:

      • The integration of a rich data-set of spiking data from the pyloric circuit

      • Use of nonlinear dimension reduction (t-SNE) to visualise that data

      • Use of clusters from that t-SNE visualisation to create subsets of data that are amenable to consistent analyses (such as using the "regular" cluster as a basis for surveying the types of dynamics possible in baseline conditions)

      • Innovative use of the cluster types to describe transitions between dynamics within the baseline state and within perturbed states (whether by changes to exogenous variables, cutting nerves, or applying neuromodulators)

      • Some interesting main results: o Baseline variability in the spiking patterns of the pyloric circuit is greater within than between animals

      o Transitions to silent states often (always?) pass through the same intermediate state of the LP neuron skipping spikes

      Weaknesses:

      • t-SNE is not, in isolation, a clustering algorithm, yet here it is treated as such. How the clusters were identified is unclear: the manuscript mentions manual curation of randomly sampled points, implying that the clusters were extrapolations from these. This would seem to rather defeat the point of using unsupervised techniques to obtain an unbiased survey of the spiking dynamics, and raises the issue of how robust the clusters are

      We have used t-SNE to visualize the circuit dynamics in a two-dimensional map. We have exploited t-SNE’s ability to preserve local structure to generate an embedding where a domain expert can efficiently manually identify and label stereotyped clusters of activity. As the author points out, this is a manual step, and we have emphasized this in the manuscript. The strength of our approach is to combine the power of a nonlinear dimensionality reduction technique such as t-SNE with human curation to make a task that was previously impossible (identifying and labelling very large datasets of neural activity) feasible.

      To address the question of how robust the manually identified clusters are, we have:

      1) used another dimensionality reduction technique, uMAP, to generate an embedding and colored points by the original t-SNE map (Figure 3 – figure supplement 3). To rough approximation, the coloring reveals that a similar clustering exists in this uMAP embedding.

      2) We generated synthetic spike trains from pre-determined spike pattern classes and used the feature vector extraction and t-SNE embedding procedure as described in the paper. We found that this generated a map (Figure 2—figure supplement 4) where classes of spike patterns were well separated in the t-SNE space.

      • the main purpose and contribution of the paper is unclear, as the results are descriptive, and mostly state that dynamics in some vary between different states of the circuit; while the collated dataset is a wonderful resource, and the map is no doubt useful for the lab to place in context what they are looking at, it is not clear what we learn about the pyloric circuit, or more widely about the dynamical repertoire of neural circuits

      • in some places the contribution is noted as being the pipeline of analysis: unfortunately as the pipeline used here seems to rely in manual curation, it is of limited general use; moreover, there are already a number of previous works that use unsupervised machine-learning pipelines to characterise the complexity of spiking activity across a large data-set of neurons, using the same general approach here (quantify properties of spiking as a vector; map/cluster using dimension reduction), including Baden et al (2016, Nature), Bruno et al (2015, Neuron), Frady et al (2016, Neural Computation).

      • Some key limitations are not considered:

      o the omission of the PY neuron activity means that the map as given is incomplete: potentially there are many more states, and hence transitions, within or beyond those already found that correspond to changes in PY neuron activity

      We agree with the reviewer that the omission of the PY neurons’ activity means that the map is incomplete. There are likely many more states, and hence many more transitions, than the ones we have identified. In addition, we note that there are other pyloric neurons whose activity is also missing (AB, IC, LPG, VD). However, measuring just LP and PD allows us to monitor the activity of the most important functional antagonists in the system (because they are effectively in a half-center oscillator because PD is electrically coupled to AB). In general, the more neurons one measures, the richer the description of the circuit dynamics will be. Collecting datasets at this scale (~500 animals) from all pyloric neurons is challenging, and we have revised the manuscript to make this important point (see Discussion: Technical considerations).

      o The use of long, non-overlapping time segments (20s) - this means, for example, that the transitions are slow and discrete, whereas in reality they may be abrupt, or continuous.

      We agree with the reviewer. There are tradeoffs in choosing a bin size in analyzing time series – choosing longer bins can increase the number of “states” and choosing shorter bins can increase the number of transitions. We chose 20s bins because it is long enough to include several cycles of the pyloric rhythm, even when decentralized, yet was short enough to resolve slow changes in spiking. We have included a statement clarifying this (see Discussion: Technical considerations).

      o tSNE cannot capture hierarchical structure, nor has a null model to demonstrate that the underlying data contains some clustering structure. So, for example, distances measured on the map may not be strictly meaningful if the data is hierarchical.

      We agree with the reviewer. t-SNE can manifest clusters when none exist (Section 4 of https://distill.pub/2016/misread-tsne/) and can obscure or merge true clusters. We have restricted analyses that rely on distances measured in the map to cases where there are qualitative differences in behavior (e.g., with decentralization, Fig 7) or have compared distances within subsets of data where a single parameter is changed (e.g., pH or temperature, Fig 5). The only conclusion we draw from these distance measures is that data are more (or less) spread out in the map, which we use as a proxy for variability. We have included a statement discussion limitations of using t-SNE (Discussion: Comparison with other methods).

      • the Discussion does not include enough insight and contextualisation of the results.

      We have completely rewritten the discussion to address this.

      Reviewer #3 (Public Review):

      Gorur-Shandilya et al. apply an unsupervised dimensionality reduction (t-SNE) to characterize neural spiking dynamics in the pyloric circuit in the stomatogastric ganglion of the crab. The application of unsupervised methods to characterize qualitatively distinct regimes of spiking neural circuits is very interesting and novel, and the manuscript provides a comprehensive demonstration of its utility by analyzing dynamical variability in function and dysfunction in an important rhythm-generating circuit. The system is highly tractable with small numbers of neurons, and the study here provides an important new characterization of the system that can be used to further understand the mapping between gene expression, circuit activity, and functional regimes. The explicit note about the importance of visualization and manual labeling was also nice, since this is often brushed under the rug in other studies.

      Major concern:

      While the specific analysis pipeline clearly identifies qualitatively distinct regimes of spike patterns in the LP/PD neurons, it is not clear how much of this is due to t-SNE itself vs the initial pre-processing and feature definition (ISI and spike phase percentiles). Analyses that would help clarify this would be to check whether the same clusters emerge after (1) applying ordinary PCA to the feature vectors and plotting the projections of the data along the first two PCs, or (2) defining input features as the concatenated binned spike rates over time of the LP & PD neurons (which would also yield a fixed-length vector per 20 s trial), and then passing these inputs to PCA or tSNE. As the significance of this work is largely motivated by using unsupervised vs ad hoc descriptors of circuit dynamics, it will be important to clarify how much of the results derive from the use of ISI and phase representation percentiles, etc. as input features, vs how much emerge from the dimensionality reduction.

      We agree with the reviewer that is important to clarify how much of our results come from the data itself, and how we parameterize them using ISIs and phases, and how much comes from the choice of t-SNE as a dimensionality reduction algorithm. We have addressed this concern in the following ways:

      1. We used principal components analysis on the feature vectors and measured triadic differences in features such as the period and duty cycle of the PD neuron. We found that triadic differences were lower in the t-SNE embedding than in the first two PCA features, or in shuffled t-SNE embeddings (Figure 2– Figure supplement 2), suggesting that the embedding is creating a useful representation that captures key features of the data.

      2. We have used uMAP to reduce the dimensionality of the feature matrix to two dimensions and found that it too preserved the coarse features of the embedding that we observe with t-SNE. Coloring the uMAP embedding by the t-SNE labels revealed that the overall classification scheme was intact (Fig 3 – figure supplement 3).

      3. We generated a synthetic dataset and applied the unsupervised part of our algorithm to it (conversion to ISIs, phases, etc., then t-SNE). We colored the points in the t-SNE embedding by the category in the synthetic dataset. We found that categories were well separated in the t-SNE plot, and each cluster tended to have a single color. This validates the overall power of our approach and shows that it can recover clustering information in large spike sets (Figure 2—figure supplement 4).

      4. We have run k-means and hierarchical clustering on the feature vectors directly and shown that our method is superior to these naïve clustering algorithms running on the feature vectors. We speculate that this is because these clustering methods attempt to partition the full space using global distances, at the expense of distance along the manifold on which the data is located. Algorithms like t-SNE are biased towards local distances, and discount global distances between points outside a neighborhood, and are this better suited here.

    1. Author Response

      Reviewer 1

      Panda and co-workers analyzed RS fMRI recordings from healthy patients and from two types of coma: UWS and MCS. They characterized the time-resolved functional connectivity in terms of metastability (time-variance of the Kuramoto order parameter), spatiotemporal patterns via non-negative tensor factorization, and its relationship to the eigenmodes of structural connectivity. Finding greater metastability and non-stationarity of the DMN network in healthy MCS patients, than in UWS patients, they found that the best discriminators to classify the different DoCs are the number of excursions (nonstability) from the DMN, salience and FPN networks extracted by the NNTF analysis. Interestingly, the data-driven NNTF yielded a novel sub-network comprising the FPN and some subcortical structures. The excursions and dwell times from this FPN subnetwork showed to be significantly lower in the UWS patients than in MCS. Surrogate data testing assures that the different methods and fits are effectively expressing the functional connectivity matrices measured.

      Overall, I think that the results are correct and they advance in the characterization and understanding of the brain under DoC. However, some improvements can be made in the way the results, and the rationale behind them, are presented.

      We thank Prof. Patricio Orio for his assessment.

      While reading the Results section, it is easy to have the impression of a disconnected set of analyses that just happened to be together. In particular, the section about the structural eigenmodes and their relationship with the time-resolved FC seems to have little connection with the rest of the work, except for confirming (yet again) that DoC patients have a less dynamic FC. More elaboration about the relevance of these results, and what they say about DoC (that other dynamical FC analyses don't), is needed both in the introduction and discussion. Although a clear explanation is given in the introduction, the bottom line seems to be yet another measure of metastability. Perhaps, a better explanation of what underlies the 'modulation strength of eigenmodes expression' will be helpful for distinguishing this analysis from others. How novel is the connection that is being done with the structural connectivity and why is this important? Moreover, the eigenmodes analysis has little-to-none importance in the discrimination of patients done at the end; thus, its place within the big picture is hard to evaluate.

      We understand the reviewer’s position. Part one of our work covers time-resolved FC and spatiotemporal networks in DoC. Part two covers the relationship between timeresolved FC and eigenmodes of the structural network. The rationale for including part two is the following: there is a lot of literature that shows that eigenmodes of the structural network can be considered as ‘building blocks’ or basis functions/vectors for spatiotemporal networks at the functional level (Aqil et al., 2021; Atasoy et al., 2016, 2018; Deslauriers-Gauthier et al., 2020; Gabay et al., 2018; Gabay and Robinson, 2017; Robinson et al., 2016; Robinson, 2021; Tewarie et al., 2019, 2020; Wang et al., 2017). Ideally to link part one and two, you would take this notion further by analysing if the magnitude eigenmode coefficients differed between UWS, MCS and healthy controls and how this would relate to dwell times or expression of spatiotemporal networks. However, this would lead to an immense multiple testing issue, which would be impossible to overcome with our sample size. An important link between part one and two of our work is the relationship between change in eigenmode expression and metastability. Our measure for metastability is only a proxy for metastability. Lack of change in eigenmode expressions seems to confirm this result of metastability.

      To allow for better integration of part one and two of our work, we have added to the introduction:

      “These eigenmodes can be considered as patterns of ‘hidden connectivity’ that come to expression at the level of functional networks. It has been postulated that eigenmodes form elementary building blocks for spatiotemporal dynamics (Aqil et al., 2021). There is evidence that the well-known resting state networks can be explained by activation of a small set of eigenmodes (Atasoy et al., 2018).”

      We have also clarified in the result section:

      “As resting-state network activity can be explained by activation of structural eigenmodes, we next analyse the role of fluctuations in eigenmode expression over time.”

      Something that I find counter-intuitive and that may confuse some readers, is the (apparent) contradiction between the diminished metastability in the DoC conditions and the reduced dwell times (Figure S1; also "the inability to sequentially dwell for prolonged times in a different set of eigenmodes", as stated in the Discussion). Fewer excursions and shorter dwell times can only mean that some networks are just less visited and maybe this would be enough to distinguish between conditions. Further explaining this will help to understand better the implications of the work.

      We understand the reviewer’s point, however we disagree that diminished metastability is in contradiction with the findings on dwell times. We show that dwell times are reduced in the posterior DMN, FPN and sub-FPTN networks, however, there is very long dwelling in the residual network in DoC. Hence, the brain resides in fewer network states in DoC, which is in agreement with reduced metastability. Our proxy for metastability is the standard deviation of the Kuramoto order parameter. Whenever there are more visits to network states, or switching between network states as is the case for healthy controls in our data, this would lead to phase uncoupling followed by phase synchronization, which would hence boost the standard deviation of the Kuramoto order parameter (a proxy for metastability).

      We agree with the reviewer that the sentence starting “the inability to sequentially dwell for prolonged….” Is confusing. We have now removed this statement.

      We have now added to the result section:

      “These findings of very short dwell times in the posterior DMN, FPN and sub-FPTN and long dwell time in the residual network can be considered as a contraction of the functional network repertoire in DoC, which is in agreement with a loss in metastability in these patients.”

      Finally, some comments about the connection(s) of these analyses with the commonly used FCD analysis (based on sliding windows of pair-wise correlations) will be useful, to put better this work into the big picture of time evolution of the functional connectivity.

      We have now discussed sliding window-based analysis in the context of our work in the methodology section.

      “Lastly, we have used a high temporal resolution method to estimate time-resolved connectivity at every time point instead of a sliding window-based method. Previous studies using sliding window approaches have provided novel insights into brain dynamics of loss of consciousness, such as the brain co-occurrence of functional connectivity patterns, which is known as brain states and its temporal (i.e., rate of pattern occurrence (probability) and between pattern transition probabilities) alteration in loss of consciousness in DoC patients (Demertzi et al., 2019) and anaesthesia induced loss of consciousness (Barttfeld et al., 2014a; Uhrig et al., 2018). However, sliding window approaches have limited sensitivity to non-stationarity in the fMRI BOLD signals (Hindriks et al., 2016) and lack to provide spatial alteration of classical brain functional network. The exploration of the spatiotemporal aspects of well-known resting state networks is an important step forwards for better understanding the relation between brain function and consciousness, in a way that is impossible to achieve at the whole brain level. In addition, recent work on time-resolved connectivity shows that brief periods of co-modulation in BOLD signals are an important driving factor for functional connectivity (Esfahlani et al., 2020; Hindriks et al., 2016).”

      Reviewer 2

      The study is of high significance, rigor, and novelty. Despite the many studies of repertoire, dynamic connectivity, etc., in the study of consciousness, there is (surprisingly, as I confirmed with a literature search) a dearth of application of these approaches to disorders of consciousness. The manuscript is well-written and transparent about its limitations. The author should consider the following recommendations:

      We thank the reviewer for his/her assessment of our work.

      1) There is frequent reference to "subcortical" and related networks, but I see no description in the text of which subcortical structures are involved. Panel N of figure 2 is helpful but I think that more explicit detail is important, especially given the specific predictions of mesocircuit theory.

      We have provided details for the subcortical networks presented in the Panel N of Figure 2. In the manuscript we provide a textual description of the brain areas that are part of the network. To improve the clarity of the description of the network, we also now refer to it as “subcortical fronto-temporoparietal (Sub-FTPN)”.

      In the result section, it read as: “This modulated subcortical fronto-temporoparietal network consist of the following brain regions: bilateral thalamus, caudate, right putamen, bilateral anterior and middle cingulate, inferior and middle frontal areas, supplementary motor cortex, middle and inferior temporal gyrus, right superior temporal, bilateral inferior parietal and supramarginal gyrus.”

      2) Similarly, although the global neuronal workspace does posit a critical role for recurrent frontal-parietal networks, can the authors be more specific about the nodes of the proposed workspace and what they found empirically?

      As above mentioned, we have provided more details about the regions part of the “subcortical fronto-temporoparietal”. As the reviewers rightfully noted, this network also shows some overlap with the Global Neuronal Workspace. We refer to that in more detail in the discussion, highlighting how our functional networks overlap and differ with the two networks (i.e., one feedforward only, one with recurrent activity), and with the predictions of the mesocircuit model. For more detail, please refer to the reply to point 1 of “Recommendations for the authors”.

      3) The classification sensitivity/specificity did not, in my opinion, add much to the manuscript, especially since the number of patients is not remotely close to what would be required for a population-based diagnostic approach. If the authors chose to include this with any reference to diagnosis (highlighted in the introduction and elsewhere), I would encourage a comparison with similar data from other clinical or neuroimagingbased diagnostic approaches. However, I think the value of the study resides more with mechanistic understanding than diagnosis.

      We agree with your suggestions that the primary aim of our work is to provide a mechanistic understanding of loss of consciousness. Therefore, we have removed the classification part from the paper and explain our findings focusing on mechanism of pathological unconsciousness rather than its potential as a clinical diagnostic tool. This change has required several textual edits throughout the manuscript.

    1. When memes or the subjects of a meme are used for commercial purposes without permission, the meme creator may sue, as the effect of the commercial use on the market value of the original meme usually prevents a finding of fair use. In 2013, the owners of the cats featured in the “Nyan Cat” and “Keyboard Cat” memes won a lawsuit against Warner Bros. and 5th Cell Media for respectively distributing and producing a video game using images of their cats.

      Big corporations use other creators' work more often than we think. It is unreal to think that people's work can be stolen from the internet easily and sometimes it could be hard to prove. Fortunately, these two cases were able to win their lawsuit.

    1. Author Response

      Reviewer #1 (Public Review):

      This manuscript investigates a role for YAP in replication. Previous work from this group has shown that Yap knock-down leads to accelerated S-phase and an abnormal progression of DNA replication in the frog eye. Here they extend this to show that YAP depletion accelerates S-phase and DNA replication in the frog embryo, and that YAP binds a DNA replication regulator called Rif1. Combing assays suggest that YAP acts on origin firing. This is an interesting new aspect of YAP function. I am not an expert on DNA replication, however, I feel that the manuscript would have been improved if more mechanistic insight was gained into how Rif1 and YAP interact, and how that interaction influences replication timing.

      In the revised version of the manuscript, we have strengthened our conclusion that Yap regulates the dynamics of DNA replication. We now provide additional experiments in addition to DNA combing and nascent strand analysis by agarose gel electrophoresis: Rhodamine-dUTP incorporation/nucleus, 32P-dCTP incorporation, western blotting for replication fork proteins. All show that DNA synthesis and origin activation is increased after Yap depletion.

      Moreover, in the revised manuscript we also directly compared the effects of YAP depletion to those of Rif1 depletion alone (page 7, New Figure 4). As for Yap depletion, we first quantified rhodaminedUTP incorporation after Rif1 depletion by direct fluorescence microscopy that demonstrated a clear increase of DNA synthesis, consistent with Alver et al. 2017. Second, we performed DNA combing experiments after Rif1 depletion in egg extracts that show a marked increase in DNA replication and fork density like those seen after Yap depletion, spanning from very early to mid S-phase. We therefore found that Rif1 depletion and Yap depletion qualitatively show the same main effects: an increase of DNA synthesis and fork density, that are more pronounced in early S-phase. We also noticed quantitative differences in the direct fluorescence after rhodamine incorporation of whole nuclei and fork density, with stronger effects after Rif1 depletion compared to Yap depletion. This suggests that there might be an additional mechanism for Rif1 in regulating origin activation.

      The title of the manuscript is "A non-transcriptional function of YAP orchestrates the DNA replication program". It is not clear that YAP "orchestrates" DNA replication - for this to be true, it would have to be signal responsive. Since the authors did not reveal any links to YAP activity (such as YAP phosphorylation or nuclear/cytoplasmic distribution) it is not "orchestrating" DNA replication.

      We have replaced “orchestrates” by “regulates”.

      Figure 1 shows that YAP is recruited onto chromatin after MCM2 and MCM7 and at the same time as PCNA and the start of DNA synthesis. Addition of geminin, an inhibitor of Cdt and MCM loading inhibits YAP loading onto chromatin. YAP immuno depletion leads to premature DNA synthesis or replication. Fig 1 B is quite confusing- the labeling in Figure 1B is likely incorrect.

      We apologize for this confusion. This has been corrected and the Figure 1B is now properly labelled.

      Figure 2 investigates if YAP depletion affects origin firing or fork speed, using DNA combing. Fig 2A shows that there is increased activated replication origins and decreased distance between origins. The authors say that the increase of fork density is more pronounced than the decreased distance, suggesting YAP is regulating the activation of origins. The number of replicates is low. This is especially true for the conclusion that eye length is unaltered -it appears that there is a subset of eye length that is increased in 2F, which might reach significance if triplicates were performed.

      As the referee points out, both the observed increase of fork density and decrease of origin distances argues that origin activation is increased after Yap depletion. The fact that the increase of the fork density seems more pronounced than the local decrease of neighbouring origins allows a more detailed interpretation, explicitly that whole clusters of origins are activated on top of origins inside already active clusters. This can be observed in the two independent experiments probing many fibers for eye distances and eyes numbers.

      Concerning Figure 2F, the scatter plot makes it look like that the impression that there are more eyes with larger sizes after Yap depletion, but please note that there are also more EL measured as stated in the legend (Mock n=182 versus Yap n=311). To highlight this parameter, we added these numbers below the scatter plot in the revised Figure 2F, as we have done consistently for all of the experiments presented in the revised Figures. The means of the two EL distributions are numerically different but since both distributions are not Gaussian (tested by d'Agostino and Pearson test), only non-parametric tests can apply (Mann-Whitney or Kolmogorov Smirnow test). The results of the two non-parametric tests show that the distributions are not significantly different, as mentioned in the legend. However, we cannot rule out that after Yap depletion some larger eyes may arise from fusions of forks or from a higher fork speed, but again, the tests, applied to a high number of measurements, show no significant statistical differences.

      The authors conducted AP-MS on egg extracts to identify proteins that co-IP with YAP. One of many proteins identified was RIF1 Figure 3 shows a co-IP with RIF1 and YAP. It is a very weak co-IP.

      We agree that the Rif/Yap co-IP is weak, but it is reproducible in several independent experiments with different extracts. There could be many reasons for this. Co-IPs with a high molecular weight partner like Rif1 (250 kDa) are generally tedious (poor gel migrations and WB transfer). Further, Rif1 has been described as having a subnuclear localisation and to associate with the nuclear lamina and heterochromatin. These characteristics are known to make the proteins highly insoluble. These technical limitations have been reported for the mouse Rif1 for instance (Sukackaite R; et al. Sci Rep 2017 May 18;7(1):2119). In fact, similar “weak co-IPs” were also obtained between Rif1 and Nanog (Wang J. et al. Nature 2006 (444), 364–368 ) as well as with PPI (Hiraga S. et al. EMBO Rep. 2017 Mar;18(3):403-419). Finally, it could also be that this interaction is not permanent but dynamic, making it difficult to capture in a Co-IP. Taken together, these parameters mean that the identification of the interaction is in itself challenging. What we did manage to provide is a reciprocal co-IP using the endogenous proteins, which we believe best reflects native conditions.

      Figure 4 shows that YAP levels increase during development and that depletion of YAP or RIF1 leads to increased cell division. The authors use Trim-away to deplete YAP and RIF1 and find that depletion of either leads to an increased number of small cells. The YAP depletion shown in Fig 4B is clear, as is the increased number of small cells in YAP depletion or RIF1 depletion.

      Figure 4 supplement 1 is arguing that trim away and morpholino combined are more effective. Quantitation of the western blots in panel A is needed for this to be convincing.

      The quantification is now presented in new Figure 5-figure supplement 1A. At the 2-cell stage, we observe some fluctuations in the amounts of Yap between samples, the origin of which we do not fully understand. At the 4-cell stage, a reduction in Yap is observed regardless of the depletion strategy used. It is from the 8-cell stage onwards that differential effects between the depletion methods can be appreciated. From this stage onwards, the quantifications confirm that the TRIM-Away and morpholino combined are more effective than taken separately.

      Figure 5 shows that RIF1 is expressed in the eye in RSC and that loss of RIF1 leads to a small eye. Panel B shows that by western blot analysis RIF1 antibody is specific. However, antibodies can have very different abilities in western vs staining. The RIF1 and YAP antibodies should be validated in staining. Also, the staining in Fig5C is at low resolution for both YAP and RIF1 and the identification of foci is unclear.

      This is indeed an important issue. To address this point, we performed immunostaining on retinal sections from embryos depleted with the target protein and compared the fluorescent signal obtained in control versus depleted samples. We show that upon depletion of Yap or Rif, the signal from the immunostaining is severely reduced for Yap or Rif1, respectively, which attests the specificity of the antibodies used in this study. We have added an additional supplementary Figure to show this control (Figure 6-figure supplement 1).

      We agree with the reviewers that the quality of the images could be improved. We now provide confocal images with a better resolution (Figure 6C).

      For Rif1, we observe a clear nuclear staining, rather non-homogenous which is consistent with data reported in the literature. Indeed, Rif1 localisation has been shown to be highly dynamic during the cell cycle and also during S-phase (Cornacchia D. et al. EMBO J. 2012). Some brighter foci could be observed at specific phases (such as G1-phase) but overall, the general pattern appears rather “granular” and restricted to the nucleus. This is what we are also observing. Interestingly, Rif1 does not appear to colocalize with the replication fork or with the replicative helicase MCM3 (Cornacchia D. et al. EMBO J. 2012). The replication foci observed in this study are therefore to be understood independently of the Rif1 localisation pattern.

      For Yap, we do not detect any granular expression but observe rather homogeneous nuclear and cytoplasmic staining, which is also consistent with reported data showing YAP nucleo-cytoplasmic shuffling (see for instance Manning S.A. et al. Curr Biol. 2018). STED microscopy might be necessary for higher resolution.

      It is difficult to see the points the authors wish to communicate in Figure 6. There is almost no Edu in the YAP-MO, which questions the ability to recognize the different patterns in this region of the eye.

      Our observations show that there are fewer EdU positive cells in the Yap-MO but not “no EdU”. The fluorescence intensity in the green-labelled nuclei in Figure 7C after Yap MO does not appear different from that in the control-MO. Under these conditions, there is no reason to think that one pattern is more difficult to recognise than the other one.

      Reviewer #2 (Public Review):

      This paper is of potential interest within the field of DNA replication, as it identifies a novel role for YAP protein in DNA replication dynamics. However, the conclusions are not supported by properly controlled data. Several aspects of data analysis and representation need to be revised.

      In this manuscript, the authors characterized YAP function in the control of DNA replication dynamics, taking advantage of the Xenopus laevis system.

      They found that YAP is recruited to replicating-chromatin and showed that its chromatin enrichment depends on the assembly of pre-RC proteins. In addition, they show that the immuno-depletion of YAP leads to increased DNA synthesis and origin activation, revealing YAP's possible role in the regulation of replication dynamics.

      The authors were also interested in finding YAP potential partners that could mediate its function. They identified Rif1, a major regulator of replication timing, as a novel YAP interactor during DNA replication.

      As RIF1 expression in vivo is restricted to the stem cell compartment of the Xenopus retina, similar to YAP, the authors assessed whether Rif1 could regulate the spatial-temporal program of DNA replication in stem cells. They showed that depletion of Rif1 at early stages of Xenopus embryos development leads to alterations in replication foci of retinal stem cells, resembling the effect observed following YAP down-regulation.

      Finally, they studied the impact of YAP and RIF1 down-regulation at early stages of development, showing that their absence results in the acceleration of cell division rate of Xenopus embryos, where RNA transcription is absent. Based on these results they concluded that YAP has a role in S-phase independent from transcription.

      The higher rate of DNA synthesis observed in the absence of Yap in Figure 1D is not very evident from the gels in Figure 1, supplement 3B. The timing of the experiments is continuously changing throughout the figures. It is therefore difficult to compare them. Also, comparisons across different gels are difficult to interpret. Most importantly, relative quantification on gel images cannot support the claim of increased DNA synthesis in the absence of YAP. To accurately quantify the replication of DNA added to the extract, the total amount of DNA synthesized must be quantified.

      Although we do not agree that relative quantification on gel images cannot support the claim of increased DNA synthesis in the absence of Yap, we thank the reviewer for his suggestion since we now provide additional data clearly strengthening our conclusion.

      Many studies, published in high standards journals and coming from different Xenopus replication laboratories have quantified DNA synthesised after 32P-dCTP incorporation and separation by agarose gel electrophoresis (Shechter et al, 2004; Trenz et al, 2008; Guo et al, 2015; Walter & Newport, 1997; Suski et al, 2022, Nature). Nevertheless, as the referee suggested, we quantified the total amount of DNA synthesized in three new independent experiments. These new results, presented page 5, lines 34-39 and shown in Figure 1G, support our conclusion, as they also show that Yap depletion increases total DNA synthesis. Please note that the DNA combing results presented in Figure 2 also show that replication is increased after Yap depletion. Finally, we also added another set of experiments to Figure 1 to further confirm these findings. We used the incorporation of Rhodamine-dUTP followed by the quantification of the fluorescence intensity within nuclei. This nuclei-fluorescence based method is frequently used in proliferation assays to assess nucleotide incorporation resulting from the DNA replication process in other organisms. Our new results demonstrate that DNA synthesis is increased 1.5-fold in six biological replicates and represent a third independent method, in addition to DNA combing and 32P-dCTP incorporation, showing that DNA synthesis is increased upon YAP depletion. These new results are now presented page 5, lines 27-24 and shown in Figure 1D-F.

      As explained in the MM section page 14 in the original manuscript, the replication extent (percent of replication) differs for a specific time point from one extract to another, because each egg extract prepared from one batch of eggs replicates nuclei with its own replication kinetics. To overcome this problem and to compare different independent experiments performed using different egg extracts, the data points of each sample were normalized to maximum incorporation value.

      It is also necessary to analyze the dynamics and the abundance of chromatin-bound replication proteins associated with the active replication fork after Yap depletion using chromatin binding assays. This would further confirm the increase in the fork density observed by DNA combing experiments.

      We thank the referee for this suggestion and we added a western blot of chromatin bound proteins after Yap depletion. This shows that two replication proteins associated with the active replication fork, namely Cdc45 and PCNA, are enriched after Yap depletion compared to the control at the beginning of S-phase. This observation further supports the DNA combing results showing that more forks are active after YAP depletion. This new data is now presented page 6 lines 25-32 and displayed in Figure 2H.

      We would like to stress here that with these additional methods added to the revised version, five different methods in total (Rhodamine-dUTP incorporation/nucleus, 32P-dCTP incorporation - total synthesis, 32P-dCTP incorporation - nascent strand analysis, DNA combing, western blotting for replication fork proteins) show that DNA synthesis and origin activation is increased after Yap depletion.

      The quantification of the amount of YAP in Figure 1B is confusing. The legend of the chart states "Control in light grey and presence of geminin in black", but the bar colors are of different shades of grey. It is not clear how to evaluate them.

      We apologize for this confusion. This has been corrected and the Figure 1B is now properly labelled.

      The efficiency of depletion for both Rif1 and YAP is different in Figure 4B and Figure 4A, supplement 1.

      We agree with the referee that the efficiency of depletion is different in both figures. This is explained by the fact that the extent of the depletion varies from experiment to experiment. We work with different batches of in vitro fertilized embryos and extracts, so these differences simply reflect the technical/biological variability.

      Moreover, the combined use of the TRIM-away approach with injections of MO led to a stronger and prolonged YAP depletion but also triggered toxicity in the tadpoles, which display severe abnormalities.

      It is important to point out that abnormal development is not always attributable to a toxic effect. Many losses of gene function result in malformations without being ascribed to toxicity or unspecific effects. However, we agree with the reviewers on the need to present a rescue experiment, which is now shown in new Figure 5C and new Figure 5-figure supplement 1B. In addition, we also provide gain-of-function (GOF) data for YAP in early embryos. In brief, we find that the Yap GOF leads to opposite outcomes than those of its depletion with embryos at the same stage of development, having fewer and larger cells than the control. Furthermore, we show that the effects of Yap depletion, i.e. embryos with more and smaller cells than the control at the same developmental stage, are rescued by the injection of MO-resistant Yap mRNA to restore the protein level. This is true for both embryonic divisions (new Figure 5C) and development, as we obtained normal-looking neurula after Yap rescue (new Figure 5-figure supplement 1B). Overall, these data now clearly show that Yap is both sufficient and necessary to maintain the rate of embryonic divisions and that this phenotype is specific since it can be rescued by expressing Yap alone. These new data are presented page 8, lines 2-10.

      Reviewer #3 (Public Review):

      The article by Garcia et al clearly describes a set of experiments establishing Yap as a novel regulator of DNA replication dynamics. Its characterization as both a RIF1 interaction partner as well as playing its own role in replication initiation will likely have a significant impact on the field, as currently little is known about how DNA replication during early embryonic cell divisions is regulated.

      The authors aim to identify a non-transcriptional function of YAP through the use of the Xenopus in vitro replication system and Yap depletion. Strengths of the paper include the particularly appropriate use of the Xenopus in vitro replication system, as well as the combined use of Trim-Away and morpholino oligonucleotides to deplete Yap and Rif1. Moreover, these experiments were elegantly complemented by single-molecule molecular combing and in vivo studies. Identifying Yap as a novel regulator of DNA replication dynamics, the authors achieved their aim. Through characterization of Yap as both playing a role in replication initiation and as a Rif1 interaction partner will likely have a significant impact on the field, as currently little is known about how DNA replication during early embryonic cell divisions is regulated. A weakness of the paper is that some of the representative data does not appear to be very representative of the entire data set.

      We replaced representative data in Figure 2 A, which we think better reflects the main conclusions of the entire data set.

    1. Reviewer #1 (Public Review):

      1: The authors formulate competing hypotheses on the behavioral impact of alpha oscillations using signal detection theory (SDT) (Intro and Fig. 1). SDT is indeed well suited for this, as it is used to compute the orthogonal behavioral metrics d' (discriminability) and criterion (bias). However, soon the authors write:

      "The higher d' for conservative trials may be due to the more skewed mapping between the false alarm (FA) rate to its Z-value in our d' computation. Specifically, when criterion (or the decision boundary) intersects the noise distribution at its right tail, small changes in FA rate are nonlinearly exaggerated after Z-transformation. As we did not observe a difference in accuracy between conservative and liberal trials, which is a more robust measure of perceptual discriminability when target presence rate equals 50%, we argue that the observed statistically significant d' difference is equivocal."

      And also:

      "For the binning analyses, we mainly focused on the percentage correct (i.e., accuracy),<br /> and hit and FA rates, because these metrics scale linearly (as opposed to d', which scales<br /> nonlinearly as the hit rate increases or FA rate decreases linearly) and are well defined for both<br /> behavioral data and MVPA outputs."

      And indeed from Fig. 3 onwards they do not really use SDT anymore, which is confusing given the Introduction and Fig. 1. I think it's also problematic, as accuracy, hit-rate and fa-rate are not orthogonal and are therefore much less suited to arbitrate between their competing hypotheses. As a result, I'm not convinced the paper accomplishes what it sets out to do in the Introduction.

      2: Related, if indeed the authors choose to deviate from SDT, they should put the metric "% yes-choices" on equal footing with accuracy. For example, in Fig. 3A, we can see that alpha oscillations predict a reduction of hit-rate as well as fa-rate; this suggest that the main effect is actually on choice bias (% yes-choices) rather than accuracy. If that's true, then the title of this manuscript is misleading.

      3: Have the authors considered to test for non-monotonic effects of alpha oscillations and cortical computation and behavior?

      4: The authors use challenging and sophisticated methods, but these are introduced very casually. For example:

      "To obtain a more fine-grained picture of the alpha power modulation of behavior, we applied generalized linear mixed models (GLMMs; see Methods) to account for both between-subjects and within-subject trial-by-trial response variability, and to estimate the effects of alpha oscillatory power on d' and criterion simultaneously."

      And:

      "To evaluate the quality of visual information coding, we used multivariate pattern analysis (MVPA), operationalizing the quality of visual representation as the neural classifier's classification performance. We used the priming trials to train binary classifiers to classify target-present vs. absent trials in a time-resolved manner, [...]".

      It would help a lot if the authors could unpack their rationale some more. For example, why did they consider between-subjects effects, and could they show some scatter plots with between-subjects correlations before turning to the GLMM? Also, what is the question the authors wanted to answer that required training the classifier in a time-resolved manner (which I like, on a personal note)?

      5: Throughout, the label "liberal trials" is odd, given that group-average criterion > 0 on those trials (Fig. 2C).

      6: It would be nice to explicitly bridge to the literature on (pupil-linked) arousal predicted shifts in decision-making, and to findings on the relationship between alpha oscillations and (pupil-linked) arousal.

    1. But I think we may go still further. The right to regulate the use of wealth in the public interest is universally admitted. Let us admit also the right to regulate the terms and conditions of labor, which is the chief element of wealth, directly in the interest of the common good.

      On July 5, 1935, President Roosevelt created the Wagner Act, also known as the Nation Labor Relations Act. The Act included many things such as entitlement to wages and benefits, hour of work, overtime arrangements and overtime compensation, and leave for illness, maternity, vacation or holiday. Labor and working conditions. (n.d.). https://firstforsustainability.org/risk-management/understanding-environmental-and-social-risk/environmental-and-social-issues/labor-and-working-conditions/ National Labor Relations Act (1935). (2021, November 22). National Archives. https://www.archives.gov/milestone-documents/national-labor-relations-act

    1. Well, this was a true early morning treat!You reeeeally botched that one. Like 180 degrees misinterpreted it.That thread is about how Luhmann developed a personal approach that worked for him (as we all do and should), and that there is no one way to work/do a zettelkasten. Ie. We all must (and inevitably will) interpret Luhmann's take on zettelkasten method (and any other tools/method/etc we encounter) in light of what our needs are.What's super dope, is that my whole jam in this ZK world is about showing the thread/lineage of these techniques and helping people specifically wrestle with some of the principles and practices Luhmann employed so that in the end they can apply them in whatever way they see fit. And yet, somehow....you actually miss that?Also, this.... (you)"We approach these methods from such a top down manner, in part, because our culture has broadly lost the thread of how these note taking practices were done historically. Instead of working with something that has always existed and been taught in our culture, and then using it to suit our needs, we're looking at it like a new shiny toy or app and then trying to modify it to make it suit our needs."... Is this....(me)"We're coming at [zettelkasten] top-down. We're appropriating something and trying to retrofit it in a desire to "be better." In doing so, we're trying "clean it up a bit."I'm critiquing this approach 😂 I'm saying we come at it top-down bc we see it as a reified object (which is incorrect) that is set in stone, when in fact those who present the "one true way" are actually presenting a "cleaned up version" of Luhmann's very personal approach and calling it "official." Again, I'm critiquing that! I am, by design and punk ethos, kinda against "official."Silly, dude. The whole thread is about not looking at it as a "shiny new toy" and seeing it as a more fluid aspect of note-taking and personal practice. It's about recognizing that the way to recreate Luhmann is to be flexible, interpret these methods for yourself. Why? Bc that's exactly what Luhmann did."Let the principles and practices guide your zettelkasten work. Throw them in a box with your defined workflow issues. Let them hash it out. Shake the box and let them tell you the "kind" of zk you should be working with." (thread the day before the above mentioned)Also, and you're gonna love this....Here's you above...."People have been using zettelkasten, commonplace books, florilegium, and other similar methods for centuries, and no one version is the "correct" one."And here's me....."The most well-known slip-boxes in the world have been employed by writers in service of their writing. Variations of the system date back to the 17th c., [3] and modern writers such as, Umberto Eco, Arno Schmidt, and Hans Blumenberg are all known for employing some version of the slip-box to capture, collect, organize, and transform notes into published work. Of course, today, the most famous zettelkasten is the one used...."Sound familiar? It's me citing you, ya dum dum 😂 Footnote numero tres....https://writing.bobdoto.computer/zettelkasten-linking-your-thinking-and-nick-milos-search-for-ground/Such a funny thing to see this fine Friday morning! ☀

      Sadly I think we're talking past each other somehow; I broadly agree with all of your original thread. Perhaps there's also some context collapse amidst our conversations across multiple platforms which doesn't help.

      Maybe my error was in placing my comment on your original thread rather than a sub branch on one of the top several comments? I didn't want to target anyone in particular as the "invented by Luhmann myth" is incredibly wide spread and is unlikely to ever go away. It's obvious by some of the responses I've seen from your thread here in r/antinet that folks without the explicit context of the history default to the misconception that Luhmann invented it. This misconception tends to reinforce the idea that there's "one true way" (the often canonically presented "perfect" Luhmann zettelkasten, rather than the messier method that he obviously practiced in reality) when, instead, there are lots of methods, many of which share some general principles or building blocks, but which can have dramatically different uses and outcomes. My hope in highlighting the history was specifically to give your point more power, not take the opposite stance. Not having the direct evidence to the contrary, you'll noticed I hedged my statement with the word "seems" in the opening sentence. I apologize to you that I apparently wasn't more clear.

      I love your comparison of LYT and zettelkasten by the way. It's reminiscent of the sort of comparison I'm hoping to bring forth in an upcoming review of Tiago Forte's recent book. His method—ostensibly a folder based digital commonplace book, which is similar to Milo's LYT—can be useful, but he doesn't seem to have the broader experience of history or the various use cases to be able to advise a general audience which method(s) they may want to try or for which ends. I worry that while he's got a useful method for potentially many people, too many may see it and his platform as a recipe they need to follow rather than having a set of choices for various outcomes they may wish to have. Too many "thought leaders" are trying to "own" portions of the space rather than presenting choices or comparisons the way you have. Elizabeth Butler is one of the few others I've seen taking a broader approach. A lot of these explorations also means there are multiple different words to describe each system's functionality, which I think only serves to muddy things up for potential users rather than make them clearer. (And doing this across multiple languages across time is even more confusing: is it zettelkasten, card index, or fichier boîte? Already the idea of zettelkasten (in English speaking areas) has taken on the semantic meaning "Luhmann's specific method of keeping a zettelkasten" rather than just a box with slips.)

    1. Author Response

      Reviewer 1

      Strengths:

      This manuscript combines experimental, exploratory, and observational methods to investigate the big question in innovation literature--why do some animals innovate over others, and how information about innovations spread. By combining a variety of methods, the manuscript tackles this question in a number of ways, and finds support for previous work showing that animals can learn about foods via social olfactory inspection (i.e., muzzle to muzzle contact), and also presents data intended to investigate the role of dispersing animals in innovation and information spread.

      Using data from a previously-published experiment, the manuscript illustrates how investigators can numerous interesting questions while limiting the disturbances to wild animals. The manuscript's attempt at using exploratory analysis is also exciting, as exploratory analyses provide a useful tool for behavior research-indeed, Tinbergen insisted that behavior must first be described.

      Weaknesses:

      The manuscript's introduction is a bit unclear as to how the fact that dispersing males may be an important source of information ties to innovations in response to disruptions due to climate change, humans, or new predators, if at all. An introduction regarding the role of dispersed animals in introducing novel behaviors and social transmission would better prepare readers for the questions presented in the manuscript. As it stands now, the manuscript only provides one sentence discussing the theoretical relevance of investigating the role of dispersing animals in innovations.

      We have added some information about this to the introduction (lines 66 – 69 and 121-123) and maintain our discussion of it in the discussion.

      Additionally, while the manuscript attempts to use exploratory analysis, it does not provide enough theoretical background as to why certain questions were asked while the data were explored. While the discussion provides some background as to the role of dispersing males in innovation, the introduction provides little background, and thus does not properly frame the issue. It is unclear how dispersing males became of interest and why readers should be interested in them. As the manuscript reads now, it may be that dispersing males became interesting only as a result of the exploratory analysis-except that the predictions explicitly mentions dispersing males. Thus, manuscript at present makes it difficult to know if the questions surrounding immigrant males resulted from the exploratory analysis, or was a question the analyses were intended to answer from the beginning. If this question only came out after first reviewing the results, then this needs to be made clear in the introduction. I see no issue with reporting observations that were the result of investigations into earlier results, but it needs to be reported in a way that can be replicated in future research-I need to know the decision process that took place during the data exploration.

      We hope this is clearer from our new research aims (lines 125-173)

      The manuscript never clearly defines what counts as an immigrant male; presumably, in this species, all adult males in the group should be immigrants, as females are the philopatric sex. Sometimes, the manuscript uses "recently" to modify immigrant males, but doesn't define exactly what counts as recent, except to say that the males that innovated were in their respective groups for fewer than 3 months, but never explains why three months should be an important distinction in adult male tenure.

      We realise how we wrote about this previously was not clear and perhaps misleading. We noticed that the males that innovated had been in the group for less than three months. We do not know if this is necessary for them to innovate or not. We also added to the discussion a description of the male in AK19 who had been in the group for four months and did no innovate – as he had many other traits which we would expect to exclude him from criteria for innovation (e.g. very old, post-prime, and inactive – died within months of the experiment).

      Due to the above weaknesses, the provided predictions are a bit murky. It is not clear how variation between groups in accordance with who innovated, or initiated eating a novel food, or demographics is related to the central issue. The manuscript does contribute to the literature by looking at changing rates of muzzle contact over exposure to a novel food source, and provides a good extension of previous findings; that, if muzzle contacts help animals learn about new foods, then rates of muzzle contacts involving novel foods should decrease as animals become familiar with the food. However, this point isn't explicit in the manuscript.

      This is now addressed in the new aims paragraph (lines 125-173)

      Finally, it is also unclear as to why changing rates of muzzle contact AND whether certain individual level variables like knowledge, sex, age, and/or rank might influence muzzle contacts during opportunities to innovate.

      We are not sure exactly what the reviewer means here, but hope that the substantial revisions we have made now address their concern.

      As for the methods, the manuscript doesn't provide enough details as to why certain decisions were made. For example, no reason is given as to why only the first four sessions after an animal ate were considered, why the first three months of tenure (but not four, as seen on one group that didn't innovate) was considered to be a critical time for which immigrant males may innovate, why (including the theoretical reasons) the structure of models for one analysis was changed (dropping one variable, adding interactions), or even how the beginning and ending of a trial was decided, despite reporting that durations varied widely,-from 5 minutes to two hours.

      Please see: above about the male with 4 month tenure; and top of document for description of our updated models.

      The discussion contains results that are never elsewhere presented in the manuscript- (2a) Individual variation in uptake of a novel food according to who ate first).

      It was just an error in the sub-title in the discussion – this is now amended. But all the other corresponding details were already there, in the list of research aims in the introduction and in the results as well.

      Finally, the largest issue with the manuscript is that its results are not as convincing as the conclusions made. An issue with all the analyses is that some grouping variables in some analyses but not others despite the fact that all of the analyses contain multiple groups (necessitating group as a grouping variable) and multiple observations of the same individuals (i.e., immigrant males tested in multiple groups, necessitating animal identity as a random effect), and not accounting for individual exposure to the experiment when considering whether animals ate the food in the allotted period (an important consideration given the massive differences in trial times), making these results difficult to interpret in their current forms. As for the results regarding muzzle contact, the analyses has a number of issues that make it difficult to determine if the claims are supported. These issues include not explaining why rank calculated a year before the experiments took place was valid or if rank was calculated among all group members or within age and sex classes, not explaining how rank was normalized, and not conducting any kind of formal model comparisons before deciding the best model.

      Mostly addressed at top of this document. Regarding rank calculations: rank was not calculated a year before the experiments, it was calculated using a year’s worth of data up to the beginning of the experiments – and ranks were calculated among all group members - we have made this clearer in the methods. We also explained our method of normalisation, and noted that it was an error to include non-normalised rank in one of the models – this has now been rectified

      As for the results regarding immigrant males and innovation, little is done to help the fact that these results are from very few observations and no direct analyses. It is possible that something that occurs relatively often but in small sample sizes, like dispersing animals, could have immense power in influencing foraging traditions, and observation is a necessary step in understanding behavior. However, the manuscript doesn't consider any alternative hypotheses as to why it found what it found. No other possible difference between the groups was considered (for example, the groups that rapidly innovated appear to be quite smaller than the groups that did), making the claim that immigrant males were what allowed groups to innovate unconvincing. This is particularly true given that some groups in this study population have experimental histories (though this goes unmentioned in the current manuscript), which likely influenced neophobia-especially given work by the same research group showing that these animals are more curious compared to their unhabituated counterparts.

      We have added more discussion of alternative hypotheses to the discussion (line numbers mentioned above).

      Regarding the comment about rapid innovation in smaller groups – we are not sure what the reviewer means here – all groups except BD were similar sized. The second largest group, NH, had one of the quickest innovations and a smaller group (KB) innovated only at the third exposure. Unless the reviewer instead refers to the spread of the innovation here? This is also not quite what we see in the data – BD is the largest group and one of the fastest to spread, and KB is the smallest group and the slowest to spread. Regarding groups experimental histories, all the five studied groups have already been used in field experiments. The group (LT) with the least experimental history was the one having the greatest proportion of individuals eating the novel food at the first and over the four exposures (see Fig. 2) while one of the groups with the most experimental history (NH) was one having a smaller proportion of individuals eating the food across the experiment. This is discussed in the discussion (lines 370-380).

      Reviewer 2

      I have separated my issues with the manuscript into three sub-headings (Conceptual Clarity, Observational Detail and Analysis) below.

      1) Conceptual clarity

      There are a number of areas where it would greatly benefit the manuscript if the authors were to revisit the text and be more specific in their intentions. At present, the research questions are not always well-defined, making it difficult to determine what the data is intended to communicate. I am confident all of these issues could be fixed with relatively minor changes to the manuscript.

      For example, Line 104: Question 1 is not really a question, the authors only state that they will "investigate innovation and extraction of eating the food", which could mean almost anything.

      We re-wrote the research questions paragraph and results with this advice in mind – hope it is clearer now. We keep the innovation part just descriptive and hope this is less problematic now.

      Question 2a (line 98) is also very vague in it's wording, and I'm left unclear as to what the authors were really interested in or why. This is not helped by Line 104 which refuses to make predictions about this research question because it is "exploratory". Empirical predictions are not simply placing a bet on what we think the results of the study will be, but rather laying out how the results could be for the benefit of the reader. For instance, if testing the effects of 10 different teaching methods on language acquisition-rate: Even if we have no a priori idea of which method will be most effective, we can nevertheless generate competing hypotheses and describe their corresponding predictions. This is a helpful way to justify and set expectations for the specific parameters that will be examined by the methods of the study. In fact, in the current paper, the authors in fact had some very clear a priori expectations going into this study that immigrant males would be vectors of behavioural transmission (clear that is from the rest of the introduction, and the parameters used in their analysis, which were not chosen at random).

      We have now updated the whole research aims (lines 125-173).

      The multiple references to 'long-lived' species in the abstract (line 16 and introduction (39, 56) is a bit confusing given the focus of this study. Although such categorisations are arbitrary by nature (a vervet is certainly long-lived compared to a dragonfly), I would not typically put vervet monkeys (or marmosets, line 62) in the same category as apes (references 8 and 9) or humans (line 62) in this regard.

      When we use “long-lived” in the introduction, we explain that we mean animals with slow generational turnover for whom genetic adaptation is relatively slow – too slow to adapt to very rapid environmental change. Within the distinctions the reviewer makes here, we feel that vervets and marmosets are much more similar to apes than to dragonflies etc. in this respect… and we think making the comparisons that we do are valid in this context (though we do agree that for other reasons we would not find it appropriate). We have modified the sentence in the introduction (line 4042) and hope this is clearer now. The study in reference 9 is about crop-raiding, which is something vervets can learn to do within one generation too. In addition, reference 8 is used as it was one of the earlier and long-standing definitions of innovation which we are using here – we are not comparing vervets to apes directly, but we do not think a different definition of innovation is required.

      This contributes a little towards the lack of overall conceptual focus for the manuscript: beginning in this fashion suggests the authors are building a "comparative evolutionary origins" story, hinting perhaps at the phylogenetic relevance of the work to understanding human behaviour, but the final paragraph of the study contextualises the findings only in terms of their relevance to feeding ecology and conservation efforts. I would recommend that the authors think carefully about their intended audience and tailor the text accordingly. This is not to say that readers interested in human evolution will not be interested in conservation efforts, but rather that each of these aspects should be represented in each stage of the manuscript (otherwise - conservationists may not read far into the Introduction, and cultural evolution fans will be left adrift in the Conclusion).

      We agree that the line running through the whole paper needed to be clearer and have tried to improve this.

      2) Observational detail

      There are a number of areas of the manuscript which I found to be lacking in sufficient detail to accurately determine what occurred in these experimental sessions, making the data difficult to interpret overall. All of this additional information ought to be readily available from the methods used (the experiments were observed by 3-5 researchers with video cameras (line 341)) and is all of direct relevance to the research questions set out by the authors.

      We added more details about the experiment in the method section.

      While I appreciate that it will take quite a bit of work to extract this information, I am certain that it would greatly improve the robustness and explanatory power of this study to do so.

      The data on who was first to innovate/demonstrate successful extraction of the food in each group (Question 1) and subsequent uptake (Question 2), as well as the actual mechanism by which that uptake occurred (the authors strongly imply social learning in their Discussion, but this is never directly examined) is difficult to interpret based on the information presented. Some key gaps in the story were:

      We did not intend to claim that muzzle contact was the specific mechanism by which individuals learned to extract and eat peanuts – we rather use this experiment to evaluate the function of muzzle contact in the presence of a novel food.

      We did not record observation networks in all groups during experiments and cannot obtain accurate ones from all our videos – we hope it is clearer in our text now. Our group’s previous study (Canteloup et al., 2021) already shows social transmission of the opening techniques using data of two of our groups (NH and KB).

      • Which/how many individuals encountered the food and in what order? I.e., were migrants/innovators simply the first to notice the food?

      No, and we have now added some info about other individuals approaching the box and inspecting the peanuts before innovation took place

      • Did any individuals try and fail to extract the food before an "innovator" successfully demonstrated?
      • How many tried and failed to extract the nuts before and after observing effective demonstrators?

      We have added the number of individuals that inspected the peanuts (visually and with contact)

      • Were individuals who observed others interact with the food more likely to approach and/or extract it themselves?
      • Did group-members use the same methods of extraction as their 'innovators'?

      Yes – this is the topic of Canteloup et al. 2021 – and these data are not presented again here. That study was on two of the groups presented here (KB and NH), and with up to 10 exposures in each of those groups and present a fine-grained analysis of peanuts opening techniques used by monkeys. We hope this is clearer now in the text where we refer to this paper.

      • How many tried and succeeded without having directly observed another individual do so (i.e. 'reinvention' as per Tennie et al.)?

      For this, and the above points: We did not record an observation network for the groups added in this study and are not able to answer this – it is not the focus of this study. For this reason, we do not make claims in this line in the present study, and are cautious with our social learning related language. Whilst we examine the role of muzzle contact in acquiring information about a novel food, we do not expect this behaviour to be a necessary prerequisite in being able to extract and eat this food – indeed many individuals who learned to eat did not perform muzzle contacts. This aspect of the study is about using this novel food situation to explore whether muzzle contact serves information acquisition – which our evidence suggests it does.

      Moreover, the processing of this food is not complex and is similar to natural foods in their environment, and we do expect individuals to be capable of reinventing it easily (and this point with Tennie’s hypothesis is actually discussed in Canteloup et al. 2021 paper) – but the point here is that their natural tendency is to be neophobic to unknown food, and therefore they do not readily eat it until they see a conspecific doing so, after which they do. And we also used this opportunity, though in a very small sample size, to investigate which individuals would overcome that neophobia and be the first to eat successfully.

      The connective tissue between the research questions set out by the authors is clearly social learning. In short: the thesis is that Migrants/Innovators bring a novel behaviour to the group, then there is 'uptake' (social learning), which may be influenced by demographic factors and muzzle-contact (biases + mechanisms). Given this focus (e.g. lines 224-264 of the Discussion), I would expect at least some of the details above to be addressed in order to provide robust support for these claims.

      See above – the reason we talk about ‘uptake’ rather than social learning is that we really see this as a case of social disinhibition of neophobia, rather than more detailed social learning such as copying or imitation, as it would be in a tool-use setting, for example (though in Canteloup et al. 2021 paper, evidence is found that the specific methods to open peanuts are socially transmitted).

      Question 2a (Lines 136-146): This data is hard to interpret without knowing how much of the group was present and visible during these exposures.

      Please see response to reviewer 1 on this.

      For example: 9% update in NH group does not sound impressive, but if only 10% of the total group were present while the rest were elsewhere, then this is 90% of all present individuals. Meanwhile if 100% of BD group were present and only experienced 31% uptake, then this is quite a striking difference between groups.

      Experiments were done at sunrise at monkeys’ sleeping site in AK, LT, NH and KB where most of the group was present in the area; we added more precision on this point in the Method section (lines 615-619).

      Of course, there is also an issue of how many individuals can physically engage with the novel food even if they want to - the presence of dominant individuals, steepness of hierarchy within that group, etc, will significantly influence this (and is all of interest with regards to the authors' research questions).

      We discuss this with respect to the result showing that higher rank individuals were more likely to extract and eat the food at the first exposure and over all four exposures

      Muzzle-contact behaviour: The authors use their data to implicate muzzle-contact in social learning, but this seems a leap from the data presented (some more on this in the Analysis section).

      We hope our distinction between information acquisition and information use is clearer now.

      For example: - What is the role of kinship in these events?

      We did not analyse kinship here, but we see a lot of targeting towards adult males, and we do not have reliable kinship data for them. We also checked (see response to reviewer 3) the muzzle contacts initiated by knowledgeable adult females, and they are mostly towards adult males, not towards related juveniles (see new figure 4D and lines 497-500).

      • Did they occur when the juvenile had free access to the food (i.e. not likely to be chased off by a feeding adult)?

      We recorded muzzle contacts visible within 2m of the box, so individuals were not necessarily eating at the box at the time of engaging in muzzle contacts. However, the majority of muzzle contacts that we could record took place directly at the edge of the box – at the location where the food is accessed – so an individual would not likely be if they were not able to have access to the food. It is possible they could be there and not eating, but they would not have been chased off, otherwise they would not be able to engage in muzzle contacts there. But it is not entirely clear what the reviewer’s point is here.

      • Did they primarily occur when adults had a mouthful of food? (i.e. could it simply be attempted pilfering/begging)

      This is not typical of this species. Very few specific individuals remove food from others’ mouths, and they do it with their hands, usually beginning with grooming their face and cheekpouches, before prising their mouth open and removing food from the victim’s cheekpouches

      • What proportion of PRESENT (not total) individuals were naïve and knowledgeable in each group for each trial (if 90% present were knowledgeable, then it is not surprising that they would be targeted more often)?

      We agree somewhat with this statement, but given the multiple ways we show the effect of knowledge – both at the individual level and the group level (effect of exposure number i.e. overall group familiarity) – we feel we present enough evidence to establish the link between knowledge of the food and muzzle contacts. We find that the model showing the interaction between exposure number and number of monkeys eating on the overall rate of muzzle contacts actually addresses this issue, because we see that when many monkeys are eating during later exposures, when many were indeed knowledgeable, the rate of muzzle contacts is massively decreased. Moreover, if 90% of the individuals present are knowledgeable, then only 10% of the individuals present are naïve, and we show both that knowledgeable individuals are targeted, but also that naïve individuals are initiators.

      • Did these events ever lead to food-sharing (In other words, how likely are they to simply be begging events)?

      We do not observe food-sharing in vervets.

      • Did muzzle-contact quantifiably LEAD to successful extraction of the food? If the authors wish to implicate muzzle-contact in social learning, it is not sufficient to show that naïve individuals were more likely to make muzzle-contact, they must also show that naïve individuals who made more muzzlecontact were more likely to learn the target behaviour.

      We disagree here, because there is a distinction between information acquisition and information use - obtaining olfactory information about a novel resource that conspecifics are eating is not the same as learning a complex tool use behaviour for which detailed observation of a model is required. We are not claiming that that muzzle contact is THE mechanism by which the monkeys learn how to eat the food – but we do believe that the clear separation between naïve individuals initiating and knowledgeable individuals being target, and the decrease of the rate of this behaviour as groups’ familiarity with the food increases – is good evidence that this behaviour functions to acquire information about a novel food.

      3) Analysis

      There are a number of issues with the current analysis which I strongly recommend be addressed before publication. Some of these are likely to simply require additional details inserted to the manuscript, whereas others would require more substantial changes. I begin with two general points (A & B), before addressing specific sections of the manuscript.

      A) My primary issue with each of the analyses in this manuscript is that the authors have fit complex statistical models for each of their analyses with no steps to ascertain whether these models are a good fit for the data. With a relatively small dataset and a very large number of fixed effects and interactions, there is a considerable risk of overfitting. This is likely to be especially problematic when predictor variables are likely to be intercorrelated (age, sex and rank in the case of this analysis).

      We have now checked for overfitting in our models.

      The most straightforward way to resolve this issue is to take a model-comparison approach. Fitting either a) a full suite of models (including a 'null' model) with each possible permutation of fixed effects and interactions (since the authors argue their analysis is exploratory) or b) a smaller set of models which the authors find plausible based on their a priori understanding of the study system. These models could then be compared using information criterion to determine which structure provides the best out-of-sample predictive fit for the data, and the outputs of this model interpreted. Alternatively, a model-averaging approach can be taken, where the effects of each individual predictor are averaged and weighted across all models in the set. Both of these approaches can be performed easily using the r package 'MuMIn'. There are also a number of tutorials that can be found online for understanding and carrying out these approaches.

      Please see our answer at the beginning of the document, detailing how we have updated our models.

      B) It does not seem that interobserver reliability testing was carried out on any of the data used in these analyses. This is a major oversight which should be addressed before publication (or indeed any re-analysis of the data).

      We have added this now and mention it above already.

      Line 444: Much more detail is needed here. What, precisely, was the outcome measure? Was collinearity of predictors assessed? (I would expect Age + Rank to be correlated, as well as Sex + Rank).

      This is now addressed (please see details above) – we use VIFs to assess multicollinearity of predictors in our models and find they are all satisfactory (see R code).

      Line 452. A few comments on this muzzle-contact analysis:

      The comments below are a little confusing as some seem to refer to the muzzle-contact rate model (previously line number 452), and some seem to refer to the initiator/receiver model. We have tried to figure out which comments refer to which, and answer accordingly.

      "We investigated muzzle contact behaviour in groups where large proportions of the groups started to extract and eat peanuts over the first four exposures"

      What was the criteria for "a large proportion"?

      All groups are now included in this analysis.

      The text for this muzzle-contact analysis would indicate that this model was not fit with any random effects, which would be extremely concerning. However, having checked the R code which the authors provided, I see that Individual has been fit as a random effect. This should be mentioned in the manuscript. I would also strongly recommend fitting Group (it was an RE in the previous models, oddly) and potentially exposure number as well.

      The model about muzzle contact rate never contained individual as a random effect because individuals are not relevant in this model – it is the number of muzzle contacts occurring during each exposure. However, the reviewer might refer here to the model that we forgot to provide the script for. Nonetheless, we have substantially revised this model, it now (Model 3) includes all groups, and has group as a random effect.

      Following on from this, if the model was fit with individual as a random effect it becomes confusing that Figure 3 which represents this data seemingly does not control for repeated measures (it contains many more datapoints than the study's actual sample size of 164 individuals). This needs to be corrected for this figure to be meaningfully interpretable.

      Figure 3 is not related to the model described in (original) line 452.

      The numbers were referring to the number of muzzle contacts, and this was written in the figure caption. However, we no longer present these details on the new figure (see Fig 4).

      Finally, would it make sense to somehow incorporate the number of individuals present for this analysis? Much like any other social or communicative behaviour, I would predict the frequency of occurrence to depend on how many opportunities (i.e. social partners) there are to engage in it.

      We have included the number of monkeys eating in our muzzle contact rate model now (Model 3) as upon further thought, we found that this was the issue leading us to want to exclude exposures, and only include the groups where many monkeys were eating. We have resolved this now by including all groups and not dropping exposures, and rather we include an interaction between number of monkeys eating and exposure number. We feel this addresses our hypothesis here much more satisfactorily. We hope these updates also address the reviewers concerns adequately.

      Line 460: "For BD and LT we excluded exposures 4 and 3, respectively, due to circumstances resulting in very small proportions of these groups present at these exposures"

      What was the criterion for a satisfactory proportion? Why was this chosen

      See above – this is now addressed.

      Line 461: "We ran the same model including these outlier exposures and present these results in the supplementary material (SM3)."

      The results of this supplemental analysis should be briefly stated. Do they support the original analysis or not?

      We no longer present this like this. We revised the model examining muzzle contact rate substantially and actually included the number of individuals eating in the model rather than excluding groups where this number was low. The results of the new model show good support our hypothesis.

      Line 465: "Due to very low numbers of infants ever being targets of muzzle contacts, we merged the infant and juvenile age categories for this analysis."

      This strikes me as a rather large mistake. The research question being asked by the authors here is "How does age influence muzzle-contact behaviour?"

      Then, when one age group (infants) is very unlikely to be a target of muzzle-contact, the authors have erased this finding by merging them with another age category (juveniles). This really does not make sense, and seriously confounds any interpretation of either age category.

      Yes we agree with this issue, and no longer do that. Rather we remove the infant data from this model, which is now Model 6, because of the large amounts of error they introduced into the model due to the small sample size. We show the process in the R code, and we describe our reasons in the text (lines 713-719). Since we are now only comparing within age- and sex-categories (see below) we do not find this decision introduces any bias.

      Lines 466-474: Why was rank removed for the second and third models? Why is Group no longer a random effect (as in the previous analysis)? The authors need to justify such steps to give the reader confidence in their approach.

      This is now addressed and discussed in descriptions of our new models.

      Furthermore - because of the way this model is designed, I do not think it can actually be used to infer that these groups are preferentially targeted, merely that adult female and adult males are LESS likely to target others than to be targeted themselves, which is a very different assertion.

      Because the specific outcome measure was not described here, this only became apparent to me after inspecting Figure 3, where outcome measure is described as "Probability of (an individual) being a target rather than initiator" - so, it can tell us that adults are more often targeted rather than initiating, but does not tell us if they are targeted more frequently than juveniles (who may get targeted very often, but initiate so often that this ratio is offset).

      We thank the reviewer for noticing this as we had indeed chosen an inappropriate model for what we were intending to measure – this has been addressed now with two additional models (Models 4 and 5; see details at the top of document). We nonetheless found the aspects of this model to still be highly interesting, so have re-framed it to focus on them.

      Lines 467-473: "Our first simple model included individuals' knowledge of the novel food at the time of each muzzle contact (knowledgeable = previously succeeded to extract and eat peanuts; naïve = never previously succeeded to extract and eat peanuts) and age, sex and rank as fixed effects. Individual was included as a random effect. The second model was the same, but we removed rank and added interactions between: knowledge and age; and knowledge and sex. The third model was the same as the second, but we also added a three-way interaction between knowledge, age and sex."

      This is a good example of some of the issues I describe above. What is the justification for each of these model-structures? The addition and subtraction of variables and interactions seems arbitrary to the reader.

      For Model 6, we no longer include rank at all, because we had not hypothetical reason to (see lines 723-725). We now begin with the three-way interaction, and only remove this, because it is not significant, and the model had problems converging as well, due to its complexity. We show this in the R script. We retain only the two separate interactions, and we do not include group as a random effect in this model due to the complexity AND because we do not think there is a theoretical requirement for it to be included here (this is explained in lines 730-735- in the manuscript. We report the results of the 3-way interaction in the supplementary material – SM3 Table S2).

      Reviewer 3

      In this study, the authors introduce a novel food that requires handling time to five vervet monkey groups, some of which had previous experience with the food. Through the natural dispersal of males in the population, they show that dispersing individuals transmit behavioral innovations between groups and are often also innovators. They also examine muzzle contact initiations and targets within the groups as a way to determine who is seeking social information on the new food source and who is the target of information seeking. The authors show that knowledgeable adults are more often the target of muzzle contacts compared to young individuals and those that are not knowledgeable.

      This is a very interesting study that provides some novel insights. The methods employed will be useful to others that are considering an experimental approach to their field research. The data set is good and analyzed appropriately and the conclusions are justified. However, there are several areas where the paper could be improved for readers in terms of its clarity.

      1) It wasn't until the Discussion that it became clear to me that the actual physiological and personality traits of dispersers were being linked with innovation. From the Title, Abstract, and Introduction, it seemed as though the focus was on dispersing males bringing their experience with a novel food to a new group to pass it on. I think it needs to be made clear much earlier in the manuscript that the authors are investigating not only the transmission of behavioural adaptation but also how the traits of dispersers might may make them more likely to innovate.

      We have now addressed this above.

      2) Early in the paper on line 28, the authors state that continued initiation of muzzle contacts by adult females could have been an effort to seek social information. This is true but another interpretation is that females were imparting or giving social information. It seems important here and elsewhere (lines 322-323) to consider and report the target of these initiations. If these were directed at more knowledgeable individuals, it supports the idea that this was social information seeking. If muzzle contacts were directed to younger or unknowledgeable individuals, it would imply a form of teaching, which is possible but perhaps unlikely, so I think the authors need to be totally clear here.

      We thank the reviewer for pointing this out We looked into our data and now present figure 4D, showing that almost all knowledgeable adult females’ muzzle contacts were targeted towards knowledgeable adult males and talk about it in the discussion (lines 499-500).

      3) The argument made on lines 344-350 needs more fleshing out to be convincing or it should be deleted. The link between number of dispersers, social organization, and large geographic range seems a little muddled. There are many dispersing individuals in species that are not typically in large multi-male, multi-female social organizations. Indeed, in many species both sexes disperse. Think of pair living birds where both sexes disperse and geographic range can be enormous. There are also no data or references presented here to show that species in multi-male, multi-female social organizations do have larger geographic ranges than those that are not in these social organizations. It seems to me that, even if this is the case, niche is more important than social organization, for instance not being dependent on forests to constrain much of your range.

      We have removed this section

    2. Reviewer #2 (Public Review)

      I have separated my issues with the manuscript into three sub-headings (Conceptual Clarity, Observational Detail and Analysis) below.

      1) Conceptual clarity

      There are a number of areas where it would greatly benefit the manuscript if the authors were to revisit the text and be more specific in their intentions. At present, the research questions are not always well-defined, making it difficult to determine what the data is intended to communicate. I am confident all of these issues could be fixed with relatively minor changes to the manuscript.

      For example, Line 104: Question 1 is not really a question, the authors only state that they will "investigate innovation and extraction of eating the food", which could mean almost anything.

      Question 2a (line 98) is also very vague in it's wording, and I'm left unclear as to what the authors were really interested in or why. This is not helped by Line 104 which refuses to make predictions about this research question because it is "exploratory". Empirical predictions are not simply placing a bet on what we think the results of the study will be, but rather laying out how the results could be for the benefit of the reader. For instance, if testing the effects of 10 different teaching methods on language acquisition-rate: Even if we have no a priori idea of which method will be most effective, we can nevertheless generate competing hypotheses and describe their corresponding predictions. This is a helpful way to justify and set expectations for the specific parameters that will be examined by the methods of the study. In fact, in the current paper, the authors in fact had some very clear a priori expectations going into this study that immigrant males would be vectors of behavioural transmission (clear that is from the rest of the introduction, and the parameters used in their analysis, which were not chosen at random).

      The multiple references to 'long-lived' species in the abstract (line 16 and introduction (39, 56) is a bit confusing given the focus of this study. Although such categorisations are arbitrary by nature (a vervet is certainly long-lived compared to a dragonfly), I would not typically put vervet monkeys (or marmosets, line 62) in the same category as apes (references 8 and 9) or humans (line 62) in this regard. This contributes a little towards the lack of overall conceptual focus for the manuscript: beginning in this fashion suggests the authors are building a "comparative evolutionary origins" story, hinting perhaps at the phylogenetic relevance of the work to understanding human behaviour, but the final paragraph of the study contextualises the findings only in terms of their relevance to feeding ecology and conservation efforts. I would recommend that the authors think carefully about their intended audience and tailor the text accordingly. This is not to say that readers interested in human evolution will not be interested in conservation efforts, but rather that each of these aspects should be represented in each stage of the manuscript (otherwise - conservationists may not read far into the Introduction, and cultural evolution fans will be left adrift in the Conclusion).

      2) Observational detail

      There are a number of areas of the manuscript which I found to be lacking in sufficient detail to accurately determine what occurred in these experimental sessions, making the data difficult to interpret overall. All of this additional information ought to be readily available from the methods used (the experiments were observed by 3-5 researchers with video cameras (line 341)) and is all of direct relevance to the research questions set out by the authors.

      While I appreciate that it will take quite a bit of work to extract this information, I am certain that it would greatly improve the robustness and explanatory power of this study to do so.

      The data on who was first to innovate/demonstrate successful extraction of the food in each group (Question 1) and subsequent uptake (Question 2), as well as the actual mechanism by which that uptake occurred (the authors strongly imply social learning in their Discussion, but this is never directly examined) is difficult to interpret based on the information presented. Some key gaps in the story were:

      - Which/how many individuals encountered the food and in what order? I.e., were migrants/innovators simply the first to notice the food?<br /> - Did any individuals try and fail to extract the food before an "innovator" successfully demonstrated?<br /> - How many tried and failed to extract the nuts before and after observing effective demonstrators?<br /> - Were individuals who observed others interact with the food more likely to approach and/or extract it themselves?<br /> - Did group-members use the same methods of extraction as their 'innovators'?<br /> - How many tried and succeeded without having directly observed another individual do so (i.e. 'reinvention' as per Tennie et al.)?

      The connective tissue between the research questions set out by the authors is clearly social learning. In short: the thesis is that Migrants/Innovators bring a novel behaviour to the group, then there is 'uptake' (social learning), which may be influenced by demographic factors and muzzle-contact (biases + mechanisms). Given this focus (e.g. lines 224-264 of the Discussion), I would expect at least some of the details above to be addressed in order to provide robust support for these claims.

      Question 2a (Lines 136-146): This data is hard to interpret without knowing how much of the group was present and visible during these exposures.

      For example: 9% update in NH group does not sound impressive, but if only 10% of the total group were present while the rest were elsewhere, then this is 90% of all present individuals. Meanwhile if 100% of BD group were present and only experienced 31% uptake, then this is quite a striking difference between groups.

      Of course, there is also an issue of how many individuals can physically engage with the novel food even if they want to - the presence of dominant individuals, steepness of hierarchy within that group, etc, will significantly influence this (and is all of interest with regards to the authors' research questions).

      Muzzle-contact behaviour: The authors use their data to implicate muzzle-contact in social learning, but this seems a leap from the data presented (some more on this in the Analysis section).

      For example:<br /> - What is the role of kinship in these events?<br /> - Did they occur when the juvenile had free access to the food (i.e. not likely to be chased off by a feeding adult)?<br /> - Did they primarily occur when adults had a mouthful of food? (i.e. could it simply be attempted pilfering/begging)<br /> - What proportion of PRESENT (not total) individuals were naïve and knowledgeable in each group for each trial (if 90% present were knowledgeable, then it is not surprising that they would be targeted more often)?<br /> - Did these events ever lead to food-sharing (In other words, how likely are they to simply be begging events)?<br /> - Did muzzle-contact quantifiably LEAD to successful extraction of the food? If the authors wish to implicate muzzle-contact in social learning, it is not sufficient to show that naïve individuals were more likely to make muzzle-contact, they must also show that naïve individuals who made more muzzle-contact were more likely to learn the target behaviour.

      3) Analysis

      There are a number of issues with the current analysis which I strongly recommend be addressed before publication. Some of these are likely to simply require additional details inserted to the manuscript, whereas others would require more substantial changes. I begin with two general points (A & B), before addressing specific sections of the manuscript.

      A) My primary issue with each of the analyses in this manuscript is that the authors have fit complex statistical models for each of their analyses with no steps to ascertain whether these models are a good fit for the data. With a relatively small dataset and a very large number of fixed effects and interactions, there is a considerable risk of overfitting. This is likely to be especially problematic when predictor variables are likely to be intercorrelated (age, sex and rank in the case of this analysis).

      The most straightforward way to resolve this issue is to take a model-comparison approach. Fitting either a) a full suite of models (including a 'null' model) with each possible permutation of fixed effects and interactions (since the authors argue their analysis is exploratory) or b) a smaller set of models which the authors find plausible based on their a priori understanding of the study system. These models could then be compared using information criterion to determine which structure provides the best out-of-sample predictive fit for the data, and the outputs of this model interpreted. Alternatively, a model-averaging approach can be taken, where the effects of each individual predictor are averaged and weighted across all models in the set. Both of these approaches can be performed easily using the r package 'MuMIn'. There are also a number of tutorials that can be found online for understanding and carrying out these approaches.

      B) It does not seem that interobserver reliability testing was carried out on any of the data used in these analyses. This is a major oversight which should be addressed before publication (or indeed any re-analysis of the data).

      Line 444: Much more detail is needed here. What, precisely, was the outcome measure? Was collinearity of predictors assessed? (I would expect Age + Rank to be correlated, as well as Sex + Rank).

      Line 452. A few comments on this muzzle-contact analysis:

      "We investigated muzzle contact behaviour in groups where large proportions of the<br /> groups started to extract and eat peanuts over the first four exposures"

      What was the criteria for "a large proportion"?

      The text for this muzzle-contact analysis would indicate that this model was not fit with any random effects, which would be extremely concerning. However, having checked the R code which the authors provided, I see that Individual has been fit as a random effect. This should be mentioned in the manuscript. I would also strongly recommend fitting Group (it was an RE in the previous models, oddly) and potentially exposure number as well.

      Following on from this, if the model was fit with individual as a random effect it becomes confusing that Figure 3 which represents this data seemingly does not control for repeated measures (it contains many more datapoints than the study's actual sample size of 164 individuals). This needs to be corrected for this figure to be meaningfully interpretable.

      Finally, would it make sense to somehow incorporate the number of individuals present for this analysis? Much like any other social or communicative behaviour, I would predict the frequency of occurrence to depend on how many opportunities (i.e. social partners) there are to engage in it.

      Line 460: "For BD and LT we excluded exposures 4 and 3, respectively, due to circumstances resulting in very small proportions of these groups present at these exposures"

      What was the criterion for a satisfactory proportion? Why was this chosen?

      Line 461: "We ran the same model including these outlier exposures and present these results in the supplementary material (SM3)."

      The results of this supplemental analysis should be briefly stated. Do they support the original analysis or not?

      Line 465: "Due to very low numbers of infants ever being targets of muzzle contacts, we merged the infant and juvenile age categories for this analysis."

      This strikes me as a rather large mistake. The research question being asked by the authors here is "How does age influence muzzle-contact behaviour?"<br /> Then, when one age group (infants) is very unlikely to be a target of muzzle-contact, the authors have erased this finding by merging them with another age category (juveniles). This really does not make sense, and seriously confounds any interpretation of either age category.

      Lines 466-474: Why was rank removed for the second and third models? Why is Group no longer a random effect (as in the previous analysis)? The authors need to justify such steps to give the reader confidence in their approach.

      Furthermore - because of the way this model is designed, I do not think it can actually be used to infer that these groups are preferentially targeted, merely that adult female and adult males are LESS likely to target others than to be targeted themselves, which is a very different assertion.

      Because the specific outcome measure was not described here, this only became apparent to me after inspecting Figure 3, where outcome measure is described as "Probability of (an individual) being a target rather than initiator" - so, it can tell us that adults are more often targeted rather than initiating, but does not tell us if they are targeted more frequently than juveniles (who may get targeted very often, but initiate so often that this ratio is offset).

      Lines 467-473: "Our first simple model included individuals' knowledge of the novel food at the time of each muzzle contact (knowledgeable = previously succeeded to extract and eat peanuts; naïve = never previously succeeded to extract and eat peanuts) and age, sex and rank as fixed effects. Individual was included as a random effect. The second model was the same, but we removed rank and added interactions between: knowledge and age; and knowledge and sex. The third model was the same as the second, but we also added a three-way interaction between knowledge, age and sex."

      This is a good example of some of the issues I describe above. What is the justification for each of these model-structures? The addition and subtraction of variables and interactions seems arbitrary to the reader.

    1. Author Response

      Reviewer #1 (Public Review):

      This is an extremely well-done study, revealing a fascinating phenotype of mes-4 mutant, which they show upregulates X-linked genes, leading to PGC death. These X-linked genes are mostly oogenesis genes, upregulation of which likely impedes normal proliferation of PGCs. The results are very concrete and supports their conclusion, and contribute significantly to the field. I do not have any major concerns except for a couple of conceptual issues. First, the title 'germline immortality' does not seem to be well aligned with the results. It is not wrong that PGCs die in mes-4 mutant, and thus the germline is 'mortal': however, the term 'germline immortality' implies multi-generational passages of germline, and the data in the present study, where mutant PGCs just die in the offspring, do not necessarily point to 'germline immortality' per se. So, I suggest to change the title to reflect the contents of the paper better.

      Good point. We changed germline immortality to germline survival and/or development throughout the paper.

      Second, although the authors speculate (in the discussion) why X activation is toxic to germ cells (discussing that upregulated X-linked genes are oogenesis genes, whose precocious activation is toxic to PGCs), there is not sufficient discussion as to why the effect is mostly limited to X chromosome, and why mes-4 is specifically involved in this. Is it because all oogenesis genes are concentrated on X chromosome? (likely not). Are autosomal genes that are upregulated in mes-4 mutant also oogenesis genes? Is this related to dosage compensation? I would like to see fuller discussions as to why X chromosome requires special regulation, also discussing the role of mes-4 in this context. I understand that the authors might have refrained from expanding discussions on matters that do not have any data, but without this discussion, I feel that many readers will be left wondering 'why?'.

      As noted in Point #5 above, we added to Discussion whether up-regulation of X genes in mes-4 mutant PGCs and EGCs reflects a defect in dosage compensation or a defect in keeping the oogenesis program (which is enriched for X-linked genes) quiet in the nascent germline (see lines 604-630). Based on new analyses showing up-regulation of oogenesis genes (on the X and autosomes) in mes-4 and PRC2 nascent germlines and the points in Discussion, we favor the view that the essential function of MES-4 and PRC2 is to repress X-linked oogenesis genes in PGCs and EGCs (see Figures 6 and 7, associated figure supplements, and lines 389-417).

      Reviewer #2 (Public Review):

      This manuscript makes substantial progress in resolving a long-standing mystery regarding the precise role of the histone methyltransferase MES-4 in promoting germline development. MES-4 maintains the histone modification H3K36me3 and germ cell survival, but prior evidence was unable to distinguish among several possibilities for target pathways. This paper utilizes a transcriptional profiling approach at the critical time of germline development to definitively demonstrate that the essential function of MES-4 is to repress X gene expression in germ cells. This result is surprising because X repression is an indirect effect of MES-4 activity (MES-4 does not localize to the X), while the direct effect of maintaining germline gene expression is not essential. To buttress this finding, the authors also utilize a series of elegant genetic experiments to independently test whether expression from the X is sufficient to cause germ cell degeneration. They then go further to identify a single X-linked target, lin-15b, as a primary contributor to the inappropriate X-linked gene expression in mes-4 mutants, by showing that loss of lin-15b activity rescues both the germline degeneration and X mis-expression of mes-4 mutants. Finally, the authors demonstrate that PRC2, the H3K27me3 histone methyltransferase and MRG-1, a candidate H3K36me3 effector protein, are also involved in promoting X silencing through lin-15b.

      The manuscript's strengths lie in the development or application of novel techniques, including the profiling of individual pairs of PGCs (a non-trivial advancement), as well as some very well-designed and conceptually innovative genetic assays. These were used to address specific and important gaps in knowledge regarding the phenotype of mes-4, which had been elusive despite having been studied for almost 30 years. Although specific to C. elegans in some ways, the findings are clearly relevant to conserved regulatory events, such as epigenetic memory mechanisms and establishment of opposing chromatin states. Thus, this work provides a substantial advance in the field overall.

      One limitation of this study is the lack of clarity about the conclusions regarding the relationship between the two H3K36me3 histone methyltransferases mes-4 and met-1, and between X vs autosomal gene expression. The authors do not precisely state what genes (X or A) are affected in the met-1 and mes-4 mutants. Ultimately, this confusion muddles the final message of X chromosome upregulation being the critical contributor to the mes-4 germline degeneration phenotype. The experiment presented in figure 3B indicates that loss of mes-4 or met-1 is sufficient to prevent germline development even when the Xs are repressed, indicating that failure to activate autosomal gene expression is also an underlying cause of the degeneration. Perhaps this cannot be definitively concluded without directly assessing met-1 and met-1;mes-4 mutant PGCs (or EGCs) for gene expression changes. If technically possible, this would be a very valuable experiment to directly examine autosomal gene expression changes in the double mutant.

      We profiled met-1 PGCs and observed very few mis-regulated genes (Figure 7 – supplemental figure 1). We tried to profile met-1; mes-4 double mutant PGCs, which completely lack both MET-1 and MES-4 and inherit chromosomes that lack H3K36me3. That was not feasible, due to the high level of embryonic lethality and rapid deterioration of PGCs dissected from met-1; mes-4 double mutant larvae. Notably, this demonstrates that germlines that lack both maternal K36me3 HMTs are sicker than those that lack just 1 of the HMTs. The high degree of embryo lethality suggests an essential function for MET-1 and MES-4 in the soma. As requested, we generated and included a list of X and autosomal genes mis-regulated in met-1, mes-4, and other mutant PGCs (see Figure 7—figure supplement 1).

      The sterility of hermaphrodites with a met-1; mes-4 mutant XspXsp germline and lacking either maternal MES-4 or maternal MET-1 may be due to mis-regulation of autosomal genes, or it may reflect that the X chromosomes are not repressed in met-1; mes-4 XspXp germlines that lack H3K36me3. To test that, we would need to profile those XspXsp PGCs. It is not feasible to identify mutant F1 larvae with Xsp/Xsp PGCs immediately after hatching, which is required for transcript profiling. We think that the main message from analyzing met-1; mes-4 mutant XspXsp germlines -- that inherited H3K36me3 marking is not critical for germline development but re-establishment of marking is important and requires both enzymes – does not require our delving into the cause of sterility of mutant XspXsp germlines lacking MET-1.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Autophagy of the endoplasmic reticulum (ER-phagy) is a fundamental process that is essential for maintaining cellular homeostasis and quality control. We recently identified a novel mechanism regulating ER-phagy in both plants and animals that is based on the ubiquitin-like protein modifiers ATG8 and UFM1, and the ER-associated protein, C53. Here, we use a combination of evolutionary, biochemical, and physiological experiments to investigate the evolution and regulation of this process. We reveal the dynamic evolution of UFM1 and the ubiquity of C53-mediated autophagy across eukaryotes. Leveraging these results, we then identify an ancestral molecular toggle switch, mediated by shuffled ATG8-interacting motifs (sAIMs), that controls C53-mediated autophagy through competitive binding between UFM1 and ATG8. These findings provide new insights into the evolution of UFM1, reveal a conserved mechanism for the regulation of ER-phagy, and raise new and exciting hypotheses about the diversity and function of the UFMylation pathway. We believe that this work will be of interest to those studying autophagy and cellular stress response but will also serve as an interesting example of the benefits of combining evolutionary analyses with biochemical and cellular experiments.

      Our manuscript has been reviewed by three reviewers through ReviewCommons, whose comments, and our responses, can be found below. Two of the reviewers (Reviewer 1 and 3) were supportive of our work and its significance whereas Reviewer 2 questioned the novelty of our findings.

      Each of the reviewers’ comments can be addressed through a few supporting experiments as well as an improved manuscript which clarifies the novelty and significance of our results. While being supportive of our work, Reviewer 1 requested minor additional experiments to support our mechanistic conclusions and Reviewer 3 suggested that we expand our characterizations of C53 function to additional eukaryotic supergroups. These experiments are straightforward to perform, the materials and protocols to accomplish them are already established, and our overall conclusions are robust to the resulting outcomes.

      In contrast, Reviewer 2 did not suggest any additional experiments but rather challenged the novelty of our results as well as some of our interpretations. In particular, Reviewer 2 was uncertain of how our phylogenomic analyses built upon a previous study, published in 2014, which used comparative genomics to identify ubiquitin-related machinery across eukaryotes. Although it was an oversight to not reference this study (we cited a more recent article showing the same results), we were aware of their conclusions that UFMylation was present in the last eukaryotic common ancestor but absent in Fungi. We now clearly outline, both below and within the manuscript, our key phylogenomic results. These were acquired after implementing more advanced and comprehensive comparative genomic searches which allowed us to identify dynamic patterns in UFMylation evolution and permitted co-evolutionary analyses which were not only important for informing our experimental hypotheses but generated new functional questions. Our phylogenomic analyses are also linked to biochemical and physiological data, providing, for the first time, experimental support for our conclusions regarding UFMylation evolution. Similarly, Reviewer 2 suggested that our mechanistic results were an incremental extension of our previous work. Although our current work does of course build on our initial identification of C53-mediated autophagy, this manuscript provides novel insights into the importance and function of this process by revealing its ubiquity across eukaryotes and by characterizing the mechanistic details of its regulation. Ultimately, we disagree with Reviewer 2 but appreciate that this misunderstanding likely resulted from a lack of context and clarity in our manuscript which we have now resolved.

      As outlined in detail below, we will address the reviewers concerns through additional experiments, analyses, and improvements to the text.

      Thank you for considering our manuscript. We look forward to hearing from you.

      Description of the planned revisions

      We thank the reviewers for carefully evaluating our manuscript and for providing us with an opportunity to respond to their suggestions and criticisms. As you can see below in our pointby-point response, we address each of the points raised by the reviewers through the addition of supporting experiments, analyses, and an improved text. Altogether, we think these additional experiments and textual changes will significantly improve the manuscript. Therefore, we would like to thank all the reviewers and editors for their time and input.

      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript Picchianti et al. provide novel insights into the interaction of C53 with UFM1 and ATG8. Initially, the authors show that protein modification by UFM1 exists in the unicellular organism Chlamydomonas reinhardtii. To that end they demonstrated that pure Chlamydomonas UBA5, UFC1 and UFM1 proteins, can charge UFC1. Then, they showed that C53 interacts with ATG8 and UFM1. Specifically, they found that the sAIM are essential for the interaction with UFM1, while substituting this motif with canonical AIM prevents the binding of UFM1 but not of ATG8. Since binding of C53 to ATG8 recruits the autophagy machinery, the authors suggest that ufmylation of RPL26 releases UFM1 from C53 which allows the binding of ATG8. Overall, the authors demonstrate that C53 that forms a complex with UFL1 connects between protein ufmylation and autophagy by its ability to bind both UBLs. Here the authors revisited the assumption that only multicellular organisms have the UFM1 system. Using bioinformatic tools they show that it exists also in unicellular organism. Also, they show that in some organisms the E3 complex UFL1, UFBP1 and C53 exist but not UBA5, UFC1 or UFM1. This is a very interesting observation that suggests an additional role for this complex. In Fig 1C the authors show that in Chlamydomonas RPL26 undergoes ufmylation. Please use IP against RPL26 and then a blot with anti UFM1. From the current experiment it is not clear how the authors know that this is indeed RPL26 that undergoes ufmylation

      RPL26 is highly conserved across eukaryotes, so by comparing our western blots with previous studies (Walczak et. al., 2019, Wang et al. 2020), we concluded that these bands corresponded to UFMylated RPL26. However, we agree with the reviewer that we need to confirm the identify of RPL26 with additional assays. Since the submission of the manuscript, we tested RPL26 antibodies in Chlamydomonas and showed that they work well. So, we will update our figure with the confirmation westerns.

      In the second part of the manuscript the authors characterize the interaction of C53 with ATG8 and UFM1. This is a continuation of their previous published work (Stephani et al, 2020). Here the reviewer thinks that further data on the binding of these proteins to C53 is required. Specifically, defining the Kd of these interactions using ITC or other biophysical method can contribute to the study.

      We agree with the reviewer. To obtain the KD values, we will perform ITC experiments with C53 wild type, a C53 sAIM mutant and a C53 cAIM variant titrated with ATG8 and UFM1.

      Under normal condition the authors suggest that C53 binds UFM1 and this keeps it inactive. The reviewer thinks that this claim needs further support. Using IP (maybe with crosslinker) the author can show that C53, in normal conditions, bind more UFM1 than ATG8. Also, since the interaction of UFM1 to C53 is noncovalent, it will be nice to show how alternations in UFM1 expression levels can affect the activation of C53.

      We thank the reviewer for this suggestion. Since the submission of the manuscript, we have obtained UFM1 overexpression lines. We will pull on C53 using our C53 antibody and check for ATG8 levels in wild type and UFM1 overexpressing lines under normal and stress conditions. We think this will show how alterations in UFM1 levels can affect C53 activation.

      Finally, the authors suggest that ufmylation of RPL26 allows binding of ATG8 to C53 and this, in turn, leads to C53 activation. Can the authors show that in cells lacking UBA5, under normal condition or with Tunicamycin treatment, ATG8 does not activate C53 due to the fact that UFM1 does not leave C53.

      In Stephani et al., we showed that C53-mediated autophagy requires the UFMylation machinery. In ufl1 and ddrgk1 mutants, C53 becomes insensitive to ER stress. However, to supplement these results, we will perform autophagic flux assays using the native C53 antibody to test autophagic degradation of C53 in a uba5 and ufc1 mutant under normal and tunicamycin stress conditions. The uba5 mutant that we have is a knockdown, so that’s why we will include the ufc1 mutant in our experiments.

      Significance

      This manuscript advances our understanding of the connection of ufmylation to autophagy which is mediated by C53.

      Thank you!

      Referee #2

      Evidence, reproducibility and clarity

      The manuscript from Picchianti et al. seeks to define the role of CDK5RAP3 (hereinafter referred as C53) during autophagy and its interplay with UFMylation. Together with UFL1 and DDRGK1, C53 is a component of a trimeric UFM1 E3 ligase complex that modifies the 60S ribosomal protein RPL26 at the endoplasmic reticulum (ER) surface upon ribosomal stalling (among other proposed functions that are not addressed). Several previous studies have implicated the UFMylation pathway in autophagy or ER-phagy although a non-autophagic fate for UFM1- tagged ribosomal subunits has also been reported. A previous study from the same authors (PMID: 32851973) identified an intrinsically disorder region (IDR) in C53 that is necessary and sufficient for interaction between C53 and autophagy receptor, ATG8. They reported that this IDR comprises four non canonical ATG8 interacting motifs (AIM), named shuffled AIMs (sAIMs) and showed that combinatorial mutagenesis of sAIM1, sAIM2, and sAIM3 abrogates ATG8 binding. A similar effect was observed for plant C53, though an additional canonical AIM (cAIM) in the C53 IDR had to be mutated to completely abolish C53 and ATG8 interaction. The earlier study reported that C53 IDR also interacts with UFM1, and this interaction can be disrupted in vitro by adding increasing concentration of ATG8, suggesting that ATG8 and UFM1 may compete with one another for C53 binding. The present paper attempts to build on this previous work by using phylogenomics to infer a coevolutionary relationship between UFMylation machinery and sAIMs in C53, which the authors argue, constitutes further evidence of the primary importance of a role for UFMylation in ER homeostasis. The manuscript includes a lot of biochemical data using variations of in vitro and in vivo pull-down experiments to define the roles of individual AIMs in mediating the binding of C53 to ATG8 and to UFM1. They also use NMR spectroscopy in an attempt to define the structural basis of the UFM1 and ATG8 binding to C53, concluding that plant C53 interacts with UFM1 mainly through sAIM1, while interaction with ATG8 requires cAIM as well as sAIM1 and sAIM2. Finally, the authors attempt to contextualize these findings by conducting studies on Arabidopsis mutants, showing that replacing sAIMs with cAIMs causes increases sensitivity to ER stress and apparently increases formation of C53 intracellular puncta that may colocalize with ATG8. From these data the authors concluded that the dual-ATG8 and UFM1 binding of C53 IDR regulates C53 recruitment to autophagosomes in response to ER stress. Major Issues: 1) The phylogenomics analysis conclusion that UFM1 is common in unicellular lineages and did not evolve in multicellular eukaryotes is not novel, as another comprehensive analysis of UFM1 phylogeny, published eight years ago - in 2014 - by Grau-Bové et al. (PMID: 25525215), also reported that UFM1, UBA5, UFC1, UFL1 and UFSP2 were likely present in LECA and lost in Fungi. Although the phylogenomic analysis by Picchianti et al. is also extended to DDRGK1 and C53 proteins, and some parasitic and algal lineages, their findings are incremental. Their proposed coevolution of sAIM and UFM1 is based on presence-absence correlation observed within five species (i.e., Albugo candida, Albuco laibachii, Piromyces finnis, Neocallimastix californiae, Anaeromyces robustus). However, this coevolutionary relationship must be further investigated by substantially increasing the taxonomic sampling within the UFM1-lacking group.

      We were aware that previous studies had investigated the distribution of UFMylation proteins across eukaryotes and that these analyses had predicted the presence of UFMylation in LECA and subsequent loss in Fungi. We included a more recent citation noting this (Tsaban et al. 2021) but apologise for not citing Grau-Bové et al. (2014), which we have now included. We must emphasize that our results are not incremental. Although we had made a point of emphasizing the presence of UFM1 in LECA, this was to counter a recent and highly cited paper in the field which claimed that UFMylation evolved in plants and animals (Walczak et al. 2019). Below we note the novel and important results from our phylogenomic analyses: 1. We used improved taxonomic sampling and more advanced comparative genomics methods to identify UFMylation components sensitively and specifically across eukaryotes. This involved the inclusion of additional eukaryotic genomes, phylogenetic annotation of orthologs, and genomic searches to complement proteome predictions. These methods are essential for accurately identifying UFMylation components and yield more robust results than using sequence similarity clustering (Tsaban et al. 2021) or un-curated Pfam HMMER search results (Grau-Bové et al. 2014). 2. By placing our UFMylation reconstructions in a modern phylogenetic context we were not only able to support previous observations which noted the presence of UFM1 in LECA and its loss in Fungi (Grau-Bové et al. 2014) and Plasmodium (Tsaban et al. 2021), but also to identify novel patterns in the evolution of UFMylation. This included the observation of recurrent losses in diverse but trophically-related lineages (such as algae and parasites) and revealed the retention of certain UFMylation components in the absence of UFM1. We identified the frequent coretention of UFL1 and DDRGK1 following UFM1 loss in multiple eukaryotic groups, including Fungi, which were previously thought to be devoid of UFMylation machinery. These previously uncharacterized patterns, suggest that these proteins could have alternative functions and may be functionally associated with life history. These results therefore expand on and add complexity to our understanding of the evolution of UFMylation. 3. By conducting a comprehensive and accurate survey of UFMylation components we were able to use our data to examine co-evolutionary trends between C53 and UFM1, which would have been incomplete and inaccurate using previously curated datasets. As the reviewer noted, only five species were identified that encoded C53 but lacked UFM1. This is not a reflection of insufficient taxon sampling, but rather the strong co-evolution between C53 and UFM1 (i.e., when UFM1 is lost, C53 is almost always lost as well). We attempted to identify additional cases by searching hundreds of fungal and oomycete genomes as well as those from other eukaryotes, but no other species were found. We agree with the reviewer that additional taxa would have made our analyses stronger, but importantly, we do not rely on genomic correlations to infer function. Rather, we use these correlations to generate functional hypotheses which we then tested experimentally. In this way, we do not rely on the strength of our correlations. We have now revised the manuscript to include additional context (including citations) and have improved the clarity of the text to better convey the novelty of our findings.

      2) The manuscript presents an overwhelming amount of biochemical and structural data obtained from a variety of protein binding techniques (i.e., NMR spectroscopy, in vitro GSTpulldown, fluorescence microscopy-based on-bead binding assays, and native massspectrometry). The results are poorly explained and not organized in a logical manner. Moreover, no attempt was made to explain the rationale behind using one technique over the other or how one method complements another to build a stronger conclusion than any individual approach. Given that none of the methods employed report quantitative measurement of binding affinities between C53 IDR and UFM1 or ATG8, it is not clear how the data presented in this manuscript contribute to our understanding of the proposed competition model for UFM1 and ATG8 binding to C53 IDR. To conclude that an interaction is "stronger" or "weaker" it is necessary to measure equilibrium binding constants. Fortunately, there are suitable techniques, including surface plasmon resonance (SPR), microscale thermophoresis (MST), fluorescence anisotropy, or calorimetry that are available to dissect these complex competitive binding interactions and to build models.

      We thank the reviewer for their suggestion. Although we attempted to describe the rationale behind each experiment (please see the line 135-137; on-bead binding assays, line154-157; NMR, 177-181), we agree that the volume of data and variety of techniques warrants additional explanation. We will revise the manuscript to further explain our rationale for using each of the different approaches. As we noted above in our response to reviewer 1, we will also perform relevant ITC binding assays to quantify the interaction between C53, ATG8, and UFM1.

      3) The NMR studies have the potential to dissect the types of dynamic binding inherent in unstructured proteins. However, the abundant NMR data presented combined with the aforementioned binding studies, remarkably, do not seem to significantly advance our understanding of how the system is organized or even how UFM1 and ATG8 bind C53, beyond the rather vague and somewhat circular conclusion stated in the abstract: "...we confirmed the interaction of UFM1 with the C53 sAIMs and found that UFM1 and ATG8 bound the sAIMs in a different mode." Or on line 165 "Altogether these results suggested that ATG8 and UFM1 bind the sAIMs withn C54 IDR, albeit in a different manner".

      We agree that NMR has the potential to dissect the complex binding interactions between UFM1, ATG8, and C53, but disagree with the reviewer’s interpretation that our NMR data fail to achieve this. To sum up, our NMR data: 1. Revealed the structural basis of the interaction of C53-IDR with ATG8 and UFM1 at atomic resolution by showing that UFM1 binds preferentially to sAIM1 in the fast-intermediate exchange [Fig.4 and Fig. S7B], instead ATG8 binds cAIM in the slow-intermediate exchange, and once cAIM is occupied, it binds sAIM1,2 with lower affinity in the fast-intermediate exchange (Fig.4 and Fig.S7D). 2. Determined conformational changes in C53 IDR upon binding of ATG8, but not UFM1 (Fig.S7E), which lead to increased dynamics in distinct regions in C53 IDR. These data could explain how binding of first ATG8 would trigger C53-dependent recruitment of the tripartite complex to autophagosomes. 3. Identified how UFM1 binds to atypical hydrophobic patch in C53 sAIM, similar to what was shown for the UBA5 LIR/UFIM. To sum up, our results shed light on how both UBLs interact with C53, being sAIM1 the highest affinity binding site for UFM1 while ATG8 binds cAIM preferentially before occupying sAIM1,2. To provide more detailed information on the atomic details of the interaction between C53 and the UBLs, we will perform molecular docking studies by using the restraints obtained from the experimental NMR data.

      4) The functional assays performed in Arabidopsis do not support the competitive model between UFM1 and ATG8 for binding to C53 during C53-mediated autophagy. The fluorescence microscopy images do not provide convincing evidence of colocalization between C53 and ATG8. In fact, in contrast to the claims made in the text or the quantification, mCherry-C53 fluorescence does not seem to localize in discrete puncta and its signal does not seem to overlap with ATG8A.

      We disagree with the reviewer’s interpretation of these results although we acknowledge that there is some subtlety in interpreting the co-localization data. Importantly, Arabidopsis has 9 ATG8 isoforms and C53 can bind to most of them with varying affinities (see Stephani et al). Because of this, we do not expect C53 puncta to fully colocalize with ATG8A puncta. Additionally, the C53 puncta are smaller and more subtle than ATG8 puncta, which label the entire autophagosome. To reconcile this, we will quantify the effect by performing colocalization analyses under normal and stress conditions. We will also upload all the raw images as supporting material, so that anyone can independently assess our images.

      Minor Issues: 1. The authors might choose to avoid teleological arguments such as (line 135): "As the phylogenomic analysis suggested that eh sAIMs have been retained to mediate C53-UFM1 interaction..."

      We thank the reviewer for this suggestion and will modify the text accordingly.

      1. The authors refer on multiple occasions to C53 "autoactivation" without defining what they mean by this. Do they propose that C53 UFMylates itself?.

      We refer to C53 activity as the ability to recruit the autophagy machinery and initiate cargo sequestration and degradation in the vacuole. We attempted to explain this in lines 57-61 but we will reword it more clearly, as suggested by the reviewer.

      1. The paper might want to avoid preachy philosophical statements like "Our evolutionary analysis also highlights why we should move beyond yeast and metazoans and instead consider the whole tree of life when using evolutionary arguments to guide biological research." (333- 335). While this is indeed a laudable goal, given the rather limited insights from this study, it is unclear how this paper exemplifies the notion.

      We added this statement as we were intrigued by our evolutionary analyses’ ability to link C53 to UFM1 (an association which took years to identify experimentally) and generate useful functional hypotheses about the interaction between C53 sAIMs and UFM1. As we mentioned above, we also wanted to highlight this point in reference to a recent prominent study in the field which drew conclusions after only considering animals, plants, and fungi (Walczak et al., 2019). We believe this point is important and underappreciated by some cell biologists, but we will modify the text to make it more generic: “This work highlights the utility of using evolutionary analyses and eukaryotic diversity to generate mechanistic hypotheses for cellular processes”.

      Significance

      Overall, while the manuscript contains an abundance of new data, the overall conclusion of the work, stated in the title: "Shuffled ATG8 interacting motifs form an ancestral bridge between UFMylation and C53-mediated autophagy" does not constitute a significant advance beyond other published phylogenomic analysis (below) and the two previous papers by the same authors, including the 2020 paper "A cross-kingdom conserved ER-phagy receptor maintains endoplasmic reticulum homeostasis during stress (PMID: 32851973)" and the 2021 paper "C53 is a cross-kingdom conserved reticulophagy receptor that bridges the gap between selective autophagy and ribosome stalling at the endoplasmic reticulum PMID: 33164651)". While a regulatory interaction between UFMylation and autophagy is of potential importance, the data in this manuscript do not constitute a major advance and fail to provide new mechanistic insight to explain the role of C53 IDR in autophagy and its interplay with UFMylation

      We disagree with the reviewer’s suggestion that our work does not constitute a significant advance. We outlined above in detail the novel insights that were obtained from our phylogenomic analysis which involved using improved methods to reveal a much more dynamic and informative picture of UFMylation evolution than has been described previously. Likewise, this manuscript builds substantially on our previous mechanistic work. In our 2020 paper (which is summarized in the mentioned 2021 review article), we identified C53 as an ER-associated protein that binds ATG8 through sAIMs and interacts with the phagophore after RPL26 UFMylation. This work linked C53 activity to ER-phagy and highlighted its importance in plant and animal stress response. However, key questions remained unanswered prior to our current work such as whether this mechanism is conserved across eukaryotes, especially in unicellular species, how C53 activity is regulated, and how UFM1 and ATG8 interact with C53. Our current manuscript builds on this work with the following key results: 1. We use a combination of phylogenomic and experimental analyses to demonstrate that C53 function is conserved across eukaryotes. 2. We reveal a mechanism whereby UFM1 and ATG8 compete for binding at the sAIMs in the C53 IDR and characterize how each of these ubiquitin-like proteins interacts in an alternative way (see the NMR results described above). 3. We show how the sAIMs are required for the regulation of C53-mediated autophagy and reveal the importance of UFM1-ATG8 competition in preventing C53 autoactivation, which causes unnecessary autophagic degradation and impairs cellular stress responses.

      These insights are fundamental for understanding the mechanisms regulating C53-mediated autophagy which were unknown before this work. We will therefore adjust our manuscript to more clearly and explicitly explain how our data build on previous observations so that the novelty and significance of our results are clearer.

      Referee #3

      Evidence, reproducibility and clarity

      Picchianti and colleagues have investigated a conserved molecular framework that orchestrates ER homeostasis via autophagy. For this, they have carried out phylogenomics and large-scale gene family analyses across eukaryote diversity as well as a barrage of molecular lab work. The amount of work carried out as well as the overall quality of the study is impressive.

      Thank you!

      I have only a few comments that should be very easy to tackle. (1) Maybe I missed it, but please upload all alignments used for phylogenetics and phylogenomics for reproducibility to e.g. Zenodo, Figshare or other suitable OA databases.

      We included the alignments in the supplementary data, but as suggested, we will upload all the source data including the scripts and the alignments to Zenodo.

      (2) "Why these non-canonical motifs were selected during evolution, instead of canonical ATG8 interacting motifs remains unknown" --> Maybe there is no "why" and these were not selected at all. Could be random... drift, non-adaptive constructive neutral evolution. I am not saying that asking "why" in evolutionary biology is wrong. It, however, often does not yield satisfactory answers--or any answer at all.

      The reviewer is completely right that “why” is not the right way to frame an evolutionary question. Thank you for pointing this out. We will revise the text and make sure that we remove these kinds of deterministic statements.

      (3) The authors make a case for UFMylation in LECA and I am fully sympathetic with this. However, getting rid of misfoled/problematic proteins and subcellular entities is something that prokaryotes also to a certain degree must have (and still do) master. Are inclusion bodies or export their only answers (I don't know)? Of course, in eukaryotes with all their intracellular complexity this is likely more of an issue. Given the scope of this manuscript (i.e. shedding light on that ancient framework, deep evolutionary roots in eukaryote evolution etc. etc.) it would be very interesting to read the authors thoughts on this and also pinpoint the prokaryote/eukaryote divide in light of the machinery discussed here.

      Thank you for this suggestion. We did indeed check whether any of the UFMylation machinery were present in prokaryotes and only found homologs of UFSP2. These results are consistent with Grau-Bové et al. (2014) who conducted an equivalent analysis and concluded that UFMylation machinery were derived during eukaryogenesis. We will make reference to this in the revised manuscript.

      Significance

      This study not only impresses with the volume of experiments and data, but also the courage to show conservation of a molecular framework by working with such a range of distantly-related eukaryotes. The results and conclusions from this study should be interesting to anyone working in the broad fields of cellular stress and/or autophagy--both extremely timely topics.

      We thank the reviewer for understanding our take-home message and the advances made. We especially thank the reviewer for understanding the challenge of connecting in silico genomic data with in vivo and in vitro experiments.

      CROSS-CONSULTATION COMMENTS

      Referee #2 The challenge in providing a fair review of this manuscript is to clearly define what contributions are novel, significant advances. It is difficult to tell the way the manuscript is written, as it is unclear how the new data - which are voluminous- actually advance the model already put forth by the same authors in two previous publications. It is also unfortunate that the authors overlooked the 2004 phylogenomics paper. There clearly are some new pieces of information here, but the overall increment in knowledge is rather minimal. Response from Referee #3 I agree that the authors somehow steamroll the reader with a wealth of data. But I think this can be addressed by the authors by requesting a lot more justification and by giving them the opportunity to put the significant advances into their own words. This is, in my opinion, quite doable in course of a revision. Overall I have to say that I am very sympathetic with the crosseukaryote reactivity approach that the authors have taken. It is quite intriguing.

      We thank the reviewers for this useful exchange. We agree that our manuscript was not clear enough to emphasize the novelty of our results which likely resulted from the volume and diversity of the experiments and analyses that were presented. We have now revised the manuscript to improve the context and rationale for the study, the intent and hypotheses behind each experiment, and the novel results and insights obtained in each section.

      Response from Referee #2 I agree that the cross-eukaryote approach is intriguing. Shouldn't we be concerned that the 2004 publication already made two of their key points (ie present in LECA, loss in Fungi). What is the incremental insight from this paper? I'd appreciate an opinion from an evolutionary biologist as to how strongly one can conclude functional co-evolution from such correlative data, especially given the rather small number of supporting examples. Is it also necessary to consider counter-examples- ie species that have sAIMs but no UFM1 (I believe that they found a few such cases)?

      Importantly, we do not conclude functional co-evolution from our correlative data. Instead, we used these correlations to generate hypotheses that we tested with various experiments in different model systems. For example, the apparent correlation between C53 sAIMs and UFM1 prompted us to test whether or not UFM1 and sAIMs interact. Regardless of sample size or statistical significance, phylogenomic analyses can never demonstrate functional links, only correlations, which is why we combined these two approaches. Although only a few species encoded C53 without UFM1, each of these contained C53 cAIMs and lacked sAIMs (Figure 2c). There are species with UFM1 that lack C53 but this makes sense as UFM1 is used in other processes besides ER-phagy. We have revised the text to make our approach and reliance on certain data clearer.

      Response from Referee #3 Well with these deep evolutionary questions this is always a challenge. Where does one stop to sample more homologs for one's analyses (one from each supergroup [which are no longer recognised by the community])? In that sense, the authors are right to make the parsimonious base assumption that if X and Y interact in species A and B (no matter how distant they are related) then X and Y interacted in the last common ancestor of A and B. That being said, if I would have designed this study, I would have sampled more broadly for my in vitro crosseukaryote approach. But also this, I think, could be carried out by the authors in a reasonable timeframe. Specifically, they have now sampled from Amorphea and Archaeplastida, they should add one from TSAR, one Haptista, one Cryptista, and one CRuM. If they synthesised the proteins via a company, they could have the constructs in a few weeks for about 1K Euro - I do not think that this would be an unreasonable request.

      We agree that testing C53 function in additional species would strengthen our understanding of the conservation of this pathway across eukaryotes, as it cannot be assumed that orthologous proteins will function in the same way across all species. To our knowledge there is no other work showing experimentally that the UFMylation pathway is working in a single-celled organism. We focussed our efforts on the unicellular green alga, Chlamydomonas due to its relative experimental tractability. However, testing this was not trivial as it required us to establish expression and purification protocols, isolate Chlamydomonas mutants, optimize physiological stress assays, and perform the experiments.

      Nevertheless, we agree that we could expand our in vitro assays with C53 orthologs from additional species. As suggested by reviewer 3, we will now synthesize 6 more C53 isoforms from two TSAR representatives (the alveolate, Tetrahymena thermophila, and the stramenopile, Phytophthora sojae), as well as a representative from Haptista (Emiliania), Cryptista (Guillardia), Diplomonada (Trypanosoma), and CRuMs (Rigifila). We will test their interaction with human and plant ATG8 and UFM1 proteins. We have also added two species from CRuMs into our phylogenomic analysis.

      The list of experiments that we can do to address the reviewer’s concerns: 1. Repeat experiment in Figure 1C probing with �-RPL26. 2. To calculate KD values, perform ITC experiments with C53 wild-type, C53 sAIM mutant and C53 cAIM variant titrated with ATG8 and UFM1. 3. Perform CoIP experiments using C53 antibody in wild type and UFM1 overexpressing lines and detect for ATG8 association, under normal and stress conditions. 4. We will test autophagic degradation of C53 in uba5 and ufc1 mutants under normal and tunicamycin stress conditions by performing autophagic flux assays using the native C53 antibody 5. Molecular docking studies to see C53’s structural rearrangements leading to ATG8 and UFM1 binding. 6. Figures from co-localization experiments in Figure 5G will be revisited and we will perform additional co-localization analyses such as Pearson coefficient under normal and stress conditions. We will also upload all the raw images as supporting material, so that anyone can independently assess our images. 7. We will upload all the source data for phylogenomic analyses, including scripts and alignments to Zenodo. 8. Test the interaction of 6 newly synthesised C53 isoforms from: (1) an alveolate (tsAr, Ciliate), (2) a stramenopile (tSar, Phaeodactylum), (3) a haptophyte (Emiliania), (4) a cryptophyte (Guillardia), (5) a diplomonad (Trypanosoma) and (6) a CrRuM with human and plant ATG8 and UFM1 proteins.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript from Picchianti et al. seeks to define the role of CDK5RAP3 (hereinafter referred as C53) during autophagy and its interplay with UFMylation. Together with UFL1 and DDRGK1, C53 is a component of a trimeric UFM1 E3 ligase complex that modifies the 60S ribosomal protein RPL26 at the endoplasmic reticulum (ER) surface upon ribosomal stalling (among other proposed functions that are not addressed). Several previous studies have implicated the UFMylation pathway in autophagy or ER-phagy although a non-autophagic fate for UFM1-tagged ribosomal subunits has also been reported.

      A previous study from the same authors (PMID: 32851973) identified an intrinsically disorder region (IDR) in C53 that is necessary and sufficient for interaction between C53 and autophagy receptor, ATG8. They reported that this IDR comprises four non canonical ATG8 interacting motifs (AIM), named shuffled AIMs (sAIMs) and showed that combinatorial mutagenesis of sAIM1, sAIM2, and sAIM3 abrogates ATG8 binding. A similar effect was observed for plant C53, though an additional canonical AIM (cAIM) in the C53 IDR had to be mutated to completely abolish C53 and ATG8 interaction. The earlier study reported that C53 IDR also interacts with UFM1, and this interaction can be disrupted in vitro by adding increasing concentration of ATG8, suggesting that ATG8 and UFM1 may compete with one another for C53 binding.

      The present paper attempts to build on this previous work by using phylogenomics to infer a co-evolutionary relationship between UFMylation machinery and sAIMs in C53, which the authors argue, constitutes further evidence of the primary importance of a role for UFMylation in ER homeostasis. The manuscript includes a lot of biochemical data using variations of in vitro and in vivo pull-down experiments to define the roles of individual AIMs in mediating the binding of C53 to ATG8 and to UFM1. They also use NMR spectroscopy in an attempt to define the structural basis of the UFM1 and ATG8 binding to C53, concluding that plant C53 interacts with UFM1 mainly through sAIM1, while interaction with ATG8 requires cAIM as well as sAIM1 and sAIM2. Finally, the authors attempt to contextualize these findings by conducting studies on Arabidopsis mutants, showing that replacing sAIMs with cAIMs causes increases sensitivity to ER stress and apparently increases formation of C53 intracellular puncta that may colocalize with ATG8.

      From these data the authors concluded that the dual-ATG8 and UFM1 binding of C53 IDR regulates C53 recruitment to autophagosomes in response to ER stress.

      Major Issues:

      1. The phylogenomics analysis conclusion that UFM1 is common in unicellular lineages and did not evolve in multicellular eukaryotes is not novel, as another comprehensive analysis of UFM1 phylogeny, published eight years ago - in 2014 - by Grau-Bové et al. (PMID: 25525215), also reported that UFM1, UBA5, UFC1, UFL1 and UFSP2 were likely present in LECA and lost in Fungi. Although the phylogenomic analysis by Picchianti et al. is also extended to DDRGK1 and C53 proteins, and some parasitic and algal lineages, their findings are incremental. Their proposed coevolution of sAIM and UFM1 is based on presence-absence correlation observed within five species (i.e., Albugo candida, Albuco laibachii, Piromyces finnis, Neocallimastix californiae, Anaeromyces robustus). However, this coevolutionary relationship must be further investigated by substantially increasing the taxonomic sampling within the UFM1-lacking group.
      2. The manuscript presents an overwhelming amount of biochemical and structural data obtained from a variety of protein binding techniques (i.e., NMR spectroscopy, in vitro GST-pulldown, fluorescence microscopy-based on-bead binding assays, and native mass-spectrometry). The results are poorly explained and not organized in a logical manner. Moreover, no attempt was made to explain the rationale behind using one technique over the other or how one method complements another to build a stronger conclusion than any individual approach. Given that none of the methods employed report quantitative measurement of binding affinities between C53 IDR and UFM1 or ATG8, it is not clear how the data presented in this manuscript contribute to our understanding of the proposed competition model for UFM1 and ATG8 binding to C53 IDR. To conclude that an interaction is "stronger" or "weaker" it is necessary to measure equilibrium binding constants. Fortunately, there are suitable techniques, including surface plasmon resonance (SPR), microscale thermophoresis (MST), fluorescence anisotropy, or calorimetry that are available to dissect these complex competitive binding interactions and to build models.
      3. The NMR studies have the potential to dissect the types of dynamic binding inherent in unstructured proteins. However, the abundant NMR data presented combined with the aforementioned binding studies, remarkably, do not seem to significantly advance our understanding of how the system is organized or even how UFM1 and ATG8 bind C53, beyond the rather vague and somewhat circular conclusion stated in the abstract: "...we confirmed the interaction of UFM1 with the C53 sAIMs and found that UFM1 and ATG8 bound the sAIMs in a different mode." Or on line 165 "Altogether these results suggested that ATG8 and UFM1 bbind the sAIMs withn C54 IDR, albeit in a different manner".
      4. The functional assays performed in Arabidopsis do not support the competitive model between UFM1 and ATG8 for binding to C53 during C53-mediated autophagy. The fluorescence microscopy images do not provide convincing evidence of colocalization between C53 and ATG8. In fact, in contrast to the claims made in the text or the quantification, mCherry-C53 fluorescence does not seem to localize in discrete puncta and its signal does not seem to overlap with ATG8A.

      Minor Issues:

      1. The authors might choose to avoid teleological arguments such as (line 135): "As the phylogenomic analysis suggested that eh sAIMs have been retained to mediate C53-UFM1 interaction..."
      2. The authors refer on multiple occasions to C53 "autoactivation" without defining what they mean by this. Do they propose that C53 UFMylates itself?.
      3. The paper might want to avoid preachy philosophical statements like "Our evolutionary analysis also highlights why we should move beyond yeast and metazoans and instead consider the whole tree of life when using evolutionary arguments to guide biological research." (333-335). While this is indeed a laudable goal, given the rather limited insights from this study, it is unclear how this paper exemplifies the notion.

      Referees cross-commenting

      Referee #2

      The challenge in providing a fair review of this manuscript is to clearly define what contributions are novel, significant advances. It is difficult to tell the way the manuscript is written, as it is unclear how the new data - which are voluminous- actually advance the model already put forth by the same authors in two previous publications. It is also unfortunate that the authors overlooked the 2004 phylogenomics paper. There clearly are some new pieces of information here, but the overall increment in knowledge is rather minimal.

      Response from Referee #3

      I agree that the authors somehow steamroll the reader with a wealth of data. But I think this can be addressed by the authors by requesting a lot more justification and by giving them the opportunity to put the significant advances into their own words. This is, in my opinion, quite doable in course of a revision. Overall I have to say that I am very sympathetic with the cross-eukaryote reactivity approach that the authors have taken. It is quite intriguing.

      Response from Referee #2

      I agree that the cross-eukaryote approach is intriguing. Shouldn't we be concerned that the 2004 publication already made two of their key points (ie present in LECA, loss in Fungi). What is the incremental insight from this paper?

      I'd appreciate an opinion from an evolutionary biologist as to how strongly one can conclude functional co-evolution from such correlative data, especially given the rather small number of supporting examples. Is it also necessary to consider counter-examples- ie species that have sAIMs but no UFM1 (I believe that they found a few such cases)?

      Response from Referee #3

      Well with these deep evolutionary questions this is always a challenge. Where does one stop to sample more homologs for one's analyses (one from each supergroup [which are no longer recognised by the community])? In that sense, the authors are right to make the parsimonious base assumption that if X and Y interact in species A and B (no matter how distant they are related) then X and Y interacted in the last common ancestor of A and B. That being said, if I would have designed this study, I would have sampled more broadly for my in vitro cross-eukaryote approach. But also this, I think, could be carried out by the authors in a reasonable timeframe. Specifically, they have now sampled from Amorphea and Archaeplastida, they should add one from TSAR, one Haptista, one Cryptista, and one CRuM. If they synthesised the proteins via a company, they could have the constructs in a few weeks for about 1K Euro - I do not think that this would be an unreasonable request.

      Significance

      Overall, while the manuscript contains an abundance of new data, the overall conclusion of the work, stated in the title: "Shuffled ATG8 interacting motifs form an ancestral bridge between UFMylation and C53-mediated autophagy" does not constitute a significant advance beyond other published phylogenomic analysis (below) and the two previous papers by the same authors, including the 2020 paper "A cross-kingdom conserved ER-phagy receptor maintains endoplasmic reticulum homeostasis during stress (PMID: 32851973)" and the 2021 paper "C53 is a cross-kingdom conserved reticulophagy receptor that bridges the gap between selective autophagy and ribosome stalling at the endoplasmic reticulum PMID: 33164651)". While a regulatory interaction between UFMylation and autophagy is of potential importance, the data in this manuscript do not constitute a major advance and fail to provide new mechanistic insight to explain the role of C53 IDR in autophagy and its interplay with UFMylation

    1. Reviewer #2 (Public Review):

      The authors utilize the publicly available dHCP dataset to ask an interesting question: how does postnatal experience and prenatal maturation influence the development of the visual system. The authors report that experience and prenatal maturation differentially contribute to different aspects of development. Namely, the authors quantify cortical thickness, myelination, and lateral symmetry of function as three different metrics of development. The homotopy and preterm infant analyses are strengths that, on their own, could have justified reporting. However, I have concerns about the analytic approaches that were used and the conclusions that were drawn. Below I list my major concerns with the manuscript.

      PMA vs. GA vs. PT

      1. The authors seek to understand the contribution of experience and prenatal development, yet I am unsure why the authors focused on the variables they did. There are three variables of interest used throughout this study: Gestational age at birth (GA), postnatal time (PT), and postmenstrual age at the time of scan (PMA). The last metric, PMA, is straightforwardly related to GA and PT since PMA = GA + PT. In most (but not all) of the manuscript, the authors use PMA and PT, with GA used without justification in some cases but not in others.

      It is unclear why PMA is used at all: PMA is necessarily related to PT and GA, making these variables non-independent. Indeed, the authors show that PMA and PT are highly correlated. The authors even say that "the contribution of postnatal experience to the development was not clarified because PMA reflects both prenatal endogenous effect and postnatal experience." So, why not use GA at birth instead of PMA? Clearly, GA is appropriate in some cases (e.g., Figure S4 or in some of the ANOVA applications), and to me, it seems to isolate the effect the authors care about (i.e., duration of prenatal development). Perhaps there is some theoretical justification for using PMA, but if so, I am unaware.

      That said, I expect that replacing all analyses involving PMA with GA will substantially change the results. I do not see this as a bad thing as I think it will make the conclusions stronger. As is, I am left unsure about what the key takeaways of this paper are.

      2. Using GA instead of PMA will have several benefits: 1) It will be much simpler to think of these two variables since they contrast the duration of fetal maturation and time postnatally. 2) This will help the partial correlation analyses performed since the variance between the variables is more independent. It will also mean that the negative relationships observed between PT and cortical thickness when controlling for PMA (e.g., Figure 2h) might disappear (reversed signs for partial correlations are common when two covariates are correlated). 3) this will allow the authors to replace Figure 1a with a more informative plot. Namely, they could use a scatter of GA and PT, giving insight into the descriptive statistics of both dimensions.

      3. I suspect that one motivation for the use of PMA over GA is for the analysis in Figure 6. In this analysis, the authors pick a group of term infants with a PMA equal to the preterm infants. Since PMA is the same, the only difference between the groups (according to the authors) is the amount of postnatal experience. However, this is not the only difference between the groups since they also vary in GA (and now PT and GA are negatively correlated almost perfectly). I don't know how to interpret this analysis since both the amount of prenatal maturation and postnatal experience vary between the groups.

      Justification of conclusions and statistical considerations

      I had concerns about some of the statistical tests and conclusions that the authors made. I refer to some of these in other sections (e.g., the homotopy analyses), but I raise several here.

      4. I am not sure what evidence the authors are using to make this claim: "we found that the cortical myelination and overall functional connectivity of ventral cortex developed significantly with the PMA but was not directly influenced by postnatal time." Postnatal time is significantly correlated with cortical myelination, as shown in Figures 2g, 2h, 3b, 3c, and postnatal time is significantly correlated with functional connectivity, as shown in Figures 4h, 5c, 5d, and 5e. Hence, this general claim that "the development of CT was considerably modulated by the postnatal experience while the CM was heavily influenced by prenatal duration" doesn't seem to be supported: both myelination and thickness are affected by postnatal experience and prenatal duration (as measured by PMA). A similar sentiment is expressed in the abstract. Perhaps the authors suggest different patterns in the strength of change for PMA vs. PT across these metrics, but if so, then statistical tests need to support that conclusion, and the claims need to reflect that sentiment.

      Interestingly, Figure S4 presents a compelling ANOVA that does support this conclusion. Still, this result is relegated to the supplement, and it also uses GA, rather than PMA, making it hard to reconcile with the other claims made in the main text. Moreover, it uses ANOVAs, which dichotomizes a continuous variable. Here and elsewhere in the manuscript (e.g., Figures 3d, 3e), the authors split the infants into quartiles and compare them with ANOVAs. Their use for visualization is helpful, but it is unclear what the statistical motivation for this is rather than treating these as continuous variables like is possible with linear mixed-effects models. Moreover, it is unclear why the authors excluded half the data from the study (i.e., quartiles 2 and 3) in this ANOVA when all four quartiles could be used as factors.

      5. It is unclear what the evidence is to support the following claim: "Both CT and CM show higher correlation with PMA in the posterior than anterior region, and higher correlation in the medial than lateral part within the anatomical mask (Figure 2a and Figure S2b-c [sic])" From Figure 2 or Figure S2, I don't see a gradient. From Figure S3, there might be a trend in some plots, but it is hard to interpret since it is non-monotonic. More generally, is there a statistical test to support this claim?

      6. "and the interaction [sic] was more prominent in CM (simple effect: t = 10.98, p < 10-9) that in than CT (t = 2.07, p < 0.05)." Does 'more prominent' mean it is 'significantly stronger'? If not, then the authors should adjust this claim

      7. Are the authors Fisher Z transforming their correlations? In numerous places, correlation values seem to be added together or used as the input to other correlation analyses. It is unclear from the methods whether the authors are transforming their correlation values to make that use appropriate.

      Homotopy analyses

      The homotopy section is a strength of the paper, but I have doubts about the approach taken to analyze this data and some of the conclusions drawn. I don't expect any of my suggestions to change the takeaway of this section, but I do think they are essential criticisms to address.

      8. I do not think that the non-homotopic control condition is appropriate. In Arcaro & Livingstone (2017), the authors had 3 categories for this analysis: homotopic pairs (e.g., left V1 vs. right V1), adjacent pairs (e.g., left V1 vs. right V2), and distal pairs (e.g., left V1 vs. right PHA1). In the homotopy analysis performed by Li and colleagues, they compare homotopic pairs with all other pairs. I don't think that is generous to the test since non-homotopic pairs include adjacent pairs that should be similar and distal pairs that shouldn't be similar. This may explain why some non-homotopic distribution overlaps with the homotopic distribution in Figure 4c.

      9. Regardless of this decision, I think the authors should reconsider their statistical test. I think the authors are using a between samples t-test to compare the 34 homotopic pairs with the hundreds of non-homotopic pairs. This is statistically inappropriate since the items are not independent (i.e., left V1 vs. right V1 is not independent of left V1 vs. right V2, which is also not independent of left V3 vs. right V2). This means the actual degrees of freedom are much lower than what is used. Moreover, I am unsure how the authors do this analysis across participants since this test can be done within participants. The authors should clarify what they did for this analysis and justify its appropriateness.

      10. Could the authors speculate on why the correlations in homotopic regions are so much lower than what Arcaro and Livingstone (2017) found. I can think of a few possibilities: higher motion in infants, less rfMRI data per participant, different sleep/wake states, and different parcellation strategies. Regarding the last explanation, I think this is a real possibility: the bilateral correlation may be reduced if the Glasser atlas combines functionally heterogeneous patches of the cortex. Hence, the authors should consider this and other possible explanations.

      11. The authors assume that the homotopic analyses mean that there are lateral connections between hemispheres (e.g., "Furthermore, the connections among the ventral visual cortex have developed during this early stage. Specifically, the homotopic connections between bilateral V1 and between bilateral VOTC both increased with GA, indicating an increased degree of functional distinction"). While this might be true, it doesn't need to be. Functional connectivity can be observed between regions that lack anatomical connectivity. Instead, two regions could both be driven by another region. In this case, the thalamus might drive symmetrical activity in the visual cortex.

      Miscellaneous

      12. I am not sure what the motivation of this line is: "Moreover, those studies did not fully control the visual experience in the first few weeks of the subjects, thus cannot give a clear conclusion whether the innate functional connectivity is unrelated to postnatal visual experience." Arcaro, Schade, Vincent, Ponce, & Livingstone (2017) did control the visual experience of subjects. Moreover, the research here doesn't control infant experience in the way this sentence implies: it implies an experiment manipulation (i.e., fully control) rather than a statistical control that is done here. Consider rephrasing

      13. I am not sure why this claim is made: "Area V1 was selected because this region is the most basic region for visual processing and probably is the most experience-dependent area during early development". Is there evidence supporting this claim? Plasticity is found throughout the visual cortex, and I think which region is most plastic depends on the definition of plasticity. For instance, most people have the same tuning properties to gabor gratings (e.g., a cardinality bias), but there is enormous variability in face tuning across cultures.

      14. The abstract says 783 infants were included in this study, but far fewer are actually used. The authors should report the 407 number in the abstract if any number at all.

      15. Any comparisons of preterms and terms ought to be given the caveat that the preterm environment can be very different than the term environment: whereas a term infant goes home and sees friends and family without restriction, the preterm environment can be heavily regulated if they are in a NICU. Authors should either provide details about the environments of the preterms in their study, or they should consider how differences in the richness of visual experience - regardless of quantity - may affect visual development.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to the reviewers

      Referee #1

      Evidence, reproducibility and clarity

      1. This manuscript constructs a gene expression model with various factors. Specifically, the effect of cell size on gene expression is considered, which is often ignored by previous studies. One interesting finding is that the absolute number of the gene products and the concentration can have different distributions. Some predictions of the models are validated by experimental data on E. coli and yeast. This manuscript uses the mean-field approximation for cell volume, which has good accuracy when the number of stages is large. The usage of the power spectrum has a satisfactory effect on studying the concentration oscillation.

      Response: Thank you for the positive comments.

      1. Overall the paper was very difficult to follow and digest easily because of all the different factors and mechanisms invoked. It is mainly an issue of providing sufficient details for each of the factors and organizing them in a systematic and logical way. Although there is a supplementary appendix, it was hard to keep track of all the elements in the main manuscript. Perhaps something like Fig 1 of the Appendix can be presented in the main body to outline all the ingredients and how they affect each other.

      Response: In the revised manuscript, we moved Supplementary Fig. S1 in the previous version into the main text to outline all the ingredients and how they affect each other (see page 8, Fig. 2). Moreover, we provided many details for each of the biological factors and tried to organize them in a more systematic and logical way (see pages 3-7).

      1. It might be good to provide a more detailed description of the goal (studying gene product number and concentration under different parameters) after introducing the full and the reduced models. A table of symbols would also be helpful.

      Response: In the revised manuscript, we added a table explaning the meaning of all model parameters (see page 4, Table 1). Moreover, we provided a detailed description of the goal of the present paper after introducing the full and reduced models (see page 7).

      1. Some technical details in the Methods section are in fact helpful in understanding the conclusions. They can be moved to the Results section.

      Response: In the revised manuscript, we moved many technical details in Methods and Supplementary Notes to the main text to help the readers better understand the conclusions (see pages 5-10).

      1. One concern is that the central concept of this manuscript, “stage”, is not thoroughly discussed. This concept should have some significant biological meaning, not just be coined for mathematical convenience.

      Response: In the revised manuscript, we explained in detail the biological meaning of the effective cell cycle stages (see page 4). Specifically, recent studies have revealed that in many cell types, the accumulation of some activator to a critical threshold is used to promote mitotic entry and trigger cell division, a strategy known as activator accumulation mechanism. In E. coli, the activator was shown to be FtsZ; in fission yeast, it is believed to be a protein upstream of Cdk1, the central mitotic regulator, such as Cdr2, Cdc25, and Cdc13. Biophysically, the N effective stages can be understood as different levels of the key activator. Moreover, we pointed out that the power law form for the rate of cell cycle progression may come from cooperativity of the key activator that triggers cell division.

      1. Fig. 1(b) is a little strange. For the left panel, the x-axis (stage) is discrete, then the volume (y-axis) should be a step function, not a straight red line.

      Response: In the revised manuscript, we added some red dots in the stage-volume plot to show the dependence of the mean cell volume vk on cell cycle stage k for the mean-field model (see page 3, Fig. 1). Moreover, we emphasized that the joining of these dots by a straight red line is simply a guide to the eye.

      Significance

      1. The main advance is a more complete model of gene expression under more realistic organism growth conditions.

      Response: Thank you for acknowledging the results of the manuscript.

      Referee #2

      Evidence, reproducibility and clarity

      1. Jia et al. introduce a modeling framework to represent stochastic gene expression, with an explicit representation of cell volume growth, cell cycle progression (and its dependency on cell volume) and gene dosage compensation. The model is very elegant and general in that it can represent a variety of situations, simply as a matter of parametrization. Under a simplifying assumption, the authors derive a number of metrics (include stationary distribution of gene product and power spectrum of gene product fluctuation dynamics), for both absolute number and concentration of gene product molecules. They use their model and derivations to examine under which conditions cell can achieve homeostasis in the concentration of the expressed gene product, despite changes in cell volume and gene copy number following replication. They also present and discuss the conditions giving rise to specific features (i.e. bimodality in stationary distribution, peak in power spectrum) and examine these features in experimental data to conclude to infer the underlying homeostasis strategies. The model is rather general and powerful. The simplifying assumption seems reasonable (and the authors investigate to some extent its limitations, i.e. Fig. 2). The conclusions are overall convincing.

      Response: Thank you for the positive comments.

      Major comments 1. My main concern is that the metrics that the authors use to assess concentration homeostasis (i.e. the γ parameter and the presence / absence of peak in power spectrum) do not seem quite appropriate to describe how much variability / fluctuations in concentration are driven by cell cycle effects. Indeed, the γ parameter measures how much the *average* concentration in each cell cycle stage varies throughout the cell cycle. However, this variability should be compared to the total variability due to both cell cycle effects and stochastic bursting dynamics. A given level of cell-cycle dependency (say γ = 0.2) could be very visible if gene expression is weakly noisy (e.g. B low and hni high) and completely invisible is gene expression is highly bursty (large B and small hni). In the latter situation, cell-cycle effects would be meaningless for the cell to minimize. In essence, reusing the authors notations, I think γ/φ1/2 , would be a more relevant metric to observe.

      Response: In the revised manuscript, we showed that the total concentration noise φ can be decomposed as φ = φext + φint, where φext is the extrinsic noise which characterizes the fluctuations between different stages due to cell cycle effects and φint is the intrinsic noise which characterizes the fluctuations within each stage due to stochastic bursty synthesis and degradation of the gene product (see page 11). Based on the above decomposition, we introduced a new metric γ = φext/φ, which characterizes the accuracy of concentration homeostasis. Clearly, the new metric γ reflects the relation contribution of cell cycle effects in the total concentration variability. All discussions about concentration homeostasis are based on the new metric γ in the revised manuscript. Moreover, all figures have been updated by using this new metric.

      1. Similarly, when inspecting the peak in the power spectrum, the weight of the Lorentzian function(s) creating the peak, should be compared to the stationary component (λN , uN in the authors’ notations).

      Response: We cannot quite understand why the weights uk of the Lorentzian functions should be compared to the stationary component uN . In fact, all the weights uk except uN are actually complex numbers and we are not so sure about the meaning of uk/uN . However in the revised manuscript, we emphasized that the power spectrum G(ξ) is normalized so that G(0) = 1 throughout the paper (see page 13). To better understand concentration oscillations and its relation to homeostasis, we depicted both γ and H as a function of B and hni (see Supplementary Fig. S5). As expected, the off-zero peak becomes lower as B increases and as hni decreases since both of them correspond to an increase in concentration fluctuations which counteracts the regularity of oscillations; noise above a certain threshold can even completely destroy oscillations. Furthermore, we found that γ and H have similar dependence on B and hni. This again shows that the occurrence of concentration oscillations is intimately related to the visibility of cell cycle effects in concentration fluctuations.

      A complementary analysis including these two points and a discussion the relative contribution of cell-cycle effects and bursting dynamics in the total variability/fluctuation of concentrations would be important to include.

      Response: In the revised manuscript, we made some complementary analysis and discussion about the relative contribution of cell cycle effects and stochastic birth-death dynamics in the total variability of concentrations (see pages 11-14).

      Minor comments 3. The dashed line on Fig. 3a is defined as κ = √ 2 1−β . First is this empirical or does it come from a derivation? Second, it seems incomplete since it should depend on w. Intuitively, this line should correspond to the value of κ that would best mimic balanced biosynthesis in the case where β 6= 1. In other words, κ should be so that hρB0 /V (t)iprereplication = hκρB0 /V (t)ipostreplication, which yields κ = 2w(1−β) ∗ (w − 1)/w ∗ [2w(β−1) − 1]/[2(1−w)(β−1) − 1]. This indeed simplifies into κ = √ 2 1−β when w = 0.5.

      Response: Thank you for providing such a beautiful derivation. In the revised manuscript, we added this derivation into the main text (see pages 12-13). Moreover, we also made it clear that this relation can also be obtained from the perspective of power spectrum (see page 14).

      1. η is used in the caption of Fig. 2, which is cited on page 4. But it is defined only 2 sections later, on page 6.

      Response: In the revised manuscript, we gave the definition of η in both Table 1 and the caption of Fig. 3 (Fig. 2 in the old version). Please see page 4, Table 1 and page 9, Fig. 3.

      1. w is used in the main text, but only defined in the caption of Fig. 3.

      Response: In the revised manuscript, we gave the definition of w in both Table 1 (see page 4) and on page 7.

      1. w is defined as “the proportion of cell cycle before replication”. Is this in terms of cell cycle stages (i.e. w = N0/N) or actual time?

      Response: In the revised manuscript, we made it clear that w represents the proportion of cell cycle duration before replication, which should be distinguished from the proportion N0/N of cell cycle stages before replication (see page 7). This is because the transition rate between cell cycle stages is an increasing function of cell size, which means that earlier (later) stages have longer (shorter) durations.

      1. Fig. 3 indicates that power spectra are normalized so that G(0) = 1, but G(0) = 10 on the first two graphs.

      Response: Corrected as suggested (see page 12, Fig. 4). Thank you.

      1. Page 11: “bimodality in the concentration distribution is significantly less apparent”. I would suggest rephrasing “bimodality in the concentration distribution is absent” since there should be no reference to “significance” and bimodality is either present or absent (binary), not less apparent.

      Response: Corrected as suggested. Thank you.

      Referees cross-commenting

      1. Regarding the comment from reviewer 3 that ”a direct validity test should use data sets of at least two types (total, nascent RNA, etc)”. I almost made a related comment in my review, but then I held it off: This issue with using nascent RNA data is that their model does not allow an ON state. They assume that gene products are produced in instantaneous bursts, which is a fair assumption if the lifetime of gene products is large compared to the time the gene stays ON. This is ok if the considered ”gene products” are mRNA or proteins, but not nascent RNAs (for which the lifetime is the time to transcribe the gene). I did not make this comment in the end because I think the model is useful regardless. To comply with reviewer 3’s request, maybe the authors could use distributions of mRNA and protein products, but I’m not sure that such data exists (since they need cell-cycle-resolved data).

      Response: It is not possible to validate our model with nascent mRNA data because the model in its present form cannot predict nascent mRNA fluctuations. This is because unlike mature mRNA, nascent mRNA cannot be assumed to decay via first-order kinetics. A detailed response is provided below to the original comment made by Referee 3. Regarding the comment on the use of cell-cycle-resolved data measuring mRNA and protein expression – while we agree it would make an excellent test of our model, we could not find such a dataset in the literature. We point out that our model, in its present form, is interesting as it is, as a detailed biological model of mature mRNA and protein number / concentration fluctuations in growing cells. Its predictions are yet to be fully confirmed and hence may stimulate the development of further experimental single-cell studies.

      Significance

      1. The advance of this paper is essentially technical. The authors present a model that incorporates and unifies previously studied effects (cell volume homeostasis, concentration homeostasis, bursting transcription). There is no major conceptual novelty, but the combination of these different aspects and the derivations that authors present are very valuable and might be applicable to interpret data in various species.

      Response: Thank you for acknowledging the results of the manuscript.

      Referee #3

      Evidence, reproducibility and clarity

      1. The manuscript analyses a phenomenological model of stochastic gene expression. The model couples bursty transcription with cell growth, division and DNA replication. The cell cycle is divided into a large number of stages whose exponential lifetimes depend on the cell volume. It is argued that concentrations of gene products are distributed according to mixed Gamma distributions, whereas the copy numbers follow mixed negative binomial distributions. The number of modes can be different for concentrations and copy numbers, for instance the copy numbers can be unimodal while concentrations are bimodal. The case when the mean concentration does not depend on the cell cycle stage is called perfect homeostasis. It is argued that perfect homeostasis leads to Gamma distribution of the gene product concentration and that deviations from a Gamma distributions result mainly from deviations of the concentration from perfect homeostasis. It is also proposed that concentration homeostasis is difficult to obtain. These qualitative predictions of the model are tested using two data sets, one for E.coli and another for fission yeast.

      Response: Thank you for acknowledging the results of the manuscript.

      Major comments 1. A huge number of states called “cell cycle stages” have exponential life times. On my opinion, this sequence of stages is just a technicality for keeping the model within a discrete Markovian framework. More natural choices are possible, such as piecewise deterministic Markov processes, age structured diffusions, etc. The biological significance (if there is any) of such states should be explained.

      Response: In the revised manuscript, we explained in detail the biological meaning of the effective cell cycle stages (see page 4). Specifically, recent studies have revealed that in many cell types, the accumulation of some activator to a critical threshold is used to promote mitotic entry and trigger cell division, a strategy known as activator accumulation mechanism. In E. coli, the activator was shown to be FtsZ; in fission yeast, it is believed to be a protein upstream of Cdk1, the central mitotic regulator, such as Cdr2, Cdc25, and Cdc13. Biophysically, the N effective stages can be understood as different levels of the key activator. Moreover, we pointed out that the power law form for the rate of cell cycle progression may come from cooperativity of the key activator that triggers cell division.

      1. The timescales of stochastic gene expression are not correctly taken into account. It is considered that during an exponential stage the bursting approximation describes gene expression in terms of Gamma distributions for concentrations and in terms of negative binomial distributions for copy numbers. This approximation is only valid if the lifetime of a stage is much larger than the time needed to generate a burst. For RNA, this condition cannot be fulfilled for a large number of states N and/or for two states promoters with a relatively long ON state. For the protein and/or in the case of translational bursting, the condition is even more difficult to fulfil. I agree with the Reviewer 2 that once the master equation accepted the results make sense. But my criticism is different and concerns the master equation itself. In this equation the burst is considered instantaneous, whereas it needs finite time in reality. Concerning nascent mRNA, ON/OFF etc. I disagree. The notion of instantaneous burst with well defined burst size and burst frequency on a stage has a meaning if the lifetime of this stage (which is not mRNA or protein lifetime) is short. The model validity should be clearly stated.

      Response: Thank you for pointing out this important issue. When we talk about the validity of the model, we should stick to the full model, instead of the mean-field model. This is because once the full model makes sense, the mean-field model must work well when N ? 15, as we have shown in Fig. 3 and Supplementary Fig. S3. Hence our reply is based on the validity of the full model. We will reply to the above comments from the following three aspects. First, we agree with the referee that in our model, we assume that the gene product is produced in instantaneous bursts with the reaction scheme G ρpk (1−p) −−−−−−→ G + kM, k ≥ 1, M d −→ ∅, (1) where the mean burst size scales as V (t) β . Of course, in reality there is a finite time for the bursts to occur. A more general assumption is that within each cell cycle, the gene expression dynamics is characterized by the following three-stage model: G ρ −→ G ∗ , G∗ r −→ G, G∗ sV (t) β −−−−→ G ∗ + M, M u−→ M + P, M v −→ ∅, P d −→ ∅, (2) where the first two reactions describe the switching of the gene between an inactive state G and an active state G∗ the middle two reactions describe transcription and translation, and the last two reactions describe the degradation of the mRNA M and the protein P. Here the synthesis rate of mRNA depends on cell volume via a power law form with power β ∈ [0, 1]. Dosage compensation can be modeled by a decrease in the gene activation rate (for each gene copy) from ρ to κρ/2 upon replication. Previous studies have revealed that the bursting of mRNA and protein has different biophysical origins: transcriptional bursting is due to a gene that is mostly inactive, but transcribes a large number of mRNA when it is active (r ? ρ and s/r is finite), whereas translational bursting is due to rapid synthesis of protein from a single short-lived mRNA molecule (v ? d and u/v is finite). Under the above timescale separation assumptions, both mRNA and protein are produced in a bursty manner with the reaction scheme described by Eq. (1). The burst frequency for mRNA and protein are both ρ before replication and κρ after replication. The mean burst size for mRNA is (s/r)V (t) β and the mean burst size for protein is (su/rv)V (t) β , both of which have a power law dependence on cell volume (see pages 5-6). In Supplementary Figs. S1 and S2, we compare the mRNA and protein distributions for the bursty model with the reaction scheme given by Eq. (1) and the three-stage model with the reaction scheme given by Eq. (2), where both models under consideration have a cell cycle and cell volume description. It can be seen that the distributions for the two models are very close to each other under the above timescale separation assumptions with the bursty model being more accurate as r/ρ and v/d increase. Moreover, we find that the accuracy of the bursty model is insensitive to the value of the number of stages N. Here the values of N are chosen so that the ratio of the average time spent in each stage (T /N, where T ≈ (log 2)/g is the mean cell cycle duration) and the mean burst duration time (1/ρ) ranges from ∼ 0.5 − 2. This shows that the effectiveness of the bursty model does not require that the lifetime of a cell cycle stage is sufficient long. Due to mathematical complexity, we only focus on the bursty model in the present paper. The consistency between the gene product distributions for the two models justifies our bursty assumption. Second, while we assume bursty expression here, our model naturally covers non-bursty expression since the latter can be regarded as a limit of the former. Hence all the conclusions in the present paper are applicable to both bursty and non-bursty expression. In the revised manuscript, we emphasized this point (see page 4 for a detailed explanation). Last but not least, if the lifetime of the gene product is much shorter compared to the lifetime of each cell cycle stage, then the gene expression dynamics will rapidly relax to a quasi-steady state for each stage. In this case, the gene product fluctuations at each stage can be characterized by a gamma distribution in terms of concentrations and by a negative binomial distribution in terms of copy numbers, and hence the distribution of concentrations (copy numbers) for a population of cells is naturally a mixture of N gamma (negative binomial) distributions. However, the powerfulness of our analytical distribution (see page 10, Eq. (8)) is that it serves an accurate approximation when N ? 1 without making any timescale assumptions. The effectiveness of our analytical distributions is validated in Supplementary Fig. S3 for three different cases: (i) the degradation rate d of the gene product is much smaller than the cell cycle frequency f; (ii) d and f are comparable; (iii) d is much larger than f. In the revised manuscript, we also emphasized these points (see page 10).

      1. DNA replication is a stochastic event and does not occur after a fixed number of exponential stages as it is considered in this model. Concerning replication: in the model this occurs after exactly N0 steps. In reality, replication occurs somewhere between the start of S and G2/M. N0 is in fact a random variable. Probably a new mean field assumption is needed here with some justification, but I have seen nothing in the paper.

      Response: We agree with the referee that replication of the whole genome occurs in the S phase, which occupies a considerable portion of the cell cycle and thus cannot be assumed to occur after a fixed number of exponential stages. However, our model is for a single gene and since the replication time of a particular gene is much shorter than the total duration of the S phase, it is reasonable to consider it to be instantaneous. In addition, recent experiments have shown that the time elapsed from birth to replication for a particular gene occupies an approximately proportion of the cell cycle, which is called the stretched cell cycle model. This is also consistent with our assumption that replication of the gene of interest occurs after exactly N0 stages. While replication occurs after a fixed number of stages, nevertheless the time of replication is stochastic since each stage has a random lifetime. In the revised manuscript, we emphasized these points (see pages 4-5).

      1. The results in the Methods were derived heuristically and their relation to the master equation (12) is not explicit (except for the part concerning moments and their power spectrum). Furthermore, one would like to have some estimates of the biases introduced by the mean field approximation. Concerning biases introduced by the mean field approximation: Figure 2 is a numerical simulation, some analytical estimates could be better. As Figure 2 looks rather convincing, I reclassify this as minor comment.

      Response: We agree with the referee that the derivation of moments is rigorous, but the derivation of the analytical distribution given in Methods is not rigorous and cannot be directly obtained from the master equation. In the revised manuscript, we emphasized that the analytical distribution is not exact but it serves as a very good approximation (see pages 10 and 22). We showed that the analytical distribution agrees well with stochastic simulations when the number of cell cycle stages N ≥ 15 (see page 9, Fig. 3 and Supplementary Fig. S3). The logic behind our approximate distribution is that while the gene product may produce complex distribution of concentrations (copy numbers), when the number of cell cycle stages is large, the distribution must be relatively simple within each stage and thus can be well approximated by a simple gamma (negative binomial) distribution (see page 22). Due to the complexity of our model, it is very difficult to provide any analytical estimates on the bias introduced by the mean-field approximation. Often the bias of an approximation can be estimated when the approximation emerges from a systematic method such as van Kampen’s system-size expansion (see Ref. [21]). However, our mean-field model cannot be seen as the zero order term of some expansion and hence it is not possible to calculate the next-order correction which would be needed to estimate the error. However, we have tested very large swathes of parameter space and found that the mean-field approximation always works well when N ≥ 15 which is the physiologically relevant regime for most types of cells (see discussion on P. 7).

      1. The model is not minimal and depends on a huge number of parameters. It is not clear how these parameters were found and if overfitting was avoided. One may have doubts about the identifiability of the parameter N. What difference is between N = 59 and N = 60 (the value of N for the cyanobacterium)?

      Response: In the revised manuscript, we used synthetic data to show that all the model parameters involved in our model (except d and β which can be determined based on a priori knowledge) can be accurately estimated from cell-cycle resolved lineage data of cell volume and gene expression (see Supplementary Note 7). We provided details of the parameter inference method, compared the input parameters with the estimated ones and verify that they are identifiable (see Supplementary Table 1). We did not use real data to test our inference method because we could not find cell-cycle resolved lineage data for mRNA or proteins. As we noted, this is in principle possible via cell-cycle fluorescent markers. We also note that parameter inference for less detailed but similar models have been made in our previous papers — the parameters related to cell volume dynamics have been inferred in E. coli (see Ref. [51]) and fission yeast (see Ref. [52]) using the method of distribution matching, and the parameters related to gene expression dynamics have be estimated in E. coli (see Ref. [40]) using the method of power spectrum matching. Moreover, for our purpose, i.e. to investigate the effect of cell cycle and cell volume on gene expression, we do believe that our model is minimal. We captured cell growth with only one parameter g, the degree of balanced biosynthesis with one parameter β (β = 0 corresponds to the case where the synthesis rate is independent of cell volume and β = 1 corresponds to the case where the synthesis rate scales linearly with cell volume), the variability in cell cycle duration with only one parameter N, gene replication with only one parameter N0, gene dosage compensation with only one parameter κ (κ = 1 corresponds to perfect dosage compensation and κ = 2 corresponds to no dosage compensation), and the variation of size control strategy across the cell cycle with two parameters α0 and α1 (αi → 0 corresponds to timer, αi = 1 corresponds to adder, and αi → ∞ corresponds to sizer). The biological meaning of the cell cycle stages were clarified in the revised manuscript (see page 4). For our purpose, we believe that our model cannot be simpler.

      1. The authors should make clear which cell biology aspects are important, which are less important, and which were neglected in the context of their problem. Thus, in their model, cell cycle acts on gene expression mainly by duplication of burst sources and thus by increase of burst frequency after replication. Another important source of gene expression variability during the cycle, the mitotic transcription repression, is neglected.

      Response: In the revised manuscript, we clarified which cell biology aspects are important for gene expression dynamics (see page 17). Specifically, in our model, cell cycle and cell volume act on gene expression mainly by (i) the dependence of the burst size on cell volume; (ii) the increase in the burst frequency upon replication; (iii) the change in size control strategy upon replication; (iv) the partitioning of molecules at division. Point (iv) strongly affects copy number fluctuations, while it has little influence on concentration fluctuations. In addition, in the revised manuscript, we also elucidated the limitations of our model including mitotic transcription repression and others (see pages 19-20).

      1. The validity test of the model is indirect. It was tested that the concentration distribution deviates from Gamma and that the deviation correlates positively to the lack of accuracy of the concentration homeostasis. However, many models can have this behaviour. A direct validity test should use data sets of at least two types (total, nascent RNA, etc.) allowing direct estimates of some model parameters (such as burst size and frequency using nascent RNA). Concerning parsimony, I think that the authors should test it. Are all the parameters identifiable? Is there any overfitting? They could use parameter uncertainty, comparison of training /testing errors, etc. Some details about the parameter fitting method should be provided.

      Response: Regarding the parameter fitting and identifiability we have provided a detailed response to a previous comment above. However we emphasize that for the generation of Fig. 7, we did not need to estimate all model parameters from data. Hence in the previous version of the manuscript, no such estimation was done — we simply extracted the homeostasis accuracy γ, the height H of the off-zero peak of the power spectrum, and the Hellinger distance D of the concentration distribution from its gamma approximation directly from data. Finally, we point out that our model can be used to predict the dynamics of mature mRNAs, but it cannot be used to describe the dynamics of nascent mRNAs. This is because nascent mRNAs do not decay via a first-order reaction but their removal, i.e. their detachment from the gene which leads to mature mRNA, is better approximated by a reaction with a fixed decay time. This models the elongation time of nascent transcripts which does not suffer from much noise because the RNAP velocity is to a good approximation constant along the gene. See e.g. the following two papers for details: H. Xu, S. O. Skinner, A. M. Sokac, I. Golding, Stochastic kinetics of nascent RNA. Phys. Rev. Lett. 117, 128101 (2016). S. Braichenko, J. Holehouse, R. Grima. Distinguishing between models of mammalian gene expression: telegraph-like models versus mechanistic models. J. R. Soc. Interface 18, 20210510 (2021). Because of the fixed delay, the delay telegraph model (the telegraph model with a delayed degradation reaction) is non-Markovian and very different from the usual Markovian telegraph model which describes the dynamics of mature mRNA within each cell cycle. See e.g. the Supplementary Information of the following paper: X. Fu, et al. Accurate inference of stochastic gene expression from nascent transcript heterogeneity. bioRxiv (2021). Given the mathematical complexity introduced by a fixed delay, using it to describe the dynamics of nascent mRNA within each cell cycle leads to a non-Markovian model that is even more analytically intractable than the present one for mature mRNA. While an interesting research question, this is clearly far removed from the scope of our current manuscript.

      Minor comments 8. The introduction could be more pedagogical. Right now it is just an accumulation of loosely related and sometimes abruptly introduced statements. For instance, we understand that the authors want to oppose their approach to other extant approaches. However, extant approaches should be better reviewed, some of them are aged structured and perfectly suited for analysing cell cycle data. It would be useful for the reader that an example of observation explained by their model and not explained by other models (age structured or not) is discussed in detail. The model of this work does not explain size control, it just assumes that this holds, and does not discuss cell population aspects. A more nuanced positioning of this approach with respect to the literature would be useful for judging its value.

      Response: In the revised manuscript, we rewrote the introduction part to make it more pedagogical (see pages 1-2). In particular, we compared three popular models describing the cell size dynamics and the associated size homeostasis. The advantages and disadvantages of the three models were discussed.

      1. The meaning of N should be discussed from the very start when the model is introduced.

      Response: In the revised manuscript, we explained in detail the biological meaning of the effective cell cycle stages (see page 4). Specifically, recent studies have revealed that in many cell types, the accumulation of some activator to a critical threshold is used to promote mitotic entry and trigger cell division, a strategy known as activator accumulation mechanism. In E. coli, the activator was shown to be FtsZ; in fission yeast, it was believed to be a protein upstream of Cdk1, the central mitotic regulator, such as Cdr2, Cdc25, and Cdc13. Biophysically, the N effective stages can be understood as different levels of the key activator. Moreover, we pointed out that the power law form for the rate of cell cycle progression may come from cooperativity of the key activator that triggers cell division.

      1. The authors call constitutive expression the situation when the mean copy number does not depend on the volume. This choice should be clarified as in general constitutive as opposed to specific, localised or transitory expression refers to non-regulated gene expression. It seems to me that in this context, expression is only partially constitutive (independent on the volume).

      Response: In the present paper, constitutive expression means that the gene product is produced one at a time and is not produced in a bursty manner. It does not mean that the mean copy number does not depend on the volume. In the revised manuscript, we provided a more detailed discussion about how constitutive expression can be viewed as a limit of bursty expression (see page 4).

      1. In figure 1b and for exponential growth the y axis should be log(volume) instead of volume. The mean field approximation is called both “of novel type” (Discussion) and “which has a long history of successful use in statistical physics” (p4). If something is novel, then one should clearly explain why.

      Response: In fact, the y-axis in Fig. 1(b) should be volume instead of log(volume). This is because the x-axis represents the cell cycle stage instead of the real time. Note that for the adder strategy (α0 = α1 = 1), it follows from Eq. (3) on page 7 that the mean cell volume at stage k is vk = v1 + (k − 1)M0/N0, which linearly depends on k. This explains why the red curves in Fig. 1(b) are straight lines instead of exponential curves. In the revised manuscript, we also explained why the mean-field approximation used is novel (see page 7). Specifically, we pointed out that the mean-field approximation is not made for the whole cell cycle, rather we make the approximation for each stage and thus different stages have different mean cell volumes. This type of piecewise mean-field approximation, as far as we know, is novel and has not been used in the study of concentrating fluctuations before.

      1. The word “cyclo-stationarity” is used with not much definition. If this means just stationary distribution of the gene products why not use just “stationarity” instead. What means “cyclo”? A number of properties were called “rare” but it is not clear on what grounds.

      Response: In the revised manuscript, we removed the term “cyclo-stationarity” and simply assumed that the copy number and concentration distributions of the gene product at each cell cycle stage have reached the steady state (see page 8). In addition, for each property that was called “rare”, we explained the reasons in detail (see pages 14 and 17).

      1. I did not find a proof that the copy number distribution has less modes than the concentration distribution.

      Response: In fact, it is very difficult to prove that the concentration distribution has less modes than the copy number distribution. However, we have tested very large swathes of parameter space and found that the number of modes of the concentration distribution is always less than or equal to that of the copy number distribution. In the revised manuscript, we emphasized this point (see page 16).

      Significance

      1. The strength of this work is that it incorporates in a stochastic gene expression model a number of ideas on size control and dosage compensation that were discussed elsewhere from a cell population point of view. However, the proposed model is based on a number artificial choices that are difficult to justify biologically: a huge number of cell cycle discrete states and inappropriate handling of the timescales characterizing stochastic gene expression. Furthermore, the model is not minimal but depends instead on a huge number of parameters. I found the paper difficult to read and in the results presentation is not suitable for biologists that would need more details on the justification of the modelling choices and on the experimental validation of the model.

      Response: All these points have been addressed in previous replies.

      1. For mathematicians, the calculations are rather standard and may seem trivial.

      Response: Our model is complex due to the coupling between gene expression dynamics, cell volume dynamics, and cell cycle events. It is far more complex than standard models of gene expression (see e.g. Refs. [2,84,85]) because of the large amount of biology encapsulated in it and we presented a first analytical- and simulation-based analysis of concentration fluctuations when concentration homeostasis is broken.

      The computations of many quantities in the present paper are non-trivial. First, we showed that the generalized added volumes before and after replication both have an Erlang distribution. Using this property, we computed the mean cell volume in each cell cycle stage which is needed in the mean-field approximation. Furthermore, the computations of the power spectrum of concentration fluctuations are also highly non-trivial. The analytical expression of the power spectrum allows us to precisely determine the onset of concentration homeostasis. While the computations of moments of concentration fluctuations are standard, we used to the moments to construct an analytical concentration distribution which serves as an accurate approximation when N is large. Our concentration distribution is generally valid when concentration homeostasis is broken and goes far beyond recent models for growing cells which require concentration homeostasis and which do not take into account DNA replication, dosage compensation and size control mechanisms that vary with the cell cycle phase (e.g. Ref. [26] ).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript analyses a phenomenological model of stochastic gene expression. The model couples bursty transcription with cell growth, division and DNA replication. The cell cycle is divided into a large number of stages whose exponential lifetimes depend on the cell volume. It is argued that concentrations of gene products are distributed according to mixed Gamma distributions, whereas the copy numbers follow mixed negative binomial distributions. The number of modes can be different for concentrations and copy numbers, for instance the copy numbers can be unimodal while concentrations are bimodal. The case when the mean concentration does not depend on the cell cycle stage is called perfect homeostasis. It is argued that perfect homeostasis leads to Gamma distribution of the gene product concentration and that deviations from a Gamma distributions result mainly from deviations of the concentration from perfect homeostasis. It is also proposed that concentration homeostasis is difficult to obtain. These qualitative predictions of the model are tested using two datasets, one for E.coli and another for fission yeast.

      Major comments:

      The model encompasses a number of artificial choices:

      • A huge number of states called "cell cycle stages" have exponential life times. On my opinion, this sequence of stages is just a technicality for keeping the model within a discrete Markovian framework. More natural choices are possible, such as piecewise deterministic Markov processes, age structured diffusions, etc. The biological significance (if there is any) of such states should be explained.
      • The timescales of stochastic gene expression are not correctly taken into account. It is considered that during an exponential stage the bursting approximation describes gene expression in terms of Gamma distributions for concentrations and in terms of negative binomial distributions for copy numbers. This approximation is only valid if the lifetime of a stage is much larger than the time needed to generate a burst. For RNA, this condition cannot be fulfilled for a large number of states N and/or for two states promoters with a relatively long ON state. For the protein and/or in the case of translational bursting, the condition is even more difficult to fulfil.
      • DNA replication is a stochastic event and does not occur after a fixed number of exponential stages as it is considered in this model. The results in the Methods were derived heuristically and their relation to the master equation (12) is not explicit (except for the part concerning moments and their power spectrum). Furthermore, one would like to have some estimates of the biases introduced by the mean field approximation. The model is not minimal and depends on a huge number of parameters. It is not clear how these parameters were found and if overfitting was avoided. One may have doubts about the identifiability of the parameter N. What difference is between N=59 and N=60 (the value of N for the cyanobacterium)? The authors should make clear which cell biology aspects are important, which are less important, and which were neglected in the context of their problem. Thus, in their model, cell cycle acts on gene expression mainly by duplication of burst sources and thus by increase of burst frequency after replication. Another important source of gene expression variability during the cycle, the mitotic transcription repression, is neglected.<br /> The validity test of the model is indirect. It was tested that the concentration distribution deviates from Gamma and that the deviation correlates positively to the lack of accuracy of the concentration homeostasis. However, many models can have this behaviour. A direct validity test should use datasets of at least two types (total, nascent RNA, etc.) allowing direct estimates of some model parameters (such as burst size and frequency using nascent RNA).

      Minor comments:

      The introduction could be more pedagogical. Right now it is just an accumulation of loosely related and sometimes abruptly introduced statements. For instance, we understand that the authors want to oppose their approach to other extant approaches. However, extant approaches should be better reviewed, some of them are aged structured and perfectly suited for analysing cell cycle data. It would be useful for the reader that an example of observation explained by their model and not explained by other models (age structured or not) is discussed in detail. The model of this work does not explain size control, it just assumes that this holds, and does not discuss cell population aspects. A more nuanced positioning of this approach with respect to the literature would be useful for judging its value.

      The meaning of N should be discussed from the very start when the model is introduced.

      The authors call constitutive expression the situation when the mean copy number does not depend on the volume. This choice should be clarified as in general constitutive as opposed to specific, localised or transitory expression refers to non-regulated gene expression. It seems to me that in this context, expression is only partially constitutive (independent on the volume).

      In figure 1b and for exponential growth the y axis should be log(volume) instead of volume.

      The mean field approximation is called both "of novel type" (Discussion) and "which has a long history of successful use in statistical physics" (p4). If something is novel, then one should clearly explain why.<br /> The word "cyclo-stationarity" is used with not much definition. If this means just stationary distribution of the gene products why not use just "stationarity" instead. What means "cyclo"?

      A number of properties were called "rare" but it is not clear on what grounds.

      I did not find a proof that the copy number distribution has less modes than the concentration distribution.

      Referees cross-commenting

      Part 1

      I agree with the Reviewer 2 that once the master equation accepted the results make sense. But my criticism is different and concerns the master equation itself. In this equation the burst is considered instantaneous, whereas it needs finite time in reality.

      Part 2 (response to Part 2 of Rev2)

      • concerning replication: in the model this occurs after exactly N_o steps. In reality, replication occurs somewhere between the start of S and G2/M. N_o is in fact a random variable. Probably a new mean field assumption is needed here with some justification, but I have seen nothing in the paper
      • concerning biases introduced by the mean field approximation: Figure 2 is a numerical simulation, some analytical estimates could be better. As Figure 2 looks rather convincing, I reclassify this as minor comment.
      • concerning nascent mRNA, ON/OFF etc. I disagree. The notion of instantaneous burst with well defined burst size and burst frequency on a stage has a meaning if the lifetime of this stage (which is not mRNA or protein lifetime) is short. The model validity should be clearly stated.
      • concerning parsimony, I think that the authors should test it. Are all the parameters identifiable? Is there any overfitting? They could use parameter uncertainty, comparison of training /testing errors, etc. Some details about the parameter fitting method should be provided.

      Significance

      The strength of this work is that it incorporates in a stochastic gene expression model a number of ideas on size control and dosage compensation that were discussed elsewhere from a cell population point of view. However, the proposed model is based on a number artificial choices that are difficult to justify biologically: a huge number of cell cycle discrete states and inappropriate handling of the timescales characterizing stochastic gene expression. Furthermore, the model is not minimal but depends instead on a huge number of parameters.

      I found the paper difficult to read and in the results presentation is not suitable for biologists that would need more details on the justification of the modelling choices and on the experimental validation of the model. For mathematicians, the calculations are rather standard and may seem trivial. I am a systems biologist with a background in mathematics and theoretical physics.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Jia et al. introduce a modeling framework to represent stochastic gene expression, with an explicit representation of cell volume growth, cell cycle progression (and its dependency on cell volume) and gene dosage compensation. The model is very elegant and general in that it can represent a variety of situations, simply as a matter of paramterization. Under a simplifying assumption, the authors derive a number of metrics (include stationary distribution of gene product and power spectrum of gene product fluctuation dynamics), for both absolute number and concentration of gene product molecules. They use their model and derivations to examine under which conditions cell can achieve homeostasis in the concentration of the expressed gene product, despite changes in cell volume and gene copy number following replication. They also present and discuss the conditions giving rise to specific features (i.e. bimodality in stationary distribution, peak in power spectrum) and examine these features in experimental data to conclude to infer the underlying homeostasis strategies.

      Major comments:

      The model is rather general and powerful. The simplifying assumption seems reasonable (and the authors investigate to some extent its limitations, i.e. Fig. 2). The conclusions are overall convincing.

      1. My main concern is that the metrics that the authors use to assess concentration homeostasis (i.e. the γ parameter and the presence/absence of peak in power spectrum) do not seem quite appropriate to describe how much variability/fluctuations in concentration are driven by cell cycle effects. Indeed, the γ parameter measures how much the average concentration in each cell cycle stage varies throughout the cell cycle. However, this variability should be compared to the total variability due to both cell cycle effects and stochastic bursting dynamics. A given level of cell-cycle dependency (say γ=0.2) could be very visible if gene expression is weakly noisy (e.g. B low and <n> high) and completely invisible is gene expression is highly bursty (large B and small <n>). In the latter situation, cell-cycle effects would be meaningless for the cell to minimize. In essence, re-using the authors notations, I think γ / ϕ^1/2, would be a more relevant metric to observe.
      2. Similarly, when inspecting the peak in the power spectrum, the weight of the Lorenztian function(s) creating the peak, should be compared to the stationary component (λ_N, u_N in thhe authors' notations).

      A complementary analysis including these two points and a discussion the relative contribution of cell-cycle effects and bursting dynamics in the total variability/fluctuation of concentrations would be important to include.

      Minor comments:

      1. The dashed line on Fig. 3a is defined as κ = sqrt(2)^(1-β). First is this empirical or does it come from a derivation? Second, it seems incomplete since it should depend on ω. Intuitively, this line should correspond to the value of κ that would best mimic balanced biosynthesis in the case where β≠1. In other words, κ should be so that <ρB' / V(t)>_prereplication = <κρB' / V(t)>_postreplication which yields κ = 2^(ω(1-β)) * (ω-1)/ω * [2^(ω(β-1))-1]/[2^((1-ω)(β-1))-1] This indeed simplifies into κ = sqrt(2)^(1-β) when ω=0.5.
      2. η is used in the caption of Fig. 2, which is cited on page 4. But it is defined only 2 sections later, on page 6.
      3. ω is used in the main text, but only defined in the caption of Fig. 3.
      4. ω is defined as "the proportion of cell cycle before replication". Is this in terms of cell cycle stages (i.e. ω=N_0/N) or actual time?
      5. Fig. 3 indicates that power spectra are normalized so that G(0)=1, but G(0)=10 on the first two graphs.
      6. Page 11: "bimodality in the concentration distribution is significantly less apparent". I would suggest rephrasing "bimodality in the concentration distribution is absent" since there should be no reference to "significance" and bimodality is either present or absent (binary), not less apparent.

      Referees cross-commenting

      Part 1.

      I agree with reviewer 1 that a table of symbols would be helpful. On reviewer 3's second Major Comment, I don't think that the "the lifetime of a stage [has to be] much larger than the time needed to generate a burst". From how the authors write and solve the master equation, I don't think that such a separation of timescale is necessary. The authors should indeed clarify this and if reviewer 3 is correct, then that's indeed a major limitation. On reviewer 3's second Major Comment, I don't think that the "the lifetime of a stage [has to be] much larger than the time needed to generate a burst". From how the authors write and solve the master equation, I don't think that such a separation of timescale is necessary. The authors should indeed clarify this and if reviewer 3 is correct, then that's indeed a major limitation. On reviewer 3's comment "DNA replication [...] does not occur after a fixed number of exponential stages", I don't think I agree with this statement. Cell cycle progression relies on an ensemble of biochemical reactions. Representing this as a set of exponential waiting-time distributions with different means is probably amongst the most general and agnostic ways of representing this. Whether these exponential waiting-times only depend on cell volume is another question. This actually links back to reviewer 3's first Major comment and reviewer 1's comment that the concept of "stage" should be better discussed.

      Regarding the need for "estimates of the biases introduced by the mean field approximation" (reviewer 3), I guess that's the goal of figure 2. Maybe reviewer 3 should make more explicit what she/he would like to see.

      Regarding the comment from reviewer 3 that "a direct validity test should use datasets of at least two types (total, nascent RNA, etc)". I almost made a related comment in my review, but then I held it off: This issue with using nascent RNA data is that their model does not allow an ON state. They assume that gene products are produced in instantaneous bursts, which is a fair assumption if the lifetime of gene products is large compared to the time the gene stays ON. This is ok if the considered "gene products" are mRNA or proteins, but not nascent RNAs (for which the lifetime is the time to transcribe the gene). I did not make this comment in the end because I think the model is useful regardless. To comply with reviewer 3's request, maybe the authors could use distributions of mRNA and protein products, but I'm not sure that such data exists (since they need cell-cycle-resolved data).

      I disagree with the statements that "the proposed model is based on a number artificial choices that are difficult to justify biologically" and that "the model is not minimal but depends instead on a huge number of parameters." In my opinion, the model is elegantly simple to capture the mechanisms under study (i.e. the effect of cell cycle and cell volume on stochastic gene expression). It is expressed so that the model captures a broad range of situations (i.e. it reduces to simpler models as a matter of choosing parameter values, e.g. \Beta=0 => transcription independent of cell cycle; \alpha => \infty cell cycle depends only on size ...). I do not think that a series of exponential distributions for cell cycle progression is inappropriate, it is the most agnostic and general way of representing an ensemble of biochemical reactions that would be meaningless to describe explicitly. Instead, only their dependency on cell volume is taken into account (and in a very general way, i.e. parameters 'a' and \alpha). It is fair to ask the authors to clarify the concept of "stage", but I see this model as being as simple as possible, but not simpler, for the authors' purpose.

      Finally, I agree that the paper is probably "not suitable for biologists" but disagree that "for mathematicians, the calculations are rather standard and may seem trivial."

      Part 2. Resp. to reviewer 3 on the master equation (Part 1 of Rev3):

      Ok, I understand better your comment. What you mean by "the time needed to generate a burst" is the time that the gene produces RNAs, not the lifetime of the gene product (which is 1/d). That's true. It is essentially the same ifdea as what I write in my previous comment about nascent RNA data not being well captured by the model. Again, I think this is fine for "gene products" that are somewhat stable (not the case for nascent RNAs, but ok for mRNAs and proteins). This is fine by me as long as the authors explicit better this limitation of their model.

      Part 3. Response to Reviewer 3 (Part 2 of Rev 3)

      • concerning replication: Note that the mean field approximation is on cell volume, not on stage progression ("To simplify this model, [...] we ignore volume fluctuations at each stage but retain fluctuations in the time elapsed between two stages", p3). So the time at which replication occurs is already a random variable in the model. It is the sum of all the exponentially distributed random variables corresponding to stages 1 to N_0. The resulting distribution of replication time from the start of cell cycle is a random variable, which can be anything from very deterministic (N_0 very high) to very variable (N_0 very low).
      • concerning nascent mRNA, ON/OFF etc. : I'm not sure I get your objection, but the best is probably to let the authors respond to your original comment.
      • concerning parsimony: Ok, you're right. The authors should test it.

      Significance

      The advance of this paper is essentially technical. The authors present a model that incorporates and unifies previously studied effects (cell volume homeostasis, concentration homeostasis, bursting transcription). There is no major conceptual novelty, but the combination of these different aspects and the derivations that authors present are very valuable and might be applicable to interpret data in various species.

      The paper is suitable for a physics/mathematics/computational audience. It is rather technical and would not be understood by readers with only a biology background.

      Field of expertise of the reviewer: Gene regulation, single-molecule imaging, stochastic modeling.

    1. The hypocrisy and the cruelty are maddening.

      I have a general idea of Amanda Knox's story but I had never heard any specific details about the story like names or places or how she was treated. I find that with most aspects of society, especially with online activities, the people do tend to go for the crazy and outlandish stories. Once most people make up their minds about a person or story then it can be hard to change their viewpoints. No matter how many times Amanda may want to show the proof that she is an innocent person caught in the wrong place at the wrong time those people who paint her in a certain light will never change their viewpoints. Another story I can think of that shares some similarities is the Gypsy Rose case. Now Gypsy was active in the crime whereas Amanda was not active in her alleged crime. The main similarities between the two stories is how the media grabbed hold of it and that there are shows, movies, characters, etc, that are based of these real life people and the real things that happened to them. There are plenty of people who want to hold Gypsy as accountable as her at the time boyfriend and others who think she was innocent but a product of her surroundings. The way this little girl, that we were told at the time of the crime, was painted as a monster is insane to think about. But if that can happen to a young woman then anything can be thought about a mid twenties adult woman in a foreign country. The way the public romanticizes or dehumanize a person for actions they may or may not take can be insane to think about. These people who get treated this way almost never get to go back as a normal everyday person.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to the reviewers

      We wish to thank all three reviewers for their thorough examination of our manuscript and their constructive criticism that allowed us to increase its quality. You will see that, following their recommendations, we have included a good amount of new data in the manuscript. Specifically, we added a new figure with experiments proposed by the reviewers (now Fig. 4), as well as Figs. S3 and S4. In addition, we expanded one paragraph of our Discussion to comment on a very recent article published by Huang et al in Nature Structural and Molecular Biology with conclusions pertaining the interplay of Rpd3 and Gcn5 in PHO5 gene regulation. Below we include the point-by-point response (in blue) with the changes we have implemented to address their specific points. All the additions and changes in the manuscript are made in red.

      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Novačić et al., investigate into a mechanisms of the non-coding transcriptiondriven regulation of the phosphate-responsive PHO5 gene. The authors employ CRSPRi system to discern direct contribution of the antisense non-coding transcription (CUT025) expressed during phosphate -rich conditions to transcriptional repression of the yeast PHO5 gene and therefore challenging previous study from the Svejstrup's lab that proposed a positive role for non-coding transcription in control of PHO5 gene. They propose a model where non-coding transcription represses PHO5 by mediating recruitment of Rpd3 histone deacetylase leading to altered chromatin structure at PHO5 promoter due to reduced recruitment of the RSC chromatin remodelling complex. Overall, the data presented in the manuscript are of a good quality, experiments are well controlled and nicely presented. Manuscript is well written. My specific comments are below: 1. I am somewhat confused by the data presented in Figure 5. While there is similar impact on the chromatin structure seen in rrp6D and air1Dair2D strains (Fig 5C) that corresponds to more "closed" configuration of chromatin , it is not consistent with H3 ChIP data that show higher nucleosome occupancy across PHO5 UAS in rrp6D but loss of nucleosomes in the double mutant (or there is a mistake perhaps while plotting the data?)

      We now realize that the data was plotted confusingly, and we apologize for it. While doing the H3 ChIP experiment we only prepared the +Pi samples for the air1Δ air2Δ double mutant. In the figure we only included this one data point for the double mutant, which could lead to the false conclusion that at other timepoints there are no histones at its PHO5 promoter region. We decide to remove this data point from the figure to avoid the confusion and only keep the air1Δ air2Δ data for the ClaI assay. We believe that this should not be an issue as this data point is not critical for the conclusions we are making.

      1. To further explore direct link between nc transcription, Rpd3 and rrp6 mediated effect, I suggest to test the effect on PHO5 induction upon rpd3 and rrp6 deletions in CRISPRi CUT025 background.

      We performed this experiment and now include it as Fig. S3 in the manuscript. As expected, expressing the CRISPRi system only made difference when Rpd3 was present.

      1. It seems that most noticeable effect of blocking nc transcription by an elegant approach that utilizes CRISPRi system on the phosphatase activity is seen between 0-1.5h of induction. I suggest taking additional time points at 30-45 min.

      We took additional timepoints and the results were incorporated as the new Fig. 5E. The CRISPRi effect resulting in higher acid phosphatase activity was still most noticeable after 1,5 h of induction. This was mostly in line with the fact that the difference in PHO5 mRNA levels was most pronounced after 30 min of induction (Fig. 5D), as the time needed to achieve measurable protein level after induction can lag significantly for secretory proteins, such as acid phosphatase. Secretory proteins are cotranslationally translocated into the ER, after which they traverse the secretory pathway and undergo modifications before being finally exported to the periplasm where their activity can be measured. Consequently, the increase in acid phosphatase activity upon induction is only measurable after at least an hour.

      1. How do authors explain that the effect of the exosome mutations are reversed and phosphatase activity is increased at later time point (20 h, Fig 2A)? I suggest using more distinct colour for dis3 mutants.

      That effect is indeed somewhat surprising. We hypothesize that the effects we are seeing after 20 h reflect the specific conditions of prolonged induction, i.e. keeping the chromatin open or semi-open for a very long period of time, which do not necessarily reflect the early gene induction period that we are using as a read-out of the effect of different mutations on acid phosphatase expression kinetics. We previously noticed a similar effect with chromatin remodeler-related mutants (e.g. rsc2Δ, unpublished result from S. Barbarić group), which speak in favour of the prolonged induction conditions resulting in a chromatin state with its own specialized cofactor requirements. We therefore consider the chromatin state after prolonged induction a topic for another study, however, we now comment on this effect in the manuscript. The dis3 mutants are now shown in more distinct colours.

      1. Figure 5A -label "H3 ChIP"

      The label was added.

      1. Error bars are quite high in Fig 1C, perhaps it is worth repeating the experiment

      Since significant differences in PHO5 mRNA levels can be seen between wt and rrp6Δ mutant cells at 0,75 and 3 h of induction, we feel that the higher error bars at 5 h of induction are not worth repeating the experiment – especially since the values are bound to converge to a similar one after a longer induction period, as demonstrated in Fig. 1D.

      Significance

      significant of interest for general audience

      Referee #2

      Evidence, reproducibility and clarity

      The authors study the PHO5 locus, which is known to a have antisense transcript and that has previously been shown the be important for activation of Pho5 sense transcription. The authors challenge the idea by an extensive analyses. They show the Pho5-AS represses sense transcription, and thus fits in the category as AS repressors instead of activators. They show a correlative data that when antisense goes down and sense goes up. They show that increase antisense levels leads to decrease sense levels. They use mutants of decay pathways to increase the levels antisense transcription. Moreover, they used crispri to repress the antisense transcript. Lastly, they show that histone deacetylation represses Pho5 sense. The data in the manuscript is convincing, and well presented. One thing that needs further clarification is the strategy to increase anti-sense levels by deletion mutants of decay or depletion of decay pathways. While it is clear that this stabilizes the pho5-AS and decrease pho5-sense, it is not clear that this causes an increase in transcription. Perhaps, it is possible that antisense transcript itself has a repressive effect. If one really wanted to increase antisense transcription than the antisense promoter should be increased in strength. On the other the CriprI experiment is very convincing. I am surprised how well the crisprI system works, it is thought to be not so efficient at blocking elongating polymerase and good at blocking initiation.

      We thank the reviewer for this feedback. We performed additional experiments which you will find described below. Based on the results, we would like to keep the point about AS transcription causing the effect.

      Major comments: - Are the key conclusions convincing? Perhaps, the conclusion that increased transcription leads to repression is not completely convincing. The authors use mutants in rrp6, exosome, and nrd1 to increase Pho5-AS transcription elongation. However, I am always under impression that these mutants stabilize the transcript. And the authors acknowledge this in their manuscript. So how do you discriminate between increased stability versus increased elongation? I support the conclusion that inhibition of Pho5-AS leads to increase Pho5-S. However, increase in elongation is not directly demonstrated. While still possible, it is equally possible that a more stable pho5-AS transcript has a repressive an effect on Pho5-AS. - Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? See above. Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. If the authors want to keep the message that increased transcription of Pho5-AS leads to more repression that may need to consider additional experiments. For example, increasing transcription from the antisense promoter.

      We performed the proposed experiment and now include it in the manuscript as Fig. 4AB. Briefly, we inserted the strong constitutive TEF1 promoter in the antisense configuration downstream of the PHO5 gene ORF, so that it drives AS transcription. The results of this experiment very clearly show the inverse relationship between PHO5 mRNA and AS transcripts levels at +Pi conditions. Importantly, this strong constitutive AS transcription had an even more pronounced effect on PHO5 gene expression than deletion mutant backgrounds (in which, like in wt cells, the AS promoter is presumably weak), and did not allow for full level of PHO5 gene expression to be reached. To verify that the AS RNA itself does not have a regulatory role, but rather the act of its transcription represses the corresponding gene, we performed an additional experiment with appropriate diploid strains. The design of this experiment is standardly used to test whether an AS transcript can work in trans (for example see Nevers et al. 2018 NAR Fig. 6). This experiment is now included as Fig. 4C. Together, the results of these experiments paint a clear picture of AS transcription, and not AS level/stability itself, driving the repression of the PHO5 gene.

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. To me this is an optional experiment, but it would benefit the manuscript
      • Are the data and the methods presented in such a way that they can be reproduced? yes - Are the experiments adequately replicated and statistical analysis adequate? yes

      Minor comments: - Specific experimental issues that are easily addressable. - Are prior studies referenced appropriately? yes - Are the text and figures clear and accurate? Yes - Do you have suggestions that would help the authors improve the presentation of their data and conclusions? no

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. The manuscript challenges previous work where it was claimed that Pho5-AS is important for activation of Pho5-S. As such, it is important work. In the field of noncoding the transcription the Pho5-AS fits in a class of AS transcript that has been well described.
      • Place the work in the context of the existing literature (provide references, where appropriate). See above.
      • State what audience might be interested in and influenced by the reported findings. In researchers in field of transcription, chromatin, and more specifically in yeast gene regulation.
      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. Chromatin, transcription, yeast.

      Referee #3

      Evidence, reproducibility and clarity

      Novačić et al present a manuscript entitled "Antisense non-coding transcription represses the PHO5 model gene via remodeling of promoter chromatin structure" which is a locus-specific follow up to previous studies from Soudet and Stutz groups on genome-wide analysis of transcription interference mediated by antisense transcripts in S cerevisiae. Critically, the authors here employ a CRISPRi approach to reduce antisense transcription from reaching the PHO5 promoter and in doing so show that kinetics of PHO5 induction are increased as would be predicted from their previous model. Additionally, they show predicted epistasis between rpd3 and rrp6 on PHO5 expression and gcn5 and rrp6 that are consistent with their model. Comments are relatively minor but should be addressed. Introduction p3. "This mechanism was subsequently explored genome-wide in yeast, which revealed a group of genes that in the absence of Rrp6 accumulate AS RNAs and are silenced in an HDACdependent manner (14)." This sentence appears awkward- perhaps move "in the absence of Rrp6" to after "AS RNAs"?

      Corrected as proposed.

      p3 "Under a high phosphate concentration Pho4 undergoes phosphorylation by the cyclindependent-kinase (Pho80-Pho85)" Since "the" is used, don't use parentheses around Pho80-Pho85

      Corrected as proposed.

      Methods Give amount/concentration of glycine used in quenching formaldehyde for ChIP. Give the exact wash conditions and buffers not "extensively"

      All of those details are now provided in the manuscript. Figure 4C.

      Describe schematic in legend

      It is now described.

      Figure 4D. Indicate time of induction in legend.

      This was lacking for Figs. 4B-C (now 5B-C) so we added it there.

      Figure 5A. air∆ data are missing from later time points?

      Please see our first response to Reviewer 1. We removed the air1Δ air2Δ double mutant data, as we only had one data point for it in this assay.

      Figure 6. Legend needs to indicate what Pi conditions are. Since PHO5 expressed, appears to be low Pi. An issue that needs to be discussed is that rpd3∆ appears to decrease expression of PHO5 AS. Is this simply because of increased PHO5 expression? Does rpd3∆ have any effects on AS in high Pi? This is important to interpret if effects of rrp6 and rpd3 are epistatic or additive.

      We thank the Reviewer for bringing this to our attention. To explore the effect of rpd3Δ on PHO5 AS level, we quantified the PHO5 AS transcript by RT-qPCR with cells grown in (chemically defined) high Pi medium, which we now include in Fig. 7A. We find that rpd3Δ mutation has practically no effect on PHO5 AS transcript level both in the wt and the rrp6Δ mutant background. This result speaks in favor of rrp6Δ and rpd3Δ being epistatic rather than additive.

      Figure 7. Sth1-CHEC data are hard to interpret. Some sort of quantification might be required as effects are not clear from the browser track nor is it clear from browser track that the results are reproducible. Examination of Sth1-AA effects in gcn5∆ background might be more compelling that the effect on RSC is via acetylation. Otherwise it is a bit hard to say as RSC could be functioning in parallel to the acetylation-dependent pathways implicated.

      We agree that the presumption that histone acetylation recruits RSC to the PHO5 gene promoter had to be tested. We therefore include the experiment involving Sth1-AA depletion in the gcn5Δ background as Fig. 8A. This experiment was complicated by the fact that RSC is highly abundant (and at the same time essential for cell viability), but we resolved this by starting to deplete RSC two hours before gene induction. These results position RSC and Gcn5 in the same pathway. In contrast, more complete Sth1 depletion severely impaired viability of the rrp6Δ mutant, making it hard to interpret the effect, so we now include this result as Fig. S4.

      To show the effect of AS transcription on RSC recruitment to the PHO5 promoter more quantitatively, we re-analyzed the Sth1-CHEC data (for two independent biological replicates) and now include the log2 values for the changes in Sth1 binding in the text of the manuscript.

      Significance

      The work is focused and narrower in impact but important because direct tests of locus-specific effects are performed, validating models from previous genomic analyses. **Referees cross-commenting**

      I think the other reviews are very reasonable. I would just suggest to the authors that they think carefully about the reviews and decide what they think is most valuable to improving the work/presentation

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Revision Plan

      1. General Statements

      We really appreciate the positive comments and suggestions of the reviewers on our submitted manuscript. We think we will be able to solve the issues inquired by reviewers by adding new data and revising the phrases as detailed below.

      2. Description of the planned revisions

      Reviewer #1:

      Major comments

      Localization analysis of a transiently expressed MAP70 transgene with inactivating phosphosite mutations would be important to see whether the identified conserved phosphosites are relevant for MAP70 interaction with MTs. This experiment could be performed rapidly using transient expression in BY-2 cells.

      We agree on the importance of this analysis. Therefore we are currently preparing fluorescent markers of Nt-MAP70-2-like and its phospho-blocked (Ala) version to coexpress with MT and nuclear markers in BY-2 cells. We estimate that we need three more months to complete this experimsnt.

      The authors propose that PP2 blocks phragmoplast formation by preventing phosphorylation of class II Kinesin-12 proteins. In support, authors show that PP2 treatment correlates with a decrease in KIN12A phosphopeptide count (not fully abolished) and its failure to localize to emerging phragmoplasts in BY-2 cells and Physcomitrium. As class II Kinesis-12 proteins have been previously implicated in phragmoplast assembly this is a fairly reasonable hypothesis, but would benefit from the analysis of transgenic KIN12A variants carrying inactivating (A) or potentially activating (D/E) phosphosite mutations. Is loss of phosphorylation sufficient to prevent phragmoplast localization? Can an activated variant rescue PP2-induced KIN12A localization and cell division defects? As above, using transient expression in BY-2 cells would be a fast approach to tackle these questions.

      We are currently preparing fluorescent markers of phospho-blocked (Ala) and phospho-mimic (Asp) versions of KIN12A (PAKRP1) to coexpress with MT and nuclear markers in BY-2 cells. We will check whether they localize to phragmoplast and also test PP2 effects. We would need three more months to complete these analyses.

      Reviewer #2:

      Major comments

      • The manuscript would strongly benefit from being revised by a native english speaker. There are many unusual or awkward formulation, in particular in the abstract.

      We apologize for unnatural sentences. After adding new data and correcting the manuscript, we will ask a native english speaker to revise it.

      Reviewer #3:

      Major comments

      The major concern is lack of evidence to connect MAP70 and MT disruption upon treatment with PD-180970, in contrast to PP2, which was shown to affect localization of Kinesin-12. I wonder if authors could use taxol to stabilize MTs, then observe the localization of MAP70 with application of PD-180970?

      As we responded to reviewer 1, we are preparing the fluorescent marker of Nt-MAP70-2-like to coexpress with MT and nuclear markers in BY-2 cells. By using this multi-color marker, we will test whether PD-180970 affects the localization of MAP70 on MTs, also using taxol. However, in our experiene, taxol is not a very effective inhibitor and may not work in our transient expression system in BY-2 cells. In that case, we will analyze whether phospho-mimic (Asp) version can prevent MT disruption in the presence of PD-180970 to assess the relation of PD-180970, MAP70 and MT disruption.

      I have another concern on the action of PD-180970. PD-180970 appears to affect ubiquitously indispensable proteins for MTs. If PD-180970 disrupt MT by inhibiting phosphorylation of some MAPs, it must need time for turnover of proteins phosphorylated before PD-180970 was applied. In the proteomics experiment, author treated the cells with the compounds for 8-9 hr. On the other hand, in BY-2 cells, PD-18970 disrupted MTs only 30 min after application of PD-180970. I wonder if proteins were replaced during the 30 min. Could authors examine how long it takes to affect interphase MTs? If PD-180970 disrupts MTs in a 5-10 min like oryzalin, it is unlikely that inhibition of phosphorylation of proteins like MAP70 caused MT disruption. Rather, it may inhibit some proteins that have activity to disrupt microtubules but are usually inactivated by phosphorylation or inhibit something directly without phosphorylation.

      We agree that there is no evidence that PD-180970 disrupts MTs by inhibiting phosphorylation of MAP70. In our live-imaging system, in which reagents are added to liquid cultivation medium, the time from the reagent application to the arrival to each cell varies. Therefore, in order to accurately measure the time required for the inhibitor to take effect, it is necessary to design a new assay system, such as using fluorescent dyes to monitor the reagent's diffusion. In addition, since some reactions mediated by protein phosphorylation occur rapidly, minute-order observations might not be sufficient. Therefore, as an alternative strategy to assess the direct involvement of MAP70 phosphorylation on MT stabilization, we will examine whether PD-180970 induces MT disruption using strains expressing the phospho-blocked (Ala) and phospho-mimic (Asp) versions of MAP70 described above.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Reviewer #1:

      Minor comments

      The authors identified the analogs PD-166326 and PP1 as potent inhibitors of cell division. For completeness, it would be interesting to include a description of these arrest phenotypes and how they compare with that of PD180870 or PP2.

      We have added the effects of all tested compounds on Arabidopsis embryos in Fig. S3C and Table S1. Based on this data and the results of tobacco BY-2 cells, we have compared the effects of PD-166326 and PD180870, and PP1 and PP2 in Results.

      Although there are two more obvious candidates in the phosphoproteome datasets on which the authors focus on, there is very little discussion on whether the other top hits and whether they might be involved in cell division. On a related note, there is no discussion on the specificity of these compounds and the likelihood of phenotypes unrelated to cell division.

      We have added the information of “Similar proteins in Arabidopsis” and “Description and putative functions” for all identified candidates for PD-180970 and PP2 in Table S2 and S3, respectively. With referring this information, we have added the sections to describe the possible contributions of these candidates on MT organization and phragmoplast formation in Results. In addition, we have described the specificity of these compounds and the phenotypes unrelated to cell division in the section for the results of Arabidopsis roots (Fig. S2A).

      1st results section:

      "...developed into the globular stage without causing morphological defects..."

      Should omit the word "causing" or replace with "any/detectable"

      We have omitted the word "causing".

      Reviewer #2:

      Even if the identification of the kinase(s) targeted by these two compounds is missing, the characterisation of at least two downstream effectors of these elusive kinase(s) inhibited by PD-180970 and PP2 is an important step forward. I would recommend to this point make very clear in the writing (e.g. already in the abstract). Upon a superficial reading, the reader could assume that MAP70s and PAKRP1s are the direct molecular targets of these compounds.

      We appreciate the very positive comments. To clarify this point, in addition to the following responses to each suggestion, we have changed the last sentense of the abstract to “These properties make PD-180970 and PP2 useful tools for transiently controlling plant cell division at key manipulation nodes that are conserved in diverse plant species”.

      Major comments

      • I would modify the title to shift the emphasis from the methodology to the biological targets identified.

      We have changed the title to “Identification of novel compounds inhibiting microtubule organization and phragmoplast formation in diverse plant species”.

      • Concerning MAP70s the authors claim that there is little functional data about this family. Yet, a recent paper (https://www.science.org/doi/10.1126/sciadv.abm4974) identifies MAP70-5 as necessary for the proper organisation of CMTs in the endodermis and its ability to actively remodel to accommodate emergence of the lateral root primordium in Arabidopsis thaliana. This could provide a functional context to test several of the predictions that the authors list in the discussion.

      We have referred this paper in Results and Discussion, as “MAP70-5 was reported to increase MT length in vitro and to reorganize cortical MTs to alter the endodermal cell shape for lateral root initiation, suggesting that MAP70-5 mediates dynamic change of MT arrays”.

      Minor comments

      • The narrative would be improved by moving the section "PD-180970 and PP2 do not irreversibly damage viability" before the phosphoproteomic section.

      We have moved the “irreversibly” section to before the “phosphoproteomics” section.

      Reviewer #3:

      Minor comments

      In supplemental data, authors show only 12 or 14 candidates of the target. It is interesting how other MAPs including homologues of MAP70 and Kiesnin-12 in BY-2 cells were scored in the phospho-proteomics assay. I suggest authors show longer lists of proteomics including other MAPs. It would be valuable information for the research community.

      We apologize for not providing the complete dataset. We have added Dataset S1 of total protein sequences that we predicted from published RNA-sea data of BY-2 cells, and all identified proteins of phosphoproteomics assay for PD-180970 and PP2 in Datasets S2 and S3, respectively. We have moved the lists of top candidates to Tables S2 and S3.

      In Abstract, authors should mention that the two compounds reduced phosphorylation level of diverse proteins including MAP70 and Kinesin-12. This is very important results and, otherwise, it may cause misunderstanding of the activity of the compounds. In addition to this, it is better to rephrase the following sentence. "presumably by inhibiting MT-associated proteins (MAP70)" with "presumably by inhibiting phosphorylation of MT-associated proteins (MAP70)."

      To avoid such a misunderstanding, we have changed the descriptions in Abstract to “Phosphoproteomic analysis showed that these compounds reduced phosphorylation level of diverse proteins. In particular, PD-180970 inhibited phosphorylation of the conserved serine residues in MT-associated proteins (MAP70). PP2 significantly reduced the phosphorylation of class II Kinesin-12, and impaired its localization at the phragmoplast emerging site”. Due to this change, the suggested sentence was eliminated. Also in Discussion, we have mentioned the reduction of phosphorylation of various proteins by stating, "we found that PD-180970 and PP2 reduced the phosphorylation levels of diverse proteins. These parts may be further modified depending on the results of the phospho-blocked (Ala) and phospho-mimic (Asp) analyses.

      Page7 line 1st. it would be better to insert "of MAP70 family" after "in the conserved MT-binding domain" because the MT binding domains are unique to the MAP70 family. I could not understand why this is " (2nd line) consistent with PD-18970 severely disrupting all the tested MT structure". At current stage, there is no evidence that dephosphorylation of MAP70 caused the microtubule disruption. I suggest authors remove the sentence (", which was~MT structures").

      We agreed on both points and have corrected them as the reviewer suggested.

    1. the exhibition of Miss Clack’s character.

      I think this is an interesting way to phrase this. Taken by itself, we may have taken Clack's narrative as the truth, but when examined beside the other narratives, her unreliability is exposed and her character traits (and flaws) become clear.

    1. Author Response

      Reviewer #1 (Public Review):

      Viola et. al. compared the electron transfer efficiency of two types of oxygenic far-red photosystem II (PSII) with the "conventional" PSII and analyzed how these far-red PSII use the limited energy from infrared photons to proceed photosynthesis. Oxygenic photosynthesis is an energy-intensive process, and a large headroom is also needed for preventing harmful back-reactions from occurring, which can produce singlet oxygen. This research investigated how the far-rad PSII managed to do their work with limited energy.

      The authors measured and compared the forward reactions of different kinds of PSII (Chl-a-PSII, Chl-d-PSII and Chl-f-PSII), including the flash-induced chlorophyll fluorescence decay and S-states turnover. These results led to a conclusion that the forward reaction quantum efficiency was not changed between "conventional" PSII and far-red PSII. However, the back-reactions of three types of PSII are different based on the measurements of the prompt fluorescence decay, delayed luminescence decay, and thermoluminescence band locations. The authors concluded that the two far-red PSII (Chl-d-PSII and Chl-f-PSII) have a different strategy for utilizing infrared light. Indeed, the authors showed that Chl-d-PSII containing cyanobacteria produced more singlet oxygen than other types, and this result was explained by the energy profile in the electron transfer chain.

      The major strength of this research is the authors made a direct comparison of different far-red PSII under the same conditions. It's exciting to have a side-by-side comparison between two types of far-red PSII. In addition, the authors also measured the singlet oxygen produced from all types of PSII which clearly showed the differences in the routes of recombination.

      We thank the reviewer for the interest demonstrated in our work and for the thoughtful comments, that we have addressed below.

      However, there are some concerns:

      1) The flash-induced fluorescence decay, thermoluminescence, delayed luminescence and S-states turnovers of the Chl-d-PSII and Chl-f-PSII have been characterized before (ref 5, 26, 39), but from intact cells compared to isolated membranes in this study, and similar conclusions have been achieved. The authors mentioned four reasons (lines 115-120, see the manuscript for the authors' arguments "i." to "iv.") why it's important to use isolated membranes. However, in my opinion, these reasons are not sufficiently strengthened:

      i. The transmembrane potentials from cells can be collapsed by adding uncouplers;

      ii. The authors mentioned the quinone pool in the cells is uncontrollable, but the authors didn't actually measure or manipulate the quinone pool in the membrane (e.g., the ratio of QB/QB-/empty-pocket in the samples);

      iii. The phycobilisomes can be controlled by different conditions through state transitions;

      iv. The isolation of membranes may not remove membrane-related quenching mechanisms (e.g., PSII quenching in State II, spillover, etc.).

      We do not agree with the reviewer on this point. We consider the use of membranes (or isolated PSII) as being the best solution to limit the effects listed at the end of the Introduction and to provide consistency between the different measurements, some of which cannot be performed in intact cells (i.e., the UV absorption measurements). More specifically:

      i) The effectiveness of uncouplers in dissipating the membrane potential is likely to vary between species (e.g., Chroococcidiopsis cells form aggregates incapsulated by a protective layer of excreted polymers) and should be assessed by directly measuring the membrane potential. ElectroChromic Shift-based measurements of the membrane potential in cyanobacteria have only been demonstrated in Synechocystis sp. PCC6803 and Synechococcus elongatus sp. PCC7942 (Viola et al. 2019, https://doi.org/10.1073/pnas.1913099116) and still need to be adapted to the far-red species used here. Additionally, commonly used uncouplers such as CCCP and FCCP are ADRY reagents, that interfere with PSII water splitting by directly reducing TyrZ (Ghanotakis et al. 1982, https://doi.org/10.1016/0005-2728(82)90115-3), and would affect all the measurements presented in this work.

      ii) In the dark, the redox state of the PQ pool in cyanobacterial cells has been observed to be kept in a highly reduced state by respiration, with potential consequences on the QB/QB- ratio. This could well vary between species, based on their different physiologies and growth conditions. In isolated cyanobacterial membranes and PSII, the QB/QB- ratio is expected to be around 50% after a short dark adaptation. This seems to be the case in our samples, based on the flash-dependent oscillations of the S2QB- and S3QB- thermoluminescence shown in Appendix 2 compared to the literature (Rutherford et al. 1982, https://doi.org/10.1016/0005-2728(82)90061-5), assuming an initial ~75% S1 population, as confirmed by the flash-dependent oxygen evolution and UV absorption. This is now mentioned in Appendix 2.

      iii) The control of state transitions requires specific illumination regimes incompatible with the conditions required for our experiments. Moreover, state transitions remain largely uncharacterised in the far-red species used in the present work. In some of these species, the situation is further complicated by the presence of both visible and far-red light-absorbing phycobilisomes that have a different spatial distribution in the cell (MacGregor-Chatwin et al. 2022, https://doi.org/10.1126/sciadv.abj4437).

      iv) Non-photochemical energy quenching in cyanobacteria seems to occur in phycobilisomes, due to the action of the Orange Carotenoid Protein (OCP). Both OCP and the phycobilisomes, if present in cyanobacterial cells (and that depends on the strains), are removed when membranes are isolated. It’s been proposed that direct quenching of the PSII core occurs in Synechococcus elongatus 7942 cells in state II (Choubeh et al. 2018, https://doi.org/10.1016/j.bbabio.2018.06.008), but since the mechanism has not been elucidated, no conclusion can be made on whether this could occur in membranes. The same is true for spill-over. Additionally, neither of the two mechanisms could be better controlled in cells than in membranes, so there would be no advantage here from working in vivo.

      In addition, the authors reached a conclusion that the Chl-f-PSII containing species should suffer from fluctuation light-induced membrane potential spikes, but don't actually measure this in physiologically relevant preparations. It will be more beneficial to use intact cells instead of an isolated membrane. I suggest the authors either restrict their conclusions to what the isolated membranes clearly show or make measurements in intact cells.

      The proposal that the far-red forms of PSII (both Chl-d-PSII and Chl-f-PSII) should suffer from increased charge recombination induced by spikes of membrane potential in fluctuating light is not new (see for example Nürnberg et al. 2018, https://doi.org/10.1126/science.aar8313), and is based on the observations made in plant PSII (Davis et al. 2016, https://doi.org/10.7554/eLife.16921) and assumed to be universal in oxygenic photosynthesis. In PSII, the transfer of electrons from the primary donor chlorophyll to QA occurs vectorially in the membrane, against the trans-membrane electric field, thanks to these electron transfer steps being exergonic. Spikes in the electric field due to sudden intensity fluctuations increase the probability of backward electron transfer. If the overall drop in the energy of the electron from the primary donor to QA is smaller (in a long wavelength PSII), it should result in a higher probability of backward transfer for a given trans-membrane electric field, and therefore a greater susceptibility to spikes in the electric field. We did not measure these effects and we do not claim to have done so. As already mentioned in the answer to point i) above, doing so would require the development of ElectroChromic Shift-based measurements of the membrane potential in the cyanobacterial species containing far-red photosystems. This is a separate research project beyond the scope of the present work.

      In conclusion, we believe that our statement justifying the use of isolated membranes at the end of the Introduction is valid.

      1. The authors measured the fluorescence decays as part of the evidence to show the stability of S2QA-. I have several concerns about these measurements:

      i. In figure 2B, the WL C. thermalis (blue) trace has a unique decay phase with a lifetime of about 0.2s, which the authors denoted as S2QA- recombination. Could the author elaborate on how this phase was assigned to this state?

      All decay kinetics in presence of DCMU are bi-phasic (with an additional faster phase in the WL and FR C. thermalis samples, attributed to a small fraction of centres where DCMU did not bind). In the manuscript we did originally assign both phases as arising from S2QA- recombination, but it is true that the middle phase, that is slightly faster in WL C. thermalis, is too fast to originate from that. This phase can rather be ascribed to TyrZ•(H+)QA- recombination occurring in a fraction of intact PSII centres before the full stabilization of charge separation, as shown in Debus et al. 2000 (https://doi.org/10.1021/bi992749w), or in centres lacking a Mn-cluster. We have now modified the paragraph regarding the fluorescence decay in presence of DCMU accordingly (L. 142-145): “The shorter lifetime (~0.22-1 s) of the middle decay phase (amplitude 15-20%) was compatible with it originating from TyrZ•(H+)QA- recombination occurring either in centres lacking an intact Mn-cluster (24) or in intact centres before charge separation is fully stabilised, as proposed in (23).”.

      A luminescence decay phase with a similar lifetime was initially ascribed, incorrectly, only to TyrZ•(H+)QA- recombination occurring in centres devoid of an intact Mn-cluster, in Appendix 5. This has now been rectified.

      ii. In figure S1 (the full version of 2B), all the fluorescence traces seem to rise at the end of the measurements. Could the authors check whether the measuring light intensity was actinic?

      This rise is significant only in the A. marina dataset (now Figure 2-figure supplement 1), and given the low signal to noise ratio in the last points of the fluorescence curve, we consider this small anomaly to be a measuring artefact. The rise is absent in the other traces in Figure 2- figure supplement 1 and in Figure 2B, except for the last point of the A. marina dataset in Fig. 2B. The corresponding Source data provided, shows that a rise in the last point of the measurements is only present in one of the three A. marina replicates (#2), while the non-decaying fluorescence is present in all A. marina samples and discussed in the text. Except for this last anomalous point, the decay curves of the A. marina replicate #2 do not differ significantly from the other two replicates. This clearly suggests an artefact, and is not consistent with the measuring light being actinic. A clarifying sentence has been added in the legend of Figure 2- figure supplement 1.

      iii. In figure S2, it seems to me that the fluorescence decay of Synechocystis + DCMU (Green open squares) was slower than the WL C. thermalis and is similar to the FRL C. thermalis in figure 2B. If the Synechocystis + DCMU is indeed similar to FR C. thermalis, would that be consistent with the authors' conclusions?

      When fitting the Synechocystis+DCMU fluorescence decay kinetics (in what is now Appendix 1-figure 1), we obtain two decay phases with, respectively: an amplitude of ~12% and lifetime of ~0.22 s, and an amplitude of ~81% and lifetime of ~7.9 s. These values are similar to those reported for WL C. thermalis in Table 1, with an overall fluorescence decay faster than in FR C. thermalis. Nonetheless, because of the limited number of Synechocystis biological replicates, we limit ourselves to a qualitative comparison. The luminescence decay kinetics are also faster in Synechocystis (as in WL C. thermalis) than in FR C. thermalis (now Figure 5- figure supplement 2).

      These data are consistent with our conclusions: the energy gap between QA- and Phe in Chl-f-PSII is at least as large as in Chl-a-PSII, or could even be larger, as suggested by the slower S2QA- recombination measured by fluorescence (Figure 2) and luminescence (Figure 3) decay.

      iv. It's known that DCMU will alter the redox potential of QA/QA- in plants. Would it have similar effects to the PSII studied in this research? If so, it will be meaningful to include these effects in the energy diagram in fig 7.

      Yes, we do expect DCMU to change the QA/QA- redox potential in our samples, as it does in plants and other cyanobacteria, although the actual effect in different PSII types would need to be measured. The energy gap values in now Figure 8 are only estimates based on literature values and on the relative changes reported here, they are not calculated from any of our data and do not specifically refer to the experimental conditions we used, including the use of DCMU. For this reason, we think that adding the effects of DCMU in the diagram would not be particularly useful and could be confusing.

      1. The authors didn't use WL C. thermalis for measuring oxygen evolution and the authors claimed that the PSII content in WL C. thermalis is too low. Is that a technical issue (e.g., cannot purify PSII enriched membranes) or a biological issue (i.e., white light condition produced less PSII)? In Fig S9C, the oxygen generated from WL C. thermalis is comparable to FR C. thermalis. Could the author explain how they reached the conclusion that PSII in WL C. thermalis was low? In addition, the author should also provide evidence showing that the samples of WL C. thermalis do not have significant PSII activity under far-red light.

      We did measure the flash dependence of oxygen evolution in WL C. thermalis membranes, and we did observe oscillations with visible flashes (but not with far-red flashes, as expected). However, the data were not good enough to be able to perform any significant analysis. Unfortunately, in the case of WL C. thermalis, we have not been able to isolate O2-evolving cores, as stated in L. 194-195. The WL C. thermalis data have now been added in Figure 3- figure supplement 1, together with the non-normalised traces of all other samples (following the suggestion by reviewer #3), and the text has been modified accordingly. The data in Figure 3- figure supplement 1 also provide evidence that the samples of WL C. thermalis do not have significant PSII activity under far-red light (although this was already clearly demonstrated in Nürnberg et al. 2018).

      We do have evidence that the PSII content per chlorophyll is lower in WL C. thermalis than in FR C. thermalis, based on fluorescence emission spectra, yield of isolated PSII and PSI from purification procedures, and O2 evolution per chlorophyll, as can be seen for example in Figure 3- figure supplement 1. The levels of PSII accumulation depend on the growth stage (among other factors) in model species such as Synechocystis. Since C. thermalis cells grow more slowly than other cyanobacteria species and their physiology has not been studied in detail yet, it is difficult to control the levels of PSII accumulation. This explains the inter-sample variability in the rates of O2 evolution per chlorophyll measured with the Clark electrode, that have now been added in Appendix 6-table 1.

      1. The authors used an indirect method, which used chemical trap histidine and oxygen consumption, for measuring the production of singlet oxygen from different types of PSII. I have several concerns about this approach.

      i. Why not use a probe that reacts directly with singlet oxygen probes like SOSG or EPR probes to unambiguously confirm the production of singlet oxygen? The difficulties of not using SOSG mentioned in Rehman et al (SI Ref#22) should be no longer problems when isolated membranes were used. The advantage would be a validation of the results and perhaps increased sensitivity.

      Although SOSG or EPR probes could also be used to detect singlet oxygen production, these other methods seem to be significantly less sensitive than histidine trapping. For example, Fufezan et al. 2007 (https://doi.org/10.1074/jbc.M610951200) used the EPR spin trap TEMPO and needed 30 minutes of illumination. Extended illumination (up to 1 hour) has also been used to detect singlet oxygen using SOGE (Flors et al 2006, https://doi.org/10.1093/jxb/erj181).With the histidine trapping method used here, less than 2 minutes of illumination were required to measure the singlet oxygen production rates. This allowed potential problems of prolonged illumination (e.g. a loss of intact PSII centres due to photodamage) to be minimised, and allowed us to confirm the results obtained in isolated membranes with those obtained in intact cells.

      As shown in now Figure 6- figure supplement 1E, the histidine-dependent oxygen consumption was suppressed by the singlet oxygen quencher sodium azide, as also shown in Rehman et al. 2013 (https://doi.org/10.1016/j.bbabio.2013.02.016). We also independently confirmed that the singlet oxygen generated by illumination of the dye Rose Bengal can be efficiently detected with the histidine trapping method and suppressed by the addition of sodium azide (Figure 6- figure supplement 1F). For these reasons, we are confident that what we measure with the histidine trapping method is singlet oxygen production.

      ii. In Rehman et al (SI Ref#22), wild-type Synechocystis cells showed significant production of singlet oxygen in the presence of DCMU and His (Figure 3A in SI Ref#22), however, the amount of singlet oxygen measured from the membranes in this study seemed to be less (Fig S10E). Could the authors provide some explanations?

      Fig. 3A in Rehman et al. showed that the production of singlet oxygen was about 10% with respect to the oxygen evolution activity in absence of additions (open squares). The light saturation curves in Fig. 4B of the same paper also show that at saturating light intensity the singlet oxygen production rate is about 10% compared to the O2 evolution rate. The traces we show in Figure 6-figure supplement 1 are only representative. The comparison should be made between the results in Rehman et al. and the averages of biological replicates that we show in Fig. 6 (membranes) and Appendix 6-figure 4A (cells). For WL and FR C. thermalis, we measure singlet oxygen production rates that are about 20% of the O2 evolution rates, slightly higher than those measured in Synechocystis in Rehman et al. Considering the variability between biological replicates, we consider our values in line with those in Rehman et al.

      iii. Can the presented results distinguish the production of singlet oxygen from recombination or other sources (e.g., antenna, free chlorophyll)? Some key controls are needed to strengthen the authors' claims.

      This is difficult to demonstrate unequivocally, but we have different lines of evidence that support the conclusion that the increase in singlet oxygen production in A. marina originates from differences in PSII charge recombination with respect to the other samples:

      i) The high levels of singlet oxygen production are observed in intact cells as well as in membranes. In neither of these samples do we expect to have significant amounts of damaged PSII or free chlorophyll, so these seem highly unlikely as the main sources of the singlet oxygen in our measurements. This is now stated more explicitly in L. 305 and Appendix 6.

      ii) According to the data in Appendix 6-figure 1B, singlet oxygen production in A. marina membranes shows a similar light saturation to that of maximal O2 evolution. This suggests that the singlet oxygen production we measure is related to PSII photochemistry. We have now stated this explicitly in L. 288-290.

      iii) Our thermoluminescence and delayed luminescence results indicate that in Chl-d-PSII the energy gap between Phe and QA is smaller than in Chl-a-PSII, as already suggested in the literature, and Chl-f-PSII. Therefore, this indicates more charge recombination going via repopulation of Phe- in Chl-d-PSII, with a consequent increase of singlet oxygen production.

      The antenna chlorophylls could form triplets under high light, by inter-system crossing, but in intact antennas the chlorophyll triplets are expected to be mostly quenched by nearby carotenoids (see https://www.jstor.org/stable/24030848 for a review on the subject). The generation of antenna triplet states in non-photoinhibitory conditions has been demonstrated in plant and algal thylakoids (Santabarbara et al 2002, 2007 doi: 10.1021/bi0201163, doi: 10.1016/j.bbabio.2006.10.007). Yet, these signals, which are attributed to a small population of damaged antennas, are small compared to those of triplets generated by charge recombination. Due to its apparently stochastic nature, the generation of antenna triplets by inter-system crossing is not expected to be significantly different between the different PSII complexes investigated in this study.

      On the other hand, it is generally recognised that in the PSII reaction centre, the carotenoid on the D1 side is not close enough to ChlD1 to directly quench its triplet state, when formed (see Telfer et al. 1994, https://doi.org/10.1016/S0021-9258(17)36825-4). The singlet oxygen produced in the reaction centre could disrupt the coupling between chlorophylls and carotenoids in the antenna, resulting in singlet oxygen production also from the antenna, in a cascade effect. This can happen with prolonged strong illumination (Fufezan et al. 2002, https://doi.org/10.1016/S0014-5793(02)03724-9).

      iv. I could not fully understand the singlet oxygen production experiments with tris-washed samples. In my opinion, the Mn-cluster depleted PSII should have accelerated charge recombination (100 ms between the YZ/QA, vs ~ 5 sec between the S2/QA), which should lead to an increase in singlet oxygen production. Correct me if I'm wrong about this, but if my reasoning is correct then how do the authors explain the discrepancy?

      Our rationale for performing the tris-washing experiment was indeed to see if this would lead to an increase in singlet oxygen production, thus implying that the high production in the A. marina samples could arise from a higher fraction of PSII centres without the Mn-cluster, as explained both in the main text and in Appendix 6. The fact that the treatment did not increase the singlet oxygen production suggests that this does not specifically arise from PSII lacking the Mn-cluster.

      The lack of singlet oxygen increase following tris-washing is not necessarily controversial, as the fact that TyrZ•QA- recombination is faster than S2QA- recombination does not necessarily imply that more of it occurs via backward electron transfer from QA- to Phe. The removal of the Mn-cluster could decrease the production of singlet oxygen by charge recombination, since it causes an increase in the redox potential of QA and, therefore, of the energy gap between Phe and QA, thus decreasing the probability of charge recombination going via the repopulation of Phe-. This is proposed to be a mechanism to protect PSII during photoactivation of the Mn-cluster (see Johnson et al 1995, https://doi.org/10.1016/0005-2728(95)00003-2).

      Our data show that the singlet oxygen production in A. marina is not specifically related to PSII lacking the Mn-cluster and are not in conflict with what is expected based on our knowledge of PSII energetics.

      v. The y-axes in Figure S10 should either contain "delta" (Δµmol O2 ml-1) or use the measured absolute oxygen concentration. I'd suggest the latter, since the reaction is oxygen consuming, it's good to show that all the samples started with similar amounts of dissolved oxygen. Low O2 levels could decrease 1O2 production, though this would be more of an issue with cells than membranes.

      The y-axis labels in the figures (now Figure 6-supplementary figure 1 and Appendix 6-figures 1D and E, 2, 3 and 4A) have been changed to Δµmol O2 ml-1. We prefer to show the traces after subtraction of the baseline recorded in the dark (now explicitly indicated in the corresponding figure legends) for a better visual comparison. All samples were left to equilibrate with air (stirred) before starting the measurements, so all started with similar levels of dissolved oxygen. This is especially important when measuring PSI-dependent oxygen consumption (Appendix 6-figure 3), because the addition of ascorbate and TMPD leads to a transient drop in oxygen concentration in the sample, which leads to artefacts in absence of the equilibration step. This information has been added to the corresponding Materials and Methods section (4.5). Additionally, when using Rose Bengal to generate singlet oxygen, the histidine-dependent oxygen consumption was about 10 times higher than in any of the measurements done with biological samples, and still we did not observe saturation of the signal in the illumination time used (added panel F in Figure 6- figure supplement 1). Therefore, we are confident that the singlet oxygen measurements in membranes and cells were not skewed by limiting oxygen concentrations in the measuring chamber.

      The y-axis labels of what is now Appendix 6-figure 1B and C have also been corrected (as ml-1 was used instead of h-1).

      Reviewer #3 (Public Review):

      In this manuscript, Viola and co-authors address the question of how far-red-light-adapted (FRL) Photosystem II (PSII) is able to bypass the "red limit", or the minimum photon energy/frequency for charge separation to proceed effectively. They attempt to do so primarily by measuring the consequence of failure to overcome the red limit: charge recombination. From this work they have concluded that FRL PSIIs are able to achieve similar efficiency of flash-induced water-oxidizing complex turnover to those adapted to standard visible light. However, they conclude that FRL PSII which uses chlorophyll-d is significantly more susceptible to charge recombination and singlet oxygen formation, leading to increased sensitivity to high-light conditions. FRL PSII which uses chlorophyll-f, however, is adapted to be more resistant to photodamage. These strategies are differentiated by the number and type of far-red chlorophyll used and tuning of redox potentials of cofactors in PSII.

      The methods employed are well-chosen to present complementary evidence to address the questions posed. The authors have supported themselves using polarography, fluorescence decay, absorption, luminescence and thermoluminescence, and spectrometry, all of which are employed in a manner well-established in the quantification of processes in standard PSII preparations. The results, however, have some loss of data such as total yields which would be useful in interpretation as the authors have chosen to extensively normalize data for ease of visual comparison of certain features.

      Overall, the authors have adequately achieved their aims and their conclusions are well-supported. The authors also clearly state their own expectations of the impact of their work at the end of the Discussion; thanks to these results, we can better understand the ecological niche of each type of FRL-PSII and how these significantly disparate systems may be used in future agricultural research and development.

      We thank the reviewer for the positive evaluation of our work.

      Following the reviewer’s suggestions, the total yields (on a chlorophyll basis) of the flash-dependent oxygen evolution have been provided in Figure 3- figure supplement 1. These include the flash-dependent oxygen evolution data measured in WL C. thermalis membranes, that were previously omitted because of the unsatisfactory quality, and are still omitted from Figure 3 (normalised data and fits) for the same reason. The S-state distributions calculated from the fits of the flash-dependent oxygen evolution have been added in Table 2.

      Additionally, the non-normalised oxygen evolution and consumption rates used for Figure 6A and Appendix 6-figure 4 are now provided in Appendix 6-table 1.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Answers to reviewers’ comments

      (Reviewers comments are in italics. Text modifications in the manuscript file are in blue.)

      Overall, we acknowledge referee’s careful reading of the paper and comments that we think have helped further improvement of the manuscript.

      On the attached pages are our detailed point by point responses to the referees’ comments along with a description of how the manuscript was modified in accordance.

      New data included:

      In response to the comments and suggestions of both reviewers 1 and 3, we conducted new experiments to test genetic interactions between different actors of the BMP and activin pathways. These new results confirm and complement the analyses described in the original manuscript. Furthermore, as suggested by reviewer 2, we have further studied the phenotypes of hiPSC-CM, by analyzing gene expression profiles and by analyzing the morphological changes induced as a result of PAX9 knockdown.

      NB: The title has been slightly modified, to highlight the conserved features of the genetic architecture of cardiac performance revealed in the study

      __Former title: __Genetic architecture of natural variation of cardiac performance in flies.

      __Novel title: __Genetic architecture of natural variation of cardiac performance: From flies to humans.

      Reviewer 1

      1. 1. The authors utilized the RNAi-mediated knockdown approach in their functional validation studies. It is not clear how each genetic variation (SNP) affects its associated genes. Could some of the SNPs activate the candidate gene expression? For the 4 candidate genes that failed to show cardiac defects, could the overexpression of these 4 genes alter cardiac performance? Answer 1- Of course, we cannot predict direction of the effect of the variants on the function of the genes. In this context, loss-of-function experiments are subjected to a risk of false negatives. It is indeed possible that in the case of a lack of effect of the loss of function, a gain of function could reveal an effect. But gain-of-function experiments are difficult to control, and often subjected to non-specific effects because it is complicated to control the level of over-expression compared to endogenous expression. This did not seem suitable for an extensive analysis of a large number of genes. We therefore chose to test only for loss of function.

      In addition, our approach to testing heart-specific RNAi aims to assess the quality of the association results by comparing RNAi for genes identified by GWAS to randomly selected genes. It is not intended to describe precisely the involvement of each gene individually.

      (See also answer to reviewer 2 comment n°2 and the modifications to the manuscript that have been made and which address these criticism)

      * 2. babo is the type I activin receptor, not type 2. *

      Answer 2- Thank you, we have corrected this error.

      • The authors show BMP and activin pathway genetically interacts to affect cardiac performance. But it is interesting to find that these interactions are in a trait-dependent manner. For example, it seems that babo and dpp epistatically interact to regulate FS, while they additively regulate HP and DI. The authors need to discuss the complex genetic interaction further. *

      Answer 3- See reply to reviewer 3, comment N°2 below.

      4*. Both snoo and sog are identified from GWAS. How about babo and dpp? Are there any identified SNPs associated with babo and dpp? *

      Answer 4- Considering GWAS for mean phenotypes, there is no variant in dpp that are within the 100 best ranked SNPs nor within the variants identified using fast epistasis. But given the size of the DGRP population we are far from being exhaustive, as we do not reach saturation. It is therefore difficult to comment on these ‘negative’ results. However, we do identify one variant in babo using fast epistasis (see figure 2B and Table S3).

      5. It is unclear why the mad KD behaves oppositely to dpp mutant, although both proteins are involved in BMP pathway. In Figure S5, the mad KD shows reduced FS and HP, but dpp LOF mutant shows increased FS and HP (Figure S4). Can the authors perform RNAi to knockdown dpp specific in the heart to reexamine the role of dpp in the regulation of cardiac function. The whole body LOF mutant dpp-d14 might not target cardiac tissue directly to control heart performance like mad KD.

      Answer 5- (see also answer to reviewer 3 comment n°2) We did perform heart specific dpp RNAi experiments together with other tests for interactions using new allelic combinations of activin and BMP pathways and therefore can compare heart specific knock down to heterozygotes for amorphic mutations for both dpp and mad.

      Regarding dpp, congruent effects on HP, DI, SI, ESD and EDD were observed between mutant and RNAi, while RNAi had opposite effects on FS compared to heterozygotes dppd14 mutants (decreased and increased FS compared to control, respectively). In the case of mad, heterozygous mutants had no effect on FS, EDD and ESD, but similarly to dpp mutants it increased SI, DI and HP. mad RNAi uniquely decreased HP, DI and SI and increased AI. However, similarly to dpp RNAi, it induced a decrease of FS.

      Thus, systemic versus heart specific knockdown of genes induce specific effects, suggesting cardiac non-autonomous interactions. This complex picture of TGFb involvement is now discussed in the result section (see below, Reviewer 3, major comment 2).

      6*. The authors selected two novel genes to study the conversed regulation in both flies and human iPSC cells. Besides testing these novel genes, the authors should also verify whether the conserved pathways, like TGF-beta, regulate heart performance in human iPSC cells similar to the flies. *

      Answer 6- We focused on poxm/Pax9 and sr/Egr2 because none of these TFs were known to have cardiac function in fly nor in mammals. Our paralleled analyses in fly and hiPS-CM illustrates how the description of the genetic architecture of cardiac traits in flies can accelerate discovery in mammals.

      There is extensive literature describing the involvement of TGF B /BMP and Activin pathways in heart development and diseases in humans, hence the choice not to focus on these pathways in iPS-CM.

      Reviewer 2:

        • It will be interesting to compare this fly GWAS to human heart disease GWAS data (for example, cardiomyopathy, arrhythmia, heart failure) from patients. Such cross comparison could make the data set more valuable. * Answer 1- We actually did make this comparison (Table 2, Table S11) and we agree it significantly validates our approach. This identified a set of orthologous genes associated with cardiac traits both in Drosophila and humans, supporting the conservation of the genetic architecture of cardiac performance traits, from arthropods to mammals.
      1. RNAi is the only experimental approach in this manuscript to validate the functional significance from data analyses. Authors may consider using genetic mutations such as deficiency lines or P-element lines to offer an alternative approach. This is simply a suggestion to improve the rigor and reproducibility, not absolutely required. *

      Answer 2- In an attempt to provide a consistent analysis of loss of gene function, our strategy was to concentrate our analysis on the effects of heart specific knock down. This allows us to compare -in a global way- the effects of the knock down of genes identified by GWAS to those of randomly selected genes.

      Our objective was to provide a global view of the heart specific effects of the identified genes, and not to characterize precisely the involvement of each of them, using a combination of mutant alleles, RNAi and gain of function. Given the experimental burden of analyzing cardiac function, such a strategy would have indeed required us to concentrate only a very small number of genes.

      We however recognize that this strategy has limitations:

      • Some variants may lead to gain-of-function effects of genes, and our strategy is not able to test for these effects.

      • Some variants may come from non-cell-autonomous effects, which would not be replicated by our targeted RNAi strategy in the heart.

      Therefore, the false negative rate of our experiments is difficult to estimate.

      We have tried to put this into perspective and to highlight the limitations of our analysis in the results section describing RNAi validation of GWAS results.

      “To assess in an extensive way whether mutations in genes harboring SNPs associated with variation in cardiac traits contributed to these phenotypes ….. (…)

      …… These results therefore supported our association results. It is important to emphasize that our approach is limited to testing the effect of tissue-specific gene knock down. Since some of the variants may lead to increased gene function and/or expression, this can lead to a false negative rate that is difficult to estimate. In addition, some of the associated variants may influence heart function by non cell-autonomous mechanisms, which would not be replicated by cardiac specific RNAi knock down.”

      *In order to validate the roles of predicted TF binding sites, the best approach would be introducing point mutations using CRISPR/Cas9 within the binding motif then testing out molecular and physiological outcomes. Rather authors chose to test indirectly to knock down those TFs. If so, authors need to at least acknowledge the potential caveats of such approach and the limitation in related data interpretation. *

      Answer 3- The reviewer is right, the definitive proof of the involvement of a potential TF binding site on the regulation of a gene located in cis requires to mutate the binding site and to analyze the effect on the expression of the corresponding gene. But this may not be sufficient to definitely demonstrate that the potential TF is indeed a regulator of that gene (the binding motif may be target of yet another TF): definitive proof may require motifs/TF DNA binding domain swaps. This would have been out of the scope of the present study. In addition, the effects on heart performance of mutating one TFBS at a time (among several dozens) may be too weak to allow their characterization with available tools and approaches.

      We acknowledge however that our approach provides an indirect validation of transcription factors binding sites predictions. This was, in our opinion, the most efficient way to evaluate the potential effect of predicted transcription factors.

      We clarify this in the result section:

      “We did not test individually the effects on cardiac performance of mutations in predicted TFBSs located near the SNPs because any individual effect would probably be too small to be detectable by the available methods. Rather, we tested the potential involvement of their cognate TFs by cardiac specific RNAi mediated KD”

      • hiPSC-CM data is somewhat limited by only showing the HR and AP duration data. It is recommended to include some immunocytochemistry data to show the morphology, sarcomere structure of these hiPSC-CMs. Gene expression data generated by qPCR or RNA-seq in particular focusing CM structure and function genes would be helpful too.*

      Answer 4- As suggested by referee 2, we have now performed gene expression analysis and immunostaining of PAX9 KD which gave the strongest phenotype in iPSC-CM (Figure 4 J-M). This unraveled increased expression of Na+ and K+ channels, which is in line with APD shortening phenotype, as well as down regulation of CASQ2, consistent with calcium transient shortening. Expression analysis also revealed increased sarcomeric genes and NPPA/B expression, which was consistent with increased CM size as quantified by the area of TNNT2 staining per nuclei.

      These new data are described at the end of the result section:

      “APD shortening for PAX9 KD was coincident with increased expression of Na+ and K+ ion channels (SCN5A, KCNH2 and KNCQ1) (Figure 4J), supporting the APD shortening phenotype. In this context, the AP kinetics also correlated with shorter calcium transient duration (Figure S8A-D and H-K), including faster upstroke and downstroke calcium kinetics and increased beat rate (peak frequency) (Figure S8E-G and L, M), consistent with decreased expression of Calsequestrin 2 isoform (CASQ2) associated with PAX9 KD (Figure 4J). Finally, assessment of the PAX9 KD effect on sarcomeric content revealed an increase in sarcomeric gene expression (Figure 4K), and an upregulation of genes associated with an hypertrophic response (NPPA, NPPB and NPR1 (Battistoni Et al Circulating biomarkers with preventive, diagnostic and prognostic implications in cardiovascular diseases, Int J Cardiol, 2012, vol. 157) which was coincident with increased CM size as quantified by the area of TNNT2 staining per cardiac nuclei (Figure 4 L, M).

      Collectively, these data illustrate conserved functions for poxm/PAX9 and sr/EGR2 in setting the cardiac rhythm and identify PAX9 as a novel and key regulator of cardiac performance at the cellular level, via the integrated regulation of expression of genes controlling electrophysiology, calcium handling and sarcomeric functions in hiPSC-CMs.”

      Reviewer 3

      Major Comments:

      1- There is an assumption in the use of RNAi knockdown to validate the genes identified in the quantitative analysis, and that is that natural variants are themselves hypomorphic. It is possible that among the variants identified some are hypermorphic, or among the transcription factor binding sites that variants lead to increased factor binding. While RNAi knockdown is an excellent choice to begin validation, I do not think the authors can rule out that a gene not functionally validated by their RNAi tests does not have a role in cardiac function.

      Answer 1. Please see our answers to reviewer 1 comment n°1 and reviewer 2 comment n°2.

      * 2- After performing RNAi knockdown to validate genes identified by GWAS the authors focus on the TGFbeta signaling pathway for downstream analysis. To do so they examine heterozygotes for sog, a repressor of BMP signaling, and snoo, an activator of Activin pathway. The data from the snoo/sog heterozygote is compelling in its disruption of heart phenotypes, and the authors conclude a "coordinated action of activin and BMP." snoo, however, also works as a transcriptional repressor in the BMP pathway, so it's possible that the effects the authors are seeing here could be confined to an increase in BMP signaling. Unlike snoo and sog, mutations in babo and dpp are both expected to have negative effects on Activin and BMP signaling, respectively. The babo/dpp interaction is not as quantitatively convincing as the snoo/sog data, despite the integral roles both babo and dpp play in their respective pathways. If both pathways are connected, why do snoo/sog heterozygotes affect SI phenotypes, while babo/dpp heterozygotes affect fractional shortening? I think the authors data suggest an interesting potential interaction between these pathways, which could be confirmed by examining further mutant combinations, knockdowns or increased expression transgenes, but falls short of a "confirmed synergystic genetic interaction." It does, however, underscore the value of the data in the paper for opening up new avenues for future study. *

      Answer 2 (and reviewer 1 comments 3 and 5).

      These comments led us to reconsider the analysis of the phenotypes associated with loss of function of the TGFb pathway, and to analyze other pathway components combinations.

      We acknowledge reviewer 3 criticisms on snoo/sog experiments, which are difficult to interpret given the broad action snoo may have on both BMP and activin pathways. We have addressed this in the result section.

      We have also analyzed other allelic combinations of BMP and activin pathways components, which strengthen the analysis performed on dpp/babo. Indeed, we tested babo/tkv heterozygotes (respectively specific activin and BMP receptors) and found significant genetic interactions for ESD and EDD. Albeit non-significant, babo/tkv double heterozygotes display a tendency to non-additive effects on FS (p= 0,054). mad/smox heterozygotes (respectively specific downstream TFs of BMP and activin pathways) display interactions (non-additive effects) on HP, SI, DI, ESD and EDD. These new results (Supplemental Figure 4) are thus supporting the hypothesis of genetic interactions between the pathways, but also reveal, as suggested by reviewer 3, a complex relationship between both pathways since interactions are revealed for specific traits in each of the mutant combinations analyzed.

      The phenotypes related to the individual loss of function of each of the actors of these pathways (dpp, tkv and mad for BMP; babo and smox for activin) are however very similar. When they have an effect, heterozygous amorphic alleles of these genes display increased phenotypes related to rhythmicity (HP, DI, SI, AI) and FS, but decreased cardiac diameters (ESD and EDD).

      Finally, as pointed out by reviewer 1, the picture is certainly even more complex since the phenotypes of RNAi mediated heart specific loss of function are not always similar to those of systemic loss of function. Indeed, mad RNAi causes a reduction of HP, DI, SI and FS (Figure S5) whereas heterozygotes for mad12 have either no or opposite effect on these phenotypes, and mad RNAi causes a significative increase in AI whereas mad12 has no effect (Figure S4). The discrepancy between tissue specific RNAi and heterozygous background was also found in the case of dpp, but specifically for the FS. Indeed, as suggested by reviewer 1 we have analyzed the loss of function of dpp by heart-specific RNAi. dpp RNAi results in a reduction of the FS (like mad RNAi) whereas the loss of function in the whole-body results in an increase of the FS.

      We therefore re-wrote the whole corresponding section of the results and modified Figure S4 to include babo/tkv; smox/mad and dppRNAi data.

      “We further focused on the TGFb pathway, since members of both BMP and activin pathways were identified in our analyses. We tested different members of the TGFb pathway for cardiac phenotypes using cardiac specific RNAi knockdown (Figure 2C), and confirmed the involvement of the activin agonist snoo (Ski orthologue) and the BMP antagonist sog (chordin orthologue). Notably, Activin and BMP pathways are usually antagonistic (Figure 2D). Their joint identification in our GWAS suggest that they act in a coordinated fashion to regulate heart function. Alternatively, it may simply reflect their involvement in different aspects of cardiac development and/or functional maturation. In order to discriminate between these two hypotheses, we tested if different components of these pathways interacted genetically. Single heterozygotes for loss of function alleles show dosage-dependent effects of snoo and sog on several phenotypes, providing an independent confirmation of their involvement in several cardiac traits (Figure S4). Importantly, compared to each single heterozygotes, snooBSC234/ sogU2 double heterozygotes flies showed non additive SI phenotypes (two-way ANOVA p val: 2,1 10-7) suggesting a genetic interaction (Figure 2E and Figure S4A). It is worth noting however that snoo is also a transcriptional repressor of the BMP pathway (PMID: 16951053). The effect observed in snooBSC234/ sogU2 double heterozygotes can therefore alternatively arise as a consequence of an increased BMP signaling without affecting the activin pathway. We thus tested other allelic combinations for loss of function alleles of BMP and activin pathways. babo/tkv heterozygotes (respectively activin and BMP type 1 receptors) displayed non additive ESD and EDD phenotypes (Figure S4C). Synergistic interaction of BMP and activin pathways was also suggested by the analysis of fractional shortening in loss of function mutants for babo and dpp, the BMP ligand (Figure S4B). Of note, babo/tkv double heterozygotes also displayed a tendency to non-additive effects on FS albeit non-significant (two-way anova p= 0,054). In addition, mad/smox heterozygotes (specifc downstream TFs of BMP and activin pathways) displayed non-additive effects on several traits, including phenotypes related to rhythmicity (HP, SI, DI) and contractility (ESD and EDD) (Figure S4D). Altogether, cardiac performance in response to allelic combinations of activin and BMP supported a coordinated action of both pathways in the establishment and/or maintenance of cardiac activity. This was further supported by the observation that simple heterozygotes for the tested loss of function alleles displayed similar trends with respect to cardiac performance, irrespective of the pathway considered (dpp, tkv and mad for BMP; babo and smox for activin). Indeed, they displayed either no effect or increased fractional shortening and rhythmicity phenotypes (HP, DI, SI, AI), and decreased cardiac diameters (ESD and EDD). This suggests coordinated activity of both pathways. Importantly, the genetic interactions were tested using amorphic alleles that lead to systemic loss of function. The observed phenotypes may thus not unravel cardiac specific effects of the pathways. In support of this, mad cardiac specific RNAi knock down was tested (see below, Figure S5) and lead to a decreased HP, DI, SI and FS whereas heterozygotes for mad12 have either no (FS) or opposite (HP, DI, SI) effect on these phenotypes (Figure S4D). Inversely, mad RNAi caused a significant increase in AI whereas mad12 had no effect. However, heart specific dpp RNAi knock down (Figure S4E) lead to similar phenotypic trends compared to dppd14 (increased HP, DI, SI, decreased EDD and ESD) with the notable exception of FS which was reduced following cardiac specific KD (Figure S4E), but increased in dppd14heterozygotes (Figure S4B). Taken together, these data point to a complex picture of TGFb pathway activity in regulating cardiac performance, involving both the activin and the BMP pathways as well as gene specific effects with both systemic and tissue-specific contributions.”

      *Minor Comments: *

      * There is an enormous amount of data in this paper, but there are places where things are summarized a little too briefly. For example, there are no definitions given at the beginning of the Results section for traits like "Heart Period" or "Systolic Interval," which would make this work significantly more accessible for other Drosophila researchers. (They do touch on this when they explain later in the paper that certain variants are "associated with quantitative traits linked to heart size and contractility" but more background earlier would be helpful.) When we consider heart performance traits, what is the baseline from known mutants? In other words, where is the line between variation and defect? *

      Answers:

      • We have detailed the description of the traits analyzed at the beginning of the result section. We hope this improves the ease of reading in the direction suggested by the reviewer. “7 cardiac traits were analyzed across the whole population (Dataset S1 and Table 1). As illustrated in Figure 1A, we analyzed phenotypes related to the rhythmicity of cardiac function: the systolic interval (SI) is the time elapsed between the beginning and the end of one contraction, the diastolic interval (DI) is the time elapsed between two contractions and the heart period (HP) is the duration of a total cycle (contraction + relaxation (DI+SI)). The arrhythmia index (AI, std-dev(HP)/mean (HP)) is used to evaluate the variability of the cardiac rhythm. In addition, 3 traits related to contractility were measured. The diameters of the heart in diastole (End Diastolic Diameter, EDD), in systole (End Systolic Diameter, ESD), and the Fractional Shortening (FS), which measures the contraction efficacy (EDD-ESD/EDD).“

      • With respect to the baseline of cardiac performance, there is no simple answer. The baseline is influenced by the genetic background and the experimental conditions. This is the reason why any analysis of mutants or RNAi is conducted in comparison with its own control, analyzed at the same time. Concerning the DGRP lines, no baseline can be defined, since the objective is to measure the diversity of cardiac performance traits within a natural population.

    1. For what purpose? So that the process of what Becker calls “self-transcendence” may begin. And he describes the process of self-transcendence this way: Man breaks through the bounds of merely cultural heroism; he destroys the character lie that had him perform as a hero in the everyday social scheme of things; and by doing so he opens himself up to infinity, to the possibility of cosmic heroism …. He links his secret inner self, his authentic talent, his deepest feelings of uniqueness … to the very ground of creation. Out of the ruins of the broken cultural self there remains the mystery of the private, invisible, inner self which yearned for ultimate significance. …This invisible mystery at the heart of [the] creature now attains cosmic significance by affirming its connection with the invisible mystery at the heart of creation. “This,” he concludes, “is the meaning of faith.” Faith is the belief that despite one’s “insignificance, weakness, death, one’s existence has meaning in some ultimate sense because it exists within an eternal and infinite scheme of things brought about and maintained to some kind of design by some creative force (90, 9 1).” This, then, is what we might call good faith, not a flight into some immortality system. And clearly, some Christians, some Buddhists–at least the Zen Buddhists Becker himself mentions!–have faith in this sense, a faith that Becker characterizes as growing out of tasting one’s own death, embracing one’s own nothingness, and affirming–not a known ultimate meaningful–but an “invisible mystery” of ultimate meaning.

      Embrace the mystery, the sacred - accepting that one will be gone forevermore is a mighty task as our culture teaches us to seek recognition. The last thing we want to be is unrecognized, a nobody. And yet, when we are dead and dissipated back into the rest of the world, that is exactly what we will become.

      But we have to accept that reality before we can build and think beyond it to a deeper possibility of meaning. Reality brought us forth to begin with. Every moment is already sacred.

    1. Author Response

      Reviewer #2 (Public Review):

      1) “…it was important that the output response was intimately linked to the bound state of the receptor, in this case the TCR, with ligand unbinding rapidly reversing all proofreading steps. This means that dissociation of a single TCR should disrupt signaling, and implicitly assumes a direct physical connection between the bound receptor and the KP modifications. However, this mechanism becomes much harder to argue when the KP steps are physically uncoupled from bound TCR, such as in LAT microclusters or DAG production.”

      We agree that signaling events in the kinetic proofreading chain must be linked to ligand unbinding. We have added discussion to the paragraphs beginning on page 20 line 440 of recent work from Yi et al. 2019 and Lo et al. 2018 suggesting a physical link between bound TCRs and LAT clusters. The full paragraphs are reproduced below.

      “The kinetic proofreading model requires all intermediate steps to reset upon unbinding of the ligand (Fig. 1A). This means that information about the receptor’s binding state must be communicated to all proofreading steps. If kinetic proofreading steps exist beyond the T cell receptor, how is unbinding information conveyed to these effectors? Importantly, there is evidence of physical proximity of LAT with the receptor. While TCR/Zap-70 and LAT/PLCγ microclusters form spatially segregated domains, these domains remain adjacent to one another (Yi et al., 2019). Lo et al. demonstrated that the protein Lck binds Zap-70 with its SH2 domain and LAT with its SH3 domain, potentially bridging the two signaling domains together and propagating binding information (Lo et al., 2018).

      An attractive reset mechanism is the segregation of CD45 away from bound receptors, creating spatial regions in which TCR and LAT associated activating events can occur (S. J. Davis & van der Merwe, 2006). Super-resolution microscopy by Razvag et al. measured TCR/CD45 segregated regions within seconds of antigen contact at the tips of T cell microvilli (Razvag et al., 2018). Upon unbinding, these regions of phosphatase exclusion collapse, allowing CD45 to dephosphorylate receptor ITAMs and LAT clusters. However, the rate of dephosphorylation for LAT and receptor ITAMs could differ. LAT clusters exclude CD45 in reconstituted bilayer systems, potentially limiting the dephosphorylation to LAT molecules at the edges of the cluster thus slowing reset (Su et al., 2016). The kinetics of multivalent protein-protein interactions within TCR and LAT clusters can also influence dephosphorylation and dissociation rates (Goyette et al., 2022).

      A CD45-mediated reset mechanism would restrict proofreading to membrane-bound signaling events occurring within a CD45-depleted region. Downstream events that dissociate away from the membrane or diffuse out of the segregated region could not directly participate in the proofreading chain, as the collapse of a CD45 segregated region could not reset signaling entities released into the cytosol (e.g. release of IP3 in the cleavage of PIP2 to DAG).”

      2) …The data clearly demonstrate a time delay between receptor binding and the measured outputs, but it is not so surprising that this lag would exist in propagating the signal through the intracellular network.

      We apologize for this point of confusion in our methodology. We are unable to measure the time lag between receptor binding and signal propagation through the network because our system is terminated by blue light. Binding is stochastically initiated much like native ligand/receptor interactions. The time values reported in our dataset are the average ligand binding half-lives of the LOV2 ligand under various intensities of constant blue-light illumination, as measured by separate in vitro kinetic washout experiments. Our model is fit to the steady-state signaling output achieved after a 3 minute exposure of cells to LOV2 ligands of an average ligand binding half-life enforced by constant blue light illumination. We clarify this point by including the following paragraphs beginning on page 8 line 170.

      “We are unable to control when binding events start since our optogenetic system is inhibited by blue-light, as opposed to being activated by blue-light. The initiation of binding after blue-light inhibition is a function of both the stochastic relaxation of inhibited LOV2 back into the binding-state as well as the diffusion of binding-state LOV2 from outside the previously illuminated area. Without temporal control over the start of binding, it is difficult to measure the time delay between ligand binding and a downstream signaling event (Yi et al., 2019). Such studies typically require careful single-molecule imaging of numerous stochastic binding events (Lin et al., 2019).

      To overcome this technical limitation of our system, we chose instead to measure the steady-state output of the antigen signaling cascade achieved several minutes after ligand binding. Kinetic proofreading systems behave differently than non-proofreading systems at steady-state. A non-proofreading system’s steady-state output is set by the number of ligand-bound receptors and not the binding half-lives of those ligands (Fig. 3D, left). In contrast, a kinetic proofreading system can produce different steady-state outputs in response to ligands of different binding half-lives, even when ligand densities are adjusted to achieve equivalent occupancy (Daniels et al., 2006) (Fig. 3D, right). Signaling events take varying amounts of time to occur after ligand binding (Lin et al., 2019; Yi et al., 2019). However, the temporal delays between steps are on the order of tens of seconds. By imaging the cells after minutes of constant exposure to a set ligand binding half-life, we measure the steady state output achieved at a signaling event in the cascade on a longer timescale than these delays (Tischer & Weiner, 2019).”

      3) The authors use a simple equation for KP to fit their datasets in Figure 4, equivalently to their previous work. However, no goodness-of-fit metric is provided for these fits, and by manual inspection it is hard to see the defining curves of their KP model in the datasets, especially not for LAT and DAG, where the datasets look much more like vertical bars. The estimated values of steps (n) may well be the best fit to the data, but they are not necessarily a 'good' fit.

      To assist readers in assessing how well our models fit our datasets, we have included heatmaps of the residuals from each model fit (Fig 4S3) on page 52, along with discussion (reproduced below) of the residual plots of regions where our models imperfectly capture our dataset on page 13 line 283.

      “To assess our model fits, we evaluated the residuals of each model subtracted from their respective dataset. For Zap70 recruitment, our model underestimates the degree of activation at moderate binding half-lives and receptor occupancies, as indicated by the positive region in the center of the heatmap. It is possible that Zap70 recruitment reaches saturation at shorter ligand binding half-lives than our model predicts (Fig. 4S3 A). For both LAT clustering and DAG generation, our models performed poorest in the region of lowest occupancy and shortest half-life (Fig. 4S3 B&C). In this region of our dataset, the fluorescent signal from bound LOV2 above the background fluorescence of unbound LOV2 is smallest. To compensate for fluorescence of unbound LOV2, we subtract off the local background fluorescence of unbound LOV2 around each cell. In doing so we may be underestimating the amount of LOV2 bound to each cell, leading to an underestimation of signaling output by the models. Future studies at LOV2 densities approaching single molecule would better capture this regime of receptor occupancy, but cell-to-cell variation in activation would be too high to be compatible with our current steady-state analysis (Lin et al., 2019).”

      4) The values of n are also very high, which would imply that the kp rate constant might be very fast to compensate; no estimates of this value are presented. Recent data from the Dushek lab (Pettmann et al, eLife 2021) measured n to be ~3, which seems much more physically realistic. Furthermore, in their previous published work, Tischer & Weiner measured n to be 2.7 for DAG production but in the present study it is now n=11.3, using the same equation

      We are unable estimate the kp rate constant, as our datasets are at steady state and do not provide temporal information. To assess the plausibility of our higher n value fits, we explored the steady-state model presented in Ganti et al. PNAS 2020, which defines a kp rate of 0.1 s-1. This model predicts the minimum number of signaling steps required to achieve a defined Hopfield error rate at defined cognate-ligand/self-ligand concentration and half-life ratios. Our exploration of this model is detailed in Fig. 4S4 on page 53 and detailed in discussion on page 14 line 299

      “In our previous work our model fit fewer (N=2.7) steps to DAG generation. We now fit a higher number of steps (N=11.3) to DAG generation. This change could be due to the incorporation of ICAM into our current study, which has been shown to potentiate ligand discrimination (Pettmann et al., 2021). Furthermore, our previous antibody-based adhesion may have short-circuited some proofreading steps by irreversibly holding the cell membrane close to the supported lipid bilayer. To evaluate if our higher value fits are indeed the best fit values for our datasets, we fit our model to each dataset while holding the value of N constant in the range of zero to fourteen steps, and evaluated the average residual value for each model fit (Fig 4S3 D). For all signaling steps, the fit value of N was near the minima of average residual and had a lower average residual value than a model with 3 proofreading steps.

      To assess the plausibility of a larger number of proofreading steps, we implemented the steady state kinetic proofreading model from Ganti et al. (Ganti et al., 2020). The model estimates the minimum number of proofreading steps required to discriminate between cognate-ligands and self-ligands with different binding half-lives present at a given concentration ratios at a given Hopfield error-rate (Hopfield, 1974). First, we evaluated what combinations of ligand half-lives and concentration ratios an 11-step kinetic proofreading network could discriminate at an error rate less than 10-3 (Fig 4S4 A). We chose the error rate of 10-3, as it is an order of magnitude less than the theorized 10-4 upper limit error rate of the native TCR (Ganti et al., 2020). At moderate half-life ratios, an 11-step network can discriminate cognate peptides present in small concentrations (e.g. 1 cognate-ligand per 1000 self-ligands at a half-life ratio of 6).

      In our optogenetic system, the ratio of the average ligand binding half-life between the longest suppressive half-life and the shortest fully activated half-life is about 2. However, an 11-step network is insufficient to discriminate between ligands with a half-life ratio of 2, even at the high ligand ratio of 1 (equal concentrations of cognate- and self-ligand). This suggests our cells are unlikely to be detecting the average ligand binding half-life of each blue-light condition, but are more likely detecting longer-lived binding events from the underlying distribution of half-lives. Another possibility is that our in vitro washout measurements, which measure average ligand binding half-lives of soluble ligands diffusing in three dimensions, differ from the half-lives of ligand-receptor interactions between the cell’s plasma membrane and the supported lipid bilayer diffusing in two dimensions (J. Huang et al., 2010).

      To better explore the kinetic proofreading model space, we generated heatmaps reporting the required number of steps to discriminate combinations of ligand and half-life ratios at an error rate of 10-3 (Fig 4S4 B). To discriminate between ligands with a half-life ratio of two, at least 14 steps are needed when the ligands are at equal concentrations, and more than 25 steps are needed if cognate-ligands are 1 per 1000 self-ligands. The required number of proofreading steps decreases rapidly as the half-life ratio increases, reaching a minimum of 8-steps needed for a concentration ratio of 1/1000 and a half-life ratio of 10, which is more in line with physiological half-life ratios between agonist and non-agonist peptides (M. M. Davis et al., 1998).

      After comparing our results with the Ganti model, this analysis suggest that our number of fit proofreading steps may be somewhat inflated as a function of our use the average ligand binding half-lives of three dimensional washout experiments in place of the two dimensional single molecule information T cells use to make activation decisions. However, the higher fit N values are more consistent with the required number of steps to discriminate ligands under more physiological conditions than our previous measurements of ~3 steps, which would not be expected to discriminate ligands with half-life ratio of 10 even at a ligand ratio of 1 (Fig 4S4 B, right).”

      5) If the fitted value of n provides no realistic insight into the KP mechanism, it should not be discussed as though it does.

      The many assumptions of our simplistic model likely results in error in determining the absolute number of fit proofreading steps. We feel the strength of our model lies in capturing the relative increase in the strength of proofreading as signal propagates through the cascade, and not determining the absolute number of proofreading steps, though it is comforting that our values are broadly consistent with the expectations of Ganti et al. To highlight the point that relative values are the most important feature of our experiments, we are open to normalizing our n fit values by the fit n of Zap70 for all discussion of our results and the proofreading strength increase shown in Fig 4D if the reviewers think this will better highlight the relative increase in proofreading strength.

      6) While it is good to confirm it, the result that downstream signaling complexes reset more slowly than distal ones is surely to be expected, given the increased number of steps over which ligand unbinding must traverse, as in their Erlang distribution. You would not expect ERK phosphorylation to decrease at the same rate as LAT cluster dissociation for this same reason. However, the fact that the lifetime of LAT clustering (14.2s) or ZAP70 (9.6s) is so different to LOV2 (3.3s) provides good evidence that it is not proofreading, as by definition the measured outputs should rapidly return to the 'unbound' state in line with ligand unbinding. At least for LAT, there must be a 'memory' from previous signalling lasting several seconds, which means the system has not reset, as required for true KP.

      Slower resetting of downstream signaling events in a kinetic proofreading cascade is not a given, as it could be the case that all events reset at the same rate. One requirement for kinetic proofreading is that events in the chain be irreversible on the timescale of the ligand binding half-life. The steps are reset through an orthogonal pathway, opposed to traversing back down a chain of reversible reactions. Both the TCR and LAT are dephosphorylated by the phosphatase CD45, and it would be possible for CD45 to dephosphorylate both proteins at the same rate (or even dephosphorylate LAT faster than the TCR). To clarify this point, we have expanded discussion on possible reset mechanism on page 21 line 451 as reproduced below

      “An attractive reset mechanism is the segregation of CD45 away from bound receptors, creating spatial regions in which TCR and LAT associated activating events can occur (S. J. Davis & van der Merwe, 2006). Super-resolution microscopy by Razvag et al. measured TCR/CD45 segregated regions within seconds of antigen contact at the tips of T cell microvilli (Razvag et al., 2018). Upon unbinding these regions of phosphatase exclusion collapse, allowing CD45 to dephosphorylate receptor ITAMs and LAT clusters. However, the rate of dephosphorylation for LAT and receptor ITAMs could differ. LAT clusters exclude CD45 in reconstituted bilayer systems, potentially limiting the dephosphorylation to LAT molecules at the edges of the cluster thus slowing reset (Su et al., 2016). The kinetics of multivalent protein-protein interactions within TCR and LAT clusters can also influence dephosphorylation and dissociation rates (Goyette et al., 2022).

      A CD45-mediated reset mechanism would restrict proofreading to membrane-bound signaling events occurring within a CD45-depleted region. Downstream events that dissociate away from the membrane or diffuse out of the segregated region could not directly participate in the proofreading chain, as the collapse of a CD45 segregated region could not reset signaling entities released into the cytosol (e.g. release of IP3 in the cleavage of PIP2 to DAG).”

      We also added discussion of recent work from Harris et al. quantifying the slower timescale of Ca++ and ERK reset upon TCR signal termination on Page 23 line 498 as reproduced below.

      “Recently Harris et al. quantified the reset rate of the downstream signaling events Ca++ release and ERK phosphorylation upon signal inhibition to be 29 seconds and 3 minutes respectively (Harris et al., 2021). They showed both Ca++ and ERK levels can persist across short inhibitions of signaling. What makes LAT clusters different than these persistent downstream events? The dissolution of LAT clusters is directly triggered by the unbinding of ligand from the TCR, and both the TCR and LAT are de-phosphorylated by CD45. To our knowledge, the rate of ERK dephosphorylation or cytosolic Ca++ depletion are not accelerated by TCR unbinding, and are turned over through constant rather than agonist-gated degradation. A useful future line of inquiry would be to quantify the reset rate for signaling steps throughout the cascade upon ligand unbinding versus orthogonal signal inhibition (e.g. kinase inhibition).”

    1. Author Response

      Reviewer #1 (Public Review):

      The paper presents a Bayesian model framework for estimating individual perceptual uncertainty from continuous tracking data, taking into account motor variability, action cost, and possible misestimation of the generative dynamics. While the contribution is mostly technical, the analyses are well done and clearly explained. The paper provides therefore a didactic resource for students wishing to implement similar models on continuous action data.

      First off, the paper is lucidly written - which made it a very pleasant read, especially compared to many other modeling papers, and the authors are to be congratulated for this. As such, the paper provides a valuable resource for didactic purposes alone. While the employed methods are not necessarily individually novel, the assembly of various parts into a coherent framework appears nonetheless valuable.

      Thank you for the positive evaluation!

      I have two major concerns, though:

      1). My main comment regards the model comparison using WAIC (Figure 4E) or cross-validation (Figure S4a): If we translate these numbers into Bayes factors, they are extraordinarily high. I assume that the p(x_i|\theta_s) in equation 7 are calculated assuming that the motor noise on u_{i,t} is independent? This would assume that motor processes act i.i.d with a timeframe of 60ms, which is probably not a very realistic assumption- given that much of the motor variability (as stated by the authors) comes likely from a central (i.e. planning) origin. Would the delta-WAIC not be much smaller if motor noise was assumed to be correlated across time points? Would this assumption change the \sigma estimates?

      Thank you for posing this question. First, sequential models tend to have much larger differences in the likelihood of parameters given data because of the large number of individual data points within a single sequence. Thus, it is not uncommon for model comparison to show much more extreme differences between models for sequential data, as is the case in the present manuscript.

      Second, since our computational framework is based on LQG control, the model indeed assumes that motor noise is independent across time steps. We agree that this assumption might not be realistic for time steps of 16ms duration. While this assumption is certainly a simplification, the assumption of independent noise across time steps is very common both in perceptual models as well as in models of motor control, and there is to our knowledge no computationally straightforward way around it in the LQG framework. It thus applies to all of the models considered in this paper, as they all assume temporally uncorrelated noise, both in perception and action. Therefore, the ranking between the models in the model comparison should hopefully not be affected in a systematic way favoring individual models disproportionately more than others, although the magnitudes of differences in WAIC might be smaller. Since the differences in WAIC are currently in the range of 1e4, we think that they will still be significant, even when accounting for correlated noise.

      Third, we think that the simplifying assumption of independent noise does not invalidate the calculation of the WAIC, which assumes independence across trials. The p(x_i | theta_s) in equation (8) are the likelihoods of whole trials. To compute them, we assume independence of the motor noise across time steps.

      We have added a short passage in the subsection ‘model comparison’:

      “Note that the assumption of independent noise across time steps might lead to WAIC values that are larger than those obtained under a more realistic noise model involving correlations across time. However, this should not necessarily affect the ranking between models in a systematic way, i.e. favoring individual models disproportionately more than others.”

      and a passage in the discussion that points out that modeling the noise as being independent across time points is a simplifying assumption:

      “Finally, assuming independent noise across time steps at the experimental sampling rate of (60Hz) is certainly a simplifying assumption. Nevertheless, the assumption of independent noise across time steps is very common both in models of perceptual inference as well as in models of motor control, and there is to our knowledge no computationally straightforward way around it in the LQG framework.”

      2). While the results in Figure 4a are interesting, the deviation of the \sigma estimates from the standard psychophysical estimates for the most difficult condition remains unexplained. What are the limits of this method in estimating perceptual acuity near the perceptual threshold? Is there a problem that subjects just "give up" and the motor cost becomes overwhelming? Would this not invalidate the method for threshold detection?

      We fully agree that for the most difficult conditions at the lowest contrasts all sequential models we considered are biased with respect to the uncertainties obtained with the 2AFC experiment, which is supposed to be equivalent. Interestingly, when considering synthetic data, we did not see such a discrepancy. Thus, the observed bias points towards an additional mechanism such as a computational cost or computational uncertainty, that is not captured by the current models at very low contrast.

      For the results in Fig. 4, we assumed a constant behavioral cost across all conditions. The assumption that the cost is independent of perceptual uncertainty might not hold in reality, exactly in line with your hypothesis that subjects might just "give up". There are other possible explanations, though, that could potentially be relevant here. For example, the visual system is known to integrate visual signals over longer times, when contrast is lower. This may introduce additional non-linearities in the integration, which could affect the sensitivity, as already pointed out in the study by Bonnen et al. (2015).

      We have added the following passage in the discussion section:

      “In the lowest contrast conditions, all models we considered show a large and systematic deviation in the estimated perceptual uncertainty compared to the equivalent 2AFC task. Note that when considering synthetic data, we did not see such a discrepancy. Thus, the observed bias points towards additional mechanisms such as a computational cost or computational uncertainty, that are not captured by the current models at very low contrast. One reason for this could be that the assumption of constant behavioral costs across different contrast conditions might not hold at very low contrasts, because subjects might simply give up tracking the target although they can still perceive its location. Another possible explanation is that the visual system is known to integrate visual signals over longer times at lower contrasts [Dean & Tolhurst, 1986; Bair & Movshon, 2004], which could affect not only sensitivity in a nonlinear fashion but could also lead to nonlinear control actions extending across a longer time horizon. Further research will be required to isolate the specific reasons.“

      Reviewer #2 (Public Review):

      This manuscript develops and describes a framework for the analysis of data from so-called continuous psychophysics experiments, a relatively recent approach that leverages continuous behavioral tracking in response to dynamic stimuli (e.g. targets following a position random walk). Continuous psychophysics has the potential to dramatically improve the pace of data collection without sacrificing the ability to accurately estimate parameters of psychophysical interest. The manuscript applies ideas from optimal control theory to enrich the analysis of such data. They develop a nested set of data-analytic models: Model 1: the Kalman filter (KF), Model 2: the optimal actor (which is a special case of a linear quadratic regulator appropriate for linear dynamics and Gaussian variability), Model 3: the bounded actor w. behavioral costs, and Model 4: the bounded actor w. behavioral costs and subjective beliefs. Each successive model incorporates parameters that the previous model did not. Each parameter is of potential importance in any serious attempt to human model visuomotor behavior. They advertise that their methods improve the accuracy the inferred values of certain parameters relative to previous methods. And they advertise that their methods enable the estimation of certain parameters that previous analyses did not.

      What were the parameters? In this context, the Kalman filter model has one free parameter: perceptual uncertainty of target position (\sigma). The optimal actor (Model 2) incorporates perceptual uncertainty of cursor position (\sigma_p) and motor variability (\sigma_m), in addition to perceptual uncertainty of target position (\sigma) that is included in the Kalman filter (Model 1). The bounded actor with behavioral costs (Model 3) incorporates a control cost parameter (c) that penalizes effort ('movement energy'). And the bounded actor with behavioral costs and subjective beliefs (Model 4) further incorporates the human observer possibly mistaken 'beliefs' about target dynamics (i.e. how the human's internal model of target motion differs from the true generative model. Model allows for the true target dynamics (position-random-walk with drift = \sigma_rw) to be mistakenly believed to be governed by a position-random-walk with drift = \sigma_s plus a velocity-random-walk with drift = \sigma_v).

      The authors develop each of these models, show on simulated data that true model parameters can be accurately inferred, and then analyze previously collected data from three papers that helped to introduce the continuous psychophysics approach (Bonnen et al. 2015, 2017 & Knoll et al. 2018). They report that, of the considered models, the most sophisticated model (Model 4) provides the best accounting of previously collected data. This model more faithfully approximates the cross-correlograms relating target and human tracking velocities than the Kalman filter model, and is favored by the widely applicable information criterion (WAIC).

      The manuscript makes clear and timely contributions. Methods that are capable of accurately estimating the parameters described above from continuous psychophysics experiments have obvious value to the community. The manuscript tackles a difficult problem and seems to have made important progress.<br /> Some topics of central importance were not discussed with sufficient detail to satisfy an interested reader, so I believe that additional discussion and/or analyses are required. But the work appears to be well-executed and poised to make a nice contribution to the field.

      The manuscript, however, was an uneven read. Parts of it were very nicely written, and clearly explained the issues of interest. Other parts seemed organized around debatable logic, making inappropriate comparisons to--and misleading characterizations of--previous work. Other parts still were weakened by poor editing, typos, and grammatical mistakes.

      Overall, it is a nice piece of work. But the authors should provide substantially more discussion so that readers will develop a better intuition and how and why the inference routines enable accurate estimation, and how the values of certain parameters trade off with one another. Most especially, the authors should be very careful to accurately describe and appropriately use the previous literature.

      Thanks for the generous overall assessment and the thorough review! We hope that we can address the points you raised in our revised manuscript with the answers to your specific comments below.

      To summarize, we have substantially revised the discussion section to clarify our reasoning and avoid potential misinterpretations of parts of our manuscript as a misrepresentation of previous work. We have also extended the introduction and the exposition of our models in the results section to help readers develop an intuition about the models and inference routines.

    1. Author Response

      Reviewer #1 (Public Review):

      This paper describes a systematic biochemical analysis of UBX proteins in facilitating protein unfolding by the p97-UFD1-NPL4 (referred to here is the p97 complex). The p97 complex binds Ub and unfolds it to allow the ubiquitylated protein to be translocated into the p97 ATPase pore for unfolding. This paper demonstrates that UBX proteins are able to reduce the necessary ubiquitin chain length in order to support unfolding by p97. They explore this using ubiquitylated CMG helicase as a substrate. Removal of CMG helicase from replicated DNA is required for completion of DNA synthesis.

      First the authors demonstrate that the p97 complex only only unfolds CMG with very long Ub chains. The then show that the high threshold for Ub is reduced when UBXN7, FAF1 or FAF2 are added. These proteins bind to both the p97 complex and Ub in substrates. This is then followed up in cells by demonstrating that removal of UBXN7 and FAF1 reduces CMG disassembly and is synthetic with reduced CMG ubiquitin ligase activity.

      The conclusion that human p97 requires UBX proteins to support unfolding/segregase activity when Ub chains are short would be strengthened by more precise characterization of the length of ubiquitin chains being studied, as the methods do not precisely determine the chain lengths and how this is overlapping with the number and location of primary ubiquitylation sites on Mcm7.

      Please see our reply above to essential revision point 2 (data in Figure 1-figure supplement 1 and Figure 2-figure supplement 3)

      The in cellulo results, while consistent with a contributing role for FAF1 and UBXN7 in disassembly of the CMG by p97, indicate that either other factors are required in cells or that p97 can disassemble CMG with relative short chains in cells without the need for the UBX proteins. This needs to be reconciled with the proposed model.

      We now discuss on lines 444-450 that CMG disassembly in the absence of UBXN7 and FAF1 might be promoted by additional UBX proteins not characterised in this study, or else be due to extensive CMG-MCM7 ubiquitylation that bypasses the requirement for UBX proteins (as predicted by our data in Figure 1). Note that short ubiquitin chains on CMG-MCM7 in cells treated with p97 inhibitor need to be interpreted with caution, as it is likely that p97 inhibition lowers the pool of free ubiquitin in cells. This point is discussed on lines 444-445 of the revised manuscript.

      Reviewer #3 (Public Review):

      The ATPase p97 (Cdc48 in yeast) unfolds ubiquitinated substrates with the help of its heterodimeric cofactor UFD1-NPL4 (U-N). Using the previously established CMG helicase complex as model substrate in a fully reconstituted biochemical assay, Fujisawa and Labib show that p97-U-N can efficiently disassemble the helicase complex only when it is modified with multiple, long ubiquitin chains. This is in contrast to the yeast Cdc48-U-N complex, which disassembles helicase complexes carrying long or short (6-10 ubiquitin moieties) chains with similar efficiency. The authors demonstrate that the requirement of p97-U-N for long chains can be overcome by the presence of p97 cofactors of the UBA-UBX type, including UBXN7, FAF1, FAF2 and (much less so) UBXN1. They show that this reduction in the 'ubiquitin threshold' of p97-U-N by UBXN7, FAF1 and FAF2 requires their UBX domain mediating p97 binding. They further show that the UBA and UIM domains of UBXN7 contribute to its activity in the assay, whereas the UBA domain of FAF1 and FAF2 is dispensable. Instead, a coiled-coil domain preceding the UBX domain of FAF1 and FAF2 is required for their activity, and both the coiled-coil-UBX domain organization and its activity are conserved in the worm homologue UBXN-3. Using UBXN7 and FAF1 knockout cells, Fujisawa and Labib then demonstrate that UBXN7 is required for efficient CMG helicase disassembly during S phase, with a minor contribution of FAF1, whereas both cofactors possess redundant roles in mitotic CMG helicase disassembly. Finally, the authors show that UBXN7 and FAF1 double knockout cells are hypersensitive to the NEDDylation inhibitor MLN4924 and suggest that this reflects their importance for p97-U-N unfoldase activity under conditions of restricted ubiquitination activity.

      This manuscript describes the intriguing observation that the yeast and mammalian Cdc48/p97-U-N complexes have distinct requirements, at least in the in vitro assay used, with respect to the substrate´s ubiquitination state and to the presence of additional cofactors. While the concept of UBA-UBX cofactors assisting/stimulating Cdc48/p97-U-N activity is well-established, their link to ubiquitin chain length is novel and unexpected. The experiments are performed to a high technical standard, and the conclusions are mostly supported by the data. However, a shortcoming of the paper is that it remains entirely descriptive regarding the effect of the UBX proteins on the ubiquitin threshold, without providing mechanistic insights into their function or the molecular basis underlying the distinct thresholds.

      1) It remains unclear if the failure of p97-U-N to disassemble the helicase complex carrying short ubiquitin chains reflects impaired binding, priming or translocation of the substrate. It should be straightforward to test if the UBA-UBX cofactors simply stabilize the p97-U-N-substrate complex.

      As shown in previous studies, human UFD1-NPL4 bind stably to p97 in the absence of UBX proteins (our new data in Figure 3-figure supplement 2D illustrate this).

      The distinct domain requirements for UBXN7 (UBA, UIM, UBX) and FAF1/FAF2 (coiled-coil-UBX) suggest different mechanisms of stimulation, which should be discussed in more detail.

      We discuss further the roles of UBXN7 and FAF1/FAF2 on lines 533-548.

      The additive defects of the UBXN7 and FAF1 double knockout cells could indicate either redundant functions (as the authors propose) or synergistic function of both cofactors. To that end, the authors could test if UBXN7 and FAF1 can bind simultaneously to the same p97-U-N-substrate complex and if they act synergistically in helicase disassembly, e.g. at limiting cofactor concentrations.

      Previous studies have found that UBXN7 binds to p97 and UFD1-NPL4 with a 1:6:1 ratio and the same is true for FAF1, without any evidence of both UBXN7 and FAF1 binding to the same p97-UFD1-NPL4 complexes (Hanzelmann et al., 2011). Correspondingly, we did not observe any synergistic effect of FAF1 with UBXN7 upon the disassembly of ubiquitylated CMG by p97-UFD1-NPL4, when comparing reactions with a single UBX protein or reactions with both (our unpublished data).

      2) Having all purified proteins at hand, the authors should test which component of the system causes the elevated ubiquitin threshold of mammalian p97-U-N, by combining yeast Cdc48 with mammalian U-N and vice versa, etc.

      We thank the reviewer for this very interesting suggestion. The data are presented in Figure 3, showing that human UFD1-NPL4 and yeast Ufd1-Npl4 set the ubiquitin threshold for their cognate unfoldase enzymes.

      Can yeast Ubx5, which is a clear homologue of UBXN7, substitute for the mammalian UBA-UBX cofactors?

      This was also an interesting suggesting – we tested Ubx5 and didn’t see any stimulation. We didn’t include the data as we lack a positive control for Ubx5 activity.

      3) The authors emphasize that mammalian p97-U-N in the absence of UBA-UBX cofactors requires long ubiquitin chains for activity. However, they should consider the possibility that the critical property is chain topology, rather than chain length. There is evidence that p97-U-N prefers substrates with branched chains (see PMIDs 28512218, 29033132), and multiple ubiquitin chains on the helicase substrate may mimic those.

      We thank the reviewer for raising this important point and we now cite the two papers mentioned above, on lines 171 and 177.

      In the revised version of the manuscript, we characterise carefully the ubiquitin chains that are formed under the various conditions used (Figure 1-figure supplement 1). Importantly, we also show that human p97-UFD1-NPL4 can disassemble highly ubiquitylated CMG, regardless of whether there are several or just one ubiquitin chains attached to CMG-Mcm7 (Figure 1-figure supplement A+C; Figure 2-figure supplement 3A).

      Moreover, we also show that human p97-UFD1-NPL4 is comparable to yeast Cdc48-Ufd1-Npl4 in being able to disassemble CMG that is highly ubiquitylated with ‘K48-only’ ubiquitin that cannot form mixed chain linkages (Figure 2-figure supplement 3B).

      These data indicate that p97-UFD1-NPL4 can disassemble heavily ubiquitylated CMG complexes with long K48-linked ubiquitin chains on CMG-Mcm7, regardless of the number of chains and regardless of the presence of other chain linkages (in addition to K48-linked chains).

      It appears that worm CDC48-U-N in the absence of UBXN-3 cannot efficiently disassemble substrate carrying even long chains (Fig. 3 - supplement 2). The authors should discuss this finding in the context of their ubiquitin threshold model.

      This is an interesting point, suggesting that the threshold of C. elegans CDC-48_UFD-1_NPL-4 is even higher than human p97-UFD1-NPL4, in the absence of UBX proteins. However, we think that this issue is beyond the scope of our manuscript and likely requires structural biology to provide a definitive explanation. Our manuscript just uses the C. elegans enzymes to make one simple and clear point – namely that the essential role of the coiled coil domain of human FAF1 is conserved in its worm orthologue UBXN-3.

    1. Author Response

      Reviewer #1 (Public Review):

      This study investigates low-frequency (LF) local field potentials and high-frequency (HF, >30 Hz) broadband activity in response to the visual presentation of faces. To this end, rhythmic visual stimuli were presented to 121 human participants undergoing depth electrode recordings for epilepsy. Recordings were obtained from the ventral occipito-temporal cortex and brain activity was analyzed using a frequency-tagging approach. The results show that the spatial, functional, and timing properties of LF and HF responses are largely similar, which in part contradicts previous investigations in smaller groups of participants. Together, these findings provide novel and convincing insights into the properties and functional significance of LF and HF brain responses to sensory stimuli.

      Strengths

      • The properties and functional significance of LF and HF brain responses is a timely and relevant basic science topic.

      • The study includes intracranial recordings in a uniquely high number of human participants.

      • Using a frequency tagging paradigm for recording and comparing LF and HF responses is innovative and straightforward.

      • The manuscript is well-written and well-illustrated, and the interpretation of the findings is mostly appropriate.

      Weaknesses

      • The writing style of the manuscript sometimes reflects a "race" between the functional significance of LF and HF brain responses and researchers focusing on one or the other. A more neutral and balanced writing style might be more appropriate.

      We would like first to thank the reviewer for his/her positive evaluation as well as constructive and helpful comments for revising our manuscript.

      Regarding the writing style: we had one major goal in this study, which is to investigate the relationship between low and high frequencies. However, it is fair to say – as we indicate in our introduction section – that low frequency responses are increasingly cast aside in the intracranial recording literature. That is, an increasing proportion of publications simply disregard the evoked electrophysiological response that occur at the low end of the frequency spectrum, to focus exclusively on the high-frequency response (e.g., Crone et al., 2001; Flinker et al., 2011; Mesgarani and Chang, 2012; Bastin et al., 2013; Davidesco et al., 2013; Kadipasoaglu et al., 2016; 2017; Shum et al., 2013; Golan et al., 2016; 2017; Grossman et al., 2019; Wang et al., 2021, see list of references at the end of the reply).

      Thus, on top of the direct objective comparison between the two types of signals that our study originally provides, we think that it is fair to somehow reestablish the functional significance of low frequency activity in intracranial recording studies.

      The writing style reflects that perspective rather than a race between the functional significance of LF and HF brain responses.

      • It remains unclear whether and how the current findings generalize to the processing of other sensory stimuli and paradigms. Rhythmic presentation of visual stimuli at 6 Hz with face stimuli every five stimuli (1.2 Hz) represents a very particular type of sensory stimulation. Stimulation with other stimuli, or at other frequencies likely induce different responses. This important limitation should be appropriately acknowledged in the manuscript.

      We agree with the Reviewer 1 (see also Reviewer 2) that it is indeed important to discuss whether the current findings generalize to the other brain functions and to previous findings obtained with different methodologies. We argue that our original methodological approach allows maximizing the generalizability of our findings.

      First, frequency-tagging approach is a longstanding stimulation method, starting from the 1930s (i.e., well before standard evoked potential recording methods; Adrian & Matthews, 1934; intracranially: Kamp et al., 1960) and widely used in vison science (Regan, 1989; Norcia et al., 2015) but also in other domains (e.g., auditory, somato-sensory stimulation). More importantly, this approach does not only significantly increase the signal-to-noise ratio of neural responses, but the objectivity and the reliability of the LF-HF signal comparison (objective identification and quantification of the responses, very similar analysis pipelines).

      Second, regarding the frequency of stimulation, our scalp EEG studies with high-level stimuli (generally faces) have shown that the frequency selection has little effect on the amplitude and the shape of the responses, as long as the frequency is chosen within a suitable range for the studied function (Alonso-Prieto et al., 2013). Regarding the paradigm used specifically in the present study (originally reported in Rossion et al., 2015 and discussed in detail for iEEG studies in Rossion et al., 2018), it has been validated with a wide range of approaches (EEG, MEG, iEEG, fMRI) and populations (healthy adults, patients, children and infants), identifying typically lateralized occipito-temporal face-selective neural activity with a peak in the middle section of the lateral fusiform gyrus (Jonas et al., 2016; Hagen et al., 2020 in iEEG; Gao et al., 2018 in fMRI).

      Importantly, specifically for the paradigm used in the present study, our experiments have shown that the neural face-selective responses are strictly identical whether the faces are inserted at periodic or non-periodic intervals within the train of nonface objects (Quek & Rossion, 2017), that the ratio of periodicity for faces vs. objects (e.g., 1/5, 1/7 … 1/11) does not matter as long as the face-selective responses do not overlap in time (Retter & Rossion, 2016; Retter et al., 2020) and that the responses are identical across a suitable range of base frequency rates (Retter et al., 2020).

      Finally, we fully acknowledge that the category-selective responses would be different in amplitude and localization for other types of stimuli, as also shown in our previous EEG (Jacques et al., 2016) and iEEG (Hagen et al., 2020) studies. Yet, as indicated in our introduction and discussion section, there are many advantages of using such a highly familiar and salient stimulus as faces, and in the visual domain at least we are confident that our conclusions regarding the relationship between low and high frequencies would generalize to other categories of stimuli.

      We added a new section on the generalizability of our findings at the end of the Discussion, p.32-33 (line 880) (see also Reviewer 2’s comments). Please see above in the “essential revisions” for the full added section.

      Reviewer #2 (Public Review):

      The study by Jacques and colleagues examines two types of signals obtained from human intracortical electroencephalography (iEEG) measures, the steady-state visual evoked potential and a broadband response extending to higher frequencies (>100 Hz). The study is much larger than typical for iEEG, with 121 subjects and ~8,000 recording sites. The main purpose of the study is to compare the two signals in terms of spatial specificity and stimulus tuning (here, to images of faces vs other kinds of images).

      The experiments consisted of subjects viewing images presented 6 times per second, with every 5th image depicting a face. Thus the stimulus frequency is 6 Hz and the face image frequency is 1.2 Hz. The main measures of interest are the responses at 1.2 Hz and harmonics, which indicate face selectivity (a different response to the face images than the other images). To compare the two types of signals (evoked potential and broadband), the authors measure either the voltage fluctuations at 1.2 Hz and harmonics (steady-state visually evoked potential) or the fluctuations of broadband power at these same frequencies.

      Much prior work has led to the interpretation of the broadband signal as the best iEEG correlate of spatially local neuronal activity, with some studies even linking the high-frequency broadband signal to the local firing rate of neurons near the electrode. In contrast, the evoked potential is often thought to arise from synchronous neural activity spread over a relatively large spatial extent. As such, the broadband signal, particularly in higher frequencies (here, 30-160 Hz) is often believed to carry more specific information about brain responses, both in terms of spatial fidelity to the cortical sources (the cortical point spread function) and in terms of functional tuning (e.g., preference for one stimulus class over another). This study challenges these claims, particularly, the first one, and concludes that (1) the point spread functions of the two signals are nearly identical, (2) the cortical locations giving rise to the two signals are nearly identical, and (3) the evoked potential has a considerably higher signal-to-noise ratio.

      These conclusions are surprising, particularly the first one (same point spread functions) given the literature which seems to have mostly concluded that the broadband signal is more local. As such, the findings pose a challenge to the field in interpreting the neuronal basis of the various iEEG signals. The study is large and well done, and the analysis and visualizations are generally clear and convincing. The similarity in cortical localization (which brain areas give rise to face-selective signals) and in point-spread functions are especially clear and convincing.

      We thank the reviewer for his/her fair and positive evaluation of our work and helpful comments.

      Although the reviewer does not disagree or criticize our methodology, we would like to reply to their comment about the surprising nature of our findings (particularly the similar spatial extent of LF and HF). In fact, we think that there is little evidence for a difference in ‘point-spread’ function in the literature, and thus that these results are not really that surprising. As we indicate in the original submission (discussion), in human studies, to our knowledge, the only direct comparisons of spatial extent of LF responses and HF is performed by counting and reporting the number of significant electrodes showing a significant response in the two signals (Miller et al., 2007; Crone et al., 1998; Pfurtscheller et al., 2003; see list of references at the end of the reply). Overall, these studies find a smaller number of significant electrodes with HF compared to LF. Intracranial EEG studies pointing to a more focal origin of HF activity generally cite one or several of these publications (e.g. Shum et al., 2013). In the current study, we replicate this finding and provide additional analyses showing that it is confounded with SNR differences across signals and created artificially by the statistical threshold. When no threshold is used and a more appropriate measure of spatial extent is computed (here, spatial extent at half maximum), we find no difference between the 2 signals, except for a small difference in the left anterior temporal lobe. Moreover, in intracranial EEG literature, the localness of the HF response is often backed by the hypothesis that HF is a proxy for firing rate. Indeed, since spikes are supposed to be local, it is implied that HF has to be local as well. However, while clear correlations have been found between HF measured with micro-electrodes and firing rate (e.g., Nir et al. 2007; Manning et al., 2009), there is no information on how local the activity measured at these electrodes is, and no evidence that the HF signal is more local than LF signal in these recordings. Last, the link between (local?) firing rate and HF/broadband signal has been show using micro-electrodes which vastly differ in size compared to macro-electrodes. The nature of the relationship and its spatial properties may differ between micro-electrodes and macro-electrodes used in ECOG/SEEG recordings.

      We feel these points were all already discussed thoroughly in the original submission of the manuscript (see p. 28-30 in the revised manuscript) and did not modify the revised manuscript.

      The lack of difference between the two signals (other than SNR), might ordinarily raise suspicion that there is some kind of confound, meaning that the two measures are not independent. Yet there are no obvious confounds: in principle, the broadband measure could reflect the high-frequency portion of the evoked response, rather than a separate, non-phase locked response to the signal. However, this is unlikely, given the rapid fall-off in the SSVEP at amplitudes much lower than the 30 Hz low-frequency end of the broadband measure. And the lack of difference between the two signals should not be confused for a null result: both signals are robust and reliable, and both are largely found in the expected parts of the brain for face selectivity (meaning the authors did not fail to measure the signals - it just turns out that the two measures have highly similar characteristics).

      The current reviewer and reviewer #3 both commented or raised concerned about the fact that HF signal as measured in our study might be contaminated by LF evoked response, thereby explaining our findings of a strong similarity between the 2 signals.

      This was actually a potential (minor) concern given the time-frequency (wavelet) parameters used in the original manuscript. Indeed, the frequency bandwidth (as measured as half width at half maximum) of the wavelet used at the lower bound (30Hz) of the HF signal extended to 11Hz (i.e., half width at half maximum = 19 Hz). At 40Hz, the bandwidth extended to 24Hz (i.e., HWHM = 16 Hz). While low-frequency face-selective responses at that range (above 16 Hz) are negligible (see e.g., Retter & Rossion, 2016; and data below for the present study), they could have potentially slightly contaminated the high frequency activity indeed.

      To fully ensure that our findings could not be explained by such a contamination, we recomputed the HF signal using wavelets with a smaller frequency bandwidth and changed the high frequency range to 40-160 Hz. This ensures that the lowest frequency included in the HF signal (defined as the bottom of the frequency range minus half of the frequency bandwidth, i.e., half width at half maximum) is 30 Hz, which is well above the highest significant harmonic of face-selective response in our frequency-tagging experiment (i.e., 22.8 Hz ; defined as the harmonic of face frequency where, at group level, the number of recording contacts with a significant response was not higher than the number of significant contacts detected for noise in bins surrounding harmonics of the face frequency, see figure below). Thus, the signal measured in the 40-160 Hz range is not contaminated by lower frequency evoked responses.

      We recomputed all analyses and statistics as reported in the original manuscript with the new HF definition. Overall, this change had very little impact on the findings, except for slightly lower correlation between HF and LF (in Occipital and Anterior temporal lobe) when using single recording contacts as unit data points (Note that we slightly modified the way we compute the maximal expected correlation. Originally we used the test-retest reliability averaged over LF and HF; in the revised version we use the lower reliability value of the 2 signals, which is more correct since the lower reliability is the true upper limit of the correlation). This indicates that the HF activity was mostly independent from phase-locked LF signal already in the original submission. However, since the analyses with the revised time-frequency analyses parameters enforce this independence, the revised analyses are reported as the main analyses in the manuscript.

      The manuscript was completely revised accordingly and all figures (main and supplementary) were modified to reflect these new analyses. We also extended the methods section on HF analyses (p. 37) to indicate that HF parameters were selected to ensure independence of the HF signal from the LF evoked response, and provide additional information on wavelet frequency bandwidth.

      There are some limitations to the possible generalizability of the conclusions drawn here. First, all of the experiments are of the same type (steady-state paradigm). It could be that with a different experimental design (e.g., slower and/or jittered presentation) the results would differ. In particular, the regularity of the stimulation (6 Hz images, 1.2 Hz faces) might cause the cortex to enter a rhythmic and non-typical state, with more correlated responses across signal types. Nonetheless, the steady-state paradigm is widely used in research, and even if the conclusions turn out to hold only for this paradigm, they would be important. (And of course, they might generalize beyond it.)

      We understand the concern of the reviewer and appreciate the last statement about the wide use of the steady-state paradigm and the importance of our conclusions. Above that, we are very confident that our results can be generalized to slower and jittered presentations. Indeed, with this paradigm in particular, we have compared different frequency rates and periodic and nonperiodic stimulations in previous studies (Retter & Rossion, 2016; Quek et al., 2017; Retter et al., 2020). Importantly, specifically for the paradigm used in the present study, the neural face-selective responses are strictly identical whether the faces are inserted at periodic or non-periodic intervals within the train of nonface objects (Quek & Rossion, 2017), showing that the regularity of stimulation does not cause a non-typical state.

      Please see our reply above to essential revisions and reviewer 1, in which we fully address this issue, as well as the revised discussion section (p. 32-33).

      A second limitation is the type of stimulus and neural responses - images of faces, face-selectivity of neural responses. If the differences from previous work on these types of signals are due to the type of experiment - e.g., finger movements and motor cortex, spatial summation and visual cortex - rather than to the difference in sample size of type of analysis, then the conclusions about the similarity of the two types of signals would be more constrained. Again, this is not a flaw in the study, but rather a possible limitation in the generality of the conclusions.

      This is a good point, which has been discussed above also. Please note that this was already partly discussed in the original manuscript when discussing the potential factors explaining the spatial differences between our study and motor cortex studies:

      “Second, the hypothesis for a more focal HF compared to LF signals is mostly supported by recordings performed in a single region, the sensorimotor cortex (Miller et al., 2007; Crone et al., 1998; Pfurtscheller et al., 2003; Hermes et al., 2012), which largely consist of primary cortices. In contrast, here we recorded across a very large cortical region, the VOTC, composed of many different areas with various cortical geometries and cytoarchitectonic properties. Moreover, by recording higher-order category-selective activity, we measured activity confined to associative areas. Both neuronal density (Collins et al., 2010; Turner et al., 2016) and myelination (Bryant and Preuss, 2018) are substantially lower in associative cortices than in primary cortices in primates, and these factors may thus contribute to the lack of spatial extent difference between HF and LF observed here as compared to previous reports.” (p. 29-30).

      Also in the same section (p. 30) we refer to the type of signals compared in previous motor cortex studies:

      “Third, previous studies compared the spatial properties of an increase (relative to baseline) in HF amplitude to the spatial properties of a decrease (i.e. event-related desynchronization) of LF amplitude in the alpha and beta frequency ranges (Crone et al.,1998; 2001; Pfurtscheller et al., 2003; Miller et al., 2007; Hermes et al., 2012). This comparison may be unwarranted due to likely different mechanisms, brain networks and cortical layers involved in generating neuronal increases and decreases (e.g., input vs. modulatory signal, Pfurtscheller and Lopes da Silva, 1999; Schroeder and Lakatos, 2009). In the current study, our frequency-domain analysis makes no assumption about the increase and decrease of signals by face relative to non-face stimuli.”

      In the original submission, we also acknowledged that the functional correspondence between LF and HF signals is not at ceiling (p. 31) :

      “We acknowledge that the correlations found here are not at ceiling and that there were also slight offsets in the location of maximum amplitude across signals along electrode arrays (Figures 5 and 6). This lack of a complete functional overlap between LF and HF is also in line with previous reports of slightly different selectivity and functional properties across these signals, such as a different sensitivity to spatial summation (Winawer et al., 2013), to selective attention (Davidesko et al., 2013) or to stimulus repetition (Privmann et al., 2011). While part of these differences may be due to methodological differences in signal quantification, they also underline that these signals are not always strongly related, due to several factors. For instance, although both signals involve post-synaptic (i.e., dentritic) neural events, they nevertheless have distinct neurophysiological origins (that are not yet fully understood; see Buszaki, 2012; Leszczyński et al., 2020; Miller et al., 2009). In addition, these differing neurophysiological origins may interact with the precise setting of the recording sites capturing these signals (e.g., geometry/orientation of the neural sources relative to the recording site, cortical depth in which the signals are measured).”

      Additional arguments regarding the generalizability can be found in the added section of the discussion as mentioned above.

      Finally, the study relies on depth electrodes, which differs from some prior work on broadband signals using surface electrodes. Depth electrodes (stereotactic EEG) are in quite wide use so this too is not a criticism of the methods. Nonetheless, an important question is the degree to which the conclusions generalize, and surface electrodes, which tend to have higher SNR for broadband measures, might, in principle, show a different pattern than that observed her.

      This is an interesting point, which cannot be addressed in our study obviously. We agree with the reviewer’s point. However, in contrast to ECoG, which is restricted to superficial cortical layers and gyri, SEEG has the advantages of sampling all cortical layers and a wide range anatomical structures (gyri, sulci, deep structures as medial temporal structures. Therefore, we believe that using SEEG ensures maximal generalizability of our findings. Overall, the relatively low spatial resolution of these 2 recording methods (i.e., several millimeters) compared the average cortical thickness (~2-3 mm) makes it very unlikely that SEEG and ECOG would reveal different patterns of LF-HF functional correspondence.

      We added this point in a new section on the generalizability of our findings at the end of the Discussion (p.33, line 896).

      Overall, the large study and elegant approach have led to some provocative conclusions that will likely challenge near-consensus views in the field. It is an important step forward in the quantitate analysis of human neuroscience measurements.

      We sincerely thank the reviewer for his/her appreciation of our work

      Reviewer #3 (Public Review):

      Jacques et al. aim to assess properties of low and high-frequency signal content in intracranial stereo encephalography data in the human associative cortex using a frequency tagging paradigm using face stimuli. In the results, a high correspondence between high- and low-frequency content in terms of concordant dynamics is highlighted. The major critique is that the assessment in the way it was performed is not valid to disambiguate neural dynamics of responses in low- and high-frequency frequency bands and to make general claims about their selectivity and interplay.

      The periodic visual stimulation induces a sharp non-sinusoidal transient impulse response with power across all frequencies (see Fig. 1D time-frequency representation). The calculated mean high-frequency amplitude envelope will therefore be dependent on properties of the used time-frequency calculation as well as noise level (e.g. 1/f contributions) in the chosen frequency band, but it will not reflect intrinsic high-frequency physiology or dynamics as it reflects spectral leakage of the transient response amplitude envelope. For instance, one can generate a synthetic non-sinusoidal signal (e.g., as a sum of sine + a number of harmonics) and apply the processing pipeline to generate the LF and HF components as illustrated in Fig. 1. This will yield two signals which will be highly similar regardless of how the LF component manifests. The fact that the two low and high-frequency measures closely track each other in spatial specificity and amplitudes/onset times and selectivity is due to the fact that they reflect exactly the same signal content. It is not possible with the measures as they have been calculated here to disambiguate physiological low- and high-frequency responses in a general way, e.g., in the absence of such a strong input drive.

      The reviewer expresses strong concerns that our measure of HF activity is merely a reflection of spectral leakage from (lower-frequencies) evoked responses. In other words, physiological HF activity would not exist in our dataset and would be artificially created by our analyses. We should start by mentioning that this comment is in no way specific to our study, but could in fact be directed at all electrophysiological studies measuring stimulus-driven responses in higher frequency bands.

      Reviewer 2 also commented on the possible contamination of evoked response in HF signal.

      This was actually a potential (minor) concern given the time-frequency (wavelet) parameters used in the original manuscript. Indeed, the frequency bandwidth (as measured as half width at half maximum) of the wavelet used at the lower bound (30Hz) of the HF signal extended to 11Hz (i.e., half width at half maximum = 19 Hz). At 40Hz, the bandwidth extended to 24Hz (i.e., HWHM = 16 Hz). While low-frequency face-selective responses at that range (above 16 Hz) are negligible (see e.g., Retter & Rossion, 2016; and data below for the present study), they could have potentially slightly contaminated the high frequency activity indeed.

      To ensure that our findings cannot be explained by such a contamination, we recomputed the HF signal using wavelet with a smaller frequency bandwidth and changed the frequency range to 40-160Hz. This ensures that the lowest frequency included in the HF signal (defined as the bottom of the frequency range minus half of the frequency bandwidth, i.e., half width at half maximum) was 30 Hz. This was well above the highest significant harmonic of face-selective response in our FPVS experiment which was 22.8 Hz (defined as the harmonic of face frequency where, at group level, the number of recording contacts with a significant response was not higher than the number of significant contacts detected for noise in bins surrounding harmonics of the face frequency, see figure below). This ensures that the signal measured in the 40-160Hz range is not contaminated by lower frequency evoked responses.

      We recomputed all analyses and statistics from the manuscript with the new HF definition. Overall, this change had very little impact on the findings, except for slightly lower correlation between HF and LF (in Occipital and Anterior temporal lobe) when using single recording contacts as unit data points (Note that we slightly modified the way we compute the maximal expected correlation. Originally we used the test-retest reliability averaged over LF and HF; now we use the lower reliability value of the 2 signals, which is more correct since the lower reliability is the true upper limit of the correlation) This indicates that the HF activity was mostly independent from phase-locked LF signal already in the original submission. However, since the analyses with the revised time-frequency analyses parameters enforces this independence, we choose to keep the revised analyses as the main analyses in the manuscript.

      The manuscript was completely revised accordingly and all figures (main and supplementary) were modified to reflect the new analyses. We also extended the method section on HF analyses (p. 37) to indicate that HF parameters were selected to ensure independence of the HF signal from the LF evoked response, and provide additional information on wavelet frequency bandwidth.

      We believe our change in the time-frequency parameters and frequency range (40-160 Hz), the supplementary analyses using 80-160 Hz signal (per request of reviewer #2; see Figure 5 – figure supplement 4 and 5) and the fact that harmonics of the face frequency signal are not observed beyond ~23Hz, provide sufficient assurances that our findings are not driven by a contamination of HF signal by evoked/LF responses (i.e., spectral leakage).

      With respect to the comment of the reviewer on the 1/f contributions on frequency band computation, as indicated in the original manuscript, the HF amplitude envelope is converted to percent signal change, separately for each frequency bin over the HF frequency range, BEFORE averaging across frequency bands. This steps works as a normalization step to remove the 1/f bias and ensures that each frequency in the HF range contributes equally to the computed HF signal. This was added to the method section (HF analysis, p 38 (line 1038) ): ” This normalization step ensures that each frequency in the HF range contributes equally to the computed HF signal, despite the overall 1/f relationship between amplitude and frequency in EEG.”

      The connection of the calculated measures to ERPs for the low-frequency and population activity for the high-frequency measures for their frequency tagging paradigm is not clear and not validated, but throughout the text they are equated, starting from the introduction.

      The frequency-tagging approach is widely used in the electrophysiology literature (Norcia et al., 2015) and as such requires no further validation. In the case our particular design, the connection between frequency-domain and time-domain representation for low-frequencies has been shown in numerous of our publications with scalp EEG (Rossion et al., 2015; Jacques et al., 2016; Retter and Rossion, 2016; Retter et al., 2020). FPVS sequences can be segmented around the presentation of the face image (just like in a traditional ERP experiment) and averaged in the time-domain to reveal ERPs (e.g., Jacques et al., 2016; Retter and Rossion, 2016; Retter et al., 2020). Face-selectivity of these ERPs can be isolated by selectively removing the base rate frequencies through notch-filtering (e.g., Retter and Rossion, 2016; Retter et al., 2020). Further, we have shown that the face-selective ERPs generated in such sequences are independent of the periodicity, or temporal predictability, of the face appearance (Queck et al. 2017) and to a large extent to the frequency of face presentation (i.e., unless faces are presented too close to each other, i.e., below 400 ms interval; Retter and Rossion, 2016). The high frequency signal in our study is measured in the same manner as in other studies and we simply quantify the periodic amplitude modulation of the HF signal. HF responses in frequency-tagging paradigm has been measured before (e.g., Winawer et al., 2013). In the current manuscript, Figure 1 provides a rational and explanation of the methodology. We also think that our manuscript in itself provides a form of validation for the quantification of HF signal in our particular frequency-tagging setup.

    1. Author Response

      Reviewer #1 (Public Review):

      This work by Wei-Jia Luo and colleagues elegantly employs in vitro and in vivo models to demonstrate that within the mouse liver, macrophages respond to lipopolysaccharide (LPS) by releasing active IL-12 (IL-12p70), which is a heterodimer of IL-12p35 and IL-12p40. They observed that the availability of "free" IL-12p35 to this heterodimerization process is governed by the molecular chaperone HLJ1. In response to LPS, HLJ1 separates homodimerized IL-12p35 into monomers, which then can heterodimerize with IL-12p40 to form active IL-12p70. This active IL-12 is released from macrophages in the liver, which then act on neighboring natural killer T cells to release interferon gamma. This interferon gamma circulates systemically and is responsible for mortality in a mouse model of endotoxic shock.

      Overall, this work is mechanistically compelling and demonstrates a novel multicellular inflammatory pathway that contributes to death in a murine model of endotoxic shock. However, it is unclear if the observed pathway is limited to this highly reductionist model, or if it applies to models that better approximate the complexity of human sepsis. Indeed, the long-standing concept of "cytokine storm" as the major mediator of sepsis has largely failed to yield benefits in clinical trials. These numerous and repeated translational failures cast doubt on the translational validity of reductionist in vivo animal models of sepsis.

      Thank the reviewer’s affirmation. One of the major aims of our work is to identify a novel multicellular inflammatory pathway mediated by HLJ1 that contributes to endotoxic shock. We agree that although our understanding of cytokine storm as the major mediator of sepsis had made dramatic progress over the past decade, these findings could not translate yet into effective treatments. As the reviewer mentioned, almost all clinical trials targeting cytokine effects failed, especially in the context of sepsis. We also know that among several explanations, the appropriateness of in vivo animal models should be concerned (Chousterman et al., 2017). Some approaches to treat cytokine storm were aimed to target the direct tissue consequences of inflammation cascade such as the blood vessel (London et al., 2010). Another possible strategy to treat cytokine storm was to target signaling that promotes cytokine synthesis and secretion (Maceyka et al., 2012). It may be feasible to quell the cytokine storm after infection by targeting upstream signaling, and reducing cytokine synthesis as well as secretion is a valid alternative to direct cytokine antagonism (Chousterman et al., 2017). Furthermore, in this study we found Hlj1−/− mice showed reduced IFN-g and improved survival when treated with daily systemic antibiotics after CLP surgery (Figure 6), indicating that targeting cytokine storm in combination with antibiotics provides a promising therapeutic strategy to treat sepsis. Combined, we think HLJ1-targeting strategy might be a potential therapy to treat cytokine storm-associated sepsis. We emphasized and discussed the concept in the Discussion of our revised manuscript (Page 19, line 441-453).

      We highly appreciated the reviewer #1 and other reviewers raised the same issue. We worked hard and attentively to response comments point-by-point below.  

      This raises several specific concerns with regard to the model used by the investigators:

      (1) The authors use a massive dose of LPS that rapidly leads to the death of mice in 24 hours. This massive and rapid mortality is not consistent with human sepsis, which is a more crescendo course with a mortality of ~30%. Indeed, when the authors used a more clinically-relevant model of mild endotoxemia, HLJ1 appeared to have no impact on mortality (Figure 1A).

      Thank for the comment. Indeed, since we observed HLJ1 knockout mice could survive from high dose of LPS, we use 20 mg/kg LPS to perform the subsequent experiments based on these obvious and significant phenomena. We also recognized the importance of administration of low dosages of LPS. To address this issue, we performed additional experiments and made some revisions point-by-point.

      i. Because 4 mg/kg is a common non-lethal dosage to induce TLR4 and IFN-γ signaling (Kunze et al., 2019; Malgorzata-Miller et al., 2016), we performed additional experiment with 4 mg/kg LPS according to the editor’s suggestion. As a result, Hlj1−/− mice showed lower serum levels of BUN, creatinine and ALT and thus less severe organ damage than Hlj1+/+ mice after 4mg/kg LPS treatment. The data are showed in Figure 1C and D of revised Figure 1 (Figure 1).

      ii. We also performed ELISA test and found that serum levels of IFN-γ were lower in Hlj1−/− mice than in Hlj1+/+ mice after 4 mg/kg LPS injection. The result is in Figure 2C of revised Figure 2 (Figure 2).

      iii. Combined, this result indicated the effect of HLJ1 deletion on reducing IFN-γ and alleviating organ injury can also be found during moderate endotoxemia. We described and discussed the result in the revised manuscript (Page 6, line 134-141; Page 18, line 423-437)

      (2) LPS is a model of endotoxemia, not a model of sepsis. Accordingly, it is unclear if the protective benefit of blocking IL-12 will similarly be seen as a live-infection model of sepsis, in which inflammatory signaling may be necessary for pathogen clearance.

      Thank the reviewer for raising these critical issues and providing valuable suggestions. This issue was also mentioned by other reviewers. Although the LPS-induced endotoxemia is a simple model with higher reproducibility and reliability comparing to other sepsis models, it indeed cannot represent actual sepsis and is based on the notion that it is the host’s response to bacteria but not the pathogen itself, that leads to mortality and organ failure (Deitch, 2005). Therefore, according to the reviewers’ suggestion, we performed additional live-infection model of sepsis including cecal ligation and puncture (CLP) which resembles clinical disease and septic shock (Deitch, 2005) to reassure the importance of HLJ1 on sepsis. As a consequence, we found IFN-γ expression was lower in liver and spleen of Hlj1−/− mice comparing to Hlj1+/+ mice (Figure 6A and B). We analyzed serum markers of organ dysfunction and Hlj1−/− mice showed lower serum levels of BUN, creatinine and AST (Figure 6C). H&E staining showed kidney injury at the histology level after CLP surgery, while Hlj1−/− mice showed less severe kidney injury than Hlj1+/+ (Figure 6D). We further found Hlj1−/− mice showed significantly improved survival compared to Hlj1+/+ mice when mice were treated with systemic antibiotics (Figure 6E). Combined, we demonstrated the effect of HLJ1 deletion on attenuation of CLP-induced sepsis with down-regulated IFN-γ, and concluded that the benefit of blocking IL-12 and HLJ1 can similarly be seen as a live-infection model of sepsis. The result is showed as below (revised Figure 6). The corresponding result was also added in the revised manuscript (Page 11-12, line 268-286). Please check it as well as the above responses to other reviewers.

      Page 11-12, line 268-286 "HLJ1 deletion protect mice from CLP-induced organ dysfunction and septic death To address the question whether HLJ1 also regulates IFN-γ-dependent septic shock in live infection model, we performed CLP (cecal ligation and puncture) surgery which more resembles clinical disease and human sepsis. CLP significantly induced transcriptional levels of IFN-γ in the liver of Hlj1+/+ mice comparing to mice receiving sham surgery while Hlj1−/− mice showed significantly lower IFN-γ mRNA than Hlj1+/+ mice (Figure 6A). This phenomenon was not restricted to the liver since lower expression of splenic IFN-γ was also found in Hlj1−/− mice (Figure 6B). The CLP surgery resulted in serious renal and liver damage while Hlj1−/− mice showed alleviated organ dysfunction with significantly lower serum levels of BUN, creatinine and AST (Figure 6C). H&E staining showed kidney injury at the histology level after CLP, while Hlj1−/− mice showed less severe kidney injury than Hlj1+/+ mice (Figure 6D). However, there was no significant difference in survival when comparing Hlj1+/+ and Hlj1−/− mice (Figure 6E). We hypothesized that severe bacteremia contributed to mortality in mice that did not receive any treatment, so we treat mice with systemic antibiotics. As a result, Hlj1−/− mice displayed significantly improved survival compared with Hlj1+/+ mice when mice received daily systemic antibiotics after CLP (Figure 6E). These results implied the agent responsible for bacteria clearance can be combined with immune modulation such as HLJ1 targeting to improve the outcome of sepsis."

      (3) Finally, it is unclear if the findings are only relevant to mice, or if they also have relevance to humans.

      We admit human studies is important, while there are some objective difficulties need to be overcame; for example, cohort identification, individual variation, and clinical considerations. This is our limitation since our findings were only based on animal models and human cell lines. We further performed CLP experiments which is more relevant to human sepsis, while it is not a true human study. That had been added as Figure 6 of our revised manuscript (Figure 6). Actually, based on the present result, we plan to initiate some specific clinical human studies. For example, we plan to collect blood monocytes from critically ill patients from ICU to see whether HLJ1 expression levels in monocytes is higher in patients with sepsis than in patients without sepsis. On the other hand, we also want to know whether HLJ1 expression levels in monocytes or in serum are correlated to inflammatory markers such as C-reactive protein, procalcitonin, and lactate in sepsis patients, because we found serum levels of HLJ1 correlated to IL-12 in mouse. In our unpublished preliminary result, HLJ1 can be detected in serum of patients with sepsis. This inspires us to investigate whether HLJ1 can be a diagnostic or prognostic marker in the future. We anticipate these results can be in our future publications. Thank you very much for your understanding.  

      Reviewer #2 (Public Review):

      The authors show that HLJ1 converts misfolded IL-12p35 homodimers to monomers, which maintains bioactive IL-12p70 heterodimerization and secretion. In turn, this contributes to increased IL-12 activity, leading to enhanced IFN-gamma production and lethality in mice challenged with LPS to model sepsis.

      Strengths:

      • Huge and diverse dataset (e.g. in vivo, in vitro, single cell RNAseq, adoptive transfer etc.) with interesting findings that could be of relevance to the field.

      We deeply thank the reviewer for the affirmation. We hope our comprehensive dataset can provide a novel insight of relevance to the field. With this information, we also keep investigating the underlying molecular alteration resulting from endotoxin-induced immune responses. Thank you very much. At the mention of our weaknesses raised by the reviewer, we totally agreed on it and take it very seriously and revised point-by-point. Thank you very much.

      Weaknesses:

      • The flow/narrative of the paper is very hard to follow. This may result from the fact that the order of presented results is a bit puzzling. Normally, one would add-in the cytokine results (now figure 3), after the survival curves in Figure 1. Furthermore, the flow cytometry data presented in Figure 4 is more or less a validation of the scRNAseq data presented in Figure 2 in another organ. Likewise, Figure 5 is sort of a validation of Figure 3 in another organ. The authors seem to jump from organ to organ, from in vivo to in vitro and vice-versa all the time which makes the paper extremely difficult to follow.

      Thank the reviewer for the valuable suggestion. Actually, we were also hesitant to this arrangement in our first submission. We rearranged our results so that the flow/narrative of the paper can be easier to follow:

      1. We moved the result of figure 3 to become figure 2 so that the cytokine array results would after the survival curve results.

      2. The flow cytometry result presented in Figure 4 was moved to Figure 5 so that it would after the result of sc-RNA sequencing.

      3. The qPCR result of pro-proinflammatory cytokines presented in figure 5 was moved to Figure 2-figure supplement 1 so that it would be a validation of cytokine array in another organ.

      In addition, along with other suggestions from reviewers, we have rewritten the introduction and the discussion sections and reorganized whole manuscript so that we can focus more on important issues. All the modification and rearrangement can be checked in the revised manuscript with changes tracked. Please check our revised manuscript. Thank you for your kind suggestions.

      • Use of extremely high dosages of LPS.

      Thank for the comment. This issue had been raised by several reviewers and the editor. Indeed, since we observed HLJ1 knockout mice could survive from high dose of LPS, we use 20 mg/kg LPS to perform the subsequent experiments based on this obvious and significant phenomenon. We also recognized the importance of administration of low dosages of LPS. To address this issue, we performed additional experiments and made some revisions point-by-point.

      i. Because 4 mg/kg is a common non-lethal dosage to induce TLR4 and IFN-γ signaling (Kunze et al., 2019; Malgorzata-Miller et al., 2016), we performed additional experiment with 4 mg/kg LPS according to the editor’s suggestion. As a result, Hlj1−/− mice showed lower serum levels of BUN, Creatinine and ALT and thus less severe organ damage than Hlj1+/+ mice after 4mg/kg LPS injection (Figure 1C). H&E staining showed kidney injury at the histology level after LPS treatment, while Hlj1−/− mice showed less severe kidney injury than Hlj1+/+ mice (Figure 1D). The data are showed in Figure 1C and D (in below) of revised Figure 1 (Figure 1).

      ii. We also performed ELISA test and found that serum levels of IFN-γ were lower in Hlj1−/− mice than in Hlj1+/+ mice after 4 mg/kg LPS injection. The result is in Figure 2C (in below) of revised Figure 2 (Figure 2).

      iii. Combined, this result indicated the effect of HLJ1 deletion on reducing IFN-γ and alleviating organ injury can also be found during moderate endotoxemia. We described and discussed the result in the revised manuscript (Page 6, line 134-141; Page 18, line 423-437)

      • Much of the presented data is replication of previous work. For instance, neutralization of IFN-γ (e.g. Billiau et al., Eur. J. Immunol. 1987; Car et al. J. Exp. Med. 1994) and anti-IL-12 (e.g. Zisman et al., Shock 1997) has been shown to lower mortality in LPS models in mice.

      Thank reviewer for the reminding. We apologized for our unclear description leading to misunderstanding. To carefully and firstly identify the novel role of HLJ1 in sepsis, we actually investigated it on several well-known bases. Indeed, the role of IFN-γ and IL-12 has been recognized in previous studies and their neutralization attenuating LPS-induced endotoxic shock have been reported. However, our study focused on the effect of HLJ1 deletion on IL-12/IFN-γ-axis and septic death. Firstly, we observed IFN-γ and IL-12 decreased after HLJ1 deletion during sepsis. On the one hand, we use IL-12/IFN-γ neutralization and found it could improve survival in wild-type mice rather than in Hlj1 knockout mice, suggesting the importance of HLJ1 in IL-12/IFN-γ-mediated mortality. On the other hand, if the difference of mortality rate across genotypes could become no difference after IL-12 or IFN-γ neutralization, then we can infer that HLJ1 contributes to mortality mainly through IL-12 and IFN-γ signaling. These ideals came from a previous study published in Cell (Ponzetta et al., 2019). The authors elegantly proved the role of Csf3r in IL-12/IFN-γ-axis and subsequent tumor incidence by showing that IFN-γ neutralization can alter the phenotype in wildtype mice rather than in knockout mice. This rationale inspired and prompted us to perform the similar neutralization experiment for understanding the precise role of HLJ1 in sepsis.

      • No true sepsis model is used, only LPS. This is important, as for instance neutralization of IFN-γ and IL-12 has been shown to improve outcome in endotoxemia before (see above), but had no effect on survival in more relevant sepsis models such as cecal ligation and puncture (e.g. see Romero et al., Journal of Leukocyte Biology 2010; Zisman et al., Shock 1997). Furthermore, IFN-γ is even proposed (and used on a small scale) as therapy in sepsis patients to reverse immunosuppression.

      Thank the reviewer raised these critical issues and provided valuable suggestions. It was also mentioned by other reviewers. Although the LPS-induced endotoxemia is a simple model with higher reproducibility and reliability comparing to other sepsis models, it indeed cannot represent actual sepsis and is based on the notion that it is the host’s response to bacteria but not the pathogen itself, that leads to mortality and organ failure (Deitch, 2005). Therefore, we performed additional model including cecal ligation and puncture (CLP) which resembles clinical disease and septic shock (Deitch, 2005) to reassure the importance of HLJ1 to human sepsis. Please see our revised Figure 6 (Figure 6) and responses to other reviewers above.

      In accordance with the previous result from Romero et al showing that IFN-γ neutralization did not improve survival rate, we observed similar survival rate between Hlj1+/+ and Hlj1−/− mice after CLP. However, when they treated mice with systemic antibiotics, IFN-γ knockout mice survived significantly better than wild-type mice (Romero et al., 2010). In CLP model, it is possible that severe bacteremia contributed to mortality in mice that did not receive antibiotics in an IFN-γ-independent manner, so we treated mice with systemic antibiotics immediately after CLP. As a result, we further found Hlj1−/− mice showed significantly improved survival compared to Hlj1+/+ mice when mice were treated with systemic antibiotics after CLP surgery (Figure 6E), indicating that targeting cytokine storm in combination with antibiotics provides a promising therapeutic strategy to treat sepsis. The result is showed in Figure 6E (in below) of revised Figure 6 (Figure 6). This suggests that HLJ1-targeting strategy can be combined with antibiotics to become combined therapy for future clinical applications. We emphasized and discussed the concept in the Discussion of the revised manuscript (Page 18-19, line 441-453).

    1. Author Response

      Reviewer #1 (Public Review):

      In their manuscript "CompoundRay: An open-source tool for high-speed and high-fidelity rendering of compound eyes", the authors describe their software package to simulate vision in 3D environments as perceived through a compound eye of arbitrary geometry. The software uses hardware accelerated ray casting using NVIDIA Optix to generate simulations at very high framerates of ~5000 FPS on recent NVIDIA graphics hardware. The software is released under the permissive MIT license, publicly available at https://github.com/ManganLab/eye-renderer, and well documented. CompoundRay can be extraordinarily useful for computational neuroscience experiments exploring insect vision and robotics with insect like vision devices.

      The manuscript describes the target of the work: realistic simulating vision as perceived by compound eyes in arthropods and thoroughly reviews the state of the art. The software CompoundRay is then presented to address the shortcomings of existing solutions which are either oversimplifying the geometry of compound eyes (e.g. assuming shared focal points), using an unrealistic rendering model (e.g. local geometry projection) or being slower than real-time.

      The manuscript then details implementation choices and the conceptual design and components of the software. The effect of compound eye geometries is discussed using some examples. The speed of the simulator depending on SNR is assessed and shown for three physiological compound eye geometries.

      I find the described open source compound eye vision simulation software extraordinarily useful and important. The manuscript reviews the state of the art well. The figures are well made and easy to understand. The description of the method and software, in my opinion, needs work to make it more succinct and easier to understand (details below). In general, I found relevant concepts and ideas buried in overly complicated meandering descriptions, and important details missing. Some editorial work could help a lot here.

      Thank you for the very positive feedback.

      Major:

      1) The transfer of the scene seen by an arbitrary geometry compound eye into a display image lacks information and discussion about the focal center/ choice of projection. I believe that only the orientation of ommatidia is used to generate this projection which leads to the overlap/ non-coverage in Fig. 5c. Correct? It would be great if, for such scenarios, a semi-orthogonal+cylindrical projection could be added? Also, please explain better.

      For clarification, CompoundRay allows for a number of projection modes from any 3D sampling surface to visualised 2D projections. This has now been made clearer with an updated Methods section “From single ommatidia to full compound eye” (lines 171-188), and also a more clarified explanation of the display pipeline within the “CompoundRay Software Pipeline” section (lines 245-247).

      We note that Fig 5 is simply intended as an example of the extreme differences in information that can be provided by nodel (the current state of the art) and non-nodal imagers (as in biological systems). A user could indeed produce custom projections (as now noted in the future work section of the Discussion), such as semi-orthgonal+cylindrical projections by modifying the projection shaders but we do not feel that this adds substantially to the desired message of Fig 5 as currently all view images are generated using the same projection method allowing them to be compared. Further to this, a semi-orthogonal+cylindrical projection would only serve to display these types of eyes and not be of significant use outside of this category of design. Rather, the utility of CompoundRay for research is now demonstrated by the inclusion of an entirely new example experiment (lines 394-467) (Fig 10) which compares artificial and realistic compound eye models in a visual tracking task.

      In additional we note that specific references to the “orientation-wise spherical mapping” of images have been added to appropriate image captions (Fig 5 & 6).

      Finally, we have attempted to be more explicit about about the way that 2D projection systems work within CompoundRay (182-185)

      2) It is clear that CompoundRay is fast and addresses complex compound eyegeometries. It remains unclear, why global illumination models are discussed while the implementation uses ray casting to sample textures without illumination which is equivalent to projection rendering which runs fast on much simpler hardware. If the argument is speed and simplicity of writing the code, that's great, write it so. If it is an intrinsic advantage of the ray-casting method, then comparison with the 'many-cameras' approach sketched below should be done:

      In your model, each ommatidium is an independent pin-hole camera. Instead of sampling this camera by ray-casting, you could use projection rendering to generate a small image per ommatidium-camera, then average over the intensities with an appropriate foveation function (Gaussian in your scenario, but could be other kernels). The resolution of the per-camera image defines the number of samples for anti-aliasing, randomizing will be harder than with ray-casting ;). What else is better when using ray-casting? Fewer samples? Hardware support? Possible to increase recursion depth and do more global things than local illumination and shadows? Easier to parallelize on specific hardware and with specific software libraries? Don't you think it would make sense to explain the entire procedure like that? That would make the choice to use ray-casting much easier to understand for naive readers like me.

      Thanks for this feedback, and can see that it was misleading to include this in our previous Methods section. We have now reduced and moved discussion of global illumination models to the future work section at the end of the Discussion. We have also added a clarification to the end of this document that summarises this point as it was raised by multiple reviewers (see Changes Relating to Colour and Light Sampling)

      3) CompoundRay, as far as I understand, currently renders RGB images at 8-bitprecision. This may not be sufficient to simulate the vision of arthropod eyes that are sensitive to other wavelengths and at variable sensitivity.

      Thanks for pointing out this easy-to-miss implementation detail. Indeed, you are correct that the native output is at 8-bit level as is standard to match display equipment. However, we note that the underlying on-GPU implementation operates at a 32-bit depth, so exposing this to the higher-level Python API should be possible, which could then be used as you suggest. We view adding enhanced lighting properties including shadows, illumination and higher bit depths so as to better support increased-bandwidth visual sensor simulation as future updates which we have now outlined in the Discussion (line 549-553).

      Reviewer #2 (Public Review):

      In this paper, the authors describe a new software tool which simulates the spatial geometry of insect compound eyes. This new tool improves on existing tools by taking advantage of recent advances in computer graphics hardware which supports high performance real-time ray tracing to enable simulation of insect eyes with greater fidelity than previously. For example, this tool allows the simulation of eyes in which the optical axes of the ommatidia do not converge to a single point and takes advantage of ray tracing as a rendering modality to directly sample the scene with simulated light rays. The paper states these aims clearly and convincingly demonstrates that the software meets these aims. I think the availability of a high-quality, open-source software tool to simulate the geometry of compound eyes will be generally useful to researchers studying vision and visual behavior in insects and roboticists working on bio-inspired visual systems, and I am optimistic that the describe tool could fill that role well.

      Thankyou for the positive feedback.

      As far as weaknesses of the paper, the most major issue for me is that I could not find any example of why the additional modeling fidelity or speed is useful in understanding a biological phenomenon. While the work is technically impressive, I think such a demonstration would increase its impact substantially.

      An example experiment has been added as requested.

      I can identify a few more, relatively minor, weaknesses: the software tool is not particularly easy to install but I think this is due primarily to the usage of advanced graphics hardware and software libraries and hence not something the authors can easily correct. In fact, the authors provide substantial documentation to help with installation.

      Indeed, we have tried to ease installation as much as possible by provided detailed documentation. This has been updated since initial submission and proven sufficient for multiple users. We have looked into dockerising the code but as correctly identified by the reviewer there are significant challenges due to proprietory hardware and their drivers.

      Another weakness of the tool, which the authors might like to address in the paper, is that there are some aspects of insect vision and optics which are not directly addressed. For example, the wavelength and polarization properties of light rays are hardly addressed despite extensive research into the sensation of these properties. Furthermore, the optical model employed here is purely ray based and would not allow investigating the wave nature of light which is important for propagation from the corneal surface to the photoreceptors in many species.

      Indeed, it is correct that the current implementation does not allow such advanced light modellign features but as our initial aim was to allow arbitrary surface shapes this was considered beyond the scope of this work. However, we have added a short description of extensions that the method would allow without significant architectural changes which include many of those listed by the reviewer. As the renderer simulates light as it reaches the lens surface, it is hoped that further works will be able to use this natural boundary between the eye surface and it’s internals to build further computational models that use the data generated in CompoundRay as a starting point to then simulate inside-eye light transport.

    1. but before we do that let me talk about something that's even more fundamental um and helps us to understand the progression of thinking through those four schools to the what's 00:42:10 usually considered the most sophisticated in my jamaica school um and that is the distinction which is really important between existence and intrinsic existence 00:42:23 and the ex and the distinction between no existence and no intrinsic existence so this is these distinctions um if one doesn't fully comprehend the the 00:42:37 majamika system uh not fully comprehend but have some idea of the of the uh my jamaica system one then usually make is not able to make these distinctions so 00:42:49 let's talk about them for a moment um so existence um we when we talk about existence we talk about our ordinary understanding of what's real okay that things are 00:43:03 objects uh things are you know they may be in relationship but what's in relationship are two different distinct objects or entities that are in relationship and that's kind of our normal understanding of existence 00:43:15 so lacking inherent existence or intrinsic existence begs the issue to understand what is intrinsic existence okay and that's the 00:43:27 object of negation for the buddha for nagarjuna and for all those following in this tradition of nagarjuna the uh the majamika school and so 00:43:39 that's not so easy to wrap our heads around uh what is intrinsic existence in a way it's so close that we miss it you know it's it's a little bit like you know 00:43:51 staying in a in a new hotel room in a new city waking up and looking for your glasses and you can't find them and then realizing that they're already on your faces and so 00:44:05 intrinsic existence is things existing independently things existing uh through relationship um things not not things existing dependently not in independently 00:44:19 and so if we look at dependence now we can look at that at several levels and the more obvious levels you've mentioned that carlo is cause and effect causality okay but there are also more uh 00:44:33 subtle levels of dependence that the buddha and nagarjuna talk about and are real central to the philosophy so the second level is the relationship between whole and parts and parts to whole it 00:44:46 goes both ways okay that's a a a little bit you know another level if you will of of dependence uh in the particularly you know highlighted by nagarjuna and 00:44:58 then the third level which is the most uh subtle level the subtlest level which is really what we have to start to understand because the opposite of that is this independent or intrinsic 00:45:10 existence okay so this third level we call dependence through designation or sometimes called dependent designation but it's dependence through designation 00:45:22 it's a type of naming or labeling so for example barry we label or name barry my parents gave this name to barry based on a body 00:45:34 okay maybe a little tiny infant body at that time right and also uh in terms of maybe some kind of behaviors or you know how they thought this emotional structure is for this little baby right 00:45:47 he's very calm or he's very you know he's acts out a lot he's very active or you know all those things so upon all that a name is placed in this case barry okay 00:45:59 so that relationship of you know dependence through designation is really what nagarjuna is talking about when we talk about dependence um and so that's very uh 00:46:11 important to understand so the opposite of that coming back to understanding this inherent or intrinsic existence there are many words in english we use synonymous for 00:46:23 ranging not existing intrinsically or inherently or independently or from its own side those are all synonyms um to the tibetan 00:46:36 terminology that i just mentioned um so when people don't have a good appreciation for intrinsic existence and you say then so the second there were two comparisons 00:46:53 the second comparison is uh non-existence and not inherently existent so when when when when regarding says no inherent existence what often people interpret is no 00:47:07 existence at all and they fall into a nihilism that nothing exists at all so they haven't fully under appreciated this notion of um intrinsic existence so they're throwing the baby out with the 00:47:20 bathwater right when we're throwing out or negating uh intrinsic existence that they don't quite understand what that really means they think it's all of existence and therefore they you know think that nothing exists they throw the 00:47:33 baby out with a backlog so that's that's okay can i interject something before you go ahead and you you you promised us before uh the full schools before uh but but can i 00:47:44 can i make a comment here um of course about you to say because this is free flow so yeah yeah so we you know we gave the title uh 00:47:56 what is real to this uh to this i that seems to me um that's exactly that distinction that that you you made between existence 00:48:09 and intrinsic existence um inherent existence it's a it it's it's uh it's idea that that i found central and and and 00:48:22 essentially essentially useful for me for for the following reason first of all um i mean the notion of reality the notion of existence here are close i mean what what exists is what is real what is that i want to say a couple of things one is 00:48:40 that um we make a distinction with an illusory and real in our everyday life uh which it's well founded i mean if i if i see 00:48:53 the chair and there's a mirror there and i see a chair of the other side of the mirror there's a precise sense in which the chair in which the other side of the mirror is not real well this chair is real 00:49:06 um this distinction has a meaning because i can sit on the chair i can touch that one but i cannot sit on that and touch that one but 00:49:18 then we realize that some aspects of what is illusory in the chair in the mirror also are shared by the chair which i just called real which is also illusory in 00:49:31 some other sets um for instance uh the fact of being a chair uh it's uh cut out and back on so i missed you up until now please could you repeat it oh 00:49:44 uh for where for where did you be speak uh when you were saying this distinction between existence and inherent existence and non-existence non-inheritances is 00:49:56 very helpful uh and then after that i lost you yeah i wanted to um make a couple points one is that uh we use a distinction between illusory and real in everyday life for instance we say that 00:50:10 a chair but then i was saying of course then um through science uh we realized that there are illusory aspects in the chair which are just called real as well 00:50:30 but then one is tempted and that's um to say all right so there are many luxury aspects of that chair but there is a a more fundamental level in which uh 00:50:45 there is a description of what is going on there which is a real one and edinton uh made it very very vividly in a well-known uh distinction between the scientific table 00:50:57 and the everyday table when he says look i have two images two tables there there's a table of which i eat which is solid and then there's a table which i view with my scientific eyes which is made by atoms 00:51:09 uh and is not solid there's a lot of emptiness of of not emptiness negatives empty completely different sense i i've heard that that emptiness is 99.9 to the 12th 00:51:20 power based in the atom is that right yes yes but that's of course not negative emptiness that's just the lack of presence of atoms yeah um and adidas says and people use that 00:51:34 by saying the the the the chair of my uh the chairman which i see the solitude is illusory the real chair is the atoms uh this way of using the notion of real and the 00:51:49 notion of um of uh existence so what exists in the atoms uh is dangerously misleading that's what 00:52:01 i uh because uh it uh um it pushes us to try to resolve the relational and illusory aspect of reality that we see 00:52:15 in terms of some basic fundamental physical reality from which to derive it or in western subjective idealism 00:52:28 in terms and its derivation in terms of some sort of uh fundamental mind or fundamental subject which is a real existing entity 00:52:41 the cartesian mind that is certain of existing itself um or the kantian subject or even the the the fundamentality of the perception 00:52:53 itself in whosoever uh and in phenomenology so there is this western need to anchor um the uh what we mean by real or something final 00:53:07 so uh to to realize that there is dependence but then there is some basic grounds on which everything builds up on which to uh on which to sit and this is what i take emptiness 00:53:23 the notion of empty negative notion of emptiness to be useful uh to to get rid of this urge of finding beyond the uh 00:53:35 the illusory aspect of the world a a basic level which is not um uh real in in in the uh 00:53:47 in the sense of uh uh of of uh uh in which this chair is is real compared to the uh to the chair uh in the mirror but but really the fundamental way so the the the bottom line of the story the 00:54:02 the solid terrain on which to anchor the ultimate um uh uh the end point of the line of dependence the line of dependence ends to some point that's what is real 00:54:15 and and what is this nagarjuna is that that's the wrong question i mean uh it's not only that the chair the table is empty because i can understand it's something else but it's 00:54:26 also that something else is also empty because i can understand it's something else until the point in which there is this emptiness itself it's a it's empty because we shouldn't take it as a 00:54:40 as a fundamental sort of metaphysical principle on which to ground all the rest so this putting this this is yeah just putting this in slightly different 00:54:51 terminology emptiness is where it allows functionality emptiness is the lack of any kind of essence even on a you know atomic level and i agree with you what you said 00:55:04 that's i think very true um right and this is a look at when we look at the chair versus the reflection of the chair in the mirror it gets a little more complicated because both of them of course lack any 00:55:17 independent existence both okay they're both empty uh in terms of shunyata having said that the metaphor that the buddha used he gave about 10 different 00:55:29 metaphors for you know something to be illusory and one of the important ones that he used was reflection you know he used the reflection of the moon or the full moon in in the still 00:55:41 water that it looks like the moon but in fact of course it's not it's a reflection he used such things as water in a mirage sound of an echo and you know things 00:55:55 like that to illustrate okay now um let me mention two experiments if i may and you correct me where i'm wrong i'm a 00:56:07 pop physicist from the new york times okay um and one is the uh the thought experiment of ed edwin schroedinger okay the so-called shorting her cat paradox 00:56:21 or thought experiment and you have double steel box in which you have a cat there's no doors no windows right and you have a vial of very powerful acid that's 00:56:33 connected to a radioisotope the half-life of the isotope is the same duration as the duration of your experiment your thought experiment so the chance of the cat so if the radioactive material 00:56:46 decays 50 chance it you know somehow pulls a lever and the acid spills killing the cat if that radioisotope does not decay there's no spillage of the of the 00:56:59 of the acid and the cat remains alive so quantum physicists call this superposition where the cat is both alive and dead when you crack open this steel box 00:57:13 then um you observe what's inside and then the cat is either dead if the radio isotope you know decayed and knocked over the acid or 00:57:25 it's alive it didn't okay and it's it's either or whereas when you can't observe it it's both it's superposition okay second is the double slit you know you you shoot these electrons or photons you 00:57:40 know through two slits in a metal thing and then you have a screen behind and you look at the the pattern and if you have a little camera observation device at the slit level of the slits observing 00:57:52 you find a pattern below on the back on the screen that suggests what passed through the splits were particles whereas if you remove the observation device you have an interference pattern 00:58:05 suggesting what went through this list were waves okay so these two experiments at least in my very uh you know superficial understanding tell us that observer dependence is very 00:58:18 important in terms of reality okay that whether or not there is or isn't or or maybe you can what type of observer you know presence there is very much influences and determines what's real 00:58:31 and so that then uh jumps into the four you know buddhist schools of philosophy and if we go from the so-called least sophisticated up the third one would be the one you alluded to that's somewhat 00:58:45 similar to bishop barkley in the west and other idealists that say that everything is consciousness everything is mine and things that seem to be solid out there in an external reality are nothing more than projections of our 00:58:58 mind and that's actually a very sophisticated philosophy it's a very sophisticated philosophy one of the things it starts to do is it breaks down this notion of a solid external reality 00:59:10 okay but it's con it's it's critique as you have you also mentioned is that it takes the mind you know to be somehow you know uh absolute or ultimate you 00:59:22 know existing and so then the highest if you will most sophisticated school of mediumica says well what the chidoma modulus the mind-only school says that's correct up to a point but the criticism is 00:59:36 there's no uh you know absoluteness about the mind either so then you end up with that you accept an external reality you accept a mind but both you know that is every existent thing uh exists 00:59:49 without having any uh exist in relationship without having any independence or objectivity um and so that's very roughly the at least the the the last two of the three buddhist schools the 01:00:03 third one is divided again into prasannika madhyamaka and spatrontikamanjamaka using tibetan terms that are borrowing from the sanskrit um and the prasangika mud yamaka is considered the most 01:00:16 sophisticated where nothing at all has intrinsic existence the whereas the uh svaltronticom and yamaka they say that some uh conventional reality does exist uh 01:00:30 from its own side having some essence uh so there's a little bit of a distinction in the debate there um so just wanted to to mention those things i'd like you to comment

      Kerzin differentiates between existence and intrinsic existence. Intrinsic existence is what the Buddha and what Nagarjuna is trying to negate.

      Rovelli makes a good point about a prevalent attitude that science offers a truer perspective than common sense, while Nagarjuna is pointing out that even the scientific explanation is not the final one. For one thing, it implicitly depends on the existence of a reified self who is the ultimate solidified existing agent and final authority, which Nagarjuna negates with his tetralemma.

    1. The lawyer has at his touch the associated opinions and decisions of his whole experience, and of the experience of friends and authorities. The patent attorney has on call the millions of issued patents, with familiar trails to every point of his client's interest. The physician, puzzled by a patient's reactions, strikes the trail established in studying an earlier similar case, and runs rapidly through analogous case histories, with side references to the classics for the pertinent anatomy and histology. The chemist, struggling with the synthesis of an organic compound, has all the chemical literature before him in his laboratory, with trails following the analogies of compounds, and side trails to their physical and chemical behavior.

      Very interesting the way it describes different professions. Although there have been some approaches to the memex none of them have been this universally usable.

    2. So he sets a reproducer in action, photographs the whole trail out, and passes it to his friend for insertion in his own memex, there to be linked into the more general trail.

      Even with current Zettlkasten technology like Logseq, a way to create a trail, and send off a particular trail to a friend is not present. I wonder what the copyright laws would look like when it comes to sharing excerpts as part of annotated trails like this. Would it be covered under Fair Use? What would a file format or a renderer for this look like?

    3. He can add marginal notes and comments, taking advantage of one possible type of dry photography, and it could even be arranged so that he can do this by a stylus scheme, such as is now employed in the telautograph seen in railroad waiting rooms, just as though he had the physical page before him.

      We have gotten away from written annotations for digital work and I'm not entirely sure it's a good thing. I want to think through the trade-offs of this.

    1. Author Response

      Reviewer #2 (Public Review):

      In this manuscript, Villalta, Schmitt, Estrozi and colleagues report their results on genome compaction in one of the most complex known viruses, the Mimivirus. This work will be of interest to a broad readership, and particularly to virologists and structural biologists. The authors describe a novel mechanism used by mimivirus to compact and package its 1.2 Mb dsDNA genome. In particular, the mimivirus genome is shown to be packed into magnificent cylinder-like assemblies composed of GMC-type oxidoreductases, presenting yet another remarkable case of enzyme exaptation. By using cryo-electron microscopy (cryo-EM) and cryo-electron tomography (cryo-ET), the authors determined the structures of such fibers in several relaxation states, which presumably represent different stages of nucleoprotein unpacking upon delivery into host cytoplasm. The authors also suggest (although do not directly visualize) that the lumen of the genomic fibers contains several viral enzymes, most notably, DNA-dependent RNA polymerase, which is necessary for cytoplasmic replication of the mimivirus. Overall, this is an important discovery, which further expands our appreciation of the "inventiveness" of viruses.

      We thank this reviewer for the positive and constructive comments. We provide now some additional data corresponding to unpublished follow up studies, we hope will help all reviewers assessing the quality and reliability of our work.

      I am not an expert on helical reconstructions and cannot evaluate the validity of the models. Thus, my specific comments will focus on aspects of the work with which I am more familiar.

      1) In light of the presented results, it is reasonable to assume that GMC-type oxidoreductases of the mimivirus are very important for the formation of functional virions. However, in a previous study (PMID: 21646533), it has been shown that the genes encoding GMC-type oxidoreductases can be deleted from the virus genome (M4 mutant) without the loss of infectivity. The M4 virions were devoid of the external fibers decorating the icosahedral capsid, but the genome was still packaged. How do the authors reconcile these results with those presented in the present manuscript? This should be addressed in the Discussion section.

      In fact, like the reviewers, we initially assumed that the GMC-oxidoreductases were essential. Now, we believe it might be premature to assume that GMC-type oxidoreductases are the only type of proteins that can be involved in the scaffolding of the Mimiviridae genomic fibers. We managed to extract the genomic fiber of M4 (the isolate without GMC oxidoreductases). The fiber also has a rod-shaped structure but protein composition analysis of the purified fiber shows that different proteins are involved in its assembly.

      We hope the reviewers will accept to reserve our finding for a following publication.

      2) The authors state that mimivirus encodes two GMC-type oxidoreductases (qu_946 and qu_143) and that both could be fitted into the electron densities. However, I could not understand whether the authors think that the fibers are heteroassemblies of both oxidoreductases or different fibers are composed of different proteins, or only one is used for fiber formation. Please clarify. In case you are not able to distinguish between the two homologs (e.g., due to limited resolution), state so explicitly.

      We cannot discriminate between the two GMC-oxidoreductases due to their close identity (69% identity, 81% similarity) and the resolution of the map. Yet we think that in most cells the qu_946 GMC-oxidoreductase is the most abundant at the time of genome packaging (from our proteomic study, between 2 and 9 times). Yet, in some cells the second GMCoxidoreductase could become the most abundant and, in that case, the genomic fiber is built using qu_143.

      3) I am slightly puzzled by the observed "ball of yarn". It is hard for me to imagine that a cylindrical container/fiber containing a continuous dsDNA genome could be bent or fragmented into bundles because this would break the protein-protein interactions holding the fiber together. In Figures 1C and S1, are these parts of the same fiber or multiple fibers coming out of one capsid? Related to this question - is there evidence (e.g., from qPCR) that Mimivirus carries a single copy of genomic dsDNA per capsid?

      We believe this reviewer should think in terms of packaging. The folded genome is packaged through two lipid membranes (the one lining the capsid interior and the one in the nucleoid) concomitantly with its wrapping by the protein shell ribbon. Thus, there is plenty of space in the nucleoid at the beginning of the packaging and the genomic fiber is gently folded inside. But as more genome needs to be packaged, this compresses the flexible fiber into the nucleoid until it is totally encased in the nucleoid. That also defines the size of the nucleoid in the icosahedral capsid. This tight packaging is exemplified in Fig 1A for instance or the AFM images of the nucleoid enclosed in P3 of this file.

      We provide a more general answer in the answers requested by the editor.

      We think that the entire genome can only be packaged in the capsid through its assembly within the protein shell. We also think the genomic fiber is progressively built on the genomic DNA while it progresses into the capsid, most likely by an energy driven packaging machinery. This process can be compared to bacterial pili assembly, except that pili are built on the surface of the cell, while the genomic fiber is built into a compartment, the nucleoid, forcing it to fold in this compartment, which is only possible due to the high flexibility of the genomic fiber. Thus, the entire genome corresponds to ~40 µm of genomic fiber, which when folded as a ball can entirely fit into the nucleoid. The organization of the genome in a large “tubular structure” and its folding inside the nucleoid compartment has been previously reported by AFM studies of the mimivirus particles (Kuznetsov, Y. G. et al. Virology 2010; Kuznetsov YG et al. J. Virol. 2013, Fig 15), which the authors refer to as “highly condensed nucleoprotein masses about 350 nm in diameter within the inner membrane sacs of virions”, with the presence of tubular structure they refer to as “thick cables of the nucleic acid” (image P3 herein).

      4) The authors describe the interactions between the monomers in the dimer of qu_946 as well as between qu_946 and DNA. I would also like to see a brief description of protein-protein interactions between subunits within the same helical strand as well as between helical strands, which hold the whole assembly together (i.e., what are the contacts between green subunits as well as between green and yellow subunits shown in Fig 2C). The authors suggest that the shell "would guide the folding of the dsDNA strands into the structure" (L310). To support this statement, the authors could show the lumen of the fiber rendered by electrostatic potential.

      We thank this reviewer for these suggestions. An additional supplementary Table (Table S4) is now provided listing the various contacting residues in each genomic fiber map and for each GMC-oxidoreductase. The number of contacts obviously decrease in the relaxed structure, but even in the compact forms, we noticed there are relatively few contacts intra and inter-strands, which may also explain the flexibility of the structure. We now provide a new figure 3 in which the lumen of the fiber is rendered by electrostatic potential for the Cl1a map and each of the two GMC-oxidoreductases.

      5) Please provide some background information on the distribution of GMC-type oxidoreductases in other families of giant viruses, so that it is clearer whether the described packaging mechanism is specific to mimiviruses or is more widespread.

      This is a central point, also linked to the question about M4. In fact, like the reviewers, we initially assumed that the GMC-oxidoreductases were essential. Now, we believe it might be premature to assume that GMC-type oxidoreductases are the only type of proteins that can be involved in the scaffolding of the Mimiviridae genomic fibers.

      If this reviewer still thinks this is essential to this manuscript we can provide a multiple alignment of the GMC-oxidoreductases of members of each clade upon request.

      Reviewer #3 (Public Review):

      Since it was presented to the scientific community as a viral entity, mimivirus has the unlimited capacity to cause surprise and admiration. In this manuscript, Villalta, Schmitt, Estrozi, et al. and Abergel present how the mimivirus gigantic genome is organized into the virion. The authors succeeded in developing a protocol to trigger virus genome uncoating followed by genome-associated proteins purification. The presented data indicates that a helical shield composed of two GMC-type oxidoreductases is associated with the mimivirus genome, named genomic fiber. By cryo-EM, and cryo-tomography different forms and stages of the genomic fiber were detailed described, indicating the dynamics of fibers conformational changes, likely related to genome packing and uncoating during the virus replication cycle. In-depth analysis of a substantial number of individual virus fibers revealed that the mimivirus genome is folded and organized inside the aforementioned helical shield, which seems to be novel among giant icosahedral viruses. Proteomics in association with image analysis indicates that mimivirus packed genome forms a channel, which accommodates key enzymes related to early phases of the replication cycle, especially RNA polymerase subunits.

      I must disclose that I am not an expert on structural virology and proteomic analysis. Therefore, I don't feel I can contribute to the improvement of this kind of analysis. That said, I congratulate the authors for their efforts to make the manuscript story understandable to nonexperts.

      We are grateful to this reviewer for these positive comments.

      I have a few suggestions and comments:

      1) Please consider the "nucleocapsid" concept during genomic fiber presentation. I believe it fits in;

      We fully agree and this was why we referred to APBV-1. Obviously, it was not clear and we now explicitly use the word “nucleocapsid” in the text.

      2) The "ball of yarn" analogy is nice, but fig 1C shows several fibers unconnected (free) in one of their ends. I am wondering if it means that the genomic fiber is not a long-single structure covering the whole genome, but a bunch of several independent helical structures covering the whole genome and attached in such "ball of yarn". Like several threads connected. Could the authors clarify that please?

      In the “ball of yarn” structures, there are clearly breaks that give the impression of multiple fibers. Yet, these breaks are due to the multiple steps of the extraction, enrichment and purification treatment. The genomic fiber is built as a long (~40 µm) single structure folded in the nucleoid while it is loaded. As a result, it is tightly packed into the nucleoid and broken into fragments upon release due to the fragilizing treatment. As exemplified in the CryoEM image provided above (P9) on freshly opened capsids, these breaks appear to depend on the treatment. This reviewer could also look at the answer we provided to Reviewer 2 point 3 as this could help clarify how it is possible to package the genomic fiber and subsequently fold it into the nucleoid to the point where it is tightly packed and under pressure.

      3) Considering previously published data on proteomics of viral factories and transcriptomics of mimivirus: is there any temporal association between GMC-type oxidoreductases' peak of expression and genome replication during the viral cycle? what about RNA pol subunits? Are all those proteins highly expressed during the late cycle? or do they reach the peak concomitantly with genome replication? This information can support the discussion on the genome-fibers assembly during the cycle.

      We thank this reviewer for these suggestions. We now added time of expression of the proteins involved in the genomic fiber composition along the manuscript. We added explicit sentences in the main text both for the GMC-oxidoreductases and RNA polymerase subunits. The RNA polymerase as well as proteins involved in mRNA maturation are in the virion (Table S2 B) and studies by others demonstrate early transcription takes place in the nucleoid once transferred in the host cytoplasm (Reference 24). We also provided a link to the reviewers where to find the expression data for the different mimivirus genes. http://www.igs.cnrs-mrs.fr/mimivirus/

      4) Taken together, data seem convincing to demonstrate that the virus genome is located inside the helical shield. However, I believe that the authors could better explain why we only see 20 kb fragments in the gel, including in the control (in Fig S2).

      We hope our answers to this comment will convince this reviewer.

      Fig S2 corresponds to a regular 1% agarose gel and not to a PFGE gel. This gel was simply to show there is DNA associated with the genomic fiber and not to show the size of the DNA as the genomic fiber has been broken into pieces and we thus do not expect to have very high molecular weight. I must point out that when extracting the DNA form Mimivirus capsids using standard kits and pipetting, it also migrates at the top of the gel (Lane 1 in Fig. S2) while it would likely appear as a smear above 20 kb on a PFGE. By contrast when the viral particles are put into plugs prior lysis, the genomic DNA migrates at the proper size, as shown in the publication from Boyer et al. 2011 (reference 31), showing the genome of Mimivirus is a linear genome migrating around 1.37 Mb (Fig 1, Panel B, Lane M1). In P9 of this letter, an image of a long (> 6 µm) and flexible fiber is presented.

      Reviewer #4 (Public Review):

      In the manuscript "The giant Mimivirus 1.2 Mb genome is elegantly organized into a 30 nm helical protein shield", the authors show that, when subjected to low pH stress, the Mimivirus particle releases 30nm-diameter filamentous assemblies. These filaments consist of a protein shell that envelopes the Mimivirus genomic DNA. The protein shell is composed of two GMC-oxidoreductases, the same protein that forms the long fibers emanating from the capsid of the Mimivirus.

      Overall, despite being interested in the subject, this scientist was left confused about several aspects of the paper described below. The presentation of the material is also confusing.

      We hope the answers and images we provide to all Reviewers in page 2 to 12 herein will clarify the various points raised by this reviewer.

      1) The presented data do not allow the estimation of the amount of mimivirus genome organized into 30 nm diameter filaments. Hence, the title of the paper is misleading.

      The entire genome should be packaged in the genomic fiber. That was already observed by other and we now provide an image of the nucleoid imaged by AFM that was published. The image was extracted from Kuznetsov et al. J. Virol. 2013. See p9 of this letter.

      2)The filamentous structures are a result of extremely harsh treatment of the virus particle, which starts with a 1.5 hour-long incubation at pH 2. Do the filaments actually exist inside the virus particle as the title of the paper implies?

      The 1 h incubation at 30°C and pH 2 was only applied to recover the nucleoids (see material and method section “Nucleoid extraction”) presented in Fig S1A. Acidic treatment was never applied to produce the genomic fiber as we noticed it is sensitive to both temperature and acidic treatment. All steps of the extraction protocol were performed at pH 7.5 (section: “Extraction and purification of the mimivirus genomic fiber”). We must emphasize that the release of the genomic fiber can be seen at the very first step of the extraction protocol (protease treatment). The sample was also controlled at each step of the protocol by negative staining TEM to assess the status of the genomic fiber. We had to optimize the protocol as using a too soft proteolytic treatment led to too few opened particles but with mostly a compact genomic fiber released, if it was too harsh, all particles were opened but the genomic fiber was mostly in the ribbon state. We had to compromise to get a decent amount of compact and relaxing structures to be able to perform the present work. We would like to stress out that we could reproducibly obtain the genomic fiber from many preparations and that we could observe them with different virions (including M4), even using different protocols (only the one with the better yield is reported in the manuscript).

      In the Figure 1B the genomic fiber can be seen inside a virion and is still encased in the membrane compartment. These structures were not reported in previous cryo-EM analyses of the virions. As said above, they were only reported by AFM studies of the mimivirus particles (Kuznetsov, Y. G. et al. Virology 2010; Kuznetsov YG et al. J. Virol. 2013, Fig 15). See p9.

      Or [might] these filaments [form during] host take over?

      Or [perhaps] these filaments [result from a harsh in vitro treatment] and have nothing to do with either?"

      The first two questions can be answered with the help of cryoFIB tomography, which might be beyond the scope of a "paper revision". However, the properties of the two GMCoxidoreductases in the presence and in the absence of genomic DNA must be examined in greater detail. Can these proteins, by themselves, form similar hollow filaments (or any filaments) when subjected to the same treatment as the virus?

      I personally have difficulties to imagine that such a complex structure could be the result of an artefact due to the treatment for several reasons: - It is unlikely that by simply putting the GMC-oxidoreductases with DNA would result in a helical structure where the DNA is folded 5 times and internally lining the protein shell (extended data video1 of one tomogram). It would be like crystallizing the proteins (in a heterogeneous sample) onto the folded DNA to form a helix with a hollow lumen. The crystallographic data obtained by others by on the mimivirus GMC-oxidoreductase did not produce tubular structures either and they reported 3 crystal forms. They overexpressed the proteins in E. coli and did not report such structures bound to DNA either.

      • Given the presence of compact and relaxed forms, once relaxed the helix cannot go back to a compact state passively by simply rewinding suggesting the relaxed forms are the result of decompaction of a constrained structure. This is also supported by the loss of DNA in the relaxed state Cl3. Last steps of unfolding correspond to the loss of one ribbon strand after the other.

      • The contacts between chains intra and inter strand are also scarce supporting an active assembly of the structure. We now provide an additional supplementary Table S4 with the different contacts for the different states of the genomic fiber.

        3) Although the assignment of the qu_946 oxidoreductase to the corresponding cryo-EM density is correct (as the resolution is high enough), I am confused about the other oxidoreductase (qu_143). Where does it fit to? Which structure does it form?

      We cannot discriminate between the two GMC-oxidoreductases due to their close identity (69% identity, 81% similarity) and the resolution of the map. Yet we think that in most cells the qu_946 GMC-oxidoreductase is the most abundant at the time of genome packaging (from our proteomic study, between 2 and 9 times). Yet, in some cells the second GMCoxidoreductase could become the most abundant and, in that case, the genomic fiber is built using qu_143.

      Equally important, what is going on with the N-terminal 50-residue domain of qu_946? Is there a space for it in the cryoEM map? Is it disordered?

      The N-terminal domain is only present in the fibrils decorating the capsids. As illustrated in Fig S12, when analyzed by MS-based proteomics, the comparison of the peptide coverage of the GMC-oxidoreductases whether they compose the fibrils or the genomic fiber is not the same. The N-terminal domain is clearly covered when the fibrils (data not shown) or intact virions are analyzed and not covered when the analysis is performed on the genomic fiber. That is why we propose this N-terminal domain could be an addressing signal (see main text) and that a protease could be cleaving it in the case of the genomic fiber assembly.

      Main text: The proteomic analyses provided different sequence coverages for the GMCoxidoreductases depending on whether samples were virions or the purified genomic fiber preparations, with substantial under-representation of the N-terminal domain in the genomic fiber (Fig. S12). Accordingly, the maturation of the GMC-oxidoreductases involved in genome packaging must be mediated by one of the many proteases encoded by the virus or the host cell.

      Indeed, there is no space to accommodate this domain as it would prevent the interaction between the protein shell and the DNA or/and induce an increase of the genomic fiber diameter that would be too big to be accommodated into the nucleoid.

      4) The bubblegram analysis is not very convincing. The bubbles appear to correlate with the length or thickness of the structure - the long or overlapped structures form bubbles. The bubbles may not be due to the presence of DNA.

      The point is, as demonstrated by our structural studies, that the relaxed structure lost the DNA. This is why bubble cannot be seen in the relaxed broken fibers. On long fibers still in compact form, the DNA is visible in the structure and bubble can be seen. Yet the evidence for the presence of DNA in the structure is also provided by the agarose gel of the purified genomic fiber and the cryo-EM structures. Bubblegrams are just one additional analysis which was provided.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We are sincerely grateful to the reviewers for several key comments that led us to correct some mistakes and better appreciate how to put our findings in the context of recently published data. These changes undoubtedly improved the manuscript.

      Many other reviewer comments seem to equate chaperone binding with a functional chaperone role in de novo folding. These are not the same. Cytosolic chaperones presumably “sample” nearly every protein that is synthesized by cytoplasmic ribosomes. This does not mean that every such protein would misfold if even one of those chaperones failed to bind it. If we want to understand what chaperone mutations might cause human disease due to septin misfolding, for example, it will not be enough to catalog all the chaperones that bind septins. We have already done that. What will help is to understand which chaperones make functional contributions to septin folding and complex assembly. Our study is the first to experimentally address chaperone roles in de novo septin folding, period. We take responsibility for not being sufficiently clear about the goals of our work, and, to emphasize these points, we added one sentence to the Introduction and revised another.

      Another consistent criticism was that the use of the E. coli system, both in vivo and in vitro, limited our ability to gain insight into the folding of septins in eukaryotic cells and led to a “tessellated view”. For example, reviewers claimed that our model about translation elongation rates for Cdc12 were “based mainly on the E. coli system and bioinformatics analysis”. We disagree with this interpretation. Key evidence in support of our model come from published data in yeast, specifically the much higher density of ribosomes on Cdc12 and the accumulation of ribosomes on the Pro-rich cluster near the Cdc12 N terminus. These are precisely the kinds of “more stringent analysis” in “authentic yeast” (to use Reviewers’ language) that we would have wanted to do to test our model, had they not already been done by others. Without specific suggestions, we struggle to imagine what other kinds of experiments the Reviewers have in mind, apart from a eukaryotic version of a reconstituted cell-free translation system, which Reviewer #1 admits “would be substantially difficult” and “time consuming”. While we are intrigued by the reconstituted eukaryotic cell-free translation system that was published last year (which we mentioned on lines 994-995) and look forward to exploring it in future studies, it is not commercially available and we agree that the amount of effort required to prepare it ourselves is unrealistic for the current study. Most importantly, we do not find in the critiques provided any specific reason why our E. coli-based systems experiments are intrinsically less “stringent” or “rigorous”.

      Accordingly, we think that, together with the results of multiple new experiments (detailed below), the extensive re-writing and re-ordering that we have done in the revised manuscript will be enough to better emphasize the importance and rigor of our findings and thus to address all of the Reviewers’ specific concerns.

      Reviewer 1 thought that our manuscript “does not even provide new information, since the involvement of CCT and the Hsp70 system is not novel” and thought that “the key finding of this manuscript is how chaperones are involved in the de novo folding of septins, which is not conceptually new because of previous findings, including those of the authors”. Reviewer #3 also stated that “the function of Tric/CCT in septin folding and assembly is well documented”.

      We were quite surprised at this reaction, since we dedicated a significant portion of the original manuscript (lines 68-76 and 319-322) to explicitly discussing the only other paper in the literature that specifically addresses the question of whether or not CCT is required for de novo septin folding. As a reminder, that paper explicitly stated that “it is unlikely that CCT is required to fold septins de novo” and “septins probably do not need CCT for biogenesis or folding”. With regard to involvement of the Hsp70 system, the only existing evidence in the literature on this subject is the aggregation of some septins in ssb1∆ ssb2∆ cells. Like the CCT study, that study did not distinguish whether this was a result of problems during septin synthesis and before septin complex assembly, or, alternatively, whether pre-folded and assembled septins were subject to disassembly, misfolding, and aggregation. Our experiments specifically test the fate of newly-synthesized septins prior to assembly in living cells. Our previous findings documented physical interactions between wild-type septins and multiple chaperones but did not address whether these interactions had any functional relevance. We previously reported functional effects of interactions between chaperones and MUTANT septins but, again, these studies did not address functional chaperone requirements for WILD-TYPE septins. While we did our best to highlight these points in the original document without devoting excessive amounts of text, we accept responsibility for not making these points sufficiently clear and to address this issue we added additional text, including the text quoted above, to the Introduction.

      While Reviewer #3 commented that the manuscript “is overall well presented”, Reviewer 1 thought that the manuscript was “complicated to read” with “no logical connections, just a list of many results” and mentioned that part of the difficulty was “that it contains many negative results”.

      In addition to reorganizing the manuscript, as suggested by the reviewers, we added more text at the beginning and end of nearly every section to even more explicitly state the logical connections between results. In our opinion, negative results of properly controlled experiments are valuable to the research community, and we do not understand what it is about negative results that makes them difficult to read about. Many of the extra experiments we performed were in anticipation of being asked to perform them by reviewers, some of which generated negative results. We are reluctant to remove negative results unless there is a more compelling reason. For example, to address another reviewer concern, we did remove the negative results with the Ydj1–Ssa2 compensatory mutants.

      Reviewer #2: “4) Figure 2: The labeling on the protein structure makes it seem like the exact region for Ydj1 and Hsp70 was experimentally identified, when it hasn’t.”

      We acknowledge that the first sentence of the figure legend (“the colored ribbon follows the color scheme in the sequences at right for overlapping β-aggregation, Ydj1 and Hsp70-binding sites”) could be misinterpreted, since only in the second sentence does it say “Sequence alignments show predicted binding sites”. We corrected this mistake, and added the text “Predicted chaperone binding sites” as the first words in the legend to this figure.

      Reviewer #2: “8) The authors confusingly jump back and forth between different Septins and different chaperone (Ssa1-4, Ydj1, Sis1, Hsp104). We would ask the authors to re-arrange the manuscript, collating all the yeast work in one section and bacterial work in another.”

      We re-arranged the manuscript and put all the yeast work in one section and all the bacterial work in another, with the exception of the studies of individually purified Cdc3 and Cdc12, which we put in between the yeast studies of the kinetics of de novo assembly and the yeast studies of post-translational assembly. Our reasoning is that the studies with the purified proteins demonstrate challenges with maintaining native conformations in the absence of chaperones and other septins, which flows naturally into the yeast studies asking about the ability of “excess” septins to maintain oligomerization-competent conformations in the absence of other septins and when we experimentally eliminate specific chaperones. All of the work actually manipulating E. coli genes/proteins is now together.

      Reviewer #3: “1. The co-translational binding of CCT to nascent polypeptide chains has been studied (Stein et al., Mol Cell 2019). While the authors indicate that septin subunits are engaged co-translationally, they do not comment which ones are interacting with CCT and at which state of translation. This information is crucial and should also be mentioned in the discussion section.”

      We are grateful to the Reviewer for bringing up this point, which we had overlooked. We hadn’t noticed that, in the end, only Cdc3 met the CCT confidence threshold to be included in the supplemental data of the Stein et al. paper. All septins co-purified with CCT in an earlier Dekker et al proteomic study, so we strongly suspect that the failure of the other septins to meet the confidence threshold in the Stein et al paper reflects the sensitivity of that assay, rather than a significant difference in how septin GTPase domains interact with CCT. We also hadn’t appreciated that according to that study, the main sites in the Cdc3 GTPase domain bound by CCT and Ssb are the same. Hence our statement that Ssb bound to septins “earlier” during translation, and CCT bound “later” was wrong. Instead, the overlapping Ssb and CCT site in Cdc3 turns out to be remarkably consistent with a conclusion from Stein et al paper, that CCT binds Rossmann-fold proteins like septins at sites where “early” beta strands have been translated and expose a chaperone-binding surface that later becomes buried by an alpha helix. We corrected our mistake in the text and in our model figure and added: (1) a new supplemental figure with predicted septin structures and a sequence alignment indicating where CCT and Ssb bound; and (2) text discussing the confidence thresholds for “calling” septin-CCT interaction, the Rossmann-fold binding, and how we interpret Ssb and CCT binding to the same site.

      Reviewer #3 “3. Figure 3: It is recommended to also follow Cdc10-GFP and Cdc12-GFP fluorescence. This will on the one hand generalize the presented findings and provide a direct link to other parts of the study (e.g. crosslinking analysis of Cdc10).

      We carried out the requested experiment for Cdc12, using Cdc12-mCherry rather than Cdc12-GFP because of the formation of non-native foci that we observed with Cdc12-GFP. We also attempted to analyze Cdc10, using an existing GAL1/10-promoter-driven Cdc10-mCherry plasmid that we’d made a few years ago, but it did not behave as expected, with high expression even in the absence of galactose (not shown), which prevented us from performing the requested experiment. We have a Cdc10-GFP plasmid with the inducible MET15 promoter, but this promoter does not provide sufficiently low levels of expression in repressive conditions, so there would be too much expression at the beginning of the experiment for us to accurately follow accumulation thereafter. Instead, we tried the only other plasmid we had with the GAL1/10-promoter controlling a tagged septin: Cdc11-GFP. Above a certain threshold of expression, Cdc11-GFP formed unexpected cortical foci, but we were still able to perform the analysis and found a clear delay in septin ring signal in cct4 cells, providing the requested generalization to other septins, if not Cdc10.

      Reviewer #3 “5. Figure 4C: The finding that only ssb1 but not ssb2 knockouts have an effect on joining of free Cdc12-mCherry subunits into septin rings is puzzling. Similarly, Ssb1 largely acts co-translationally, while in this assay post-translational septin ring assembly is monitored. The authors need to comment on these two points.”

      We did not examine ssb2 knockouts, so we do not know to what the Reviewer is referring in the first point. If the Reviewer means that they are puzzled by the fact that we saw a phenotype in cells in which only SSB1 was deleted and SSB2 remained, we offer two explanations. As can be seen in the Saccharomyces Genome Database entry for SSB1 (https://yeastgenome.org/locus/S000002388/phenotype), there are at least a dozen known phenotypes associated with deletion of SSB1 in cells with wild-type SSB2. We even showed a very clear septin misfolding/mislocalization phenotype in Supplemental Figure 4D. Thus while our findings are new and provide novel insights into Ssb function, they are not unprecedented. The Reviewer is correct that most Ssb is ribosome-bound and thus Ssb1 “largely acts co-translationally” but ~25% of Ssb is not ribosome-associated (PMID: 1394434). Furthermore, the lack of a strong phenotype for ssb1∆ cells in our new kinetics-of-folding experiment (see below), plus the realization that Ssb and CCT both bind the same site in Cdc3, leads us to a new model: Ssb acts both co- and post-translationally in septin folding, but only the post-translational function is associated with a phenotype in ssb1∆ cells, because in that assay we drastically overexpress a tagged septin and thereby exceed the Ssb chaperone capacity that remains when we delete SSB1. This logic also explains the first ssb1∆ phenotype we saw, when overexpressing Cdc10(D182N)-GFP. In the kinetics-of-folding assay, on the other hand, tagged septin expression is much lower and reducing the amount of total Ssb by ~50% (via SSB1 deletion) likely does not compromise Ssb function in folding the tagged septin. We therefore removed our statement that “Ssb dysfunction leaves nascent septins in non-native conformations that are aggregation-prone and unrecognizable to CCT”, revised our model figure accordingly, and added new text and citations to explain our new model.

      Reviewer #3 “Additionally, they should test whether the appearance of septin ring fluorescence is slowed down in ssb1 mutants (as shown for cct4-1 mutant cells in Figure 3B).”

      We agree that slower septin folding in ssb1∆ cells is a prediction of our model, and we performed the requested experiment and include the results in our revised manuscript. The new data show that the appearance of septin ring fluorescence is not delayed in ssb1∆ mutants, which is easily explained by the ability of Ssb2 to chaperone the folding of the low levels of tagged septin that we express in these kinds of experiments (see above).

      Reviewer #3: “7. Figure 5G: The data is not convincing. This reviewer cannot detect a specific Cdc12 band accumulating in presence of GroEL/ES.”

      We re-ran the reactions again with fresh reagents and this time ran the gel longer to reduce excess signal from free fluorescent puromycin and the bright Cdc10 bands. We now see a very clear band for full-length Cdc12 in the reaction with added GroEL/ES, fully consistent with our mass spectrometry results. We updated the figure with the new results.

      Reviewer #3: “Furthermore, the activity tests done for the chaperonin system are confusing (Supplemental Figure 7). The ATPase rate (slope!) of GroEL/GroES seems higher as compared to GroEL but according to the authors it should be opposite.”

      In our assays, the ATPase activity is so fast that for our “time 0” timepoint, much of it has already occurred by the time the reaction can be physically stopped and measured. In other words, the handling time is such that we can’t visualize what happened in the earliest stages of the reaction, where the rates could accurately be estimated as slopes. This is obvious from the fact that at time 0, the absorbance for the “GroEL alone” reaction is already more than twice the absorbance for GroEL+ES. We added clarifying text to the figure legend.

      Reviewer #3: “The refolding assay using Rhodanese as substrate is also confusing: What is the activity of native Rhodanese? The aggregated Rhodanese sample seems to have substantial activity that is not too different from a GroEL/ES-treated one. From the presented data it is not clear to the reviewer to which extend GroEL/ES prevents aggregation and supports folding of denatured Rhodanese.”

      We thank the Reviewer for bringing this to our attention, because made we mistakenly left out the values for native Rhodanese with the reporter. With regard to the aggregated Rhodanese, we failed to note that this sample contains urea. When the urea absorbance is subtracted, it is clear that the GroEL/ES-treated sample has higher activity. Furthermore, some native enzyme is likely still active within the aggregated sample, explaining the “substantial activity” that the Reviewer correctly notes. We corrected the figure and added clarifying text to the figure legend.

      Reviewer #3: “the study goes astray following aspects that does not seem relevant to this reviewer (e.g. the role of N-terminal proline residues for Cdc12 translation, Fig. 5E/F).”

      We acknowledge that we did a poor job of introducing the N-terminal Pro-rich cluster in Cdc12 with relation to our model of slow Cdc12 translation. Instead, we have revised and reorganized the manuscript to set up these experiments as a direct test of our model: if ribosome collisions on the body of the ORF drive mRNA decay, then decreasing the spacing of those ribosomes should exacerbate the problem, and eliminating the Pro-rich cluster (where published yeast data already show ribosomes accumulate) is the most logical way to test the prediction. Far from being irrelevant, the results fit the prediction perfectly and thus support the model. We expect that this change will highlight the importance of these experiments for the reader.

      Reviewer #2: “1) Fig. 1 Is the folding of Cdc3 being measured in cells lacking chaperones mentioned towards the end of the paper or are the authors referring to the lack of yeast proteins?”

      We are unclear as to what the Reviewer is asking here. The title of Figure 1 states that these are “purified yeast septins” and the figure legend further emphasizes this fact. Additionally, the Coomassie-stained gel in Figure 1A shows a single band, corresponding to purified 6xHis-Cdc3. The proteins were purified from wild-type E. coli cells, so all E. coli chaperones were present when Cdc3 initially folded, but chaperones and all other proteins were removed during the purification and prior to the analysis. We do not know what change to make.

      Reviewer #2 asked “How do the authors account for the septin defect in Ssa4 delete cells in unstressed conditions where Ssa4 would be very low already? According to the authors previous work, Ssa2 and 3 should be able to compensate.”

      We explicitly addressed this point in the original manuscript (lines 893-898). Again, we think here the Reviewer is equating chaperone binding with chaperone function. According to our previous work, Ssa2 and Ssa3 are able to bind septins, but this does not mean that they can fold septins the same way as Ssa4. We cite several papers that discuss the distinct functional roles for the different Ssa proteins. We do not think that additional clarification of this point would strengthen the manuscript.

      Reviewer #3: “6. Figure 5B: It is unclear why Cdc3 is observed in the pulldown of His-tagged Cdc12 (37˚C), although no Cdc12 was isolated under these conditions. How is that possible?”

      That is not possible. As we indicate in the figure legend and with the red asterisk, the only band appearing in that lane is a non-specific band that cross-reacts with the anti-Cdc3 and/or anti-Cdc11 antibodies. This is why it is also present in the “No septins” control lanes. We made the asterisk larger to help accentuate this point.

      Reviewer #3: “Furthermore, the authors observe a specific effect on Cdc12-Cdc11 assembly in the E. coli groEL mutant. How do they rationalize this specific effect as Cdc12-Cdc3 assembly remained unchanged? This observation also seems in conflict with the suggestion of the authors that Cdc12 preferentially recruits Cdc11 before interacting with Cdc3 (page 45, lane 1024).”

      Cdc11 was not expressed in the groEL mutants because no Cdc11 gene was present in those cells, as explained in the body text and indicated in the labeling above the lanes in Figure 5A. The band near the size of Cdc11 is a non-septin protein that bound to the beads in the groEL-mutant cells, as is shown in the immunoblot using anti-Cdc11 antibodies in Figure 5B. Thus there is no conflict to rationalize.

      Reviewer #1: “The only evidence that CCT binds to septin is the list of LC-MS/MS. Western blotting would provide more solid data.” and “2) The cross-linking experiments appears not to have been successful. Why are the Ssas, Ydjs etc not detected here? “

      First, CCT subunits are relatively low-abundance, expressed at 5- to 50-fold lower levels than other chaperone families in the yeast cytosol (see PMID: 23420633). To the Reviewer’s second point, we did in fact detect other chaperones in our crosslinking mass spectrometry experiments, including Ydj1, multiple Ssa and Ssb chaperones, Hsp104, etc., as can be seen in Table S1. However, they were also detected in negative control experiments. This is not surprising, given that these chaperones are among the most common “contaminants” of affinity-based purification schemes (see the CRAPome database at https://reprint-apms.org/). It was for this reason we had to perform so many negative control experiments, which likely produced some false negative results, as some “real” interactions were likely discarded when the same chaperone showed up in our controls. We added a figure panel with a Venn diagram of overlap between experimental and control samples, and text pointing out this caveat of our approach.

      Second, in this experiment we attempted to identify proteins that transiently interact with a specific region of Cdc10 that will later become buried in a septin-septin oligomerization interface. Due to the transient nature of the interaction, we do not expect to detect high levels of crosslinked chaperones. Mass spectrometry is significantly more sensitive than immunoblotting, so there is no guarantee that we would be able to detect a band even if the crosslinking works as desired. Indeed, the crosslinked bands we saw by immunoblot for GroEL were quite faint (see Figure 2F), despite the fact that GroEL and the T7-promoter-driven Cdc10 were among the most abundant proteins in those E. coli cells.

      Third, there is no commercially available, verified antibody recognizing yeast Cct3 for which to perform the requested immunoblot experiment. Since both the N and C termini of CCT subunits project into the folding chamber, it is unwise to use a standard epitope tagging approach, as the tags may compromise function. Indeed, for purification purposes others inserted an affinity tag in an internal loop in Cct3 (PMID: 16762366). We have a yeast strain with Cct6 tagged in an analogous way, but to perform the requested immunoblot experiment with Cct3 would require creating or obtaining the Cct3-tagged strain, deleting NAM1/UPF1, and introducing our Bpa tRNA/synthetase and GST-6xHis-Cdc10 plasmids. Given the sensitivity of detection concerns stated above, we doubt this would help.

      In summary, we prefer not to attempt the requested immunoblot experiments.

      Reviewer #1: “-Fig. 3B ant related Figures: The experiment to see if GFP-tagged septin accumulates in the bud neck is important, but only the graphs after the analysis are shown. The authors should provide the readers with representative examples from imaging data.”

      We are confused, because the images at the bottom of Figure 3A already show what the Reviewer requests. As stated in the figure legend, these are representative examples of the imaging data from a middle timepoint of one of the experiments. It would be nearly impossible (for space reasons) to provide representative images for all of the timepoints for all of the genotypes for all of the experiments. Since in our new experiments we introduce new tagged septins (Cdc11-GFP and Cdc12-mCherry), we also now include representative images of cells expressing these proteins, as well.

      Reviewer #2: “3) If the authors had evidence of chaperone interaction from their previous study, why did they not simply do IPs with fragments of the septins/chaperones?”

      We are unclear why the Reviewer is suggesting IPs after referring to our previous study. IPs are a poor choice for transient interactions, which is why we mostly avoided them in previous studies, and instead used a novel approach (BiFC) to “trap” chaperone–septin interactions. Moreover, we seek to identify chaperones that bind wild-type septins at future septin-septin interfaces on the path towards the native conformation. Fragments of septin proteins would likely misfold and would therefore likely attract chaperones that wouldn’t normally bind the full-length septin. Indeed, our previous studies demonstrated that even a single non-conservative amino acid substitution was sufficient to alter chaperone-septin binding. Thus IPs with fragments of septins or chaperones would be highly unlikely to yield informative results for the questions we seek to answer. We strongly prefer not to attempt these suggested experiments.

      Reviewer #2: “5) While differences between Ssa paralogs are highly interesting, using deletions of Ssas is not useful, given that yeast compensate by overexpressing other paralogs. The yeast GFP Septin assays should be repeated in yeast lacking all Ssas and expressing one paralog on a constitutive promoter (See numerous papers by Sharma and Masison).”

      We disagree that ssa deletions are “not useful”, since if the overexpressed paralogs cannot fulfill the same function as the deleted SSA, then we will see a phenotype. Which we do. Furthermore, we had already obtained and thoroughly tested a strain like the ones mentioned by the reviewer (ECY487, a.k.a. JN516, from Betty Craig’s lab, with ssa2∆ ssa3∆ ssa4∆ and SSA1, which is constitutively expressed, PMID: 8754838), but we found that, as published, it divides slightly more slowly even under the most permissive of conditions. The requested strain cannot be analyzed using our method, because slow accumulation of ring fluorescence could be attributed to other defects unrelated to septin folding. Thus we strongly prefer not to attempt the suggested experiments.

      Reviewer #2: “7) The authors need to clarify the experiment with the Ydj1 D36N and Ssa2 R169H. In Reidy et al, they never fully biochemically test this system and it was never examined for Ssa2-Ydj1. The authors would need to do some fundamental experiments to demonstrate the validity and functionality of this double mutant in yeast.”

      Given that this experiment was unable to generate meaningful data, since the mutations affected the kinetics of induction of the GAL1/10 promoter, we do not think the requested biochemical experiments would add any value to the study. Instead, we removed these studies from the manuscript.

      Reviewer #3: “4. Figure 3B: The difference between wt and cct4-1 cells in appearance of septin ring fluorescence is observed at one timepoint. Since this experiment is considered highly relevant, the authors are asked to include another timepoint to bolster the conclusion that Cdc3-GFP folding and thus septin ring assembly is delayed in the CCT mutant.”

      We carried out new experiments with cct4-1 cells using Cdc12-mCherry and Cdc11-GFP with more timepoints than in our original cct4-1 experiments with Cdc3-GFP. Since these experiments provide the same kinds of results, but at multiple timepoints, we do not see the value in repeating the Cdc3-GFP experiment.

      Reviewer #3: “If Ssb1 functions to maintain Cdc12 in an assembly competent state preventing misfolding, one would expect either enhanced degradation or aggregation of Cdc12-mCherry in ssb1 mutant cells. Did the authors check for such scenario? Septin aggregation has been shown in a ssb1 ssb2 double deletion strain (Willmund et al., 2013), yet the data shown here predict that aggregation might already occur in single ssb1 mutants.”

      We already examined septin aggregation in single ssb1 mutants and showed these data (Supplementary Figure 4D). Indeed, this phenotype was the rationale for testing post-translational septin assembly in ssb1 single mutants. We have seen no evidence of septin degradation in any context (as we mentioned on line 889), so we would not expect it here. While we added new text and a very new citation showing that many “misfolded” conformations of wild-type E. coli proteins avoid aggregation and degradation, we do not think that the suggested experiments would add enough value to the current study to justify the effort, time and expense.

      Reviewer #3: “Fig. 3C: The figure showing septin ring fluorescence does not include error bars. This is crucial, also because the difference between wt and ssa4 mutant cells is not large.”

      There are, in fact, error bars included in the figure, as can be most clearly seen for the final timepoint for the ssa4∆ cells. For most of the other timepoints the error bars are smaller than the data point symbols (the circles and squares). We do not think that adjusting the size or opacity of the symbols to better show the error bars will be sufficiently valuable to justify the effort.

    1. At the same time, like Harold, I’ve realised that it is important to do things, to keep blogging and writing in this space. Not because of its sheer brilliance, but because most of it will be crap, and brilliance will only occur once in a while. You need to produce lots of stuff to increase the likelihood of hitting on something worthwile. Of course that very much feeds the imposter cycle, but it’s the only way. Getting back into a more intensive blogging habit 18 months ago, has helped me explore more and better. Because most of what I blog here isn’t very meaningful, but needs to be gotten out of the way, or helps build towards, scaffolding towards something with more meaning.

      Many people treat their blogging practice as an experimental thought space. They try out new ideas, explore a small space, attempt to come to understanding, connect new ideas to their existing ideas.


      Ton Zylstra coins/uses the phrase "metablogging" to think about his blogging practice as an evolving thought space.


      How can we better distill down these sorts of longer ideas and use them to create more collisions between ideas to create new an innovative ideas? What forms might this take?

      The personal zettelkasten is a more concentrated form of this and blogging is certainly within the space as are the somewhat more nascent digital gardens. What would some intermediary "idea crucible" between these forms look like in public that has a simple but compelling interface. How much storytelling and contextualization is needed or not needed to make such points?

      Is there a better space for progressive summarization here so that an idea can be more fully laid out and explored? Then once the actual structure is built, the scaffolding can be pulled down and only the idea remains.

      Reminiscences of scaffolding can be helpful for creating context.

      Consider the pyramids of Giza and the need to reverse engineer how they were built. Once the scaffolding has been taken down and history forgets the methods, it's not always obvious what the original context for objects were, how they were made, what they were used for. Progressive summarization may potentially fall prey to these effects as well.

      How might we create a "contextual medium" which is more permanently attached to ideas or objects to help prevent context collapse?

      How would this be applied in reverse to better understand sites like Stonehenge or the hundreds of other stone circles, wood circles, and standing stones we see throughout history.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General response to the reviewer

      We thank all reviewers for their constructive comments on our manuscript. We were very pleased to see that the reviewers found our study ‘…represent new insight in the field’ (rev#1) and ‘…contains important and exciting novel findings’ (rev#2), and ‘…gives a more detailed perspective on how Src proteins (Src42A in Drosophila) control epithelial stability and the contraction of specific surfaces of epithelial cells’ (rev#3). The reviewers raised a number of specific points that we partially addressed already in a preliminary revision of the manuscript. Some more points will require some additional experiments that we will incorporate in a fully revised version of the manuscript.

      Reviewer #1

      (Evidence, reproducibility and clarity (Required)): Highest priority: 1) The Src42A knockdown and germline clone experiments both cause defects in cellularization (Fig. 2B and 9A), which could result in differences in the state of the blastoderm epithelium (cell size, cell number, structural integrity, organization, etc.) between the experimental and control conditions. In addition, Src42A knockdown appears to affect the size and shape of the egg (Fig. 9A and 9C). The manuscript would be strengthened if the authors included data to demonstrate that the initial structure of the epithelium is mostly normal (quantifications of cell size, number, etc.) in the Src42A RNAi condition, as this would bolster the argument that germband extension, rather than due to indirect effects resulting from the cellularization defects. The authors may have relevant data to do this on-hand, for example using data associated with figures 1, 3, 6, and 9.

      Response:

      The cellularization phenotype of src42A knockdown embryos has a penetrance of about 50% and exhibits a variable expressivity. We attempted to characterize this phenotype in detail, but failed to identify any dramatic differences in cellularization of the src42A knockdown embryos compared to wild type. The localization of E-cadherin, in turn is not affected, but occasionally, nuclei are dropping out of the blastoderm before cellularization is accomplished. This can result in patches of irregular cellularization, but the blastoderm epithelium in stage 6 embryos did not display major defects in overall structure. We will present additional data on the cellularization phenotypes in the fully revised manuscript. As the referee suggested, we will analyze our data to determine potential effects on the cell size, cell number and overall organization of the blastoderm before germband extension. We plan to present these data as an additional Suppl. Mat. Figure in the full revision.

      Lower priority:

      5) Figure 8 - in my opinion, using a FRAP or photoconversion approach would be a more convincing demonstration of differences in E-cadherin residency times / turnover rate than time-lapse imaging of E-cadherin:GFP alone. Authors should decide whether this improvement is worth the investment.

      Response:

      We thank the reviewer for this comment. While we believe that the data presented in Fig. 8 demonstrates a significant difference in the E-cadherin residence time based on E-cadherin-GFP fluorescence intensity, we agree with the referee that FRAP analyses would provide additional evidence to support our conclusion. For the full revision, we will therefore attempt to perform FRAP-experiments on src42A knockdown embryos expressing E-cadherin-GFP and compare the recovery time to the wild type.

      Reviewer #1 (Significance (Required)):

      The manuscript by Backer et al. examines the function of Src42A in germband extension during Drosophila gastrulation. Prior studies in the field have shown that Src family kinases play an important role in the early embryo, including cellularization (Thomas and Wieschaus 2004), anterior midgut differentiation (Desprat et al. 2008), and germband extension (Sun et al. 2017; Tamada et al. 2021). In this study, the authors showed that Src42A was enriched at adherens junctions and was moderately enriched along junctions with myosin-II. They then showed that maternal Src42A depletion exhibits phenotypes, starting with cellularization and including a defect in germband extension. The authors focus on defects in germband extension and found that Src42A was required for timely rearrangement of junctions and that the Src42A RNAi phenotype is enhanced by Abl RNAi. Finally the authors show that E-cadherin turnover is affect by Src42A depletion.

      Overall, this study provided a higher resolution description of how Src42A regulates the behavior of junctions during germband extension. I thought the authors conclusions were well supported by the data and represent new insight in the field.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: Chandran et al. investigate the role of Src42A in axis elongation during Drosophila gastrulation. Using maternal RNAi and CRISPR/Cas9-induced germline mosaics, they revealed that Src42A is required to contract junctions at anterior/posterior cell interfaces during cell intercalations. Using time-lapse imaging and image analysis, they further revealed the role of Src42A in E-Cad dynamics at cell junctions during this process.

      By analyzing double knockdown embryos for Src42A and Abl, they further showed that Src42A might act in parallel to Abl kinase in regulating cell intercalations. The authors proposed that Src42A is involved in two processes, one affecting tension generated by myosin II and the other acting as a signaling factor at tricellular junctions in controlling E-Cad residence time. Overall, the data are clear and nicely quantified. However, some data do not convincingly support the conclusion, and statistical analyses are missing for an experiment or two. Methods for several quantifications also need improvement in writing. Also, several figures (Figures 6-8) do not match the citation in the text and need to be corrected.

      Page and line numbers were not indicated in the manuscript. For my comments, I numbered pages starting from the title page (Title, page 1; Abstract, page 2, Introduction, pages 3-6; Results, pages 7-14; Discussion, pages 15-18; M&M, 19-23; Figure legends, 28-30) and restarted line numbers for each page. For Figures 6-8 that do not match the citation in the text, I still managed to look at the potentially right panels. All the figure numbers I mention here are as cited in the text. My detailed comments are listed below.

      Response:

      We apologize for the lack of organization of the manuscript and the figure numbering. In the revised version we have added page numbers, line numbers and we corrected the figure numbers.

      Major comments: 1. b-Cat/E-Cad signals at the D/V and A/P junctions in Src42Ai (Figs. 5-6). These data are critical for their major conclusion and should be demonstrated more convincingly.

      In Fig. 5A, the authors said, "When the AP border was cut, the detached tAJs moved slower in Src42Ai embryos compared to control (Fig. 5A)". However, even control tAJs do not seem to move that much in the top panels, and I found the images not very convincing.

      Response:

      We thank the referee for commenting on the lack of clarity in the presentation of the data. The overall movement within the first 10 seconds after the laser cut (determined by movement of adjacent D/V tAJs from each other) was about 2 µm in the wildtype, while in the mutant it was 1 µm. Despite this 50% difference, it may be difficult to appreciate this difference from looking at Fig. 5A in our original submission. The yellow lines in Fig 5A only showed the region of the cut, but did not indicate the movement of the tAJ from each other, which may have led to a distraction from the actual movement. We will change the annotation and the marks within the figure to visualize the movement much more clearly in the full revision. In the fully revised manuscript, we will also add movies from the experiments including marks of the tricellular junctions to follow the displacement as part of the Supplemental Material.

      Based on the genetic interaction between Src42A and Abl using RNAi (Fig. 7), the authors argue that Src42A and Abl may act in parallel. However, the efficiency of Abl RNAi has not been tested. It can be done by RT-PCR or Abl antibody staining. Also, the effect of Abl RNAi alone on germband extension should be tested and compared with Src42A & Abl double RNAi embryos. I expect the experiments can be done within a few weeks without difficulty.

      Response:

      We agree with the referee that it is important to determine the level of depletion in Abl RNAi embryos in order to interpret the genetic relationship between Abl and Src42A. In the full revision of the manuscript, we will follow the advice of the referee and analyze the knockdown, preferably by antibody labeling with an anti-Abl antibody. We will also generate single knockdowns of abl in embryos and determine their effect on germband extension compared to wildtype and src42/abl double knockdown.

      Minor comments:

      Fig. 2 - Fig. 2B: Higher magnification images of the defective cytoplasm can be shown as insets.

      Response:

      We will add some higher magnification images of the cellularization phenotype in the full revision of the manuscript. In addition, as mentioned in the response to reviewer #1, we will provide a more detailed analysis of the cellularization in src42Ai embryos in the fully revised manuscript.

      • Fig. 2E: A simple quantification of the penetrance of cuticle defects in Src42A mutants and RNAi will be helpful, as shown in Fig. S3.

      Response:

      In the full revision, we will add the quantification of the occurrence of the different classes of cuticle phenotypes.

      Fig. 9 - Fig. 9A: Magnified views of the cytoplasmic clearing can be added as insets.

      Response: As described in our response to the comments made by referee #1, we will add a more detailed analysis of the cellularization phenotype in the full revision.

      Page 14, lines 9-10: More explicit description of the phenotype rather than just "stronger compared to Src42Ai" will be helpful.

      Response:

      In the full revision, we will add a more detailed description of the phenotype and re-analyze and present data on the hatching rate, stage of lethality and cuticle phenotypes.

      Reviewer #2 (Significance (Required)): This work revealed the role of Src42A in regulating germband extension. A previous study suggested the roles of Src42A and Src64 in this developmental process using a partial loss of both proteins (Tamada et al., 2021). Using different approaches, the authors demonstrated a role of Src42A in regulating E-Cad dynamic at cell junctions during Drosophila axis elongation. Most of the analyses were done with maternal knockdown using RNAi, but they successfully generated germline clones for the first time and confirmed the RNAi phenotypes. Overall, this work contains important and exciting novel findings. This work will be of general interest to cell and developmental biologists, particularly researchers studying epithelial morphogenesis and junctional dynamics. I have expertise in Drosophila genetics, epithelial morphogenesis, imaging, and quantitative image analysis.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Chandran et al. report on the function of Src42A during cell intercalation in the early Drosophila gastrula. They create a Src42A-specific antibody (there are two Src genes in the fly genome) and examine the localization of Src42A and observe a planar-polarized distribution at cell interfaces. They then measure cell-contractile dynamics and show that T1 contraction is slower after Src42A disruption. The authors then argue that Src42A functions in a parallel pathway to the Abl protein, and that E-cadherin dynamics (turnover) is altered in Src42A disrupted embryos. Src function at these stages has been studied previously (though not to the degree that this study does), and in some respects the manuscript feels a little preliminary (please label figures with figure number!), but after editing this should be a polished study that merits publication in a developmentally-focused journal.

      1) Does the argument that Src42A has two functions fully make sense? Myosin II function is known to affect E-cadherin stability (and vice versa), so it seems that Src42A could affect both MyoII and Ecad by either decreasing Myosin II function/engagement at junctions or by destabilizing Ecad.

      Response:

      We thank the referee for raising an important point that we may not have discussed appropriately in our initial submission. We agree that the reciprocal relationship between actomyosin and E-cadherin might not be reflected equivocally in our manuscript. As the referee points out, Src42A could affect both MyoII planar localization and E-cadherin dynamics through the same pathway. Previous studies showed that Src is involved in translating the planar polarized distribution of the Toll-2 receptor by recruiting Pi3-Kinase activity to the Toll-2 receptor complex resulting in planar polarized distribution of MyoII at the A/P interfaces. These data, however do not address the possibility that a well-known Src target, the E-cadherin/ß-Catenin complex, which is extensively remodeled in germband extension contributes to the delay in germband extension. The observed defects in both studies can be attributed to both a defect in abnormal planar polarization of MyoII and the abnormal dynamics of the E-cadherin/ß-catenin complex. In either of these cases, we suggest that Src42A phosphorylates distinct substrates, the Toll-2 intracellular domain in the MyoII planar polarity pathway and the E-cad/ß-Cat complex controlling E-cad dynamics. Given the relationship between MyoII and E-cadherin, however, it is not possible to decide whether these two effects are independent functions of Src42A or are consequences of each other. Since we cannot resolve a possible epistatic relationship between these potential two activities of Src42A, we decided to extend the discussion on this topic by taking both possible scenarios into account and discussing them appropriately. We will add this discussion in the full revision of the manuscript.

      ) One obvious question that arises is the nature of cleavage defects that are mentioned that happen previously to intercalation. For example, is E-cad normal prior to intercalation initiating? How specific are the observed defects to GBE?

      Response:

      please see response to referee #1

      3) Pg. 10, "the shrinking junction along the AP axis strongly reduces its length with an average of 1.25 minute" - what is this measurement? How much is "strongly"?

      Response:

      We thank the referee for pointing out our inappropriate qualitative statement of the experimental data, which was indeed misleading. The measurement of the shrinking junction was based upon the time it takes for the AP interface junction between two adjacent vertices on the DV axis to shrink into a single 4-cell vertex. The time for this contraction was on average 1 minute 25 seconds. The data in Fig.4 A’,C show that after 2 minutes in the control embryo 100% of the observed AP junctions have collapsed and the extension of the new DV junction along AP axis has begun. At the same timepoint of 2 minutes in the src42A knockdown, we show in Fig. 4B’,D that the shrinking of the AP junction interface has still not been completed in 60% of the cases.

      In the full revision, we will remove the qualitative statement and replace it with a correct description of the measurements taken and will refer to the data described in Fig. 4 A-D.

      4) Also pg. 10, "the AP junction was not markedly reduced after 1 minute" - what is the criteria for this statement? X%? 1 minute is very specific, it feels like how much of a reduction/non-reduction should also be specific.

      Response:

      please see response to point 3.

      Reviewer #3 (Significance (Required)):

      This study gives a more detailed perspective on how Src proteins (Src42A in Drosophila) control epithelial stability and the contraction of specific surfaces of epithelial cells.

      Description of the revisions that have already been incorporated in the transferred manuscript

      Reviewer #2 and #3 noted that the manuscript was somewhat unorganized with regard to lacking the numbering of pages, lines and figures. We also noted that in the submission process the figures were not presented in the correct order. In the preliminary revision of the manuscript, we fixed these problems to facilitate the evaluation of our transferred manuscript by editorial boards.

      In addition, we also addressed issues that the referees mentioned by editing the text according to their comments. We also addressed problems regarding the presentation of the figures and statistical analyses of the data. The following changes were made:

      1. We added page numbers and line numbers.
      2. We added figure numbers to the figure panels.
      3. We corrected ordering of figures in the transferred manuscript.
      4. We addressed the following comments by statistical analyses, editing the text and the figures:

        Regarding comments from Reviewer #1:

      Highest Priority:

      2) There is a discrepancy in the staging of embryos used between some of the analyses, which make it hard to interpret some of the data. For example, characterization of the knockdowns in Fig. 1A and B are based on stages 10 and 15, whereas the majority of the paper is focused on earlier stages 6 - 8 during germband extension (e.g., Fig. 1D). The analysis for Fig. 1B would be more meaningful if it was done on the same stages used for subsequent phenotypic analysis so they can be directly compared.

      Response:

      We thank the referee for pointing out an apparent misunderstanding caused by the description of Fig. 1A,B. The data presented in Fig.1A and 1B do not show RNAi knockdown experiments, but show a comparison between embryos that are heterozygous or homozygous for the loss-of-function allele src42A26-1. These data were intended to demonstrate that zygotic mutants still maintain levels of maternal Src42A protein up until late stages of development. Data for embryos at an earlier stage (stage 5) were shown in the Supplementary Fig. S1E, where no difference in protein levels of Src42A can be observed between heterozygous and homozygous zygotic src42A26-1 embryos.

      At the beginning of the results sections 1 and 2 of the preliminary revised manuscript, we added a sentence to address the referee’s concern that earlier stages exhibit no difference in protein levels and will refer to Fig. S1E. We also more explicitly spelled that out that the experiment (referring to Fig.1A,B and S1) was intended to look at zygotic mutants and to demonstrate that our novel Src42A antibody was able to detect the reduction of maternal Src42A protein in mid- to late-stage homozygous zygotic embryos.

      3) There is incongruence between figures in terms of which junctional pools (bAJs vs. tAJs) of beta-catenin and E-cadherin are quantified that makes it difficult to draw comparisons between analyses. For example, pTyr levels are examined for both bAJs and tAJs in Figure 3, however, only tAJs are considered in Fig. 8. Similarly, in some cases planar cell polarity is considered (e.g., comparison of levels at AP vs DV bAJs in Fig. 6 and 9), and in other cases (e.g. Fig. 8) it is not.

      Response:

      We thank the referee for commenting on the different readouts for different pools of cell junctions in our experiments. In our study we considered effects on src42A on both, bAJs and tAJs by RNAi knockdown of src42A. We decided to present the data for bAJ and tAJ in separate figures for clarity and structure. For example, the data for the effect of src42A knockdown on the planar polarized distribution on bAJs of E-cadherin were presented in Fig.6, while the effect on E-cadherin residence time in tAJs were presented in Fig.8. The analysis pTyr levels considered both pools in order to determine whether src42A knockdown leads to an overall reduction of pTyr levels or to a reduction in a specific junctional pool. From our data we conclude that pTyr levels show a similar reduction in both, the bAJ and the tAJ junctions.

      In order to address the reviewer’s comment, we have linked the figures more stringently with the results text of the preliminary revision. We only referred to the reduction in PTyr levels in Fig. 3 to point out that both junctional pools are affected by reduced PTyr in src42i embryos. Furthermore, we referred to the individual figure panels when addressing junctional pools and explain the rationale to focus on particular pools (bAJs or tAJ) in the experiments in detail. For Fig. 6 we point out in the preliminary revised manuscript that we focus the analyses on the known planar polarized distribution of beta-catenin and E-Cadherin.

      Lower priority: 1) Introduction, 2nd paragraph - The modes of cell behaviors described to drive cell intercalation leaves out another clear example in the literature - Sun et al., 2017 - which describes a basolateral cell protrusion-based mechanism. While the authors cite this paper later, leaving it out when summarizing the state of the field misrepresents the current knowledge of the range of mechanisms responsible.

      Response:

      We thank the referee for this remark. In the preliminary revision, we have added to the introduction that the cell behaviors associated with germband elongation include apical and basolateral rearrangements of the cells indicating that basolateral protrusions also contribute to the set of mechanisms that drive germ band elongation.

      2) 'defective cytoplasm' - this term is confusing, and could perhaps be replaced with 'cellularization defect', or something similar.

      Response:

      We agree that the term we applied for the cellularization defect may be misleading. The observation, we intended to describe with the term was a defect in the cytoplasmic clearing which occurs in the last syncytial division and the beginning of the cell formation process. We changed the description of this observation according now refer to the defect in the preliminary revised manuscript as ‘cytoplasmic clearing defect’.

      3) Tests of statistical significance are not uniformly applied across the figures. For instance, Figures 3G + H indicate statistical significance, but Fig. 3D + E do not. Performing statistical tests throughout the paper, or clearly articulating a rationale when they are not used, would strengthen the manuscript. Specifically, the authors should consider this for Fig. 3D + E, and Fig. 7D + E, to support their arguments that rates of germband extension are different between conditions.

      Response:

      We agree with the reviewer and have provided statistical analysis for the data displayed in Fig. 3D,E and Fig. 7D,E in the preliminary revision of the manuscript.

      4) Page 12 - "We found that Src42A showed a distinct localization at the tAJs (Fig. 1B)": Figure 1B shows a quantification of levels at bAJs, not tAJs.

      Response:

      In the preliminary version of the revised manuscript, we added a quantification of the localization of Src42A at the tAJs as a part of Suppl Fig. S4. In Fig. S4A-C we show that Src42A is enriched in comparison to the bAJs.

      Regarding comments from reviewer #2:

      Major Comments:

      In Fig. 6A, b-Cat signals look fuzzier and dispersed and have more background signals in the control, compared to the Src42Ai background. Also, b-Cat signals in the control image do not seem to show enrichment at the D/V border, as shown in Tamada et al., 2012.

      Response:

      We agree with the referee that the image in Fig. 6A for the control is fuzzier and looks dispersed. This is due to the fixation method that we used. In this experiment we did not apply heat fixation, but used formaldehyde fixation in which b-catenin protein, in addition to the junctional pool, is also maintained in the cytoplasm creating the fuzzy cytoplasmic staining. We chose to do this in order to be able to co-immunolabel the embryos with b-catenin and E-cadherin antibodies; the latter staining is not working with the heat fixation applied in the Tamada et al. 2012 paper. Despite the slightly lower quality of the staining, the quantification of the data clearly indicated an effect of src42A knockdown on the planar polarized distribution of E-cad/b-cat complex does show an enrichment. In the preliminary revision added a note to the figure legend to indicate the fact that the fixation procedure was not optimized for b-catenin junctional staining. In the preliminary revision we also added a quantification of live imaging data recording E-cadherin-GFP in wild-type and src42Ai embryos. We present these additional data in Fig. S5 in the preliminary revision of the manuscript. These data are consistent with the results in Fig. 6 from the immunolabeling and support our conclusion that E-cadherin AP/DV ratio is increased in Src42A knockdown embryos.

      In Fig. 6B, C, it is not clear how the intensity was measured and how normalization was done. Was the same method used for these quantifications as "Protein levels at bicellular and tricellular AJs" on pages 21-22? Methods should be written more explicitly with enough details.

      Response:

      We thank the referee for pointing out the lack of detail in explaining how the quantification was done. In the preliminary revision of the manuscript, we extended a paragraph entitled ‘Protein levels at bicellular and tricellular junctions’ in the methods section that will serve this purpose and describe the methods that were applied for each quantification and the method as to how the data were normalized.

      Does each sample (experimental repeat) for the D/V border in Fig. 6B match the one right below for the A/P border in Fig. 6C? It should be clearly mentioned in the figure legend. The ratio of the DV intensity to AP intensity will better show the compromised planar polarity of the b-Cat/E-Cad complex.

      Response:

      We thank the reviewer for pointing out a lack of clarity in our presentation. The experimental repeats for each measurement do indeed match, i.e. the measurement of the DV border matches the same adjacent 4-cell pair in the same embryo and in total 5 distinct embryos were analyzed for each experiment. In the preliminary revision of the manuscript, we explain this detail of the experimental design in the figure legend. In the preliminary revision, we also determined the ratios of DV/AP cell interfaces for b-Cat and E-Cad and added this quantification as panel 6C and 6E for a clearer presentation of the data.

      Minor notes: Page 4, missing comma after "For example"

      Response: The text was edited accordingly.

      Page 4, "inevitable" does not make sense in this context

      Response: We eliminated ‘inevitable’ and replaced it with ‘critical’ to better indicate the importance of Canoe protein for germband elongation.

      Page 7, lines 6-7 - The localization of Src42A in control should be described in more detail and more clearly here.

      Response: In the preliminary revised manuscript, we extended our description of the distribution of Src42A in more detail pointing out its dynamics and differential distribution at distinct plasma membrane domains.

      Supplemental Fig S1 - Fig. S1D: Based on the head structure and the segmental grooves, the embryo shown here is close to late stage 13/early stage 14, not stage 15. - Fig S1E: It will be helpful if the predicted protein band and non-specific bands are indicated by arrows/arrowheads in the figure.

      Response:

      We thank the referee for their careful observation of the embryonic stage. We agree that the embryo was actually a younger stage. In the preliminary revision, we replaced the images with an example of an older stage. We will also add clear annotations as arrows to clearly mark the specific protein bands in Fig. S1E.

      Page 7, lines 21-22 - "Src42A was slightly enriched at the AP interface" - To argue that, quantification should be provided.

      Response:

      We thank the referee for pointing out a qualitative statement that we made with regard to the distribution of Src42A at the AP cell interfaces. In the preliminary revision of the manuscript, we present an additional quantification of the imaging data of Src42A immunolabeling. In Figure S4A-C, we now present a quantification of the enrichment of Src42A at the tricellular junctions. In addition, the new Fig. S4D,E shows a quantification of the planar polarized distribution of Src42A at the AP cell interfaces.

      Figure 1 - Fig. 1B: Src42A levels should be compared between control (Src42A/+) and Src42A/Src42A for each stage. It currently shows a comparison between Src42A/Src42A of stages 10 and 15.

      Response:

      We thank the referee for the comment. As indicated in our response to referee #1, the point of this analysis was to (1) provide evidence for the specificity of our new anti-Src42A antibody and (2) to demonstrate the presence of substantial material contribution of Src42A protein in zygotic mutant. We do not see the advantage to provide a detailed developmental Western-blot analysis, but we provide data in Suppl. Mat Fig S1E showing that the level of Src42A is unimpaired in stage 6 zygotic src42A[26-1] homozygous mutant embryos.

      • Fig. 1B: The figure legend says, "dotted line represents mean value and error bars," but there are no dotted lines shown in the figure. Also, what p-value is for ****? It should be mentioned in the figure legend. It also says Src42A levels were normalized against E-Cad intensity here (stages 10 and 15). They have shown that E-Cad levels are affected in Src42A RNAi during gastrulation (Fig. 6). Is E-Cad not affected in Src42A26-1 zygotic mutants at stages 10 and 15?

      Response:

      We thank the referee for pointing out inaccuracies in the presentation and the description of Fig.1B. In the preliminary revision, we emphasized the marks on the graph and provide p-values throughout. Regarding the E-Cadherin levels: E-cadherin levels were altered in src42A RNAi knockdown embryos, but not in zygotic mutants, even at later developmental stages.

      Page 8, line 14 - "Embryos expressing TRiP04138 showed reduced hatching rates with variable penetrance and expressivity depending on the maternal Gal4 driver used (Fig. 2B)" - Fig. 2B doesn't seem to be a right citation for this sentence.

      Response:

      We agree with the referee and in the preliminary revised manuscript we corrected the reference to the conclusion drawn from Figure 2A’, which does show the relationship of hatching rate to the various maternal Gal4 drivers.

      • Fig. 2C: It will be helpful to indicate two other non-specific bands in the figure with arrows/arrowheads with a description in the figure legend.

      Response:

      In the preliminary revision, we added an arrow to mark the band specific for Src42A and asterisks to mark unspecific bands in Fig 2C.

      Page 9, line 9 - This is the first time that the fast and the slow phases of germband extension are mentioned. As these two phases are used to compare the Src42A and Src42A Abl double RNAi phenotypes, they should be introduced and explained better earlier, perhaps in Introduction.

      Response:

      We thank the referee for pointing out that the two phases of germband extension were not introduced. We added a sentence to introduce and define the distinct phases of extension movements in the preliminary revision.

      Fig. 3 - Fig. 3A: It will be helpful to mark the starting and the ending points of germband elongation with different markers (arrows vs. arrowheads or filled vs. empty arrowheads).

      Response:

      In the preliminary revision, we added distinct markers to indicate the start and endpoints of germband elongation to make this figure easier to read.

      • Fig. 3C figure legend: R2 is wrongly mentioned in Fig. 3D, E. Also, R2 (coefficient of determination) needs to be defined either in the figure legend or Materials & Methods.

      Response:

      We thank the referee for pointing this misleading reference to us. In the preliminary revision we corrected the reference to R2 in Fig,3D,E and will describe the definition of R2 in the figure legend.

      • Fig. 3D, E: statistical analysis is missing.

      Response:

      In the preliminary revision, we included a statistical analysis of the data (see ref #1). We changed the figure to indicate the data sets that were analyzed and added the p-values to the figure legend.

      • Fig. 3G and H should be cited in the text.

      Response:

      In the preliminary revision, we added references to Fig 3G,H in the result section to the annotation of Fig.3F).

      • Fig. 3F: It should be mentioned that the heat map is shown for pY20 signals in the figure legend, with an intensity scale bar in the figure.

      Response:

      In the preliminary revision, we added an intensity scale bar to the figure panel and mentioned the relationship to the PY20 signal.

      Fig. 7A: Arrows can be added to mark the delayed germband extension.

      Response:

      In the preliminary revision, we added arrows to mark the anterior and posterior extent of the germband.

      Fig. 8A: It should be mentioned that the heat map is shown for E-Cad signals in the figure legend, with an intensity scale bar in the figure.

      Response:

      In the preliminary revision, we added an intensity scale to the heat map and mention the relationship to the E-cadherin signal in the figure legend.

      Fig. S3G: An arrowhead can be added to the gel image to indicate the band described in the legend.

      Response:

      In the preliminary revision, we added an arrow to help annotating the Src42A-specific bands on the Western blot.

      • Fig. 9B: Arrow/arrowheads can be added to show the absence of the signals in the nurse cells.

      Response:

      In the preliminary revision, we added markers to help recognizing the reduced signal in the nurse cells and the oocyte.

      • Fig. 9C: Indicate the ending point of the germband extension by arrows.

      Response: In the preliminary revision, we added arrows to mark the anterior and posterior extent of the germband.

      Regarding comments from reviewer #3:

      Minor notes: Page 4, missing comma after "For example"

      Response: The text was edited accordingly.

      Page 4, "inevitable" does not make sense in this context Response:

      In the preliminary revision, we eliminated ‘inevitable’ and replaced it with ‘critical’ to better indicate the importance of Canoe protein for germband elongation.

      Description of analyses that authors prefer not to carry out

      Referee #1 point2 and Referee#2 minor comment figure 1. Both referees suggest that figure 1 AB should include earlier developmental stages according to the stages looked at in the RNAi knockdown experiment.

      Response:

      The referees’ comments are likely based on a misunderstanding. The data that the reviewer are referring to present analyses of the zygotic phenotype of embryos homozygous for the src42A26-1 loss of function allele. They are not related to the maternal RNAi knockdown experiments, but were meant to demonstrate the existence and extent of a maternal pool of Src42A protein, that persists even to late stages in development. The maternal knockdown mutants are analyzed in detail at the appropriate stages in Fig. 2.

      As described in our response above, we don’t feel that a detailed developmental stage Western analysis of wildtype and src42A26-1 embryos would provide significant additional insights. As mentioned in our response above, data for an earlier developmental stage (before germband elongation, as requested by the referees, were provided in Suppl. Fig. S1E.

      Referee #1 Point 6) Figure 8E - showing images of multiple tAJs, rather than z-slices of a single vertex, would better support the claim here, as the assertion is that Src42a levels are different between control and sdk RNAi conditions, and not that it varies in the z-dimension.

      Response:

      The image series of Fig. 8E shows one representative example of multiple tAJs that have been imaged for this experiment (n=6 for wild type and n=10 for sdk RNAi). We think that the presentation of Z-slices for this experiment is important as the protein distribution needs to be considered for a larger area along the apical-lateral cell interface. In addition the quantification of the data for multiple tAJs was presented in Fig. 8F,G as a graph. We would therefore rather not change this figure in the revised manuscript.

      Referee #3 suggests that anti MyoII staining should accompany the analysis of tension measurements in the germband.

      As this analysis has already been performed by Tamada et al. 2021, we decided not to reproduce these data, but rather extend the analysis towards tension measurements, which support the findings by Tamada et al. 2021 on a functional level. We do not see the added value of adding MyoII labeling.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2021-01016

      Corresponding author(s): Dennis Klug

      1. General Statements [optional]

      Dear editor, dear reviewers,

      thank you very much for the quick review of our manuscript as well as for the constructive criticism and the interesting discussion of our results. Reading the comments, we realized that we may have put too much emphasis on the in vivo microscopy of sporozoites and their interaction with the salivary gland. We believe that the generated mosquito lines can be used to address different scientific questions, the in vivo microscopy of host-pathogen interactions being only one of them. Because of this imbalance, and to address some of the reviewers' comments, we have partially rewritten the manuscript (particularly the introduction). At the same time, we have implemented additional data on the inducibility of the promoters used, as well as on the functionality of hGrx1-roGFP2 in the salivary glands. Furthermore, we created an additional figure to better present the expression patterns of trio and saglin promoters within the median lobe, and we expanded the section on in vivo microscopy of sporozoites. We hope that these results further highlight the significance of our study. Accordingly, we have also changed the title of the manuscript to „A toolbox of engineered mosquito lines to study salivary gland biology and malaria transmission” to indicate the broad applicability of the generated mosquito lines and we have included an additional co-author, Raquel Mela-Lopez, who conducted the redox analysis. We hope that these changes will adequately answer the questions of the reviewers and address any concerns they may have had. We look forward to hearing from you.

      With our kind regards,

      Dennis Klug

      Katharina Arnold

      Raquel Mela-Lopez

      Eric Marois

      Stéphanie Blandin

      2. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      **Summary**

      This manuscript reports the generation and characterization of transgenic lines in the African malaria mosquito Anopheles coluzzii that express fluorescent proteins in the salivary glands, and their potential use for in vivo imaging of Plasmodium sporozoites. The authors tested three salivary gland-specific promoters from the genes encoding anopheline antiplatelet protein (AAPP), the triple functional domain protein (TRIO) and saglin (SAG), to drive expression of DsRed and roGFP2 fluorescent reporters. The authors also generated a SAG knockout line where SAG open reading frame was replaced by GFP. The reporter expression pattern revealed lobe-specific activity of the promoters within the salivary glands, restricted either to the distal lobes (aapp) or the middle lobe (trio and sag). One of the lines, expressing hGrx1-roGFP2 under control of aapp promoter, displayed abnormal morphology of the salivary glands, while other lines looked normal. The data show that expression of fluorescent reporters does not impair Plasmodium berghei development in the mosquito, with oocyst densities and salivary gland sporozoite numbers not different from wild type mosquitoes. Salivary gland reporter lines were crossed with a pigmentation deficient yellow(-) mosquito line to provide proof of concept of in vivo imaging of GFP-expressing P. berghei sporozoites in live infected mosquitoes.

      **Major comments**

      Overall the manuscript is very well written with a clear narrative. The data are very well presented. The generation of the transgenic mosquito lines is elegant and state-of-the art, and the new reporter lines are thoroughly characterized.

      This is a nice piece of work that is suitable for publication, although the in vivo imaging of sporozoites is somewhat preliminary and would benefit from additional experiments to increase the study impact.

      We would like to thank the reviewer for his/her appreciation of our manuscript. In the revised version, we have included additional experiments on in vivo imaging of sporozoites, which allowed us to quantify moving and non-moving sporozoites imaged under the cuticle of live mosquitoes. Although this is still a proof of concept, we believe that these new data provide novel interesting data and will better illustrate potential applications.

      The reporter mosquito lines express fluorescent salivary gland lobes, yet the authors only provide imaging of parasites outside the glands. It would be relevant to provide images of the parasite inside the fluorescent glands.

      We have now included images showing sporozoites inside the salivary glands in vivo in Figure 8C and discuss possible ways to further improve resolution and efficiency of the imaging procedure in lines 563-586.

      The advantage of the pigmentation-deficient line over simple reporter lines is not clear, essentially due to the background GFP fluorescent in figure 5C. Imaging of GFP-expressing parasites should be performed in mosquitoes after excision of the GFP cassette under control of the 3xP3 promoter. This would probably allow to document the value of the reporter lines more convincingly.

      Indeed, by incorporating two Lox sites in the transgenesis cassette, we designed the yellow(-)KI line to permit removal of the fluorescent cassette and completely exclude expression of the transgenesis reporter EGFP. Still, EGFP expression in the yellow(-)KI adults is restricted to the eye and ovary, as we show now in Figure 7 supplement 1D. In contrast, no EGFP fluorescence was observed in the thorax area (Figure 7 supplement 1D). Therefore, we believe that the benefit of removing the fluorescence cassette for this study is limited. Moreover, the generation of such a line would take at least 3-4 months before experiments could be performed. Nevertheless, we agree with the reviewer that removal of the fluorescence cassette would be instrumental for follow-up studies. To draw the reader's attention to this issue, we now discuss background fluorescence in lines 378-387.

      Along the same line, it is unclear if the DsRed spillover signal in the GFP channel is inherent to the high expression level or to a non-optimal microscope setting. This is a limitation for the use of the reporter lines to image GFP-expressing parasites.

      We have discussed this problem with the head of the imaging platform at our institute, and we believe that it is not a problem that occurs due to incorrect settings. Rather, it seems to be due to the significant expression differences of the two fluorescence reporters used. We agree with the reviewer that this is a limitation and discuss the problem now in lines 416-412 and 565-567.

      The authors should fully exploit the SAG(-) line, which is knockout for saglin and provides a unique opportunity to determine the role of this protein during invasion of the salivary glands. This would considerably augment the impact of the study. In this regard, line 131 and Fig S3E: why is there persistence of a PCR band for non-excised in the sag(-)EX DNA?

      We definitely share the reviewer's enthusiasm about saglin and its role in parasite development in mosquitoes. We have thoroughly characterized the phenotype of sag(-) lines with respect to fitness and Plasmodium infection. These results are described in a spearate manuscript currently in peer review and available as a preprint on bioRxiv (https://doi.org/10.1101/2022.04.25.489337). Furthermore, in the revised manuscript, we have included additional data on the transcriptional activity of the saglin promoter with respect to the onset of expression and blood meal inducibility (Figure 2). In addition, we have included a completely new Figure 3 to highlight the spatial differences in transcriptional activity of the saglin promoter compared with the trio promoter. These new data are commented in lines 206-276.

      There might be a misunderstanding in the interpretation of the genotyping PCR. The PCR shown in Figure 1 – figure supplement 3, displays PCR products for different genomic DNAs (sag(-)EX, sag(-)KI and wild type) using the same primer pair. „Excised“ refers to sag(-)EX while „non excised“ refers to sag(-)KI and „control“ to wild type. Primers were chosen in a way to yield a PCR product as long as the transgene has integrated, only the shift in size between „excised“ and „non excised“ indicates the loss of the 3xP3-lox fragment. We have now changed the labeling of the respective gel in Figure 1 – figure supplement 3 to make this clearer.

      Did the authors search for alternative integration of the construct to explain the trioDsRed variability?

      We validated trio-DsRed cassette insertion in the X1 locus by PCR. The only way to rule out an additional integration of the transgene would be whole genome sequencing, which we did not perform. Still, we believe that the observed expression patterns are due to locus-specific effects of the X1 locus. Indeed, several lines of evidence point in this direction: (1) transgenesis was realized using the phage Φ31 integrase that promotes site-specific integration (attP is 38bp long and very unlikely to occur as such in the mosquito genome) and for which we never detected insertion in other sites in the genome for other constructs inserted in X1 and other docking lines; (2) additional unlinked insertions would have been easily detected during the first backcrosses to WT mosquitoes we perform in order to isolate the transgenic line and homozygotise it; (3) we have often observed variegated expression patterns for other transgenes located in the X1 locus in the past, leading us to believe that this locus is subjected to variegation influencing the expression of the inserted promoters. Usually, the variation we observe is simpler (e.g. strong and weak expression of the fluorescent reporter placed under the control of the 3xP3 promoter in the same tissues where it is normally expressed), but some promoters are more sensitive to nearby genomic environment than others, which we believe is the case for trio. Finally, should there be additional insertions of the transgenesis cassette in the genome, they should all be linked to the X1 locus as we would otherwise have detected them in the first crosses as mentioned above, which is unlikely. Thus, although very unlikely, we cannot exclude a single additional and linked insertion possibly explaining the high/low DsRed patterns, but variegation would still be required to explain other patterns. We have mentioned this alternative explanation in the manuscript in lines 522-524.

      Line 254-255. Does the abnormal morphology of SG from aapp-hGrx1-roGFP2 result in reduced sporozoite transmission?

      This is an interesting question. For future experiments, it could indeed be important to test if the transmission of sporozoites by the generated salivary gland reporter lines is not impaired. However, the quantification of the number of sporozoites in aapp-hGrx1-roGFP2 expressing salivary glands did not reveal any significant differences from the wild type (Figure 5 – figure supplement 1B) and would definitely be sufficient to infect mice. As we have no evidence for reduced invasion of sporozoites in the salivary glands of aapp-hGrx1-roGFP2 and of the DsRed reporter lines, no good reason to believe that the expression of fluorescent proteins would interfere with parasite transmission, and as we produced these lines as tools to follow sporozoite interaction with salivary glands, we have not performed transmission experiments.

      Of note, we have now included images of highly infected salivary glands of all reporter lines in Figure 5 – figure supplement 1D to confirm that expression of the respective fluorescence reporter does not interfere with sporozoite invasion. Also we have not observed that sporozoites do not invade salivary gland areas displaying high levels of hGrx1-roGFP2.

      **Minor comments**

      -Line 51: sporogony rather than schizogony

      Schizogony was replaced with sporogony.

      -Line 56: sporozoites are not really deformable as they keep their shape during motility

      This sentence was removed.

      -In the result section, it is not clearly explained where constructs were integrated.

      We have now included the sentence „...with an attP site on chromosome 2L...“ (line 173) and the respective reference (PMID: 25869647) to give more information about the integration site.

      Line 106 and 434-435: for the non-expert reader, it is not clear what X1 refers to, strain or locus for integration?

      X1 refers to both, the locus and the docking line. We have rephrased the beginning of the result section (previously line 106) to give more information about the integration site as mentioned above.

      -Line 112-115: the rational of integrating GFP instead of SAG is not clearly explained here, but become clearer in the discussion (line

      We have slightly rephrased the sentence to better explain the reasoning for this procedure (lines 182-184).

      -Line 140: FigS2A instead of S3A

      This mistake was corrected in the revised manuscript.

      -Perhaps mention that GFP reporters (SG) might be useful to image RFP-expressing parasites.

      We have now included an image of the aapp-hGrx1-roGFP2 line infected with a mCherry expressing P. berghei strain in Fig. 7D.

      -Line 236: the authors cannot exclude integration of an additional copy (as mentioned in the discussion line 367-368).

      As discussed above, we removed „..as a single copy...“ and introduced the possibility of an additional integration linked to X1 (lines 522-524).

      -Line 257-258. The title of this section should be modified as SG invasion was not captured.

      The title was rephrased. It reads now „Salivary gland reporter lines as a tool to investigate sporozoite interactions with salivary glands” (line 356-357).

      -Line 287: remove "considerable number" since there is no quantification.

      This was removed. In addition, we included new data in this section of the manuscript and rephrased the results accordingly (lines 406-427).

      -Line 400-402: Klug and Frischknecht have shown that motility precedes egress from oocysts (PMID 28115054), so the statement should be modified.

      Thank you for this suggestion. The passage was modified accordingly.

      -Line 404: remove "significant number" since there is no quantification.

      This section was rephrased and the phrase "significant number" was removed (lines 406-427).

      -Line 497: typo "transgenesis"

      The typo was correct in the revised manuscript.

      -FigS1: add sag-DsRed in the title

      Thank you for spotting this inconsistency, we corrected this mistake (line 1134).

      -Stats: Mann Whitney is adequate for analysis in fig 2C but not 2B, where ANOVA should be used (more than 2 groups).

      We have performed now an one-way-ANOVA test and adapted figure and figure legend accordingly.

      Reviewer #1 (Significance (Required)):

      This work describes a technical advance that will mainly benefit researchers interested in vector-Plasmodium interactions. Invasion of salivary glands by Plasmodium sporozoites is an essential step for transmission of the malaria parasite, yet remains poorly understood as it is not easily accessible to experimentation. The development of transgenic mosquitoes expressing fluorescent salivary glands and with decreased pigmentation provides novel tools to allow for the first time in vivo imaging in live mosquitos of the interactions between sporozoites and salivary glands.

      Reviewer's expertise: malaria, Plasmodium berghei, genetic manipulation, host-parasite interactions

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The first achievements of the Klug et al. study are the (i) genetical engineering of the Anopheles coluzzii mosquitoes reared in insectarium, that stably express distinct fluorescent reporters (DsRed and hGrx1-roGFP2 and EGFP) under the putative "promoters" of genes reported to encode proteins expressed differentially in the pluri-lobal salivary glands(Sg) of anthropophilic blood-feeding adult females, (ii) the analysis of the promoter activity - based on the selected fluorescent reporter - with a primary focus on the salivary gland/Sg (including at the Sg lobe level) of the adult female but also considering the preimaginal developmental time with larvae and pupa samples. Of note, some data confirm the already reported time-dependent and blood meal-dependent promoter activity for the related Anopheles species. The last part presents preliminary dataset on live imaging of Plasmodium berghei sporozoites with the aim of highlighting the usefulness of these A. coluzzii transgenic

      lines to better understand how the rodent Plasmodium sporozoites first colonize and then settle as packed cells in Sg acinar host cells.

      **Major comments**

      The two first objectives presented by the authors have been convincingly achieved with (i) the challenging production of four different lines expressing different single or double reporters chosen by the authors (and appropriately presented in the result text and figure sections), (ii) the careful analysis of the spatiotemporal expression of the DsRed reporter under two "promoters" studied and with regards to the blood feeding event parameter. However, if the reason why the authors have put so much effort in the production of their transgenic mosquitoes is (and as mentioned) to provide a significant improved setting enabling the behavioral analysis of sporozoites upon colonization and survival in the Sg, it seems this part is kind of limited. Likely in relation with this perception is the fact I found the introductory section often confusing and not enough direct to the points: in particular distinguishing the rationale from the necessity to produce appropriate models, and clarifying what is/are the added value(s) offered by these new transgenic lines models when compared to what exist (in Anopheles stephensi) with specific evidence that argue for this knowledge gain. At this stage, it is unfortunately not clear to me, what is the bonus of imaging the Plasmodium fluorescent sporozoites in hosts with fluorescent salivary gland lobes if one can not monitor key events of the Sg-sporozoite interaction that were not reachable without the fluorescent mosquito lines. Furthermore, it should be better explained why the rodent Plasmodium species has been chosen rather Plasmodium falciparum (or other human species) for which A. coluzzii is a natural host; may be just mentioning that this study would serve as a proof of concept but bringing real biological insights would be fine.

      We would like to thank the reviewer for his/her evaluation of our manuscript, which has helped us clarify our manuscript on several points. Our goal here was a proof of concept demonstrating potential applications for the fluorescent salivary gland reporter lines and for the low pigmented yellow(-) line we generated. In vivo imaging of sporozoites in salivary glands is one possible application that we intended to use as proof-of-concept, but we tailored the manuscript too restrictively with this aim in mind and neglected other applications as well as characterization of the biology of salivary glands in general. To improve this, we have included further data on the blood inducibility of the promoters tested (Figure 2), the functionality of roGFP2 in the salivary glands (Figure 5), and the use of the generated lines in the examination and definition of expression patterns of salivary gland proteins in vivo (Figure 6). Accordingly, we have adjusted the entire manuscript to adequately describe all the results presented. We have also rephrased major parts of the abstract and the introduction to better describe the impact of salivary gland biology on the transmission of pathogens, and to explain the anatomy of salivary glands in more detail.

      We agree with the reviewer that it would be desirable to show direct salivary gland-sporozoite interactions in vivo. Still we believe that having mosquito lines expressing a fluorescent marker in the salivary gland as well as weakly pigmented mosquitoes are a first step to make this visualization possible, although we cannot provide a lot of quantitative data about this interaction yet.

      1- The three genes and gene products selected by the authors should definitively be more systematically explained, which means for example the authors need to introduce the different mosquito species and the parasite-mosquito host pairs they are then referring to for the promoter/encoded proteins of their interest. In the same vein, I did not find any information as to the choice of the mosquito species (A. Coluzzii) for the current work. I was curious to know what is the advantage since better knowledge was available with Anopheles stephensi with respect to (i) Saglin and its promotor activity, (ii) aap driven dsRed expression (lines already existing) and (iii) sporozoite-gland interaction.

      We have largely reworded the introduction to clarify the rationale for selecting these three promoters while providing a better understanding of salivary gland biology in general.

      The choice of the mosquito species depends, in our opinion, strongly on the perspective and on the experiments to be performed. We agree with the reviewer that the malaria mosquito A. stephensi is a widely used model, based on its robustness in breeding and its high susceptibility to P. berghei and P. falciparum infections. However, in these cases, both vector-parasite pairs are to some extend artificial. Indeed, although it is also a vector of P. falciparum in some regions, A. stephensi mostly transmits P. vivax that cannot be cultured in vitro. Thus research efforts on this vector-parasite pair is limited. Also, due to the emerging number of observed differences between Anopheles species and their susceptibility to Plasmodium infection and transmission, more research has recently been conducted on African mosquito species. This effect is also reinforced by the fact that P. falciparum, unlike all other Plasmodium species infecting humans, causes the most deaths, making control strategies for species from the A. gambiae complex such as A. coluzzii particularly important. As a result, the number of available genetic tools in A. coluzzi/gambiae has overpaced A. stephensi. These include mosquito lines with germline-specific expression of Cas9 for site-directed transgenesis, lines expressing Cre for lox-mediated recombination, and several docking lines. Such tools are, as far as we know, not available in A. stephensi and were essential in reaching our objectives. Docking lines are of particular interest because they allow reliable integration into a characterized locus, which is an advantage over random transposon-mediated integration. Random insertion sites have generally not been characterized in the past, which can cause problems since integrations regularly occur in coding sequences. Docking lines also enable comparison of different transgenes as they are all integrated in the same genetic environment, which does not ensure some expression variation as illustrated in our manuscript. For all these reasons, we have thus chosen to work with A. coluzzii.

      Concerning the use of the murine malaria parasite P. berghei instead of the human one P. falciparum, there are two reasons that motivated our choice. (1) For in vivo imaging of sporozoites, we needed a parasite line that is strongly fluorescent at this stage, and there is no such line existing for P. falciparum. Actually, there is no fluorescent P. falciparum line able to efficiently infect A. coluzzii reported thus far, as reporter genes have all been inserted in the Pfs47 locus that is required by P. falciparum for A. coluzzii colonization. (2) Imaging P. falciparum infected mosquitoes, especially with sporozoites in their salivary glands, requires to have access to a confocal microscope in a biosafety level 3 laboratory. Hence our objective here was indeed to provide a proof of principle of in vivo imaging of sporozoites in the vicinity or inside salivary glands using our engineered mosquitoes, and to provide a first analysis of this process using P. berghei as a model of infection. Nevertheless, we agree with the reviewer that the goal should be to work as close as possible to the human pathogen.

      Despite the wide range of topics that this study touches on, we want to try and keep the manuscript as concise as possible. Therefore, we have not discussed the advantages and disadvantages of the different vector-parasite pairs and ask the reviewer to indulge us in this.

      2- To help clarifying the added value of the present study, introducing the species names of the mosquito and the Plasmodium that serve as a model would be appreciated.

      We have included now the name of the used Plasmodium species in line 361. At this position we also give now more details about the transgene this line is carrying. We mention the used mosquito species A. coluzzii now at different positions in the manuscript (e.g. lines 52, 162 and 177).

      3- Since a focus is the salivary gland of the blood feeding female Anopheles sp., a rapid description of the glands with different lobes and subdomains the results and figure 1 nicely refer to, would help in the introduction.

      We explain now the anatomy of female and male mosquito salivary glands in the introduction (lines 119-123). The different lobes are now also indicated in the salivary gland images shown in several figures including Figure 1.

      4- That description could logically introduce the few proteins actually identified with lobe specific or cell domain specific expression (apical versus basal side, intracellular or surface expose, vacuole, duct...) profiles. The context with regards to sporozoite biology would then easily validate the "promoter choice". As a minor remark, I miss the reason why the authors wrote " the astonishing degree of order of the structures (referring to the packing of sporozoites within the Sg acinars) raise the question whether sporozoite can recognize each other". Please clarify since packing/accumulation can be passive due to cell mechanical constraints and explain what this point has to see with the question and experimental work proposed here?)

      We thank you for this suggestion. We have reworded key parts of the introduction to make the reasons for using the three selected promoters clearer. We also mention now other proteins expressed in the salivary glands which have been characterized in more detail because of their effect on blood homeostasis (e.g. anticoagulants) (lines 136-139).

      The mention of stack formation of salivary gland sporozoites served only to clarify that almost nothing is known about the behavior of sporozoites within the salivary glands in vivo to explain why new methods are needed to make these processes visible. We have now reworded this passage to make this clearer, and we also mention that stack formation could also occur due to mechanical constraints, as suggested by the reviewer (lines 101-102, 106-110).

      5- The selection of hGrx1-roGFP2 is quite interesting and justified but there is then no use of this reporter property in the preliminary characterization of the Sg and Sg-sporozoite interaction. Could the authors provide such characterization?

      We have now implemented data testing the functionality of hGrx1-roGFP2 in the salivary glands. We also show qualitatively that the redox state of glutathione does not change upon infection with P. berghei sporozoites (Figure 6). We now describe and discuss these new data in lines 337-354.

      6- Figure 1: it would be nice to add in the legend at what time the dissection/imaging has been made (age, blood feeding timing?). I would also omit the double mutant trio-Dsred/aapDsred in the main figure (may be supplemental) since the two single mutants Dsred separately together with the double mutant (with different fluorescence) already provide the information. I would suggest to regroup the phenotypic presentation of the transgenic line made in the KI mosquitoes (current figure 5) in the main figure 1.

      We have now added the missing information about the age of dissected mosquitoes and their feeding status in the legend of Figure 1. We also thank the reviewer for the suggestion to replace one image displaying aapp and trio promoter activity in trans-heterozygous mosquitoes with an image of the pigment deficient mutant yellow(-)KI. Still, due to the changes made to the manuscript based on the reviewers comments in general, we have now implemented new data highlighting the functionality of the generated salivary gland reporter lines investigating the redox state of glutathione as well as the expression pattern of the saglin and trio promoters at the single cell level (see Figure 3 and 6). Therefore it would no longer seem logical to introduce the yellow(-)KI mutant in Figure 1 while further data on this mutant are provided in the last two figures of the manuscript and discussed later in the manuscript (Figure 7 and 8). In addition we believe that co-expression of different transgenes (carrying fluorescent reporters) in the median and the distal lobes could potentially be interesting for certain applications. We believe that readers who might actually be interested in combining both transgenes in a cross would like to see the outcome to better evaluate the usefulness before experiments are planned and performed. This is especially true because localization as well as expression strength may differ between different fluorescence reporters while using the same promoter (e.g. the hGrx1-roGFP2 construct appears less bright and more localized to the apex of the distal-lateral lobes than dsRed, while expression of both reporters is driven by the aapp promoter in aapp-hGrx1-roGFP2 and aapp-DsRed, respectively).

      7- Figure 2:

      1. a) Is there anything known on the Sgs' size change overtime. It seems that between day 1 and 2 there is an increase of size and volume as much as I can evaluate the volume (Fig S4). Could that mean that there is increase in cell number in the lobes and therefore more cells expressing the transgene which would account for the signal intensity increase rather than more transcripts per cell? Thank you for this interesting question. The changes in the morphology of the salivary glands in Anopheles gambiae following eclosion have been studied in detail by Wells et al., 2017 (PMID: 28377572) which we cite now in the introduction (line 122-123). According to this reference, cell counts of the salivary gland are not changing upon emergence of the adult mosquito. However, we agree with the reviewer that the glands appear smaller and differ in morphology directly after eclosion. We noted that glands of freshly emerged females are more „fragile“ during dissections and lack secretory cavities, as reported by Wells et al., 2017. We believe that the increase in size occurs through the formation and filling of the secretory cavities which has been reported to take place within the first 4 days after emergence (Wells et al., 2017). This observation is in accordance with our observations that the promoters of the saliva proteins AAPP and Saglin display only weak activity after hatching, or, in the case of TRIO are not yet active directly after emergence. The timing of the formation of the secretory cavities is also in agreement with our time course experiment (Figure 2) which shows a strong increase in fluorescence intensity in dissected glands within the first 4 days after emergence.

      2. b) why choosing 24h after the blood meal to assess promoter activity in the Sgs? Do we have any information on how the blood meal impact on the Sgs'development. At this time anyway the sporozoites are far from being made. Yosshida and Watanabe 2006 mentioned at significant decrease of Sg proteins post-blood feeding. Could the authors detail their rationale based on what the questions they wish to address Thank you for this question. Unfortunately, the data available in the literature on this topic are very sparse, so we could only refer to few previous publications. The decision to quantify the fluorescence signals as early as 24 hours after blood feeding was based on Yoshida et al, Insect Mol. Biol, 2006, PMID: 16907827. The authors of this study generated the first salivary gland reporter line in A. stephensi by using the aapp promoter sequence to drive DsRed expression, and showed by qRT-PCR that DsRed transcripts increase 1-2 days after blood feeding compared to controls. Consistent with this observation and because we were concerned that putative changes in protein levels would only be visible for a short period of time, we began quantification one day after feeding. Since we observed significant changes in fluorescence intensity for the aapp-DsRed and sag(-)KI lines 24 hours after blood feeding, we retained the experimental setup and did not change it further. Nevertheless, we agree with the reviewer that different time points could help determine how long the effect lasts, and whether trio expression might also be regulated by blood feeding, but at a later time point. Still, our main objective here was to validate that the ectopic expression of DsRed driven by the aapp promoter in the aapp-DsRed line was indeed induced upon blood feeding as previously reported (PMID: 16907827). This experiment allowed us to confirm the inducibility of aapp in a different way and to show for the first time that saglin, but not trio, is induced one day after blood feeding. Our transgenic lines could be used for follow-up studies investigating the inducibility of salivary gland-specific promoters by different stimuli, or after infection with Plasmodium sporozoites. For example, for trio, transcription has been shown to increase after infection of the salivary gland by Plasmodium (PMID: 29649443).

      8- Figure 3: The figure is quite informative in terms of subcellular localization. Concerning the section "Natural variation of DsRed expression in trio-DsRed mosquitoes", I think it could be shortened because because it is a bit out of the focus the study.

      We agree with the reviewer that this part of the manuscript sticks a bit out and is not perfectly in line with the remaining results because it doesn’t deal with the salivary gland. Still, we would like to emphasise that in this work, we particularly want to show possible applications of the generated mosquito lines to address unanswered questions in host-parasite interactions and salivary gland biology. As a result, this manuscript establishes potentially important tools. For this reason, we feel it is important to mention the natural variation in DsRed expression, as this natural variation can have a significant impact on crossing schemes (especially with lines inheriting other DsRed-marked transgenes) and experiments (e.g. visualizing DsRed expression by western blot in larval and pupal stages). Furthermore, it is important for the use of the line to show that the transgene is inserted only once, at the expected location, which we try to emphasize with figure 4 – figure supplement 1 and figure 4 – figure supplement 2.

      We would also like to note that transgenesis in Anopheles is a relatively young field of research and altered expression patterns of ectopically used promoters have rarely been described so far, although this could have major implications e.g. in the case of gene drives. Therefore, we hope that the data shown will bring this previously neglected observation more into focus and highlight the importance of accurate characterization of generated transgenic mosquito lines.

      9- In contrast the last section of live imaging of P. berghei sporozoites in the vicinity and within salivary gland should be expanded. The 2 sentences summarizing the data are quite frustrating "We also observed single sporozoites moving actively through tissues in a back and forth gliding manner (Fig. 6B, Movie 3) or making contact with the salivary gland although no invasion event could be monitored"

      We have now implemented new data and extended Figure 8 showing the results of the in vivo imaging in a qualitative manner. We have rephrased the result and discussion section accordingly.

      10- I am aware of the technical difficulties to perform live imaging of sporozoite on whole mosquitoes, even when the salivary gland lobe under observation is closely apposed to the cuticle but that seems to be the final aim of the authors. I looked very carefully to the three movies and I am sorry but at this stage I could not make meaningful analysis out of them, and could not agree with the conclusions: for instances, the authors specify that sporozoites were undergoing back and forth movements (movie 3) but I do not see that and do not see the Sg contours in the available movies? The authors should also add bar and time scales to their movies. Having an in-depth description with regards to the sub-domain marked by a relevant reporter would strengthen the study, even if images are not collected in the whole mosquito to get higher resolution.

      We thank the reviewer for this comment. We have to admit that parasite imaging in fluorescent salivary glands in vivo is an ambitious goal given the complex biological system we are working with. We believe that the system presented in our manuscript is a first and important step to enable the analysis of the interaction of sporozoites with salivary glands, although in-depth analysis will require further optimization and considerable time, especially to generate quantitative data. Therefore, we now downstate the significance of our results in this respect and changed the title accordingly. Still, we also provide a more detailed analysis of the data we have already collected (Figure 8 and lines 406-427). Because we focus on the analysis of sporozoites in the thorax area in the revised manuscript, the outlines of the salivary gland are not necessarily visible in the images.

      I am not sure I understand the relevance of this quite condensed sentence in the text. Could the authors rephrase and expand if they wish to keep the issues they refer to. "The sporozoites' distinctive cell polarization and crescent shape, in combination with high motility, allows them to „drill" through tissues". I would stress more on the main unknown in terms of sporozoite-Sg interactions and the need to get right models for applying informative approaches (i.e. here, imaging).

      We thank you for this suggestion. The sentence mentioned has been removed in its entirety. We have also adjusted the text accordingly and reworded most of the introduction to make the narrative clearer (lines 91-119).

      Of note, it could help to point that the "Sgs is a niche in which the sporozoites which egress from the oocyst could mature and be fully competent when co-deposited with the saliva into the dermis of their intermediary hosts"

      We have now implemented a similar sentence in the introduction (lines 93-98).

      Reviewer #2 (Significance (Required)):

      1- Clear technical significance with the challenging molecular genetics achieved in the mosquito A. coluzzii.

      2- More limited biological significance: fair analysis and gain of knowledge of spatio-temporal of reporter expression under the selected promoter but limited significance of the final goal analysis which concerns the Plasmodium sporozoite biology once egressed from oocysts

      As stated above, we changed the title to place the focus on the engineered mosquito lines.

      3- Previous reports cited by the authors have used the DsRed reporter and the aap promoter in another Anopheles (i.e. A. stephensi, Yoshida and Watanabe, Insect Mol Biol, 2006; Wells and Andrew, 2019) which is also a natural host and vector for human Plasmodium spp.) with significantly more resolutive 3D visualization of GFP-fluorescent P. berghei but in dissected salivary glands and not in whole mosquitoes. The Wells and Andrew publication entitled "Salivary gland cellular architecture in the Asian malaria vector mosquito Anopheles stephensi" in Parasite Vectors, 2015 would deserve to be reference and described.

      Thank you very much for this suggestion. We considered citing Wells and Andrews (PMID: 26627194). However, this reference focuses very specifically on the subcellular localization of AAPP and shows only highly magnified sections of immunostained dissected and fixed salivary glands. Working only with the AAPP promoter, we felt it important to refer to the previously observed expression pattern along the entire salivary gland, as shown in Yoshida and Watanabe (PMID: 16907827). Nevertheless, we have cited two other publications by Wells and Andrews (PMID: 31387905 and 28377572) at various points in the manuscript.

      4- Audience: I would say that this work should be of interest of mostly scientists investigating Plasmodium biology (basic and field research) or in entomology of Diptera.

      5- To describe my fields of expertise, I can refer to my extensive initial training in entomology including at one point in the genetic basis of mosquito-virus interaction. I have also been working for more than 20 years in the field of Apicomplexa biology (Plasmodium and Toxoplasma) and I have long-standing interest in live and static high-resolution imaging.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Klug et al. generated salivary gland reporter lines in the African malaria mosquito Anopheles coluzzii using salivary gland-specific promoters of three genes. Lobe-specific reporter activity from these promoters was observed within the salivary glands, restricted either to the distal lobes or the medial lobe. They characterized localization, expression strength and onset of expression in four mosquito lines. They also investigated the possibility of influences of the expressed fluorescent reporters on infection with Plasmodium berghei and salivary gland morphology. Using crosses with a pigmentation deficient mosquito line, they demonstrated that their salivary gland reporter lines represent a valuable tool to study the process of salivary gland colonization by Plasmodium parasites in live mosquitoes. SG positioning close to the cuticle in 20% of females in this strain is another key finding of this study.

      The key findings from this study are largely quite convincing. The authors have created a suite of SG reporter strains using modern genetic techniques that aid in vivo imaging of Plasmodium sporozoites.

      Vesicular staining within salivary acinar cells should be stated as "vesicle-like" staining unless a co-stain experiment in fixed SGs is conducted using antisera against the marker protein(s) and antisera against a known vesicular marker (e.g. Rab11). It may also be possible to achieve this in vivo using perfusion of a lipid dye (e.g. Nile Red), but this is not necessary. As is, in Fig. 3A, there are images in which it appears that the vesicle-like staining is located both within acinar cells' cytoplasm and in the secretory cavities (e.g. Fig. 3A: aapp-DsRed bottom and middle), and this is fine, but should be more inclusively stated. Fixed staining of the reporter strain SGs would allow for clarification of this point. In previous work, other groups have observed vesicle-like structures in both locations (e.g. PMID: 33305876).

      Thank you very much for this suggestion. Indeed, when we observed the vesicle-like localization, we had similar ideas and considered investigating the identity of the observed particles in more detail. Ultimately, however, we concluded that the localization of DsRed does not play a critical role in the use of the lines as such and believe that a more detailed investigation of the trafficking of the fluorescent protein DsRed is beyond the scope of this study.

      We have thus followed the suggestion of the reviewer and now use the phrase „vesicle-like“ throughout the manuscript. In addition, we extended the discussion on the different localizations observed and presented some explanations that might have led to this observation. We also included a new reference that investigated the localization of AAPP using immunofluorescence (PMID: 28377572).

      Morphological variation is extensive among individual mosquito SGs, thought to impact infectivity, and well documented in the literature. The manuscript should be edited to make it much clearer (e.g. n = ?) exactly how many SGs, especially in microscopy experiments, were imaged before a "representative" image was selected from each data point and in any additional experiment types where this information is not already presented. Figure S8 is an example where this was done well. Figure 3A-B is an example where this was not well done. All substantial variation (e.g. "we detected a strangulation..." - line 189) across individual SGs within a data point should be noted in the Results. Because of the genetics and labor involved, acceptable sample sizes for minor conclusions may be small (5-10), but should be larger for major conclusions when possible.

      Thank you for this comment. We have improved this point by specifying precisely the number of samples and of repetitions in the respective figure legends. For example, we have now quantified the proportion of moving sporozoites and report both the number of sporozoites evaluated and the number of microscopy sessions required (see Figure 8).

      Thank you for this comment. We have improved this point by specifying precisely the number of samples and of repetitions in the respective figure legends. For example, we have now quantified the proportion of moving sporozoites and report both the number of sporozoites evaluated and the number of microscopy sessions required (see Figure 8). Regarding Figure 3, fluorescence expression and localization in salivary gland reporter lines was actually very uniform in each line. We added the following sentence in the legend of revised figures 3 and 5: “Between 54 and 71 images were acquired for each line in ≥3 independent preparation and imaging sessions. Representative images presented here were all acquired in the same session”.

      Sporozoite number within SGs has been shown to be quite variable across the infection timeline, by mosquito species, by parasite strain, in the wild vs. in the lab, and according to additional study conditions. The authors mention that the levels they observed are consistent with their prior studies and experience, but they did not utilize the reporter strains and in vivo imaging to support these conclusions, instead relying on dissected glands and a cell counter. It is important for these researchers to attempt to leverage their in vivo imaging of SG sporozoites for direct quantification, likely using the "Analyze Particles" function in Fiji. The added time investment for this additional analysis would be around two weeks for one person experienced in the use of the imaging software.

      Thank you for this interesting suggestion. Indeed, it would be beneficial to use an imaging based approach to quantify the sporozoite load inside the salivary glands. We already used „watershed segmentation“ in combination with the „Analyze Particles“ function in Fiji on images of infected midguts to determine oocyst numbers. Still, we believe this analysis cannot be applied to images of infected salivary glands mainly because of differences in shape and location of the oocyst and sporozoite stages. Sporozoites inside salivary glands form dense, often multi-layered stacks. Because of this close proximity, watershedding cannot resolve them as single particles which could subsequently be counted. This creates an unnecessary error by counting accumulations of sporozoites as one, likely leading to an underestimation of actual parasite numbers. Furthermore, given that the proximity issue could be resolved e.g. by performing infections yielding lower sporozoite densities, another problem would be that infected salivary glands prepared for imaging are often slightly damaged leading to a leak of sporozoites from the gland into the surrounding. These leaked sporozoites are likely not included on images which would then be used for analysis, potentially leading again to an underestimation of counts. Since these issues are circumvented by the use of a cell counter, we believe that this method is still the method of choice in acquiring sporozoite numbers.

      Nevertheless, we can understand the reviewer's concern that counts performed with a hemocytometer do not reflect the variability in the sporozoite load of individual mosquitoes. To highlight that all generated reporter lines can have high sporozoite counts, we have now included images of highly infected salivary glands for each line in Figure 7D.

      This manuscript is presented thoughtfully and such that the data and methods could likely be well-replicated, if desired, by other researchers with similar expertise.

      The statistical analysis is appropriate for the experiments conducted. It is currently unclear if some experiments were adequately replicated. That information should be added to the paper throughout where it is missing.

      We do appreciate your comments on our efforts to give all required information for other laboratories to replicate our experiments. We have added the missing information about the number of independent experiments in the respective figure legends wherever appropriate.

      Studies from multiple groups should be more thoroughly referenced when the authors are describing the "vesicle-like" staining patterns observed in SGs from reporter strains (e.g. Fig. 3A). Is this similar to the SG vesicle-like structures observed previously (e.g. PMIDs: 28377572, 33305876, and others)?

      Thank you for this comment. We did not discuss this observation in detail in the first version of our manuscript because the observed localization was rather unexpected, as DsRed was not fused to the AAPP leader/signal peptide. The observed localization is therefore difficult to explain, however, we have expanded the discussion on this (lines 465-482) and now cite one of the proposed references (PMID: 28377572, lines 468-469).

      There are minor grammar issues in the manuscript text (e.g. "Up to date" should be "To date"). The figures are primarily presented very clearly and accurately. One minor suggestion: In cases such as Fig. S2A images 3 and 6, where some of the staining labels are very difficult to read, please move all labels for the figure to boxes located directly above the image.

      We are sorry for the grammatical errors we have missed in the first version of our manuscript. We have now performed a grammar check over the whole manuscript. We have also increased the font size of the captions in the above figures and tried to make them better readable by moving the captions over the images.

      The data and conclusions are presented well.

      Reviewer #3 (Significance (Required)):

      This report represents a significant technical advance (improved in vivo reporter strain and sporozoite imaging), and a minor conceptual advance (active sporozoite active motility), for the field.

      This work builds off of previous SG live imaging studies involving Plasmodium-infected mosquitoes (e.g. Sinnis lab, Frischneckt lab, etc.), addressing one of the major challenges from these studies (reliable in vivo imaging inside mosquito SGs).

      This work will appeal to a relatively small audience of vector biology researchers with an interest in SGs. Many in the field still see the SGs as intractable, instead choosing to focus on the midgut due to ease of manipulation. Perhaps work like this will spark new interest in tangential research areas.

      I have sufficient expertise to evaluate the entirety of this manuscript. Some descriptors of my perspective include: bioinformatics, SG molecular biology, mosquito salivary glands, microscopy, RNA interference, SG infection, and SG cell biology.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      Klug et al generated transgenic mosquito lines expressing fluorescent reporters regulated by salivary gland specific promoters and characterized fluorescent reporter expression level over the time, subcellular localization of fluorescent reporters, and impact on P. berghei oocyst and salivary gland sporozoite generation. In addition, by crossing one of the lines (aapp-DsRed) with yellow(-) KI mosquitoes, they open up the possibility to perform in vivo visualization of salivary glands and sporozoites.

      Overall the generation and characterization of these transgenic lines is well-done and will be helpful to the field. However, there are several concerns with the in vivo imaging data shown in Figure 6, which does not convincingly show fluorescent sporozoites in the lobe or secretory cavity of a fluorescent salivary gland lobe. This needs to be addressed. Points related to this concern are outlined below:

      (1) Although the authors mention that the DsRed signal was strong enough to see with GFP channel, it would be more appropriate to show that the DsRed signal from salivary glands and GFP channel image co-localize.

      We now show a merge of the GFP and DsRed signal in Figure 7 – figure supplement 2 The yellow appearance of the salivary gland in the merge likely indicates the spillover of the DsRed signal into the GFP channel. In addition we discuss the issue in lines 416-412 and 565-567.

      (2) Mosquitoes were pre-sorted using the GFP fluorescence of the sporozoites on day 17-21. From figure 4B, median salivary gland sporozoite number was about 10,000 sporozoites/mosquito on day 17-18. However, in Figure 6A there are no sporozoites in the secretory cavities. They should be able to see sporozoites in the cavities at this time. Can the authors confirm that they can visualize sporozoites in secretory cavities in vivo and perhaps show a picture of this.

      This is entirely correct. We also examined mosquitoes for the presence of sporozoites in the salivary glands and wing joints prior to imaging, as shown in Figure 7B and Figure 7 – figure supplement 2A, to increase the probability that sporozoites could be observed. Nevertheless, the area of the salivary gland that comes to the surface is often small and limited to a few cells that can be imaged with good resolution. Unfortunately, these same cells were often not infected although other regions of the salivary glands must have been very well infected based on the previously observed GFP screening (Figure 7B). In addition, with the confocal microscope available to us, we struggled to achieve the necessary depth to image sporozoites in the cavities of the salivary gland cells. For this reason, we were often able to detect a strong GFP signal in the background, but not always to resolve the sporozoites sufficiently well. Still, we have now included an image showing sporozoites in salivary glands (Figure 8C). However, we believe that the method can be further improved to be more efficient and provide better resolution. We discuss possible ways to further improve the imaging in lines 563-586.

      (3) There is no mention of the number of experiments performed (reproducibility) and no quantification of the imaging data. In the results (line 287-288), the authors state that sporozoites are present in tissue close to the gland and sometimes perform active movement. How can this be? Do they believe these sporozoites are on route to entering? More relevant to this study would be a demonstration that they can see sporozoites in the secretory cavities of the salivary gland epithelial cells, this should shown. If they have already performed a number of experiments, I would suggest to do quantification of the number of sporozoites observed in defined regions . The mention that sporozoites are moving is confounded by the flow of hemolymph. How do they know that the sporozoites are motile versus being carried by the hemolymph. Perhaps it's premature to jump to sporozoite motility in the mosquito when they haven't even shown sporozoite presence in the salivary glands.

      Thank you very much for this comment. We have followed the suggestions of the reviewer and have now quantified the behavior of sporozoites in the thorax area of the mosquito. For the analysis, we only considered sporozoites that could be observed for at least 5 minutes. This analysis revealed that 26% of persistent sporozoites performed active movements, which in most cases resembled patch gliding previously described in vitro. We adjusted the results section accordingly. In addition, we have changed the figure legend to accurately indicate the number of experiments performed. Likewise, we now also provide an image of sporozoites that we assume are located in the salivary gland (Figure 8C). Although we have not yet been able to image and quantify vector-sporozoite interactions extensively (further improvements would be required, as mentioned previously), we believe these results illustrate the potential of the transgenic lines.

      (4) In vivo imaging has been performed with the mosquito' sideways. Was this the best orientation? Have you tried other orientations like from the front (Figure 5B orientation).

      It is true that in the abdominal view as shown in Figure 7B the fluorescence in the salivary glands is very well visible. This is mainly due to the fact that in this area the cuticle is almost transparent and therefore serves as a kind of "window". Nevertheless, the salivary glands are not close to the cuticle in this position, which makes good confocal imaging impossible. Imaging always worked best where the salivary gland was very close to the cuticle, and this was always laterally. However, there were differences in the position of the salivary glands in individual mosquitoes, which also led to slight differences in the imaging angle.

      Overall, the text is easy to follow and I have only few suggestions.

      Thank you for this comment.

      In the result section, the authors describe the DsRed expression during development of mosquito (line 194-236) after they describe subcellular localization of fluorescent reporters. I felt the flow was disrupted. Thus, this part (line 194-236) could summarize and move to line 135. In this way, the result section flow according to the main figures.

      Thank you very much for this suggestion. We have considered your idea, but based on the changes we have made in response to reviewer comments and new data implemented in the form of two new figures, we believe the current order in the results section is more appropriate. The rationale was primarily to first characterize the expression of fluorescent reporters in the salivary glands of all lines before going into more detail on expression in other tissues of a single line. We then finish with potential applications like in vivo imaging of sporozoite interactions with salivary glands.

      Also, and as mentioned previously (reviewer 2, point 8), we believe it is important to describe the variability of ectopic promoter expression at a given locus with sufficient details, as this has not been characterized thus far despite its importance.

      In the result section, text line 186-190, the authors describe the morphological alternation of salivary gland in aapp-hGrx1-roGFP2. I would suggest to mention that this observation was only in one of lateral lobe. (I saw that it was mentioned in the figure legend but not in the main text.)

      We believe there has been a misunderstanding. The morphological alteration in salivary glands expressing aapp-hGrx1-roGFP2 was observed in all distal-lateral lobes to varying degrees (quantification in Figure 6E). To include as many salivary glands as possible in the quantification and because in some images only one distal-lateral lobe was in focus, only the diameter of one lobe per salivary gland was measured and evaluated. We have now revised the legend to prevent further misunderstandings.

      In the discussion section, author discuss localization of fluorescent reporters (line 322-331). When I looked at aapp-DsRed localization pattern (Figure 3A), the pattern looked similar to the previous publication by Wells et al 2017 (https://www.nature.com/articles/s41598-017-00672-0). This publication used AAPP antibody and stain together with other markers (Figure 4-7). This publication could be worth referring in the discussion section.

      Thank you for this suggestion. According to the information available through Vectorbase, we did not fuse DsRed with any coding sequence of AAPP that could potentially encode a trafficking signal. Therefore, it is rather unlikely that the observed DsRed localization in our aapp-DsRed line and the localization observed by AAPP immunofluorescence staining in WT mosquitoes match. This is further exemplified by the cytoplasmic localization of hGrx1-roGFP2 in the aapp-hGrx1-roGFP2 line, where the reporter gene was cloned under the control of the same promoter. For this reason, we had not mentioned this reference in the first version of the manuscript. In the revised manuscript, we have included now the suggested reference (lines: 475-476) and extended the discussion on possible reasons which led to the observed localization pattern.

      In the text, authors describe salivary gland lobes as distal lobes and middle lobe. It would be more accurate to refer to the lobes as the lateral and medial lobes. The lateral lobes can then be sub-divided into proximal and distal portions. I would suggest to use distal lateral lobes, proximal lateral lobes and median lobe as other references use (Wells M.B and Andrew D.J, 2019).

      Thank you for this suggestion. We have corrected the nomenclature for the description of the salivary gland anatomy as suggested throughout the manuscript.

      Overall, the figures are easy to understand and I have following suggestions and questions.

      Figure 1C) It is hard to see WT salivary gland median lobe. If authors have better image, please replace it so that it would be easier to compare WT and transgenic lines.

      We have replaced the wild-type images of salivary glands in this figure and labeled the median and distal-lateral lobes accordingly (see Figure 1).

      Figure 2) While it was interesting to observe the significant expression differences between day 3 and day 4, have you checked if this expression maintained over time or declines or increases (especially on day 17-21 when author perform in vivo imaging)?

      Thank you for this interesting question. We have not quantified fluorescence intensities in mosquitoes of higher age. Nevertheless, we regularly observed spillover of DsRed signaling to the GFP channel during sporozoite imaging, suggesting that expression levels, at least in aapp-DsRed expressing mosquitoes, remain high even in mosquitoes >20 days of age (see Figure 8A). We also confirmed this observation by dissecting salivary glands from old mosquitoes, whose distal lateral lobes always showed a strong pink coloration even in normal transmission light (data not shown).

      Figure 3A) There is no description of "Nuc" in figure legend. If "nuc" refers to nucleus, have you stained with nucleus staining dye (example, DAPI)?

      Thank you for spotting this missing information in the legend. Initial images shown in this figure were not stained with a nuclear dye. To test whether the observed GFP expression pattern really colocalizes with DNA, we performed further experiments in which salivary glands from both aapp-hGrx1-roGFP2 and sag(-)KI mosquitoes were stained with Hoechst. We have now included these new data in Figure 3 - figure supplement 1. It appears that GFP is concentrated around the nuclei of the acinar cells, which makes the nuclei clearly visible even without DNA staining.

      Figure 4B) The number of biological replicates in the figure and the legend do not match (In the figure, there are 3-5 data points and, in the legend, text says 3 biological replicates.)

      Thank you for spotting this inconsistency. The number of biological replicates refers to the number of mosquito generations used for experiments. The difference is due to the fact that sometimes two experiments were performed with the same generation of mosquitoes using two different infected mice. We have clarified the legend accordingly to avoid misunderstandings.

      Figure 4C) The number of data points from (B) is 5. However, in (C) only 4 data points are presented.

      We have corrected this mistake. In the previous version, the results of two technical replicates were inadvertently plotted separately in (B) instead of the mean.

      Figure 5) I would suggest to have thorax image of P. berghei infected mosquito to show both salivary glands and parasites.

      Thank you for this suggestion. Images in Figure 7B (previously Figure 5) were replaced with an infected specimen to show salivary glands (DsRed) and sporozoites (GFP) together.

      Reviewer #4 (Significance (Required)):

      The transgenic lines that authors created have potential for in vivo imaging of salivary gland and sporozoite interactions. Since the aapp and trio lines have distinct fluorescence expression, they could help elucidate why sporozoites are more likely to invade distal lateral lobes compare to median lobe.

      My areas of expertise are confocal microscope imaging, mosquito salivary gland and Plasmodium infection and sporozoite motility.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The first achievements of the Klug et al. study are the (i) genetical engineering of the Anopheles coluzzii mosquitoes reared in insectarium, that stably express distinct fluorescent reporters (DsRed and hGrx1-roGFP2 and EGFP) under the putative "promoters" of genes reported to encode proteins expressed differentially in the pluri-lobal salivary glands(Sg) of anthropophilic blood-feeding adult females, (ii) the analysis of the promoter activity - based on the selected fluorescent reporter - with a primary focus on the salivary gland/Sg (including at the Sg lobe level) of the adult female but also considering the preimaginal developmental time with larvae and pupa samples. Of note, some data confirm the already reported time-dependent and blood meal-dependent promoter activity for the related Anopheles species. The last part presents preliminary dataset on live imaging of Plasmodium berghei sporozoites with the aim of highlighting the usefulness of these A. coluzzii transgenic lines to better understand how the rodent Plasmodium sporozoites first colonize and then settle as packed cells in Sg acinar host cells.

      Major comments

      The two first objectives presented by the authors have been convincingly achieved with (i) the challenging production of four different lines expressing different single or double reporters chosen by the authors (and appropriately presented in the result text and figure sections), (ii) the careful analysis of the spatiotemporal expression of the DsRed reporter under two "promoters" studied and with regards to the blood feeding event parameter. However, if the reason why the authors have put so much effort in the production of their transgenic mosquitoes is (and as mentioned) to provide a significant improved setting enabling the behavioral analysis of sporozoites upon colonization and survival in the Sg, it seems this part is kind of limited. Likely in relation with this perception is the fact I found the introductory section often confusing and not enough direct to the points: in particular distinguishing the rationale from the necessity to produce appropriate models, and clarifying what is/are the added value(s) offered by these new transgenic lines models when compared to what exist (in Anopheles stephensi) with specific evidence that argue for this knowledge gain. At this stage, it is unfortunately not clear to me, what is the bonus of imaging the Plasmodium fluorescent sporozoites in hosts with fluorescent salivary gland lobes if one can not monitor key events of the Sg-sporozoite interaction that were not reachable without the fluorescent mosquito lines. Furthermore, it should be better explained why the rodent Plasmodium species has been chosen rather Plasmodium falciparum (or other human species) for which A. coluzzii is a natural host; may be just mentioning that this study would serve as a proof of concept but bringing real biological insights would be fine.

      1- The three genes and gene products selected by the authors should definitively be more systematically explained, which means for example the authors need to introduce the different mosquito species and the parasite-mosquito host pairs they are then referring to for the promoter/encoded proteins of their interest. In the same vein, I did not find any information as to the choice of the mosquito specie (A. Coluzzii) for the current work. I was curious to know what is the advantage since better knowledge was available with Anopheles stephensi with respect to (i) Saglin and its promotor activity, (ii) aap driven dsRed expression (lines already existing) and (iii) sporozoite-gland interaction.

      2- To help clarifying the added value of the present study, introducing the species names of the mosquito and the Plasmodium that serve as a model would be appreciated.

      3- Since a focus is the salivary gland of the blood feeding female Anopheles sp., a rapid description of the glands with different lobes and subdomains the results and figure 1 nicely refer to, would help in the introduction.

      4- That description could logically introduce the few proteins actually identified with lobe specific or cell domain specific expression (apical versus basal side, intracellular or surface expose, vacuole, duct...) profiles. The context with regards to sporozoite biology would then easily validate the "promoter choice". As a minor remark, I miss the reason why the authors wrote " the astonishing degree of order of the structures (referring to the packing of sporozoites within the Sg acinars) raise the question whether sporozoite can recognize each other". Please clarify since packing/accumulation can be passive due to cell mechanical constraints and explain what this point has to see with the question and experimental work proposed here?)

      5- The selection of hGrx1-roGFP2 is quite interesting and justified but there is then no use of this reporter property in the preliminary characterization of the Sg and Sg-sporozoite interaction. Could the authors provide such characterization?

      6- Figure 1: it would be nice to add in the legend at what time the dissection/imaging has been made (age, blood feeding timing?). I would also omit the double mutant trio-Dsred/aapDsred in the main figure (may be supplemental) since the two single mutants Dsred separately together with the double mutant (with different fluorescence) already provide the information. I would suggest to regroup the phenotypic presentation of the transgenic line made in the KI mosquitoes (current figure 5) in the main figure 1.

      7- Figure 2:

      a) Is there anything known on the Sgs' size change overtime. It seems that between day 1 and 2 there is an increase of size and volume as much as I can evaluate the volume (Fig S4). Could that mean that there is increase in cell number in the lobes and therefore more cells expressing the transgene which would account for the signal intensity increase rather than more transcripts per cell?

      b) why choosing 24h after the blood meal to assess promoter activity in the Sgs? Do we have any information on how the blood meal impact on the Sgs'development. At this time anyway the sporozoites are far from being made. Yosshida and Watanabe 2006 mentioned at significant decrease of Sg proteins post-blood feeding. Could the authors detail their rationale based on what the questions they wish to address

      8- Figure 3: The figure is quite informative in terms of subcellular localization. Concerning the section "Natural variation of DsRed expression in trio-DsRed mosquitoes", I think it could be shortened because because it is a bit out of the focus the study.

      9- In contrast the last section of live imaging of P. berghei sporozoites in the vicinity and within salivary gland should be expanded. The 2 sentences summarizing the data are quite frustrating "We also observed single sporozoites moving actively through tissues in a back and forth gliding manner (Fig. 6B, Movie 3) or making contact with the salivary gland although no invasion event could be monitored"

      10- I am aware of the technical difficulties to perform live imaging of sporozoite on whole mosquitoes, even when the salivary gland lobe under observation is closely apposed to the cuticle but that seems to be the final aim of the authors. I looked very carefully to the three movies and I am sorry but at this stage I could not make meaningful analysis out of them, and could not agree with the conclusions: for instances, the authors specify that sporozoites were undergoing back and forth movements (movie 3) but I do not see that and do not see the Sg contours in the available movies? The authors should also add bar and time scales to their movies. Having an in-depth description with regards to the sub-domain marked by a relevant reporter would strengthen the study, even if images are not collected in the whole mosquito to get higher resolution.

      I am not sure I understand the relevance of this quite condensed sentence in the text. Could the authors rephrase and expand if they wish to keep the issues they refer to. "The sporozoites' distinctive cell polarization and crescent shape, in combination with high motility, allows them to „drill" through tissues". I would stress more on the main unknown in terms of sporozoite-Sg interactions and the need to get right models for applying informative approaches (i.e. here, imaging).

      Of note, it could help to point that the "Sgs is a niche in which the sporozoites which egress from the oocyst could mature and be fully competent when co-deposited with the saliva into the dermis of their intermediary hosts"

      Significance

      1- Clear technical significance with the challenging molecular genetics achieved in the mosquito A. coluzzii.

      2- More limited biological significance: fair analysis and gain of knowledge of spatio-temporal of reporter expression under the selected promoter but limited significance of the final goal analysis which concerns the Plasmodium sporozoite biology once egressed from oocysts

      3- Previous reports cited by the authors have used the DsRed reporter and the aap promoter in another Anopheles (i.e. A. stephensi, Yoshida and Watanabe, Insect Mol Biol, 2006; Wells and Andrew, 2019) which is also a natural host and vector for human Plasmodium spp.) with significantly more resolutive 3D visualization of GFP-fluorescent P. berghei but in dissected salivary glands and not in whole mosquitoes. The Wells and Andrew publication entitled "Salivary gland cellular architecture in the Asian malaria vector mosquito Anopheles stephensi" in Parasite Vectors, 2015 would deserve to be reference and described.

      4- Audience: I would say that this work should be of interest of mostly scientists investigating Plasmodium biology (basic and field research) or in entomology of Diptera.

      5- To describe my fields of expertise, I can refer to my extensive initial training in entomology including at one point in the genetic basis of mosquito-virus interaction. I have also been working for more than 20 years in the field of Apicomplexa biology (Plasmodium and Toxoplasma) and I have long-standing interest in live and static high-resolution imaging.

    1. Author Response

      Reviewer #1 (Public Review):

      The data support the claims, and the manuscript does not have significant weaknesses in its present form. Key strengths of the paper include using a creative HR-based reporter system combining different inducible DSB positions along a chromosome arm and testing plasmid-based and chromosomal donor sequences. Combining that system with the visualization of specific chromosomal sites via microscopy is powerful. Overall, this work will constitute a timely and helpful contribution to the field of DSB/genome mobility in DNA repair, especially in yeast, and may inform similar mechanisms in other organisms. Importantly, this study also reconciles some of the apparent contradictions in the field.

      We thank the reviewer for these positive comments on the quality of the THRIV system, in helping us to understand global mobility and to reconcile the different studies in the field. The possibility that these mobilities also exist in other organisms is attractive because they could be a way to anticipate the position of the damage in the genome and its possible outcome.

      Reviewer #2 (Public Review):

      The authors are clarifying the role of global mobility in homologous recombination (HR). Global mobility is positively correlated with recombinant product formation in some reports. However, some studies argue the contrary and report that global mobility is not essential for HR. To characterize the role of global chromatin mobility during HR, the authors set up a system in haploid yeast cells that allows simultaneously tracking of HR at the single-cell level and allows the analysis of different positions of the DSB induction. By moving the position of the DSB within their system, the authors postulate that the chromosomal conformation surrounding a DNA break affects the global mobility response. Finally, the authors assessed the contributions of H2A(X) phosphorylation, checkpoint progression and Rad51 in the mobility response.

      One of the strengths of the manuscript is the development of "THRIV" as an efficient method for tracking homologous recombination in vivo. The authors take advantage of the power of yeast genetics and use gene deletions and as well as mutations to test the contribution of H2A(X) phosphorylation, checkpoint progression and Rad51 to the mobility response in their THRIV system.

      A major weakness in the manuscript is the lack of a marker to indicate that DSB formation has occurred (or is occurring)? Although at 6 hours there is 80% I-SceI cutting, around 20% of the cells are uncut and cannot be distinguished from the ones that are cut (or have already been repaired). Thus, the MSD analysis is done in the blind with respect to cells actually undergoing DSB repair.

      The authors clearly outlined their aims and have substantial evidence to support their conclusions. They discovered new features of global mobility that may clear up some of the controversies in the field. They overinterpreted some of their observations, but these criticisms can be easily addressed.

      The authors addressed conflicting results concerning the importance of global mobility to HR and their results aid in reconciling some of the controversies in the field. A key strength of this manuscript is the analysis of global mobility in response to breaks at different locations within chromosomes? They identified two types of DSB-induced global chromatin mobility involved in HR and postulate that they differ based on the position of the DSB. For example, DSBs close to the centromere exhibit increased global mobility that is not essential for repair and depends solely on H2A(X) phosphorylation. However, if the DSB is far away from the centromere, then global mobility is essential for HR and is dependent on H2A(X) phosphorylation, checkpoint progression as well as the Rad51 recombinase.

      The Bloom lab had previously identified differences in mobility based on the position of the tracked site. However, in the study reported here, the mobility response is analyzed after inducing DSBs located at different positions along the chromosome.

      They also addressed the question of the importance of the Rad51 protein in increased global mobility in haploid cells. Previous studies used DNA damaging agents that induce DSBs randomly throughout the genome, where it would have been rare to induce DSBs near the centromere. In the studies reported in this manuscript, they find no increase in global mobility in a rad51∆ background for breaks induced near the centromere (proximal), but find that breaks induced near the telomeres (distal), are dependent on both gamma-H2A(X) spreading and the Rad51 recombinase.

      We thank the referee for his constructive comments on the strength of our system to accurately determine the impact of a DSB according to its position in the genome. Concerning the issue of damaged cells that were not detected, it is a very important and exciting issue because it confronts our data with the question of biological heterogeneity. We provide evidence on the consistency of our findings despite the lack of detection of undamaged cells.

      Reviewer #3 (Public Review):

      In this study, Garcia Fernandez et al. employ a variety of genetic constructs to define the mechanism underlying the global chromatin mobility elicited in response to a single DNA double-strand break (DSB). Such local and global chromatin mobility increases have been described a decade ago by the Gasser and Rothstein laboratories, and a number of determinants have been identified: one epistasis group results in H2A-S129 phosphorylation via Rad9 and Mec1 activation. The mechanism is thought to be due to chromatin rigidification (Herbert 2017; Miné-Hattab 2017) or general eviction of histones (Cheblal 2020). More enigmatic, global chromatin mobility increase also depends on Rad51, a central recombination protein downstream of checkpoint activation (Smith & Rothstein 2017), which is also required for local DSB mobility (Dion .. Gasser 2012). The authors set out to address this difficulty in the field.

      A premise of their study is the convergence of two types of observations: First, the H2A phosphorylation ChIP profile matches that of Rad51, with both spreading in trans on other chromosomes at the level of centromeres when a DSB occurs in the vicinity of one of them (Renkawitz 2014). Second, global mobility depends on H2A phosphorylation and on Rad51 (their previous study Herbert 2017). They thus address whether the Rad51-ssDNA filament (and associated proteins) marks the chromatin engaged during the homology search. They found that the extent of the mobility depends on the residency time of the filament in a particular genomic and nuclear region, which can be induced at an initially distant trans site by providing a region of homology. Unfortunately, these findings are not clearly apparent from the title and the abstract, and in fact somewhat misrepresented in the manuscript, which would call for a rewrite (see points below).

      The main goal of our study was to understand the role of global mobility in the repair by homologous recombination, depending on the location of the damage. We found distinct global mobility mechanisms, in particular in the involvement of the Rad51 nucleofilament, depending on whether the DSB was pericentromeric or not. It is thus likely that when the DSB is far from the pericentromere, the residence time of the Rad51 nucleofilament with the donor has an impact on global mobility. Thus, if our experiments were not designed to answer directly the question of the residence time of the nucleofilament, we now discuss in more detail the causes and consequences of the global mobility.

      To this end, they induce the formation of a site-specific DSB in either of two regions: a centromere-proximal region and a telomere-proximal region, and measure the mobility of an undamaged site near the centromere on another chromosome (with a LacO-LacI-GFP system). This system reveals that only the centromere-proximal DSB induces the mobility of the centromere-proximal undamaged site, in a Rad9- and Rad51-independent manner. Providing a homologous donor in the vicinity of the LacO array (albeit in trans) restores its mobility when the DSB is located in a subtelomeric region, in a Rad9- and Rad51-dependent fashion. These genetic requirements are the same as those described for local DSB mobility (Dion & Gasser 2012), drawing a link between the two types of mobility, which to my knowledge was not described. The authors should focus their message (too scattered in the current manuscript), on these key findings and the diffusive "painting" model, in which the canvas is H2A, the moving paintbrush Mec1, and the hand the Rad51-ssDNA filament whose movement depends on Rad9. In the absence of Rad51-Rad9 the hand stays still, only decorating H2A in its immediate environment. The amount of paint deposited depends on the residency time of the Rad51-ssDNA-Mec1 filament in a given nuclear region. This synthesis is in agreement with the data presented and contrasts with their proposal that "two types of global mobility" exist.

      The brush model is very useful in explaining the distal mobility, which indeed is linked to local mobility genetic requirements, but it is also helpful to think of different model than the brush model when pericentromeric damage occurs. To stay in the terms of painting technique, this model would be similar to the pouring technique, when oil paint is deposited on water and spreads in a multidirectional manner. It is likely that Mec1 or Tel1 are the factors responsible for this spreading pattern. We therefore propose to maintain the notion of two distinct types of mobilities. Without going into pictorial techniques in the text, we have attempted to clarify these two models in the manuscript.

      The rest of the manuscript attempts to define a role in DSB repair of this phosphor-H2A-dependent mobility, using a fluorescence recovery assay upon DSB repair. They correlate a defect in the centromere-proximal mobility (in the rad9 or h2a-s129a mutant) when a DSB is distantly induced in the subtelomere with a defect in repairing the DSB. Repair efficiency is not affected by these mutations when the donor is located initially close to the DSB site. This part is less convincing, as repair failure specifically at a distant donor in the rad9 and H2A-S129A mutants may result from other defects relating to chromatin than its mobility (i.e. affecting homology sampling, DNA strand invasion, D-loop extension, D-loop disruption, etc), which could be partially alleviated by repeated DSB-donor encounters when the two are spatially close. In fact, suggesting that undamaged site mobility is required for the early step of the homology search directly contradicts the fact that the centromere-proximal mobility induced by a subtelomeric DSB depends on the presence of a donor near the centromere: mobility is thus a product of homology identification and increased Rad51-ssDNA filament residency in the vicinity of the centromere, and so downstream of homology search. This is a major pitfall in their interpretation and model.

      We thank the referee for helping to clarify the question of the cause and consequence of global mobility. As he pointed out, the fact that a donor is required to observe both H2A phosphorylation and distal mobility implicates the recombination process itself, as well as the residence time of the Rad51 nucleofilament, in the ƴ--‐H2A(X) spreading and indicates that recombination would be the cause of distal mobility. In contrast, the fact that proximal mobility can exist independently of homologous recombination suggests that in this particular configuration, HR would then be a consequence of proximal mobility.

      In conclusion, I think the data presented are of importance, as they identify a link between local and global chromatin mobility. The authors should rewrite their manuscript and reorganize the figures to focus on the painter model that their data support. I propose experiments that will help bolster the manuscript conclusions.

      1) Attempt dual-color tracking of the DSB (i.e. Rad52-mCherry or Ddc1-mCherry) and the donor site, and track MSD as a function of proximity between the DSB and the Lac array (with DSB +/-dCen). The expectation is that only upon contact (or after getting in close range) should the MSD at the centromere-proximal LacO array increase with a DSB at a subtelomere. Furthermore, this approach will help distinguish MSDs in cells bearing a DSB (Rad52 foci) from undamaged ones (no Rad52 foci)(see Mine-Hattab & Rothstein 2012). This would help overcome the inefficient DSB induction of their system (less than 50% at 1 hr post-galactose addition, and reaching 80% at 6 hr). For the reader to have a better appreciation of the data distribution, replace the whisker plots of MSD at 10 seconds with either scatter dot plot or violin plots, whichever conveys most clearly the distribution of the data: indeed, a bimodal distribution is expected in the current data, with undamaged cells having lower, and damaged cells having higher MSDs.

      The reviewer raises two points here.

      The first point concerns the residence time of the Rad51 filament with the donor when a subtelomeric DSB happens. Measuring the DSBs as a function of the distance between donor and Rad52mCherry (or Ddc1--‐mCherry) would allow deciding on the cause or the consequence of the global mobility. Thus, if mobility is the consequence of (stochastic) contact, leading to a better efficiency of homologous recombination, we would see an increase in MSDs only when the distance between donor and filament would be small. Conversely, if global mobility is the cause of contact, the increase in mobility would be visible even when the distance between donor and filament is large. It would be necessary to have a labelling system with 3 different fluorophores — the one for the global mobility, the one for the donor and the one allowing following the filament. This triple labelling is still to be developed.

      The second point concerns the important question of the heterogeneity of a population, a central challenge in biology. Here we wish to distinguish between undamaged and damaged cells. Even if a selection of the damaged cells had been made, this would not solve entirely the inherent cell to cell variation: at a given time, it is possible that a cell, although damaged, moves little and conversely that a cell moves more, even if not damaged. The question of heterogeneity is therefore important and the subject of intense research that goes beyond the framework of our work (Altschuler and Wu, 2010). However, in order to start to clarify if a bias could exist when considering a mixed population (20% undamaged and 80% damaged), we analyzed MSDs, using a scatter plot. We considered two population of cells where the damage is the best controlled, i.e. i) the red population which we know has been repaired and, importantly, has lost the cut site and will be not cut again (undamaged--‐only population) and ii) the white population, blocked in G2/M, because it is damaged and not repaired (damaged--‐only population). These two populations show very significant differences in their median MSDs. We artificially mixed the MSDs values obtained from these two populations at a rate of 20% of undamaged--‐only cells and 80% of damaged--‐only cells. We observed that the mean MSDs of the damaged--‐only and undamaged--‐only cells were significantly different. Yet, the mean MSD of damaged--‐only cells was not statistically different from the mean MSD from the 20%--‐80% mixed cell population. Thus, the conclusions based on the average MSDs of all cells remain consistent.

      Scatter plot showing the MSD at 10 seconds of the damaged-­‐only population (in white), the repaired-­‐only population (in red), or the 20%-­‐80% mixed population

      2) Perform the phospho-H2A ChIP-qPCR in the C and S strains in the absence of Rad51 and Rad9, to strengthen the painter model.

      ChIP experiments in mutant backgrounds as well as phosphorylation/dephosphorylation kinetics would corroborate the mobility data described here, but are beyond the scope of this manuscript. Yet, a phospho--‐ H2A ChIP experiment was performed in a Δrad51 mutant in Renkawitz et al. 2013. In that case, γH2A propagation was restricted only to the region around the DSB, corroborating both the requirement for Rad51 in distal mobility and the lack of requirement for Rad51 in proximal mobility.

      3) Their data at least partly run against previously published results, or fail to account for them. For instance, it is hard to see how their model (or the painter model), could explain the constitutively activated global mobility increase observed by Smith .. Rothstein 2018 in a rad51 rad52 mutant. Furthermore, the gasser lab linked the increased chromatin mobility to a general loss of histones genome-wide, which would be inconsistent with the more localized mechanism proposed here. Do they represent an independent mechanism? These conflicting observations need to be discussed in detail.

      Apart from the fact that the mechanisms in place in a haploid or a diploid cell are not necessarily comparable, it is not clear to us that our data are inconsistent with that of Smith et al. (Smith et al., 2018). Indeed, it is not known by which mechanisms the increase in global mobility is constitutively activated in a Δrad51 Δrad52 mutant. But according to their hypothesis the induction of a checkpoint is likely and so is the phosphorylation of H2A. It would be interesting to verify γH2A in such a context. This question is now mentioned in the main text.

      Concerning histone loss, it appears to be different depending on the number of DSBs. Upon multiple DNA damage following genotoxic treatment with Zeocin, Susan Gasser's group has clearly established that nucleosome loss occurs (Cheblal et al., 2020; Hauer et al., 2017). Nucleosome loss, like H2A phosphorylation as we have shown (Garcia Fernandez et al., 2021; Herbert et al., 2017), leads to increased global mobility. The state of chromatin following these histone losses or modifications is not yet fully understood, but could coexist. In the case of a single DSB by HO, it is the local mobility of the MAT locus that is examined (Fig3B in (Cheblal et al., 2020). In this case, the increase in mobility is indeed dependent on Arp8 which controls histone degradation and correlates with a polymer pattern consistent with normal chromatin. It is likely that histone degradation occurs locally when a single DSB occurs. Concerning histone loss genome wide, the question remains open. If histone eviction nevertheless occurred globally upon a single DSB, both types of modifications could be possible. This aspect is now mentioned in the discussion.

    1. anticipations is key to 01:08:38 everything and attention is key to everything so every organism does that plants and everything else and it doesn't require a central nervous system 01:08:51 and and you i might add to this that not only is every organism cognitive but essentially every organism organism is cooperative to those cooperation and cognition 01:09:03 go hand in hand because any intelligent organism any organism that can act to better its you know viability is going to cooperate in 01:09:17 meaningful ways with other organisms and you know other species and things like that nice point because um there's cost to communication whether it's exactly whether it's the cost of making the pheromone 01:09:30 or just the time which is super finite or attention fundamentally and so costly interactions through time the game theory are either to exploit and stabilize which is fragile 01:09:42 or to succeed together yeah exactly and and and succeeding together cooperation is is is like everywhere once you once you understand what you're looking 01:09:54 for it's in the biologic world it's like everywhere so this idea that we're you know one one one person against all or you know we're a dog eat dog universe i mean it's you 01:10:08 know in a certain sense it's true obviously tigers eat you know whatever they eat zebras or whatever i mean that happens yes of course but in the larger picture 01:10:19 over and over multiple time scales not just uh you know in five minutes but over evolutionary time scales and uh you know developmental time scales and everything the cooperation is really the rule 01:10:33 for the most part and if you need if any listener needs proof of that just think of who you think of your body i mean there's about a trillion some trillion some cells 01:10:45 that are enormously harmonious like your blood pumps every day or you know this is a this is like a miracle i don't want to use the word miracle because i want to get into 01:10:59 whatever that might imply but uh it is amazing aw inspiring the the depth of cooperation just in our own bodies is like that's that's like 01:11:12 evolution must prefer cooperation or else there would never be such a complex uh pattern of cooperation as we see just in one human body 01:11:26 just to give one example from the bees so from a species i study it's almost like a sparring type of cooperation because when it was discovered that there were some workers with developed ovaries 01:11:38 there was a whole story about cheating and policing and about altruism and this equation says this and that equation says that and then when you take a step back it's like the colony having a distribution of over-reactivation 01:11:51 may be more ecologically resilient so um i as an evolutionary biologist never think well my interpretation of what would be lovey-dovey in this system must be how it works because that's so 01:12:05 clearly not true it's just to say that there are interesting dynamics within and between levels and in the long run cooperation and stable cooperation and like learning to adapt 01:12:17 to your niche is a winning strategy in a way that locking down just isn't but unfortunately under high um stress and 01:12:29 uh high uncertainty conditions simple strategies can become rife so that's sort of a failure mode of the population

      The human, or ANY multicellular animal or plant body is a prime example of cooperation....billions of cells in cooperation with each other to regulate the body system.

      The body of any multi-cellular organism, whether flora or fauna is an example of exquisite cellular and microbial cooperation. A multi-cellular organism is itself a superorganism in this sense. And social organisms then constitute an additional layer of superorganismic behavior.

    1. Reviewer #2 (Public Review):

      The research paper presents a modeling approach aimed at disentangling mother's genetic effects on their offspring in two components: prenatal environment and postnatal environment. Specifically, the authors use SEM on adopted and non-adopted individuals from the UK Biobank and leverage the variation in genetic similarities from different family structures. Because the UK Biobank is not created as an adoption study, they build seven different family structures to include all possible family combinations that can provide information regarding the two parameters of interest: those representing prenatal and postnatal environment respectively. The model is used on two phenotypes (birthweight and education attainment) to illustrate it.

      The results indicate an 'expected pattern of maternal genetic effect on offspring birthweight' and 'unexpectedly large prenatal (intrauterine) maternal genetic effects on offspring education attainment. The authors mention this result can likely be explained by adopted offspring being raised by biological relatives. They then show simulations supporting this hypothesis.

      We praise the authors for the complex analyses executed and the work done to create the model and make the scripts available to the research community. The models can be a valuable addition to the behavior genetics literature and to researcher's toolkit. We do however have a few concerns regarding 1. the meaning of the results, 2. model building decisions and the choice of sample and 3. the way some limitations are addressed. We go into more details for each of these points.

      1. Interest to study mothers' genetic effects as acting via the prenatal environment or the postnatal environment and the meaning of the parameters tested by the model

      I think this is an interesting question and a useful distinction for a number of phenotypes and the authors use the adoption design in an innovative way to define and estimate parameters that correspond to this distinction. However, I would suggest that the expressions of prenatal environmental effect and postnatal environmental effect (as distinct pathways for mother's gene to be expressed) seem to be an overstatement.

      The definition of mother genetic effects (effects of mother genotype on their child phenotype, over and above any genetic transmission) is citing Wolf & Wade 2009 (line 56) which mention the more general notion of 'maternal effect' that are defined as effect of genotype, phenotype (or both) on their offspring. I would argue that postnatal maternal genetic effects (as currently defined in the paper) are likely environmental effect and not only 'genetic effects'.

      These environmental effects are indeed partly influenced by mother's genes, but also strongly affected by other variables such as culture, generation, SES, education. It is not possible to disentangle these effects in the design(s) used here.

      This consideration can affect the authors definition of the covariance between an adopted individual's genotype and phenotype as a function of prenatal (but not postnatal) maternal genetic effects (line 93-94). The authors current assumption does not consider the potential for environmental modulation of the effect of adopted mothers' genes (which are not zero for several phenotypes). Postnatal maternal genetic effects are thus also likely to capture and represent environmental differences.

      2. Model building decisions specific to the UK biobank

      One of the main issues is that the method is tested on a sample that is not built as an adoption design. This forced the authors to make decision to circumvent this problem and lead to important limitations that are not inherent to their method, but to the specific sample they applied it to.

      a. Having adoptive parents partly genetically related to the child is breaking the logic of the adopted design. Thus, it brings back the genetic confound (passive gene-environment correlation) problem of usual family-based design. In their case, it alters their ability to differentiate between prenatal and postnatal environment.

      b. In section starting on line 426, the authors have included simulations to show how this issue could be addressed. However, it does not help the fact that in their model applied to the UK biobank, the information regarding the degree of genetic similarity between adopting parents and biological parents and the child is unknown.

      c. To address this problem in their analyses of UK biobank, authors used (Lines 302 & 417) information regarding whether children were breastfed or not (on the basis that this knowledge would be more common if the child was raised by a biological family relative) to identify adopted singletons raised by biological relatives. However, this is, at best, a mediocre index of genetic relatedness. I can see other reasons for participants to have knowledge of if they have been breastfed: because they were adopted at an older age, because they are still (or have been) in contact with their biological mother. It is also possible, albeit rare, that adoptive parents may breastfeed a child via the use of drugs to stimulate milk production. Line 420: the fact that the prenatal maternal estimate became non-significant after removing participants that were breastfed do provide results more in-line with what would be expected. But we can't use expected results as a basis to evaluate the validity of the approach. The absence of GxE and rGE are two other strong assumptions of the model that could also produce this kind unexpected results.

      d. I would suggest discussing the issue of genetic relatedness between adopting parents and offspring in terms of passive rGE which is a common problem for the estimation of parental effects in every familial design.<br /> e. Line 291: why use an unweighted PRS for EY3 (Lee, 2018), while the usual way of computing PRS (as a weighted sum of risk alleles) was used for birthweight?

      3. Limitations<br /> Assess other limitations of their method.

      a. limitation of the availability of birth father information,

      b. prenatal events uncorrelated with birthmother's genes (disease or accidents),

      c. Inferring prenatal environment effect from higher birth mother correlation compared to birthfather is subject to bias from measurement differences between the two (Loehlin, 2016).

      d. age at which the child is adopted (if the child has been partly raised by birth parents before adoption, it would bias (raise) the estimates of prenatal effects).

      e. evocative rGE not mentioned. It has been shown that parents partly react to children's behaviors. Thus, the estimate of maternal genetic postnatal effects could be biased (lowered) by evocative gene-environment correlation. In other words, the model also assumes no evocative gene-environment correlation.

      Final thoughts:

      1. I would like a better case made for why it is important to distinguish genetic effects into prenatal and postnatal effect.

      2. I would suggest the author make a clear distinction between the limits inherent to their sample (UK biobank) from those inherent to their methodological approach. I see important usefulness is plague by limits inherent to the sample used. At the same time, I am not aware of the availability of a big enough sample of adopted children with genotypic information available to compute PRS.

    2. Author Response

      Reviewer #2 (Public Review):

      Summary

      The research paper presents a modeling approach aimed at disentangling mother's genetic effects on their offspring in two components: prenatal environment and postnatal environment. Specifically, the authors use SEM on adopted and non-adopted individuals from the UK Biobank and leverage the variation in genetic similarities from different family structures. Because the UK Biobank is not created as an adoption study, they build seven different family structures to include all possible family combinations that can provide information regarding the two parameters of interest: those representing prenatal and postnatal environment respectively. The model is used on two phenotypes (birthweight and education attainment) to illustrate it.

      The results indicate an 'expected pattern of maternal genetic effect on offspring birthweight' and 'unexpectedly large prenatal (intrauterine) maternal genetic effects on offspring education attainment. The authors mention this result can likely be explained by adopted offspring being raised by biological relatives. They then show simulations supporting this hypothesis.

      We praise the authors for the complex analyses executed and the work done to create the model and make the scripts available to the research community. The models can be a valuable addition to the behavior genetics literature and to researcher's toolkit. We do however have a few concerns regarding 1. the meaning of the results, 2. model building decisions and the choice of sample and 3. the way some limitations are addressed. We go into more details for each of these points.

      1) Interest to study mothers' genetic effects as acting via the prenatal environment or the postnatal environment and the meaning of the parameters tested by the model .

      I think this is an interesting question and a useful distinction for a number of phenotypes and the authors use the adoption design in an innovative way to define and estimate parameters that correspond to this distinction. However, I would suggest that the expressions of prenatal environmental effect and postnatal environmental effect (as distinct pathways for mother's gene to be expressed) seem to be an overstatement.

      The definition of mother genetic effects (effects of mother genotype on their child phenotype, over and above any genetic transmission) is citing Wolf & Wade 2009 (line 56) which mention the more general notion of 'maternal effect' that are defined as effect of genotype, phenotype (or both) on their offspring. I would argue that postnatal maternal genetic effects (as currently defined in the paper) are likely environmental effect and not only 'genetic effects'. These environmental effects are indeed partly influenced by mother's genes, but also strongly affected by other variables such as culture, generation, SES, education. It is not possible to disentangle these effects in the design(s) used here.

      Although we have referred to the maternal effects estimated in our manuscript as “prenatal maternal genetic effects” and “postnatal maternal genetic effects”- all of these effects on the offspring are mediated through maternal phenotypes (which as the reviewer correctly notes, will be influenced by both genes and the environment). In other words, the maternal PRS used in our study proxies some maternal phenotype/s that then forms part of the offspring’s prenatal and/or postnatal environment which then affects the offspring’s phenotype. We have referred to these effects as maternal genetic effects rather than just maternal effects to emphasize the causal link with the maternal genotype and the fact that we are only proxying that part of the maternal phenotype that is explained by the relevant genetic variation (NB. This is consistent with the Wolf & Wade 2009 definition of maternal effects i.e. “…the causal influence of maternal genotypes on offspring phenotypes…”). We agree with the reviewer that our model is not attempting to disentangle proportions of variance due to genetic and environmental factors (which is not its purpose).

      This consideration can affect the authors definition of the covariance between an adopted individual's genotype and phenotype as a function of prenatal (but not postnatal) maternal genetic effects (line 93-94). The authors current assumption does not consider the potential for environmental modulation of the effect of adopted mothers' genes (which are not zero for several phenotypes). Postnatal maternal genetic effects are thus also likely to capture and represent environmental differences.

      Assuming that adopted offspring are not biologically related to their adoptive mothers, then adopted individuals’ PRS should not be correlated with adoptive mothers’ PRS. The corollary is that adoptive mothers’ PRS should not influence the covariance between adopted individuals’ PRS and phenotype (i.e. regardless of whether there is environmental modulation of the effect of adopted mothers’ genes on offspring phenotype). It is true, however, that we do not consider genotype by environment interaction effects in our model, and that this is a limitation of our model. We allude to this important point several times in the Discussion:

      “Those assumptions explicitly encoded in Figure 1 include that the total maternal genetic effect can be decomposed into the sum of prenatal and postnatal components, that genetic effects are homogenous across biological and adoptive families, the absence of genotype x environment interaction…”

      And

      “In contrast, in our design it is more important that genetic effect sizes are homogenous across adopted and non-adopted individuals (i.e. no genotype by environment interaction)…”.

      At the request of the reviewer, we now include additional discussion of GxE and other assumptions of our model in further detail in Supplementary File 17.

      2) Model building decisions specific to the UK biobank. One of the main issues is that the method is tested on a sample that is not built as an adoption design. This forced the authors to make decision to circumvent this problem and lead to important limitations that are not inherent to their method, but to the specific sample they applied it to.

      a) Having adoptive parents partly genetically related to the child is breaking the logic of the adopted design. Thus, it brings back the genetic confound (passive gene-environment correlation) problem of usual family-based design. In their case, it alters their ability to differentiate between prenatal and postnatal environment.

      We agree that the UK Biobank was never designed for this purpose, and that data from it regarding adoption is less than perfect. Nevertheless, we think that an important conclusion of our paper is that large-scale biobanks (which because of their size) contain many hundreds/thousands of adopted individuals can be used to partition maternal genetic effects into prenatal and postnatal components, provided good quality data on the adoption process has been gathered and/or genetic information on their adoptive parents.

      To help address the reviewer’s concerns we have created a Supplementary Table (Supplementary File 17) that summarizes some of the main limitations/assumptions of our model, whether they are specific to the UK Biobank dataset or intrinsic to our method, their consequences on model parameters, and possible options for addressing them.

      b) In section starting on line 426, the authors have included simulations to show how this issue could be addressed. However, it does not help the fact that in their model applied to the UK biobank, the information regarding the degree of genetic similarity between adopting parents and biological parents and the child is unknown.

      We agree- but we feel it is important to demonstrate (a) that cryptic biological relatedness between adopted individuals and their adoptive parents is a potential issue not only for our study, but for other studies attempting to utilize this information in the UK Biobank, and (b) that cryptic relatedness can be dealt with effectively through appropriate modelling in our SEM framework (i.e. even if it is not possible with the current data from UK Biobank). The corollary is that we recommend that the UK Biobank (and other large-scale biobanks) attempt to acquire information on adopted individuals and their parents through e.g. questionnaire.

      c) To address this problem in their analyses of UK biobank, authors used (Lines 302 & 417) information regarding whether children were breastfed or not (on the basis that this knowledge would be more common if the child was raised by a biological family relative) to identify adopted singletons raised by biological relatives. However, this is, at best, a mediocre index of genetic relatedness. I can see other reasons for participants to have knowledge of if they have been breastfed: because they were adopted at an older age, because they are still (or have been) in contact with their biological mother. It is also possible, albeit rare, that adoptive parents may breastfeed a child via the use of drugs to stimulate milk production. Line 420: the fact that the prenatal maternal estimate became non-significant after removing participants that were breastfed do provide results more in-line with what would be expected. But we can't use expected results as a basis to evaluate the validity of the approach. The absence of GxE and rGE are two other strong assumptions of the model that could also produce this kind unexpected results.

      We agree that (a) the inclusion of adopted individuals whose adoptive parents are biologically related to them is only one possible reason for unexpectedly strong prenatal maternal genetic effect estimates, (b) attempting to remove these individuals from the analysis using a proxy like breastfeeding information is less than perfect. As indicated above, we now discuss in detail alternative explanations for our results including violations of assumptions regarding the absence of GxE and rGE, and other explanations (assortative mating, stratification etc) (see new text in the Discussion and Supplementary File 17).

      d) I would suggest discussing the issue of genetic relatedness between adopting parents and offspring in terms of passive rGE which is a common problem for the estimation of parental effects in every familial design.

      We now include mention of passive rGE in the Discussion:

      “Rather we hypothesize it is possible that our model could have been misspecified in that substantial numbers of adopted individuals in the UK Biobank may have in fact been raised by their biological relatives. This can be thought of as (unintentional) reintroduction of passive gene-environment correlation into the study. In other words, adopted children are brought up by their genetic relatives, who in turn provide the environment in which they are raised. This induces a correlation between adopted individuals’ PRS and their environment.”

      e) Line 291: why use an unweighted PRS for EY3 (Lee, 2018), while the usual way of computing PRS (as a weighted sum of risk alleles) was used for birthweight?

      We thank the reviewer for pointing this inconsistency out. We have now rerun the analyses using weighted and unweighted PRS for both birth weight and educational attainment. The reason for running both sets of analyses is that the GWAS on which the SNPs are selected (i.e. the weights are based), contains UK Biobank individuals. This may inflate the overall strength of association between the PRS and outcome through winner’s curse (although not differentially between individuals from adoptive and biological families). In contrast, unweighted scores should be much more robust to this inflation, and so are a useful sanity check on the results.

      3) Limitations

      As our Discussion is already very long, we have created a Supplementary Table (Supplementary File 17) that summarizes some of the main limitations/assumptions of our model, their consequences on model parameters, and possible options for addressing them. We also discuss specific concerns raised by the referee below.

      Assess other limitations of their method.

      a) limitation of the availability of birth father information,

      Our model does not require information on adopted individual’s birth fathers (although it does require PRS on non-adopted individuals’ birth fathers- which is typically readily available). It does, however, make the assumption that fathers do not contribute prenatally to offspring traits- which we think is a reasonable assumption for the majority of offspring phenotypes. If PRS for adopted individuals’ biological fathers were available, then prenatal paternal genetic effects could be estimated as part of the model. To accommodate the reviewer’s request, we have included and discussed this limitation/assumption in more detail in Supplementary File 17.

      b) prenatal events uncorrelated with birthmother's genes (disease or accidents),

      We agree that our model assumes that maternal genotype is uncorrelated with prenatal environmental factors. We now discuss this assumption/limitation further in Supplementary File 17.

      c) Inferring prenatal environment effect from higher birth mother correlation compared to birthfather is subject to bias from measurement differences between the two (Loehlin, 2016).

      Whilst this is a limitation of adoption designs that estimate prenatal effects using the difference between maternal and paternal correlations with offspring phenotypes, this is not actually a limitation of our model. In our model we do not use (phenotypic) mother-child and father-child correlations (we use PRS-phenotype correlations). Also, in our model, information on the size of the prenatal (and postnatal) maternal genetic effects primarily comes from the difference between the PRS-phenotype covariance in adopted singletons compared to the PRS-phenotype covariance non-adopted individuals (i.e. not from the difference between maternal and paternal correlations with offspring phenotypes). We state this in the Introduction and Methods e.g.:

      “Thus, the difference between the genotype-phenotype covariance in adopted and non-adopted singleton individuals provides important information on the likely size of postnatal genetic effects.”

      It is also worth noting, that in our model, the size of the paternal PRS-offspring association does not factor into the estimation of maternal genetic effects (nor does the difference between the maternal PRS-offspring phenotype association and the paternal PRS-offspring phenotype association). Also, our model takes into account if there are differences in the amount of (random) measurement error in adoptive and non-adoptive families.

      d) age at which the child is adopted (if the child has been partly raised by birth parents before adoption, it would bias (raise) the estimates of prenatal effects).

      We agree and now discuss this limitation further in Supplementary File 17.

      e) evocative rGE not mentioned. It has been shown that parents partly react to children's behaviors. Thus, the estimate of maternal genetic postnatal effects could be biased (lowered) by evocative gene-environment correlation. In other words, the model also assumes no evocative gene-environment correlation.

      We agree and now discuss this limitation in Supplementary File 17 (although we note that the effect that evocative rGE will have on the SEM parameters will depend on the direction of the gene-environment correlation).

      Final thoughts

      1) I would like a better case made for why it is important to distinguish genetic effects into prenatal and postnatal effect.

      We have included the following text in the Introduction:

      “Given the increasing number of variants identified in GWAS that exhibit robust maternal genetic effects, a natural question to ask is whether these loci exert their effects on offspring phenotypes through intrauterine mechanisms, the postnatal environment, or both. Indeed, resolving maternal effects into prenatal and postnatal sources of variation could be a valuable first step in eventually elucidating the underlying mechanisms behind these associations (Armstrong-Carter et al. 2020), directing investigators to where they should focus their attention, and in the case of disease-related phenotypes, yielding potentially important information regarding the optimal timing of interventions. For example, the demonstration of maternal prenatal effects on offspring IQ/educational attainment, suggests that if the mediating factors that were responsible could be identified, then improvements in the prenatal care of mothers and their unborn babies which target these factors, could yield useful increases in offspring IQ/educational attainment.”

      2) I would suggest the author make a clear distinction between the limits inherent to their sample (UK biobank) from those inherent to their methodological approach. I see important usefulness is plague by limits inherent to the sample used. At the same time, I am not aware of the availability of a big enough sample of adopted children with genotypic information available to compute PRS.

      One of the main limitations inherent to our sample (UK Biobank) is the fact that currently we cannot be certain that adopted individuals are not biologically related to their adoptive parents. As we demonstrate, this limitation could be addressed if information were gathered regarding the relationships, which at least in principle could be done relatively easily in the UK Biobank (e.g. by questionnaire, or even better, by genotyping adoptive parents where possible). The SEMs could then be adjusted to take these relationships into account. We discuss this limitation, and many others, in Supplementary File 17, and divide the table according to whether the limitation is primarily a consequence of the dataset (UK Biobank) or the method more broadly.

      We agree with the reviewer that the size of adoption studies is currently limited (e.g. Texas Adoption Project; Colorado Adoption Study etc). Nevertheless, it is likely that the number of adopted individuals available in large-scale Biobanks will increase over time, in which case models like the one espoused in this manuscript will become increasingly useful. Importantly, our method does not require adoptive families in order to partition maternal effects, merely adopted singleton individuals, and reliable information on the biological relatedness (or lack thereof) of their adoptive parents. We feel therefore that it is important that this sort of information be gathered so that the adopted individuals within these large-scale resources can be leveraged to examine interesting questions like the ones discussed in our manuscript.

      We have added these points to the Discussion:

      “We argue that of greater consequence for the validity of our model is that any genetic relationship between adoptive and biological parents is accurately modelled and included in the SEM. Through simulation, we have shown that the consequences of model misspecification depend upon which biological and adoptive parents are related, the nature of this relationship, and the proportion of adopted individuals in the sample who have had their relationship misspecified. Our simulations also showed that correctly modelling this relationship returns asymptotically unbiased effect estimates and correct type I error rates. Clearly, knowing these cryptic relationships in the UK Biobank would allow us to properly model them and better estimate prenatal and postnatal maternal genetic effects using this resource. We emphasize that accurately modelling these relationships does not require that actual genotypes for adoptive and/or biological parents be obtained (although this would be advantageous in terms of statistical power) as our SEM allows us to model these relationships in terms of latent variables. Indeed, as large-scale resources like the UK Biobank become more common, we expect that the number of adopted individuals who have GWAS will also increase, and consequently models like the one espoused in this manuscript will become increasingly useful. High quality phenotypic information on these adopted individuals and their adoptive parents including whether they share any biological relationship will be critical to making the most of these resources.”

    1. MIT/Brown Vannevar Bush 研讨会于 1995 年 10 月 12 日至 13 日在麻省理工学院举行,以庆祝1945 年 7 月在大西洋月刊上发表的万尼瓦尔·布什(Vannevar Bush)的开创性文章《诚如所思》(As We May Think)发表50 周年。活动视频分五部分,这是第五部分,内容主要是:

      1.Lee Sproull发表的演讲:“信息是不够的:计算机对生产性工作的支持”(Information Is Not Enough: Computer Support for Productive Work)。内容介绍:对一项新技术的任何设想都意味着对人类及其行为的设想。在这次演讲中,我描述了与个人计算的最有影响力的技术愿景相关的人类行为愿景,其缩影是万尼瓦尔·布什(Vannevar Bush)的Memex--孤独的思想者和问题解决者的愿景。我将这一愿景与关于人类生产性行为如何实际发生的另一种观点进行对比--在相互依赖的社会关系中。我回顾了目前计算机对社会行动者的支持状况,并提出了另一种观点,即信息处理从属于关系管理。

      2.艾伦·凯(Alan Kay)发表的演讲:“Simex:布什的愿景中被忽视的部分”(Simex: the neglected part of Bush's Vision)。内容介绍:布什的愿景是在一张桌子上建立一个超链接的10000卷图书馆,它对个人计算的发展产生了巨大的影响,而且今天也有可能实现(甚至可以通过互联网超越它)。然而,尽管布什在30年代就从事(模拟)计算机模拟工作,但很可能他从他的工作或新建的Eniac中都看不到Memex的任何模拟作用。布什的设想中缺少什么,今天能不能发明出来?

      3.第 2 天小组讨论。

    1. MIT/Brown Vannevar Bush 研讨会于 1995 年 10 月 12 日至 13 日在麻省理工学院举行,以庆祝1945 年 7 月在大西洋月刊上发表的万尼瓦尔·布什(Vannevar Bush)的开创性文章《诚如所思》(As We May Think)发表50 周年。活动视频分五部分,这是第四部分,内容主要是:1.拉吉·瑞迪(Raj Reddy):重新审视布什的智能系统(Bush's Intelligent Systems Revisited)。内容介绍:在他著名的论文 《诚如我思》中,万尼瓦尔·布什(Vannevar Bush)为创造能够解释图片、听写、理解语言、使用超链接和从数字图书馆进行关联检索的机器提供了一个愿景。在这次演讲中,我们将回顾50年来在这些预测方面所取得的进展。

  6. Local file Local file
    1. 'I don't think it's anything—I mean, I don't think it was ever put to anyuse. That's what I like about it. It's a little chunk of history that they'veforgotten to alter. It's a message from a hundred years ago, if one knew howto read it.'

      Walter and Julia are examining a glass paperweight in George Orwell's 1984 without having context of what it is or for what it was used.

      This is the same sort of context collapse caused by distance in time and memory that archaeologists face when examining found objects.

      How does one pull out the meaning from such distant objects in an exegetical way? How can we more reliably rebuild or recreate lost contexts?

      Link to: - Stonehenge is a mnemonic device - mnemonic devices in archaeological contexts (Neolithic carved stone balls


      Some forms of orality-based methods and practices can be viewed as a method of "reading" physical objects.


      Ideograms are an evolution on the spectrum from orality to literacy.


      It seems odd to be pulling these sorts of insight out my prior experiences and reading while reading something so wholly "other". But isn't this just what "myths" in oral cultures actually accomplish? We link particular ideas to pieces of story, song, art, and dance so that they may be remembered. In this case Orwell's glass paperweight has now become a sort of "talking rock" for me. Certainly it isn't done in any sort of sense that Orwell would have expected, presumed, or even intended.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to the reviewers

      Manuscript number: RC-2022-01407

      Corresponding author(s): Ivana, Nikić-Spiegel

      1. General Statements

      We would like to thank the reviewers for careful reading of our manuscript and for their insightful and useful comments. We are happy to see that the reviewers find these results to be of interest and significance. The way we understand reviewers’ reports, their main concerns can be roughly divided in following categories: 1) providing more quantitative data 2) interpretation of the Annexin V/PI assay 3) additional evidence for calpain involvement. We intend to address these experimentally or by modifying the text, as outlined below.

      2. Description of the planned revisions

      Reviewer #1

      Fig1A/B o SYTO 16 staining suggests slight reshaping of nucleus upon spermine NONOate, showing less blurry punctae. From the SYTO 16 profile, this should be quantifiable.

      By looking at the shown examples and the entire dataset, it appears to us as if neuronal nuclei are shrinking upon spermine NONOate treatment resulting in their less blurry appearance. We are not sure if this is what the reviewer is referring to, but this can also be quantified by measuring changes in neuronal nuclear size. We already have this data from the measurements shown in Fig4 and we intend to show it in the revised version of the manuscript. Line profile measurements are also possible, but the nuclear size quantification might be more suitable for this purpose.

      o There is a subset of neuron nuclei that are SYTO 16 positive. Please quantify the ratio

      We will use our existing dataset to quantify the ratio of NFL positive and SYTO16 positive nuclei.

      FigS1A o Show NeuN with Anti-NFL merged figures

      We will show merged NeuN and anti-NFL images, which might require rearrangement of the existing figures and figure panels. We will do this in the revised manuscript.

      FigS1C o Show quantification and timeline. I want to know whether there is also a plateau reached here.

      As the data shown in the FigS1C do not include NeuN staining, we will do additional experiments and perform proposed quantifications.

      FigS2A-F o Though the statements might be true, selecting one nucleus for a line profile as a statement for the whole dataset seems problematic. Average a larger number of unbiased selected nuclei profiles across multiple cultures to make a stronger statement, or a percentage of positive nuclei as in FigS1b.

      Corresponding images and line profiles are representative of the entire dataset. However, we agree with the reviewer that this is not obvious from the current manuscript version. Thus, to strengthen our findings, we intend to quantify the percentage of positive nuclei as in FigS1b. The only difference will be that instead of NeuN, we will use SYTO16 as a nuclear marker. The reason being that the existing datasets contain images of NFL and SYTO16 and not NeuN.

      FigS3 • There are no fluorescence profiles, no quantification

      As the reviewer suggests, we will quantify the ratio of NFL positive and SYTO16 positive nuclei, and include the quantifications in the revised manuscript.

      General statement: There do seem to be punctated patterns of non-nucleus accumulating NFL fragments. Can they be localized to any specific structure?

      We assume that the reviewer is referring to neuronal/axonal debris. They are present after injury but they do not colocalize with nuclear stains. We will address this in the revised manuscript.

      Fig1C-F • I find it too simplistic to categorize c+f and d+e together. There is a huge difference in the examples of nuclear localization between d and e. To not comment on their distinction (if that is consistent) is problematic. Also, since we don't see a merge with either NeuN or SYTO 16, reader quantification is difficult.

      We thank the reviewer for bringing this up. We will carefully check our entire dataset and we will update the figures and the text accordingly. We will also show the corresponding SYTO16 images, as the reviewer suggested.

      Would the microfluidic device construction allow for time to transport any axonally damaged fragments to the soma?

      Yes, the construction of the microfluidic devices allows the transport of axonal proteins back to the soma. Based on our experiments, it seems that damaged NFL from the axonal compartment could be contributing to the accumulation of NFL fragments in the nuclei. However, this contribution seems to be minimal as we cannot detect nuclear NFL upon the injury of axons alone. Alternatively, it could be that the processing of axonal NFL fragments proceeds differently if neuronal bodies are not injured and that this is the reason we don’t detect the NFL nuclear accumulation upon injury of axons alone. We will discuss this in the revised manuscript.

      Fig2C+D • The statement ".... no annexin V was detected on the cell membrane" needs to be shown more clearly

      We will modify figures to address this comment.

      • Please provide merged AnnexinV/PI images

      We will modify figures to address this comment.

      • The conclusion about 2D, that nuclear accumulated NFL overlaps with PI is not supported by the example image shown. There are plenty of PI positive spots that are not NFL positive and even several NFL positive ones that do not have a clear PI staining. Please quantify and then show a very clear result in order to be able to suggest necrosis as the underlying process.

      We are not sure if we understand the reviewer’s concern correctly. We will try to clarify it here and in the revised text. If necessary, we will tone down our conclusion, but the reason why not all of PI positive spots are NFL positive is most likely due to the fact that not all injured nuclei are NFL positive. We quantified in FigS1 that up to 60% of nuclei under injury conditions show NFL accumulations. That is why we are not surprised to see some PI positive/NFL negative nuclei. And the fact that there are some NFL positive nuclei which appear to be PI negative is most likely related to the fact that the PI binding is affected. In addition, upon closer inspection of NFL and PI panels in Fig2d it can be observed that NFL positive nuclei are also PI positive, albeit with a lower PI fluorescence intensity. We will modify the figure to show this clearly in the revised manuscript.

      FigS5 C+D • If the case is made that nitric oxide damage induces necrosis, then why is it that the AnnexinV example of Staurosporine exposure (which induces apoptosis) looks similar to that of nitric oxide damage in Fig2d and necrosis induction with Saponin looks very different?

      We thank the reviewer for bringing this up. We will try to clarify this in the revised manuscript. Regarding the specific questions, the most likely explanation why staurosporine treated neurons look similar to the ones treated with spermine NONOate is that in the late stages of apoptosis cell membrane ruptures and allows for the PI to label nuclei. This is probably the case here as illustrated by the nucleus in the middle of the image (FigS5c) that shows the fragmentation characteristic for the apoptosis. This is not happening in early apoptotic cells due to the presence of an intact plasma membrane. On the other hand, the reason why saponin treated cultures look different compared to spermine NONOate is that membranes are destroyed by saponin so that the PI can enter the cell. For that reason, there could have not been any AnnexinV binding to the membrane which would correspond to the AnnexinV signal of spermine NONOate treated neurons. As we will discuss below, we did not try to mimic spermine NONOate-induced injury with saponin treatment. Instead this was a control condition for PI labeling and imaging. We also used a rather high concentration of saponin which probably destroyed all the membranes which was not the case with spermine NONOate treatment. We intend to do additional control experiments to address this.

      • Additionally, does necrosis induction with Saponin also cause NFL fragment accumulation in the nucleus? Please show a co-staining of them. Also, the authors want to make a claim about reduce PI binding in NFL accumulated necrotic cells. In these examples, the intensity of the nuclear stain of PI with Saponin looks dimmer than with Staurosporine. Are the color scalings similar? It might be that the necrotic process itself causes reducing binding of PI and is not related to the presence of NFL.

      With regards to this question, it is important to note that Annexin V and PI imaging was done in living cells. To obtain the corresponding anti-NFL signal as shown in Fig 2c,d we had to fix the neurons, perform immunocytochemistry and identify the same field of view. We tried to do the same procedure after saponin treatment (Supplementary Figure 5d) but the correlative imaging was very difficult due to the detachment of neurons from the coverslip after the saponin treatment. For this reason, we could not identify the same field of view co-stained with NFL. However, other fields of view did not show NFL fragment accumulation. This could also be the consequence of the high saponin concentration that we used as we discuss above. We have also noticed the reduced intensity of PI binding in the nuclei of saponin-treated neurons. However, if the necrotic process itself reduces the binding of PI to the DNA, then all of the neurons treated with spermine NONOate would have an equally low PI signal. In our experiments, only the nuclei which contained NFL accumulations had a low PI signal, while the signal of NFL-negative nuclei was higher (as shown in Fig2d). We would also like to point out again that the saponin treatment was our control of the PI’s ability to penetrate cells and bind the DNA, as well as our imaging conditions, and not the control of the necrotic process itself. This is the reason why we didn’t go into details about neuronal morphology and NFL localization upon saponin treatment. We thank the reviewer for pointing this out since it prompted us to reevaluate what we wrote in the corresponding paragraph of the manuscript. We realized that the confusion might stem from our explanation of the AnnexinV/PI assay controls in the lines 196-198 (“Additional control experiments in which neurons were treated with 10 μM staurosporine (a positive control for induction of apoptosis) or with 0.1% saponin (a positive control for induction of necrosis) confirmed the efficiency of the annexin V/PI assay (Supplementary Fig. 5c,d).”). We will modify this portion of the text to clearly state that staurosporine and saponin treatments were controls of the AnnexinV and PI binding to their respective targets and not of the apoptosis/necrosis process. When it comes to the saponin treatment, our intention was only to permeabilize the membranes in order to allow PI penetration and DNA binding and not to induce necrosis or to mimic the effect of the spermine NONOate. We also intend to perform experiments with lower concentration of saponin to try to address this experimentally in addition to the text modifications.

      Fig3d • Please show similarly scaled images from controls for proper comparison

      We will show similarly scaled images of the control neurons so that they can be properly compared. They were initially not scaled the same for visualization purposes, but we will modify this in the revised manuscript.

      • How do the authors scale the degree and kinetics of induced damage between application of hydrogen peroxide/CCCP and glutamate toxicity? Does glutamate toxicity take longer to affect the cell, not allowing enough time to accumulate NFL fragments in the nucleus?

      It is challenging to scale the degree and kinetics of induced damage with different stressors. That is why we did not intend to do this. Instead we set different injury conditions based on the published literature. That is why can only speculate when it comes to this. In this regard, it can be that the glutamate toxicity takes “longer” to affect the cells even though it is very difficult to compare them on a timescale, especially when considering different mechanisms of action. We will discuss this limitation in the revised manuscript.

      Fig4B • Some groups (like NO and NO + emricasan) have much larger numbers of close to 0 intensity, compared to the control group. Why?

      We were wondering the same when we analyzed the data. The fact that our nuclear fluorescence intensity analysis picked up NFL signal in control neurons which had no nuclear NFL accumulation made us realize that the intensity measured in the nuclei of control group comes entirely from the out of focus fluorescence – from neurofilaments in cell bodies, dendrites and axons (an example can be seen in the FigS6). That is why we presented the corresponding data with a cut-off value based on the control signal (as mentioned in lines 238-240). Since the oxidative injury causes NFL degradation (not only in neuronal soma, but also neuronal processes), the overall fluorescence intensity of the NFL immunocytochemical staining is reduced in injured neurons. We can see that in all of our images. Consequently, there is no contribution of out of focus fluorescent signal to the measured fluorescence intensity in the majority of nuclei. Due to that, the nuclei without NFL accumulation (at least 40% of injured nuclei) will appear to have a close to 0 intensity of the fluorescent signal. We will discuss and clarify this additionally in the revised manuscript.

      • Please add the ratio of above/below threshold (50/50 obviously in controls)

      We will update the figure in the revised manuscript.

      • The description of the CTCF value calculation seems a little... muddled? Several parameters are described whereas "integrated density" is not even used. Why not simply mean intensity of nuclear ROI-mean intensity of background ROI?

      We included the integrated density in the description since it is measured together with the raw integrated density and can also be used for the CTCF value calculation. However, since we didn’t use it for the CTCF calculation, we will remove it from the corresponding section of the manuscript. We calculated the CTCF value instead of calculating mean intensity of the nuclear ROI - mean intensity of the background ROI, since the CTCF value also takes into account the area of the ROI and not just the mean intensity.

      • Also, please tell me if the areas for nuclear ROIs change, as I noted for Fig1A/B

      We will include this information in the revised manuscript.

      • To make sure that one of the 3 experimental repeats didn't skew the results, please show the median fluorescence intensity for each individual experiment to clarify that the supposed effect is repeated across experiments.

      We have already noticed that in the earliest of the three experiments overall fluorescence intensity was higher, but this was consistent across all the experimental groups and did not skew the results or affect the overall conclusion. However, we will double-check this and revise the figure.

      • From the text "...and due to the NFL degradation during injury...": this seems to contradict the process? Either the NFL fragment accumulates in the nucleus or it is degraded during injury. And isn't the degradation through calpain what supposedly allows this fragment of NFL to go to the nucleus in the first place? I reckon that the authors are possibly trying to reconcile why there are many close-to-0 intensity nuclei in the NO and NO + emricasan groups, but I don't feel the explanation given here fits.

      As we tried to explain in our response above, we think that the overall degradation of neurofilaments in neurons affects the fluorescence intensity originating from the out of focus neurofilaments. Therefore, the nuclei without NFL accumulation in injured conditions have a close to 0 fluorescence intensity. Additionally, we think that this is not an either/or situation, but that both degradation and nuclear accumulation of NFL happen simultaneously. We also think that degradation of axonal NFL and the transport of its tail domain to the soma will at least partially contribute to the accumulation in the nucleus. In any case, degradation and nuclear accumulation seem to be differentially regulated in individual neurons, as some of them show nuclear NFL accumulation and some not. Furthermore, calpain and other mechanisms could also cause NFL degradation up to the point at which these fragments can no longer be recognized by the anti-NFL antibody leading to the loss of signal. We will try to clarify this in the revised version of the manuscript.

      Fig5 • Does the distribution of this GFP in B match any of the various antibody stainings of different NFL fragments? Perhaps this is still a valid fragment of NFL, just not picked up by any AB?

      The GFP signal in B appears rather homogenous and it does not match any of the various antibody stainings of different NFL fragments. As the reviewer points out, this could also be a valid fragment of NFL fused to GFP that none of our antibodies is recognizing. We will clarify this in the revised manuscript.

      • "... and was indistinguishable from the full277 length NFL-GFP." Based on what parameters?

      We will clarify this in the revised text, but we meant in terms of overall neurofilament network and cell appearance, which is commonly used to test the effect of NFL mutations.

      • The authors claim that b is different from d, but I am not convinced. I would like to see a time dependent curve from multiple cells showing a differential change in nuclear and cytosolic GFP signal.

      As we also wrote in the manuscript, in the majority of neurons that were monitored during injury we were not able to detect an increase in the GFP fluorescence intensity in the nucleus. This is what prompted further experiments with NFL(ΔA461–D543)-FLAG. We will clarify this additionally in the revised manuscript and perform line profile intensity measurements to show the difference in nuclear and cytosolic GFP signal.

      • Secondly, the somatic GFP intensity for NFL increases for full length NFL-GFP. How is this explained, if it is only a separation of NFL and GFP? If anything, GFP should float away. And if the answer is that NFL is recruited to the nucleus, you showed that inhibition of calpain activity partially prevents that. So, if calpain activity is necessary for the transport of NFL to the nucleus, then wouldn't it also cut the GFP from NFL before it reaches the nucleus?

      We thank the reviewer for bringing this up and we apologize for the confusion. This can be explained by the fact that the images were scaled in a way that the GFP signal over time could still be seen easily (i.e. differently across different time points which we unfortunately forgot to mention in the figure legend). In the revised manuscript, we will either scale the images the same or we will alternatively show the displayed grey values in individual panels.

      Fig6 • It is recommended to overlap the transfected cells with a stain for endogenous NFL to show that despite the absence of the FLAG-tag, there is still NFL.

      We did not overlap the anti-NFL with anti-FLAG and SYTO16 staining, due to the space constraint and the intent to clearly show the overlap of FLAG and SYTO16 signals in the merged images above the graphs. However, the line profile intensity measurements were done in all three channels and show that despite the absence of FLAG, there is still NFL in the nucleus (Fig6b), or that both FLAG and NFL are present in the nucleus (Fig6d, NFL signal shown in gray). However, as this is not obvious and can easily be overlooked, we will show the endogenous NFL staining overlap in the revised version of the manuscript.

      Fig7 • „ ...all disrupted neurofilament assembly...": this sounds like the staining for native NFL supposedly shows a distortion due to a dominant negative effect of the expression of these constructs? Please clarify.

      Yes, we were referring to the disruption of neurofilament assembly due to a dominant negative effect of the expression of NFL domains. We will clarify this in the revised version of the manuscript.

      Discussion: • The authors show that after overepression of the head domain only, it possibly passively diffuses into the nucleus even in the absence of oxidative injury. However, it seems to be suggested as well that the head domain would not be freely floating around if it wouldn't be for increased calpain activity as a result of oxidative injury in the first place. Therefore, a head domain fragment localized in the nucleus would still more prominently happen upon oxidative injury and interact with DNA through prior identified putative DNA interaction sites from Wang et al. Please comment.

      That is correct. Upon injury and calpain cleavage, it is conceivable that a fragment containing the NFL head domain would also be present in the cell and could potentially diffuse to the nucleus and interact with the DNA. However, by staining injured neurons with an antibody that recognizes amino acids 6-25 of the NFL head domain, we were not able to detect an NFL signal in the nucleus (FigS2a,b). It could be that either the NFL head domain does not localize in the nuclei upon injury, or that the fragment localizing in the nucleus does not contain amino acids 6-25 of the NFL head domain. As the putative DNA-binding sites described by Wang et al involve 7 amino acids located in the first 25 residues of the NFL head domain, we would expect to detect it with the aforementioned antibody. However, as that was not the case we speculated that the interaction of NFL and DNA occurs differently in living cells, as opposed to the test tube conditions utilized by Wang et al. We will comment and clarify this in the revised version of the manuscript.

      • Reviewer #2*

      • Major Comments:

      • The initial data presented in the paper is good, does response of oxidative damage with proper controls, testing the antibodies to NF-L and etc. (Fig. 1-Fig. 4). *

      We thank the reviewer for their positive feedback.

      1. The evidence for calpain involvement in NF-L cleavage during oxidative damage is missing. Provide the evidence for full length NF-L construct and deletion mutants transfected into cells by immunoblot for cleavage of NF-L, perform nuclear and cytoplasmic extract preparations and show that enrichment of the tagged cleaved NF-L fragment in nuclear fraction.

      We thank the reviewer for their comments and suggestions. Since we saw in our microscopy experiments that calpain inhibition reduced the accumulation of NFL in the nucleus, and since it is known that NFL is a calpain substrate (Schlaepfer et al., 1985; Kunz et al., 2004 and others), we did not perform additional experiments to confirm the involvement of calpain in NFL degradation during injury. However, to strengthen our findings, we intend to perform the suggested experiments and include the results in the revised manuscript.

      1. Show calpain activation during oxidative damage by performing alpha-Spectrin immunoblots identify calpain specific 150-kda Spectrin and caspase specific 120-kDa fragment generation in these cells. Also, calpain activation can be measured by MAP2 level alteration and p35 to p25 conversion. Without this evidence it's very hard to believe if the calpain activity is increased or decreased during oxidative damage and these markers are altered by using calpain inhibitors.

      To confirm the calpain activation, we intend to perform anti-alpha spectrin and/or anti-MAP2 blots in lysates of control and injured neurons and include the results in the revised manuscript.

      1. The premise that NF proteins are absent in cell bodies and present only in axons is not correct. It has been demonstrated by multiple investigators that NFs are present in the perikaryon and dendrites of many types of neurons (Dahl, 1983, Experimental Cell Research)., Dr. Ron Liem's group showed NF protein expression in cell bodies of dorsal root ganglion cells (Adebola et ., 2015, Human Mol Genetics) and also showed N-terminal antibodies for NF-L, NF-M and NF-H stain rat cerebellar neuronal cell bodies and dendrites (Kaplan et al., 1991, Journal of Neuroscience Research) when NFs are less phosphorylated. (Schlaepfer et al., 1981, Brain Research) show staining of cell bodies of cortex and dorsal root ganglion cell bodies with NF antibody Ab150, and Yuan et al., 2009 in mouse cortical neurons with GFP tagged NF-L.

      We are not sure what the reviewer is referring to since we cannot find a corresponding section in which we claim that NF proteins are absent in cell bodies. We wrote the following “Anti-NFL antibody staining of neurons treated with the control compound showed the expected neurofilament morphology, that is, a strong fluorescence intensity in axons and lower intensity in cell bodies and dendrites (Fig. 1a)” in our results section (lines 119-121), but the claim we were trying to make there was that NF proteins are particularly abundant in axons. We will clarify this in the revised manuscript.

      1. Quantifying NF-L signal or tagged NF-L fragment signals in the cell body by ICC has many problems and making conclusions. It's extremely difficult to have control over levels of proteins in transfected overexpression models and comparing two or three different constructs with each other by ICC. Not every cell expresses same levels of protein in transfected cells and quantifying it by ICC again has a major problem. This can be addressed if there are stable lines that express equal levels of protein in all cells that comparisons can be made. Under thesese circumstances validation of the hypothesis presented in the study has no strong direct evidence to demonstrate that calpain is activated and NF-L fragment translocate to the nucleus.

      We agree that the results from overexpression-based experiments should be interpreted with caution as levels of expression vary between the cells. We intend to discuss this in the revised manuscript. However, we find it difficult to experimentally address this comment since we are not sure which specific experiments the reviewer is referring to. With regards to this, we would like to emphasize that most of the initial experiments in which we observed NFL accumulation in the nuclei of injured neurons were based on the ICC labeling of endogenous NFL and didn’t involve its overexpression. This includes labeling of endogenous NFL in various types of neurons, comparing the effects of different types of oxidative injury, as well as testing the effects of calpain inhibition on the observed nuclear accumulation (Figures 1-4; Supplementary Figures 1-6). We later resorted to the overexpression experiments in primary neurons (Figures 5-7; Supplementary Figure 7, 10) to gain more information about the identity of NFL fragment which was detected in the nucleus. Due to the low transfection efficiency of primary neurons, we performed an additional set of overexpression experiments in neuroblastoma ND7/23 cells (Figure 8; Supplementary Figures 8,9) and obtained similar results in a higher number of cells. We agree that having stable cell lines which e.g. express same levels of NFL domains would be a more elegant approach and we intend to make them for our follow-up studies, however the generation of said stable cell lines might be beyond the scope of this revision. Furthermore, looking at our data with overexpression of NFL domains in ND7/23 cells (Supplementary Figure 8,9), it appears to us as if different domains are rather homogenously expressed in different cells. While the expression levels might vary, it seems that they all show the same trend when it comes to their localization (which was the main point of those experiments).

      1. The interpretation that NF-L preventing DNA labeling cells is misinterpretation. NFs have very long half-life compared to other proteins. Due to oxidative damage, DNA is degraded in the cells but NFs that have very long half-life you see as NFs rings in the dead cells. So, NFs do not prevent DNA labeling, but DNA or chromatin is degraded in dead cells.

      We thank the reviewer for their useful insight. DNA degradation could certainly be the reason why we observe a lower fluorescence intensity of the propidium iodide fluorescence in the nuclei of injured neurons. We intend to discuss this in the revised manuscript. However, if the DNA degradation is the only reason for the lower PI fluorescence intensity, then the PI fluorescence intensity would be the same in all injured nuclei. In our experiments, we saw the reduced PI fluorescence intensity in nuclei that contained NFL accumulations and not in other nuclei. Additionally, we observed a reduction of SYTO16 fluorescent labeling of nuclei which contained accumulations of the NFL tail domain, even in the absence of oxidative injury. Due to these reasons we speculated that NFL accumulation in the nucleus might hinder nuclear dyes from interacting with the DNA. But this is only a speculation and we will try to clarify this further in the revised manuscript including alternative explanations.

      Minor comments: 1. In the introduction on page 4 reference is missing for NF transport, aggregation and perikaryal accumulation (on line 93).

      We will add a reference to the revised manuscript.

      1. The statement in discussion on page 14 line 454 for Zhu et al., 1997 study is not accurate. It should be modified to sciatic nerve crush not spinal cord injury.

      We will correct this mistake in the revised manuscript.

      1. What is the size of the calpain cleaved NF-L tail domain? If you perform immunoblots on cell extracts treated with oxidative agents one would know it.

      We will perform immunoblots on cell lysates and incorporate the corresponding results in the revised manuscript.

      1. Authors could make their conclusions clear. This is particularly true for the experiments in Figure 4 panels c and d. It is very difficult to understand the conclusions of the experiments. First state the expectation and then described whether the expectation is true or different.

      We will do as the reviewer suggested in the revised manuscript.

      1. The ICC images are at extremely low magnification. They should be shown at 100x or 120x so that details of the cell body and the nucleus can be seen.

      Our intention was to show larger fields of view and wherever appropriate insets, but we will try to improve this in the revised manuscript by either zooming in, cropping or adding additional insets with individual cell bodies and nuclei. In general, images were taken with an optimal resolution/pixel size in mind for any of the used objectives (60x/1.4 NA or 100x/1.49 NA) and we can easily modify our figure panels to show more details.

      1. Oxidative damage leads to beaded accumulation of NF-L in neurites and axons. Authors should address this issue.

      We will discuss this in the revised manuscript.

      1. The combination treatment of the inhibitors (last 3 sets of the Fig. 4 b) has no statistical significance should be removed.

      Actually, these differences were statistically significant (Supplementary Table 1). For clarity and as described in the figure legend (line 516: “The most relevant significant differences are indicated with an asterisk”) we showed only a subset of them on the graph, but we will change this in the revised manuscript.

      1. Why only two antibodies recognize cleaved NF-L? If the antibodies at directed at tail region, they should recognize it unless the phosphorylated tail at Ser473 may inibit the antibody binding. In that case NF-L Ser473 specific antibody (EMD Millipore: MABN2431) may be used to test this idea.

      This is a very good point that we also wonder about. Even if all antibodies are directed at tail region, exact epitopes are not described for all of them. That makes it also difficult for us to understand and speculate on this. However, we have already ordered the new antibody as suggested by the reviewer and we will experimentally test it.

      **Referees cross-commenting**

      I agree with the reviewer#1 about presenting the quantification data for the indicated figures to make conclusions strong and see how much of variation is there among sampled cells.

      As discussed in our response to reviewer #1, we will provide additional quantifications.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      4. Description of analyses that authors prefer not to carry out

      Reviewer #2, major comment 7. Authors could do chromatin immunoprecipitation (chip) analysis to identify NF-L binding sites on chromatin and perform gel shift assays to show NF-L tail domain binding to specific consensus DNA sequences.

      We thank the reviewer for their suggestion. We are very interested in performing additional experiments and identifying the NFL binding sites on the DNA (either by chromatin immunoprecipitation or DamID-seq) and we intend to perform these experiments as soon as possible. Unfortunately, at the moment we do not have the expertise to perform such experiments in our lab. Instead, this type of follow-up project requires establishing a collaboration which is beyond the scope of this revision.

    1. Author Response

      Reviewer #1 (Public Review):

      This paper examines EEG responses time-locked to (or "entrained" by) musical features and how these depend on tempo and feature identity. Results revealed stronger entrainment to "spectral flux" than to other, more commonly tested features such as amplitude envelope. Entrainment was also strongest for lowest rates tested (1-2 Hz).

      The paper is well written, its structure is easy to follow and the research topic is explained in a way that makes it accessible to readers outside of the field. Results will advance the scientific field and give us further insights into neural processes underlying auditory and music perception. Nevertheless, there are a few points that I believe need to be clarified or discussed to rule out alternative explanations or to better understand the acquired data.

      We thank the Reviewer for taking the time to evaluate our manuscript and for the positive response. We have now conducted further analyses to strengthen our conclusion that neural synchronization was strongest at slower musical tempi and to rule out an alternative explanation that neural synchronization was strongest for music presented near its own original or “natural” tempo. We also added some points to the Discussion in response to your comments; revised text is reproduced as part of our point-by-point responses below for your convenience. The page and line numbers correspond to the manuscript file without track changes.

      1) Results reveal spectral flux as the musical feature producing strongest entrainment. However, entrainment can only be compared across features in an unbiased way if these features are all equally present in the stimulus. I wonder whether entrainment to spectral flux is only most pronounced because the latter is the most prominent feature in music. Can the authors rule out such an explanation?

      Respectfully, it is not fully clear to us based on the literature that entrainment can only be compared across features fairly when those features are equally presented in the stimulus. Previous work in the speech domain has compared entrainment to amplitude envelope vs. spectrogram, vs. a symbolic representation of the time of occurrence of different phonemes (Di Liberto et al., 2015). Work in the music domain has compared entrainment to amplitude envelope (and its derivative) vs. features quantifying melodic expectation (surprise and entropy, quantified using a hidden Markov-model trained on a corpus of Western music; Di Liberto et al., 2020). In these papers, there was no quantification of the degree to which each feature was present in the stimulus material, and when comparing such qualitatively different features, it is not clear to us how one would do so. Nonetheless, these studies used the resulting TRF-based dependent measures to evaluate which feature best predicted the neural response. Here, although we do not know what acoustic feature might be most present / strongest in music, we believe that we can investigate the degree to which each feature predicts the neural response. In fact, we might argue the sort of reverse of the logic in your comment – that the TRF results actually tell us which feature is perceptually or psychologically the most important in terms of driving brain responses, which may not be fully predictable from the acoustics of those features.

      From a data analysis perspective, we have independently normalized (z-scored) each feature as well as the neural data, as prescribed in Crosse et al., 2021, to try to level the playing field for the musical features we are comparing. Moreover, we made changes in the discussion to acknowledge your concern. The text is reproduced here for your convenience.

      p. 26, l. 489-497: “One hurdle to performing any analysis of the coupling between neural activity and a stimulus time course is knowing ahead of time the feature or set of features that will well characterize the stimulus on a particular time scale given the nature of the research question. Indeed, there is no necessity that the feature that best drives neural synchronization will be the most obvious or prominent stimulus feature. Here, we treated feature comparison as an empirical question (Di Liberto et al., 2015), and found that spectral flux is a better predictor of neural activity than the amplitude envelope of music. Beyond this comparison though, the issue of feature selection also has important implications for comparisons of neural synchronization across, for example, different modalities.”

      2) Spectral analyses of neural data often yield the strongest power at lowest frequencies. Measures of entrainment can be biased by the amount of power present, where entrainment increases with power. Can the authors rule out that the advantage for lower frequencies is a reflection of such an effect?

      Thank you for this insightful comment. In response to your comment and the comments of Reviewer 3, we normalized the TRF correlations, stimulus–response correlations, and stimulus–response coherences by surrogate distributions that were calculated separately for each musical feature and – importantly – for every tempo condition. Following Zuk et al., 2021, we formed surrogate distributions by shifting the relevant neural data time course relative to the stimulus-feature time courses by a random amount. We did this 50 times, and for each shift re-calculated all dependent measures. We then normalized our dependent measures calculated from the intact time series relative to these surrogate distributions by subtracting the mean and dividing by the standard deviation of the surrogate distribution (“z-scoring”). Since the approach of shifting the neural data leaves the neural time series intact, the power spectrum of the data is preserved, but only its relationship to the stimulus is destroyed. After normalization, the plots obviously look a little different, but the main results – a higher level of neural synchronization to slower stimulation tempi and in response to the spectral flux – remain.

      The changes can be found throughout the manuscript, but especially on p. 11, l. 210-218, Figures 2-3 and a more detailed explanation in the Methods section.

      p. 39, l. 821-829: “In order to control for any frequency-specific differences in the overall power of the neural data that could have led to artificially inflated observed neural synchronization at lower frequencies, the SRCorr and SRCoh values were z-scored based on a surrogate distribution (Zuk et al., 2021). Each surrogate distribution was generated by shifting the neural time course by a random amount relative to the musical feature time courses, keeping the time courses of the neural data and musical features intact. For each of 50 iterations, a surrogate distribution was created for each stimulation subgroup and tempo condition. The z-scoring was calculated by subtracting the mean and dividing by the standard deviation of the surrogate distribution.”

      A related point, what was the dominant rate of spectral flux in the original set of stimuli, before tempo was manipulated? Could it be that the slow tempo was preferred because in this case participants listened to a most "natural" stimulus?

      This is a good point, thank you. We did two things to attempt to address this (see also comment Reviewer 3). First, the original tempo for each song can be found in Supplementary Table 1. To make the table more readable and more comparable with the main manuscript, we have updated the table and now state the original tempi in BPM and Hz. Second, we added histograms of the original tempi across all songs as well as the maximum amount by which all songs were tempo-shifted (i.e., the maximum tempo difference between the slowest (or fastest) version of each song segment compared to the original tempo). These histograms have been added to Figure 1 – figure supplement 2, and are paraphrased here for your convenience (p. 13 l. 265-273): The original tempo of the set of musical stimuli ranges between 1-2.75 Hz. This indeed overlaps with the tempo range that revealed strongest neural synchronization. When songs were tempo-shifted to be played at a slower tempo than the original, they were shifted by ~0.25-1.25 Hz. In contrast, shifting a song to have a faster tempo typically involved a larger shift of ~1-2.25 Hz. Thus, it is definitely possible that tempo, degree of tempo shift, and proximity to “natural” tempo were not completely independent values.

      For that reason, to investigate the effects of the amount of tempo manipulation on neural synchronization, we conducted an additional analysis. We compared TRF correlations for a) songs that were shifted very little relative to their original tempi to b) songs that were shifted a lot relative to their original tempi. We did not have enough song stimuli to do this for every stimulation tempo, but we were able to do the TRF correlation comparison for two illustrative stimulation tempo conditions (at 2.25 Hz and 1.5 Hz). In those tempo conditions, we took the TRF correlations for up to three trials per participant when the original tempo was around the manipulated tempo (1.25-1.6 Hz for 1.5 Hz or 2.01-2.35 Hz for 2.25 Hz) and compared it to those trials where the original tempo was around 0.75¬–1 Hz faster or slower than the manipulated tempo at which the participants heard the songs (Figure 3 – figure supplement 2). This analysis revealed that there was no significant effect of the original music tempi on the neural response (please see Material and Methods, p. 40, l. 855-861 and Results p. 13, l. 265-273). In response to your and Reviewer’s 3 comments, we also added this additional point to the discussion.

      p. 23-24 l. 427-436: “The tempo range within which we observed strongest synchronization partially coincides with the original tempo range of the music stimuli (Figure 1 – figure supplement 2). A control analysis revealed that the amount of tempo manipulation (difference between original music tempo and tempo the music segment was presented to the participant) did not affect TRF correlations. Thus, we interpret our data as reflecting a neural preference for specific musical tempi rather than an effect of naturalness or the amount that we had to tempo shift the stimuli. However, since our experiment was not designed to answer this question, we were only able to conduct this analysis for two tempi, 2.25 Hz and 1.5 Hz (Figure 3 – figure supplement 3), and thus are not able to rule out the influence of the magnitude of tempo manipulation on other tempo conditions.”

      3) The authors have a clear hypothesis about the frequency of the entrained EEG response: The one that corresponds to the musical tempo (or harmonics). It seemed to me that analyses do not sufficiently take that hypothesis into account and often include all possible frequencies. Restricting the analysis pipeline to frequencies that are expected to be involved might reduce the number of comparisons needed and therefore increase statistical power.

      Although we manipulated tempo, and so had an a priori hypothesis about the frequency at which the beat would be felt, natural music is a complex stimulus composed of different instruments playing different lines at different time scales, many or most of which are nonisochronous. Thus, we analyzed the data in two different ways – 1) based on TRFs and 2) based on stimulus–response correlation and coherence. Stimulus–response coherence is a frequency-domain measure, and so it was possible to do exactly as you suggest here and consider coherence only at the stimulation tempo and first harmonic, which we did (Figure 2E-J). However, for the TRF analyses, we followed previous literature (e.g., Ding et al., 2014; Di Liberto et al., 2020; Teng et al., 2021), and considered broader-band EEG activity (bandpass filtered at 0.5-30 Hz). Previous work has shown that the beat in music evokes a neural response at harmonics up to at least 4 times the beat rate (Kaneshiro et al., 2020), so we wanted to leave a broad frequency range intact in the neural data. Despite being based on differently filtered data, we found that the dependent measures from the two analysis approaches were correlated, which suggests to us that neural tracking at the stimulation tempo itself was probably the largest contributor to the results we observed here.

      Related to your comment, we added two points to our discussion, which we reproduce here for your convenience.

      p. 24-25, l. 453-461: “Regardless of the reason, since frequency-domain analyses separate the neural response into individual frequency-specific peaks, it is easy to interpret neural synchronization (SRCoh) or stimulus spectral amplitude at the beat rate and the note rate – or at the beat rate and its harmonics – as independent (Keitel et al., 2021). However, music is characterized by a nested, hierarchical rhythmic structure, and it is unlikely that neural synchronization at different metrical levels goes on independently and in parallel. One potential advantage of TRF-based analyses is that they operate on relatively wide-band data compared to Fourier-based approaches, and as such are more likely to preserve nested neural activity and perhaps less likely to lead to over- or misinterpretation of frequency-specific effects.”

      p. 29 l. 564-577: “Despite their differences, we found strong correspondence between the dependent variables from the two types of analyses. Specifically, TRF correlations were strongly correlated with stimulation-tempo SRCoh, and this correlation was higher than for SRCoh at the first harmonic of the stimulation tempo for the amplitude envelope, derivative and beat onsets (Figure 4 - figure supplement 1). Thus, despite being computed on a relatively broad range of frequencies, the TRF seems to be correlated with frequency-specific measures at the stimulation tempo. The strong correspondence between the two analysis approaches has implications for how users interpret their results. Although certainly not universally true, we have noticed a tendency for TRF users to interpret their results in terms of a convolution of an impulse response with a stimulus, whereas users of stimulus–response correlation or coherence tend to speak of entrainment of ongoing neural oscillations. The current results demonstrate that the two approaches produce similar results, even though the logic behind the techniques differs. Thus, whatever the underlying neural mechanism, using one or the other does not necessarily allow us privileged access to a specific mechanism.”

      Reviewer #2 (Public Review):

      Kristin Weineck and coauthors investigated the neural entertainment to different features of music, specifically the amplitude envelope, its derivative, the beats and the spectral flux (which describes how fast are spectral changes) and its dependence on the tempo of the music and self-reports of enjoyment, familiarity and ease of beat perception.

      They use and compare analysis approaches typically used when working with naturalistic stimuli: temporal response functions (TRFs) or reliable components analysis (RCA) to correlate the stimulus with its neural response (in this case, the EEG). The spectral flux seems the best music descriptor among the tested ones with both analyses. They find a stronger neural response to stimuli with slower beat rates and predictable stimuli, namely familiar music with an easy-to-perceive beat. Interestingly, the analysis does not show a statistically significant difference between musicians and non-musicians.

      The authors provide an extensive analysis of the data, but some aspects need to be clarified and extended.

      We thank the Reviewer for taking the time to evaluate and summarize our manuscript and for the great comments. We addressed the concerns and made changes throughout the manuscript, but especially in the introduction and discussion sections about the terminology (neural entrainment and neural measures), musical features of the stimuli, and musical experience of the participants. Below you can find the alterations described in more detail. The page and line numbers correspond to the manuscript file without track changes.

      1) It would be helpful to clarify better the concepts of neural entertainment, synchronization and neural tracking and their meaning in this specific context. Those terms are often used interchangeably, and it can be hard for the reader to follow the rest of the paper if they are not explicitly defined and related to each other in the introduction. Note that this is fundamental to understanding the primary goal of the paper. The authors clarify this point only at the end of the discussion (lines 570-576). I suggest moving this part in the introduction. Still, it is unclear why the authors use the TRF model and then say they want to be agnostic about the physiological mechanisms underlying entertainment. The choice of the TRF (as well as the stimulus representation) automatically implies a hypothesis about a physiological mechanism, i.e., the EEG reflects convolution of the stimulus properties with an impulse response. Please could you clarify this point? I might have missed it.

      Thank you for this valuable comment. We agree that it is fundamental to define and uniformly use terminology, and have made changes throughout the manuscript along these lines. First of all, we have changed all instances of “neural entrainment” or “neural tracking” to “neural synchronization”, as we think this term avoids evoking a specific theoretical background or strong mechanistic assumptions. Second, we have moved the Discussion paragraph you mention to the Introduction and expanded it. Specifically, we take the opportunity to address the association between specific analysis approaches (TRFs vs. stimulus–response correlation or coherence) and specific mechanistic assumptions (convolution of stimulus properties with an impulse response vs. entrainment of an ongoing oscillation, respectively). This allowed us to clarify what we mean when we say we prefer to stay agnostic to specific mechanistic interpretations. We are happy to have had the chance to strengthen this discussion, and think it benefits the manuscript a lot.

      We reproduce the new Introduction paragraph here for your convenience.

      p. 5-6, l. 101-123: “The current study investigated neural synchronization to natural music by using two different analysis approaches: Reliable Components Analysis (RCA) (Kaneshiro et al., 2020) and temporal response functions (TRFs) (Di Liberto et al., 2020). A theoretically important distinction here is whether neural synchronization observed using these techniques reflects phase-locked, unidirectional coupling between a stimulus rhythm and activity generated by a neural oscillator (Lakatos et al., 2019) versus the convolution of a stimulus with the neural activity evoked by that stimulus (Zuk et al., 2021). TRF analyses involve modeling neural activity as a linear convolution between a stimulus and relatively broad-band neural activity (e.g., 1–15 Hz or 1–30 Hz; (Crosse et al., 2016, Crosse et al., 2021); as such, there is a natural tendency for papers applying TRFs to interpret neural synchronization through the lens of convolution (though there are plenty of exceptions to this e.g., (Crosse et al., 2015, Di Liberto et al., 2015)). RCA-based analyses usually calculate correlation or coherence between a stimulus and relatively narrow-band activity, and in turn interpret neural synchronization as reflecting entrainment of a narrow-band neural oscillation to a stimulus rhythm (Doelling and Poeppel, 2015, Assaneo et al., 2019). Ultimately, understanding under what circumstances and using what techniques the neural synchronization we observe arises from either of these physiological mechanisms is an important scientific question (Doelling et al., 2019, Doelling and Assaneo, 2021, van Bree et al., 2022). However, doing so is not within the scope of the present study, and we prefer to remain agnostic to the potential generator of synchronized neural activity. Here, we refer to and discuss “entrainment in the broad sense” (Obleser and Kayser, 2019) without making assumptions about how neural synchronization arises, and we will moreover show that these two classes of analyses techniques strongly agree with each other.”

      2) Interestingly, the neural response to music seems stronger for familiar music. Can the authors clarify how this is not in contrast with previous works that show that violated expectations evoke stronger neural responses ([Di Liberto et al., 2020] using TRFs and [Kaneshiro et al., 2020] using RCA])? [Di Liberto et al., 2020] showed that the neural response of musicians is stronger than non-musicians as they have a stronger expectation (see point 2). However, in the present manuscript, the analysis does not show a statistically significant difference between musicians and non-musicians. The authors state that they had different degrees of musical training in their dataset, and therefore it is hard to see a clear difference. Still, in the "Materials and Methods" section, they divided the participants into these two groups, confusing the reader.

      Our findings are consistent with previous studies showing stronger inter-subject correlation in response music in a familiar style vs. music in an unfamiliar style (Madsen et al., 2019) and stronger phase coherence in response to familiar relative to unfamiliar sung utterances (Vanden Bosch der Nederlanden et al., 2022). We actually don’t think our results (stronger neural synchronization for familiar music) or these previous results are incompatible with work showing that violations of expectations evoke stronger neural responses. This work either manipulated music so it violated expectations (Kaneshiro et al., 2020) or explicitly modeled “surprisal” as a feature (Di Liberto et al., 2020). Thus, we could think of those stronger neural responses to expectancy violations as reflecting something like “prediction error”. Our music stimuli did not contain any violations, and we were unable to model responses to surprisal given the nature of our music stimuli, as we better explain below (p. 27 l. 514-529). Thus, neural synchronization was stronger to familiar music, and we would argue that listeners were able to form stronger expectations about music they already knew. We would predict that expectancy violations in familiar music would evoke stronger neural responses to those in unfamiliar music, though we did not test that here. We now include a paragraph in the Discussion reconciling our findings with the papers you have cited.

      p. 27 l. 514-529: “We found that the strength of neural synchronization depended on the familiarity of music and the ease with which a beat could be perceived (Figure 5). This is in line with previous studies showing stronger neural synchronization to familiar music (Madsen et al., 2019) and familiar sung utterances (Vanden Bosch der Nederlanden et al., 2022). Moreover, stronger synchronization for musicians than for nonmusicians has been interpreted as reflecting musicians’ stronger expectations about musical structure. On the surface, these findings might appear to contradict work showing stronger responses to music that violated expectations in some way (Kaneshiro et al., 2020, Di Liberto et al., 2020). However, we believe these findings are compatible: familiar music would give rise to stronger expectations and stronger neural synchronization, and stronger expectations would give rise to stronger “prediction error” when violated. In the current study, the musical stimuli never contained violations of any expectations, and so we observed stronger neural synchronization to familiar compared to unfamiliar music. There was also higher neural synchronization to music with subjectively “easy-to-tap-to” beats. Overall, we interpret our results as indicating that stronger neural synchronization is evoked in response to music that is more predictable: familiar music and with easy-to-track beat structure.”

      Your other question was why we did not see effects of musical training / sophistication on neural synchronization to music, when other studies have. There are a few possible reasons for this. One is that previous studies aiming to explicitly test the effects of musical training recruited either professional musicians or individuals with a high degree of musical training for their “musician” sample. In contrast, we did not target individuals with any degree of musical training, but attempted this analysis in a post-hoc way. For this reason, our musicians and nonmusicians were not as different from each other in terms of musical training as in previous work. Given this, we have opted to remove the artificial split into musician and nonmusician groups, and now only include a correlation with musical sophistication (as you suggest in your next comment), which was also nonsignificant (Figure 5 – figure supplement 2).

      3) Musical expertise was also assessed using the Goldsmith Music Sophistication Index, which could be an alternative to the two-group comparison between musicians and non-musicians. Does this mean that in Figure 5, we should see a regression line (the higher the Gold-MSI, the higher should be the TRF correlation)? Since we do not see any significant effect, might this be due to the choice of the audio descriptor? The spectral flux is not a high-level descriptor; maybe it is worth testing some high-level descriptors such as entropy and surprise. The choice of the stimulus features defines linear models such as the TRF as they determine the hierarchical level of auditory processing, and for testing the musical expertise, we might need more than acoustic features. The authors should elaborate more on this point.

      It is true that the Goldsmith Music Sophistication Index serves as an alternative way of investigating the effects of musical expertise on neural synchronization to natural music, and we now include this approach exclusively instead of dividing our sample (see response to the previous comment). Indeed, if musical sophistication would have an effect on the TRF correlations in this study, we would see a regression line in Figure 5 – figure supplement 2. Based on our experiment it is difficult to assess whether the lack of a correlation between neural measures and musical expertise is based on our choice of stimulus features. That is because our experiment was designed to investigate the effects of fundamental acoustic features of music, and it was not possible to calculate high-level descriptors, such as the entropy or surprisal, for the music stimuli we chose to work with – the stimuli were polyphonic, and moreover were purchased in a .wav format, so we do not have access to the individual MIDI versions or sheet music of each song that would have been necessary to apply, for example, the IDyOM (Information Dynamics of Music) model. As we cannot rule out that the (lack of) effects of varying levels of musical expertise on TRF correlations is due to our choice of stimulus features, we added this to the discussion.

      p. 28 l. 541-546: “Another potential reason for the lack of difference between musicians and non-musicians in the current study could originate from the choice of utilizing pure acoustic audio-descriptors as opposed to “higher order” musical features. However, “higher order” features such as surprise or entropy that have been shown to be influenced by musical expertise (Di Liberto et al., 2020), are difficult to compute for natural, polyphonic music.”

      4) Regarding the stimulus representation, I have a few points. The authors say that the amplitude envelope is a too limited representation for music stimuli. However, before testing the spectral flux, why not test the spectrogram as in previous studies? Moreover, the authors tested the TRF on combining all features, but it was not clear how they combined the features.

      One of the main reasons that we did not use the spectrogram as a feature was that it wouldn’t be possible to use a two-dimensional representation for the RCA-based measures, SRCorr and SRCoh, so we would not have been able to compare across analysis approaches. However, spectral flux is calculated directly from the spectrogram, and so is a useful one-dimensional measure that captures the spectro-temporal fluctuations present in the spectrogram (https://musicinformationretrieval.com/novelty_functions.html). Thank you for making this important point, we added this explanation to the Materials and Methods section (p. 35 l. 726-727).

      Sorry for not explaining the multivariate TRF approach better. Instead of using only one stimulus feature, e. g. the amplitude envelope, several stimulus features can be concatenated into a matrix (with the dimensions: time T x 4 musical features M at different time lags), which is then used as an input for the mTRFcrossval, mTRFtrain and mTRFpredict of the mTRF Matlab Toolbox (Crosse et al., 2016) – actually this is exactly how using a 2D feature like the spectrogram would work. The multivariate TRF is calculated by extending the stimulus lag matrix (time course of one musical feature at different time lags, T × τwindow) by an additional dimension (time course of several musical features at different time lags, T × M x τwindow). We added an explanation to the Methods section of the manuscript and hope that it is this way better understandable:

      p. 39 l. 840-842: “For the multivariate TRF approach, the stimulus features were combined by replacing the single time-lag vector by several time-lag vectors for every musical feature (Time x 4 musical features at different time lags).”

      Reviewer #3 (Public Review):

      Subjects listened to various excerpts from music recordings that were designed to cover musical tempi ranging from 1-4 Hz, and EEG was recorded as subjects listened to these excerpts. The main and novel findings of the study were: 1) spectral flux, measuring sudden changes in frequency, were tracked better in the EEG than other measures of fluctuations in amplitude, 2) neural tracking seemed to be best for the slowest tempi, 3) measures of neural tracking were higher when subject's rated an excerpt as high for ease-of-tapping and familiarity, and 4) their measure of the mapping between stimulus feature and response could predict whether a subject tapped at the expected tempo or at 2x the expected tempo after listening to the musical excerpt.

      One of the key strengths of this study is the use of novel methodologies. The authors in this study used natural and digitally manipulated music covering a wide range of tempi, which is unique to studies of musical beat tracking. They also included both measures of stimulus-response correlation and phase coherence along with a method of linear modeling (the temporal response function, or TRF) in order to quantify the strength of tracking, showing that they produce correlated results. Lastly, and perhaps most importantly, they also had subjects tap along with the music after listening to the full excerpt. While having a measure of tapping rate itself is not new, combined with their other measures they were able to demonstrate that neural data predicted the hierarchical level of tapping rate, opening up opportunities to study the relationship between neural tracking, musical features, and a subject's inferred metrical level of the musical beat.

      Additionally, the finding that spectral flux produced the best correlations with the EEG data is an important one. Many studies have focused primarily on the envelope (amplitude fluctuations) when quantifying neural tracking of continuous sounds, but this study shows that, for music at least, spectral flux may add information that is tracked by the EEG. However, given that it is also highly correlated with the envelope, what additional features spectral flux contributes to measuring EEG tracking is not clear from the current results and worth further study.

      All four of their main findings are important for research into the neural coding of musical rhythm. I have some concerns, however, that two of these findings could be a consequence of the methods used, and one could be explained by related correlations to acoustic features:

      We thank the Reviewer for the very helpful review, the summary, and the great suggestions. We addressed the comments and performed additional analysis. We made changes throughout the manuscript, but especially 1) concerning the potential advantage of the neural response to slower music, 2) the effects of the amount of tempo manipulation on neural synchronization, 3) the SVM-related analysis and 4) the relation between stimulus features and behavioral ratings. The implemented modifications can be found below in more detail. The page and line numbers correspond to the manuscript file without track changes.

      The authors found that their measures of neural tracking were highest for the lowest musical tempos. This is interesting, but it is also possible that this is a consequence of lower frequencies producing a large spread of correlations. Imagine two signals that are fluctuating in time with a similar pattern of fluctuation. When they are correctly-aligned they are correlated with each other, but if you shift one of the signals in time those fluctuations are mismatched and you can end up with zero or negative correlations. Now imagine making those fluctuations much slower. If you use the same time shifts as before, the signals will still be fairly correlated, because the rates of signal change are much longer. As a result, the span of null correlations also increases. This can be corrected by normalizing the true correlations and prediction accuracies with a null distribution at each tempo. But with this in mind, it is hard to conclude if the greater correlations found for lower musical tempos in their current form are a true effect.

      Thank you for this great suggestion. We followed your lead (Zuk et al., 2021), and normalized all measures of neural synchronization (TRF correlation, SRCorr, SRCoh) relative to a surrogate distribution. The surrogate distribution was calculated by randomly and circularly shifting the neural data relative to the musical features for each of 50 iterations. This was done separately for every musical feature and stimulation tempo condition (Figures 2 and 3). After normalization, the results look qualitatively similar and the main results – spectral flux and slow stimulation tempi resulting in highest levels of neural synchronization – persist.

      The changes in the manuscript based on your comment (and the comment of Reviewer 1) can be found throughout the manuscript, but especially on p. 11, l. 210-218, Figures 2-3 and a more detailed explanation in the Methods section:

      p. 39, l. 821-829: “In order to control for any frequency-specific differences in the overall power of the neural data that could have led to artificially inflated observed neural synchronization at lower frequencies, the SRCorr and SRCoh values were z-scored based on a surrogate distribution (Zuk et al., 2021). Each surrogate distribution was generated by shifting the neural time course by a random amount relative to the musical feature time courses, keeping the time courses of the neural data and musical features intact. For each of 50 iterations, a surrogate distribution was created for each stimulation subgroup and tempo condition. The z-scoring was calculated by subtracting the mean and dividing by the standard deviation of the surrogate distribution.”

      If the strength of neural tracking at low tempos is a true effect, it is worth noting that the original tempi for the music clips span 1 - 2.5 Hz (Supplementary Table 1), roughly the range of tempi exhibiting the largest prediction accuracies and correlations. All tempos above this range are produced by digitally manipulating the music. It is possible that the neural tracking measures are higher for music without any digital manipulations rather than reflecting the strength of tracking at various tempi. This could also be related to the author's finding that neural tracking was better for more familiar excerpts. This alternative interpretation should be acknowledged and mentioned in the discussion.

      Thank you for these important suggestions (see also comment #2 (part 2) from Reviewer 1). First up, it is important to say that all music stimuli were tempo manipulated: even if the tempo of an original music segment was e. g. 2 Hz and the same song was presented at 2 Hz, it was still converted via the MAX patch to 2 Hz again (to make it comparable to the other musical stimuli). Second, it is true that we cannot fully exclude the possibility that the amount of tempo manipulation could have an effect on neural synchronization to music – meaning that less tempo manipulated music segments (so a stimulation tempo close to the original tempo) could result in higher neural synchronization. However, we have now conducted an additional analysis to address this as best we could.

      We compared TRF correlations for a) songs that were shifted very little relative to their original tempi to b) songs that were shifted a lot relative to their original tempi. We did not have enough song stimuli to do this for every stimulation tempo, but we were able to do the TRF correlation comparison for two illustrative stimulation tempo conditions (at 2.25 Hz and 1.5 Hz). In those tempo conditions, we took the TRF correlations for up to three trials per participant when the original tempo was around the manipulation tempo (1.25-1.6 Hz for 1.5 Hz or 2.01-2.35 Hz for 2.25 Hz) and compared it to those trials where the original tempo was around 0.75¬–1 Hz faster or slower than the manipulated tempo at which the participants heard the songs (Figure 3 – figure supplement 2). This analysis revealed that there was no significant effect of the original music tempi on the neural response (please see Material and Methods, p. 40, l. 855-861 and Results p. 13, l. 265-273). In response to your and Reviewer’s 1 comments, we also added it to the discussion.

      p. 23-24 l. 427-436: “The tempo range within which we observed strongest synchronization partially coincides with the original tempi of the music stimuli (Figure 1 – figure supplement 2). A control analysis revealed that the amount of tempo manipulation (difference between original music tempo and tempo the music segment was presented to the participant) did not affect TRF correlations. Thus, we interpret our data as reflecting a neural preference for specific musical tempi rather than an effect of naturalness or the amount that we had to tempo shift the stimuli. However, since our experiment was not designed to answer this question, we were only able to conduct this analysis for two tempi, 2.25 Hz and 1.5 Hz (Figure 3 – figure supplement 3), and thus are not able to rule out the influence of tempo manipulation on other tempo conditions.”

      We also provide more information to the reader about the amount of tempo shift that each stimulus underwent. We added two plots to the manuscript that show 1) the distribution of original tempi of the music stimuli and 2) the distribution of the amount of tempo manipulation across all stimuli (Figure 1 – figure supplement 2).

      Their last finding regarding predicting tapping rates is novel and important, and the model they use to make those predictions does well. But I am concerned by how well it performs (Figure 6), since it is not clear what features of the TRF are being used to produce this discrimination. Are the effects producing discriminable tapping rates and stimulation tempi apparent in the TRF? I noticed, though, that these results came from two stages of modeling: TRFs were first fit to groups of excerpts with different tapping rates or stimulation tempo separately, then a support vector machine (SVM) was used to discriminate between the two groups. So, another way to think about this pipeline is that two response models (TRFs) were generated for the separate groups, and the SVM finds a way of differentiating between them. There is no indication about what features of the TRFs the SVM is using, and it is possible this is overfitting. Firstly, I think it needs to be clearer how the TRFs are being computed from individual trials. Secondly, the authors construct surrogate data by shuffling labels (before training) but it is not clear at which training stage this is performed. They can correct for possible issues of overfitting by comparing to surrogate data where shuffling happens before the TRF computation, if this wasn't done already.

      Thank you for noticing this important point. You are absolutely right – when re-analyzing that part of the results based on your comment, we noticed that we had an error in our understanding of the analysis pipeline. Indeed, we first calculated two TRF models for the separate groups (e. g. stimulation tempo = tapping tempo vs. stimulation tempo = 2* tapping tempo) based on all trials of each group apart from the left-out-trial. Next, the resulting TRFs were fed into the SVM which was used to predict the group. The shuffling of the surrogate data occurred at the SVM training step.

      Based on your comment, we tried several approaches to solve this problem. First, we calculated TRFs on a single-trial basis (instead of using the two-group TRFs as before, only one trial was used to calculate the TRFs) and submitted the resulting TRFs to the SVM. The resulting SVM accuracy was compared to a “surrogate SVM accuracy” which was calculated based on shuffling the labels when training the SVM classifier. Second, we shuffled, as you suggest, the labels not at the SVM training step, but instead prior to the TRF calculation. This way we could compare our “original” SVM accuracies (based on the two-group TRFs) to a fairer surrogate dataset. However, in both cases the resulting SVM accuracies did not perform better than the surrogate data. Therefore, we felt that it is the fairest to remove this part from the manuscript. We are aware that this was one of the main results of the paper and we are sorry that we had to remove it. However, we feel that our paper is still strong and offers a variety of different results that are important for the auditory neuroscience community.

      Lastly, they show that their measures of neural tracking are larger for music with high familiarity and high ease-of-tapping. I expect these qualitative ratings could be a consequence of acoustic features that produce better EEG correlations and prediction accuracies, especially ease-of-tapping. For example, music with acoustically-salient events are probably easier to tap to and would produce better EEG correlations and prediction accuracies, hence why ease-of-tapping is correlated with the measures of neural tracking. To understand this better, it would be useful to see how the stimulus features correlate with each of these behavioral ratings.

      We agree that our rating-based results could be influenced by acoustic stimulus features (at least for ease of tapping, it’s actually not clear to us why familiarity would be related to acoustics). As it is difficult to correlate stimulus features (time-domain, and one time course per song) with behavioral ratings (one single value per song per participant), we conducted frequency-domain analysis on the musical features to arrive at a single value quantifying the strength of spectral flux at the stimulation frequency and its first harmonic. We calculated single-trial FFTs on the spectral flux (which was used for the main Figure 5) for the 15 highest- and 15 lowest-rated trials per behavioral category (enjoyment, familiarity, ease to tap the beat) and participant. We compared the z-scored FFT peaks at the stimulation tempo and first harmonic for the top- and bottom-rated stimuli. We did observe significant acoustic differences between top- and bottom-rated stimuli in each category, but the differences were not in the direction that would be expected based on acoustically more salient events leading to better TRF correlations, with the exception of ease of tapping. Easy-to-tap music did indeed have stronger spectral flux than difficult-to-tap music, which is intuitive. However, spectral flux was stronger for more enjoyed music (we did not see any significant differences between TRF correlations of more vs. less enjoyed music; Figure 5C) and for less familiar music (this is the opposite of what we saw for the TRF measures). Overall, given the inconsistent relationship between acoustics, behavioral ratings, and TRF measures, we would argue that acoustic features alone cannot solely explain our results (Figure 5 – figure supplement 1, p. 21 l. 381 – 387).

    1. MIT/Brown Vannevar Bush 研讨会于 1995 年 10 月 12 日至 13 日在麻省理工学院举行,以庆祝1945 年 7 月在大西洋月刊上发表的万尼瓦尔·布什(Vannevar Bush)的开创性文章《诚如所思》(As We May Think)发表50 周年。活动视频分五部分,这是第一部分,内容主要是:

      1.Paul Penfield的开幕致辞。

      2.Andy van Dam介绍万尼瓦尔·布什及其经历。

      3.Paul Kahn(Memex专家):布什作品的视觉之旅(Visual tour of Bush's work)。

      4.道格拉斯·恩格尔巴特(Douglas Engelbart):对集体智慧的战略追求(The Strategic Pursuit of Collective IQ)。内容介绍:对我来说,布什在《诚如我思》中留下的遗产直接关系到提高人类组织所代表的社会有机体的集体智慧的非常真实和重要的潜力。最认真和有效地追求这种潜力的公司、机构--实际上是国家--显然会有强大的成功/生存优势。除此之外,整个人类能否在一个健康和 "人性化 "的社会、政治、经济和生态环境中生存,很可能取决于我们如何尽快和有效地明确追求这一潜力。

      认真的追求将涉及到我们思考方式的许多变化,与 "我们工作方式"的许多同步变化相协调--以及我们可以合作、分享、扮演新的角色、行使新的/不同的技能和方法集,等等。简而言之,这将涉及到将人类的基本感觉、运动、精神和学习能力与集体开发、整合和应用知识的任务相结合的根本性新方法。

      有效的追求将需要一种战略方法,其接受程度肯定会涉及到一些普遍存在的范式的关键转变。我想描述一下它们,以及它们在追求大规模集体智商显著提高的候选"引导"战略中的相对作用。

      技术只是该战略中的一个重要因素,在这个因素中,关键是要加快开放的超文件系统的发展,要有适当的通用功能、应用领域、互操作性和可扩展性的目标。WWW/HTML的激动人心的出现提供了一个极其重要的推动力;我想描述一下下一阶段向OHS目标演变的一些候选者。

      5.泰德·尼尔森(Theodor Holm Nelson):小路通向何方(Where the Trail Leads)。内容介绍:像任何简洁的预言作品一样,《诚如我思》支持许多解释,并导致推断的问题。我们今天聚集在一起表示敬意,并争论谁的想法最忠实地表达了最初所说的内容。

      布什预见到了一个可公开访问的、快速访问的连接性文献,这将允许人们发表已经存在的材料之间的连接。但他所预见的结构,即他所称的"线索",与今天的意大利面条式的超文本相当不同;布什的结构是基于转包(transclusion)而不是链接。它值得详细研究。

      经过适当的推断和打磨,我相信这个想法会导致跨平行媒体(连接的对象与它们的连接一起被看到),以及设计一个广泛的版权安排,以便不受约束地重新使用。

    1. MIT/Brown Vannevar Bush 研讨会于 1995 年 10 月 12 日至 13 日在麻省理工学院举行,以庆祝1945 年 7 月在大西洋月刊上发表的万尼瓦尔·布什(Vannevar Bush)的开创性文章《诚如所思》(As We May Think)发表50 周年。活动视频分五部分,这是第三部分,内容主要是:1.迈克·莱斯克(Michael Lesk)发表的演讲:“信息检索的七个时代”(The Seven Ages of Information Retrieval)。内容介绍:万尼瓦尔·布什(Vannevar Bush)在1945年的文章中提出了一个快速获取世界图书馆内容的目标,看起来它将在65年后的2010年实现。因此,它的历史堪比一个人的历史。信息检索在20世纪50年代和60年代初有其学生时代的研究阶段;然后在20世纪70年代努力争取采用,但在20世纪80年代和90年代,随着自由文本检索系统的常规使用,它已被接受。例如,我的公司不再用纸印刷其公司电话簿。现在,它正在继续前进,开展声音和图像检索项目,同时以电子方式提供现在图书馆中的大部分内容。我们可以期待着布什的梦想在一个生命周期内完成。2.第 1 天小组讨论。

    1. MIT/Brown Vannevar Bush 研讨会于 1995 年 10 月 12 日至 13 日在麻省理工学院举行,以庆祝1945 年 7 月在大西洋月刊上发表的万尼瓦尔·布什(Vannevar Bush)的开创性文章《诚如所思》(As We May Think)发表50 周年。活动视频分五部分,这是第二部分,内容主要是:

      1.罗伯特·卡恩(Robert Kahn)发表的演讲:“用数字技术增强布什的愿景”(Augmenting Bush's Vision with Digital Technology)。内容介绍:尽管万尼瓦尔·布什(Vannevar Bush)在他的经典论文《诚如所思》(As We May Think)中描述了信息共享的重要性,但他的视野必然受到当时技术的限制。特别是,我们现在认为理所当然的数字计算和通信技术甚至还没有进入他的参考框架。本讲座将探讨计算机和通信基础设施的可能演变,以及架构、技术和智能在该系统中的作用。连通性以及几乎无限的数字对象、通用服务和应用将刺激网络中的思想共享、各种联合活动、虚拟实体和团队工作。在分布式任务执行的背景下,网络内和网络外的软件代理的作用将被考虑。最后,将对智能分布式系统的前景进行探讨。

      2.蒂姆·伯纳斯-李(Tim Berners-Lee)发表的演讲:“超文本和我们的集体命运”(Hypertext and Our Collective Destiny)。内容介绍:布什考虑到研究人员被无法获取的信息所淹没的困境。他提出了MEMEX,一种可以快速访问并允许信息片段之间随机链接的机器。此后,网络和计算机使我们在速度和便利性方面超过了这个带有远见的设想。然而,我们在解决政治问题、管理大型组织或放大我们的团体直觉的能力方面没有看到巨大的进步。 我们必须做得更多,而不是赋予个人权力。我们必须让一起互动的人和机器以新的方式作为一个群体来行事。现在,我们可以通过我们的信息制造线索,我们必须创造一个基质,在这个基质中,这些线索将成长为一个越来越有意义的整体,而不是一个纠结的群体。我们和我们的文件能够作为一个大型机器一起运作,但不是作为一个大型的头脑。各种规模的团体都必须获得直觉、关联和发明的天赋,这些天赋我们通常与人而不是机器联系在一起,然后我们才能迎接布什对人类的挑战,"在种族经验的智慧中成长",而不是 "在冲突中灭亡"。

  7. Jun 2022
    1. Author Response

      Reviewer #1 (Public Review):

      The authors set out to consider more the role of the predator in predator-prey interactions, particularly from a collective locomotion aspect. This is an aspect which at times has been overlooked, with many theories, experiments and models focusing largely on the prey response, independent of how the predator behaves. The major strengths are the (1) excellent writing, (2) quality of the figures, (3) quantity of data, and (4) question tackled. The major weaknesses are (1) the volume of information (as a reader, it is quite hard to distil key points from the sheer volume of what has been presented), (2) the confined captive environment making it difficult to draw comparisons with a wild-type scenario, and (3) lack of clarity about the wider implications of the work outside of the immediate field.

      We thank the reviewer for their thoughtful review and positive comments. To address the weaknesses highlighted by the reviewer, we have revised our manuscript throughout.

      Reviewer #2 (Public Review):

      The manuscript describes a laboratory-based predator-prey experiment in which pike hunt shiner fish as a way to gain insight into the selective pressures driving the evolution of collective behavior. Unlike the predictions of classical theoretical work in which prey on the edge of social groups are considered to be at highest risk of predation, the fish in the center of the school were primarily targeted by the pike. This is because the pike uses a hunting behavior in which it slowly moves to the center of the school, seemingly undetected, until it rapidly attacks prey directly in front of its snout. This study also differs from previous studies in that both the predator and prey motion are examined, and the success of predation attempts was precisely determined. While the study demonstrates why shiners would be under selective pressure to avoid the center of a school, I am not convinced that the results explain why shiners evolved to have schooling behavior.

      The reviewer indeed highlights one of the main findings of our study, that fish closer to the group center are more at risk of being attacked by pike. They also give a proper account of its possible explanation, and highlight some of the main ways in which our study differs from previous work. The reviewer states that our results do not explain why shiners evolved to school. We agree and note that we also don’t claim this anywhere in the manuscript. Rather, we state our study provides important new insights about differential predation risk in groups of prey and highlight the important role of predator attack strategy and decision-making and prey response, with potential repercussions for the costs and benefits of grouping.

      We have considerably revised our introduction to better explain the importance of understanding differential predation risk in animal groups (lines 36-50): A key challenge in the life of most animals is to avoid being eaten. Via effects such as enhanced predator detection (Lima, 1995; Magurran et al., 1985), predator confusion (Landeau and Terborgh, 1986), and risk dilution effects (Foster and Treherne, 1981; Turner and Pitcher, 1986), individuals living and moving in groups can reduce their risk of predation (Ioannou et al., 2012; Krause and Ruxton, 2002; Pitcher and Parrish, 1993; Ward and Webster, 2016). This helps explain why strong predation pressure is known to drive the formation of larger and more cohesive groups (Beauchamp, 2004; Krause and Ruxton, 2002; B. Seghers, 1974). However, the costs and benefits of grouping are not shared equally among individuals within groups, and besides differential food intake and costs of locomotion, group members themselves may experience widely varying risks of predation (Handegard et al., 2012; Krause, 1994; Krause and Ruxton, 2002). Where and who predators attack within groups not only has major implications for the selection of individual phenotypes, and thereby the emergence of collective behaviour and the functioning of animal groups (Farine et al., 2015; Jolles et al., 2020; Ward and Webster, 2016), but also shapes the social behaviour of prey and the properties and structure of prey groups. Hence, a better understanding of the factors that influence predation risk within animal groups is of fundamental importance.

      And in the discussion now better explain the potential evolutionary consequences of the findings of our work (lines 456-466): Predation is seen as one of the main factors to shape the collective properties of animal groups (Herbert-Read et al., 2017) and has so far generally been seen as to drive the formation of larger, more cohesive groups that exhibit collective, coordinated motion (see e.g. Beauchamp, 2004; Ioannou et al., 2012; B. H. Seghers, 1974). Our finding that central individuals are more at risk of being predated could actually have the opposite effect, with schooling having a selective disadvantage and over time result in weaker collective behaviour and less cohesive schools. However, we do not deem this likely as selection is likely to be group-size dependent, as discussed above. Furthermore, our multi-model inference approach revealed that, despite more central individuals experiencing higher predation risk, being close to others inside the school was still associated with a lower risk of being targeted. As most prey experience many types of predators, including sit-and-wait predators and active predators that hunt for prey, the extent and direction of such selection effects will depend on the broader predation landscape in which prey find themselves.

      Major strengths of the paper include the precise recording of the location and orientation of all fish at all times during the experiments. This indeed provides a rich dataset that can be used to search for the factors that predict the likelihood of attack and escape with higher statistical power.

      The major concern I have about the manuscript is that the results somewhat contradict the aim of the paper as expressed in the introduction and discussion: that predator-prey interactions explain the emergent evolution of collective behavior. Figure 2C shows that fish in smaller clusters or those that were totally isolated experienced lower rates of predation and were not included in any subsequent analyses. This would suggest that shiners experiencing predation from pike would be under strong selection to avoid schooling behavior altogether. Can you compare the likelihood of predation for individuals in non-central school locations compared to individuals outside of schools altogether? It might be helpful to investigate whether other predators of shiners use predation strategies that target prey on the edge of the school to help explain why schooling could be useful. Did the likelihood of schooling decrease throughout the trials?

      The reviewer makes a good point regarding the observation that pike tended to mainly attack individuals in the main school, questioning if this would result in a selective disadvantage for schooling. We would like to point out that this result is regarding the likelihood to attack an individual, not the likelihood for a successful attack. If we look at the later we find 5 out of 8 attacks away from the main school were successful, a ratio that is actually similar to that of the main school. More importantly, when wanting to understand how predation risk is linked to group size one needs to look at the per capita risk. If we do that for the group size we used in our study, despite a moderately elevated risk of being predated in a large group, the shiners in the main school still had considerably lower individual risk to be killed than those that occurred in small sub-groups or were alone. We would like to note that in our study the shiners did not really show proper fission-fusion behaviour and by far the majority of the time the shiners were in one large cohesive school. Therefore, we feel our dataset is not suitable for a proper investigation about the role of group size in predation risk.

      We now clarify these points in the discussion (lines 467-471): While the finding that pike were more likely to attack the main school may also appear to indicate a selective disadvantage to school, calculating the per-capita-risk for each individual would actually reveal it is still safest to be part of the main school. Nevertheless, as the shiners in our study rarely exhibited fission-fusion dynamics we feel our dataset is not appropriate to make proper inferences about how predation risk is linked to group size.

      We have also slightly extended the relevant sentences in the results to further clarify the clustering results (lines 144-150): We found that, by and large, the shiners were organised in one large, cohesive school at the time of attack and rarely showed fission-fusion behaviour (merging and splitting of schools) during the trials. Only occasionally there were one or two singletons besides the main school (25 attacks) or multiple clusters of more than two fish (12 attacks Figure 2C), which tended to exist relatively briefly (mean school size: 36.5 ± 0.8). In more than 80% of these cases, pike still targeted an individual in the main cluster (Figure 2C).

      We now also provide more discussion about other predator types being likely to attack central prey (lines 343-354): That predators may actually enter groups and strike at central individuals is not often considered (Hirsch and Morrell, 2011), possibly because it contrasts with the long-standing idea that predation risk is higher on the edge of animal groups (Duffield and Ioannou, 2017; Krause, 1994; Krause and Ruxton, 2002; Stankowich, 2003). However, our finding is in line with the predictions of theoretical work that suggest that the extent of marginal predation may depend on attack strategy and declines with the distance from which the predator attacks (Hirsch and Morrell, 2011). Furthermore, increased risk of individuals near the centre of groups may be more widespread than currently thought. Predators not only exhibit stealthy behavioural tactics that enable them to approach and attack central individuals, as we show here, but may also do so by attacking groups from above (Brunton, 1997) or below (Clua and Grosvalet, 2001; Hobson, 1963; but see Romey et al., 2008), and by rushing into the main body of the group (Handegard et al., 2012; Hobson, 1963; Parrish et al., 1989).

      We furthermore discuss the potential role of group size on the observed effects (lines 441-455): In particular, while group size is not expected to effect much whether ambush predators are likely to attack internal individuals, the specific risk of central individuals could both be hypothesized to decrease with group size, such as if the predator is more likely to attack when surrounded by prey, or to not be affected by it, such as if the predator actively targets central individuals. Whatever the process, the observed findings are likely for prey that move in groups of somewhat intermediate size; for very large groups, such as the huge schools encountered in the pelagic, ambush predators may simply not be able to attack the group centre due to spatial constraints. More generally, the tendency for predators to attack the centre of moving groups may depend on the medium in which the predator-prey interactions occur. As in the air there is potential for (fatal) collisions, and on land it is physically difficult for predators to enter groups and predators’ size advantage tends to be more limited, predators may be less likely to go for the group centre as compared to in aquatic or mixed (e.g. aerial predator hunting aquatic prey) systems. Hence, the important interplay we highlight between predator attack strategy and prey response may have different implications across different predator prey systems and warrants concerted further research effort.

      Finally, in response to the reviewer’s question if the likelihood to school decreased through the trials, we did not see a change in packing faction (median nearest-neighbour distance) with repeated exposure to the pike, but shiners increasingly avoided the area directly in front of the pike’s head (lines 182-186): While the shiners did not show a change in their packing fraction (median nearest-neighbour distance) with repeated exposure to the pike (F1,52 = 1.81, p = 0.185), they increasingly avoided the area directly in front of the pike’s head (Appendix 2 – Figure 1A) resulting in the pike attacking from increasingly further away (target distance: F1,52 = 45.52, p < 0.001, see Appendix 2 – Figure 1B,C). See also further Appendix 2.

      I am also curious whether tank size affects the behavior of the fish, both of the shiners and the pike. The pike seem to be approximately 1/3 the shortest length of the tank, and 6 inches of depth have constrained the movement to be mostly in the 2D plane. A lack of open space might limit the pike's ability to hunt in any way other than this stealthy strategy. Has this stealthy hunting strategy been described in other experiments in larger or more naturalistic conditions? Does open space affect the shiners' propensity to school? Although the manuscript describes that shiners tend to school near the surface of water, does the shallow depth affect the pike's behavior? The manuscript states that some pike never attacked -- were these the largest in the study?

      While the tank is small relative to the real world, we actually decided on this size of ~2m2 based on previous experimental work on predator-prey dynamics. As we stated in the methods of the original manuscript (lines 543-545) we expect that if a much larger space would have been used, pike would actually still show the same approach and attack behaviour linked to their stealthy attack strategy. The stealthy hunting behaviour of pike and similar predators and their ability to thereby get very close to their prey has been described elsewhere (see e.g. references on lines 332-344 of the original manuscript).

      We now better explain the potential limitation of the arena size in the discussion (lines 472-480): Laboratory studies on predator-prey dynamics like ours do, of course, have their limitations. Although the size of the arena we used (~2m2) is in line with behavioural studies with large schools of fish (e.g. Sosna et al., 2019; Strandburg-Peshkin et al., 2013) and experiments with live predators attacking schooling prey (Bumann et al., 1997; Magurran and Pitcher, 1987; Neill and Cullen, 1974; Romenskyy et al., 2020; Theodorakis, 1989), compared to conditions in the wild the prey and predator had limited space to move. However, as pike are ambush predators they tend to move relatively little to search for prey and rather rely on prey movement for encounters (Nilsson and Eklöv, 2008). Increasing tank size would have made effective tracking extremely difficult, or impossible, and while a much larger tank is expected to considerably increase latency to attack, we expect it to have relatively little effect on the observed findings.

      We agree that the shallow depth of the tank is a limitation of our study and may have somewhat restricted the pikes’ natural behaviour, although pilot experiments showed that the pike exhibited normal movements and attack behaviours. Fish were tested in very shallow water to be able to acquire detailed individual-based tracking of the schools as well as compute features related to the visual field of the fish. We would also like to note that both shiners and pike can often be found in the littoral zone and come in very shallow water of only a few 10s of cm (see e.g. Krause et al., 2000b; Pierce et al., 2013; Skov et al., 2018), with some experimental work furthermore showing that pike may actually prefer shallow water (Hawkins et al., 2005). We don’t think that increasing the depth of the tank would have considerably changed the predatory behaviour of the pike, as the pike would be expected to still use their stealthy approach to get close to their prey even if the prey school would be more three-dimensional.

      We now provide a much more extensive discussion of the limited depth used in the discussion (lines 480-494): In terms of water depth, fish were tested in relatively very shallow water. This was primarily done to be able to keep track of individual identities and compute features related to the visual field of the fish. Shiners naturally school in very shallow water conditions as well as near the surface in deeper water in the wild (Hall et al., 1979; Krause et al., 2000b; Stone et al., 2016) and also pike primarily occur in the shallow littoral zone, sometimes only a few of tens of cm deep (Pierce et al., 2013; Skov et al., 2018). Furthermore, pilot experiment showed the pike did exhibit normal swimming and attack behaviour with attack speeds and acceleration comparable to previous work (Domenici and Blake, 1997; Walker et al., 2005). Recent other work on predator-prey dynamics did not find a considerable impact of adding the third dimension to their analyses (Romenskyy et al., 2020). Still, the water depth used is a limiting factor of our study and in the future this type of work should be extended to deeper water while still keeping track of individual identities over time. We expect that adding the third dimension would not change the stealthy attack behaviour of the pike and therefore still put more central individuals most at risk, but possibly attack success would be reduced because of increased predator visibility and prey escape potential in the vertical plane, which remains to be tested.

      We did not observe a relationship between pike size and tendency to attack.

      Reviewer #3 (Public Review):

      While it has long been clear that animals in groups (e.g., fish schools) benefit in terms of safety in numbers, there has also been a keen interest in which animals in the group are at higher versus lower risk (e.g., those in front, or along the edges) and how that might depend on the predator's attack strategy. This study addresses these important predator-prey details using a common predatory fish (northern Pike) attacking schools of prey fish (golden shiners). A strength of the study is that it uses cutting-edge video tracking and computational/statistical methods that allow it to quantify and follow each fish's (1 predator and 40 prey in a group) spatial position, relative spacing, orientation and even each individual's visual field and movement throughout each of 125 attacks. Most (70%) of these attacks were successful, but many were not. The variation in attack success allowed the investigators to do statistical analyses to identify key predator and prey behaviors that are associated with successful vs. unsuccessful attacks.

      The study yielded numerous interesting insights. While conventional wisdom pictures predators initiating an attack from outside of the group thus putting individuals at the group's edge at greatest risk, this study found that pike typically approached the school of prey headon both in terms of the group's orientation and direction of movement, and often stealthily moved within the group before initiating an attack. To understand which prey individual was targeted by the predator, the highly quantitative video analyses examined 11 measures of each individual prey's position and orientation at the time that the pike initiated its attack. Of course, pike showed a strong tendency to target one of the 3 closest prey, particularly prey that were more or less directly in front of the pike. However, contrary to conventional wisdom, the analysis showed that targeted prey were closer to the center than the edge, and that an individual's position and orientation relative to other nearby prey also played an important role in whether it might be targeted by the predator. Not surprisingly, analyses showed that targeted prey were more likely to escape if they were further from the predator's head and if they exhibited higher maximum acceleration. Interestingly, during the actual strike, on average, the predator accelerated to a speed about 50% faster than the velocity of the targeted prey.

      A limitation of the study (that the authors describe and discuss) is that it was conducted in a tank with no spatial refuges whereas in nature, pike are often found in areas with vegetation, and schools of prey can often potentially respond to the presence of a predator by moving towards refuge (e.g., vegetation). Also, the study was done in very shallow water (6 cm) -- likely shallower than many, if not most, natural predator-prey interactions for these species. In deeper water, the predator-prey interaction might be better analyzed in three dimensions (i.e., also accounting for variation in vertical height in the water), though the authors argue that this conventional idea is not necessarily true.

      Overall, this study provides an impressive example of the use of modern technology and statistical analyses allows us to better describe and understand the fine-scale behaviors that affect an interaction of high importance for ecology and evolution.

      We thank the reviewer for the care and attention put in their review and their detailed objective assessment of our study.

      Regarding refuge use, it is true that in the wild pike are often found in areas with vegetation, but it is actually predominantly younger pike seeking refuge among vegetation from predators themselves, including from cannibalism by larger pike (see Skov & Lucas, 2018 Chapter 5). Vegetation is also used by pike as background camouflage rather than a refuge per se, but due to their elongated body and narrow frontal body pike are able to approach and ambush prey when no vegetation is available, as we show in our study. During pilot experiments we did provide pike with refuges, but as they never used them, and it would provide a hiding place for hiding, which would have considerably impacted our ability to investigate predation risk within the schools, no refuges were provided during the experiment.

      We now added an explanation about not using refuges in the discussion (lines 495502): For our experiments we used a testing arena without any internal structures such as refuges. This was a strategic decision as providing a more complex environment would have impacted the ability of the shiners to school in large groups and would have led fish to hide under cover. Although studying predator-prey dynamics in more complex environments would be interesting in its own regard, it would not have allowed us to study the questions we are interested in about the predation risk of free-schooling prey. Furthermore, pilot experiments indicated that the pike never used refuges (consistent with previous work, see Turesson and Brönmark, 2004), so they were not further provided during the actual experiment.

      Regarding the shallow depth of the tank, we now better acknowledge this limitation and explain our reasoning (lines 480-482): In terms of water depth, fish were tested in relatively very shallow water. This was primarily done to be able to keep track of individual identities and compute features related to the visual field of the fish. We would also like to note that both shiners and pike spent a lot of their life in the littoral zone and occur in very shallow water of only a few 10s of cm (see e.g. Krause et al., 2000b; Pierce et al., 2013; Skov et al., 2018). Although the limited vertical space may have restricted the pikes’ natural behaviour to some extent, they did exhibit normal swimming and attack behaviour with attack speeds and acceleration comparable to previous work (Domenici and Blake, 1997; Walker et al., 2005). We now better discuss the limitation of the shallow depth used in the discussion on lines 477-494 (see also our responses above).

    1. Author Response

      Reviewer #1 (Public Review):

      In this study, He and collaborators analyse eight samples from six patients with acral melanoma through single-cell RNA sequencing. They describe the tumour microenvironment in these tumours, including descriptions of interactions among distinct cell types and potential biomarkers. I believe the work is thoroughly done, but I have identified a few concerns in their depiction and interpretation of their results.

      Strengths:

      1) One of the few available single-cell studies of acral melanoma, including a non-European cohort of patients.

      2) Data will be very useful to study the immune landscape of these rare tumours.

      3) Data include adjacent tissue, primary tumours and a metastatic sample, covering all disease stages.

      4) Analyses seem to be carefully done.

      Things to improve:

      1) Figures need much more description to be understandable, in particular, axes should be clearly labeled and the colour code should be specified

      Thank you for your generous comments and suggestions. We have improved the integrity of some figures and added some figure legends. I believe this will further improve the quality of our manuscript.

      2) In some places, I would recommend the authors soften their interpretation of their analyses (for example, when they suggest targeting TNFRSF9+ T cells as a novel therapy), as these are nearly all bioinformatic in a small number of samples

      As for the conclusions of TNFRSF9, we indeed provided a possibility that TNFRSF9 may serve as a novel therapy. We made some changes to soften the statement. In addition, we have added instructions and explanations in the Discussion section.

      3) I don't think the experiments add much to the literature, as these test already known oncogenes on a common, non-acral melanoma cell line. Thanks for your comments regarding the experiments included in our study. We have pointed out this deficiency in the Discussion section, and made some experimental changes. For example, we have removed the TWIST1-related experiments from the main Results section and shown them only as non-focus work in the Supplementary Figure.

      It is difficult for us to obtain AM cell lines. No commercial AM cell lines can be purchased in ATCC or ECACC. AM cell lines are more difficult to establish and there are few reports on methods for establishing primary acral melanoma cell cultures (PMID: 22578220, PMID: 17488338). Some Japanese and Chinese researchers have isolated the primary generation of AM cells (e.g., PMID: 17488338, PMID: 22578220, PMID: 34097822), but due to the customs policy and the COVID-19 epidemic, we could not receive them within a short period. Moreover, these studies also stated their limitations; namely, that the stability during serial passaging had not been evaluated. Therefore, it may be very time-consuming to obtain operable AM cell lines for functional assays. However, our research group would like to have the opportunity to separate and culture primary cells in subsequent studies, and improve relevant experiments according to your valuable suggestions. Man thanks again for your comments.

      Reviewer #2 (Public Review):

      The study presented by Zan He et al dissects the main interactions between malignant and stromal cells present in acral melanoma samples and in adjacent tissues using single cell RNA sequencing. The study describes factors that allow communication between the different cell types, with a special focus on macrophages, lymphocytes and fibroblasts, along with malignant cells. Factors playing a role in cell-cell communication are identified and suggested to be relevant prognostic makers and/or attractive therapeutic targets.

      Historically, the study of acral melanomas has been neglected due to the low incidence among Europeandescents and this formed an important gap of knowledge in the field and hindered the development of effective therapies to control the disease. Therefore, studies that address this unmet need in melanoma research are very important and should be motivated. This includes singlecell sequencing studies that allow one to study the complexity of tumours, including microenvironment features that influence the development and effectiveness of certain types of treatment. The present study contributes information on how cells interact in the acral melanoma microenvironment and this could be a first step toward better understanding how these interactions influence acral melanoma development, progression, and therapy response.

      However, there are a few points that should be carefully considered. The authors use 3 adjacent tissues (which in theory is composed of normal skin next to a cancer lesion), 4 primary tumor samples, and one lymph node metastasis as a model to study tumor progression. Adjacent tissue is not considered a stage of tumour progression and the sample size is too small to rule out sample-dependent effects. The study is descriptive in nature and could better contextualize the findings regarding what is known for other subtypes of melanomas or other tumours. This is especially important to help readers understand why it would be relevant to study cutaneous melanomas located in acral skin. It would be helpful to explain how different it is from nonacral cutaneous melanoma, and what this study adds compared to other single-cell studies from cutaneous acral and non-acral melanomas.

      Thank you for your generous comments. It is not accurate to represent the adjacent tissue samples as ‘tumour progression’, and our study did not want to focus on the tumour developmental process. We have revised related description in the text. Tumour adjacent tissues (ATs) have always been the focus of research on TMEs. Some studies believe that there are a lot of mutations and clone amplification in normal tissues adjacent to cancer, which may be in a pre-cancerous state (PMID: 33004515), and many single-cell studies of tumours have also sampled and paired para-cancer tissues (e.g., PMID: 29988129; PMID: 35303421).

      The problem of sample size limits the generality of the results, as we pointed out in the Discussion section. Most acral melanoma (AM) patients opt for surgical resection at an early stage to avoid the possibility of metastasis. Hence, we rarely encounter patients with lymph gland (LG) metastases. We only collected one metastatic sample, because it is very rare in clinic. However, the sample has a high quality, such as a high cell activity of single cell suspension after dissociation (95.30%), and a rich amount of tumour cells and other stroma cells. Therefore, we added its sequencing data into the overall analyses, hoping to contribute to the comprehensiveness of resources and research.

      It is important to link this study with the findings regarding what is known for other subtypes of melanomas. We have already supplied the comparison of AMs with non-acral skin cutaneous melanomas (CMs), using the published data. Your comments and advices are entirely helpful to us, and we believe that the current manuscript is more comprehensive and complete.

    1. Author Response

      Reviewer #1 (Public Review):

      In this paper, the authors estimate growth curves ('nomograms') for hippocampal volume (HV) using Gaussian process regression applied to UK Biobank data and evaluate the influence of polygenic scores for HV on the estimated centile curves. By taking this into account, the centile scores are shifted up or down accordingly. The authors then apply this to the ADNI cohort and show that subjects with dementia mostly lie in the lower centiles, but this does not improve the prediction of transition from mild cognitive impairment to dementia.

      This paper is reasonably well written and the finding that centile curves for different phenotypes are sensitive to genetic features will be of interest to many in the field, albeit perhaps somewhat unsurprising given the polygenic score evaluated here is for the same phenotype under investigation (i.e. HV). I think using centiles derived from nomograms/normative models for precisely assessing both current staging and progression of neurological disorders is a highly promising direction. Regarding this manuscript, I have a few comments about the methodology and interpretation of results, which I will outline below.

      • My most significant concern is that It appears that the assumption of Gaussian residuals is violated by the HV phenotypes that the authors fit their GP to. For example, in figure 2, the distribution is clearly skewed, and the lower centiles -in particular- are poorly fit to the data. First, please provide additional metrics to assess the fit and calibration of these models quantitatively (the latter can be done e.g. via Q-Q plots).

      Thanks for pointing this out. We are sorry for causing this confusion. The skew in the figure appears because the scatter plot overlayed with the GP-generated nomogram is showing ADNI samples of all diagnoses – not the UKB training data used for the GP. The lower centiles are mainly occupied by the participants with AD or MCI (see the new plots in Figure 5). In addition, the healthy subjects from ADNI do indeed fit the model reasonably well. We have added a supplementary figure to show just the healthy subject and have made the following edits in the text to address the confusion:

      Lines 143-149: “Nomograms of healthy subjects generated using the SWA and GPR method displayed similar trends (Figure 2; Supplementary Figure S8). … This extension allowed 86% of all diagnostic groups from the ADNI to be evaluated versus 56% in the SWA Nomograms (Figure 2; Figure 2 – Figure Supplement 2).”

      Lines 159-170 (description of figure 2): “Figure 2: Comparing Nomogram Generation Methods. Nomograms produced from healthy UKB subjects using the sliding window approach (SWA) (red lines) and gaussian process regression (GPR) method (grey lines) … The benefits of this extension can be seen with scatter plots of ADNI subjects of all diagnoses overlayed (E, F… A similar figure with only the Cognitively Normal ADNI subjects can be found in Figure 2 – Figure Supplement 2

      Second, I think if the authors wish to make precise inferences about the centile distribution for the reference model, then the deviation from Gaussianity ought to be accommodated in some manner. There are several options for this, including different noise models (e.g. Gamma, inverse Gamma, SHASH, etc), variable transformation, or quantile regression. One option that could be useful in the context of Gaussian process regression is the use of likelihood warping (see e.g. Fraza et al 2021 Neuroimage and references therein) which was originally developed for GP models. I would recommend the authors pursue one of these routes and provide metrics to properly gauge the fit.

      This is an excellent point. However, we believe that given that the training data indeed follows a Gaussian distribution (see new Figure 4 – Figure Supplement 3; reproduced below) across the relevant strata (sex, PGS) and across age groups, such modifications are not required.

      • Related to the above, it is likely that the selection of subjects with high/low polygenic scores for HV changes the shape of the distribution. It is currently impossible to assess this because no data points are shown in these cases. Please also add this information, along with comparable quantitative metrics to those for the models above.

      Thank you for bringing this up. We have now added a new supplementary figure with the shape of these distributions along with the Shapiro-Wilkens test results for each of them. As can be seen, the Shapiro-Wilkens tests detects mild deviation from Normality in some cases. However, given the size of the strata N>2000 this is not surprising. Moreover, would multiple testing be applied here across the 48 comparisons, then none of the tests would be significant at the corrected threshold (P<0.001).

      • How did the authors handle site effects? There appears to be no adjustment for the fact that the ADNI data are acquired from different sites that were not used during the estimation of the normative models. I would expect to see this dealt with properly (e.g. via fixed or random effects included in the modelling) or at the very least a convincing demonstration that site effects are not clearly biasing the results.

      We agree that site effects are a major issue; we have rerun the application experiments after adjusting the ADNI volumes with NeuroCombat. The results did not change significantly, but we have changed all the reported results with the updated results. In addition, we noted this in the methods section:

      Lines 442-445: Finally, we used NeuroCombat 1 to adjust across ADNI sites and harmonize the volumes with the UKB Dataset. To do this we modelled 58 batches (UKB data as one batch and 57 ADNI sites as separate batches) and added ICV, sex, and diagnosis (assigning all UKB as Healthy and using the diagnosis columns in ADNI) to retain biological variation.

      • How do the authors interpret the finding that the relationship between the polygenic scores and HV is different in the cohorts they consider (i.e. bimodal in UKB and unimodal in ADNI)? Does this call into question the appropriateness of the subsampled model for the clinical cohort?

      While we do see a bimodal distribution in UKB the effect is not very strong as the other reviewers commented. Therefore, we have de-emphasized this aspect. One reason may be that we detect the slightly bimodal aspect in UKB because of greater statistical power due to the large sample size (one order of magnitude). One further aspect is the used SNP data, i.e., differences in genotyping platform and imputation. This is also the reason why integrating PGS directly into the predictive model comes with additional challenges. We have addressed this topic briefly in our discussion: Lines 390-392: “Lastly, a recent study of PGS uncertainty revealed large variance in PGS estimates63, which may undermine PGS based stratification; hence a more sophisticated method of building PGS or stratification may improve results further.”

      • Perhaps the authors can comment on (or better, evaluate) how this genetic shift could be accommodated in normative models (e.g. the possibility of including polygenic risk scores as predictor variables in the normative model). This would remove the need for post hoc adjustment and would allow more precise control over the adjustment than just taking the upper/lower xxx % of the PGS distribution as is done in the current manuscript.

      We agree that integration of the genetics directly into the normative models is a great idea. And this will be the direction we will be exploring in future work. However, PGS themselves are prone to show ‘site’ effects that depend on the genotyping method that was used as well as of the quality of genotyping and imputation. As a consequence, using the ‘raw’ PGS scores in predictive models brings its own challenges. Therefore, we feel that the current framework is simpler at this point and illustrates the potential of PGS when combined with normative models.

      • Related to my point above, it is perhaps unsurprising that the polygenic score for the HV phenotype influences the centile distribution. I think the paper would benefit considerably by also evaluating other polygenic scores (e.g., APOE4 as in some of the prior cited references). it would be interesting to compare the magnitude and shape differences for these adjustments. The authors can consider this an optional suggestion.

      Our rationale for focusing on HV PGS was that we sought to improve the accuracy of the normative model. The genetics influences HV and this is a first attempt to adjust for this in the normative modeling framework. Indeed, APOE-e4 has a sizable effect on HV. However, this is most likely mediated by nascent accelerated neurodegeneration, i.e., Alzheimer’s disease. Thus, in our view focusing on APOE-e4 would mean to focus on a disease effect. We address this issue briefly in the discussion (Lines 326-334). For sensitivity analysis, we did indeed test other PGS, such as AD and Whole-Brain-Volume, and found that these do not affect the normative models for HV.

      Reviewer #3 (Public Review):

      Given the large variation in and high heritability of hippocampus volume in the population, taking out known variation in the healthy population is a nice way of reducing heterogeneity, and a step forward towards using normative models in clinical practice. The dataset the nomograms are based on is large enough to do so even when stratified by polygenic scores for hippocampal volume, and these provide interesting information on the role of genetics in hippocampus volume.

      There are however several concerns regarding the applicability of the models to the ADNI dataset. First, the lack of overlap in the age range between the dataset the model is trained on and the application to subjects that are outside that age range is questionable. The authors prefer Gaussian process regression (GPR) over a sliding window-based approach using the argument that the former allows for predictions in a larger age range but extrapolation beyond the reach of the data is usually not valid. The claim that Supplementary Figure 6 shows accurate extension beyond these limits is in my opinion not justified. If anything, we can be rather certain that the extensive growth of the hippocampus up to age 48 is not realistic (see e.g. Dima et al., 2022).

      As mentioned already in response to reviewer #1, this was a miscommunication on our side. We only used the ADNI samples that were within the age range of the models they were being plotted against. The GPR model did not require smoothing at the edges of the age-range and thus can support a wider age range than the SWA. This is why we stated that the extension of the nomograms enabled more of the ADNI dataset to be used, i.e., because otherwise these samples were outside the range of the model and could not be used.

      We have changed the following lines in the manuscript to make this idea explicit:

      Lines 477-478 (end of GPR methods section): “For both SWM and GPR models, we only tested the ADNI samples that lay within the age range of each model respectively.”

      Regarding the accurate extension claim, we have edited the line (411-412) in the discussion so that it now reads:

      Lines 347-348 “In fact, our GPR model can potentially be extended a few years beyond those limits”

      Thank you for pointing out the discrepancy in the hippocampal growth around 48 with the results by Dima et al. 2022. Although sample sizes between the two studies are similar. The data availability in UKB for ages 45-50 is rather sparse (N<100; see new Figure 4 – Figure Supplement 3). Thus, the observed growth is likely due to under sampling. The growth effect has been observed in other studies using UKB data7,8. We have noted this in the discussion:

      Lines 354-356:” However, there is a possibility that our results suffer from edge effects. For example, we suspect that the peak noted in the male nomogram is likely due to under-sampling in the younger participants.”

      Second, the drop in mean 'percentile' difference between high and low polygenic scoring individuals that if one uses genetically adjusted nomograms seems nice, but this difference is currently just a number and the reader cannot see whether this difference is significant, or clinically relevant.

      We have now provided a new figure (Figure 5) that shows the boxplots behind those numbers. The MCI-to-AD conversion analyses in the ADNI explored the clinical benefit of genetically adjusted nomograms. However, adjusted, and un-adjusted percentiles performed equally well. In the discussion we argue that the MCI stage is already too late and earlier stages may benefit from the increased precision:

      Lines 373-378: “However, despite this sizable effect, genetically adjusted nomograms did not provide additional insight into distinguishing MCI subjects that remained stable or converted to AD. Nonetheless, the added precision may prove more useful in early detection of deviation among CN subjects, for instance in detecting subtle hippocampal volume loss in individuals with presymptomatic neurodegeneration.”

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this paper, the authors show that the turnover of centriole components is necessary for proper centriole maintenance within Drosophila cultured cells (during prologued cell cycle arrest) and within Drosophila oocytes, where centrioles are normally degraded prior to fertilisation. They highlight Ana1 as an important player in centriole maintenance. The authors begin with a candidate screen to identify core centriole proteins that are required to properly maintain centrioles. They then focus on Ana1, given that its depletion had the strongest effect, and show that its depletion leads to a reduction in the levels of centriole components in Drosophila oocytes. They show that the previously observed ability of centriole-targeted Polo to counteract centriole loss depends at least in part on Ana1 and that targeting Ana1 to centrioles also counteracts centriole loss. The authors conclude that Ana1 is a component of the PCM-promoted centriole integrity pathway.

      Major comments

      1. The authors say that Plk4 depletion does not lead to centriole loss, but there are significant differences in centriole number between the control and Plk4 depletion cells in Fig 1F and S1D. Please comment.
      2. One of the main results is that depletion of centriole components leads to a reduction in centrosome numbers when measured 8 days after S-phase arrest. I wonder whether a restriction of centriole duplication could add to this effect? Any cells that were in G2 or M phase when the drugs were added would presumably progress into the following S-phase and duplicate their inherited centrioles, but not if centriole duplication proteins had been depleted. It's true that Plk4 depletion leads to a relatively mild centriole loss phenotype, but can the authors be sure that this is not due to variations in the efficiency of different RNAi constructs? Perhaps the authors can show that Plk4 depletion efficiently prevents centriole duplication under otherwise normal conditions.
      3. The authors show that Ana1 depletion has the strongest effect, but this could in theory be due to differences in RNAi efficiency. I don't expect the authors to show the efficiency of all RNAi constructs, but they could state in the text that this is a caveat e.g. "...although we cannot rule out the possibility that differences in RNAi efficiency lead to the observed differences in severity of phenotype..."
      4. A key conclusion is that core centriole components turnover to some extent and that the incorporation of new molecules is necessary for centriole maintenance. This is a very interesting and important point and so it would be nice to have more direct data to support it. This could be done in different ways, including transfecting fluorescently tagged centriole components after S-phase arrest and showing that some molecules become incorporated into the centrioles, or by performing FRAP experiments. Of course, it is possible that the turnover is so low that the incorporated fluorescent molecules cannot be detected...
      5. The authors show that depletion of Ana1 from oocytes leads to a reduction in the intensity of centriole markers. They do not measure centrosome numbers, as the centrosomes cluster too tightly. The authors therefore can't be certain that Ana1 depletion leads to a reduction in centrosome numbers. The authors could show this by inhibiting centrosome clustering while depleting Ana1. There is a recent BioRxiv paper showing that centrosome clustering can be inhibited by depletion of Kinesin-1.
      6. In Figure 3B the authors show that expression of GFP-Polo-PACT partially rescues the effect of "all PCM" depletion, but this seems strange given that Polo's role is presumably to recruit PCM (which has been depleted). Can the authors comment? Also, it would make sense to test whether GFP-Polo-PACT can rescue centriole loss after the depletion of Ana1 alone (not Ana1 and all PCM). If Ana1 has a role in recruiting Polo (either directly or indirectly), which has been shown previously in mitotic cells, then there should be a rescue to some extent.
      7. In Fig4A,C, the authors say that γ-tubulin levels at centrosomes increase when GFP-Polo is forced onto the centrosomes - the graph seems to show a big increase, but the pictures do not...? Are the authors measuring total levels at all centrosomes? If so, I think they should be measuring the average at individual centrosomes. Also, why is the level of GFP alone not much higher when expressed with GFPnanoPACT (Fig 1B)? Presumably GFP should be recruited to the centrosomes by GFPnanoPACT.
      8. The authors show that tethering Ana1-GFP to the centrioles counteracts centriole loss in oocytes (Fig4G). They say that the centrosomes are most likely inactive because they don't recruit PCM, but they have only looked at γ-tubulin, which is a downstream component of the PCM. I think it is important to check whether Polo is recruited, given that tethering Polo to centrioles also counteracts centriole loss and that a recent paper showed that Ana1 has a role in recruiting Polo to centrosomes (Alvarez-Rodigo et al., 2021). The authors also say that these centrosomes do not organise microtubules but do not show the data.
      9. The authors propose that Ana1 is downstream of the PCM, and so over-expressing Ana1 should at least partially rescue centriole loss after PCM depletion. But I don't really agree with this. If Ana1 relies on the PCM then how would its overexpression manage to rescue the phenotype in the absence of the PCM? The finding that over-expressing Ana1 partially rescues centriole loss may instead suggest that Ana1 is either upstream of the PCM or part of an independent pathway. Indeed, the authors show that depletion of both the PCM and Ana1 has a stronger effect than either depletions individually - this is indicative of two independent pathways.

      Minor comments

      1. When the authors say that the centriole wall and cartwheel components are "dynamic" I think that they need to make it clear that this "dynamicity" is not very fast. Using the term dynamic tends to suggest rapid turnover (like in the PCM). Perhaps the authors could use the term "slow exchange" or something similar.
      2. The authors currently use a 0 or 1 centriole categorisation - it would be nice to see the breakdown of what percentage of cells have 0, 1, 2, or >2 centrioles, perhaps in a supplementary excel file.

      Significance

      How centrioles are eliminated in certain cells is an interesting question and the data presented is also relevant to understanding centriole biology in general, because it seems that some apparently very stable structural proteins actually turnover. It is widely known that PCM proteins turnover relatively quickly, but core centriole proteins are considered to be stably incorporated. The data will therefore raise interest in the centrosome field. I do, however, feel that for the authors to make this point more strongly it would be good to show this more directly. Overall, this is a very interesting paper that is well written. The data is well presented and supports the conclusions that centriole components turnover and that Ana1 is involved in maintaining centriole integrity.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript Pimenta-Marques build on their previous work addressing how centrioles are stabilized and maintained or destabilized and disassembled, depending on the cell type and developmental context. Using Drosophila cell culture and oogenesis as an in vivo model for centriole destabilization, they identify the centriole wall protein Ana1 as a central player in centriole stability. Its presence is required for the maintenance even of mature centrioles, suggesting that there is continued turnover of centriole structural components.

      Major comments:

      1. The experiments and results are very well described and most of the conclusions are supported by the data. One aspect needs clarification though. It is not clear to this reviewer how the authors envision the regulation and mechanism by which Ana1 functions in centriole stability. The data suggest that it can stabilize centrioles independent of PCM (Fig. 3B, 5B), yet the authors claim in the results and discussion that it functions downstream of PCM. As presented, this does not make sense. I would argue the opposite, it may function upstream or in parallel to the PCM. Related to the above, the last sentence of the intro states: "Finally, we found that both Polo and the PCM require ANA1 to promote centriole structural integrity." This is shown for Polo, but where is the data showing that PCM requires ANA1 for promoting centriole stability?
      2. I have a concern regarding the number n used for statistics in the quantifications. In many cases it seems that the number n of cells etc. was used (e.g. n>100 cells) rather than the number of experiments (e.g. n=3). The statistics should measure variability between experimental repetitions, not between cells etc. If statistics were indeed not done on experiments and would have to be changed, some of the observed effects may not be statistically significant and would require additional experimental replicates, which would increase the time needed for revision.

      Minor comments:

      1. I would advice the authors to improve the presentation of the figures. In particular the labels are in many cases very small and difficult to read. Readability is also reduced by the use of bold font in the labels and a mix of various font sizes within single figure panels.
      2. The result section could be shortened/become more readable by moving several paragraphs to the intro or discussion.
      3. The introduction is quite long and some parts read more like an introduction of a review on the topic.

      Significance

      This is a nice, focused study on the requirements underlying centriole stability and maintenance. The first part identifies the cartwheel, the centriole wall, and the PCM as important for centriole maintenance. The remaining parts identify and focus on the essential role of ANA1 in this process. This is an important finding, since the mechanisms underlying centriole stability and maintenance are poorly understood, yet highly relevant. Some cell types inactivate and/or disassemble centrioles during differentiation and this is likely important to their function. Providing more mechanistic insight, for example, regarding the relationship between ANA1 and PCM recruitment or the regulation of ANA1's centriole function by Polo, would have further strengthened the study. The audience interested in this work will be cell and developmental biologists. My expertise is in centrosome biology and microtubule organization.

      Referees cross-commenting

      I agree with the additional points raised by the other reviewers. I still think that overall the paper is fine and most things could be addressed in a reasonable time frame. The work does not provide much mechanism though. In this regard, the confusing placement of ANA1 downstream of PCM, would be the only mechanistic aspect, and it seems the authors got it wrong, at least based on the provided data. Here, additional experiments could elucidate these relationships further, but if this is not the goal, text changes could also address this and it would remain a smaller, more focused study.

    1. Peer review report

      Reviewer: Yulia Karmanova

      Institution: Research Centre Kairos

      email: yulia.karmanova@gmail.com


      General assessment

      In my honest opinion the topic of intercultural competence (ICC) should be of great interest not only to researchers involved in linguistics and pedagogics but to a general reader as well. By developing ICC, that represents a set of skills needed when encountering people from various backgrounds, one can learn valuable communication skills, flexibility in behaviour and become more aware of a lack of one’s tact and tolerance.

      The manuscript is well written in an engaging and lively style, it provides excellent context about linguistic cues of ICC that will help educators steer and stimulate the ICC development of their students.

      The manuscript cites relevant and sufficient literature that provides a very useful resource for current practitioners.

      I do not identify fundamental flaws in the manuscript, there is nothing illogical or irrational, although I have a few suggestions for minor improvements. Please see my comments below for further details.


      Essential revisions that are required to verify the manuscript

      No essential revisions. The manuscript clearly describes the research methods of data collection and analysis as well as other meaningful parameters. Section number 3 (Research Method and Results) is recipe-like, the study can be reproduced.

      The data collected for the research is impressive: 1,635 blogs (on average 400 words each) written by 672 students majoring in Hotel Management.

      The data and analysis provided in the manuscript are not deprived of clarity and logic. No additional experiments are needed to validate the results presented in the manuscript.

      Discussion and conclusion section aligns with objectives stated in the first section.

      The authors of the manuscript made a valuable contribution by identifying linguistic markers for ICC in the language use of students blogging about intercultural experiences: I-perspective lexemes, insight verbs and quantifiers. These language cues make ICC more «tangible» and as a result provide teachers with concrete tools for giving students more targeted ICC assessments in their reflective writing tasks. By giving certain linguistic prompts to students, educators may form a more thoughtful and personalised approach in describing their intercultural experience.


      Other suggestions to improve the manuscript

      The content of the manuscript is scientifically sound but has minor shortcomings that could be improved by further revisions.

      I do agree with the limitations of the research mentioned by the authors, especially with the lack of the explanatory value of a significant difference in frequency of use of the linguistic markers which I think can be resolved in future studies of this topic.

      I suggest that the authors should involve more assessors in their future research. Two lecturer-researchers and three senior students were involved in the process which I assume is not enough for such large-scale research like this. A bigger team of professional assessors could make valuable contribution when analysing the data and resolving emerging research questions.

      I would also recommend providing the manuscript with brief comments on the meanings of the parameters in column 4 (Table 3, 4, 5, 6) for readers’ clarity. What do t, p and n.s. stand for?

      I believe that the manuscript would benefit from correcting minor inaccuracies. I would recommend to:
 replace «his» with gender neutral «their», page 6: In these blogs, the language use of students serves as a vehicle of information on the students’ development of ICC, offering the reader concrete cues – henceforth referred to as linguistic markers – of his reflective learning process.

      • add a space between that and are, page 19: In order to bring more focus to our research, we initially focused on word categories thatare characteristic of properties that can be linked to ICC and cultural sensitivity, such as openness, self- relativity, curiosity and reflection or analytical thinking.

      • add missing parentheses, page 22; Deardorff, D. 2006. Identification and Assessment of Intercultural Competence as a Student Outcome of Internationalisation. Journal of Studies in International Education, 10 (3), 241-266.

      All in all, I find the topic of the manuscript fascinating and the research question relevant and essential to the field.


      Decision

      Verified manuscript: The content is scientifically sound, only minor amendments (if any) are suggested.

    1. Author Response

      Reviewer #1 (Public Review):

      This manuscript by O'Herron et al. describes an all-optical method combining optogenetic stimulation and 2-photon microscopy imaging to simultaneously manipulate and monitor brain microvasculature contractility in three dimensions. The method itself, which represents a microvasculature-targeted variation on a theme previously elaborated for simultaneous stimulation and monitoring of ensembles of neurons, employs a spatial light modulator (SLM) to create three-dimensional activation patterns in the brains of cranial window-model transgenic mice expressing the excitatory opsin, ReaChR, in mural cells (smooth muscle cells and pericytes) under control of the PDGFRβ promoter. The authors demonstrated that, by splitting a single 1040-nm stimulating beam into multiple beamlets using an SLM, this system is capable of optogenetically activating ReaChR at discrete depths in the neocortex, depolarizing mural cells and producing highly localized constrictions in targeted, individual microvessels. Using this system to investigate the kinetics of optogenetic-induced contraction and sensory-evoked dilation, the authors found that the onset of optogenetically evoked contraction was much more rapid than that of sensory-evoked dilation, concluding that the observed lag between sensory stimulation and vascular response does not reflect intrinsic limitations of mural cell contractile mechanisms but is instead attributable to the time course of neurovascular coupling mechanisms. They further found that by titrating the stimulation duration they could completely negate the vasodilatory response to a concurrent sensory stimulus.

      1) The red-shifted opsin, ReaChR, represents an improvement over opsins used in previously described 3D neuronal activation/monitoring systems. In particular, brief single-photon stimulation (100 ms) of ReaChR led to rapid, robust arteriole constrictions throughout the activation volume, whereas a previous generation ChR2 opsin required stimulation for seconds to achieve slowly appearing constrictions.

      Thank you for pointing out this key takeaway from our manuscript. In Figure 9 of the revised manuscript, we provide a comparison of ReaChR-induced vasoconstriction, with data previously collected across microvascular zones using line-scanning in ChR2-expressing mice. These data show how ReaChR produces faster and more potent vasoconstriction in alpha-SMA expressing SMCs and ensheathing pericytes, but has similar effects on the slow contraction with capillary pericytes.

      2) Single-photon stimulation was capable of completing stopping blood flow in a "first order pre-capillary branch". (Not clear what is meant by the phrase "pre-capillary branch"; anatomically, penetrating arterioles feed capillary branches.) While this speaks to the effectiveness of the method, it also highlights potential supraphysiological effects of stimulation and the importance of titrating stimulus intensity/duration to achieve physiologically meaningful responses.

      We have removed the term “pre-capillary” to avoid causing confusion, and now use the term arteriole-capillary transition to denote the alpha-SMA positive segment that lies between the penetrating arteriole (0th order) and the alpha-SMA low/negative capillaries (>4th order). The rationale for this terminology is provided in our new review (PMID: 34672718), which explains why the transitional zone should be considered a separate vessel type that is not arteriole and not capillary.

      We agree with the reviewer that titration of stimulation power/duration will be important and will depend on the application. We addressed this point by performing measurements of arteriole diameter with graded laser powers (Figures 5 & 7). There are many parameters to explore, but for the purposes of this manuscript, we clarify that the effect is titratable and that users should define physiological ranges in their specific circumstances, which may differ based on the experimental goals, age of mice, arteriolar size and vascular zone, and other factors.

      We also note that some applications may want to mimic pathophysiological levels of constriction, for example to mimic the effects of arterial vasospasm after subarachnoid hemorrhage, or ensheathing pericyte contraction with MCAo stroke (PMID: 26119027), or to examine the neural consequences of transient small vessel occlusion.

      3) In assessing effects of laser power, the authors assert that "increasing the laser power only slightly expanded the range of constriction". This seems a bit of an overstatement, given that increasing power (30-fold) had a greater effect on the spread (3x) than the magnitude (2x) of the response.

      Thank you for pointing this out. We have re-worded this section to avoid the overstatement and to emphasize the results more clearly on the spatial spread of constriction relative to laser power.

      The difference images in Figures 4B-C, G-H demonstrated that there was very limited spread of the constriction beyond the stimulation spots. We tested the effect of laser power on the spatial spread of constriction by stimulating with a broad range of power levels. We found that increasing the laser power led to a small increase in the spread of constriction. For example, a 30-fold increase in power (from 5 mW to 150 mW total power) led to ~3-fold increase in the spread of constriction (from ~25 µm to ~75 µm) (Figure 5A-H).

      4) The suggestion that penetrating brain arterioles possess a mechanism for upstream conduction of constrictive responses is intriguing (although this intrigue is tempered by the lack of experimental support for the operation of such a mechanism in the brain microvasculature).

      We are also intrigued by this hypothesis, which was supported by some evidence from a recent study of retinal vasculature. Kovacs-Oller et al. showed using neurocytin tracer injections into capillary pericytes, that they are linked through gap junctions and there is upstream directional diffusion of tracer. Further, they showed that electrical stimulation of a pericyte could lead to directional constriction from capillaries back to the arteriole in the retina (PMID: 32566247). The planar orientation of retinal vasculature makes this phenomenon easier to see. However, the 3D architecture of cortical vasculature is more challenging to study, particularly since the propagation along arterioles occurs along the Z axis, where spatiotemporal resolution of imaging is limited.

      Given our new data on the effects of laser power on axial spread (see reply to points 10-13 below) and the difficulty in separating active propagation from out-of-focus activation, we think there is not sufficient evidence to claim that penetrating arterioles are propagating the signal through some active process. Further experiments, including studies of the mechanisms involved, will be needed to address this hypothesis. Therefore, we have removed any discussion of potential propagation of the signal, and instead focus on the relationship between laser power and axial resolution of activation.

      5) The authors' premise for comparing contractile kinetics with sensory-evoked kinetics is flawed. In attempting to use the kinetics of optogenetic-induced constriction to infer something about the kinetics of sensory-evoked dilation, they are implicitly assuming that the kinetics of contraction and dilation processes intrinsic to mural cells are the same. This is highlighted by their use of the phrase "kinetics of the vasculature", which elides the possibility that dilation and contraction kinetics intrinsic to mural cells are different. Support for this latter possibility is provided by a previous report on renal afferent arterioles showing that the kinetics of myogenic constriction in arterioles are "substantially faster" than those of dilation (PMID: 24173354). Thus, their data do not rule out the possibility that the delay between sensory stimulation and vascular response reflects a slower intrinsic dilatory response rather than the time course of neurovascular coupling mechanisms. Furthermore, arterioles have an internal elastic lamina (IEL), which also determines the rates and degree of constriction and dilation. The IEL ends with the arterioles, and vessels with ensheathing contractile pericytes (and downstream) lack the constraints of the IEL.

      We thank the reviewer for this constructive critique. We agree that there are many issues in comparing kinetics between sensory evoked dilation and our optogenetic constriction. We have re-worded this section to avoid any mechanistic implications in the discussion of the kinetics of the different processes. However, we wish to still incorporate the details about the rapid kinetics of constriction to highlight the utility of the approach to intervene/perturb sensory-evoked responses, given that contraction can be titrated and precisely timed. We discuss the utility of this approach further below.

      6) It's not at all clear how overriding sensory-evoked dilation with optogenetically generated constriction provides a means for distinguishing neural activity from vascular responses. In particular, it is not clear how performing this maneuver while monitoring neuronal activity can provide the suggested insight into "aspects" of functional hyperemia that are essential to neuronal function beyond the relatively trivial observation that there is a point at which blood flow is too low to support continued neuronal activity.

      Thank you for raising this point. We have added more detail to our thoughts on why over-riding functional hyperemia could provide insight into the dependence of neural activity on the blood flow increase. Neural circuits are extremely complex with many different sub-types of neurons playing different roles. These subtypes have been shown to have different metabolic sensitivities and thus, may be differentially affected by blocking functional hyperemia (PMID: 26284893). This could lead to altered circuit activity which could have profound consequences for neural processing. Additionally, the energy budgets of different cellular functions within neurons are quite different (PMID: 22434069) and reducing available energy by blocking functional hyperemia could lead to differing degrees of dysfunction across important cellular processes (e.g. re-establishing the membrane potential, recycling neurotransmitters) which could again have important consequences for neural coding. Furthermore, it has been shown that there is a steep gradient of oxygen moving away from penetrating arterioles, and so neurons at greater distances from vessels may be differentially affected by blocking the hyperemic response (PMID: 21940458).

      7) With the exception of vasculo-neural coupling, where it would be the method of choice, the technology described leaves the impression of a capability in search of an application. That said, the ability to control blood flow to the point of completely stopping it may ultimately have applications in pathological settings.

      In addition to our response above on the utility of over-riding arteriole dilation during functional hyperemia, we have added to the discussion more potential uses of the technique. These include: (1) To be able to manipulate blood flow without using pharmacology or having to induce neural activity could be useful for a variety of studies involving intrinsic reactivity and compliance of vessels in both health and disease. (2) The different microvascular zones have distinct contractile kinetics. There are details that remain unstudied, such as the kinetics of different sized vessels, their location in the network, their identity as collateral arterioles or pial arterioles. Vascular optogenetics can dissect the contractile characteristics of different vessel types, similar to probing a circuit board. (3) Studies of the physiological significance of vasomotion, with respect to brain clearance of metabolic waste products. Being able to directly drive vasomotion and alter its amplitude and frequency will be an important tool for studies in this field. (4) Functional hyperemia is also impaired in many diseases, but this dysfunction could arise from impaired activity of neurons, astrocytes, or vessels. Therefore, a method to disentangle specific changes to blood vessels in vivo could be useful for understanding the vascular contributions to such diseases.

      Reviewer #2 (Public Review):

      The manuscript by O'Herron et al. describes a new technique for all-optical interrogation of the vasculature in vivo. They expressed optogenetic actuator ReaChR in vascular smooth muscle. They activated ReaChR using single-photon or 2-photon absorption. In both cases, they observed rapid and reversible constriction (presumably, due to Ca increase). Single-photon activation produced widespread constriction; two-photon activation allowed targeting of individual vessels. Using a commercial 2-photon system with a spatial light modulator on the photoactivation 1040-nm beam, they demonstrated localized constriction at multiple points along the small and large cerebral arterioles at once targeted by individual beamlets. Overall, this is a very interesting paper that clearly lays out the methodology and experimental design and carefully considers a number of potential limitations and pitfalls. This paper will serve as a valuable recourse for a large community of eLife readers interested in cerebrovascular physiology in health and disease as well as in neurovascular coupling and interpretation of noninvasive imaging.

      Given the chronic nature of the optical window, it is not clear why imaging was done under anesthesia. This point requires explanation. There is a concern that targeting of the vessel wall not possible in awake animals due to brain motion. If yes, that would be a serious limitation of the methodology.

      To ensure that our method is compatible with awake experiments, we have added awake data to the manuscript (Figure 10). We show that individual vessels can be independently targeted in the awake animal and the outcomes are not profoundly different than in the anesthetized state. As with all awake experiments, due diligence must be taken to ensure the preparation is as stable as possible, and the occasional trial may have to be removed if motion artifacts are too large.

      Reviewer #3 (Public Review):

      Strengths: In the vascular field, previous implementation of optogenetics to constrict and dilate blood vessels, has used either single photon full field and fiber illumination, or alternatively confocal and 2-photon scanning of individual vascular segments with raster scanning. The former is limited in spatial precision, activating multiple vessels over a large area, whereas raster scanning is not ideal for accumulating currents and often results in slow temporal precision. Spatial light modulator (SLM) generated diffraction patterns to achieve patterned illumination have become increasingly used in neuroscience to achieve reliable 2-photon activation of targeted neuron populations. Here the authors use this technology to depolarize and constrict smooth muscle cells in vivo. By imaging and stimulating with 2 laser lines and different optical paths they are able to stimulate opsin expressing cells and image simultaneously, which is advantageous. By using the Red-shifted opsin ReaChR for their experiments, it is possible to combine this approach (cautiously) with imaging many of the classically used 2-photon fluorophores and genetic indicators, with excitation spectrums <1040nm. Future work using variations of the technique is likely to gain valuable insight into neurovascular biology.

      Weaknesses: A major limitation of the current study is that although the authors achieve high spatial precision of ReaChR activation in the xy plane, the axial precision appears extremely poor compared to what would have been expected. For example, in Fig. 5-1 (using a 0.8NA, 16x objective), the authors achieve equivalent levels of surface arteriole constriction even when the SLM is focused 200um above the brain, and even larger constrictions as they initially move the focus away from the imaging plane. Although the axial spatial resolution appears better with the 1.1NA - 25X objective, such a large point spread function largely limits the utility of the technique, as there will always be a concern as whether the effects are spatially specific and not due to activation of vascular cells above and/or below the site of interest. This experiment that the authors have presented on axial precision is extremely important as it outlines a very important limitation of the technique (which is likely power dependent), but it remains to be completely characterized and understood. One possibility is that the power levels used by the authors are already above saturation, a problem raised by Rickgauer and Tank (2009)- PMID: 19706471, and therefore they may be able to refine the axial precision by using lower power. Further controls would be valuable to understand the precise cause of this large axial spread as it doesn't quite add up with the diameter of the bleach spot shown in figure 5-1D (some suggestions outlined in recommendations to the authors).

      We agree with the reviewers on this point. We conducted several new experiments to help elucidate the limits of axial resolution. First, we have dropped the comparison between objectives with different NA’s. This leads to unnecessary confusion, and it is common knowledge that lower NA objectives will have poorer resolution in the axial plane. We now mention this as a factor to consider, but have removed it from the figures. Second, we have shown, as the reviewer suggests below, that the stimulation power used has a dramatic effect on the axial spread of constriction (Figure 6E and Figure 7). Low powers indeed show a more narrow axial spread. However, we typically use higher powers (near or above 100 mW) to generate large constrictions in penetrating arteries, and we also include these levels to show the greater axial spread they cause. In summary, we confirm with lower powers the 3D precision of the two-photon optogenetic technique, and we show that higher powers can be used to broadly constrict penetrating arterioles for studies seeking to modulate blood flow in columns of cortical tissue supplied by penetrating arterioles.

      Regarding the stated inconsistency with the bleached spots, we think this mostly has to do with the difference between photo-bleaching fluorescent material (requiring lots of laser power) and photo-activating opsin channels (which can be done with much less power for very sensitive opsins). Additionally, the slide we bleached is optimally activated at ~800nm and so our 1040 nm stimulation required enormous power to burn the spot.

      The current version of the paper also lacks adequate quantification of the results as it is composed primarily of representative examples, which limits a proper assessment of reproducibility and variability of the effects.

      We agree that showing population averages will be more informative to the field. In the original submission, we showed mostly examples because the large parameter space (size and number of spots, position on vessels, duration and intensity of stimulation; if a stimulation train, the duration, number, and inter-pulse interval of stimulation) was explored in the early data rather than picking one set of conditions. However, we have now collected new data where parameters were typically the same and included population average plots in the figures that previously had only individual examples (Figures 2G,I, 4I,M, 4-1C, 5I, 6E,F, 7, 11-2 ) as well as the new data (Figures 8, 9, 10).

    1. Author Response

      Reviewer #1 (Public Review):

      LaRue, Linder and colleagues present an automation (GLO-Bot) and analysis pipeline building on the previously developed GLO-Roots, which makes use of a constitutively expressed luciferase gene to image plant roots in thin soil containers (rhizotrons). After validation of the system using a set of 6 accessions, the authors then take advantage of the increased throughput to phenotype root system architecture (RSA) of 93 natural Arabidopsis accessions and perform genome-wide association to identify polymorphic genomic regions that are associated with specific RSA traits. I appreciate that the authors made all data available via zenodo.

      The authors succeeded in automating the GLO-Root system. Overall, the GLO-Bot appears to be a nice platform to collect time-lapse images of root growth in soil-substrate using rhizotrons. The automation of the GLO-Roots system using the GLO-Bot is well described, although not in sufficient detail to be rebuilt by interested researchers, e.g. the software controlling the robot is not described or made available, precluding wide adoption of the method. The image processing pipeline is clearly described in the methods and in Figure 2. The pipeline open source and available for use and appears to work well overall, although in some cases the vector representation of the root system appears to be incomplete.

      We thank reviewer #1 for raising these concerns. We have now made the general code for the software available (GitHub: https://github.com/rhizolab/rhizo-server). In addition, we uploaded the rhizotron laser cutting files (Zenodo DOI: https://doi.org/10.5281/zenodo.6694558) that would facilitate rebuilding the robot.

      We understand the concerns about the vector representations of the root system.

      These root system structures visible on the GLO-Bot images are indeed disconnected in many locations, due to variability in the reporter’s intensity and obstruction of the light path by soil particles. For traits like root angle, the disconnected nature of the root system is much less impactful as this method naturally uses “segments” of the root as individual elements for angle measurements.

      The authors then present a quantitative analysis of RSA using a set of 93 accessions, with 6 replicates per accession, generating a large dataset on the diversity of RSA in Arabidopsis. Using average angle per day, the authors identify SNPs that significantly associated with angle at 28 days after sowing, and they describe a correlation between this trait and the mean diurnal temperature range at the site where the accession was originally collected. The main weakness of the manuscript in its current form are some details of the quantitative genetic analysis. In my opinion the quantitative genetic analysis would benefit from additional quality control as there are peculiarities in the dataset that was used as the basis for GWAS.

      We understand the concerns from reviewer #1 about the quantitative genetic analysis. Ultimately, we performed the analyses in the way we explained in the paper with careful consideration. We have added in additional descriptions of the rationale for chosing certain methods that hopefully elucidate why we did the analyses in the way we did. We hope this paper serves as a resource for others to pursue additional studies on traits relevant to their research.

      Reviewer #2 (Public Review):

      Therese LaRue and colleagues have developed a second generation of the GLO-Roots system that had been developed in their lab and published in 2015. Importantly, the new system (GLO-Bot) and the analysis of the resulting images has now been largely automated and therefore provides a throughput allowing for genetic studies. In an impressive endeavor the authors have transformed more than 100 diverse accessions that had been selected using sensible criteria with the luciferase construct, which then allowed the RSA of these accessions to be measured using the GLO-Bot system. On a set of 6 diverse accessions, the authors carefully identify meaningful RSA traits that they then quantified in the accessions of a larger panel of almost 100 accessions. They also benchmarked the new imaging processing tools against gold-standard manual tools. Overall, they show that the data acquisition and analysis is reproducible and reasonably accurate. They then proceeded to conduct GWAS using the RSA traits and identified several significantly associated candidate SNPs. Finally, they correlated the RSA with environmental variables and found interesting correlations that are consistent with prior studies.

      Strengths:

      The manuscript presents interesting root phenotyping technology, a comprehensive atlas of RSA under rhizotron lab conditions in Arabidopsis, candidate genes potentially underlying RSA traits, and interesting associations of RSA and climate variables. This will be inspiring and useful to many other researchers and has the potential to be explored further in future studies.

      We thank the reviewer for the encouraging feedback.

      Weaknesses:

      Some aspects of the data analyses are not well described and should be described more. The trait data is heavily processed to "breeding values" and it is a bit unclear when unprocessed and processed trait data is used and why. Also, limitations and caveats are not discussed sufficiently. For instance, presenting and discussing the issues and caveats of measuring RSA that was generated in thin and not very wide soil sheets using the GLO-Bot system when natural growth in soil is usually largely unconstrained. Moreover, the analysis of potential candidate genes from the GWAS is not very well developed. Finally, the trait data was not available with the manuscript and a major impact of a resource like this will come from the data being fully available to the community.

      We appreciate the broad comments on the manuscript and have tried to address them through the specific responses below. Overall we believe the approaches we used are effective but with specific caveats and have used the revision as a means of better communicating the limitations of the approaches chosen.

      Reviewer #3 (Public Review):

      The authors provide a thorough description of a method to transform plants to be bioluminescent upon applications of the require substrate such that roots are visible on the windows of rhizoboxes. They have expanded on previous work by automatic the imaging process with a robot that moves rhizoboxes to an imager where images are captured. They have improved the image analysis pipeline to be mostly automated with a user presumably needed to run various scripts in batch mode on directories of images. One novel aspect of the image analysis pipeline is in using image subtraction to subtract the previous time root system from the current in order to identify new growth.

      We thank the reviewer for highlighting the strengths of the manuscript.

      Overall, I think the authors provide a great amount of detail in parts needed and the methods, but some recommendations to increase reproducibility are more information about actual root traits measured. For example, one concern would be if root length is only summing pixels without considering diagonal pixels having a length of square-root of two, sqrt(2).

      This is a valid concern, rather than just summing the pixels, the length of the segments is actually calculated using the “Feret Diameter” (or caliper length) function in imageJ which does take diagonals into consideration

      While the methodological aspects of the paper are compelling, the authors have furthered the significance through a biological application for genetic analysis among accessions of Arabidopsis and correlating root traits to climatic 'envirotypes' or data from the origin site of the respective accession. This genetic analysis would be furthered by greater consideration of time series analysis and multi-trait analysis, which is possible in GEMMA. The authors could consider genetic analysis of the PCA traits as well. Given the novelty of this type of time-series, multi-trait data - the authors can reach further here.

      Absolutely, PCA approaches to disentangle the phenotype space would be highly interesting to further investigate, which we started in the Supplemental Figure 8. This figure decomposes all the data points including replicates and temporal values of the same replicate. The PC1 therefore mostly captures how plants change over time, while PC2 seems to capture the main trade-off of wide/horizontal vs deep/vertical root architectures that we describe throughout the text. We could make use of this PC space to quantify the average value per genotype in PC2 and utilize this value for GWA, although it is not obvious how replicated and temporal measurements behave in PCA and what would be its consequences when computing a genotype value. There will definitely be interesting work that we aim to pursue in this direction in the future.

      Regarding the additional capabilities of GEMMA. We are not aware of a subtool that is able to analyze time series directly in GEMMA, but we will look into it. The multi-trait analysis in GEMMA is also interesting. We have utilized the multi-trait feature in the past, but this is limited to very few traits. We have 8 time points, thus 8 traits. For reference, when we have run multi-trait LMM with 2 traits, we have typically seen runtimes of ~9 days in large clusters. New tools continue to emerge in the field of quantitative genetics, such as the use of summary statistics of multiple GWAs to gain new insights, which we will pursue in the future. We have added possible future directions to the discussion section (page 14).

      As far as the general structure of the manuscript, I struggled with the results mixing in the methods such that I was never sure if the lack of detail in methods there would be addressed later, along with the mixture of discussions. Perhaps these are personal choices, but the methods were also after supplemental. I simply ask the authors to consider the reader here by being honest with my own experience reading this manuscript.

      We appreciate this comment of reviewer #3. Since this is a “Tools and Resources” article, we believe that a substantial part of the results section should include the methods that were applied. The methodology mentioned in the results section should always help the reader to understand the illustrated results in the figures. If readers would like to apply certain methods, however, more details can be found in the materials and methods section. We apologize if this was not always successful and led to confusion. In the final formatted version, all supplemental figures would be linked to the main figures so that the materials and methods section would follow the discussion.

      Overall, I believe this manuscript advanced root phenotyping by providing relatively high-throughput (imaging is slow due to the long exposure times) data and doing the time-series, multi-trait genetic mapping. The authors mention imaging shoots but no data is presented - presumably, it would be interesting to tie that in but they may be reasons to not. The authors could also discuss more the advantages of this approach relative to color imaging that has also advanced significantly since the original GLO-Root paper was released. Last, I am not sure the description of the 6 accessions study adds much value to the paper, and probably many other preliminary studies were done to prototype. Overall, this is fantastic and substantial work presented in a compelling way.

      Unfortunately, the shoot images that were taken did not have sufficient quality for further analysis and due to technical problems, the set of shoot images is not complete. We removed the part of shoot imaging from the text. It now reads:”Inside the imaging system, the rhizotrons were rotated using a Lambda 10-3 Optical Filter Changer (Sutter Instrument®, Novato, CA). If it was the first imaging day or a designated luciferin day (every six days), GLO-Bot added 50 mL of 300 μM D-luciferin (Biosynth International Inc., Itasca, IL) to the top of each rhizotron immediately before loading the rhizotron into the imager.”

      The advantages of the GLO-Roots method over color imaging is clearly that the GLO-Roots method can capture a more complete image of root systems with finer roots (like Arabidopsis). We have added the possibility of using RGB imaging for bigger root systems to the discussion section (page 13).

    1. Is maintenance a privilege?

      I think in many ways it has become a privilege. In an age when practical skills and ability to repair are relatively rare, and when it is often cheaper (in money and time) to buy new, I think maintenance is a privilege. Can we share it, teach people to fish so to speak? Perhaps knowing how to maintain simply isn't enough; slim margins of personal time may not be best spent maintaining things (as opposed to maintaining oneself).

    1. Reviewer #1 (Public Review):

      The key question that Huang et al. are addressing is which approach, paratransgenesis, transgenesis, or the combination of both, is the most promising to combat malaria, killing parasites without affecting the mosquito host. They explored this question by generating a transgenic mosquito line secreting two effector molecules in the midgut and salivary glands, and infecting mosquitoes with Serratia bacteria expressing effector molecules. Their major finding is that a combination of both strategies has the highest inhibition of parasite development compared to transgenesis or paratransgenesis alone. This is further confirmed by mouse infections with a rodent malaria model showing that a combination of both strategies inhibits transmission to naïve mice.

      This study is comprehensive and provides significant information on the possible use of these approaches for malaria control. The effects on parasite development are clear and convincingly confirm that these strategies have the potential for reducing malaria transmission. It cannot be ruled out, however, that the more pronounced effects on parasite development of the combined approaches may be due to differences in the fitness of these mosquitoes rather than a true additive or synergistic action between transgenesis and paratransgenesis. Another limitation is that the authors do not show when parasites are killed and do not provide direct evidence of the role of the bacterial-expressed factors in the killing mechanism.

      The authors show very convincingly that transgenic mosquitoes (all possible combinations) have comparable fitness to wild types. However, these fitness studies are lacking in Serratia-infected mosquitoes, and in the transgenic-paratransgenic combination. Are those mosquitoes as fit as WT? Fitness costs could negatively affect parasite development indirectly, rendering the comparison between the treatments impossible (and negatively impacting this possible strategy). These are key controls that need to be added to the manuscript in order to support the finding that the combination is the best approach.

      It is surprising that the Sg/E line inhibits oocyst development given it uses a salivary gland promoter. The authors hypothesize that this is most likely explained by mosquitoes ingesting saliva with the blood meal. This hypothesis is interesting but needs to be tested by determining the presence of Scorpine and MP2 protein in the blood bolus. Also, at what stage are parasites killed?

      While the authors test the expression levels of Scorpine and MP2 by qRT-PCR and western blot in transgenic mosquitoes, they did not test levels in paratransgenic ones. In which tissues are these factors produced in Serratia-infected mosquitoes? Are Scorpine and MP2 produced in the midguts and/or salivary glands? And at what level? A quantitative comparison of scorpine and MP2 protein levels in transgenic and paratransgenic mosquitoes is important to determine whether levels are correlated to the effects on parasite development.

      Related to this, the engineered Serratia bacteria appear to express 5 effector molecules rather than just MP2 and Scorpine. This obviously can affect the results and also makes a direct comparison less meaningful, but we couldn't find any information on the other effectors, or on whether they are expressed and potentially responsible for the observed anti-parasitic activity.

      More information about the experimental setup is needed. The authors used a piggybac approach that has led to multiple insertions in some of the mosquito lines. Which lines did they use for the experiments? This is not clear in the manuscript. If multiple insertions were used, this should be stated and the feasibility of maintaining them (and efficacy) over different generations should be discussed.

      Oocyst and sporozoite data are not normally distributed, and therefore presenting the median instead of the mean is more informative. Furthermore, the statistical analyses done do not appear to be appropriate for this data. The authors need to either FDR-correct for multiple comparisons or do a Kruskal-Wallis test with post hoc testing. It would also be important to do statistical analyses on the prevalence.

      When discussing the ethical consequences of this approach, the authors should also discuss the possible effects of QF2, scorpine, and MP2 secretions in humans upon a blood feed.

      The authors show Serratia vertical transmission over three generations, but as the CFUs decrease over multiple generations, they should discuss whether low levels of Serratia can still block parasite development. In general, the manuscript lacks a thorough discussion of the limitations of this study.

      The discussion around line 280 should be more nuanced. I don't think the word 'protected' can be used as mice were not immunized but were simply not infected.

    1. Reviewer #1 (Public Review):

      The authors look at a few different nematode species to compare the dynamics of anaphase. They find that in some species the spindle oscillates transversely in anaphase, and in other species it does not. They ask what accounts for this different behavior. To address this question, they use ablation of the central spindle, and conclude from the result, correctly, that after the ablation the centrosomes are pulled to the opposite poles of the cell in all species. However, the magnitude, half-time and initial velocity of the recoil differ.

      To understand what accounts for the quantitative difference, the authors

      1) use a simple viscoelastic model of a constant force, F, pulling against a spring (with constant stiffness k), while the object moves through the viscous medium.

      2) estimate the cytoplasmic viscosity from tracking yolk granules,

      3) estimate parameters F and k from fitting the exponential recoil curves. They find that the greatest correlation between having transverse oscillation or not is with lower or higher viscosity, not with magnitude of the force or stiffness of the spring.

      Two major problems with this study can be identified:

      1) Meaning and significance: It is not clear if the transverse oscillation have a functional significance. In fact, they are more likely than not simply a byproduct of complex nonlinear mechanics of the mitotic spindle. It is important to understand what we can learn about the spindle mechanics from these oscillations, but there may be no evolutionary significance here. If the authors were asking - how, in many different species, the spindle scales with the cell size in the same way (as was done in Farhadifar et al 2020, which the authors do not to cite) despite large parameter variations - that would be a different story. But asking which parameter change is responsible for the behavior change is less meaningful.

      2) The study is not convincing, mainly because the model used for the fit is overly simplistic. The force is not constant, the spring stiffness is not constant, the mechanics is not, etc. There are a few different, very complex models, of the anaphase spindle with transverse oscillations - comparing to simulations of these models would be more convincing. Also, I am not quite sure whether the volume fraction of yolk is a useful parameter. Does not measuring MSD give us the diffusion coefficient and viscosity directly? I think using the factor depending on the volume fraction artificially inflates the viscosity differences. Lastly, I do not understand the theoretical argument based on comparison with Nedelec's model: in that model, increasing viscosity only slowed the oscillations down, not abolished them.

      In short, much more thorough investigation would be needed to understand which differences between the species account for the presence or absence of the oscillations, and one may question whether the answer would have a deep impact on our understanding of spindle mechanics.

    1. Reviewer #2 (Public Review):

      A summary of what the authors were trying to achieve:

      The authors have developed an approach to prediction of T cell receptor:peptide-MHC (TCR:pMHC) interactions that relies on 3D model building (with published tools) followed by feature extraction and machine learning. The goal is to use structural and energetic features extracted from 3D models to discriminate binding from non-binding TCR:pMHC pairs. They are not the first to make such an attempt (e.g., Lanzarotti, Marcotili, Nielsen, Mol. Imm. 2018), but they provide a detailed critical evaluation of the approach that sets the stage for future attempts. The hope is that structure-based approaches may have better power to generalize from limited training data and/or to model unseen pMHCs.

      An account of the major strengths and weaknesses of the methods and results:

      The authors first report (section 4.1) that their structural and energetic features contain information on binding mode, highlighting complexes with reversed binding polarity, for example, and partly discriminating MHC class I from MHC class II structures. This is encouraging but not terribly surprising. Also, with regard to MHC I vs II discrimination, it is not clear how the class II peptides are registered with respect to one another. This needs to be done by alignment on MHC and mapping of structurally-corresponding peptide positions, since the extent of N- and C-terminal peptide overhangs varies between structures and is largely irrelevant to the docking mode. Interactions between the TCR and MHC are ignored in the feature extraction process; it's possible that including these interactions could improve performance. The authors state: "To be noted that not all structures could be successfully modelled by TCRpMHC models, and so we could not submit them to the feature extraction pipeline." It's unclear what effect this could have on the results: if the modeling failures are cases of structures for which no good CDR templates could be identified, then perhaps this could bias the results.

      Section 4.2 reports a negative result: unsupervised learning applied to the extracted features is unable to discriminate binding from non-binding complexes. This suggests that there is not likely to be a simple energetic feature, such as overall binding energy, that reliably discriminates the true binders. In Section 4.3, the authors turn to supervised learning, in which training examples inform prediction by a classifier. One finding is that the pure-sequence approach using Atchley-factor encoding of the TCR:pMHC outperforms the structure-based approaches, though not by much. A combined model incorporating Atchley factors and structural features does slightly better. These results are a little hard to interpret because we don't know how challenging the 10-fold internal cross-validation is. It doesn't sound like there is any attempt to avoid testing on TCR:pMHCs that are nearly identical to TCR:pMHCs in the training sets, and the structural database is highly redundant, containing many slight variants of well-studied systems. It's also not clear how overlap between the template database used for 3D modeling and the testing set was handled; my guess is that since the model building is an external tool this was not controlled. Together, these factors may explain why the results on independent test sets are, for the most part, significantly worse than the cross-validation results. Another take-home message from the independent validation is that the sequence-only method seems to outperform the sequence+structure or structure-only methods. Although these are described as "out-of-sample validation", it's not clear how different these independent TCR:pMHC examples are from the structure dataset on which the model was trained.

      Sections 4.4 and 4.5 report that prediction accuracy varies significantly across epitopes, and this is in part determined by sequence similarity to the structural database (which provides templates for modeling and also constitutes the training set for the model). In section 4.6, the authors determine that the model does not appear to be able to predict binding affinity (as opposed to the binary decision, binding versus non-binding). Finally, in section 4.7 the authors benchmark the predictor against two publicly available, sequence-based predictors. When predicting for epitopes present in their training sets, all methods do reasonably well, with the edge going to the sequence-based ERGO method. When predicting for epitopes not present in their training sets, none of the methods perform very well. The authors state that "these results suggest that the structure-based models developed in this study perform as well as the state-of-the-art sequence-based models in predicting binding to novel pMHC, despite learning from a much smaller training set." This may be true, but the predictions themselves are not much better than random guessing (AUROCs around 0.5-0.6).

      An appraisal of whether the authors achieved their aims, and whether the results support their conclusions:

      I'm doubtful that the proposed methods will form the basis of a practical prediction algorithm. In the absence of ability to generalize to unseen epitopes, simpler sequence-based approaches that leverage the ever-growing dataset of TCR:pMHC interactions seem preferable. I still think the study has value as a template and roadmap for future efforts, and a baseline for comparison. For me, a key unanswered question is whether the model-derived structural features are just a different, slightly noisier way of memorizing sequence, or actually contain orthogonal information that can enhance predictions. It might be possible to gain insight into this question by looking more carefully at the impact of model-building accuracy on performance (the authors use sequence similarity as a proxy, but this is confounded by overlap between the training set and the template set used for modeling). If model-building really adds something, it seems plausible that it does so by accurately capturing physical features of the true binding mode.

      A discussion of the likely impact of the work on the field, and the utility of the methods and data to the community:

      As state above, I think the present work will have a positive impact on the field of TCR:pMHC prediction by critically evaluating the structure-based approach (and also by testing two previously published methods on independent data). I am less convinced of the utility of the specific methods than of the overall conceptual framework, evaluation procedures, and training/testing sets.

      Any additional context you think would help readers interpret or understand the significance of the work:

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): **Summary:** Techniques to probe the local environment of membrane proteins are sparse, although the influence of lipids on the membrane protein's function are known since many years. Therefore, the paper by Umebayashi et al. is important. The environment-sensitive dye Nile red (NR) coupled to a membrane protein is an appropriate sensor for monitoring the local membrane fluidity. Linking of Nile red to the receptor via a flexible tether was achieved with the acyl carrier protein (ACP)-tag method. Experiments showed that depending on the ACP site a certain linker length is required to have NR inserted in the membrane and thus be an effective sensor for lipid disorder. This technology could be of general usability to study the environment of membrane proteins in the context of their function. As an example, the technique allowed insulin induced membrane disorder in the close insulin receptor vicinity to be observed. Further, results suggested that tyrosine activity is required for this disorder to happen. The experimental results appear to be complete and controls were made.

      **Major comments:** 1) Sometimes technical terms are used without explanation: What is the GP value? What is ACP-IR? The spectrum was measured in number of rois? The reader can find those abbreveations out, but it would be nice to have them defined.

      We have made a list of abbreviations.

      2) Fig. 1d) is confusing. The ACP-IR labelling is evident in 3 panels, but there is no difference in the color (emission spectra of 1992-ACP-IR vs 2031-ACP-IR should be visible??). The DAPI staining is very different. When doing the latter, how difficult is it to get the staining equal?

      The differences in spectra cannot be seen because we used pseudo colors for display of the DAPI and CoA-PEG-NR staining. The reviewer’s comments about the unequal DAPI staining is correct. The reason for this is most likely that the cell membrane is unequally permeabilized by PFA treatment. As the point of this figure is just to show that the plasma membrane is labeled, dependent upon the expression of the ACP-tagged insulin receptor, we don’t think that the variable intensities of the DAPI staining is important. DAPI is simply used to indicate the position of the cells.

      3) How can one interpret Fig. 4: a) Control goes over 4 frames, at 240" insulin is added, and 10 frames should show a fluctuation difference?

      We showed 4 frames after control treatment that showed no significant change was observed by control treatment. We expected that clear changes would be invoked by insulin treatment in GP images, however these changes, while visible in the GP images, are difficult to see for the untrained observer. This is the reason why we used the ZNCC method in the subsequent figures to better visualize the changes.

      1. b) A color shift from blue to green is visible after insulin addition. But it is faint - difficult to assess from the pseudo color scheme. What does 1000 pixel top/1000 pixel bottom mean in c). Is it an attempt to better visualize the fluctuation? It is difficult to recognize a difference before and after adding insulin. d) It seems that the kymograph set should show this. What is the color scale? Why is 3 so untypical, i.e., no change? Box 6 is also peculiar: the left side does not show a strong change upon insulin administration, the right side does. Why? We appreciate the helpful comments for improving our manuscript.

      As pointed out, the change of GP value is extremely small before and after insulin addition, so it is difficult to fully visualize the change with normal pseudo-color expression. To deal with this, we adopted the following two methods to visualize minute changes.

      1) Visualization of local changes of the statistical GP value showed by ZNCC throughout the time-lapse images (Fig. 6 and Fig. S2B).

      2) Visualization of the top/bottom 1000 pixels of the sorting ZNCC value in each image (Fig. 7 and Fig. S2C). The top 1000 pixels are the ones that showed the largest changes. The bottom 1000 pixels are the ones that showed the smallest changes.

      Owing to these expressions, we found out that the level of the response against the insulin signal was spatially and temporally heterogeneous in the membrane.

      As for the color scale, in order to clarify the meaning of the difference of color, we have added the description about the relationship between the color and the ZNCC value in the results section.

      4) How is the kymogram calculated? The legend says 'The horizontal dimension represents the averaged ZNCC inside the rectangular area, and the vertical dimension represents time'. The averaged ZNCC is a single value, so it is not clear why the kymogram shows a variation from left to right. May it be the ZNCC was averaged just vertically?

      We apologize that we did not provide information regarding making the kymograph.

      In the yellow rectangular area (Fig. 6B), the ZNCC values of the pixels with the same x coordinate value were vertically averaged, which were represented as the horizontal direction of the kymograph. That is, one horizontal line of the kymograph holds the spatial distribution of the ZNCC value along the horizontal direction of the membrane, and the vertical direction shows their time changes. To make it easier to understand, we refined the description about the kymograph in the legend of Fig. 6.

      5) When calculating cross-correlation values on images, they need to be aligned. What fraction of the total image does the selected 19x19 box represent? As described, I imagine that a rolling CC over 19x19 pixels is calculated over an image from the time lapse series comparing it with the reference Iave(x,y). Compared to the 3x3 median filtered CP image, the ZNCC image should then be much more blurred??

      Below we provide more information regarding the calculation of ZNCC.

      Each local window for ZNCC calculation is set to a 19x19 pixels centered on every single pixel excluding the edges of an image. The ZNCC value calculated in that window is set to a center pixel of that area. After that, a new window centered on the adjacent pixel is set and calculate the new ZNCC. That is, the calculation window is slid throughout the image. Also, the calculated ZNCC value is not set to all the pixels of the window, but is set to only the center pixel of the window, so there is no blur effect like median filtering.

      The figure below shows a schematic view of our ZNCC calculation.

      Schematic view of our ZNCC calculation

      **Minor comment:** On page 16 supplementary is not spelled properly.

      corrected

      Reviewer #1 (Significance (Required)):

      The key point of this paper is convincing and the new technology appears to have a lot of potential. It can be applied to study membrane protein function in the context of its environment, the lipid bilayer.

      Membrane fluidity measurements have been developed (e.g., using fluorescent probes like laurdan). However, the trick to link a probe like nile red by ACP technology to the insulin receptor and to observe its activity is quite new.

      A most recent description of such a technology is in TrAC Trends in Analytical Chemistry Volume 133, December 2020, 116092.

      This is an interesting review, but not directly impacting on our work.

      **Referees cross-commenting**

      All comments are constructive and important. The paper is important but needs to be amended as proposed.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): **Summary:** In this manuscript, authors generated an ACP-attached Nile Red probe in order to specifically label Insulin receptor in the membrane. Owing to this specificity, one can measure the lipid membrane properties around a specific protein in the membrane. **Major comments:**

      For the conclusions in the manuscript to be convincing, in my opinion, these additional data need to be added. Some of these are new experiments, and some are detailed analysis of existing data. The new experiments are not for new line of investigation, instead it is to confirm their statements and conclusions. The major point is the reliability of spectral shift. In usual environment sensitive probes, it is certain that they are in the membrane whatever is done to the membrane. However, when the probe is attached to a protein, it is not trivial to have the same confidence that the probe is always inside the membrane, and it is in the same plane of the membrane. 1992-ACP-IR is a good example; authors state that it binds to the protein outside the membrane, but when there is cholesterol addition and -maybe more interestingly- cholesterol removal, the dye still reacts and changes its emission (even PreCT changes its emission quite a bit at the 570 nm region). This is a clear indication of a change in localization of the probe upon some changes in the membrane. This implies that observed spectral shifts may not be due to lipid packing differences, but due to localization of the probes. For this reason, it is crucial to know where any environment sensitive probe localize in the membrane with respect to membrane normal, and this knowledge is more important for this probe. Related to this, the spectral difference upon insulin treatment and activation of insulin receptor could be due to changes in probe's localization in the membrane. Especially because authors show in Fig1e, the spectra can change depending on the probe localization. Relatedly, quantum yield of NR should be significantly different when it is inside vs outside membrane. Authors should show QY for 1992-ACP-NR and 2031-ACP-NR with different PEG lengths and upon insulin treatment.

      We understand the logic of the request to measure the QY, since the QY of Nile red is much higher in organic solvents than in aqueous solutions, so it might be predicted that the QY of Nile red is higher in a lipid bilayer than when covalently bound to the protein in an aqueous environment. However, this argument depends upon the mechanism for the increase in quantum yield when going from aqueous to a non-polar solution. One possible explanation is based on the intrinsic properties of the dye under the two conditions. The alternative explanation would be that the dye would aggregate (be insoluble) in aqueous solution and therefore either not fluoresce or self-quench. In this case, we believe that the latter is the explanation because we and others have previously shown the turn-on properties of the probe when binding to proteins (SNAP-tag and others). It is not simple to measure QY in the cell under a microscope, but we have done something similar shown in supplementary figure 4. We labeled the three ACP-receptor complexes with PEG11-Nile red and co-stained with antibody to the Insulin Receptor. We then calculated a relative quantum yield. There were very little differences at all between the relative quantum yields, so we conclude that it is not the environment of the probe, which affects the quantum yield under these conditions, but the fact that it is covalently attached to a protein and incapable of forming aggregates. What distinguishes these constructs is the emission spectrum, not the quantum yield. In supplementary Table 2 we also did QY measurements in vitro and we could reproduce the increase of quantum yield by association with liposomes or in organic solvents. We tested whether non-covalent association with a protein would increase the QY by incubation with the lipid binding protein, BSA, in PBS. This was not the case, strongly pointing to the conclusion that it is the covalent association with the protein that increases the QY, not association with a protein. We believe that our demonstration of changes in fluorescent spectra with changes in cholesterol, large changes in fluorescent spectra with linker length for the 1992 construct and voltage sensitivity using patch-clamp prove that the Nile red is reporting on the membrane environment under the conditions we propose.

      **Minor comments:** - Fig 1d requires quantification We do not agree on this. This is simply to show that the labeling is dependent upon expression of the relevant ACP-IR constructs. There is no detectable labeling of the control.

      • Voltage sensitivity of different PEG length of 2031-ACP probe should be added. We have added this data in figure 2 panel E.

      • Fig 3a graph should show all data points, not only bar graphs. Also, the band in 3a for +CoA-PEG-NR is dimmer than other bands, is it specific to this particular gel since quantification does not show any difference?

      There is no significant difference- Fig 4d, colour code is needed.

      Done

      • Fig 5b and Fig3d are basically the same experiments in terms of control measurement, why is the difference in 3b is 0.04 GP unit while it is 0.007 GP unit?

      We explain in the MS, but have improved the title of Y-axis in Fig.5 b graph so that the difference in what is plotted is clear. - Why is inhibitor data so noisy? We should discuss.

      We don’t know the exact reason why inhibitor data is noisy, but we speculate that the actin cytoskeleton and phosphoinositide-dependent signaling could affect the membrane stability, and the membrane environment would be fluctuated in the presence of latrunculin B or PI3K inhibitor.

      Reviewer #2 (Significance (Required)): Overall, this is a very useful approach, and this line of research will yield very useful tools to shed light on how lipids surrounding proteins can change their function. Major advance of the paper is the new chemical biology tool. There is also biological data on how insulin can change the insulin receptor's membrane environment which is contradictory to some old literature claiming that InsR becomes more "rafty" upon insulin treatment (e.g., PMID: 11751579).

      If this type of tagging proves robust and reproducible (limitations and concerns listed above and below), it could be used by other researchers to tag their protein of interest and investigate the lipid environment around those proteins.

      The downside of this method is that the probe requires ACP tag, a relatively less used tag than others in biology, therefore researchers interested in using this probe should have their proteins with ACP tag. Moreover, the linker length and ACP-tag position are quite crucial parameters (and probably should be optimized for each protein). Longer PEG lengths cannot report on changes efficiently (Fig3b), while shorter lengths are prone to artefacts as they can go out of membrane (Fig1 and Fig2). This might limit its widespread use.

      The reason for using the ACP tag is that neither the SNAP tap nor the HALO tag working. The tethered Nile Red preferred to bind to the tqg rather than inserting into the membrane.

      **Referees cross-commenting** I agree with all comments and concerns of other reviewers. I see the usability and potential of this new technology along with its limitations as all three reviewers pointed out.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): See below. No concerns on any of these issues.

      Reviewer #3 (Significance (Required)): **Critique:** This MS reports a proof-of-principle for using site-directed environmentally sensitive probe technology to assess the local membrane environment of a receptor tyrosine kinase (IR) upon activation. This technology addresses a major gap in our arsenal of tools to study the mechanisms of membrane signaling as the parameters of interest are biophysical parameters rather than purely biochemical ones. How to do this with spatial and temporal resolution is a major challenge. This study builds on previous work by the Riezman group that develops an extrinsic labeling system to tether Nile Red to specific sites on the ectodomain of a signaling receptor and then probe local membrane environments as a function of receptor activity. This is a carefully done study is well-controlled, is clever in design and is well-described. Although the major issues to which such a general technology could contribute involve intracellular (and not extracellular) event, the advances described will be of general interest -- particularly that local membrane order decreases when IR becomes activated. Specific comments for the authors' consideration follow:

      **Specific Comments:** (i) As a general comment, the authors are measuring extracellular plasma membrane leaflet properties that may or may not translate to what is happening in the local inner leaflet environment. A general reader may well miss the significance of this. This point needs to be more explicitly emphasized in the Discussion.

      This has been discussed in the revised version.

      (ii) Why not treat cells with a PLC inhibitor to block PIP2 hydrolysis and ask if that inhibits membrane disorder. It is PIP2 hydrolysis/resynthesis that regulates the actin cytoskeleton at signaling receptors and this seems an attractive candidate for study.

      There is a long list of attractive post-signaling events of the insulin receptor and how this works in different cell types that could be tested. We believe that this is beyond the scope of this study and we encourage others to do this.

      (iii) The data acquisition time is at least 4 min which is long enough for activated receptors to be recruited to sites of endocytosis. Can the authors exclude the possibility that what they are measuring isn't reflective of such spatial reorganization? Does a clathrin inhibitor block the observed change in local membrane order for activated IR? We determined localization to AP2 adaptor containing clathrin coated pits at the cell surface and showed that during the time-course of the experiment that there is no significant change in co-localization or evidence for endocytosis (new figure 9). Therefore, we decided not to do the clathrin inhibitor blocking experiment because we believe that it could only lead to indirect effects.

      (iv) Receptor activation is accompanied by other transitions such as dimerization, etc. Can the authors exclude the possibility that what they are measuring is related to changes in depth of insertion of the NR probe into the plasma membrane outer leaflet that is a consequence of IR conformational transitions associated with activation? This is highly unlikely given the fact that fluidification of the membrane environment is found with all length linkers. Given the intervals in increases in linker length on the 2031 construct, which is the closest to the membrane, it is very difficult to conceive that any of the ones larger than 5 PEGs restrict significantly the membrane insertion of the dye. **Referees cross-commenting**

      I think we have a consensus opinion

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      See below. No concerns on any of these issues.

      Significance

      Critique:

      This MS reports a proof-of-principle for using site-directed environmentally sensitive probe technology to assess the local membrane environment of a receptor tyrosine kinase (IR) upon activation. This technology addresses a major gap in our arsenal of tools to study the mechanisms of membrane signaling as the parameters of interest are biophysical parameters rather than purely biochemical ones. How to do this with spatial and temporal resolution is a major challenge. This study builds on previous work by the Riezman group that develops an extrinsic labeling system to tether Nile Red to specific sites on the ectodomain of a signaling receptor and then probe local membrane environments as a function of receptor activity.

      This is a carefully done study is well-controlled, is clever in design and is well-described. Although the major issues to which such a general technology could contribute involve intracellular (and not extracellular) event, the advances described will be of general interest -- particularly that local membrane order decreases when IR becomes activated. Specific comments for the authors' consideration follow:

      Specific Comments:

      (i) As a general comment, the authors are measuring extracellular plasma membrane leaflet properties that may or may not translate to what is happening in the local inner leaflet environment. A general reader may well miss the significance of this. This point needs to be more explicitly emphasized in the Discussion.

      (ii) Why not treat cells with a PLC inhibitor to block PIP2 hydrolysis and ask if that inhibits membrane disorder. It is PIP2 hydrolysis/resynthesis that regulates the actin cytoskeleton at signaling receptors and this seems an attractive candidate for study.

      (iii) The data acquisition time is at least 4 min which is long enough for activated receptors to be recruited to sites of endocytosis. Can the authors exclude the possibility that what they are measuring isn't reflective of such spatial reorganization? Does a clathrin inhibitor block the observed change in local membrane order for activated IR?

      (iv) Receptor activation is accompanied by other transitions such as dimerization, etc. Can the authors exclude the possibility that what they are measuring is related to changes in depth of insertion of the NR probe into the plasma membrane outer leaflet that is a consequence of IR conformational transitions associated with activation?

      Referees cross-commenting

      I think we have a consensus opinion

    1. Author Response

      Reviewer #1 (Public Review):

      This study addresses the important question of understanding the cellular physiology of cholinergic interneurons in the striatum. These interneurons play a key role in learning and performance of motivated behaviors, and are central to movement disorders, psychiatric disease, and addiction. Their unique physiology, which includes tonic pacemaking activity and active conductances that shape integration of dendritic inputs, is critical to their function but is still incompletely understood. The authors cleverly integrate a series of innovative electrophysiological and optical approaches to gain insight into dendritic physiology of these neurons. Their creative approach yields some interesting and novel findings. However, there are technical and conceptual concerns that need to be addressed before these results can be readily interpreted. Some refinement of analysis and presentation, and potentially some additional experiments, will therefore be required to strengthen the conclusions and facilitate interpretation of the results.

      We believe that with several new sets of experiments and simulations, we have successfully refined the analysis and addressed the technical and conceptual problems. Indeed, we strengthened the conclusion with a novel pharmacological experiment that provided model-independent evidence of proximal-only boosting.

      Major concerns:

      1) This manuscript focuses on differential physiology of proximal and distal dendrites contribute to physiological activity and integration of inputs in cholinergic interneurons, suggesting that NaP and HCN currents act in concert to selectively boost inputs onto proximal dendrites (from thalamus), relative to inputs onto distal dendrites (from cortex). The results presented in Figures 1-4 are consistent with a distinct physiology of proximal-vs-distal dendrites based on purely electrical properties. Indeed, Figure 5 initially appears consistent with this model as well, since thalamic inputs (onto proximal dendrites) are boosted by an NaP conductance, while cortical inputs (onto distal dendrites) are not. This raises a key conceptual question: why are cortical inputs onto distal dendrites not boosted? Any depolarization of distal dendrites must pass through proximal dendrites before reaching the recording electrode at the soma. Shouldn't this signal be subject to the same active and passive conductances, and consequently the same boosting that shapes thalamic inputs onto proximal dendrites?

      You are absolutely right in the case of a linear model (passive or quasi-linear). However, for a nonlinear system, there can be preferential boosting of proximal inputs. The new Appendix 2, addresses this point with computer simulations.

      2) The quasi-linear approach to characterizing active and passive membrane properties is promising, and the choice of a cable-based model is well supported. However, the model itself is rather opaque, which limits confidence in the interpretation of the results. Additional analysis and description should be presented to alleviate concerns about whether the experimental data, which has a limited number of measurable values, may be over-fit by a model with too many free parameters. For example, why is the radius of the dendrite a free parameter that is allowed to vary in the full field vs proximal experiment (Lines 253-256) - and isn't it a serious red flag that the value returned for proximal dendrites is smaller than for the full field? Additional tables (e.g. fixed and free parameters and how they were determined), and figures (plots of how those parameters influence the fits, and how the parameters interact with one another) would considerably strengthen confidence in the conclusions drawn by the authors.

      Thank you very much for this comment. We have added in the new ms a table with all the parameters fit in the various figures, and have discussed the possible pitfalls of overfitting. Most importantly, we have provided a new appendix (#1) to the manuscript that explains the effects of the various model parameters in a systematic fashion, beginning with a passive dendrites, followed by the effects of boosting and then the effect of restorative currents that give rise to resonances. This appendix addresses the questions raised by the reviewer regarding how the various parameters influence the fits.

      We apologize, if we created a confusion, with respect to the meaning of the parameter r. It does not represent the radius of the dendrites (which is not explicitly represented at all, only implicitly through the space constant) but rather the electrotonic range of illumination. We indeed find that the fits consistently estimate a value of r for the proximal illumination which is smaller than that estimated for the full-field illumination, as it should.

      Finally, our new pharmacological demonstration of differential boosting in the case of proximal vs. fullfield illumination (see above) is entirely independent of the quasi-linear model fit. So for the main thrust of the ms, which is to demonstrate a proximal localization of nonlinearities and its correspondence to the spatial localization of excitatory afferent inputs, this is now achieved, at least vis-à-vis the NaP current, independently of the qausilinear model. However, we still find the model useful as it is used to estimate the distribution of HCN currents and provides a framework to think about how to manipulate dendritic nonlinearities experimentally.

      3) Technically, the use of ChR2 to modulate dendritic currents is creative. While the authors rightly acknowledge that activation/deactivation kinetics of the ChR2 channel will contribute to filtering, this important point should be expanded with additional analysis and potentially with new experiments. Of particular concern is the transition of ChR2 channels to an inactivated state over the comparatively long oscillating light pulse in Figure 3 Inactivation of ChR2 is prominent over this timescale and would precisely co-vary with the shift in oscillation frequency. To address this, the authors should present a direct measurement of this inactivation and account for it in their analysis of the chirp data. Alternatively, the chirp stimulus could be presented backwards (starting at high frequency), so that comparison of forwards-vs-backwards chirp recordings could disentangle this artefact. Either one or both of these additional experiments would be critical for interpreting the roll-off in photocurrent responses at high frequencies reported in Figure 3.

      Touché! You were spot on with this critique and we were wrong. We have now conducted several new experiments (that appear in the main text and in Figure 3 and all its supplements) that show that including ChR2 kinetics explicitly in the model fits actually makes the fits more self-consistent and removes some of the glaring differences between the results from the somatic voltage perturbations (Figures 1–2) and the optogenetic illumination (Figure 3). So as per your request, we have now presented a direct measurement of the deactivation (Figure 3–figure supplement 1) and we have played the “chirp” backwards (Appendix 1–figure 2) to address the issue of inactivation.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      First we would like to express our deep gratitude to the reviewers for thoroughly and fairly reviewing our work.


      Reviewer #1:

      Major Concerns

      1. A major concern I have is with the use of DAPT to modulate Notch signaling, and investigate the impact on integrins, Yap, cadherins, etc. Gamma-secretase, the target of DAPT, cleaves not only Notch receptors, but also IntegrinB1, Nectins, Cadherins, Ephrins and more. This recent review lists 149 substrates (Guner & Lichtenthaler Seminars in Cell & Developmental Biology 2020). The risk that some of the results reflect DAPT impact on IntegrinB1, Cadherins etc themselves is significant. The authors should validate their findings with more specific modulation of Notch activity, for example with a Notch blocking antibody, with siRNA, or with SAHM1. We agree with the reviewer´s comment and will add additional key experiments using SAHM1 as alternative inhibitor of Notch activity.

      Furthermore, EGTA was used to "acutely destabilize VE-Cadherin". But EGTA chelates Calcium, which is essential for Notch structure, and EGTA is thus a well-known activator of Notch signaling (see eg Rand MD et al. (2000) Calcium depletion dissociates and activates heterodimeric notch receptors. Mol Cell Biol). The authors rightfully describe and cite this paper, but the use of EGTA nonetheless confounds interpretation. The authors check for NICD levels (at what timepoint?) but the staining is cytoplasmic (also not labelled in the figure per se, but described in the figure legend? - please label the staining in the panel). And in any case, NICD is very short-lived and nuclear staining cannot be taken as a hallmark of signaling activity. In particular if staining is performed at a time point at which the receptor and NICD may have been exhausted/depleted. The authors should validate these observations/conclusions with the Notch reporter to conclusively demonstrate whether EGTA does not activate Notch in their system.

      To test whether transient treatment with EGTA causes Notch activation we will repeat this experiment with Notch reporter activity as readout.

      Trans-endocytosis of NECD on different substrates: the authors suggest that trans-endocytosis of NECD by Dll4 increases on softer substrates. But the authors also show that soft substrates lead to spreading out of cells, which could confound interpretation (is overlapping membranes, not internalization). The authors could validate trans-endocytosis by FACS: check if red Dll4+ cells contain more NECD. It is also not clear to me in this experiment whether the authors are looking at green NECD, or Notch1 full length, since they write "overlap of Notch1 and Dll4", which would not reflect trans-endocytosis but interactions at the cell surface for both cells. Please also define "overlay intensity", or explain further.

      We will validate the trans-endocytosis by flow cytometry. In addition, we describe the procedure for microscopic analysis more clearly (methods section, p 4; results section, p 17-19)

      The authors conclude their introduction with a statement that mechanosensitivity of Notch is linked to endocytosis, but their conclusion from Fig 6C was that Notch stiffness-dependence was independent of endocytosis, using the rhDll4..?

      We have now rephrased this sentence.

      • *

      Minor concerns

      1. In the introduction, the authors describe Dll3 as a Notch ligand that activates Notch signaling in trans. To my knowledge, Dll3 has only been described as a cis-inhibitor of Notch signaling. (I think this may have arisen during repeated edits of the manuscript!) This has now been corrected in the current version.

      In the introduction, the authors state that Notch1, Dll4 and Jag1 control angiogenesis, but then they only describe what Notch1/Dll4 do in the next few sentences. Perhaps one sentence to describe the role of Jag1 would help avoid the feeling of being "left hanging".

      This has now been corrected in the current version.

      Data presentation: please show all bar graphs with the individual replicates (dotplots).

      We have now changed all bar graphs into scatter plots.

      Data analysis/normalization: many graphs represent normalization of values in multiple steps which are not described in the methods/legends/results. For example, Notch reporter gene activity (Fig 1A) is Firefly divided by Renilla, and presumably normalized to the control condition at 1 (or an average of 1 for the three controls?). This is not explained. Also, it is not clear whether the data reported for the Control condition are Huvec on rhDll4 compared (normalized) to Huvec on control substrate (and similar for each other condition). What controls are included in this experiment? Please provide the full data to provide insight into the magnitude of activation by Dll4 itself. Perhaps "Control" is without rhDll4? But the bar underneath A/B implies this rhDll4 was used in all conditions.

      We have edited our manuscript accordingly to avoid these ambiguities.

      Statistics: data should be presented as means +/- standard deviation, not standard error of the mean (see for example Barde & Barde Perspect Clin Res. 2012): "SEM quantifies uncertainty in estimate of the mean whereas SD indicates dispersion of the data from mean. As readers are generally interested in knowing the variability within sample, descriptive data should be precisely summarized with SD."

      We now use SD instead of SEM.

      Statistics: In the Methods section, the authors state that one-way ANOVA was followed by Dunnett's multiple comparison test, and two-way ANOVA was followed by Tukey's multiple comparison test. Dunnett is used to compare every mean to a control mean, while Tukey is used to compare every mean with every other mean. Fig 1 describes using Dunnett for Fig 1B, but the end of the legend days Tukey was used. However Fig 1A,C show internal pairwise comparisons to plastic. Please be sure to explain which statistics were used where, and why, and if plastic was set as the comparator, please be explicit about this. Fig 3 uses "Sidak's corrected two-way ANOVA" and "Sidak's multiple comparison test"? I think Sidak is a method to correct alpha or p for multiple comparisons, as stated in the first instance, but it is described why this was used here, and not in other analyses, and whether the authors then applied Tukey's post-hoc test as described in the methods section? Similar comments for Fig 6. It is counter-intuitive that the plastic -1.5kPa PDMS difference with no error-bar overlap in 1A would be 1-star significance, while the plastic-70kPa difference with almost overlapping error bars in 1B would be 4-star significance. Please check/show values. In Fig 1B Figure legend, the authors write "Data is presented in a bar plot and compared with the integrin β____1 intensities without DAPT treatment", but this is not the statistical comparison presented. Fig 3B shows a very minor difference with overlapping error bars as 3-star significance? Is this correct?

      We have checked all statistical issues and corrected where necessary. Since the sample size and variance were homogenous in all comparisons we now uniformly use ANOVA and Tukey´s multiple comparison test as post hoc to keep things simple.

      How much nuclear NICD (NICD intensity) is there in control conditions? (Control missing from Fig 1B, D).

      We will repeat the experiment and compare the NICD levels with those in non-activated cells on plastic.

      A DAPI counterstaining for 1B/D right panels would facilitate evaluation of whether NICD nuclear intensity is increased. The same applies for nuclear YAP assessment in Fig 3B. I assume a nuclear counter-stain was done for quantification of nuclear NICD intensity, and nuclear YAP intensity, but this is not described in the Materials and Methods, please add a description of how intensity was quantified, and provide nuclear counterstain images. (Also, what is the unit on the y-axis of "intensity" graphs? Arbitrary units (a.u.)?

      The counterstaining method with Hoechst as well as the use of the nuclear staining for quantitative analysis of images are now described in the Methods section and where needed in the figure legends. The y-axis of the intensity graphs now has a dimension (a.u.). We decided against overlay of the nuclear staining with the NICD or YAP images for graphical reasons (visibility of the respective staining).

      How much "overall" integrin B1 is there in DAPT-treated conditions in Fig 2C? (related to the concept that DAPT could be cleaving integrin B1, it could be depleted at 24 hours..?)

      We will additionally add this experiment and validate the effect of Noch inhibition on the overall intergrin level by the alternative inhibitor SAHM1

      More details regarding the analysis procedure need to be added to the Methods Section. Were cells segmented and then mean intensity estimated for the whole cell? Was this done by means of Intensity Ratio Nuclei Cytoplasm Tool plugin for Fiji alone? Were images background corrected, corrected for inhomogeneous illumination, normalized? In the case of Integrin beta 1 active, the expression seems to be patterned, was intensity expressed as mean intensity of every pixel corresponding to cytoplasm? For VE Cadherin staining, how was intensity estimated (only pixels corresponding to membrane were considered or every pixel of the cell)? Many figures are originated from a confocal microscope: were z-stacks acquired and then maximum projections done? Were z-stacks acquired and then fluorescence quantified in 3D images? Was a single plane acquired or analyzed, and if that is the case, how was this plane chosen?

      The requested information has now been inserted in the respective results and method sections.

      In Fig 4A, how is VE-Cadherin intensity quantified? As an average per field of view? Or per cell? And if per cell, how was each cell delineated? And if not per cell, how were equal cell numbers ensured? In FRAP experiment, how was intensity quantified? Was it per cell, per field of view or per region? Was each bleached region analyzed separately, or each cell? The datapoints should be either added to Figure 4C or as supplementary to assess the fitting. How many bleached regions per cell were done and how many cells were analyzed? In FRAP experiment, was bleaching done with an increased pixel dwell time? Was laser intensity increased? Do you have an estimation of laser power (not percentage) or flux?

      These issues are now described in more detail in the respective figure legend.

      Figure S2 is not referenced in the manuscript - I think a reference to "Figure S3" in the NECD transendocytosis section (no page numbers or line numbering) should be to Fig S2 instead?

      Sorry for this mistake! We corrected this now.

      In Figure 5A NICD nuclear intensity normalized somehow (normalization not explained), and stiffness no longer appears to regulate NICD levels as shown in Figure 1B.

      We have now described the normalization better in the figure legend. The difference to the results in Fig. 1B is that in Fig. 5A the cells were not activated by Dll4 sender cells or rhDll4 (endogenous Notch activity). This is now stated more clearly.

      Fig 6B: From the immuno at right there is a clear stiffness-dependent difference in Transferrin uptake. How were "single cell uptake" and "number of particles" quantified? (How were cell bodies identified?) Uptake could also be verified with FACS.

      In this point, we disagree with the reviewer: we really do not see a systematic difference in intensities between the different substrates. The process of image analysis is now better described in the figure legend. The result was so clear that we did not use FACS as complementary approach.

      Fig 6C: there appear to be very different numbers of cells in the brightfield image at right. Are the 70, 1.5, and 0.5 kPa Notch reporter activities different from one another or only different from plastic? Might these results reflect cell density/increased Notch signaling due to more cell-cell contacts?

      Unfortunately, with decreasing stiffness the PDMS gels become optically more and more cloudy, giving the false impression of a higher cell number. We tried to circumvent this by changing contrast and brightness of the images, but to no satisfying effect. We now mention this issue in the figure legend.

      How was the Dll4 coating of the different substrates done?

      The coating of the substrates is now described under a specific subheading in the Methods section.

      It would be helpful to describe the composition of Collagen G (Collagen I) in the text (it is a risk to expect vendor information to remain available indefinitely).

      The role and composition of the Collagen G coatings was included in the text (p 7). Further information on the manufacturer of the product used is included in the methods section.

      Please list catalog numbers for all reagents, and dilutions used for antibodies.

      We have added this information wherever possible.

      Instead of using red and green for images, maybe cyan, yellow and/or magenta could be used to help the reader see what is being shown (especially if the reader might be color blind).

      We will of course adhere to the respective policy of the publishing journal, once the manuscript is accepted.

      Packages and tools such as Intensity Ratio Nuclei Cytoplasm Tool plugin for FIJI should be referenced.

      We have now referenced respective tools.

      Reviewer #2:

      *Major comments: *

      Is there difference on a growth rate of cells on softer vrs stiffer gels that could affect cell morphology/signaling pathways?

      This is an important point and we will perform additional respective experiments.

      Nuclear localization of NICD and YAP would be good to validate with western blot.

      Quantification of Western Blots (especially after nuclear isolation) is – at least in our hands – much less sensitive and reliable then quantitative imaging. We do not think that this experiment would strengthen our study.

      In Figure 3 and Figure 5, siRNA experiments would strengthen the data. DAPT is not only an inhibitor of Notch but affects to other proteins as well. This should be stated.

      A similar point was raised by Reviewer#1 with the suggestion to use SAHM1 as an alternative to DAPT. As suggested we will add these experiments.

      How was the mean VE-cadherin branch length determined? This term often refers to angiogenesis assay/sprout formation and maybe another one should be considered here to describe VE-cadherin junction morphology.

      Add to all figure texts how many cells were used for the analyses*. *

      The cell number is now added wherever appropriate.

      In Fig. 6C the cell morphology of HUVECs look abnormal in comparison to other images and should be re-done.

      In contrast to all other experiments the cells where not confluent in this case. The different morphology is a sign of the lack of neighbours, not of some problem with the cells.

      Was all the data normally distributed and thus ANOVA was used? Please add more details on the statistics part. Did you remove outliers?

      Like also suggested by Reviewer #1 we have added more information on statistics and streamlined this. The data are normally distributed, outliers wer not removed.

      MTT assay of DAPT would need to be presented as it can be cytotoxic. Cells are not well visible in Fig 2C with DAPT. DAPI and F-actin staining would help to see the cell morphology.

      We will add respective data on cell viability after DAPT (and SAHM1) treatment in a revised version of the manuscript.

      Minor comments:

      Please clarify how coating with rhDDL4 is done as this was unclear at least for this reviewer.

      The coating of the substrates is now described under a specific subheading in the Methods section.

      HUVECs are known to be hard to transfect. Please provide data on transfection efficiencies of all transiently transfected cells.

      We did not systematically monitor transfection efficiencies in this context, since there was always an internal control (e.g. co-reporter in the reporter gene assay) or the data were obtained on a single cell based quantification. Generally, we yield transfection efficiencies around 30% with HUVECs.

      Reviewer #3:

      Major comments:

      • *

      1) The authors use recombinant Dll4 or Dll4-expressing ("sender") cells to activate Notch in co-cultured cells. This is per se fine however, one might over-estimate all other observed downstream effects as endogenous Notch activity is lower. It would be important to see how naïve HUVEC or other primary endothelial cells respond to changes in stiffness. qPCR of Notch target genes such as Hey1, Hey2, Hes5, Dll4 is frequently used as a readout of Notch activity in this context. Also. the Notch transcriptional reporter assay might be a suitable read-out-

      In Fig.5A we show data on endogenous Notch activity (- EGTA) on substrates with different stiffness. In this case NICD levels in the nucleus do not differ. It will definitely be interesting to repeat this experiment based on the reporter gene assay.

      2) As the authors mention in the Discussion, cell density could be of utmost importance given the fact that Notch signaling usually is assumed as an in trans signaling event between adjacent cell membranes. However, also other signaling modes (in cis, cis inhibition, JAG1 vs DLL4 ratio) might be important. As such, the authors should carefully document an report on cell density in all experiments. Secondly, the authors should use other conditions such as sparse cell density and thirdly the authors should measure transcriptional effects of stiffness on Notch ligand expression.

      In all experiments (with the exception of Fig. 6C) we used confluent cells. With the sparse cells (Fig. 6C) we also observe stiffness dependency. Investigating Notch ligand expression is definitely a good idea and will be investigated in the revised manuscript.

      3) The authors need to compare stiffness in their model with physiological conditions in developing tissues and ideally also in tumor which often have increased tissue stiffness.

      *Good point! We have now integrated such comparisons in the Discussion. *

      4) Is Notch activation due to changes in stiffness dependent on the presence of ligands or could it be that (unspecific) binding of Notch receptors to ECM could trigger cleavage just by conformational change?

      Since there is no stiffness dependent response on collagen (Fig. 6C, left panel), an effect of unspecific binding is highly unlikely.

    1. Author Response

      Reviewer #1 (Public Review):

      In this article, the authors investigated the role of sleep and brain oscillations in visual cortical plasticity in adult humans. The authors tested the effect of 2 hours of monocular deprivation (MD) on ocular dominance measured by binocular rivalry. In the main MDN session, MD was performed in the late evening, followed by 2 hours of sleep, during which EEG was measured. After the sleep session, ocular dominance was measured, which was followed by 4 hours of sleep, then ocular dominance was measured again in the morning. The results show that the effect of MD was preserved 6 hours after MD. The effect of MD correlated with sleep spindle and slow oscillation measures. The questions asked by the study are timely and findings are important in understanding the visual cortical plasticity in human adults, but I have some concerns regarding the experimental design, analysis, and interpretation of the results, which are listed below.

      Thank you for the positive summary of our results.

      • The authors investigated EEG activities in the central and occipital regions. The results of the relationship between slow oscillations / sleep spindles and deprivation index are very interesting. However, it appears that the activities were averaged across hemispheres in the occipital region. Previous studies (e.g. Lunghi et al., 2011; Binda et al., 2018) have demonstrated that MD is associated with up-scaling of the deprived eye and with down-scaling of the non-deprived eye (page 11). I wonder whether sleep slow oscillations and / or spindles are modulated locally in the deprived occipital region? To answer the first question raised by the authors (how MD affects subsequent sleep), wouldn't it be important to compare between deprived vs. non-deprived regions?

      In humans, the pure monocular recipient cortical regions are very small and represent only very far visual periphery. These regions are impossible to be located by EEG and they are also difficult to locate also with high resolution fMRI (ref to Koulla CB). Visual cortical organization is based on the visual field map: neurons whose visu.al receptive fields lie next to one another in visual space are located next to one another in cortex, forming one complete representation of contralateral visual space, independently of the eye from which the visual information comes. However, at finer scales ocular dominance columns exist and Binda et al (2018) showed that in adult humans MD boosts the BOLD response to the deprived eye, changing ocular dominance of V1 vertices, consistent with homeostatic plasticity. All these are well known facts to the visual community, and we believe are not worthwhile to discuss them.

      • To answer the second question (how sleep contributes to consolidation of visual homeostatic plasticity), the authors compared the deprivation index between two sessions, the main MDN and a control MDM session. The experimental designs for these two sessions were quite different. For example, MD was conducted in the evening in MDN, whereas it was conducted in the morning in MDM. Since there may be circadian effects on plasticity (Frank, 2016), the comparisons between these sessions may not be sufficient in investigating the effect of sleep itself (it could be merely due to circadian effect).

      Thank you for raising this important issue. We performed the dark exposure experiment in the morning because we wanted to minimize the occurrence of sleep during the two hours spent by participants lying down in complete darkness. Preventing sleep under these conditions in the late evening would have been extremely challenging. In order to investigate a possible influence of the circadian rhythm on visual homeostatic plasticity and its decay over time, we have performed an additional experiment. In this experiment, we have tested the effect of 2h of monocular deprivation in the same participants either early in the morning or late at night (at a time of the day comparable to the MDnight and MDmorn conditions in the main study). We report the results of this control experiment in the supplementary materials (Figure S2). We found that the effect of monocular deprivation follows a similar timecourse for the two conditions (ocular dominance returns to baseline levels within 120 minutes after eye-patch removal). Moreover, we also report that the effect of MD is slightly (but significantly) larger in the morning, compared to the evening. The results of this experiment rules out a contribution of circadian effects and reinforces the evidence of a specific effect of sleep in maintaining visual homeostatic plasticity.

      • The authors argue that NREM sleep consolidates the effect of MD. However, consolidation may last days to months or even years (Dudai et al., 2015). Since the effect is gone in 6 hours or so, it may be difficult to interpret it as consolidation. Although the findings of the effects of sleep on ocular dominance plasticity are interesting, the interpretations of the results may need to be clarified or revised.

      We thank the reviewer for raising this issue. We agree that the data show a substantial delay in the decay process of the MD effects after the removal of the patch. The present data indicate that specifically the sleep condition and not merely darkness would be responsible for the maintenance of the MD-induced effect during the night. Therefore, we gladly adhere to the request and propose to say that sleep stabilizes/maintains the effects of MD as long as sleep itself persists. Having said that, we would like to point out that the MD boost in amblyopic patients gets consolidated for up to one year and increases across night sleep as we reported in Lunghi, Sframeli et al (2019). Although these data strongly suggest that real consolidation may occur, we agree with the reviewer that our data did not directly address this question and changed accordingly the manuscript.

      Reviewer #2 (Public Review):

      This manuscript is an interesting follow up on a substantial literature on the role of sleep in promoting critical period ocular dominance plasticity, and the role of sleep in promoting adult V1 plasticity following presentation of a novel visual stimulus. For nearly all of that literature (i.e. coming from cats and mice), the focus has mainly been on Hebbian mechanisms. The authors here propose to advance the field by investigating plasticity in adult human V1, which the authors consider to be homeostatic rather than Hebbian, and which the authors consider to be a form of sleep-dependent consolidation. This is an exciting goal, and the overall study designs and control will test the effects of brief MD and subsequent sleep or wake in the dark on V1 processing for the two eyes.

      Thank you for the positive commentary on our study.

      However, the outcomes of the study suggest that the changes observed in V1 across sleep may actually be the opposite of consolidation - rather it is decay of an effect on V1 function caused by prior wake experience (MD), which disappears over subsequent hours.

      We thank the reviewer for raising this issue. We agree that the data show a substantial delay in the decay process of the MD effects after the removal of the patch. The present data indicate that specifically the sleep condition and not merely darkness would be responsible for the maintenance of the MD-induced effect during the night. Therefore, we gladly adhere to the request and propose to say that sleep stabilizes/maintains the effects of MD as long as sleep itself persists. We have revised the entire MS through the various sections to handle this important aspect and to consider that a classic correlate of memory consolidation during sleep (spindles density) also turns out to be associated with maintenance of the MD-induced ocular dominance effect.

      The authors claim differences due to sleep, but there is not a direct statistical comparison between sleep and awake-in-the-dark controls.

      We now directly compare the effect of monocular deprivation and its decay after two hours in the sleep vs dark exposure condition (MDnight vs MDmor). We now plot the results of the two conditions in the same graph (Figure 2). We found a significant interaction effect between the factors TIME (before and after) and CONDITION (MDnight and MDmor), indicating a specific role of sleep in prolonging the decay of short-term monocular deprivation.

      There is also no quantification of sleep architecture across the sleep period, to determine whether REM or NREM play a role.

      We have provided a summary table of sleep architecture in the revised version of the Supplementary Materials. The table shows descriptive statistics of sleep architecture on MDnight and CN. Also, we report the result of the paired comparison between the nights and the Spearman correlations between the deprivation indices (DI before and DI after) and the changes between the nights in sleep architecture. Tests indicate that MD does not produce any main effect on the sleep architecture and that there are no substantial associations found between sleep architecture parameters and deprivation indices. Thus, it appears that changes in SSO and spindle frequency and amplitude did not lead to an alteration in the amount of N2 or N3 sleep, as we might expect. At the beginning of the Results section we refer to the table and to the lack of statistically significant effects.

      Finally, while there are tests of changes in NREM oscillations with previous plasticity in wake, there are no direct tests of changes across sleep - i.e. the very changes that could be considered consolidation.

      We thank the reviewer for stimulating us to investigate whether there are any NREM parameters whose change within the sleep cycle can be related to the degree of plasticity maintenance observed at the end of the two hours of sleep.

      For this aim, we 1) partitioned SSO and spindle events into tertiles according to their occurrence time, 2) estimated the average measures of events belonging to the first and last tertile, and considered the variation between tertiles as an estimate of the changes across sleep. We then tested whether there is a consistent relationship between measures of individual retained plasticity (DI after) and changes in SSO and sleep spindles across sleep.

      We did the across sleep analysis of the SSO and spindles measurements and as previously explained none of the parameters showed associations across sleep with the individual DI after sleep. We report these results in the supplementary materials (Figure S8).

      Finally is also not clear that the decay of response changes is due to homeostatic plasticity - it could be just that- decay of plasticity that occurred previously. The terminology used - e.g. consolidation, homeostatic vs. Hebbian - don't seem well founded based on data.

      Thank you for raising an important point. In our study homeostatic plasticity refers to the effect of short-term monocular deprivation (so the plasticity occurred before sleep). We have rephrased the interpretation of our results in terms of stabilization/maintenance rather than consolidation of plasticity

      About homeostatic vs Hebbian plasticity, there is a quite large agreement in the literature stating that indeed the effects are different. Now we make clear in the text that Hebbian plasticity is usually associated to the boost of most successful signals in driving a neuronal response or a behavior. Here the MD produced a boost of the unused, and probably silent, eye and as such the boost it is very difficult to explain in term of Hebbian plasticity. We make now this clear in the introduction.

      Reviewer #3 (Public Review):

      In this study, Menicucci et al. induced plastic changes in ocular dominance by applying an eye-patch to the dominant eye (monocular deprivation, MD). This manipulation resulted in a shift toward even more dominance of the deprived eye, as assessed though a binocular rivalry protocol. This effect was stabilized during sleep whereas it quickly decreases in waking (in the dark). The authors interpret the MD effect as the resultant of cortical plasticity over primary visual areas and its maintenance during sleep as the consolidation of these changes. The authors thus connect their work to the literature on sleep consolidation. They further show that the magnitude of the MD effect is positively correlated with sleep markers that are involved in memory consolidation (slow oscillations and sleep spindles).

      However, I have first conceptual issues with this study. Indeed, previous findings on the replay of memories during sleep and their consolidation were mostly obtained in hippocampus-dependent forms of learning. Here, I do not really see what is it that would be replayed. Thus, I struggle understanding how rhythms, such as sleep spindles, that have been linked to the transfer of hippocampal memories to the neocortex, would be mechanistically associated with low-level plastic changes restricted to primary visual areas. In addition, the effects were observed over occipital electrodes, where sleep spindles are far fewer and lower in amplitude than other cortical regions. Furthermore, the association between MD-related plasticity and slow oscillations is interesting but, since these slow oscillations organize sleep slow waves, the lack of correlation with slow wave is surprising.

      We agree with the review that many of our results are indeed surprising, especially those related to the involvement of the spindles and for these reasons we believe that eLife would be the appropriate journal to present our work. At present the fact that sleep spindles have been associated manly in mediating transfer of memory does not exclude a more general involvement in other sensory functions.

      Connected to these conceptual issues, I think the present work has some important methodological limitations. First of all, the analyses included a rather small number of participants, which could make some analyses, in particular correlational analyses, severely underpowered.

      We thank you for stimulating us to emphasize this limitation. In the section Participants within Materials and methods we pointed out that the complexity of the experimental design and the need to take into account the complexity of sleep expressed through different parameters, the sample size used and the need for corrections for multiple tests led to highlight only associations characterized by strong effect size.

      Secondly, the approach used to explore the correlation between plasticity and sleep features focused on subset of electrodes (ROI) defined a priori. It is therefore difficult to conclude on the specificity of the results. Given the topographical maps provided by the authors, I am wondering if a more exhaustive analysis of the effect at the electrode level could not yield more robust findings.

      The need for ROIs is based on the interindividual variability of brain structures, in particular the large anatomical variability of V1 orientation implying a variably oriented dipole and a variable maximal representation of visual potentials over electrodes from Oz to CPz. Moreover, we have to cope with the volume conduction effect that limits EEG spatial resolution.

      With these limitations in mind, we very gladly adhere to the reviewer's request to evaluate the effects on individual electrodes in more detail. To this end we have prepared supplementary figures which show boxplots and scatterplots for the electrodes inside the ROIs to evaluate main effects and associations, respectively.

      Finally, given the number of features tested, I think it is important to clarify the strategy used to correct for multiple comparisons.

      We thank the reviewer for highlighting an unclear point. In the revised version of the Statistical analyses section, we have provided missing details of the procedure used for handling false positives due to multiple testing. Basically, we applied the FDR correction for each question we asked.

      For example, “at which time points does dominance remain significantly different from baseline?” or, “which EEG feature and in which area of the scalp shows changes significantly dependent on plasticity induced by monocular deprivation?” For each of these questions, we made a group of tests (for the first example, dependent on the number of points at which ocular dominance was assessed until the morning; for the second example, on the number of EEG features examined multiplied by the number of areas in which they were assessed) to which Benjamini & Hochberg's FDR correction was then applied.

    1. Author Response

      Reviewer #1 (Public Review):

      The role of the parietal (PPC), the retrospenial (RSP) and the the visual cortex (S1) was assessed in three tasks corresponding a simple visual discrimination task, a working-memory task and a two-armed bandit task all based on the same sensory-motor requirements within a virtual reality framework. A differential involvement of these areas was reported in these tasks based on the effect of optogenetic manipulations. Photoinhibition of PPC and RSP was more detrimental than photoinhibition of S1 and more drastic effects were observed in presumably more complex tasks (i.e. working-memory and bandit task). If mice were trained with these more complex tasks prior to training in the simple discrimination task, then the same manipulations produced large deficits suggesting that switching from one task to the other was more challenging, resulting in the involvement of possibly larger neural circuits, especially at the cortical level. Calcium imaging also supported this view with differential signaling in these cortical areas depending on the task considered and the order to which they were presented to the animals. Overall the study is interesting and the fact that all tasks were assessed relying on the same sensory-motor requirements is a plus, but the theoretical foundations of the study seems a bit loose, opening the way to alternate ways of interpreting the data than "training history".

      1) Theoretical framework:

      The three tasks used by the authors should be better described at the theoretical level. While the simple task can indeed be considered a visual discrimination task, the other two tasks operationally correspond to a working-memory task (i.e. delay condition which is indeed typically assessed in a Y- or a T-maze in rodent) or a two-armed bandit task (i.e. the switching task), respectively. So these three tasks are qualitatively different, are therefore reliant on at least partially dissociable neural circuits and this should be clearly analyzed to explain the rationale of the focus on the three cortical regions of interest.

      We are glad to see that the reviewer finds our study interesting overall and sees value in the experimental design. We agree that in the previous version, we did not provide enough motivation for the specific tasks we employed and the cortical areas studied.

      Navigating to reward locations based on sensory cues is a behavior that is crucial for survival and amenable to a head-fixed laboratory setting in virtual reality for mice. In this context of goal-directed navigation based on sensory cues, we chose to center our study on posterior cortical association areas, PPC and RSC, for several reasons. RSC has been shown to be crucial for navigation across species, poised to enable the transformation between egocentric and allocentric reference frames and to support spatial memory across various timescales (Alexander & Nitz, 2015; Fischer et al., 2020; Pothuizen et al., 2009; Powell et al., 2017). It furthermore has been shown to be involved in cognitive processes beyond spatial navigation, such as temporal learning and value coding (Hattori et al., 2019; Todd et al., 2015), and is emerging as a crucial region for the flexible integration of sensory and internal signals (Stacho & ManahanVaughan, 2022). It thus is a prime candidate area in the study of how cognitive experience may affect cortical involvement in goal-directed navigation.

      RSC is heavily interconnected with PPC, which is generally thought to convert sensory cues into actions (Freedman & Ibos, 2018) and has been shown to be important for navigation-based decision tasks (Harvey et al., 2012; Pinto et al., 2019). Specific task components involving short-term memory have been suggested to cause PPC to be necessary for a given task (Lyamzin & Benucci, 2019), so we chose such task components in our complex tasks to maximize the likelihood of large PPC involvement to compare the simple task to.

      One such task component is a delay period between cue and the ultimate choice report, which is a common design in decision tasks (Goard et al., 2016; Harvey et al., 2012; Katz et al., 2016; Pinto et al., 2019). We agree with the reviewer that traditionally such a task would be referred to as a workingmemory task. However, we refrain from using this terminology because it may cause readers to expect that to solve the task, mice use a working-memory dependent strategy in its strictest and most traditional sense, that is mice show no overt behaviors indicative of the ultimate choice until the end of the delay period. If the ultimate choice is apparent earlier, mice may use what is sometimes referred to as an embodiment-based strategy, which by some readers may be seen as precluding working memory. Indeed, in new choice-decoding analyses from the mice’s running patterns, we show that mice start running towards the side of the ultimate choice during the cue period already (Figure 1—figure supplement 1). Regardless of these seemingly early choices, however, we crucially have found much larger performance decrements from inhibition in mice performing the delay task compared to mice performing the simple task, along with lower overall task performance in the delay task, indicating that the insertion of a delay period increased subjective task difficulty. As traditional working-memory versus embodiment-based strategies are not the focus of our study here and do not seem to inform the performance decrements from inhibition, we chose to label the task descriptively with the crucial task parameter rather than with the supposedly underlying cognitive process.

      For the switching task, we appreciate that the reviewer sees similarities to a two-armed bandit task. However, in a two-armed bandit task, rewards are typically delivered probabilistically, whereas in our task, cue and action values are constant within each of the two rule blocks, and only the rule, i.e. the cuechoice association, reverses across blocks. This is a crucial distinction because in our design, blocks of Rule A in the switching task are identical to the simple task, with fixed cue-choice associations and guaranteed reward delivery if the correct choice is made, allowing a fair comparison of cortical involvement across tasks.

      We have now heavily revised the introduction, results, and discussion sections of the manuscript to better explain the motivation for the tasks and the investigated brain areas. These revisions cover all the points mentioned in this response.

      Furthermore, we agree with the reviewer that the three tasks are qualitatively different and likely depend on at least partially dissociable circuits. We consider the large differences in cortical inhibition effects between the simple and the complex tasks as evidence for this notion. We also want to highlight that in fact, we performed task-specific optogenetic manipulations presented in the Supplementary Material to further understand the involvement of different areas in task-specific processes. In what is now Figure 1—figure supplement 4, we restricted inhibition in the delay task to either the cue period only or delay period only, finding that interestingly, PPC or RSC inhibition during either period caused larger performance drops than observed in the simple task. We also performed epoch-specific inhibition of PPC in the switching task, targeting specifically reward and inter-trial-interval periods following rule switches, in what is now Figure 1—figure supplement 5. With such PPC inhibition during the ITI, we observed no effect on performance recovery after rule switches and thus found PPC activity to be dispensable for rule updates.

      For the working-memory task we do not know the duration of the delay but this really is critical information; per definition, performance in such a task is delay-dependent, this is not explored in the paper.

      We thank the reviewer for pointing out the lack of information on delay duration and have now added this to the Methods section.

      We agree that in classical working memory tasks where the delay duration is purely defined by the experimenter and varied throughout a session, performance is typically dependent on delay duration. However, in our delay task, the delay distance is kept constant, and thus the delay is not varied by the experimenter. Instead, the time spent in the delay period is determined by the mouse, and the only source of variability in the time spent in the delay period is minor differences in the mice’s running speeds across trials or sessions. Notably, the differences in time in the delay period were greatest between mice because some mice ran faster than others. Within a mouse, the time spent in the delay period was generally rather consistent due to relatively constant running speeds. Also, because the mouse had full control over the delay duration, it could very well speed up its running if it started to forget the cue and run more slowly if it was confident in its memory. Thus, because the delay duration was set by the mouse and not the experimenter, it is very challenging or impossible to interpret the meaning and impact of variations in the delay duration. Accordingly, we had no a priori reason to expect a relationship between task performance and delay duration once mice have become experts at the delay task. Indeed, we do not see such a relationship in our data (see plot here, n = 85 sessions across 7 mice). In order to test the effect of delay duration on behavioral performance, we would have to systematically change the length of the delay period in the maze, which we did not do and which would require an entirely new set of experiments.

      Also, the authors heavily rely on "decision-making" but I am genuinely wondering if this is at all needed to account for the behavior exhibited by mice in these tasks (it would be more accurate for the bandit task) as with the perspective developed by the authors, any task implies a "decision-making" component, so that alone is not very informative on the nature of the cognitive operations that mice must compute to solve the tasks. I think a more accurate terminology in line with the specific task considered should be employed to clarify this.

      We acknowledge that the previous emphasis on decision-making may have created expectations that we demonstrate effects that are specific to the ‘decision-making’ aspect of a decision task. As we do not isolate the decision-making process specifically, we have substantially revised our wording around the tasks and removed the emphasis on decision-making, including in the title. Rather than decision-making, we now highlight the navigational aspect of the tasks employed.

      The "switching"/bandit task is particularly interesting. But because the authors only consider trials with highest accuracy, I think they are missing a critical component of this task which is the balance between exploiting current knowledge and the necessity to explore alternate options when the former strategy is no longer effective. So trials with poor performance are thus providing an essential feedback which is a major drive to support exploratory actions and a critical asset of the bandit task. There is an ample literature documenting how these tasks assess the exploration/exploitation trade-off.

      We completely agree with the reviewer that the periods following rule switches are an essential part of the switching task and of high interest. Indeed, ongoing work in the lab is carefully quantifying the mice’s strategy in this task and exploring how mice use errors after switches to update their belief about the rule. In this project, however, a detailed quantification of switching task strategy seemed beyond the scope because our focus was on training history and not on the specifics of each task. While we agree with the reviewer about the interesting nature of the switching period, it would be too much for a single paper to investigate the detailed mechanisms of each task on top of what we already report for training history. Instead, we have now added quantifications of performance recovery after rule switches in Figure 1— figure supplement 2, showing that rule switches cause below-chance performance initially, followed by recovery within tens of trials.

      2) Training history vs learning sets vs behavioral flexibility:

      The authors consider "training history" as the unique angle to interpret the data. Because the experimental setup is the same throughout all experiments, I am wondering if animals are just simply provided with a cognitive challenge assessing behavioral flexibility given that they must identify the new rule while restraining from responding using previously established strategies. According to this view, it may be expected for cortical lesions to be more detrimental because multiple cognitive processes are now at play.

      It is also possible that animals form learning sets during successive learning episodes which may interfere with or facilitate subsequent learning. Little information is provided regarding learning dynamics in each task (e.g. trials to criterion depending on the number of tasks already presented) to have a clear view on that.

      We thank the reviewer for raising these interesting ideas. We have now evaluated these ideas in the context of our experimental design and results. One of the main points to consider is that for mice transitioned from either of the complex tasks to the simple task, the simple task is not a novel task, but rather a well-known simplification of the previous tasks. Mice that are experts on the delay task have experienced the simple task, i.e. trials without a delay period, during their training procedure before being exposed to delay periods. Switching task expert mice know the simple task as one rule of the switching task and have performed according to this rule in each session prior to the task transition. Accordingly, upon to the transition to the simple task, both delay task expert mice and switching task expert mice perform at very high levels on the very first simple task session. We now quantify and report this in Figure 2—figure supplement 1 (A, B). This is crucial to keep in mind when assessing ‘learning sets’ or ‘behavioral flexibility’ as possible explanations for the persistent cortical involvement after the task transitions. In classical learning sets paradigms, animals are exposed to a series of novel associations, and the learning of previous associations speeds up the learning of subsequent ones (Caglayan et al., 2021; Eichenbaum et al., 1986; Harlow, 1949). This is a distinct paradigm from ours because the simple task does not contain novel associations that are new to the mice already trained on the complex tasks. Relatedly, the simple task is unlikely to present a challenge of behavioral flexibility to these mice given our experimental design and the observation of high simple task performance in the first session after the task transition.

      We now clarify these points in the introduction, results, and discussion sections, also acknowledging that it will be of interest for future work to investigate how learning sets may affect cortical task involvement.

      3) Calcium imaging data versus interventions:

      The value of the calcium imaging data is not entirely clear. Does this approach bring a new point to consider to interpret or conclude on behavioral data or is it to be considered convergent with the optogenetic interventions? Very specific portions of behavioral data are considered for these analyses (e.g. only highly successful trials for the switching/bandit task) and one may wonder if considering larger or different samples would bring similar insights. The whole take on noise correlation is difficult to apprehend because of the same possible interpretation issue, does this really reflect training history, or that a new rule now must be implemented or something else? I don't really get how this correlative approach can help to address this issue.

      We thank the reviewer for pointing out that the relationship between the inhibition dataset and calcium imaging dataset is not clear enough. We restricted analyses of inhibition and calcium imaging data in the switching task to the identical cue-choice associations as present in the simple task (i.e. Rule A trials of the switching task). We did this because we sought to make the fairest and most convincing comparison across tasks for both datasets. However, we can now see that not reporting results with trials from the other rule causes concerns that the reported differences across tasks may only hold for a specific subset of trials.

      We have now added analyses of optogenetic inhibition effects and calcium imaging results considering Rule B trials. In Figure 1—figure supplement 2, we show that when considering only Rule B trials in the switching task, effects of RSC or PPC inhibition on task performance are still increased relative to the ones observed in mice trained on and performing the simple task. We also show that overall task performance is lower in Rule B trials of the switching task than in the simple task, mirroring the differences across tasks when considering Rule A trials only.

      We extended the equivalent comparisons to the calcium imaging dataset, only considering Rule B trials of the switching task in Figure 4—figure supplement 3. With Rule B trials only, we still find larger mean activity and trial-type selectivity levels in RSC and PPC, but not in V1, compared to the simple task, as well as lower noise correlations. We thus find that our conclusions about area necessity and activity differences across tasks hold for Rule B trials and are not due to only considering a subset of the switching task data.

      In Figure 4—figure supplement 4, we further leverage the inclusion of Rule B trials and present new analyses of different single-neuron selectivity categories across rules in the switching task, reporting a prevalence of mixed selectivity in our dataset.

      Furthermore, to clarify the link between the optogenetic inhibition and the calcium imaging datasets, we have revised the motivation for the imaging dataset, as well as the presentation of its results and discussion. Investigating an area’s neural activity patterns is a crucial first step towards understanding how differential necessity of an area across tasks or experience can be explained mechanistically on a circuit level. We now elaborate on the fact that mechanistically, changes in an area’s necessity may or may not be accompanied by changes in activity within that area, as previous work in related experimental paradigms has reported differences in necessity in the absence of differences in activity (Chowdhury & DeAngelis, 2008; Liu & Pack, 2017). This phenomenon can be explained by differences in the readout of an area’s activity. We now make more explicit that in contrast to the scenario where only the readout changes, we find an intriguing correspondence between increased necessity (as seen in the inhibition experiments) and increased activity and selectivity levels (as seen in the imaging experiments) in cortical association areas depending on the current task and previous experience. Rather than attributing the increase in necessity solely to these observed changes in activity, we highlight that in the simple task condition already, cortical areas contain a high amount of task information, ruling out the idea that insufficient local information would cause the small performance deficits from inhibition. Our results thus suggest that differential necessity across tasks and experience may still require changes at the readout level despite changes in local activity. We view our imaging results as an exciting first step towards a mechanistic understanding of how cognitive experience affects cortical necessity, but we stress that future work will need to test directly the relationship between cortical necessity and various specific features of the neural code.

      Reviewer #2 (Public Review):

      The authors use a combination of optogenetics and calcium imaging to assess the contribution of cortical areas (posterior parietal cortex, retrosplenial cortex, S1/V1) on a visual-place discrimination task. Headfixed mice were trained on a simple version of the task where they were required to turn left or right depending on the visual cue that was present (e.g. X = go left; Y = go right). In a more complex version of the task the configurations were either switched during training or the stimuli were only presented at the beginning of the trial (delay).

      The authors found that inhibiting the posterior parietal cortex and retrosplenial cortex affected performance, particularly on the complex tasks. However, previous training on the complex tasks resulted in more pronounced impairments on the simple task than when behaviourally naïve animals were trained/tested on a simple task. This suggests that the more complex tasks recruit these cortical areas to a greater degree, potentially due to increased attention required during the tasks. When animals then perform the simple version of the task their previous experience of the complex tasks is transferred to the simple task resulting in a different pattern of impairments compared to that found in behaviorally naïve animals.

      The calcium imaging data showed a similar pattern of findings to the optogenetic study. There was overall increased activity in the switching tasks compared to the simple tasks consistent with the greater task demands. There was also greater trial-type selectivity in the switching task compared to the simple task. This increased trial-type selectivity in the switching tasks was subsequently carried forward to the simple task so that activity patterns were different when animals performed the simple task after experiencing the complex task compared to when they were trained on the simple task alone

      Strengths:

      The use of optogenetics and calcium-imaging enables the authors to look at the requirement of these brain structures both in terms of necessity for the task when disrupted as well as their contribution when intact.

      The use of the same experimental set up and stimuli can provide a nice comparison across tasks and trials.

      The study nicely shows that the contribution of cortical regions varies with task demands and that longerterm changes in neuronal responses c can transfer across tasks.

      The study highlights the importance of considering previous experience and exposure when understanding behavioural data and the contribution of different regions.

      The authors include a number of important controls that help with the interpretation of the findings.

      We thank the reviewer for pointing out these strengths in our work and for finding our main conclusions supported.

      Weaknesses:

      There are some experimental details that need to be clarified to help with understanding the paper in terms of behavior and the areas under investigation.

      The use of the same stimuli throughout is beneficial as it allows direct comparisons with animals experiencing the same visual cues. However, it does limit the extent to which you can extrapolate the findings. It is perhaps unsurprising to find that learning about specific visual cues affects subsequent learning and use of those specific cues. What would be interesting to know is how much of what is being shown is cue specific learning or whether it reflects something more general, for example schema learning which could be generalised to other learning situations. If animals were then trained on a different discrimination with different stimuli would this previous training modify behavior and neural activity in that instance. This would perhaps be more reflective of the types of typical laboratory experiments where you may find an impairment on a more complex task and then go on to rule out more simple discrimination impairments. However, this would typically be done with slightly different stimuli so you don't introduce transfer effects.

      We agree with the reviewer that investigating the effects of schema learning on cortical task involvement is an exciting future direction and have now explicitly mentioned this in the Discussion section. As the reviewer points out, however, our study was not designed to test this idea specifically. Because investigating schema learning would require developing and implementing an entirely new set of behavioral task variants, we feel this is beyond the scope of the current work. As to the question of how generalized the effects of cognitive experience are, our data in the run-to-target task suggest that if task settings are sufficiently distinct, cortical involvement can be similarly low regardless of complex task experience (now Figure 3—figure supplement 1). This finding is in line with recent work from (Pinto et al., 2019), where cortical involvement appears to change rapidly depending on major differences in task demands. However, work in MT has shown that previous motion discrimination training using dots can alter MT involvement in motion discrimination of gratings (Liu & Pack, 2017), highlighting that cortical involvement need not be tightly linked to the sensory cue identity.

      It is not clear whether length of training has been taken into account for the calcium imaging study given the slow development of neural representations when animals acquire spatial tasks.

      We apologize that the training duration and the temporal relationship between task acquisition and calcium imaging was not documented for the calcium imaging dataset. Please see our detailed reply below the ‘recommendations for the authors’ from Reviewer 2 below.

      The authors are presenting the study in terms of decision-making, however, it is unclear from the data as presented whether the findings specifically relate to decision making. I'm not sure the authors are demonstrating differential effects at specific decision points.

      We understand that the previous emphasis on decision-making may have created expectations that we demonstrate effects that are specific to the ‘decision-making’ aspect of a decision task. As we do not isolate the decision-making process specifically, we have substantially revised our wording around the tasks and removed the emphasis on decision-making, including in the title. Rather than decision-making, we now highlight the navigational aspect of the tasks employed.

      While we removed the emphasis on the decision-making process in our tasks, we found the reviewer’s suggestion to measure ‘decision points’ a useful additional behavioral characterization across tasks. So, we quantified how soon a mouse’s ultimate choice can be decoded from its running pattern as it progresses through the maze towards the Y-intersection. We now show these results in Figure 1—figure supplement 1. Interestingly, we found that in the delay task, choice decoding accuracy was already very high during the cue period before the onset of the delay. Nevertheless, we had shown that overall task performance and performance with inhibition were lower in the delay task compared to the simple task. Also, in segment-specific inhibition experiments, we had found that inhibition during only the delay period or only the cue period decreased task performance substantially more than in the simple task, thus finding an interesting absence of differential inhibition effects around decision points. Overall, how early a mouse made its ultimate decision did not appear predictive of the inhibition-induced task decrements, which we also directly quantify in Figure 1—figure supplement 1.

    2. Reviewer #1 (Public Review):

      The role of the parietal (PPC), the retrospenial (RSP) and the the visual cortex (S1) was assessed in three tasks corresponding a simple visual discrimination task, a working-memory task and a two-armed bandit task all based on the same sensory-motor requirements within a virtual reality framework. A differential involvement of these areas was reported in these tasks based on the effect of optogenetic manipulations. Photoinhibition of PPC and RSP was more detrimental than photoinhibition of S1 and more drastic effects were observed in presumably more complex tasks (i.e. working-memory and bandit task). If mice were trained with these more complex tasks prior to training in the simple discrimination task, then the same manipulations produced large deficits suggesting that switching from one task to the other was more challenging, resulting in the involvement of possibly larger neural circuits, especially at the cortical level. Calcium imaging also supported this view with differential signaling in these cortical areas depending on the task considered and the order to which they were presented to the animals. Overall the study is interesting and the fact that all tasks were assessed relying on the same sensory-motor requirements is a plus, but the theoretical foundations of the study seems a bit loose, opening the way to alternate ways of interpreting the data than "training history".

      1) Theoretical framework:<br /> The three tasks used by the authors should be better described at the theoretical level. While the simple task can indeed be considered a visual discrimination task, the other two tasks operationally correspond to a working-memory task (i.e. delay condition which is indeed typically assessed in a Y- or a T-maze in rodent) or a two-armed bandit task (i.e. the switching task), respectively. So these three tasks are qualitatively different, are therefore reliant on at least partially dissociable neural circuits and this should be clearly analyzed to explain the rationale of the focus on the three cortical regions of interest. For the working-memory task we do not know the duration of the delay but this really is critical information; per definition, performance in such a task is delay-dependent, this is not explored in the paper.

      Also, the authors heavily rely on "decision-making" but I am genuinely wondering if this is at all needed to account for the behavior exhibited by mice in these tasks (it would be more accurate for the bandit task) as with the perspective developed by the authors, any task implies a "decision-making" component, so that alone is not very informative on the nature of the cognitive operations that mice must compute to solve the tasks. I think a more accurate terminology in line with the specific task considered should be employed to clarify this.

      The "switching"/bandit task is particularly interesting. But because the authors only consider trials with highest accuracy, I think they are missing a critical component of this task which is the balance between exploiting current knowledge and the necessity to explore alternate options when the former strategy is no longer effective. So trials with poor performance are thus providing an essential feedback which is a major drive to support exploratory actions and a critical asset of the bandit task. There is an ample literature documenting how these tasks assess the exploration/exploitation trade-off.

      2) Training history vs learning sets vs behavioral flexibility:<br /> The authors consider "training history" as the unique angle to interpret the data. Because the experimental setup is the same throughout all experiments, I am wondering if animals are just simply provided with a cognitive challenge assessing behavioral flexibility given that they must identify the new rule while restraining from responding using previously established strategies. According to this view, it may be expected for cortical lesions to be more detrimental because multiple cognitive processes are now at play.

      It is also possible that animals form learning sets during successive learning episodes which may interfere with or facilitate subsequent learning. Little information is provided regarding learning dynamics in each task (e.g. trials to criterion depending on the number of tasks already presented) to have a clear view on that.

      3) Calcium imaging data versus interventions:<br /> The value of the calcium imaging data is not entirely clear. Does this approach bring a new point to consider to interpret or conclude on behavioral data or is it to be considered convergent with the optogenetic interventions? Very specific portions of behavioral data are considered for these analyses (e.g. only highly successful trials for the switching/bandit task) and one may wonder if considering larger or different samples would bring similar insights. The whole take on noise correlation is difficult to apprehend because of the same possible interpretation issue, does this really reflect training history, or that a new rule now must be implemented or something else? I don't really get how this correlative approach can help to address this issue.

    1. Reviewer #1 (Public Review): 

      This study compares concentrations of immune mediators in vaginal samples of young women who report having had or report not having had vaginal sex. The study finds that the concentration of many immune markers is higher in samples of women who report having had sex than in samples of women who report not yet having had sex. While the results are interesting and suggestive, I do not believe this result necessarily indicates that vaginal sex increases levels of these immune mediators (a causal relationship) and that the evidence presented here is strong enough to draw this conclusion. 

      This study presents many methodological strengths. The sample size is amply sufficient to achieve high statistical power for this research question. A particular strength of this analysis is the relatively large number of participants who provided paired before and after sex samples. These samples are particularly valuable because stronger conclusions can be drawn from them, as their comparison is less likely to be confounded by unmeasured confounders. The statistical methods are largely appropriate for the research question, with the use of random effects to account for the correlation in multiple measures per participant. 

      The reason I would not draw causal conclusions from this analysis is that there is a high potential for unmeasured confounding of the association between sex and the concentration of immune mediators. The variables that were included in the multivariable analysis were for the most part not confounders, so the authors cannot claim that their results are free from potential confounding. Confounders are in general variables which are common causes of both the exposure of interest (vaginal sex) and the outcome (level of immune markers), and which are not on the causal pathway and are not a downstream effect of the outcome (inverse causality). The only variable included that is potential confounders is age. Most other variables (pregnancy, contraception, Nugent score, Chlamydia infection, and HSV-2 seropositivity) are either potential mediators of the effect of sex or downstream effects of the level of immune markers. It does not follow that adjustment for these variables would necessarily lead to an underestimation of the causal effect, as it is possible some of these variables have complex relationships with immune mediators, so it is difficult to predict how adjusting for these variables would influence results. Some of these variables are also potentially colliders, so adjustment for them may lead to bias (see an introduction to this topic in Holmberg MJ, Andersen LW. Collider Bias. JAMA. 2022;327(13):1282-1283. doi:10.1001/jama.2022.1820). There is no consideration of general social determinants of health that are more likely to be confounders because they potentially influence both sexual behavior and the immune system: socioeconomic status, ethnicity, education, employment, housing, food security, access to health care, etc. There is overwhelming evidence that young people who are sexually active tend to have very different socioeconomic characteristics than young people who are not sexually active. It is therefore difficult to assess whether the higher level of immune markers in women who are sexually active truly represents a causal effect of sex or simply reflect differences in the type of women who have sex. 

      The paired analysis also suggests that the main analysis is likely to be confounded. The evidence from the paired analysis is much stronger than the evidence from the unpaired main analysis because the paired analysis inherently adjusts for many unmeasured confounders that lead to women having sex by a certain age; the differences in paired samples are likely much closer to the causal effect of sex than the differences from the unpaired samples. We see that, in the paired analysis, the differences in levels of immune mediators before and after sex is systematically much smaller and non-significant for most immune markers. This suggests to me that the main analysis is confounded and overestimates the effect of sex on immune markers. If there is a causal effect, it is likely to be much smaller than the one estimated in the main unpaired analysis. 

      The authors argue that the smaller effects seen in the paired analysis might be due to an effect of time, where samples closer to the start of sex show smaller differences. However, I would need more evidence to be convinced of this. Notably, they use a spline analysis in Figure 4 to show the effect of time since vaginal sex. However, I would have liked to see the p-values for the time-dependent spline effect, in order to see whether the data supports that a difference in slopes before and after sex significantly improves the model. I suspect many of the splines are not significant and may not lend strong support to the hypothesis that time since sex has an effect. It is however difficult to assess this visually without a formal test. 

      While the results from the systematic review and meta-analysis are interesting and show that at least two other studies have shown similar results, I wonder whether these other studies do not have similar issues of confounding. The other previous studies have even fewer paired samples, so are likely to have weaker evidence than the current study. 

      In summary, I think this study has some important methodological strengths in terms of sampling and study design. However, I believe the interpretation of the results should be more tempered and cautious; while there are differences in levels of immune markers in women who have had and not had sex, there is not to my mind sufficient evidence that this difference is the result of a causal effect of initiation of vaginal sex, as there is likely to be some collider bias and unmeasured residual confounding in the analysis.

    1. Author Response

      Reviewer #2 (Public Review):

      In this study, Radtke et al. use a model of helminth infection in IL-4-IRES-eGFP (4get) mice, in which transcription at the Il4 locus is reported by eGFP, in order to define the transcriptional signatures and clonal relatedness between Il4-licensed, CD4+ T cells in the mesenteric lymph nodes (mLN) and lungs. By infecting 4get mice with the hookworm Nippostrongylus brasiliensis, which is well described to induce a robust type 2 immune response, the authors isolated and sorted eGFP+CD4+ T cells from the mLN and lungs at 10day post infection and performed single cell RNA-seq analysis using the 10X Chromium platform. Transcriptional profiling of activated CD4+ T cells with scRNA-seq has been performed in a murine model of allergic asthma, including the lung and lung-draining lymph nodes, but this study involved unbiased capture of all activated CD4+ T cells (Tibbitt et al., Immunity, 2019). Radtke et al. have used a distinct model with Nippostrongylus brasiliensis and have focused on sorting Il4-licensed, CD4+ T cells, allowing for a greater number of captured CD4+ T cells with a "type 2" lymphocyte program for single cell analysis. Furthermore, this study sought to identify distinct and overlapping transcriptional signatures and clonal relatedness between Il4-licensed, CD4+ T cells in two "distant" tissues. In support of such an approach, there is growing evidence for tissue-specific and model-specific features of CD4+ T cell differentiation (Poholek, Immunohorizons, 2021; Hiltensperger et al., Nature Immunol, 2021; Kiner et al., Nature Immunol, 2021).

      Upon dimension reduction, the authors found mLN- and lung-specific clusters, including two juxtaposed clusters that form a "bridge" between the mLN and lung compartments, suggesting immigrating and/or emigrating cells. Consistent with previous studies, the dominant lung cluster (L2) exhibited unique expression of Il5 and Il13, enhanced IL-33 and IL-2 signaling, and exhibited an effector/resident memory profile. The authors did find a small cluster in the mLN (ML4) with an effector/resident memory signature that also expressed CCR9, suggesting the potential for homing to the gut mucosa. Whether this population is specific to the mLN or would also be found in the lung-draining lymph nodes remains unclear. In the mLN, the authors also describe an iNKT cell cluster with CCR9 expression and a CD4+ T cell cluster with a myeloid gene signature, but the significance of these populations remains unclear.

      The authors then use RNA velocity analysis to infer the developmental trajectory of Il4licensed, CD4+ T cells from the two tissue sites. Consistent with previous studies, the authors found that T cell proliferation was associated with fate decisions. Furthermore, among the two lung CD4+ T cell clusters, L1 represents highly differentiated, effector Th2 cells while L2, which is juxtaposed to the mLN clusters, represents a population likely entering the lung with the potential to differentiate into L1 cells.

      Next, the authors perform TCR repertoire analysis. The authors identified a broad TCR repertoire with the majority of distinct TCRs being found in only one cell. Among the TCRs found in more than one cell, a substantial number of clones can be found in both tissue sites, which is consistent with the findings that individual CD4+ T cells clones can produce different types of effector cells (Tubo et al., Cell, 2013). The authors find significant overlap of clones between the mLN and lung. In addition, they also identify clones enriched in a particular site and suggest that this represents local expansion. However, an alternative possibility is that certain CD4+ T cell clones are expanded at a particular site because the specific TCR preferentially instructs a particular cell fate. For example, fate-mapping of individual naïve CD8+ T cells suggests that certain T cell clones exhibit a greatly heightened capacity to form tissue-resident memory T cells over other cell fates (Kok et al., J Exp Med, 2020). Lastly, the authors analyze CDR3 sequences, finding the most abundant CDR3 motif belonging to the invariant TCRa chain of iNKTs. Among conventional CD4+ T cells, the abundant CDR3 motifs were not restricted to an exact TCRa/TCRb combination beyond a slight preferential usage of the Trbv1 gene. While TCR repertoire analysis allows for defining clonal relatedness among Il4-licensed, CD4+ T cells, the importance and relevance of the above findings to the in vivo type 2 immune response remain unclear.

      There are several limitations of the study:

      (1) The authors use the term "Th2 cells" to describe all Il4-licensed, CD4+ T cells. While CD4+ T helper cell nomenclature has evolved, Th2 cells and Tfh2 cells are generally used to describe distinct subsets driven by unique transcriptional programs (Ruterbusch et al., Annu Rev Immunol, 2020). While previous data suggested that Tfh2 cells are precursors to effector Th2 cells, subsequent studies support a model in which Tfh2 and Th2 cells represent distinct developmental pathways and should be designated as distinct subsets (Ballesteros-Tato et al., Immunity, 2016; Tibbitt et al., Immunity, 2019). Consequently, the authors' broad use of "Th2 cells" and a description of "Th2 cell heterogeneity" includes CD4+ T cell subsets with distinct developmental pathways that includes canonical Th2 cells as well as Tfh2 and iNKT cells. The clarity of the manuscript would be improved by describing eGFP+CD4+ cells as Il4licensed, CD4+ T cells rather than Th2 cells.

      We thank the reviewer for the helpful comment and state now that our IL-4 reporter positive population also includes cells that don’t meet the Th2 criteria in the introduction (lines 76-78).

      (2) The authors used perfused lungs to isolate Il4-licensed, CD4+ T cells for scRNA-seq of "Th2 cells" in the lung tissue. However, previous studies indicate that leukocytes, including CD4+ T cells, in lung vasculature are not completely removed by perfusion, which confounds the interpretation of a tissue cell profile due to contaminating circulating cells (Galkina, E et al., J Clin Invest, 2005; Anderson, KG et al., Nat Protoc, 2014). This is particularly true in the lung and relevant as the authors found a lung cluster (L2) with a circulating signature and suggested that L2 may represent a recent immigrant "Th2 cells". Thus, it is unclear whether L2 cluster identifies immigrant Th2 cells or simply reflect the circulating Th2 cells trapped in the lung vasculature. The study would benefit of using the intravascular staining to discriminate cells within the lungs from those in the circulation (Anderson, KG et al., Nat Protoc, 2014) for the proper isolation of Il4-licensed lung CD4+ T cells to truly define immigrant "Th2 cells" within the lung parenchyma.

      According to the reviewers suggestion we performed an intravascular staining to discriminate cells within the lungs from those in the circulation (new Figure 2—figure supplement 1). According to the vascularity staining method (with slightly increased time between i.v. and sacrifice compared to Anderson, KG et al., Nat Protoc, 2014 for higher probability of successful staining) the L2 lung cluster is a mixture of circulating cells and immigrating cells which we describe in the text (lines 210-213). The finding that the cells from the vasculature and the cells we classified as “migrating” seem to cluster together based on the similarity of their expression profiles on our UMAP further supports the classification of the L2 tissue fraction as “recent immigrants”. We thank the reviewer for this helpful comment which improved the quality of the manuscript.

      (3) The authors describe T cell exchange/trafficking across organs. However, in general, interorgan trafficking refers to lymphocyte trafficking between distinct non-lymphoid tissues, rather than trafficking between lymph nodes and peripheral tissues (Huang et al., Science, 2018). Rather than inter-organ trafficking, the authors have described shared and distinct features of Il4-licensed, CD4+ T cells from a draining lymph node of one organ (gut) and a distant non-lymphoid organ (lung). The experimental approach used makes interpretation of some of the findings challenging. Specifically, canonical effector Th2 cell differentiation is well described to occur via two checkpoints, including the draining lymph node and the peripheral (non-lymphoid) tissue (Liang et al., Nature Immunol, 2011; Van Dyken et al., Nature Immunol, 2016; Tibbitt et al., Immunity, 2019). In the draining lymph node, Th2 cells acquire the capacity to express IL-4 alone, but do not complete effector Th2 cell differentiation until trafficking to the inflamed peripheral tissues and receiving additional inflammatory signals. Consequently, it is unclear whether the differences identified in the mesenteric lymph node and lungs simply reflect well-described differences between the two Th2 cell checkpoints or organ-specific differences (gut vs lung). Il4-licensed, CD4+ T cells from the intestinal mucosa and lung-draining lymph node would also be needed to truly define organ-specific differences during helminth infection.

      According to the reviewers suggestion, we avoid the term “inter-organ trafficking” and replaced it by “at distant sites” in the title. As the reviewer points out we chose the setup of comparing a lymphoid and a non-lymphoid organ to acquire a broad picture of Th2 developmental stages in Nb infection. The limited overlap in clusters on the UMAP shows that expression profiles between MLN and lung strongly differ. However, this notion is not in conflict with cells of both organs being in a different developmental stage. We added information to highlight it in the manuscript (lines 99-101). Lung and MLN (rather than medLN and MLN) were selected to enable clonal relatedness/distribution analysis of T cells at distant sites. As part of the revision we additionally provide newly generated single cell sequencing data that compares medLN and MLN cells at day 10 after Nb infection and find that UMAP clusters are largely overlapping between medLN and MLN (new Figure 1—figure supplement 3). This suggests that there is no broad medLN/MLN site specific signature present that would force the medLN and MLN cells to cluster apart. Addition of the newly generated medLN/MLN data on the lung/MLN UMAP based on shared anchors (Stuart et al. Cell. 2019) also leads to a clear separation between all LN and lung cells supporting that cells don’t cluster due to a site-specific respiratory tract vs intestinal tract signature but likely based on developmental stages (new Fig. 1C,D). An exception are defined effector clusters that show signs of a site-specific signature (L1 expresses Ccr8, MLN4 and MLN6 express Ccr9, differences are also suggested by clustering described in lines 247-252). A similar phenotype to the one observed on the transcriptional level is observed when we cluster medLN/MLN and lung cells based on scRNAseq suggested surface marker expression after flow cytometric analysis, extending analysis to medLN on protein level (new Fig. 3). It would have also been interesting to include lamina propria T cells as the reviewer suggested but we were not able to extract high quality cells at day 10 after Nb infection which is a common limitation in the Nb model.

      (4) The study includes a single time point (day 10) whereas Tibbitt et al. performed scRNAseq in the lung and lung-draining lymph node at multiple time points during type 2 immunity (Tibbitt et al., Immunity, 2019). As a result, it remains unclear how similarities or differences between the mesenteric lymph node and lung response would change over the duration of helminth infection, especially given the helminth life cycle involves multiple infection stages.

      As part of the revision we screened for surface marker expression in the single cell sequencing dataset on transcript level and stained these on protein level (new Fig. 3 and Figure 3—figure supplement 1). This allows to follow the populations defined by scRNAseq longitudinally (d0, d6, d8, d10) by flow cytometry during Nb infection. We compared medLN, MLN and lung. The dynamic of the response in the medLN and the MLN seems similar with a small delay in the MLN compared to medLN.

      Nb with its relatively well defined migratory path through the body provides a relevant complex model antigen naturally present in the respiratory tract and the intestine during infection. However, analysis of complexity and relevance does often invoke limitations. While stage 4 larvae are found in lung and gut and certainly provide a shared antigen basis between both sites (migration stage from lung to intestine; Camberis et al. Curr Protoc Immunol. 2003), we also think that there is a reasonable number of antigens shared between different larval stages and antigen (either actively secreted or from dying larvae) that are systemically distributed. However, there are probably immunogenic differences between larval stages but to analyze these is beyond the scope of the manuscript.

      While i.e. Tibbitt et al. nicely define cell clusters with a limited number of cells they don’t include any TCR analysis and clonal information. Not much was known about the expansion of T cells in the different clusters in one organ and between organs and we provide relevant data in this regard. Furthermore, HDM as an allergy model might invoke different Th2 differentiation pathways as. i.e. Tfh13 cells are found in allergic settings but not in worm models (Gowthaman U, Science. 2019). With our approach on single cell level we were able to show effective distribution of a number of T cell clones in a highly heterogeneous immune response and describe and functionally validate successfully expanded clones / expanded TCR chains later on (i.e. new Fig. 6). This kind of analysis has not been performed for a worm model before.

      (5) The study analyzed one scRNA-seq experiment that included two mice without validation via flow cytometry or other method to infer a role of a particular finding to the type 2 immune response in vivo.

      As noted above, we screened for surface marker expression in the single cell sequencing dataset on transcript level and measured these on protein level by flow cytometry as the reviewer suggested. This allows to follow the populations defined by scRNAseq longitudinally (d0, d6, d8, d10) during Nb infection (new Fig. 3). Furthermore, we added a newly generated set of scRNAseq data which confirms and extends findings made in the initial sequencing experiment (Fig. 1C,D and Figure 1—figure supplement 3). We also included validation experiments based on the performed TCR analysis and retrovirally expressed three TCRs from our study and confirm Nb specific expansion for one of them in vivo (new Fig. 6 and Figure 6—figure supplement 1).

    1. The third UDL principle is to provide multiple means of expression and action. We find it helpful to think of this as the principle that transcends social annotation: at this point, students use what they’ve learned through engagement with the material to create new knowledge. This kind of work tends to happen outside of the social annotation platform as students create videos, essays, presentations, graphics, and other products that showcase their new knowledge.

      I'm not sure I agree here as one can take other annotations from various texts throughout a course and link them together to create new ideas and knowledge within the margins themselves. Of course, at some point the ideas need to escape the margins to potentially take shape with a class wiki, new essays, papers, journal articles or longer pieces.

      Use of social annotation across several years of a program this way may help to super-charge students' experiences.

    1. “How might we, both individually and as a society, creatively generate new visions of what it means to grow old?”

      I agree with Minha's assessment of the project. Her research question is phrased perfectly for the overall topic of these combined videos. I can't stop, and I think I won't stop thinking about what it truly means for me to age. Each voice represents a background that provides a resource for both the voice owner and the audience to answer this question. Aging for me means being more cautious with words and actions. I consciously do this because I see everyone around me go through this process and talk about it. Aging for me means looking at my grandparents and and thinking what I will do and what I will look like when I reach their age. I thought about this question a few times when I was much younger, then there was a long period of me not worrying about it at all, and in college, the question came back to me at higher rate of frequencies. I often ask myself if my future kids/grandkids (if I ever have them) would care about me and life after death was something that seems to be in my head for the longest time. Aging for me means carrying new responsibilities. I know that there are things that was acceptable when I was one year younger and became inapplicable for me the year after, and vice versa. "What it means to age?" is repeatedly asked throughout the video, motivating us to give it a try and craft our own response. This research question has well summarized for the bigger and better understanding of the purpose that these 'storytellers' and collaborators embed in this project. Same with taylortots, I may revisit this project from time to time with newer perspectives about the definition of growing old. Thank you for the insightful post!